All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C
@ 2020-06-25 12:19 Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 01/17] t6027: modernise tests Alban Gruin
                   ` (17 more replies)
  0 siblings, 18 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

In an effort to reduce the number of shell scripts part of git, I
propose this patch converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on this series.

Patches 2-5 rewrite, clean, and libify `git merge-one-file', used by the
resolve and octopus strategies.

Patch 6 libifies `git merge-index', so the rewritten `git
merge-one-file' can be called without forking.

Patch 7-8-9 rewrite, clean, and libify `git merge-resolve'.

Patch 10 moves a function, better_branch_name(), that will prove itself
useful in the C version of `git merge-octopus', but that is not part of
libgit.a.

Patches 11-12-13 rewrite, clean, and libify `git merge-octopus'.

Patches 14-15-16-17 teach `git merge' and the sequencer to call the
strategies without forking.

This series keeps the commands `git merge-one-file', `git merge-resolve'
and `git merge-octopus', so any script depending on them should keep
working without any changes.

This series is based on c9c318d6bf (The fourth batch, 2020-06-22).  The
tip is tagged as "rewrite-and-cleanup-merge-strategies-v1" at
https://github.com/agrn/git.

Alban Gruin (17):
  t6027: modernise tests
  merge-one-file: rewrite in C
  merge-one-file: remove calls to external processes
  merge-one-file: use error() instead of fprintf(stderr, ...)
  merge-one-file: libify merge_one_file()
  merge-index: libify merge_one_path() and merge_all()
  merge-resolve: rewrite in C
  merge-resolve: remove calls to external processes
  merge-resolve: libify merge_resolve()
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge-octopus: remove calls to external processes
  merge-octopus: libify merge_octopus()
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Makefile                        |   7 +-
 builtin.h                       |   3 +
 builtin/merge-index.c           |  77 +----
 builtin/merge-octopus.c         |  65 ++++
 builtin/merge-one-file.c        |  74 ++++
 builtin/merge-recursive.c       |  16 +-
 builtin/merge-resolve.c         |  69 ++++
 builtin/merge.c                 |   9 +-
 cache.h                         |   2 +-
 git-merge-octopus.sh            | 112 -------
 git-merge-one-file.sh           | 167 ---------
 git-merge-resolve.sh            |  54 ---
 git.c                           |   3 +
 merge-strategies.c              | 577 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  44 +++
 merge.c                         |  12 +
 sequencer.c                     |  16 +-
 t/t6027-merge-binary.sh         |  27 +-
 t/t6035-merge-dir-to-symlink.sh |   2 +-
 19 files changed, 889 insertions(+), 447 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 01/17] t6027: modernise tests
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 02/17] merge-one-file: rewrite in C Alban Gruin
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

Some tests in t6027 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6027-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6027-merge-binary.sh b/t/t6027-merge-binary.sh
index 4e6c7cb77e..071d3f7343 100755
--- a/t/t6027-merge-binary.sh
+++ b/t/t6027-merge-binary.sh
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -35,33 +34,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 02/17] merge-one-file: rewrite in C
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 01/17] t6027: modernise tests Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 14:55   ` Chris Torek
  2020-06-25 15:16   ` Phillip Wood
  2020-06-25 12:19 ` [RFC PATCH v1 03/17] merge-one-file: remove calls to external processes Alban Gruin
                   ` (15 subsequent siblings)
  17 siblings, 2 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is very
straightforward: it keeps using external processes to edit the index,
for instance.  Errors are also displayed with fprintf() instead of
error().  Both of these will be addressed in the next few commits,
leading to its libification so its main function can be used from other
commands directly.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   2 +-
 builtin.h                       |   1 +
 builtin/merge-one-file.c        | 275 ++++++++++++++++++++++++++++++++
 git-merge-one-file.sh           | 167 -------------------
 git.c                           |   1 +
 t/t6035-merge-dir-to-symlink.sh |   2 +-
 6 files changed, 279 insertions(+), 169 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh

diff --git a/Makefile b/Makefile
index 372139f1f2..19574f5133 100644
--- a/Makefile
+++ b/Makefile
@@ -596,7 +596,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -1089,6 +1088,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..9205d5ecdc 100644
--- a/builtin.h
+++ b/builtin.h
@@ -172,6 +172,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..4992a6cd30
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,275 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge script, called with
+ *
+ *   $1 - original file SHA1 (or empty)
+ *   $2 - file in branch1 SHA1 (or empty)
+ *   $3 - file in branch2 SHA1 (or empty)
+ *   $4 - pathname in repository
+ *   $5 - original file mode (or empty)
+ *   $6 - file in branch1 mode (or empty)
+ *   $7 - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases.. The _really_ trivial cases have
+ * been handled already by git read-tree, but that one doesn't
+ * do any merges that might change the tree layout.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "dir.h"
+#include "lockfile.h"
+#include "object-store.h"
+#include "run-command.h"
+#include "xdiff-interface.h"
+
+static int create_temp_file(const struct object_id *oid, struct strbuf *path)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf err = STRBUF_INIT;
+	int ret;
+
+	cp.git_cmd = 1;
+	argv_array_pushl(&cp.args, "unpack-file", oid_to_hex(oid), NULL);
+	ret = pipe_command(&cp, NULL, 0, path, 0, &err, 0);
+	if (!ret && path->len > 0)
+		strbuf_trim_trailing_newline(path);
+
+	fprintf(stderr, "%.*s", (int) err.len, err.buf);
+	strbuf_release(&err);
+
+	return ret;
+}
+
+static int add_to_index_cacheinfo(unsigned int mode,
+				  const struct object_id *oid, const char *path)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+
+	cp.git_cmd = 1;
+	argv_array_pushl(&cp.args, "update-index", "--add", "--cacheinfo", NULL);
+	argv_array_pushf(&cp.args, "%o,%s,%s", mode, oid_to_hex(oid), path);
+	return run_command(&cp);
+}
+
+static int remove_from_index(const char *path)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+
+	cp.git_cmd = 1;
+	argv_array_pushl(&cp.args, "update-index", "--remove", "--", path, NULL);
+	return run_command(&cp);
+}
+
+static int checkout_from_index(const char *path)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+
+	cp.git_cmd = 1;
+	argv_array_pushl(&cp.args, "checkout-index", "-u", "-f", "--", path, NULL);
+	return run_command(&cp);
+}
+
+static int merge_one_file_deleted(const struct object_id *orig_blob,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode)) {
+		fprintf(stderr, "ERROR: File %s deleted on one branch but had its\n", path);
+		fprintf(stderr, "ERROR: permissions changed on the other.\n");
+		return 1;
+	}
+
+	if (our_blob) {
+		printf("Removing %s\n", path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	return remove_from_index(path);
+}
+
+static int do_merge_one_file(const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, source, dest;
+	struct strbuf src1 = STRBUF_INIT, src2 = STRBUF_INIT, orig = STRBUF_INIT;
+	struct child_process cp_merge = CHILD_PROCESS_INIT,
+		cp_checkout = CHILD_PROCESS_INIT,
+		cp_update = CHILD_PROCESS_INIT;
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK) {
+		fprintf(stderr, "ERROR: %s: Not merging symbolic link changes.\n", path);
+		return 1;
+	} else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK) {
+		fprintf(stderr, "ERROR: %s: Not merging conflicting submodule changes.\n",
+			path);
+		return 1;
+	}
+
+	create_temp_file(our_blob, &src1);
+	create_temp_file(their_blob, &src2);
+
+	if (orig_blob) {
+		printf("Auto-merging %s\n", path);
+		create_temp_file(orig_blob, &orig);
+	} else {
+		printf("Added %s in both, but differently.\n", path);
+		create_temp_file(the_hash_algo->empty_blob, &orig);
+	}
+
+	cp_merge.git_cmd = 1;
+	argv_array_pushl(&cp_merge.args, "merge-file", src1.buf, orig.buf, src2.buf,
+			 NULL);
+	ret = run_command(&cp_merge);
+
+	if (ret != 0)
+		ret = 1;
+
+	cp_checkout.git_cmd = 1;
+	argv_array_pushl(&cp_checkout.args, "checkout-index", "-f", "--stage=2",
+			 "--", path, NULL);
+	if (run_command(&cp_checkout))
+		return 1;
+
+	source = open(src1.buf, O_RDONLY);
+	dest = open(path, O_WRONLY | O_TRUNC);
+
+	copy_fd(source, dest);
+
+	close(source);
+	close(dest);
+
+	unlink(orig.buf);
+	unlink(src1.buf);
+	unlink(src2.buf);
+
+	strbuf_release(&src1);
+	strbuf_release(&src2);
+	strbuf_release(&orig);
+
+	if (ret) {
+		fprintf(stderr, "ERROR: ");
+
+		if (!orig_blob) {
+			fprintf(stderr, "content conflict");
+			if (our_mode != their_mode)
+				fprintf(stderr, ", ");
+		}
+
+		if (our_mode != their_mode)
+			fprintf(stderr, "permissions conflict: %o->%o,%o",
+				orig_mode, our_mode, their_mode);
+
+		fprintf(stderr, " in %s\n", path);
+
+		return 1;
+	}
+
+	cp_update.git_cmd = 1;
+	argv_array_pushl(&cp_update.args, "update-index", "--", path, NULL);
+	return run_command(&cp_update);
+}
+
+static int merge_one_file(const struct object_id *orig_blob,
+			  const struct object_id *our_blob,
+			  const struct object_id *their_blob, const char *path,
+			  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((our_blob && oideq(orig_blob, our_blob)) ||
+	     (their_blob && oideq(orig_blob, their_blob))))
+		return merge_one_file_deleted(orig_blob, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	else if (!orig_blob && our_blob && !their_blob) {
+		return add_to_index_cacheinfo(our_mode, our_blob, path);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		printf("Adding %s\n", path);
+
+		if (file_exists(path)) {
+			fprintf(stderr, "ERROR: untracked %s is overwritten by the merge.\n", path);
+			return 1;
+		}
+
+		if (add_to_index_cacheinfo(their_mode, their_blob, path))
+			return 1;
+		return checkout_from_index(path);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		if (our_mode != their_mode) {
+			fprintf(stderr, "ERROR: File %s added identically in both branches,", path);
+			fprintf(stderr, "ERROR: but permissions conflict %o->%o.\n",
+				our_mode, their_mode);
+			return 1;
+		}
+
+		printf("Adding %s\n", path);
+
+		if (add_to_index_cacheinfo(our_mode, our_blob, path))
+			return 1;
+		return checkout_from_index(path);
+	} else if (our_blob && their_blob)
+		return do_merge_one_file(orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	else {
+		char *orig_hex = "", *our_hex = "", *their_hex = "";
+
+		if (orig_blob)
+			orig_hex = oid_to_hex(orig_blob);
+		if (our_blob)
+			our_hex = oid_to_hex(our_blob);
+		if (their_blob)
+			their_hex = oid_to_hex(their_blob);
+
+		fprintf(stderr, "ERROR: %s: Not handling case %s -> %s -> %s\n",
+			path, orig_hex, our_hex, their_hex);
+		return 1;
+	}
+
+	return 0;
+}
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (!get_oid(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		orig_mode = strtol(argv[5], NULL, 8);
+	}
+
+	if (!get_oid(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		our_mode = strtol(argv[6], NULL, 8);
+	}
+
+	if (!get_oid(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		their_mode = strtol(argv[7], NULL, 8);
+	}
+
+	return merge_one_file(p_orig_blob, p_our_blob, p_their_blob, argv[4],
+			      orig_mode, our_mode, their_mode);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index a2d337eed7..058d91a2a5 100644
--- a/git.c
+++ b/git.c
@@ -532,6 +532,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/t/t6035-merge-dir-to-symlink.sh b/t/t6035-merge-dir-to-symlink.sh
index 2eddcc7664..5fb74e39a0 100755
--- a/t/t6035-merge-dir-to-symlink.sh
+++ b/t/t6035-merge-dir-to-symlink.sh
@@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 03/17] merge-one-file: remove calls to external processes
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 01/17] t6027: modernise tests Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 02/17] merge-one-file: rewrite in C Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 04/17] merge-one-file: use error() instead of fprintf(stderr, ...) Alban Gruin
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

To save precious cycles by avoiding reading and flushing the index
repeatedly, or write temporary files when an operation can be performed
in-memory, this removes call to external processes:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_cache_entry();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_cache();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and xdl_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are replaced by calls to
   cache_file_exists();

 - calls to `update-index' are replaced by calls to add_file_to_cache().

To enable these changes, the index is read and written back in
cmd_merge_one_file().

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-one-file.c | 160 +++++++++++++++++++--------------------
 1 file changed, 78 insertions(+), 82 deletions(-)

diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
index 4992a6cd30..d9ebd820cb 100644
--- a/builtin/merge-one-file.c
+++ b/builtin/merge-one-file.c
@@ -27,54 +27,48 @@
 #include "dir.h"
 #include "lockfile.h"
 #include "object-store.h"
-#include "run-command.h"
 #include "xdiff-interface.h"
 
-static int create_temp_file(const struct object_id *oid, struct strbuf *path)
-{
-	struct child_process cp = CHILD_PROCESS_INIT;
-	struct strbuf err = STRBUF_INIT;
-	int ret;
-
-	cp.git_cmd = 1;
-	argv_array_pushl(&cp.args, "unpack-file", oid_to_hex(oid), NULL);
-	ret = pipe_command(&cp, NULL, 0, path, 0, &err, 0);
-	if (!ret && path->len > 0)
-		strbuf_trim_trailing_newline(path);
-
-	fprintf(stderr, "%.*s", (int) err.len, err.buf);
-	strbuf_release(&err);
-
-	return ret;
-}
-
 static int add_to_index_cacheinfo(unsigned int mode,
 				  const struct object_id *oid, const char *path)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
+	struct cache_entry *ce;
+	int len, option;
 
-	cp.git_cmd = 1;
-	argv_array_pushl(&cp.args, "update-index", "--add", "--cacheinfo", NULL);
-	argv_array_pushf(&cp.args, "%o,%s,%s", mode, oid_to_hex(oid), path);
-	return run_command(&cp);
-}
+	if (!verify_path(path, mode))
+		return error("Invalid path '%s'", path);
 
-static int remove_from_index(const char *path)
-{
-	struct child_process cp = CHILD_PROCESS_INIT;
+	len = strlen(path);
+	ce = make_empty_cache_entry(&the_index, len);
 
-	cp.git_cmd = 1;
-	argv_array_pushl(&cp.args, "update-index", "--remove", "--", path, NULL);
-	return run_command(&cp);
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(0);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
+	if (add_cache_entry(ce, option))
+		return error("%s: cannot add to the index", path);
+
+	return 0;
 }
 
 static int checkout_from_index(const char *path)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
+	struct checkout state;
+	struct cache_entry *ce;
 
-	cp.git_cmd = 1;
-	argv_array_pushl(&cp.args, "checkout-index", "-u", "-f", "--", path, NULL);
-	return run_command(&cp);
+	state.istate = &the_index;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	ce = cache_file_exists(path, strlen(path), 0);
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error("%s: cannot checkout file", path);
+	return 0;
 }
 
 static int merge_one_file_deleted(const struct object_id *orig_blob,
@@ -96,7 +90,9 @@ static int merge_one_file_deleted(const struct object_id *orig_blob,
 			remove_path(path);
 	}
 
-	return remove_from_index(path);
+	if (remove_file_from_cache(path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
 }
 
 static int do_merge_one_file(const struct object_id *orig_blob,
@@ -104,61 +100,50 @@ static int do_merge_one_file(const struct object_id *orig_blob,
 			     const struct object_id *their_blob, const char *path,
 			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
 {
-	int ret, source, dest;
-	struct strbuf src1 = STRBUF_INIT, src2 = STRBUF_INIT, orig = STRBUF_INIT;
-	struct child_process cp_merge = CHILD_PROCESS_INIT,
-		cp_checkout = CHILD_PROCESS_INIT,
-		cp_update = CHILD_PROCESS_INIT;
+	int ret, i, dest;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+	struct cache_entry *ce;
 
-	if (our_mode == S_IFLNK || their_mode == S_IFLNK) {
-		fprintf(stderr, "ERROR: %s: Not merging symbolic link changes.\n", path);
-		return 1;
-	} else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK) {
-		fprintf(stderr, "ERROR: %s: Not merging conflicting submodule changes.\n",
-			path);
-		return 1;
-	}
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
 
-	create_temp_file(our_blob, &src1);
-	create_temp_file(their_blob, &src2);
+	read_mmblob(mmfs + 0, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
 
 	if (orig_blob) {
 		printf("Auto-merging %s\n", path);
-		create_temp_file(orig_blob, &orig);
+		read_mmblob(mmfs + 1, orig_blob);
 	} else {
 		printf("Added %s in both, but differently.\n", path);
-		create_temp_file(the_hash_algo->empty_blob, &orig);
+		read_mmblob(mmfs + 1, the_hash_algo->empty_blob);
 	}
 
-	cp_merge.git_cmd = 1;
-	argv_array_pushl(&cp_merge.args, "merge-file", src1.buf, orig.buf, src2.buf,
-			 NULL);
-	ret = run_command(&cp_merge);
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
 
-	if (ret != 0)
+	ret = xdl_merge(mmfs + 1, mmfs + 0, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret > 127)
 		ret = 1;
 
-	cp_checkout.git_cmd = 1;
-	argv_array_pushl(&cp_checkout.args, "checkout-index", "-f", "--stage=2",
-			 "--", path, NULL);
-	if (run_command(&cp_checkout))
-		return 1;
+	ce = cache_file_exists(path, strlen(path), 0);
+	if (!ce)
+		BUG("file is not present in the cache?");
 
-	source = open(src1.buf, O_RDONLY);
-	dest = open(path, O_WRONLY | O_TRUNC);
-
-	copy_fd(source, dest);
-
-	close(source);
+	unlink(path);
+	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
+	write_in_full(dest, result.ptr, result.size);
 	close(dest);
 
-	unlink(orig.buf);
-	unlink(src1.buf);
-	unlink(src2.buf);
-
-	strbuf_release(&src1);
-	strbuf_release(&src2);
-	strbuf_release(&orig);
+	free(result.ptr);
 
 	if (ret) {
 		fprintf(stderr, "ERROR: ");
@@ -178,9 +163,7 @@ static int do_merge_one_file(const struct object_id *orig_blob,
 		return 1;
 	}
 
-	cp_update.git_cmd = 1;
-	argv_array_pushl(&cp_update.args, "update-index", "--", path, NULL);
-	return run_command(&cp_update);
+	return add_file_to_cache(path, 0);
 }
 
 static int merge_one_file(const struct object_id *orig_blob,
@@ -250,11 +233,17 @@ int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
 {
 	struct object_id orig_blob, our_blob, their_blob,
 		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
-	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret;
+	struct lock_file lock = LOCK_INIT;
 
 	if (argc != 8)
 		usage(builtin_merge_one_file_usage);
 
+	if (read_cache() < 0)
+		die("invalid index");
+
+	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+
 	if (!get_oid(argv[1], &orig_blob)) {
 		p_orig_blob = &orig_blob;
 		orig_mode = strtol(argv[5], NULL, 8);
@@ -270,6 +259,13 @@ int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
 		their_mode = strtol(argv[7], NULL, 8);
 	}
 
-	return merge_one_file(p_orig_blob, p_our_blob, p_their_blob, argv[4],
-			      orig_mode, our_mode, their_mode);
+	ret = merge_one_file(p_orig_blob, p_our_blob, p_their_blob, argv[4],
+			     orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return ret;
+	}
+
+	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 }
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 04/17] merge-one-file: use error() instead of fprintf(stderr, ...)
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (2 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 03/17] merge-one-file: remove calls to external processes Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 05/17] merge-one-file: libify merge_one_file() Alban Gruin
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

We have a handy helper function to display errors and return a value.
Use it instead of fprintf(stderr, ...).

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-one-file.c | 43 +++++++++++++---------------------------
 1 file changed, 14 insertions(+), 29 deletions(-)

diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
index d9ebd820cb..d612885723 100644
--- a/builtin/merge-one-file.c
+++ b/builtin/merge-one-file.c
@@ -77,11 +77,9 @@ static int merge_one_file_deleted(const struct object_id *orig_blob,
 				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
 {
 	if ((our_blob && orig_mode != our_mode) ||
-	    (their_blob && orig_mode != their_mode)) {
-		fprintf(stderr, "ERROR: File %s deleted on one branch but had its\n", path);
-		fprintf(stderr, "ERROR: permissions changed on the other.\n");
-		return 1;
-	}
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
 
 	if (our_blob) {
 		printf("Removing %s\n", path);
@@ -146,19 +144,11 @@ static int do_merge_one_file(const struct object_id *orig_blob,
 	free(result.ptr);
 
 	if (ret) {
-		fprintf(stderr, "ERROR: ");
-
-		if (!orig_blob) {
-			fprintf(stderr, "content conflict");
-			if (our_mode != their_mode)
-				fprintf(stderr, ", ");
-		}
-
+		if (!orig_blob)
+			error(_("content conflict in %s"), path);
 		if (our_mode != their_mode)
-			fprintf(stderr, "permissions conflict: %o->%o,%o",
-				orig_mode, our_mode, their_mode);
-
-		fprintf(stderr, " in %s\n", path);
+			error(_("permission conflict: %o->%o,%o in %s"),
+			      orig_mode, our_mode, their_mode, path);
 
 		return 1;
 	}
@@ -181,22 +171,18 @@ static int merge_one_file(const struct object_id *orig_blob,
 	} else if (!orig_blob && !our_blob && their_blob) {
 		printf("Adding %s\n", path);
 
-		if (file_exists(path)) {
-			fprintf(stderr, "ERROR: untracked %s is overwritten by the merge.\n", path);
-			return 1;
-		}
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
 
 		if (add_to_index_cacheinfo(their_mode, their_blob, path))
 			return 1;
 		return checkout_from_index(path);
 	} else if (!orig_blob && our_blob && their_blob &&
 		   oideq(our_blob, their_blob)) {
-		if (our_mode != their_mode) {
-			fprintf(stderr, "ERROR: File %s added identically in both branches,", path);
-			fprintf(stderr, "ERROR: but permissions conflict %o->%o.\n",
-				our_mode, their_mode);
-			return 1;
-		}
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
 
 		printf("Adding %s\n", path);
 
@@ -216,9 +202,8 @@ static int merge_one_file(const struct object_id *orig_blob,
 		if (their_blob)
 			their_hex = oid_to_hex(their_blob);
 
-		fprintf(stderr, "ERROR: %s: Not handling case %s -> %s -> %s\n",
+		return error(_("%s: Not handling case %s -> %s -> %s"),
 			path, orig_hex, our_hex, their_hex);
-		return 1;
 	}
 
 	return 0;
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 05/17] merge-one-file: libify merge_one_file()
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (3 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 04/17] merge-one-file: use error() instead of fprintf(stderr, ...) Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This moves merge_one_file() (and its helper functions) to a new file,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.  It is also renamed
merge_strategies_one_file().

This is not a faithful copy-and-paste; in the builtin versions,
merge_one_file() operated on `the_repository' and `the_index', something
we cannot allow a function part of libgit.a to do.  Hence, it now takes
a pointer to a repository as its first argument (and helper functions
takes a pointer to an `index_state').

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---

Notes:
    This patch is best viewed with `--color-moved'.

 Makefile                 |   1 +
 builtin/merge-one-file.c | 190 +-------------------------------------
 merge-strategies.c       | 191 +++++++++++++++++++++++++++++++++++++++
 merge-strategies.h       |  13 +++
 4 files changed, 209 insertions(+), 186 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index 19574f5133..1ab4d160cb 100644
--- a/Makefile
+++ b/Makefile
@@ -911,6 +911,7 @@ LIB_OBJS += match-trees.o
 LIB_OBJS += mem-pool.o
 LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
index d612885723..2f7a3e1db2 100644
--- a/builtin/merge-one-file.c
+++ b/builtin/merge-one-file.c
@@ -23,191 +23,8 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
 #include "builtin.h"
-#include "commit.h"
-#include "dir.h"
 #include "lockfile.h"
-#include "object-store.h"
-#include "xdiff-interface.h"
-
-static int add_to_index_cacheinfo(unsigned int mode,
-				  const struct object_id *oid, const char *path)
-{
-	struct cache_entry *ce;
-	int len, option;
-
-	if (!verify_path(path, mode))
-		return error("Invalid path '%s'", path);
-
-	len = strlen(path);
-	ce = make_empty_cache_entry(&the_index, len);
-
-	oidcpy(&ce->oid, oid);
-	memcpy(ce->name, path, len);
-	ce->ce_flags = create_ce_flags(0);
-	ce->ce_namelen = len;
-	ce->ce_mode = create_ce_mode(mode);
-	if (assume_unchanged)
-		ce->ce_flags |= CE_VALID;
-	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
-	if (add_cache_entry(ce, option))
-		return error("%s: cannot add to the index", path);
-
-	return 0;
-}
-
-static int checkout_from_index(const char *path)
-{
-	struct checkout state;
-	struct cache_entry *ce;
-
-	state.istate = &the_index;
-	state.force = 1;
-	state.base_dir = "";
-	state.base_dir_len = 0;
-
-	ce = cache_file_exists(path, strlen(path), 0);
-	if (checkout_entry(ce, &state, NULL, NULL) < 0)
-		return error("%s: cannot checkout file", path);
-	return 0;
-}
-
-static int merge_one_file_deleted(const struct object_id *orig_blob,
-				  const struct object_id *our_blob,
-				  const struct object_id *their_blob, const char *path,
-				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
-{
-	if ((our_blob && orig_mode != our_mode) ||
-	    (their_blob && orig_mode != their_mode))
-		return error(_("File %s deleted on one branch but had its "
-			       "permissions changed on the other."), path);
-
-	if (our_blob) {
-		printf("Removing %s\n", path);
-
-		if (file_exists(path))
-			remove_path(path);
-	}
-
-	if (remove_file_from_cache(path))
-		return error("%s: cannot remove from the index", path);
-	return 0;
-}
-
-static int do_merge_one_file(const struct object_id *orig_blob,
-			     const struct object_id *our_blob,
-			     const struct object_id *their_blob, const char *path,
-			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
-{
-	int ret, i, dest;
-	mmbuffer_t result = {NULL, 0};
-	mmfile_t mmfs[3];
-	xmparam_t xmp = {{0}};
-	struct cache_entry *ce;
-
-	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
-		return error(_("%s: Not merging symbolic link changes."), path);
-	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
-		return error(_("%s: Not merging conflicting submodule changes."), path);
-
-	read_mmblob(mmfs + 0, our_blob);
-	read_mmblob(mmfs + 2, their_blob);
-
-	if (orig_blob) {
-		printf("Auto-merging %s\n", path);
-		read_mmblob(mmfs + 1, orig_blob);
-	} else {
-		printf("Added %s in both, but differently.\n", path);
-		read_mmblob(mmfs + 1, the_hash_algo->empty_blob);
-	}
-
-	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
-	xmp.style = 0;
-	xmp.favor = 0;
-
-	ret = xdl_merge(mmfs + 1, mmfs + 0, mmfs + 2, &xmp, &result);
-
-	for (i = 0; i < 3; i++)
-		free(mmfs[i].ptr);
-
-	if (ret > 127)
-		ret = 1;
-
-	ce = cache_file_exists(path, strlen(path), 0);
-	if (!ce)
-		BUG("file is not present in the cache?");
-
-	unlink(path);
-	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
-	write_in_full(dest, result.ptr, result.size);
-	close(dest);
-
-	free(result.ptr);
-
-	if (ret) {
-		if (!orig_blob)
-			error(_("content conflict in %s"), path);
-		if (our_mode != their_mode)
-			error(_("permission conflict: %o->%o,%o in %s"),
-			      orig_mode, our_mode, their_mode, path);
-
-		return 1;
-	}
-
-	return add_file_to_cache(path, 0);
-}
-
-static int merge_one_file(const struct object_id *orig_blob,
-			  const struct object_id *our_blob,
-			  const struct object_id *their_blob, const char *path,
-			  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
-{
-	if (orig_blob &&
-	    ((our_blob && oideq(orig_blob, our_blob)) ||
-	     (their_blob && oideq(orig_blob, their_blob))))
-		return merge_one_file_deleted(orig_blob, our_blob, their_blob, path,
-					      orig_mode, our_mode, their_mode);
-	else if (!orig_blob && our_blob && !their_blob) {
-		return add_to_index_cacheinfo(our_mode, our_blob, path);
-	} else if (!orig_blob && !our_blob && their_blob) {
-		printf("Adding %s\n", path);
-
-		if (file_exists(path))
-			return error(_("untracked %s is overwritten by the merge."), path);
-
-		if (add_to_index_cacheinfo(their_mode, their_blob, path))
-			return 1;
-		return checkout_from_index(path);
-	} else if (!orig_blob && our_blob && their_blob &&
-		   oideq(our_blob, their_blob)) {
-		if (our_mode != their_mode)
-			return error(_("File %s added identically in both branches, "
-				       "but permissions conflict %o->%o."),
-				     path, our_mode, their_mode);
-
-		printf("Adding %s\n", path);
-
-		if (add_to_index_cacheinfo(our_mode, our_blob, path))
-			return 1;
-		return checkout_from_index(path);
-	} else if (our_blob && their_blob)
-		return do_merge_one_file(orig_blob, our_blob, their_blob, path,
-					 orig_mode, our_mode, their_mode);
-	else {
-		char *orig_hex = "", *our_hex = "", *their_hex = "";
-
-		if (orig_blob)
-			orig_hex = oid_to_hex(orig_blob);
-		if (our_blob)
-			our_hex = oid_to_hex(our_blob);
-		if (their_blob)
-			their_hex = oid_to_hex(their_blob);
-
-		return error(_("%s: Not handling case %s -> %s -> %s"),
-			path, orig_hex, our_hex, their_hex);
-	}
-
-	return 0;
-}
+#include "merge-strategies.h"
 
 static const char builtin_merge_one_file_usage[] =
 	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
@@ -244,8 +61,9 @@ int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
 		their_mode = strtol(argv[7], NULL, 8);
 	}
 
-	ret = merge_one_file(p_orig_blob, p_our_blob, p_their_blob, argv[4],
-			     orig_mode, our_mode, their_mode);
+	ret = merge_strategies_one_file(the_repository,
+					p_orig_blob, p_our_blob, p_their_blob, argv[4],
+					orig_mode, our_mode, their_mode);
 
 	if (ret) {
 		rollback_lock_file(&lock);
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..3a9fce9f22
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,191 @@
+#include "cache.h"
+#include "dir.h"
+#include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int add_to_index_cacheinfo(struct index_state *istate,
+				  unsigned int mode,
+				  const struct object_id *oid, const char *path)
+{
+	struct cache_entry *ce;
+	int len, option;
+
+	if (!verify_path(path, mode))
+		return error(_("Invalid path '%s'"), path);
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(0);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
+	if (add_index_entry(istate, ce, option))
+		return error(_("%s: cannot add to the index"), path);
+
+	return 0;
+}
+
+static int checkout_from_index(struct index_state *istate, const char *path)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct cache_entry *ce;
+
+	state.istate = istate;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	ce = index_file_exists(istate, path, strlen(path), 0);
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error(_("%s: cannot checkout file"), path);
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *orig_blob,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf("Removing %s\n", path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+	struct cache_entry *ce;
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+
+	read_mmblob(mmfs + 0, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	if (orig_blob) {
+		printf("Auto-merging %s\n", path);
+		read_mmblob(mmfs + 1, orig_blob);
+	} else {
+		printf("Added %s in both, but differently.\n", path);
+		read_mmblob(mmfs + 1, &null_oid);
+	}
+
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
+
+	ret = xdl_merge(mmfs + 1, mmfs + 0, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret > 127)
+		ret = 1;
+
+	ce = index_file_exists(istate, path, strlen(path), 0);
+	if (!ce)
+		BUG("file is not present in the cache?");
+
+	unlink(path);
+	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
+	write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (ret) {
+		if (!orig_blob)
+			error(_("content conflict in %s"), path);
+		if (our_mode != their_mode)
+			error(_("permission conflict: %o->%o,%o in %s"),
+			      orig_mode, our_mode, their_mode, path);
+
+		return 1;
+	}
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_strategies_one_file(struct repository *r,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob, const char *path,
+			      unsigned int orig_mode, unsigned int our_mode,
+			      unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((our_blob && oideq(orig_blob, our_blob)) ||
+	     (their_blob && oideq(orig_blob, their_blob))))
+		return merge_one_file_deleted(r->index,
+					      orig_blob, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	else if (!orig_blob && our_blob && !their_blob) {
+		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		printf("Adding %s\n", path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
+			return 1;
+		return checkout_from_index(r->index, path);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf("Adding %s\n", path);
+
+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
+			return 1;
+		return checkout_from_index(r->index, path);
+	} else if (our_blob && their_blob)
+		return do_merge_one_file(r->index,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	else {
+		char *orig_hex = "", *our_hex = "", *their_hex = "";
+
+		if (orig_blob)
+			orig_hex = oid_to_hex(orig_blob);
+		if (our_blob)
+			our_hex = oid_to_hex(our_blob);
+		if (their_blob)
+			their_hex = oid_to_hex(their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..b527d145c7
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,13 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+int merge_strategies_one_file(struct repository *r,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob, const char *path,
+			      unsigned int orig_mode, unsigned int our_mode,
+			      unsigned int their_mode);
+
+#endif /* MERGE_STRATEGIES_H */
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all()
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (4 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 05/17] merge-one-file: libify merge_one_file() Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-26 10:13   ` Phillip Wood
  2020-06-25 12:19 ` [RFC PATCH v1 07/17] merge-resolve: rewrite in C Alban Gruin
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves merge_one_path(), merge_all(), and their
helpers to merge-strategies.c.  They also take a callback to dictate
what they should do for each file.  For now, only one launching a new
process is defined to preserve the behaviour of the builtin version.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---

Notes:
    This patch is best viewed with `--color-moved'.

 builtin/merge-index.c | 77 +++------------------------------
 merge-strategies.c    | 99 +++++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    | 17 ++++++++
 3 files changed, 123 insertions(+), 70 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..6cb666cc78 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,11 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
-#include "run-command.h"
-
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
-
-static int merge_entry(int pos, const char *path)
-{
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
-	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
-
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
-
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
-	}
-}
+#include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	const char *pgm;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all(&the_index, one_shot, quiet,
+						 merge_program_cb, (void *)pgm);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_one_path(&the_index, one_shot, quiet, arg,
+				      merge_program_cb, (void *)pgm);
 	}
-	if (err && !quiet)
-		die("merge program failed");
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index 3a9fce9f22..f4c0b4acd6 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "dir.h"
 #include "merge-strategies.h"
+#include "run-command.h"
 #include "xdiff-interface.h"
 
 static int add_to_index_cacheinfo(struct index_state *istate,
@@ -189,3 +190,101 @@ int merge_strategies_one_file(struct repository *r,
 
 	return 0;
 }
+
+int merge_program_cb(const struct object_id *orig_blob,
+		     const struct object_id *our_blob,
+		     const struct object_id *their_blob, const char *path,
+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		     void *data)
+{
+	char ownbuf[3][60] = {{0}};
+	const char *arguments[] = { (char *)data, "", "", "", path,
+				    ownbuf[0], ownbuf[1], ownbuf[2],
+				    NULL };
+
+	if (orig_blob)
+		arguments[1] = oid_to_hex(orig_blob);
+	if (our_blob)
+		arguments[2] = oid_to_hex(our_blob);
+	if (their_blob)
+		arguments[3] = oid_to_hex(their_blob);
+
+	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
+	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
+	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);
+
+	return run_command_v_opt(arguments, 0);
+}
+
+static int merge_entry(struct index_state *istate, int quiet, int pos,
+		       const char *path, merge_cb cb, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		return -2;
+	}
+
+	return found;
+}
+
+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
+		   const char *path, merge_cb cb, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
+		if (ret == -1)
+			return -1;
+		else if (ret == -2)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all(struct index_state *istate, int oneshot, int quiet,
+	      merge_cb cb, void *data)
+{
+	int err = 0, i, ret;
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+		else if (ret == -2) {
+			if (oneshot)
+				err++;
+			else
+				return 1;
+		}
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index b527d145c7..cf78d7eaf4 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
 			      unsigned int orig_mode, unsigned int our_mode,
 			      unsigned int their_mode);
 
+typedef int (*merge_cb)(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_program_cb(const struct object_id *orig_blob,
+		     const struct object_id *our_blob,
+		     const struct object_id *their_blob, const char *path,
+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		     void *data);
+
+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
+		   const char *path, merge_cb cb, void *data);
+int merge_all(struct index_state *istate, int oneshot, int quiet,
+	      merge_cb cb, void *data);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 07/17] merge-resolve: rewrite in C
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (5 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 08/17] merge-resolve: remove calls to external processes Alban Gruin
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port keeps using external processes for operations
on the index, or to call `git merge-one-file'.  This will be addressed
in the next two commits.

The parameters of merge_resolve() will be surprising at first glance:
why using a commit list for `bases' and `remote', where we could use an
oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_resolve() directly, and
it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_resolve() takes the same types of parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-resolve.c | 112 ++++++++++++++++++++++++++++++++++++++++
 git-merge-resolve.sh    |  54 -------------------
 git.c                   |   1 +
 5 files changed, 115 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index 1ab4d160cb..ccea651ac8 100644
--- a/Makefile
+++ b/Makefile
@@ -596,7 +596,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1092,6 +1091,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 9205d5ecdc..6ea207c9fd 100644
--- a/builtin.h
+++ b/builtin.h
@@ -174,6 +174,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..c66fef7b7f
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,112 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "run-command.h"
+
+static int merge_resolve(struct commit_list *bases, const char *head_arg,
+			 struct commit_list *remote)
+{
+	struct commit_list *j;
+	struct child_process cp_update = CHILD_PROCESS_INIT,
+		cp_read = CHILD_PROCESS_INIT,
+		cp_write = CHILD_PROCESS_INIT;
+
+	cp_update.git_cmd = 1;
+	argv_array_pushl(&cp_update.args, "update-index", "-q", "--refresh", NULL);
+	run_command(&cp_update);
+
+	cp_read.git_cmd = 1;
+	argv_array_pushl(&cp_read.args, "read-tree", "-u", "-m", "--aggressive", NULL);
+
+	for (j = bases; j && j->item; j = j->next)
+		argv_array_push(&cp_read.args, oid_to_hex(&j->item->object.oid));
+
+	if (head_arg)
+		argv_array_push(&cp_read.args, head_arg);
+	if (remote && remote->item)
+		argv_array_push(&cp_read.args, oid_to_hex(&remote->item->object.oid));
+
+	if (run_command(&cp_read))
+		return 2;
+
+	puts("Trying simple merge.");
+
+	cp_write.git_cmd = 1;
+	cp_write.no_stdout = 1;
+	cp_write.no_stderr = 1;
+	argv_array_push(&cp_write.args, "write-tree");
+	if (run_command(&cp_write)) {
+		struct child_process cp_merge = CHILD_PROCESS_INIT;
+
+		puts("Simple merge failed, trying Automatic merge.");
+
+		cp_merge.git_cmd = 1;
+		argv_array_pushl(&cp_merge.args, "merge-index", "-o",
+				 "git-merge-one-file", "-a", NULL);
+		if (run_command(&cp_merge))
+			return 1;
+	}
+
+	return 0;
+}
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, is_baseless = 1, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	/* The first parameters up to -- are merge bases; the rest are
+	 * heads. */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else if (remote) {
+			/* Give up if we are given two or more remotes.
+			 * Not handling octopus. */
+			return 2;
+		} else {
+			struct object_id oid;
+
+			get_oid(argv[i], &oid);
+			is_baseless &= sep_seen;
+
+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
+				struct commit *commit;
+				commit = lookup_commit_or_die(&oid, argv[i]);
+
+				if (sep_seen)
+					commit_list_append(commit, &remote);
+				else
+					next_base = commit_list_append(commit, next_base);
+			}
+		}
+	}
+
+	/* Give up if this is a baseless merge. */
+	if (is_baseless)
+		return 2;
+
+	return merge_resolve(bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 343fe7bccd..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index 058d91a2a5..2e92019493 100644
--- a/git.c
+++ b/git.c
@@ -536,6 +536,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 08/17] merge-resolve: remove calls to external processes
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (6 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 07/17] merge-resolve: rewrite in C Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 09/17] merge-resolve: libify merge_resolve() Alban Gruin
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This removes calls to external processes to avoid reading and writing
the index over and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all() function.  A callback
   function, merge_one_file_cb(), is added to allow it to call
   merge_one_file() without forking.

Here too, the index is read in cmd_merge_resolve(), but merge_resolve()
takes care of writing it back to the disk.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-resolve.c | 103 ++++++++++++++++++++++++++++------------
 merge-strategies.c      |  11 +++++
 merge-strategies.h      |   6 +++
 3 files changed, 89 insertions(+), 31 deletions(-)

diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
index c66fef7b7f..2c364fcdb0 100644
--- a/builtin/merge-resolve.c
+++ b/builtin/merge-resolve.c
@@ -10,54 +10,91 @@
  */
 
 #include "cache.h"
+#include "cache-tree.h"
 #include "builtin.h"
-#include "run-command.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+#include "unpack-trees.h"
+
+static int add_tree(const struct object_id *oid, struct tree_desc *t)
+{
+	struct tree *tree;
+
+	tree = parse_tree_indirect(oid);
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
 
 static int merge_resolve(struct commit_list *bases, const char *head_arg,
 			 struct commit_list *remote)
 {
+	int i = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct object_id head, oid;
 	struct commit_list *j;
-	struct child_process cp_update = CHILD_PROCESS_INIT,
-		cp_read = CHILD_PROCESS_INIT,
-		cp_write = CHILD_PROCESS_INIT;
-
-	cp_update.git_cmd = 1;
-	argv_array_pushl(&cp_update.args, "update-index", "-q", "--refresh", NULL);
-	run_command(&cp_update);
-
-	cp_read.git_cmd = 1;
-	argv_array_pushl(&cp_read.args, "read-tree", "-u", "-m", "--aggressive", NULL);
-
-	for (j = bases; j && j->item; j = j->next)
-		argv_array_push(&cp_read.args, oid_to_hex(&j->item->object.oid));
 
 	if (head_arg)
-		argv_array_push(&cp_read.args, head_arg);
-	if (remote && remote->item)
-		argv_array_push(&cp_read.args, oid_to_hex(&remote->item->object.oid));
+		get_oid(head_arg, &head);
 
-	if (run_command(&cp_read))
-		return 2;
+	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
+	refresh_index(the_repository->index, 0, NULL, NULL, NULL);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = the_repository->index;
+	opts.dst_index = the_repository->index;
+	opts.update = 1;
+	opts.merge = 1;
+	opts.aggressive = 1;
+
+	for (j = bases; j; j = j->next) {
+		if (add_tree(&j->item->object.oid, t + (i++)))
+			goto out;
+	}
+
+	if (head_arg && add_tree(&head, t + (i++)))
+		goto out;
+	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
+		goto out;
+
+	if (i == 1)
+		opts.fn = oneway_merge;
+	else if (i == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(the_repository->index);
+	} else if (i >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = i - 1;
+	}
+
+	if (unpack_trees(i, t, &opts))
+		goto out;
 
 	puts("Trying simple merge.");
+	write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
 
-	cp_write.git_cmd = 1;
-	cp_write.no_stdout = 1;
-	cp_write.no_stderr = 1;
-	argv_array_push(&cp_write.args, "write-tree");
-	if (run_command(&cp_write)) {
-		struct child_process cp_merge = CHILD_PROCESS_INIT;
+	if (write_index_as_tree(&oid, the_repository->index,
+				the_repository->index_file, 0, NULL)) {
+		int ret;
 
-		puts("Simple merge failed, trying Automatic merge.");
+		repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all(the_repository->index, 0, 0,
+				merge_one_file_cb, the_repository);
 
-		cp_merge.git_cmd = 1;
-		argv_array_pushl(&cp_merge.args, "merge-index", "-o",
-				 "git-merge-one-file", "-a", NULL);
-		if (run_command(&cp_merge))
-			return 1;
+		write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
+		return !!ret;
 	}
 
 	return 0;
+
+ out:
+	rollback_lock_file(&lock);
+	return 2;
 }
 
 static const char builtin_merge_resolve_usage[] =
@@ -73,6 +110,10 @@ int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
 	if (argc < 5)
 		usage(builtin_merge_resolve_usage);
 
+	setup_work_tree();
+	if (repo_read_index(the_repository) < 0)
+		die("invalid index");
+
 	/* The first parameters up to -- are merge bases; the rest are
 	 * heads. */
 	for (i = 1; i < argc; i++) {
diff --git a/merge-strategies.c b/merge-strategies.c
index f4c0b4acd6..39bfa1af7b 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -191,6 +191,17 @@ int merge_strategies_one_file(struct repository *r,
 	return 0;
 }
 
+int merge_one_file_cb(const struct object_id *orig_blob,
+		      const struct object_id *our_blob,
+		      const struct object_id *their_blob, const char *path,
+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		      void *data)
+{
+	return merge_strategies_one_file((struct repository *)data,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+}
+
 int merge_program_cb(const struct object_id *orig_blob,
 		     const struct object_id *our_blob,
 		     const struct object_id *their_blob, const char *path,
diff --git a/merge-strategies.h b/merge-strategies.h
index cf78d7eaf4..40e175ca39 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -16,6 +16,12 @@ typedef int (*merge_cb)(const struct object_id *orig_blob,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_cb(const struct object_id *orig_blob,
+		      const struct object_id *our_blob,
+		      const struct object_id *their_blob, const char *path,
+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		      void *data);
+
 int merge_program_cb(const struct object_id *orig_blob,
 		     const struct object_id *our_blob,
 		     const struct object_id *their_blob, const char *path,
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 09/17] merge-resolve: libify merge_resolve()
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (7 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 08/17] merge-resolve: remove calls to external processes Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 10/17] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This moves merge_resolve() (and its helper functions) to
merge-strategies.c.  This will enable `git merge' and the sequencer to
directly call it instead of forking.

Here too, this is not a faithful copy-and-paste; the new
merge_resolve() (renamed merge_strategies_resolve()) takes a pointer to
the repository, instead of using `the_repository'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---

Notes:
    This patch is best viewed with `--color-moved'.

 builtin/merge-resolve.c | 86 +----------------------------------------
 merge-strategies.c      | 85 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 3 files changed, 91 insertions(+), 85 deletions(-)

diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
index 2c364fcdb0..59f734473b 100644
--- a/builtin/merge-resolve.c
+++ b/builtin/merge-resolve.c
@@ -10,92 +10,8 @@
  */
 
 #include "cache.h"
-#include "cache-tree.h"
 #include "builtin.h"
-#include "lockfile.h"
 #include "merge-strategies.h"
-#include "unpack-trees.h"
-
-static int add_tree(const struct object_id *oid, struct tree_desc *t)
-{
-	struct tree *tree;
-
-	tree = parse_tree_indirect(oid);
-	if (parse_tree(tree))
-		return -1;
-
-	init_tree_desc(t, tree->buffer, tree->size);
-	return 0;
-}
-
-static int merge_resolve(struct commit_list *bases, const char *head_arg,
-			 struct commit_list *remote)
-{
-	int i = 0;
-	struct lock_file lock = LOCK_INIT;
-	struct tree_desc t[MAX_UNPACK_TREES];
-	struct unpack_trees_options opts;
-	struct object_id head, oid;
-	struct commit_list *j;
-
-	if (head_arg)
-		get_oid(head_arg, &head);
-
-	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
-	refresh_index(the_repository->index, 0, NULL, NULL, NULL);
-
-	memset(&opts, 0, sizeof(opts));
-	opts.head_idx = 1;
-	opts.src_index = the_repository->index;
-	opts.dst_index = the_repository->index;
-	opts.update = 1;
-	opts.merge = 1;
-	opts.aggressive = 1;
-
-	for (j = bases; j; j = j->next) {
-		if (add_tree(&j->item->object.oid, t + (i++)))
-			goto out;
-	}
-
-	if (head_arg && add_tree(&head, t + (i++)))
-		goto out;
-	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
-		goto out;
-
-	if (i == 1)
-		opts.fn = oneway_merge;
-	else if (i == 2) {
-		opts.fn = twoway_merge;
-		opts.initial_checkout = is_index_unborn(the_repository->index);
-	} else if (i >= 3) {
-		opts.fn = threeway_merge;
-		opts.head_idx = i - 1;
-	}
-
-	if (unpack_trees(i, t, &opts))
-		goto out;
-
-	puts("Trying simple merge.");
-	write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
-
-	if (write_index_as_tree(&oid, the_repository->index,
-				the_repository->index_file, 0, NULL)) {
-		int ret;
-
-		repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
-		ret = merge_all(the_repository->index, 0, 0,
-				merge_one_file_cb, the_repository);
-
-		write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
-		return !!ret;
-	}
-
-	return 0;
-
- out:
-	rollback_lock_file(&lock);
-	return 2;
-}
 
 static const char builtin_merge_resolve_usage[] =
 	"git merge-resolve <bases>... -- <head> <remote>";
@@ -149,5 +65,5 @@ int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
 	if (is_baseless)
 		return 2;
 
-	return merge_resolve(bases, head, remote);
+	return merge_strategies_resolve(the_repository, bases, head, remote);
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index 39bfa1af7b..a12c575590 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,7 +1,10 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int add_to_index_cacheinfo(struct index_state *istate,
@@ -299,3 +302,85 @@ int merge_all(struct index_state *istate, int oneshot, int quiet,
 
 	return err;
 }
+
+static int add_tree(const struct object_id *oid, struct tree_desc *t)
+{
+	struct tree *tree;
+
+	tree = parse_tree_indirect(oid);
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	int i = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct object_id head, oid;
+	struct commit_list *j;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+	refresh_index(r->index, 0, NULL, NULL, NULL);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.update = 1;
+	opts.merge = 1;
+	opts.aggressive = 1;
+
+	for (j = bases; j && j->item; j = j->next) {
+		if (add_tree(&j->item->object.oid, t + (i++)))
+			goto out;
+	}
+
+	if (head_arg && add_tree(&head, t + (i++)))
+		goto out;
+	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
+		goto out;
+
+	if (i == 1)
+		opts.fn = oneway_merge;
+	else if (i == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (i >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = i - 1;
+	}
+
+	if (unpack_trees(i, t, &opts))
+		goto out;
+
+	puts("Trying simple merge.");
+	write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+
+		puts("Simple merge failed, trying Automatic merge.");
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all(r->index, 0, 0, merge_one_file_cb, r);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+
+ out:
+	rollback_lock_file(&lock);
+	return 2;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 40e175ca39..778f8ce9d6 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_strategies_one_file(struct repository *r,
@@ -33,4 +34,8 @@ int merge_one_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all(struct index_state *istate, int oneshot, int quiet,
 	      merge_cb cb, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 10/17] merge-recursive: move better_branch_name() to merge.c
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (8 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 09/17] merge-resolve: libify merge_resolve() Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 11/17] merge-octopus: rewrite in C Alban Gruin
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

get_better_branch_name() will be used by rebase-octopus once it is
rewritten in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---

Notes:
    This patch is best viewed with `--color-moved'.

 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index 0f0485ecfe..bbbd8e352d 100644
--- a/cache.h
+++ b/cache.h
@@ -1915,7 +1915,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index aa36de2f64..5f3d05268f 100644
--- a/merge.c
+++ b/merge.c
@@ -108,3 +108,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 11/17] merge-octopus: rewrite in C
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (9 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 10/17] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 12/17] merge-octopus: remove calls to external processes Alban Gruin
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port keeps using external processes for operations on
the index, or to call `git merge-one-file'.  This will be addressed in
the next two commits.

Here to, merge_octopus() takes two commit lists and a string to reduce
frictions when try_merge_strategies() will be modified to call it
directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c | 241 ++++++++++++++++++++++++++++++++++++++++
 git-merge-octopus.sh    | 112 -------------------
 git.c                   |   1 +
 5 files changed, 244 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index ccea651ac8..8f45a3ec03 100644
--- a/Makefile
+++ b/Makefile
@@ -595,7 +595,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1088,6 +1087,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 6ea207c9fd..5a587ab70c 100644
--- a/builtin.h
+++ b/builtin.h
@@ -170,6 +170,7 @@ int cmd_mailsplit(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..6216beaa2b
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,241 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "commit-reach.h"
+#include "lockfile.h"
+#include "run-command.h"
+#include "unpack-trees.h"
+
+static int write_tree(struct tree **reference_tree)
+{
+	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf read_tree = STRBUF_INIT, err = STRBUF_INIT;
+	struct object_id oid;
+	int ret;
+
+	cp.git_cmd = 1;
+	argv_array_push(&cp.args, "write-tree");
+	ret = pipe_command(&cp, NULL, 0, &read_tree, 0, &err, 0);
+	if (err.len > 0)
+		fputs(err.buf, stderr);
+
+	strbuf_trim_trailing_newline(&read_tree);
+	get_oid(read_tree.buf, &oid);
+
+	*reference_tree = lookup_tree(the_repository, &oid);
+
+	strbuf_release(&read_tree);
+	strbuf_release(&err);
+	child_process_clear(&cp);
+
+	return ret;
+}
+
+static int merge_octopus(struct commit_list *bases, const char *head_arg,
+			 struct commit_list *remotes)
+{
+	int non_ff_merge = 0, ret = 0, references = 1;
+	struct commit **reference_commit;
+	struct tree *reference_tree;
+	struct commit_list *j;
+	struct object_id head;
+
+	get_oid(head_arg, &head);
+	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
+	reference_commit[0] = lookup_commit_reference(the_repository, &head);
+	reference_tree = get_commit_tree(reference_commit[0]);
+
+	for (j = remotes; j; j = j->next) {
+		struct commit *c = j->item;
+		struct object_id *oid = &c->object.oid;
+		struct commit_list *common, *k;
+		char *branch_name;
+		int can_ff = 1;
+
+		if (ret) {
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			goto out;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common)
+			die(_("Unable to find common commit with %s"), branch_name);
+
+		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
+
+		if (k) {
+			printf(_("Already up to date with %s\n"), branch_name);
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (!non_ff_merge) {
+			int i;
+
+			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
+				can_ff = oideq(&k->item->object.oid,
+					       &reference_commit[i]->object.oid);
+			}
+		}
+
+		if (!non_ff_merge && can_ff) {
+			struct child_process cp = CHILD_PROCESS_INIT;
+
+			printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+			cp.git_cmd = 1;
+			argv_array_pushl(&cp.args, "read-tree", "-u", "-m", NULL);
+			argv_array_push(&cp.args, oid_to_hex(&head));
+			argv_array_push(&cp.args, oid_to_hex(oid));
+
+			ret = run_command(&cp);
+			if (ret) {
+				free(branch_name);
+				free_commit_list(common);
+				goto out;
+			}
+
+			child_process_clear(&cp);
+			references = 0;
+			write_tree(&reference_tree);
+		} else {
+			struct commit_list *l;
+			struct tree *next = NULL;
+			struct child_process cp = CHILD_PROCESS_INIT;
+
+			non_ff_merge = 1;
+			printf(_("Trying simple merge with %s\n"), branch_name);
+
+			cp.git_cmd = 1;
+			argv_array_pushl(&cp.args, "read-tree", "-u", "-m", "--aggressive", NULL);
+
+			for (l = common; l; l = l->next)
+				argv_array_push(&cp.args, oid_to_hex(&l->item->object.oid));
+
+			argv_array_push(&cp.args, oid_to_hex(&reference_tree->object.oid));
+			argv_array_push(&cp.args, oid_to_hex(oid));
+
+			if (run_command(&cp)) {
+				ret = 2;
+
+				free(branch_name);
+				free_commit_list(common);
+
+				goto out;
+			}
+
+			child_process_clear(&cp);
+
+			if (write_tree(&next)) {
+				struct child_process cp = CHILD_PROCESS_INIT;
+				puts(_("Simple merge did not work, trying automatic merge."));
+
+				cp.git_cmd = 1;
+				argv_array_pushl(&cp.args, "merge-index", "-o",
+						 "git-merge-one-file", "-a", NULL);
+				if (run_command(&cp))
+					ret = 1;
+
+				child_process_clear(&cp);
+				write_tree(&next);
+			}
+
+			reference_tree = next;
+		}
+
+		reference_commit[references++] = c;
+
+		free(branch_name);
+		free_commit_list(common);
+	}
+
+out:
+	free(reference_commit);
+	return ret;
+}
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+	struct child_process cp = CHILD_PROCESS_INIT;
+	struct strbuf files = STRBUF_INIT;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	/* The first parameters up to -- are merge bases; the rest are
+	 * heads. */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+
+			get_oid(argv[i], &oid);
+
+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
+				struct commit *commit;
+				commit = lookup_commit_or_die(&oid, argv[i]);
+
+				if (sep_seen)
+					next_remote = commit_list_append(commit, next_remote);
+				else
+					next_base = commit_list_append(commit, next_base);
+			}
+		}
+	}
+
+	/* Reject if this is not an octopus -- resolve should be used
+	 * instead. */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	cp.git_cmd = 1;
+	argv_array_pushl(&cp.args, "diff-index", "--cached",
+			 "--name-only", "HEAD", "--", NULL);
+	pipe_command(&cp, NULL, 0, &files, 0, NULL, 0);
+	child_process_clear(&cp);
+
+	if (files.len > 0) {
+		struct strbuf **s, **b;
+
+		s = strbuf_split(&files, '\n');
+
+		fprintf(stderr, _("Error: Your local changes to the following "
+				  "files would be overwritten by merge\n"));
+
+		for (b = s; *b; b++)
+			fprintf(stderr, "    %.*s", (int)(*b)->len, (*b)->buf);
+
+		strbuf_list_free(s);
+		strbuf_release(&files);
+		return 2;
+	}
+
+	return merge_octopus(bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 7d19d37951..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 2e92019493..28634cf61f 100644
--- a/git.c
+++ b/git.c
@@ -531,6 +531,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 12/17] merge-octopus: remove calls to external processes
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (10 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 11/17] merge-octopus: rewrite in C Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 13/17] merge-octopus: libify merge_octopus() Alban Gruin
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This removes calls to external processes to avoid reading and writing
the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes(), and is moved from cmd_merge_octopus() to
   merge_octopus().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_octopus().

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-octopus.c | 155 ++++++++++++++++++++++------------------
 1 file changed, 86 insertions(+), 69 deletions(-)

diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
index 6216beaa2b..14310a4eb1 100644
--- a/builtin/merge-octopus.c
+++ b/builtin/merge-octopus.c
@@ -9,33 +9,70 @@
  */
 
 #include "cache.h"
+#include "cache-tree.h"
 #include "builtin.h"
 #include "commit-reach.h"
 #include "lockfile.h"
-#include "run-command.h"
+#include "merge-strategies.h"
 #include "unpack-trees.h"
 
+static int fast_forward(const struct object_id *oids, int nr, int aggressive)
+{
+	int i;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	repo_read_index_preload(the_repository, NULL, 0);
+	if (refresh_index(the_repository->index, REFRESH_QUIET, NULL, NULL, NULL))
+		return -1;
+
+	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = the_repository->index;
+	opts.dst_index = the_repository->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	for (i = 0; i < nr; i++) {
+		struct tree *tree;
+		tree = parse_tree_indirect(oids + i);
+		if (parse_tree(tree))
+			return -1;
+		init_tree_desc(t + i, tree->buffer, tree->size);
+	}
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(the_repository->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(the_repository->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
 static int write_tree(struct tree **reference_tree)
 {
-	struct child_process cp = CHILD_PROCESS_INIT;
-	struct strbuf read_tree = STRBUF_INIT, err = STRBUF_INIT;
 	struct object_id oid;
 	int ret;
 
-	cp.git_cmd = 1;
-	argv_array_push(&cp.args, "write-tree");
-	ret = pipe_command(&cp, NULL, 0, &read_tree, 0, &err, 0);
-	if (err.len > 0)
-		fputs(err.buf, stderr);
-
-	strbuf_trim_trailing_newline(&read_tree);
-	get_oid(read_tree.buf, &oid);
-
-	*reference_tree = lookup_tree(the_repository, &oid);
-
-	strbuf_release(&read_tree);
-	strbuf_release(&err);
-	child_process_clear(&cp);
+	ret = write_index_as_tree(&oid, the_repository->index,
+				  the_repository->index_file, 0, NULL);
+	if (!ret)
+		*reference_tree = lookup_tree(the_repository, &oid);
 
 	return ret;
 }
@@ -48,12 +85,23 @@ static int merge_octopus(struct commit_list *bases, const char *head_arg,
 	struct tree *reference_tree;
 	struct commit_list *j;
 	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
 
 	get_oid(head_arg, &head);
+
 	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
 	reference_commit[0] = lookup_commit_reference(the_repository, &head);
 	reference_tree = get_commit_tree(reference_commit[0]);
 
+	if (repo_index_has_changes(the_repository, reference_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		ret = 2;
+		goto out;
+	}
+
 	for (j = remotes; j; j = j->next) {
 		struct commit *c = j->item;
 		struct object_id *oid = &c->object.oid;
@@ -94,43 +142,36 @@ static int merge_octopus(struct commit_list *bases, const char *head_arg,
 		}
 
 		if (!non_ff_merge && can_ff) {
-			struct child_process cp = CHILD_PROCESS_INIT;
-
+			struct object_id oids[2];
 			printf(_("Fast-forwarding to: %s\n"), branch_name);
 
-			cp.git_cmd = 1;
-			argv_array_pushl(&cp.args, "read-tree", "-u", "-m", NULL);
-			argv_array_push(&cp.args, oid_to_hex(&head));
-			argv_array_push(&cp.args, oid_to_hex(oid));
+			oidcpy(oids, &head);
+			oidcpy(oids + 1, oid);
 
-			ret = run_command(&cp);
+			ret = fast_forward(oids, 2, 0);
 			if (ret) {
 				free(branch_name);
 				free_commit_list(common);
 				goto out;
 			}
 
-			child_process_clear(&cp);
 			references = 0;
 			write_tree(&reference_tree);
 		} else {
-			struct commit_list *l;
+			int i = 0;
 			struct tree *next = NULL;
-			struct child_process cp = CHILD_PROCESS_INIT;
+			struct object_id oids[MAX_UNPACK_TREES];
 
 			non_ff_merge = 1;
 			printf(_("Trying simple merge with %s\n"), branch_name);
 
-			cp.git_cmd = 1;
-			argv_array_pushl(&cp.args, "read-tree", "-u", "-m", "--aggressive", NULL);
+			for (k = common; k; k = k->next)
+				oidcpy(oids + (i++), &k->item->object.oid);
 
-			for (l = common; l; l = l->next)
-				argv_array_push(&cp.args, oid_to_hex(&l->item->object.oid));
+			oidcpy(oids + (i++), &reference_tree->object.oid);
+			oidcpy(oids + (i++), oid);
 
-			argv_array_push(&cp.args, oid_to_hex(&reference_tree->object.oid));
-			argv_array_push(&cp.args, oid_to_hex(oid));
-
-			if (run_command(&cp)) {
+			if (fast_forward(oids, i, 1)) {
 				ret = 2;
 
 				free(branch_name);
@@ -139,19 +180,15 @@ static int merge_octopus(struct commit_list *bases, const char *head_arg,
 				goto out;
 			}
 
-			child_process_clear(&cp);
-
 			if (write_tree(&next)) {
-				struct child_process cp = CHILD_PROCESS_INIT;
+				struct lock_file lock = LOCK_INIT;
+
 				puts(_("Simple merge did not work, trying automatic merge."));
+				repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
+				ret = !!merge_all(the_repository->index, 0, 0,
+						  merge_one_file_cb, the_repository);
+				write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
 
-				cp.git_cmd = 1;
-				argv_array_pushl(&cp.args, "merge-index", "-o",
-						 "git-merge-one-file", "-a", NULL);
-				if (run_command(&cp))
-					ret = 1;
-
-				child_process_clear(&cp);
 				write_tree(&next);
 			}
 
@@ -178,12 +215,14 @@ int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
 	struct commit_list *bases = NULL, *remotes = NULL;
 	struct commit_list **next_base = &bases, **next_remote = &remotes;
 	const char *head_arg = NULL;
-	struct child_process cp = CHILD_PROCESS_INIT;
-	struct strbuf files = STRBUF_INIT;
 
 	if (argc < 5)
 		usage(builtin_merge_octopus_usage);
 
+	setup_work_tree();
+	if (repo_read_index(the_repository) < 0)
+		die("corrupted cache");
+
 	/* The first parameters up to -- are merge bases; the rest are
 	 * heads. */
 	for (i = 1; i < argc; i++) {
@@ -215,27 +254,5 @@ int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
 	if (commit_list_count(remotes) < 2)
 		return 2;
 
-	cp.git_cmd = 1;
-	argv_array_pushl(&cp.args, "diff-index", "--cached",
-			 "--name-only", "HEAD", "--", NULL);
-	pipe_command(&cp, NULL, 0, &files, 0, NULL, 0);
-	child_process_clear(&cp);
-
-	if (files.len > 0) {
-		struct strbuf **s, **b;
-
-		s = strbuf_split(&files, '\n');
-
-		fprintf(stderr, _("Error: Your local changes to the following "
-				  "files would be overwritten by merge\n"));
-
-		for (b = s; *b; b++)
-			fprintf(stderr, "    %.*s", (int)(*b)->len, (*b)->buf);
-
-		strbuf_list_free(s);
-		strbuf_release(&files);
-		return 2;
-	}
-
 	return merge_octopus(bases, head_arg, remotes);
 }
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 13/17] merge-octopus: libify merge_octopus()
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (11 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 12/17] merge-octopus: remove calls to external processes Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 14/17] merge: use the "resolve" strategy without forking Alban Gruin
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This moves merge_octopus() (and its helper functions) to
merge-strategies.c.  This will enable `git merge' and the sequencer to
directly call it instead of forking.

Once again, this is not a faithful copy-and-paste; the new
merge_octopus() (renamed merge_strategies_octopus()) takes a pointer to
the repository, instead of using `the_repository'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---

Notes:
    This patch is best viewed with `--color-moved'.

 builtin/merge-octopus.c | 197 +---------------------------------------
 merge-strategies.c      | 191 ++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 3 files changed, 196 insertions(+), 195 deletions(-)

diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
index 14310a4eb1..37bbdf11cc 100644
--- a/builtin/merge-octopus.c
+++ b/builtin/merge-octopus.c
@@ -9,202 +9,9 @@
  */
 
 #include "cache.h"
-#include "cache-tree.h"
 #include "builtin.h"
-#include "commit-reach.h"
-#include "lockfile.h"
+#include "commit.h"
 #include "merge-strategies.h"
-#include "unpack-trees.h"
-
-static int fast_forward(const struct object_id *oids, int nr, int aggressive)
-{
-	int i;
-	struct tree_desc t[MAX_UNPACK_TREES];
-	struct unpack_trees_options opts;
-	struct lock_file lock = LOCK_INIT;
-
-	repo_read_index_preload(the_repository, NULL, 0);
-	if (refresh_index(the_repository->index, REFRESH_QUIET, NULL, NULL, NULL))
-		return -1;
-
-	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
-
-	memset(&opts, 0, sizeof(opts));
-	opts.head_idx = 1;
-	opts.src_index = the_repository->index;
-	opts.dst_index = the_repository->index;
-	opts.merge = 1;
-	opts.update = 1;
-	opts.aggressive = aggressive;
-
-	for (i = 0; i < nr; i++) {
-		struct tree *tree;
-		tree = parse_tree_indirect(oids + i);
-		if (parse_tree(tree))
-			return -1;
-		init_tree_desc(t + i, tree->buffer, tree->size);
-	}
-
-	if (nr == 1)
-		opts.fn = oneway_merge;
-	else if (nr == 2) {
-		opts.fn = twoway_merge;
-		opts.initial_checkout = is_index_unborn(the_repository->index);
-	} else if (nr >= 3) {
-		opts.fn = threeway_merge;
-		opts.head_idx = nr - 1;
-	}
-
-	if (unpack_trees(nr, t, &opts))
-		return -1;
-
-	if (write_locked_index(the_repository->index, &lock, COMMIT_LOCK))
-		return error(_("unable to write new index file"));
-
-	return 0;
-}
-
-static int write_tree(struct tree **reference_tree)
-{
-	struct object_id oid;
-	int ret;
-
-	ret = write_index_as_tree(&oid, the_repository->index,
-				  the_repository->index_file, 0, NULL);
-	if (!ret)
-		*reference_tree = lookup_tree(the_repository, &oid);
-
-	return ret;
-}
-
-static int merge_octopus(struct commit_list *bases, const char *head_arg,
-			 struct commit_list *remotes)
-{
-	int non_ff_merge = 0, ret = 0, references = 1;
-	struct commit **reference_commit;
-	struct tree *reference_tree;
-	struct commit_list *j;
-	struct object_id head;
-	struct strbuf sb = STRBUF_INIT;
-
-	get_oid(head_arg, &head);
-
-	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
-	reference_commit[0] = lookup_commit_reference(the_repository, &head);
-	reference_tree = get_commit_tree(reference_commit[0]);
-
-	if (repo_index_has_changes(the_repository, reference_tree, &sb)) {
-		error(_("Your local changes to the following files "
-			"would be overwritten by merge:\n  %s"),
-		      sb.buf);
-		strbuf_release(&sb);
-		ret = 2;
-		goto out;
-	}
-
-	for (j = remotes; j; j = j->next) {
-		struct commit *c = j->item;
-		struct object_id *oid = &c->object.oid;
-		struct commit_list *common, *k;
-		char *branch_name;
-		int can_ff = 1;
-
-		if (ret) {
-			puts(_("Automated merge did not work."));
-			puts(_("Should not be doing an octopus."));
-
-			ret = 2;
-			goto out;
-		}
-
-		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
-		common = get_merge_bases_many(c, references, reference_commit);
-
-		if (!common)
-			die(_("Unable to find common commit with %s"), branch_name);
-
-		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
-
-		if (k) {
-			printf(_("Already up to date with %s\n"), branch_name);
-			free(branch_name);
-			free_commit_list(common);
-			continue;
-		}
-
-		if (!non_ff_merge) {
-			int i;
-
-			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
-				can_ff = oideq(&k->item->object.oid,
-					       &reference_commit[i]->object.oid);
-			}
-		}
-
-		if (!non_ff_merge && can_ff) {
-			struct object_id oids[2];
-			printf(_("Fast-forwarding to: %s\n"), branch_name);
-
-			oidcpy(oids, &head);
-			oidcpy(oids + 1, oid);
-
-			ret = fast_forward(oids, 2, 0);
-			if (ret) {
-				free(branch_name);
-				free_commit_list(common);
-				goto out;
-			}
-
-			references = 0;
-			write_tree(&reference_tree);
-		} else {
-			int i = 0;
-			struct tree *next = NULL;
-			struct object_id oids[MAX_UNPACK_TREES];
-
-			non_ff_merge = 1;
-			printf(_("Trying simple merge with %s\n"), branch_name);
-
-			for (k = common; k; k = k->next)
-				oidcpy(oids + (i++), &k->item->object.oid);
-
-			oidcpy(oids + (i++), &reference_tree->object.oid);
-			oidcpy(oids + (i++), oid);
-
-			if (fast_forward(oids, i, 1)) {
-				ret = 2;
-
-				free(branch_name);
-				free_commit_list(common);
-
-				goto out;
-			}
-
-			if (write_tree(&next)) {
-				struct lock_file lock = LOCK_INIT;
-
-				puts(_("Simple merge did not work, trying automatic merge."));
-				repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
-				ret = !!merge_all(the_repository->index, 0, 0,
-						  merge_one_file_cb, the_repository);
-				write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
-
-				write_tree(&next);
-			}
-
-			reference_tree = next;
-		}
-
-		reference_commit[references++] = c;
-
-		free(branch_name);
-		free_commit_list(common);
-	}
-
-out:
-	free(reference_commit);
-	return ret;
-}
 
 static const char builtin_merge_octopus_usage[] =
 	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
@@ -254,5 +61,5 @@ int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
 	if (commit_list_count(remotes) < 2)
 		return 2;
 
-	return merge_octopus(bases, head_arg, remotes);
+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index a12c575590..8395c4c787 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "lockfile.h"
 #include "merge-strategies.h"
@@ -384,3 +385,193 @@ int merge_strategies_resolve(struct repository *r,
 	rollback_lock_file(&lock);
 	return 2;
 }
+
+static int fast_forward(struct repository *r, const struct object_id *oids,
+			int nr, int aggressive)
+{
+	int i;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	repo_read_index_preload(r, NULL, 0);
+	if (refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL))
+		return -1;
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	for (i = 0; i < nr; i++) {
+		struct tree *tree;
+		tree = parse_tree_indirect(oids + i);
+		if (parse_tree(tree))
+			return -1;
+		init_tree_desc(t + i, tree->buffer, tree->size);
+	}
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL);
+	if (!ret)
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int non_ff_merge = 0, ret = 0, references = 1;
+	struct commit **reference_commit;
+	struct tree *reference_tree;
+	struct commit_list *j;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+
+	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
+	reference_commit[0] = lookup_commit_reference(r, &head);
+	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
+
+	if (repo_index_has_changes(r, reference_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		ret = 2;
+		goto out;
+	}
+
+	for (j = remotes; j && j->item; j = j->next) {
+		struct commit *c = j->item;
+		struct object_id *oid = &c->object.oid;
+		struct commit_list *common, *k;
+		char *branch_name;
+		int can_ff = 1;
+
+		if (ret) {
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			goto out;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common)
+			die(_("Unable to find common commit with %s"), branch_name);
+
+		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
+
+		if (k) {
+			printf(_("Already up to date with %s\n"), branch_name);
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (!non_ff_merge) {
+			int i;
+
+			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
+				can_ff = oideq(&k->item->object.oid,
+					       &reference_commit[i]->object.oid);
+			}
+		}
+
+		if (!non_ff_merge && can_ff) {
+			struct object_id oids[2];
+			printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+			oidcpy(oids, &head);
+			oidcpy(oids + 1, oid);
+
+			ret = fast_forward(r, oids, 2, 0);
+			if (ret) {
+				free(branch_name);
+				free_commit_list(common);
+				goto out;
+			}
+
+			references = 0;
+			write_tree(r, &reference_tree);
+		} else {
+			int i = 0;
+			struct tree *next = NULL;
+			struct object_id oids[MAX_UNPACK_TREES];
+
+			non_ff_merge = 1;
+			printf(_("Trying simple merge with %s\n"), branch_name);
+
+			for (k = common; k; k = k->next)
+				oidcpy(oids + (i++), &k->item->object.oid);
+
+			oidcpy(oids + (i++), &reference_tree->object.oid);
+			oidcpy(oids + (i++), oid);
+
+			if (fast_forward(r, oids, i, 1)) {
+				ret = 2;
+
+				free(branch_name);
+				free_commit_list(common);
+
+				goto out;
+			}
+
+			if (write_tree(r, &next)) {
+				struct lock_file lock = LOCK_INIT;
+
+				puts(_("Simple merge did not work, trying automatic merge."));
+				repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+				ret = !!merge_all(r->index, 0, 0, merge_one_file_cb, r);
+				write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+				write_tree(r, &next);
+			}
+
+			reference_tree = next;
+		}
+
+		reference_commit[references++] = c;
+
+		free(branch_name);
+		free_commit_list(common);
+	}
+
+out:
+	free(reference_commit);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 778f8ce9d6..938411a04e 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -37,5 +37,8 @@ int merge_all(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 14/17] merge: use the "resolve" strategy without forking
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (12 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 13/17] merge-octopus: libify merge_octopus() Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 15/17] merge: use the "octopus" " Alban Gruin
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 7da707bf55..d50b4ad6ad 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -744,7 +745,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
-	} else {
+	} else if (!strcmp(strategy, "resolve"))
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
+	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
 					 common, head_arg, remoteheads);
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 15/17] merge: use the "octopus" strategy without forking
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (13 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 14/17] merge: use the "resolve" strategy without forking Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 12:19 ` [RFC PATCH v1 16/17] sequencer: use the "resolve" " Alban Gruin
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index d50b4ad6ad..53f64ddb87 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -748,6 +748,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve"))
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	else if (!strcmp(strategy, "octopus"))
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 16/17] sequencer: use the "resolve" strategy without forking
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (14 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 15/17] merge: use the "octopus" " Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-06-25 16:11   ` Phillip Wood
  2020-06-25 12:19 ` [RFC PATCH v1 17/17] sequencer: use the "octopus" merge " Alban Gruin
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  17 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index fd7701c88a..ea8dc58108 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -33,6 +33,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -1922,9 +1923,15 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [RFC PATCH v1 17/17] sequencer: use the "octopus" merge strategy without forking
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (15 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 16/17] sequencer: use the "resolve" " Alban Gruin
@ 2020-06-25 12:19 ` Alban Gruin
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  17 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-06-25 12:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index ea8dc58108..f9fa995b4b 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1927,6 +1927,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.27.0.139.gc9c318d6bf


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 02/17] merge-one-file: rewrite in C
  2020-06-25 12:19 ` [RFC PATCH v1 02/17] merge-one-file: rewrite in C Alban Gruin
@ 2020-06-25 14:55   ` Chris Torek
  2020-06-25 15:16   ` Phillip Wood
  1 sibling, 0 replies; 221+ messages in thread
From: Chris Torek @ 2020-06-25 14:55 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano

much snippage below, keeping just enough context to see which file,
function, etc:

On Thu, Jun 25, 2020 at 5:49 AM Alban Gruin <alban.gruin@gmail.com> wrote:
> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..4992a6cd30
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,275 @@

> +static int do_merge_one_file(const struct object_id *orig_blob,
> +                            const struct object_id *our_blob,
> +                            const struct object_id *their_blob, const char *path,
> +                            unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +       int ret, source, dest;

> +       source = open(src1.buf, O_RDONLY);
> +       dest = open(path, O_WRONLY | O_TRUNC);
> +
> +       copy_fd(source, dest);
> +
> +       close(source);
> +       close(dest);
> +
> +       unlink(orig.buf);
> +       unlink(src1.buf);
> +       unlink(src2.buf);

Some of this goes away in subsequent patches, but most of these calls
should be checked for error returns, especially the two `open`s in case
someone has messed with permissions.

Chris

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 02/17] merge-one-file: rewrite in C
  2020-06-25 12:19 ` [RFC PATCH v1 02/17] merge-one-file: rewrite in C Alban Gruin
  2020-06-25 14:55   ` Chris Torek
@ 2020-06-25 15:16   ` Phillip Wood
  2020-06-25 18:17     ` Phillip Wood
  2020-07-12 11:22     ` Alban Gruin
  1 sibling, 2 replies; 221+ messages in thread
From: Phillip Wood @ 2020-06-25 15:16 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

Hi Alban

I think this series is a great idea

On 25/06/2020 13:19, Alban Gruin wrote:
> This rewrites `git merge-one-file' from shell to C.  This port is very
> straightforward: it keeps using external processes to edit the index,
> for instance.  Errors are also displayed with fprintf() instead of
> error().  Both of these will be addressed in the next few commits,
> leading to its libification so its main function can be used from other
> commands directly.
> 
> This also fixes a bug present in the original script: instead of
> checking if a _regular_ file exists when a file exists in the branch to
> merge, but not in our branch, the rewritten version checks if a file of
> any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
> where the branch to merge had a new file, `a/b', but our branch had a
> directory there; it should have failed because a directory exists, but
> it did not because there was no regular file called `a/b'.  This test is
> now marked as successful.
> 
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>   Makefile                        |   2 +-
>   builtin.h                       |   1 +
>   builtin/merge-one-file.c        | 275 ++++++++++++++++++++++++++++++++
>   git-merge-one-file.sh           | 167 -------------------
>   git.c                           |   1 +
>   t/t6035-merge-dir-to-symlink.sh |   2 +-
>   6 files changed, 279 insertions(+), 169 deletions(-)
>   create mode 100644 builtin/merge-one-file.c
>   delete mode 100755 git-merge-one-file.sh
> 
> diff --git a/Makefile b/Makefile
> index 372139f1f2..19574f5133 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -596,7 +596,6 @@ SCRIPT_SH += git-bisect.sh
>   SCRIPT_SH += git-difftool--helper.sh
>   SCRIPT_SH += git-filter-branch.sh
>   SCRIPT_SH += git-merge-octopus.sh
> -SCRIPT_SH += git-merge-one-file.sh
>   SCRIPT_SH += git-merge-resolve.sh
>   SCRIPT_SH += git-mergetool.sh
>   SCRIPT_SH += git-quiltimport.sh
> @@ -1089,6 +1088,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
>   BUILTIN_OBJS += builtin/merge-base.o
>   BUILTIN_OBJS += builtin/merge-file.o
>   BUILTIN_OBJS += builtin/merge-index.o
> +BUILTIN_OBJS += builtin/merge-one-file.o
>   BUILTIN_OBJS += builtin/merge-ours.o
>   BUILTIN_OBJS += builtin/merge-recursive.o
>   BUILTIN_OBJS += builtin/merge-tree.o
> diff --git a/builtin.h b/builtin.h
> index a5ae15bfe5..9205d5ecdc 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -172,6 +172,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
>   int cmd_merge_index(int argc, const char **argv, const char *prefix);
>   int cmd_merge_ours(int argc, const char **argv, const char *prefix);
>   int cmd_merge_file(int argc, const char **argv, const char *prefix);
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
>   int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
>   int cmd_merge_tree(int argc, const char **argv, const char *prefix);
>   int cmd_mktag(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..4992a6cd30
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,275 @@
> +/*
> + * Builtin "git merge-one-file"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> + *
> + * This is the git per-file merge script, called with
> + *
> + *   $1 - original file SHA1 (or empty)
> + *   $2 - file in branch1 SHA1 (or empty)
> + *   $3 - file in branch2 SHA1 (or empty)
> + *   $4 - pathname in repository
> + *   $5 - original file mode (or empty)
> + *   $6 - file in branch1 mode (or empty)
> + *   $7 - file in branch2 mode (or empty)

nit pick - these are now argv[1] etc rather than $1 etc

> + *
> + * Handle some trivial cases.. The _really_ trivial cases have
> + * been handled already by git read-tree, but that one doesn't
> + * do any merges that might change the tree layout.
> + */
> +
> +#define USE_THE_INDEX_COMPATIBILITY_MACROS
> +#include "cache.h"
> +#include "builtin.h"
> +#include "commit.h"
> +#include "dir.h"
> +#include "lockfile.h"
> +#include "object-store.h"
> +#include "run-command.h"
> +#include "xdiff-interface.h"
> +
> +static int create_temp_file(const struct object_id *oid, struct strbuf *path)
> +{
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +	struct strbuf err = STRBUF_INIT;
> +	int ret;
> +
> +	cp.git_cmd = 1;
> +	argv_array_pushl(&cp.args, "unpack-file", oid_to_hex(oid), NULL);
> +	ret = pipe_command(&cp, NULL, 0, path, 0, &err, 0);
> +	if (!ret && path->len > 0)
> +		strbuf_trim_trailing_newline(path);
> +
> +	fprintf(stderr, "%.*s", (int) err.len, err.buf);
> +	strbuf_release(&err);
> +
> +	return ret;
> +}

I know others will disagree but personally I'm not a huge fan of 
rewriting shell functions in C that forks other builtins and then 
converting the C to use the internal apis, it seems a much better to 
just write the proper C version the first time. This is especially true 
for simple function such as the ones in this file. That way the reviewer 
gets a clear view of the final code from the patch, rather than having 
to piece it together from a series of additions and deletions.

> +
> +static int add_to_index_cacheinfo(unsigned int mode,
> +				  const struct object_id *oid, const char *path)
> +{
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +
> +	cp.git_cmd = 1;
> +	argv_array_pushl(&cp.args, "update-index", "--add", "--cacheinfo", NULL);
> +	argv_array_pushf(&cp.args, "%o,%s,%s", mode, oid_to_hex(oid), path);
> +	return run_command(&cp);
> +}
> +
> +static int remove_from_index(const char *path)
> +{
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +
> +	cp.git_cmd = 1;
> +	argv_array_pushl(&cp.args, "update-index", "--remove", "--", path, NULL);
> +	return run_command(&cp);
> +}
> +
> +static int checkout_from_index(const char *path)
> +{
> +	struct child_process cp = CHILD_PROCESS_INIT;
> +
> +	cp.git_cmd = 1;
> +	argv_array_pushl(&cp.args, "checkout-index", "-u", "-f", "--", path, NULL);
> +	return run_command(&cp);
> +}
> +
> +static int merge_one_file_deleted(const struct object_id *orig_blob,
> +				  const struct object_id *our_blob,
> +				  const struct object_id *their_blob, const char *path,
> +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if ((our_blob && orig_mode != our_mode) ||
> +	    (their_blob && orig_mode != their_mode)) {
> +		fprintf(stderr, "ERROR: File %s deleted on one branch but had its\n", path);
> +		fprintf(stderr, "ERROR: permissions changed on the other.\n");
> +		return 1;
> +	}
> +
> +	if (our_blob) {
> +		printf("Removing %s\n", path);
> +
> +		if (file_exists(path))
> +			remove_path(path);
> +	}
> +
> +	return remove_from_index(path);
> +}
> +
> +static int do_merge_one_file(const struct object_id *orig_blob,
> +			     const struct object_id *our_blob,
> +			     const struct object_id *their_blob, const char *path,
> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	int ret, source, dest;
> +	struct strbuf src1 = STRBUF_INIT, src2 = STRBUF_INIT, orig = STRBUF_INIT;
> +	struct child_process cp_merge = CHILD_PROCESS_INIT,
> +		cp_checkout = CHILD_PROCESS_INIT,
> +		cp_update = CHILD_PROCESS_INIT;
> +
> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK) {
> +		fprintf(stderr, "ERROR: %s: Not merging symbolic link changes.\n", path);
> +		return 1;
> +	} else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK) {
> +		fprintf(stderr, "ERROR: %s: Not merging conflicting submodule changes.\n",
> +			path);
> +		return 1;
> +	}
> +
> +	create_temp_file(our_blob, &src1);
> +	create_temp_file(their_blob, &src2);
> +
> +	if (orig_blob) {
> +		printf("Auto-merging %s\n", path);
> +		create_temp_file(orig_blob, &orig);
> +	} else {
> +		printf("Added %s in both, but differently.\n", path);
> +		create_temp_file(the_hash_algo->empty_blob, &orig);
> +	}
> +
> +	cp_merge.git_cmd = 1;
> +	argv_array_pushl(&cp_merge.args, "merge-file", src1.buf, orig.buf, src2.buf,
> +			 NULL);
> +	ret = run_command(&cp_merge);
> +
> +	if (ret != 0)
> +		ret = 1;
> +
> +	cp_checkout.git_cmd = 1;
> +	argv_array_pushl(&cp_checkout.args, "checkout-index", "-f", "--stage=2",
> +			 "--", path, NULL);
> +	if (run_command(&cp_checkout))
> +		return 1;
> +
> +	source = open(src1.buf, O_RDONLY);
> +	dest = open(path, O_WRONLY | O_TRUNC);
> +
> +	copy_fd(source, dest);
> +
> +	close(source);
> +	close(dest);
> +
> +	unlink(orig.buf);
> +	unlink(src1.buf);
> +	unlink(src2.buf);
> +
> +	strbuf_release(&src1);
> +	strbuf_release(&src2);
> +	strbuf_release(&orig);

The whole business of creating temporary files and forking seems like a 
lot of effort compared to calling ll_merge() which would also mean we 
respect any merge attributes

> +
> +	if (ret) {
> +		fprintf(stderr, "ERROR: ");
> +
> +		if (!orig_blob) {

I think the original does if (ret || !orig_blob) not &&
> +			fprintf(stderr, "content conflict");
> +			if (our_mode != their_mode)
> +				fprintf(stderr, ", ");

sentence lego, in any case the message below should be printed 
regardless of content conflicts. We should probably mark all these 
messages for translation as well.

> +		}
> +
> +		if (our_mode != their_mode)
> +			fprintf(stderr, "permissions conflict: %o->%o,%o",
> +				orig_mode, our_mode, their_mode);
> +
> +		fprintf(stderr, " in %s\n", path);
> +
> +		return 1;
> +	}
> +
> +	cp_update.git_cmd = 1;
> +	argv_array_pushl(&cp_update.args, "update-index", "--", path, NULL);
> +	return run_command(&cp_update);
> +}
> +
> +static int merge_one_file(const struct object_id *orig_blob,
> +			  const struct object_id *our_blob,
> +			  const struct object_id *their_blob, const char *path,
> +			  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if (orig_blob &&
> +	    ((our_blob && oideq(orig_blob, our_blob)) ||
> +	     (their_blob && oideq(orig_blob, their_blob))))
> +		return merge_one_file_deleted(orig_blob, our_blob, their_blob, path,
> +					      orig_mode, our_mode, their_mode);

It would be nice to preserve the comments from the script as I find they 
help a lot in understanding which case each piece of code is handling. 
The code above appears to be handling deletions but does not appear to 
check that one side is actually missing. Shouldn't it be something like

if (orig_blob &&
     ((!their_blob && (our_blob && oideq(orig_blob, our_blob))) ||
      (!our_blob && (their_blob && oideq(orig_blob, their_blob))))

Maybe this could do with a test case

> +	else if (!orig_blob && our_blob && !their_blob) {
> +		return add_to_index_cacheinfo(our_mode, our_blob, path);
> +	} else if (!orig_blob && !our_blob && their_blob) {
> +		printf("Adding %s\n", path);
> +
> +		if (file_exists(path)) {
> +			fprintf(stderr, "ERROR: untracked %s is overwritten by the merge.\n", path);
> +			return 1;
> +		}
> +
> +		if (add_to_index_cacheinfo(their_mode, their_blob, path))
> +			return 1;
> +		return checkout_from_index(path);
> +	} else if (!orig_blob && our_blob && their_blob &&
> +		   oideq(our_blob, their_blob)) {
> +		if (our_mode != their_mode) {
> +			fprintf(stderr, "ERROR: File %s added identically in both branches,", path);
> +			fprintf(stderr, "ERROR: but permissions conflict %o->%o.\n",
> +				our_mode, their_mode);
> +			return 1;
> +		}
> +
> +		printf("Adding %s\n", path);
> +
> +		if (add_to_index_cacheinfo(our_mode, our_blob, path))
> +			return 1;
> +		return checkout_from_index(path);
> +	} else if (our_blob && their_blob)
> +		return do_merge_one_file(orig_blob, our_blob, their_blob, path,
> +					 orig_mode, our_mode, their_mode);
> +	else {
> +		char *orig_hex = "", *our_hex = "", *their_hex = "";
> +
> +		if (orig_blob)
> +			orig_hex = oid_to_hex(orig_blob);
> +		if (our_blob)
> +			our_hex = oid_to_hex(our_blob);
> +		if (their_blob)
> +			their_hex = oid_to_hex(their_blob);
> +
> +		fprintf(stderr, "ERROR: %s: Not handling case %s -> %s -> %s\n",
> +			path, orig_hex, our_hex, their_hex);
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +static const char builtin_merge_one_file_usage[] =
> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
> +	"<orig mode> <our mode> <their mode>\n\n"
> +	"Blob ids and modes should be empty for missing files.";
> +
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> +{
> +	struct object_id orig_blob, our_blob, their_blob,
> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0;
> +
> +	if (argc != 8)
> +		usage(builtin_merge_one_file_usage);
> +
> +	if (!get_oid(argv[1], &orig_blob)) {
> +		p_orig_blob = &orig_blob;
> +		orig_mode = strtol(argv[5], NULL, 8);

It would probably make sense to check that strtol() succeeds (and the 
mode is sensible), and also that get_oid() fails because argv[1] is 
empty, not because it is invalid.

Thanks for working on this
Best Wishes

Phillip


> +	}
> +
> +	if (!get_oid(argv[2], &our_blob)) {
> +		p_our_blob = &our_blob;
> +		our_mode = strtol(argv[6], NULL, 8);
> +	}
> +
> +	if (!get_oid(argv[3], &their_blob)) {
> +		p_their_blob = &their_blob;
> +		their_mode = strtol(argv[7], NULL, 8);
> +	}
> +
> +	return merge_one_file(p_orig_blob, p_our_blob, p_their_blob, argv[4],
> +			      orig_mode, our_mode, their_mode);
> +}
> diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
> deleted file mode 100755
> index f6d9852d2f..0000000000
> --- a/git-merge-one-file.sh
> +++ /dev/null
> @@ -1,167 +0,0 @@
> -#!/bin/sh
> -#
> -# Copyright (c) Linus Torvalds, 2005
> -#
> -# This is the git per-file merge script, called with
> -#
> -#   $1 - original file SHA1 (or empty)
> -#   $2 - file in branch1 SHA1 (or empty)
> -#   $3 - file in branch2 SHA1 (or empty)
> -#   $4 - pathname in repository
> -#   $5 - original file mode (or empty)
> -#   $6 - file in branch1 mode (or empty)
> -#   $7 - file in branch2 mode (or empty)
> -#
> -# Handle some trivial cases.. The _really_ trivial cases have
> -# been handled already by git read-tree, but that one doesn't
> -# do any merges that might change the tree layout.
> -
> -USAGE='<orig blob> <our blob> <their blob> <path>'
> -USAGE="$USAGE <orig mode> <our mode> <their mode>"
> -LONG_USAGE="usage: git merge-one-file $USAGE
> -
> -Blob ids and modes should be empty for missing files."
> -
> -SUBDIRECTORY_OK=Yes
> -. git-sh-setup
> -cd_to_toplevel
> -require_work_tree
> -
> -if test $# != 7
> -then
> -	echo "$LONG_USAGE"
> -	exit 1
> -fi
> -
> -case "${1:-.}${2:-.}${3:-.}" in
> -#
> -# Deleted in both or deleted in one and unchanged in the other
> -#
> -"$1.." | "$1.$1" | "$1$1.")
> -	if { test -z "$6" && test "$5" != "$7"; } ||
> -	   { test -z "$7" && test "$5" != "$6"; }
> -	then
> -		echo "ERROR: File $4 deleted on one branch but had its" >&2
> -		echo "ERROR: permissions changed on the other." >&2
> -		exit 1
> -	fi
> -
> -	if test -n "$2"
> -	then
> -		echo "Removing $4"
> -	else
> -		# read-tree checked that index matches HEAD already,
> -		# so we know we do not have this path tracked.
> -		# there may be an unrelated working tree file here,
> -		# which we should just leave unmolested.  Make sure
> -		# we do not have it in the index, though.
> -		exec git update-index --remove -- "$4"
> -	fi
> -	if test -f "$4"
> -	then
> -		rm -f -- "$4" &&
> -		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
> -	fi &&
> -		exec git update-index --remove -- "$4"
> -	;;
> -
> -#
> -# Added in one.
> -#
> -".$2.")
> -	# the other side did not add and we added so there is nothing
> -	# to be done, except making the path merged.
> -	exec git update-index --add --cacheinfo "$6" "$2" "$4"
> -	;;
> -"..$3")
> -	echo "Adding $4"
> -	if test -f "$4"
> -	then
> -		echo "ERROR: untracked $4 is overwritten by the merge." >&2
> -		exit 1
> -	fi
> -	git update-index --add --cacheinfo "$7" "$3" "$4" &&
> -		exec git checkout-index -u -f -- "$4"
> -	;;
> -
> -#
> -# Added in both, identically (check for same permissions).
> -#
> -".$3$2")
> -	if test "$6" != "$7"
> -	then
> -		echo "ERROR: File $4 added identically in both branches," >&2
> -		echo "ERROR: but permissions conflict $6->$7." >&2
> -		exit 1
> -	fi
> -	echo "Adding $4"
> -	git update-index --add --cacheinfo "$6" "$2" "$4" &&
> -		exec git checkout-index -u -f -- "$4"
> -	;;
> -
> -#
> -# Modified in both, but differently.
> -#
> -"$1$2$3" | ".$2$3")
> -
> -	case ",$6,$7," in
> -	*,120000,*)
> -		echo "ERROR: $4: Not merging symbolic link changes." >&2
> -		exit 1
> -		;;
> -	*,160000,*)
> -		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
> -		exit 1
> -		;;
> -	esac
> -
> -	src1=$(git unpack-file $2)
> -	src2=$(git unpack-file $3)
> -	case "$1" in
> -	'')
> -		echo "Added $4 in both, but differently."
> -		orig=$(git unpack-file $(git hash-object /dev/null))
> -		;;
> -	*)
> -		echo "Auto-merging $4"
> -		orig=$(git unpack-file $1)
> -		;;
> -	esac
> -
> -	git merge-file "$src1" "$orig" "$src2"
> -	ret=$?
> -	msg=
> -	if test $ret != 0 || test -z "$1"
> -	then
> -		msg='content conflict'
> -		ret=1
> -	fi
> -
> -	# Create the working tree file, using "our tree" version from the
> -	# index, and then store the result of the merge.
> -	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
> -	rm -f -- "$orig" "$src1" "$src2"
> -
> -	if test "$6" != "$7"
> -	then
> -		if test -n "$msg"
> -		then
> -			msg="$msg, "
> -		fi
> -		msg="${msg}permissions conflict: $5->$6,$7"
> -		ret=1
> -	fi
> -
> -	if test $ret != 0
> -	then
> -		echo "ERROR: $msg in $4" >&2
> -		exit 1
> -	fi
> -	exec git update-index -- "$4"
> -	;;
> -
> -*)
> -	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
> -	;;
> -esac
> -exit 1
> diff --git a/git.c b/git.c
> index a2d337eed7..058d91a2a5 100644
> --- a/git.c
> +++ b/git.c
> @@ -532,6 +532,7 @@ static struct cmd_struct commands[] = {
>   	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
>   	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
>   	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
> +	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>   	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>   	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>   	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
> diff --git a/t/t6035-merge-dir-to-symlink.sh b/t/t6035-merge-dir-to-symlink.sh
> index 2eddcc7664..5fb74e39a0 100755
> --- a/t/t6035-merge-dir-to-symlink.sh
> +++ b/t/t6035-merge-dir-to-symlink.sh
> @@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
>   	test -h a/b
>   '
>   
> -test_expect_failure 'do not lose untracked in merge (resolve)' '
> +test_expect_success 'do not lose untracked in merge (resolve)' '
>   	git reset --hard &&
>   	git checkout baseline^0 &&
>   	>a/b/c/e &&
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 16/17] sequencer: use the "resolve" strategy without forking
  2020-06-25 12:19 ` [RFC PATCH v1 16/17] sequencer: use the "resolve" " Alban Gruin
@ 2020-06-25 16:11   ` Phillip Wood
  2020-07-12 11:27     ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Phillip Wood @ 2020-06-25 16:11 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

Hi Alban

On 25/06/2020 13:19, Alban Gruin wrote:
> This teaches the sequencer to invoke the "resolve" strategy with a
> function call instead of forking.

This is a good idea, however we should check the existing tests that use 
this strategy to see if they are doing so to test the 
try_merge_command() code path. I've got some patches in seen that use 
'--strategy=resolve' to exercise the "non merge-recursive" code path, so 
I'll update them to use a proper custom merge strategy.

Is it worth optimizing do_merge() to take advantage of resolve and 
octopus being builtin as well?

Best Wishes

Phil


> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>   sequencer.c | 13 ++++++++++---
>   1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index fd7701c88a..ea8dc58108 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -33,6 +33,7 @@
>   #include "commit-reach.h"
>   #include "rebase-interactive.h"
>   #include "reset.h"
> +#include "merge-strategies.h"
>   
>   #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
>   
> @@ -1922,9 +1923,15 @@ static int do_pick_commit(struct repository *r,
>   
>   		commit_list_insert(base, &common);
>   		commit_list_insert(next, &remotes);
> -		res |= try_merge_command(r, opts->strategy,
> -					 opts->xopts_nr, (const char **)opts->xopts,
> -					common, oid_to_hex(&head), remotes);
> +
> +		if (!strcmp(opts->strategy, "resolve")) {
> +			repo_read_index(r);
> +			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
> +		} else
> +			res |= try_merge_command(r, opts->strategy,
> +						 opts->xopts_nr, (const char **)opts->xopts,
> +						 common, oid_to_hex(&head), remotes);
> +
>   		free_commit_list(common);
>   		free_commit_list(remotes);
>   	}
> 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 02/17] merge-one-file: rewrite in C
  2020-06-25 15:16   ` Phillip Wood
@ 2020-06-25 18:17     ` Phillip Wood
  2020-06-26 14:33       ` Phillip Wood
  2020-07-12 11:22     ` Alban Gruin
  1 sibling, 1 reply; 221+ messages in thread
From: Phillip Wood @ 2020-06-25 18:17 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

On 25/06/2020 16:16, Phillip Wood wrote:
> Hi Alban
> 
> I think this series is a great idea
> 
> On 25/06/2020 13:19, Alban Gruin wrote:
>> This rewrites `git merge-one-file' from shell to C.  This port is very
>> straightforward: it keeps using external processes to edit the index,
>> for instance.  Errors are also displayed with fprintf() instead of
>> error().  Both of these will be addressed in the next few commits,
>> leading to its libification so its main function can be used from other
>> commands directly.
>>
>> This also fixes a bug present in the original script: instead of
>> checking if a _regular_ file exists when a file exists in the branch to
>> merge, but not in our branch, the rewritten version checks if a file of
>> any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
>> where the branch to merge had a new file, `a/b', but our branch had a
>> directory there; it should have failed because a directory exists, but
>> it did not because there was no regular file called `a/b'.  This test is
>> now marked as successful.
>> [...]
>> +static int merge_one_file(const struct object_id *orig_blob,
>> +              const struct object_id *our_blob,
>> +              const struct object_id *their_blob, const char *path,
>> +              unsigned int orig_mode, unsigned int our_mode, unsigned
>> int their_mode)
>> +{
>> +    if (orig_blob &&
>> +        ((our_blob && oideq(orig_blob, our_blob)) ||
>> +         (their_blob && oideq(orig_blob, their_blob))))
>> +        return merge_one_file_deleted(orig_blob, our_blob,
>> their_blob, path,
>> +                          orig_mode, our_mode, their_mode);
> 
> It would be nice to preserve the comments from the script as I find they
> help a lot in understanding which case each piece of code is handling.
> The code above appears to be handling deletions but does not appear to
> check that one side is actually missing. Shouldn't it be something like
> 
> if (orig_blob &&
>     ((!their_blob && (our_blob && oideq(orig_blob, our_blob))) ||
>      (!our_blob && (their_blob && oideq(orig_blob, their_blob))))
> 
> Maybe this could do with a test case

The reason your version works is that if only one side has changed
read-tree will have done the merge itself so this only gets called if
one side has been deleted. However the original script printed an error
if someone accidentally called when the content had changed in only one
side and there were no mode changes. I think we want to keep that behavior.

In the future we could probably update this to also handle the cases
that read-tree normally takes care of rather than erroring out but I
don't think it is a high priority.

Best Wishes

Phillip

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all()
  2020-06-25 12:19 ` [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-06-26 10:13   ` Phillip Wood
  2020-06-26 14:32     ` Phillip Wood
  2020-07-12 11:36     ` Alban Gruin
  0 siblings, 2 replies; 221+ messages in thread
From: Phillip Wood @ 2020-06-26 10:13 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

Hi Alban

On 25/06/2020 13:19, Alban Gruin wrote:
> The "resolve" and "octopus" merge strategies do not call directly `git
> merge-one-file', they delegate the work to another git command, `git
> merge-index', that will loop over files in the index and call the
> specified command.  Unfortunately, these functions are not part of
> libgit.a, which means that once rewritten, the strategies would still
> have to invoke `merge-one-file' by spawning a new process first.
> 
> To avoid this, this moves merge_one_path(), merge_all(), and their
> helpers to merge-strategies.c.  They also take a callback to dictate
> what they should do for each file.  For now, only one launching a new
> process is defined to preserve the behaviour of the builtin version.
> 
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
> 
> Notes:
>     This patch is best viewed with `--color-moved'.
> 
>  builtin/merge-index.c | 77 +++------------------------------
>  merge-strategies.c    | 99 +++++++++++++++++++++++++++++++++++++++++++
>  merge-strategies.h    | 17 ++++++++
>  3 files changed, 123 insertions(+), 70 deletions(-)
> 
> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> index 38ea6ad6ca..6cb666cc78 100644
> --- a/builtin/merge-index.c
> +++ b/builtin/merge-index.c
> @@ -1,74 +1,11 @@
>  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>  #include "builtin.h"
> -#include "run-command.h"
> -
> -static const char *pgm;
> -static int one_shot, quiet;
> -static int err;
> -
> -static int merge_entry(int pos, const char *path)
> -{
> -	int found;
> -	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
> -	char hexbuf[4][GIT_MAX_HEXSZ + 1];
> -	char ownbuf[4][60];
> -
> -	if (pos >= active_nr)
> -		die("git merge-index: %s not in the cache", path);
> -	found = 0;
> -	do {
> -		const struct cache_entry *ce = active_cache[pos];
> -		int stage = ce_stage(ce);
> -
> -		if (strcmp(ce->name, path))
> -			break;
> -		found++;
> -		oid_to_hex_r(hexbuf[stage], &ce->oid);
> -		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
> -		arguments[stage] = hexbuf[stage];
> -		arguments[stage + 4] = ownbuf[stage];
> -	} while (++pos < active_nr);
> -	if (!found)
> -		die("git merge-index: %s not in the cache", path);
> -
> -	if (run_command_v_opt(arguments, 0)) {
> -		if (one_shot)
> -			err++;
> -		else {
> -			if (!quiet)
> -				die("merge program failed");
> -			exit(1);
> -		}
> -	}
> -	return found;
> -}
> -
> -static void merge_one_path(const char *path)
> -{
> -	int pos = cache_name_pos(path, strlen(path));
> -
> -	/*
> -	 * If it already exists in the cache as stage0, it's
> -	 * already merged and there is nothing to do.
> -	 */
> -	if (pos < 0)
> -		merge_entry(-pos-1, path);
> -}
> -
> -static void merge_all(void)
> -{
> -	int i;
> -	for (i = 0; i < active_nr; i++) {
> -		const struct cache_entry *ce = active_cache[i];
> -		if (!ce_stage(ce))
> -			continue;
> -		i += merge_entry(i, ce->name)-1;
> -	}
> -}
> +#include "merge-strategies.h"
>  
>  int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  {
> -	int i, force_file = 0;
> +	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
> +	const char *pgm;
>  
>  	/* Without this we cannot rely on waitpid() to tell
>  	 * what happened to our children.
> @@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  				continue;
>  			}
>  			if (!strcmp(arg, "-a")) {
> -				merge_all();
> +				err |= merge_all(&the_index, one_shot, quiet,
> +						 merge_program_cb, (void *)pgm);
>  				continue;
>  			}
>  			die("git merge-index: unknown option %s", arg);
>  		}
> -		merge_one_path(arg);
> +		err |= merge_one_path(&the_index, one_shot, quiet, arg,
> +				      merge_program_cb, (void *)pgm);
>  	}
> -	if (err && !quiet)
> -		die("merge program failed");
>  	return err;
>  }
> diff --git a/merge-strategies.c b/merge-strategies.c
> index 3a9fce9f22..f4c0b4acd6 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -1,6 +1,7 @@
>  #include "cache.h"
>  #include "dir.h"
>  #include "merge-strategies.h"
> +#include "run-command.h"
>  #include "xdiff-interface.h"
>  
>  static int add_to_index_cacheinfo(struct index_state *istate,
> @@ -189,3 +190,101 @@ int merge_strategies_one_file(struct repository *r,
>  
>  	return 0;
>  }
> +
> +int merge_program_cb(const struct object_id *orig_blob,
> +		     const struct object_id *our_blob,
> +		     const struct object_id *their_blob, const char *path,
> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +		     void *data)

Using void* is slightly unfortunate but it's needed later.

It would be nice to check if the program to run is git-merge-one-file
and call the appropriate function instead in that case so all users of
merge-index get the benefit of it being builtin. That probably wants to
be done in cmd_merge_index() rather than here though.

> +{
> +	char ownbuf[3][60] = {{0}};

I know this is copied from above but it would be better to use
GIT_MAX_HEXSZ rather than 60

> +	const char *arguments[] = { (char *)data, "", "", "", path,
> +				    ownbuf[0], ownbuf[1], ownbuf[2],
> +				    NULL };
> +
> +	if (orig_blob)
> +		arguments[1] = oid_to_hex(orig_blob);
> +	if (our_blob)
> +		arguments[2] = oid_to_hex(our_blob);
> +	if (their_blob)
> +		arguments[3] = oid_to_hex(their_blob);
> +
> +	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
> +	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
> +	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);

These are leaked. Also are you sure we want to fill out the mode if the
corresponding blob is missing - I guess it doesn't matter but it would
be good to check that - i think the original passed "". It also passed
"" rather than "0000..." for the blobs that were missing I think.

Best Wishes

Phillip

> +
> +	return run_command_v_opt(arguments, 0);
> +}
> +
> +static int merge_entry(struct index_state *istate, int quiet, int pos,
> +		       const char *path, merge_cb cb, void *data)
> +{
> +	int found = 0;
> +	const struct object_id *oids[3] = {NULL};
> +	unsigned int modes[3] = {0};
> +
> +	do {
> +		const struct cache_entry *ce = istate->cache[pos];
> +		int stage = ce_stage(ce);
> +
> +		if (strcmp(ce->name, path))
> +			break;
> +		found++;
> +		oids[stage - 1] = &ce->oid;
> +		modes[stage - 1] = ce->ce_mode;
> +	} while (++pos < istate->cache_nr);
> +	if (!found)
> +		return error(_("%s is not in the cache"), path);
> +
> +	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
> +		if (!quiet)
> +			error(_("Merge program failed"));
> +		return -2;
> +	}
> +
> +	return found;
> +}
> +
> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
> +		   const char *path, merge_cb cb, void *data)
> +{
> +	int pos = index_name_pos(istate, path, strlen(path)), ret;
> +
> +	/*
> +	 * If it already exists in the cache as stage0, it's
> +	 * already merged and there is nothing to do.
> +	 */
> +	if (pos < 0) {
> +		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
> +		if (ret == -1)
> +			return -1;
> +		else if (ret == -2)
> +			return 1;
> +	}
> +	return 0;
> +}
> +
> +int merge_all(struct index_state *istate, int oneshot, int quiet,
> +	      merge_cb cb, void *data)
> +{
> +	int err = 0, i, ret;
> +	for (i = 0; i < istate->cache_nr; i++) {
> +		const struct cache_entry *ce = istate->cache[i];
> +		if (!ce_stage(ce))
> +			continue;
> +
> +		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
> +		if (ret > 0)
> +			i += ret - 1;
> +		else if (ret == -1)
> +			return -1;
> +		else if (ret == -2) {
> +			if (oneshot)
> +				err++;
> +			else
> +				return 1;
> +		}
> +	}
> +
> +	return err;
> +}
> diff --git a/merge-strategies.h b/merge-strategies.h
> index b527d145c7..cf78d7eaf4 100644
> --- a/merge-strategies.h
> +++ b/merge-strategies.h
> @@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
>  			      unsigned int orig_mode, unsigned int our_mode,
>  			      unsigned int their_mode);
>  
> +typedef int (*merge_cb)(const struct object_id *orig_blob,
> +			const struct object_id *our_blob,
> +			const struct object_id *their_blob, const char *path,
> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			void *data);
> +
> +int merge_program_cb(const struct object_id *orig_blob,
> +		     const struct object_id *our_blob,
> +		     const struct object_id *their_blob, const char *path,
> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +		     void *data);
> +
> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
> +		   const char *path, merge_cb cb, void *data);
> +int merge_all(struct index_state *istate, int oneshot, int quiet,
> +	      merge_cb cb, void *data);
> +
>  #endif /* MERGE_STRATEGIES_H */
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all()
  2020-06-26 10:13   ` Phillip Wood
@ 2020-06-26 14:32     ` Phillip Wood
  2020-07-12 11:36     ` Alban Gruin
  1 sibling, 0 replies; 221+ messages in thread
From: Phillip Wood @ 2020-06-26 14:32 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

Hi Alban

On 26/06/2020 11:13, Phillip Wood wrote:
> Hi Alban
> 
> On 25/06/2020 13:19, Alban Gruin wrote:
>> The "resolve" and "octopus" merge strategies do not call directly `git
>> merge-one-file', they delegate the work to another git command, `git
>> merge-index', that will loop over files in the index and call the
>> specified command.  Unfortunately, these functions are not part of
>> libgit.a, which means that once rewritten, the strategies would still
>> have to invoke `merge-one-file' by spawning a new process first.
>>
>> To avoid this, this moves merge_one_path(), merge_all(), and their
>> helpers to merge-strategies.c.  They also take a callback to dictate
>> what they should do for each file.  For now, only one launching a new
>> process is defined to preserve the behaviour of the builtin version.
>>
>> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
>> ---
>>
>> Notes:
>>      This patch is best viewed with `--color-moved'.
>>
>>   builtin/merge-index.c | 77 +++------------------------------
>>   merge-strategies.c    | 99 +++++++++++++++++++++++++++++++++++++++++++
>>   merge-strategies.h    | 17 ++++++++
>>   3 files changed, 123 insertions(+), 70 deletions(-)
>>
>> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
>> index 38ea6ad6ca..6cb666cc78 100644
>> --- a/builtin/merge-index.c
>> +++ b/builtin/merge-index.c
>> @@ -1,74 +1,11 @@
>>   #define USE_THE_INDEX_COMPATIBILITY_MACROS
>>   #include "builtin.h"
>> -#include "run-command.h"
>> -
>> -static const char *pgm;
>> -static int one_shot, quiet;
>> -static int err;
>> -
>> -static int merge_entry(int pos, const char *path)
>> -{
>> -	int found;
>> -	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
>> -	char hexbuf[4][GIT_MAX_HEXSZ + 1];
>> -	char ownbuf[4][60];
>> -
>> -	if (pos >= active_nr)
>> -		die("git merge-index: %s not in the cache", path);
>> -	found = 0;
>> -	do {
>> -		const struct cache_entry *ce = active_cache[pos];
>> -		int stage = ce_stage(ce);
>> -
>> -		if (strcmp(ce->name, path))
>> -			break;
>> -		found++;
>> -		oid_to_hex_r(hexbuf[stage], &ce->oid);
>> -		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
>> -		arguments[stage] = hexbuf[stage];
>> -		arguments[stage + 4] = ownbuf[stage];
>> -	} while (++pos < active_nr);
>> -	if (!found)
>> -		die("git merge-index: %s not in the cache", path);
>> -
>> -	if (run_command_v_opt(arguments, 0)) {
>> -		if (one_shot)
>> -			err++;
>> -		else {
>> -			if (!quiet)
>> -				die("merge program failed");
>> -			exit(1);
>> -		}
>> -	}
>> -	return found;
>> -}
>> -
>> -static void merge_one_path(const char *path)
>> -{
>> -	int pos = cache_name_pos(path, strlen(path));
>> -
>> -	/*
>> -	 * If it already exists in the cache as stage0, it's
>> -	 * already merged and there is nothing to do.
>> -	 */
>> -	if (pos < 0)
>> -		merge_entry(-pos-1, path);
>> -}
>> -
>> -static void merge_all(void)
>> -{
>> -	int i;
>> -	for (i = 0; i < active_nr; i++) {
>> -		const struct cache_entry *ce = active_cache[i];
>> -		if (!ce_stage(ce))
>> -			continue;
>> -		i += merge_entry(i, ce->name)-1;
>> -	}
>> -}
>> +#include "merge-strategies.h"
>>   
>>   int cmd_merge_index(int argc, const char **argv, const char *prefix)
>>   {
>> -	int i, force_file = 0;
>> +	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
>> +	const char *pgm;
>>   
>>   	/* Without this we cannot rely on waitpid() to tell
>>   	 * what happened to our children.
>> @@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>>   				continue;
>>   			}
>>   			if (!strcmp(arg, "-a")) {
>> -				merge_all();
>> +				err |= merge_all(&the_index, one_shot, quiet,
>> +						 merge_program_cb, (void *)pgm);
>>   				continue;
>>   			}
>>   			die("git merge-index: unknown option %s", arg);
>>   		}
>> -		merge_one_path(arg);
>> +		err |= merge_one_path(&the_index, one_shot, quiet, arg,
>> +				      merge_program_cb, (void *)pgm);
>>   	}
>> -	if (err && !quiet)
>> -		die("merge program failed");
>>   	return err;
>>   }
>> diff --git a/merge-strategies.c b/merge-strategies.c
>> index 3a9fce9f22..f4c0b4acd6 100644
>> --- a/merge-strategies.c
>> +++ b/merge-strategies.c
>> @@ -1,6 +1,7 @@
>>   #include "cache.h"
>>   #include "dir.h"
>>   #include "merge-strategies.h"
>> +#include "run-command.h"
>>   #include "xdiff-interface.h"
>>   
>>   static int add_to_index_cacheinfo(struct index_state *istate,
>> @@ -189,3 +190,101 @@ int merge_strategies_one_file(struct repository *r,
>>   
>>   	return 0;
>>   }
>> +
>> +int merge_program_cb(const struct object_id *orig_blob,
>> +		     const struct object_id *our_blob,
>> +		     const struct object_id *their_blob, const char *path,
>> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +		     void *data)
> 
> Using void* is slightly unfortunate but it's needed later.
> 
> It would be nice to check if the program to run is git-merge-one-file
> and call the appropriate function instead in that case so all users of
> merge-index get the benefit of it being builtin. That probably wants to
> be done in cmd_merge_index() rather than here though.
> 
>> +{
>> +	char ownbuf[3][60] = {{0}};
> 
> I know this is copied from above but it would be better to use
> GIT_MAX_HEXSZ rather than 60
> 
>> +	const char *arguments[] = { (char *)data, "", "", "", path,
>> +				    ownbuf[0], ownbuf[1], ownbuf[2],
>> +				    NULL };
>> +
>> +	if (orig_blob)
>> +		arguments[1] = oid_to_hex(orig_blob);
>> +	if (our_blob)
>> +		arguments[2] = oid_to_hex(our_blob);
>> +	if (their_blob)
>> +		arguments[3] = oid_to_hex(their_blob);
>> +
>> +	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
>> +	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
>> +	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);

Sorry ignore all the comments below, they are nonsense

Best Wishes

Phillip

> These are leaked. Also are you sure we want to fill out the mode if the
> corresponding blob is missing - I guess it doesn't matter but it would
> be good to check that - i think the original passed "". It also passed
> "" rather than "0000..." for the blobs that were missing I think.
> 
> Best Wishes
> 
> Phillip
> 
>> +
>> +	return run_command_v_opt(arguments, 0);
>> +}
>> +
>> +static int merge_entry(struct index_state *istate, int quiet, int pos,
>> +		       const char *path, merge_cb cb, void *data)
>> +{
>> +	int found = 0;
>> +	const struct object_id *oids[3] = {NULL};
>> +	unsigned int modes[3] = {0};
>> +
>> +	do {
>> +		const struct cache_entry *ce = istate->cache[pos];
>> +		int stage = ce_stage(ce);
>> +
>> +		if (strcmp(ce->name, path))
>> +			break;
>> +		found++;
>> +		oids[stage - 1] = &ce->oid;
>> +		modes[stage - 1] = ce->ce_mode;
>> +	} while (++pos < istate->cache_nr);
>> +	if (!found)
>> +		return error(_("%s is not in the cache"), path);
>> +
>> +	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
>> +		if (!quiet)
>> +			error(_("Merge program failed"));
>> +		return -2;
>> +	}
>> +
>> +	return found;
>> +}
>> +
>> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
>> +		   const char *path, merge_cb cb, void *data)
>> +{
>> +	int pos = index_name_pos(istate, path, strlen(path)), ret;
>> +
>> +	/*
>> +	 * If it already exists in the cache as stage0, it's
>> +	 * already merged and there is nothing to do.
>> +	 */
>> +	if (pos < 0) {
>> +		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
>> +		if (ret == -1)
>> +			return -1;
>> +		else if (ret == -2)
>> +			return 1;
>> +	}
>> +	return 0;
>> +}
>> +
>> +int merge_all(struct index_state *istate, int oneshot, int quiet,
>> +	      merge_cb cb, void *data)
>> +{
>> +	int err = 0, i, ret;
>> +	for (i = 0; i < istate->cache_nr; i++) {
>> +		const struct cache_entry *ce = istate->cache[i];
>> +		if (!ce_stage(ce))
>> +			continue;
>> +
>> +		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
>> +		if (ret > 0)
>> +			i += ret - 1;
>> +		else if (ret == -1)
>> +			return -1;
>> +		else if (ret == -2) {
>> +			if (oneshot)
>> +				err++;
>> +			else
>> +				return 1;
>> +		}
>> +	}
>> +
>> +	return err;
>> +}
>> diff --git a/merge-strategies.h b/merge-strategies.h
>> index b527d145c7..cf78d7eaf4 100644
>> --- a/merge-strategies.h
>> +++ b/merge-strategies.h
>> @@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
>>   			      unsigned int orig_mode, unsigned int our_mode,
>>   			      unsigned int their_mode);
>>   
>> +typedef int (*merge_cb)(const struct object_id *orig_blob,
>> +			const struct object_id *our_blob,
>> +			const struct object_id *their_blob, const char *path,
>> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +			void *data);
>> +
>> +int merge_program_cb(const struct object_id *orig_blob,
>> +		     const struct object_id *our_blob,
>> +		     const struct object_id *their_blob, const char *path,
>> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +		     void *data);
>> +
>> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
>> +		   const char *path, merge_cb cb, void *data);
>> +int merge_all(struct index_state *istate, int oneshot, int quiet,
>> +	      merge_cb cb, void *data);
>> +
>>   #endif /* MERGE_STRATEGIES_H */
>>
> 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 02/17] merge-one-file: rewrite in C
  2020-06-25 18:17     ` Phillip Wood
@ 2020-06-26 14:33       ` Phillip Wood
  0 siblings, 0 replies; 221+ messages in thread
From: Phillip Wood @ 2020-06-26 14:33 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

On 25/06/2020 19:17, Phillip Wood wrote:
> On 25/06/2020 16:16, Phillip Wood wrote:
>> Hi Alban
>>
>> I think this series is a great idea
>>
>> On 25/06/2020 13:19, Alban Gruin wrote:
>>> This rewrites `git merge-one-file' from shell to C.  This port is very
>>> straightforward: it keeps using external processes to edit the index,
>>> for instance.  Errors are also displayed with fprintf() instead of
>>> error().  Both of these will be addressed in the next few commits,
>>> leading to its libification so its main function can be used from other
>>> commands directly.
>>>
>>> This also fixes a bug present in the original script: instead of
>>> checking if a _regular_ file exists when a file exists in the branch to
>>> merge, but not in our branch, the rewritten version checks if a file of
>>> any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
>>> where the branch to merge had a new file, `a/b', but our branch had a
>>> directory there; it should have failed because a directory exists, but
>>> it did not because there was no regular file called `a/b'.  This test is
>>> now marked as successful.
>>> [...]
>>> +static int merge_one_file(const struct object_id *orig_blob,
>>> +              const struct object_id *our_blob,
>>> +              const struct object_id *their_blob, const char *path,
>>> +              unsigned int orig_mode, unsigned int our_mode, unsigned
>>> int their_mode)
>>> +{
>>> +    if (orig_blob &&
>>> +        ((our_blob && oideq(orig_blob, our_blob)) ||
>>> +         (their_blob && oideq(orig_blob, their_blob))))
>>> +        return merge_one_file_deleted(orig_blob, our_blob,
>>> their_blob, path,
>>> +                          orig_mode, our_mode, their_mode);
>>
>> It would be nice to preserve the comments from the script as I find they
>> help a lot in understanding which case each piece of code is handling.
>> The code above appears to be handling deletions but does not appear to
>> check that one side is actually missing. Shouldn't it be something like
>>
>> if (orig_blob &&
>>      ((!their_blob && (our_blob && oideq(orig_blob, our_blob))) ||
>>       (!our_blob && (their_blob && oideq(orig_blob, their_blob))))
>>
>> Maybe this could do with a test case
> 
> The reason your version works is that if only one side has changed
> read-tree will have done the merge itself so this only gets called if
> one side has been deleted. However the original script printed an error
> if someone accidentally called when the content had changed in only one
> side and there were no mode changes. I think we want to keep that behavior.

Actually I think the original probably handles this case by calling 'git 
merge-file'

Best Wishes

Phillip

> In the future we could probably update this to also handle the cases
> that read-tree normally takes care of rather than erroring out but I
> don't think it is a high priority.
> 
> Best Wishes
> 
> Phillip
> 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 02/17] merge-one-file: rewrite in C
  2020-06-25 15:16   ` Phillip Wood
  2020-06-25 18:17     ` Phillip Wood
@ 2020-07-12 11:22     ` Alban Gruin
  1 sibling, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-07-12 11:22 UTC (permalink / raw)
  To: phillip.wood, git; +Cc: Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 6408 bytes --]

Hi Phillip,

Phillip Wood (phillip.wood123@gmail.com) a écrit :

> Hi Alban
> 
> I think this series is a great idea
> 
> On 25/06/2020 13:19, Alban Gruin wrote:
> -%<-
> > diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> > new file mode 100644
> > index 0000000000..4992a6cd30
> > --- /dev/null
> > +++ b/builtin/merge-one-file.c
> > @@ -0,0 +1,275 @@
> > +/*
> > + * Builtin "git merge-one-file"
> > + *
> > + * Copyright (c) 2020 Alban Gruin
> > + *
> > + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> > + *
> > + * This is the git per-file merge script, called with
> > + *
> > + *   $1 - original file SHA1 (or empty)
> > + *   $2 - file in branch1 SHA1 (or empty)
> > + *   $3 - file in branch2 SHA1 (or empty)
> > + *   $4 - pathname in repository
> > + *   $5 - original file mode (or empty)
> > + *   $6 - file in branch1 mode (or empty)
> > + *   $7 - file in branch2 mode (or empty)
> 
> nit pick - these are now argv[1] etc rather than $1 etc
> 

I'll change that, and replace "script" by "utility".

> > + *
> > + * Handle some trivial cases.. The _really_ trivial cases have
> > + * been handled already by git read-tree, but that one doesn't
> > + * do any merges that might change the tree layout.
> > + */
> > +
> > +#define USE_THE_INDEX_COMPATIBILITY_MACROS
> > +#include "cache.h"
> > +#include "builtin.h"
> > +#include "commit.h"
> > +#include "dir.h"
> > +#include "lockfile.h"
> > +#include "object-store.h"
> > +#include "run-command.h"
> > +#include "xdiff-interface.h"
> > +
> > +static int create_temp_file(const struct object_id *oid, struct strbuf
> > *path)
> > +{
> > +	struct child_process cp = CHILD_PROCESS_INIT;
> > +	struct strbuf err = STRBUF_INIT;
> > +	int ret;
> > +
> > +	cp.git_cmd = 1;
> > +	argv_array_pushl(&cp.args, "unpack-file", oid_to_hex(oid), NULL);
> > +	ret = pipe_command(&cp, NULL, 0, path, 0, &err, 0);
> > +	if (!ret && path->len > 0)
> > +		strbuf_trim_trailing_newline(path);
> > +
> > +	fprintf(stderr, "%.*s", (int) err.len, err.buf);
> > +	strbuf_release(&err);
> > +
> > +	return ret;
> > +}
> 
> I know others will disagree but personally I'm not a huge fan of rewriting
> shell functions in C that forks other builtins and then converting the C to
> use the internal apis, it seems a much better to just write the proper C
> version the first time. This is especially true for simple function such as
> the ones in this file. That way the reviewer gets a clear view of the final
> code from the patch, rather than having to piece it together from a series of
> additions and deletions.
> 

I understand -- I'll squash the "rewrite" and "use internal APIs" patches 
together as a last step for the v2, so I'd be able to get them back with 
all the changes made in the v2 if needed.

> -%<-
> > +static int do_merge_one_file(const struct object_id *orig_blob,
> > +			     const struct object_id *our_blob,
> > +			     const struct object_id *their_blob, const char
> > *path,
> > +			     unsigned int orig_mode, unsigned int our_mode,
> > unsigned int their_mode)
> > +{
> > +	int ret, source, dest;
> > +	struct strbuf src1 = STRBUF_INIT, src2 = STRBUF_INIT, orig =
> > STRBUF_INIT;
> > +	struct child_process cp_merge = CHILD_PROCESS_INIT,
> > +		cp_checkout = CHILD_PROCESS_INIT,
> > +		cp_update = CHILD_PROCESS_INIT;
> > +
> > +	if (our_mode == S_IFLNK || their_mode == S_IFLNK) {
> > +		fprintf(stderr, "ERROR: %s: Not merging symbolic link
> > changes.\n", path);
> > +		return 1;
> > +	} else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK) {
> > +		fprintf(stderr, "ERROR: %s: Not merging conflicting submodule
> > changes.\n",
> > +			path);
> > +		return 1;
> > +	}
> > +
> > +	create_temp_file(our_blob, &src1);
> > +	create_temp_file(their_blob, &src2);
> > +
> > +	if (orig_blob) {
> > +		printf("Auto-merging %s\n", path);
> > +		create_temp_file(orig_blob, &orig);
> > +	} else {
> > +		printf("Added %s in both, but differently.\n", path);
> > +		create_temp_file(the_hash_algo->empty_blob, &orig);
> > +	}
> > +
> > +	cp_merge.git_cmd = 1;
> > +	argv_array_pushl(&cp_merge.args, "merge-file", src1.buf, orig.buf,
> > src2.buf,
> > +			 NULL);
> > +	ret = run_command(&cp_merge);
> > +
> > +	if (ret != 0)
> > +		ret = 1;
> > +
> > +	cp_checkout.git_cmd = 1;
> > +	argv_array_pushl(&cp_checkout.args, "checkout-index", "-f",
> > "--stage=2",
> > +			 "--", path, NULL);
> > +	if (run_command(&cp_checkout))
> > +		return 1;
> > +
> > +	source = open(src1.buf, O_RDONLY);
> > +	dest = open(path, O_WRONLY | O_TRUNC);
> > +
> > +	copy_fd(source, dest);
> > +
> > +	close(source);
> > +	close(dest);
> > +
> > +	unlink(orig.buf);
> > +	unlink(src1.buf);
> > +	unlink(src2.buf);
> > +
> > +	strbuf_release(&src1);
> > +	strbuf_release(&src2);
> > +	strbuf_release(&orig);
> 
> The whole business of creating temporary files and forking seems like a lot of
> effort compared to calling ll_merge() which would also mean we respect any
> merge attributes
> 
> > +
> > +	if (ret) {
> > +		fprintf(stderr, "ERROR: ");
> > +
> > +		if (!orig_blob) {
> 
> I think the original does if (ret || !orig_blob) not &&

Good catch.

> > +			fprintf(stderr, "content conflict");
> > +			if (our_mode != their_mode)
> > +				fprintf(stderr, ", ");
> 
> sentence lego, in any case the message below should be printed regardless of
> content conflicts. We should probably mark all these messages for translation
> as well.
> 

Yeah, I think I will replace them with two calls to `error()'.

> -%<-
> > +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> > +{
> > +	struct object_id orig_blob, our_blob, their_blob,
> > +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> > +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0;
> > +
> > +	if (argc != 8)
> > +		usage(builtin_merge_one_file_usage);
> > +
> > +	if (!get_oid(argv[1], &orig_blob)) {
> > +		p_orig_blob = &orig_blob;
> > +		orig_mode = strtol(argv[5], NULL, 8);
> 
> It would probably make sense to check that strtol() succeeds (and the mode is
> sensible), and also that get_oid() fails because argv[1] is empty, not because
> it is invalid.
> 

Checking that `orig_mode' and friends are lower than 0800, and that 
`*argv[1]' is not equal to '\0' should be enough, right?

> Thanks for working on this

As always, thank you for your reviews.

> Best Wishes
> 
> Phillip
> 
> 

Cheers,
Alban

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 16/17] sequencer: use the "resolve" strategy without forking
  2020-06-25 16:11   ` Phillip Wood
@ 2020-07-12 11:27     ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-07-12 11:27 UTC (permalink / raw)
  To: phillip.wood, git; +Cc: Junio C Hamano

[-- Attachment #1: Type: text/plain, Size: 932 bytes --]

Hi Phillip,

Phillip Wood (phillip.wood123@gmail.com) a écrit :

> Hi Alban
> 
> On 25/06/2020 13:19, Alban Gruin wrote:
> > This teaches the sequencer to invoke the "resolve" strategy with a
> > function call instead of forking.
> 
> This is a good idea, however we should check the existing tests that use this
> strategy to see if they are doing so to test the try_merge_command() code
> path. I've got some patches in seen that use '--strategy=resolve' to exercise
> the "non merge-recursive" code path, so I'll update them to use a proper
> custom merge strategy.
> 
> Is it worth optimizing do_merge() to take advantage of resolve and octopus
> being builtin as well?
> 

Hmm, I see that do_merge() doesn't call directly the strategies, and 
delegates this work to git-merge.  If calling the new APIs does not imply 
to copy/paste too much code from merge.c, then my answer is yes.

> Best Wishes
> 
> Phil
> 

Cheers,
Alban

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all()
  2020-06-26 10:13   ` Phillip Wood
  2020-06-26 14:32     ` Phillip Wood
@ 2020-07-12 11:36     ` Alban Gruin
  2020-07-12 18:02       ` Phillip Wood
  1 sibling, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-07-12 11:36 UTC (permalink / raw)
  To: Phillip Wood, git; +Cc: Junio C Hamano

Hi Phillip,

Phillip Wood (phillip.wood123@gmail.com) a écrit :

> Hi Alban
> 
> On 25/06/2020 13:19, Alban Gruin wrote:
> -%<-
> > diff --git a/merge-strategies.c b/merge-strategies.c
> > index 3a9fce9f22..f4c0b4acd6 100644
> > --- a/merge-strategies.c
> > +++ b/merge-strategies.c
> > @@ -1,6 +1,7 @@
> >  #include "cache.h"
> >  #include "dir.h"
> >  #include "merge-strategies.h"
> > +#include "run-command.h"
> >  #include "xdiff-interface.h"
> >  
> >  static int add_to_index_cacheinfo(struct index_state *istate,
> > @@ -189,3 +190,101 @@ int merge_strategies_one_file(struct repository *r,
> >  
> >  	return 0;
> >  }
> > +
> > +int merge_program_cb(const struct object_id *orig_blob,
> > +		     const struct object_id *our_blob,
> > +		     const struct object_id *their_blob, const char *path,
> > +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> > +		     void *data)
> 
> Using void* is slightly unfortunate but it's needed later.
> 
> It would be nice to check if the program to run is git-merge-one-file
> and call the appropriate function instead in that case so all users of
> merge-index get the benefit of it being builtin. That probably wants to
> be done in cmd_merge_index() rather than here though.
> 

Dunno, I am not completely comfortable with changing a parameter that 
specifically describe a program, to a parameter that may be a program, 
except in one case where `merge-index' should lock the index, setup the 
worktree, and call a function instead.

Well, I say that, but implementing that behaviour is not that hard:

-- snip --
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 6cb666cc78..19fff9a113 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,11 +1,15 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 	const char *pgm;
+	void *data;
+	merge_cb merge_action;
+	struct lock_file lock = LOCK_INIT;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -26,7 +30,19 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+	if (!strcmp(pgm, "git-merge-one-file")) {
+		merge_action = merge_one_file_cb;
+		data = (void *)the_repository;
+
+		setup_work_tree();
+		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+	} else {
+		merge_action = merge_program_cb;
+		data = (void *)pgm;
+	}
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -36,13 +52,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all(&the_index, one_shot, quiet,
-						 merge_program_cb, (void *)pgm);
+						 merge_action, data);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_one_path(&the_index, one_shot, quiet, arg,
-				      merge_program_cb, (void *)pgm);
+				      merge_action, data);
+	}
+
+	if (merge_action == merge_one_file_cb) {
+		if (err) {
+			rollback_lock_file(&lock);
+			return err;
+		}
+
+		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 	}
 	return err;
 }
-- snap --

> > +{
> > +	char ownbuf[3][60] = {{0}};
> 
> I know this is copied from above but it would be better to use
> GIT_MAX_HEXSZ rather than 60
> 

Cheers,
Alban


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all()
  2020-07-12 11:36     ` Alban Gruin
@ 2020-07-12 18:02       ` Phillip Wood
  2020-07-12 20:10         ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Phillip Wood @ 2020-07-12 18:02 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano

Hi Alban

On 12/07/2020 12:36, Alban Gruin wrote:
> Hi Phillip,
> 
> Phillip Wood (phillip.wood123@gmail.com) a écrit :
> 
>> Hi Alban
>>
>> On 25/06/2020 13:19, Alban Gruin wrote:
>> -%<-
>>> diff --git a/merge-strategies.c b/merge-strategies.c
>>> index 3a9fce9f22..f4c0b4acd6 100644
>>> --- a/merge-strategies.c
>>> +++ b/merge-strategies.c
>>> @@ -1,6 +1,7 @@
>>>  #include "cache.h"
>>>  #include "dir.h"
>>>  #include "merge-strategies.h"
>>> +#include "run-command.h"
>>>  #include "xdiff-interface.h"
>>>  
>>>  static int add_to_index_cacheinfo(struct index_state *istate,
>>> @@ -189,3 +190,101 @@ int merge_strategies_one_file(struct repository *r,
>>>  
>>>  	return 0;
>>>  }
>>> +
>>> +int merge_program_cb(const struct object_id *orig_blob,
>>> +		     const struct object_id *our_blob,
>>> +		     const struct object_id *their_blob, const char *path,
>>> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>>> +		     void *data)
>>
>> Using void* is slightly unfortunate but it's needed later.
>>
>> It would be nice to check if the program to run is git-merge-one-file
>> and call the appropriate function instead in that case so all users of
>> merge-index get the benefit of it being builtin. That probably wants to
>> be done in cmd_merge_index() rather than here though.
>>
> 
> Dunno, I am not completely comfortable with changing a parameter that 
> specifically describe a program, to a parameter that may be a program, 
> except in one case where `merge-index' should lock the index, setup the 
> worktree, and call a function instead.

There is some previous discussion about this at
https://lore.kernel.org/git/xmqqblv5kr9u.fsf@gitster-ct.c.googlers.com/

I'll try and have a proper look at your comments towards the end of the
week (or maybe the week after the way things are at the moment...)

Best Wishes

Phillip

> Well, I say that, but implementing that behaviour is not that hard:
> 
> -- snip --
> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> index 6cb666cc78..19fff9a113 100644
> --- a/builtin/merge-index.c
> +++ b/builtin/merge-index.c
> @@ -1,11 +1,15 @@
>  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>  #include "builtin.h"
> +#include "lockfile.h"
>  #include "merge-strategies.h"
>  
>  int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  {
>  	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
>  	const char *pgm;
> +	void *data;
> +	merge_cb merge_action;
> +	struct lock_file lock = LOCK_INIT;
>  
>  	/* Without this we cannot rely on waitpid() to tell
>  	 * what happened to our children.
> @@ -26,7 +30,19 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  		quiet = 1;
>  		i++;
>  	}
> +
>  	pgm = argv[i++];
> +	if (!strcmp(pgm, "git-merge-one-file")) {
> +		merge_action = merge_one_file_cb;
> +		data = (void *)the_repository;
> +
> +		setup_work_tree();
> +		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
> +	} else {
> +		merge_action = merge_program_cb;
> +		data = (void *)pgm;
> +	}
> +
>  	for (; i < argc; i++) {
>  		const char *arg = argv[i];
>  		if (!force_file && *arg == '-') {
> @@ -36,13 +52,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  			}
>  			if (!strcmp(arg, "-a")) {
>  				err |= merge_all(&the_index, one_shot, quiet,
> -						 merge_program_cb, (void *)pgm);
> +						 merge_action, data);
>  				continue;
>  			}
>  			die("git merge-index: unknown option %s", arg);
>  		}
>  		err |= merge_one_path(&the_index, one_shot, quiet, arg,
> -				      merge_program_cb, (void *)pgm);
> +				      merge_action, data);
> +	}
> +
> +	if (merge_action == merge_one_file_cb) {
> +		if (err) {
> +			rollback_lock_file(&lock);
> +			return err;
> +		}
> +
> +		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>  	}
>  	return err;
>  }
> -- snap --
> 
>>> +{
>>> +	char ownbuf[3][60] = {{0}};
>>
>> I know this is copied from above but it would be better to use
>> GIT_MAX_HEXSZ rather than 60
>>
> 
> Cheers,
> Alban
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all()
  2020-07-12 18:02       ` Phillip Wood
@ 2020-07-12 20:10         ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-07-12 20:10 UTC (permalink / raw)
  To: Phillip Wood, git; +Cc: Junio C Hamano

Hi Phillip,

Phillip Wood (phillip.wood123@gmail.com) a écrit :

> Hi Alban
> 
> On 12/07/2020 12:36, Alban Gruin wrote:
> > Hi Phillip,
> > 
> > Phillip Wood (phillip.wood123@gmail.com) a écrit :
> > 
> >> Hi Alban
> >>
> >> On 25/06/2020 13:19, Alban Gruin wrote:
> >> -%<-
> >>> diff --git a/merge-strategies.c b/merge-strategies.c
> >>> index 3a9fce9f22..f4c0b4acd6 100644
> >>> --- a/merge-strategies.c
> >>> +++ b/merge-strategies.c
> >>> @@ -1,6 +1,7 @@
> >>>  #include "cache.h"
> >>>  #include "dir.h"
> >>>  #include "merge-strategies.h"
> >>> +#include "run-command.h"
> >>>  #include "xdiff-interface.h"
> >>>  
> >>>  static int add_to_index_cacheinfo(struct index_state *istate,
> >>> @@ -189,3 +190,101 @@ int merge_strategies_one_file(struct repository *r,
> >>>  
> >>>  	return 0;
> >>>  }
> >>> +
> >>> +int merge_program_cb(const struct object_id *orig_blob,
> >>> +		     const struct object_id *our_blob,
> >>> +		     const struct object_id *their_blob, const char *path,
> >>> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> >>> +		     void *data)
> >>
> >> Using void* is slightly unfortunate but it's needed later.
> >>
> >> It would be nice to check if the program to run is git-merge-one-file
> >> and call the appropriate function instead in that case so all users of
> >> merge-index get the benefit of it being builtin. That probably wants to
> >> be done in cmd_merge_index() rather than here though.
> >>
> > 
> > Dunno, I am not completely comfortable with changing a parameter that 
> > specifically describe a program, to a parameter that may be a program, 
> > except in one case where `merge-index' should lock the index, setup the 
> > worktree, and call a function instead.
> 
> There is some previous discussion about this at
> https://lore.kernel.org/git/xmqqblv5kr9u.fsf@gitster-ct.c.googlers.com/
> 

Thanks.  If no-one seems really against doing that, I'll include the patch 
below in the v2, with an additional note in the man page.

> I'll try and have a proper look at your comments towards the end of the
> week (or maybe the week after the way things are at the moment...)
> 
> Best Wishes
> 
> Phillip
> 

Cheers,
Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C
  2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
                   ` (16 preceding siblings ...)
  2020-06-25 12:19 ` [RFC PATCH v1 17/17] sequencer: use the "octopus" merge " Alban Gruin
@ 2020-09-01 10:56 ` Alban Gruin
  2020-09-01 10:56   ` [PATCH v2 01/11] t6027: modernise tests Alban Gruin
                     ` (11 more replies)
  17 siblings, 12 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

In a effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on it.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without any changes.

This series is based on d9cd433147 (po: add missing letter for French
message, 2020-08-27).  The tip is tagged as
"rewrite-merge-strategies-v2" at https://github.com/agrn/git.

Changes since v1:

 - Merged commits rewriting and libifying scripts.

 - Introduce checks in merge-one-file to check that file modes are
   correct.

 - Use ll_merge() instead of xdl_merge().

 - merge-index does no longer fork to call git-merge-one-file.

 - Remove usage of the_index in merge-one-file.c.

 - Mark more strings for translation.

 - Carry more comments from the original scripts.

 - Use GIT_MAX_HEXSZ instead of hardcoding 60.

Alban Gruin (11):
  t6027: modernise tests
  merge-one-file: rewrite in C
  merge-index: libify merge_one_path() and merge_all()
  merge-index: don't fork if the requested program is
    `git-merge-one-file'
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Makefile                        |   7 +-
 builtin.h                       |   3 +
 builtin/merge-index.c           | 102 ++----
 builtin/merge-octopus.c         |  65 ++++
 builtin/merge-one-file.c        |  85 +++++
 builtin/merge-recursive.c       |  16 +-
 builtin/merge-resolve.c         |  69 ++++
 builtin/merge.c                 |   9 +-
 cache.h                         |   2 +-
 git-merge-octopus.sh            | 112 ------
 git-merge-one-file.sh           | 167 ---------
 git-merge-resolve.sh            |  54 ---
 git.c                           |   3 +
 merge-strategies.c              | 594 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  44 +++
 merge.c                         |  12 +
 sequencer.c                     |  16 +-
 t/t6407-merge-binary.sh         |  27 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 19 files changed, 942 insertions(+), 447 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v1:
 1:  50e15b5243 !  1:  28c8fd11b6 t6027: modernise tests
    @@ Commit message
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
    - ## t/t6027-merge-binary.sh ##
    -@@ t/t6027-merge-binary.sh: test_description='ask merge-recursive to merge binary files'
    + ## t/t6407-merge-binary.sh ##
    +@@ t/t6407-merge-binary.sh: test_description='ask merge-recursive to merge binary files'
      . ./test-lib.sh
      
      test_expect_success setup '
    @@ t/t6027-merge-binary.sh: test_description='ask merge-recursive to merge binary f
      	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
      	git add m &&
      	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
    -@@ t/t6027-merge-binary.sh: test_expect_success setup '
    +@@ t/t6407-merge-binary.sh: test_expect_success setup '
      '
      
      test_expect_success resolve '
 2:  08a337738e <  -:  ---------- merge-one-file: rewrite in C
 3:  5da78d5de1 <  -:  ---------- merge-one-file: remove calls to external processes
 4:  11c0da9e13 <  -:  ---------- merge-one-file: use error() instead of fprintf(stderr, ...)
 5:  df28965c8e <  -:  ---------- merge-one-file: libify merge_one_file()
 -:  ---------- >  2:  f5ab0fdf0a merge-one-file: rewrite in C
 6:  84f2f2946a !  3:  7f3ce7da17 merge-index: libify merge_one_path() and merge_all()
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
     
      ## merge-strategies.c ##
     @@
    - #include "cache.h"
      #include "dir.h"
    + #include "ll-merge.h"
      #include "merge-strategies.h"
     +#include "run-command.h"
      #include "xdiff-interface.h"
    @@ merge-strategies.c: int merge_strategies_one_file(struct repository *r,
     +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
     +		     void *data)
     +{
    -+	char ownbuf[3][60] = {{0}};
    ++	char ownbuf[3][GIT_MAX_HEXSZ] = {{0}};
     +	const char *arguments[] = { (char *)data, "", "", "", path,
     +				    ownbuf[0], ownbuf[1], ownbuf[2],
     +				    NULL };
 7:  1f864a4840 <  -:  ---------- merge-resolve: rewrite in C
 8:  3517990e6a <  -:  ---------- merge-resolve: remove calls to external processes
 9:  9831fe1729 <  -:  ---------- merge-resolve: libify merge_resolve()
 -:  ---------- >  4:  07e6a6aaef merge-index: don't fork if the requested program is `git-merge-one-file'
 -:  ---------- >  5:  117d4fc840 merge-resolve: rewrite in C
10:  99d42e8ea1 =  6:  4fc955962b merge-recursive: move better_branch_name() to merge.c
11:  3182673ea7 <  -:  ---------- merge-octopus: rewrite in C
12:  8f4cfcefb7 <  -:  ---------- merge-octopus: remove calls to external processes
13:  d4dba22988 <  -:  ---------- merge-octopus: libify merge_octopus()
 -:  ---------- >  7:  e7b9e15b34 merge-octopus: rewrite in C
14:  bbe50cd770 =  8:  cd0662201d merge: use the "resolve" strategy without forking
15:  b7aff6fb3a =  9:  0525ff0183 merge: use the "octopus" strategy without forking
16:  c1cdcce3a9 = 10:  6fbf599ba4 sequencer: use the "resolve" strategy without forking
17:  e68765cdc7 = 11:  2c2dc3cc62 sequencer: use the "octopus" merge strategy without forking
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v2 01/11] t6027: modernise tests
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
@ 2020-09-01 10:56   ` Alban Gruin
  2020-09-01 10:56   ` [PATCH v2 02/11] merge-one-file: rewrite in C Alban Gruin
                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

Some tests in t6027 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6407-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index 4e6c7cb77e..071d3f7343 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -35,33 +34,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 02/11] merge-one-file: rewrite in C
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-09-01 10:56   ` [PATCH v2 01/11] t6027: modernise tests Alban Gruin
@ 2020-09-01 10:56   ` Alban Gruin
  2020-09-01 21:06     ` Junio C Hamano
  2020-09-01 10:56   ` [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_cache_entry();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_cache();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and ll_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are replaced by calls to
   cache_file_exists();

 - calls to `update-index' are replaced by calls to add_file_to_cache().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   3 +-
 builtin.h                       |   1 +
 builtin/merge-one-file.c        |  85 ++++++++++++++
 git-merge-one-file.sh           | 167 ---------------------------
 git.c                           |   1 +
 merge-strategies.c              | 199 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  13 +++
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 8 files changed, 302 insertions(+), 169 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index 65f8cfb236..8849d54063 100644
--- a/Makefile
+++ b/Makefile
@@ -596,7 +596,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -911,6 +910,7 @@ LIB_OBJS += match-trees.o
 LIB_OBJS += mem-pool.o
 LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
@@ -1089,6 +1089,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index a5ae15bfe5..9205d5ecdc 100644
--- a/builtin.h
+++ b/builtin.h
@@ -172,6 +172,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..306a86c2f0
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,85 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file SHA1 (or empty)
+ *   argv[2] - file in branch1 SHA1 (or empty)
+ *   argv[3] - file in branch2 SHA1 (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (repo_read_index(the_repository) < 0)
+		die("invalid index");
+
+	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
+
+	if (!get_oid(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		orig_mode = strtol(argv[5], NULL, 8);
+
+		if (!(S_ISREG(orig_mode) || S_ISDIR(orig_mode) || S_ISLNK(orig_mode)))
+			ret |= error(_("invalid 'orig' mode: %o"), orig_mode);
+	}
+
+	if (!get_oid(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		our_mode = strtol(argv[6], NULL, 8);
+
+		if (!(S_ISREG(our_mode) || S_ISDIR(our_mode) || S_ISLNK(our_mode)))
+			ret |= error(_("invalid 'our' mode: %o"), our_mode);
+	}
+
+	if (!get_oid(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		their_mode = strtol(argv[7], NULL, 8);
+
+		if (!(S_ISREG(their_mode) || S_ISDIR(their_mode) || S_ISLNK(their_mode)))
+			ret = error(_("invalid 'their' mode: %o"), their_mode);
+	}
+
+	if (ret)
+		return ret;
+
+	ret = merge_strategies_one_file(the_repository,
+					p_orig_blob, p_our_blob, p_their_blob, argv[4],
+					orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return ret;
+	}
+
+	return write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index 8bd1d7551d..c97fea36c1 100644
--- a/git.c
+++ b/git.c
@@ -534,6 +534,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..f2af4a894d
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,199 @@
+#include "cache.h"
+#include "dir.h"
+#include "ll-merge.h"
+#include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int add_to_index_cacheinfo(struct index_state *istate,
+				  unsigned int mode,
+				  const struct object_id *oid, const char *path)
+{
+	struct cache_entry *ce;
+	int len, option;
+
+	if (!verify_path(path, mode))
+		return error(_("Invalid path '%s'"), path);
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(0);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
+	if (add_index_entry(istate, ce, option))
+		return error(_("%s: cannot add to the index"), path);
+
+	return 0;
+}
+
+static int checkout_from_index(struct index_state *istate, const char *path)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct cache_entry *ce;
+
+	state.istate = istate;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	ce = index_file_exists(istate, path, strlen(path), 0);
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error(_("%s: cannot checkout file"), path);
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *orig_blob,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	struct ll_merge_options merge_opts = {0};
+	struct cache_entry *ce;
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, &null_oid);
+	}
+
+	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
+	ret = ll_merge(&result, path,
+		       mmfs + 0, "orig",
+		       mmfs + 1, "our",
+		       mmfs + 2, "their",
+		       istate, &merge_opts);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret > 127 || !orig_blob)
+		ret = error(_("content conflict in %s"), path);
+
+	/* Create the working tree file, using "our tree" version from
+	   the index, and then store the result of the merge. */
+	ce = index_file_exists(istate, path, strlen(path), 0);
+	if (!ce)
+		BUG("file is not present in the cache?");
+
+	unlink(path);
+	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
+	write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (ret && our_mode != their_mode)
+		return error(_("permission conflict: %o->%o,%o in %s"),
+			     orig_mode, our_mode, their_mode, path);
+	if (ret)
+		return 1;
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_strategies_one_file(struct repository *r,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob, const char *path,
+			      unsigned int orig_mode, unsigned int our_mode,
+			      unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
+		/* Deleted in both or deleted in one and unchanged in
+		   the other */
+		return merge_one_file_deleted(r->index,
+					      orig_blob, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	else if (!orig_blob && our_blob && !their_blob) {
+		/* Added in one.  The other side did not add and we
+		   added so there is nothing to be done, except making
+		   the path merged. */
+		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
+			return 1;
+		return checkout_from_index(r->index, path);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		/* Added in both, identically (check for same
+		   permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
+			return 1;
+		return checkout_from_index(r->index, path);
+	} else if (our_blob && their_blob)
+		/* Modified in both, but differently. */
+		return do_merge_one_file(r->index,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	else {
+		char *orig_hex = "", *our_hex = "", *their_hex = "";
+
+		if (orig_blob)
+			orig_hex = oid_to_hex(orig_blob);
+		if (our_blob)
+			our_hex = oid_to_hex(our_blob);
+		if (their_blob)
+			their_hex = oid_to_hex(their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..b527d145c7
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,13 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+int merge_strategies_one_file(struct repository *r,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob, const char *path,
+			      unsigned int orig_mode, unsigned int our_mode,
+			      unsigned int their_mode);
+
+#endif /* MERGE_STRATEGIES_H */
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2eddcc7664..5fb74e39a0 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all()
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-09-01 10:56   ` [PATCH v2 01/11] t6027: modernise tests Alban Gruin
  2020-09-01 10:56   ` [PATCH v2 02/11] merge-one-file: rewrite in C Alban Gruin
@ 2020-09-01 10:56   ` Alban Gruin
  2020-09-01 21:11     ` Junio C Hamano
  2020-09-01 10:56   ` [PATCH v2 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
                     ` (8 subsequent siblings)
  11 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves merge_one_path(), merge_all(), and their
helpers to merge-strategies.c.  They also take a callback to dictate
what they should do for each file.  For now, only one launching a new
process is defined to preserve the behaviour of the builtin version.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 77 +++------------------------------
 merge-strategies.c    | 99 +++++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    | 17 ++++++++
 3 files changed, 123 insertions(+), 70 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..6cb666cc78 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,11 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
-#include "run-command.h"
-
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
-
-static int merge_entry(int pos, const char *path)
-{
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
-	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
-
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
-
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
-	}
-}
+#include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	const char *pgm;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all(&the_index, one_shot, quiet,
+						 merge_program_cb, (void *)pgm);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_one_path(&the_index, one_shot, quiet, arg,
+				      merge_program_cb, (void *)pgm);
 	}
-	if (err && !quiet)
-		die("merge program failed");
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index f2af4a894d..ffd6cf77d6 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -2,6 +2,7 @@
 #include "dir.h"
 #include "ll-merge.h"
 #include "merge-strategies.h"
+#include "run-command.h"
 #include "xdiff-interface.h"
 
 static int add_to_index_cacheinfo(struct index_state *istate,
@@ -197,3 +198,101 @@ int merge_strategies_one_file(struct repository *r,
 
 	return 0;
 }
+
+int merge_program_cb(const struct object_id *orig_blob,
+		     const struct object_id *our_blob,
+		     const struct object_id *their_blob, const char *path,
+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		     void *data)
+{
+	char ownbuf[3][GIT_MAX_HEXSZ] = {{0}};
+	const char *arguments[] = { (char *)data, "", "", "", path,
+				    ownbuf[0], ownbuf[1], ownbuf[2],
+				    NULL };
+
+	if (orig_blob)
+		arguments[1] = oid_to_hex(orig_blob);
+	if (our_blob)
+		arguments[2] = oid_to_hex(our_blob);
+	if (their_blob)
+		arguments[3] = oid_to_hex(their_blob);
+
+	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
+	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
+	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);
+
+	return run_command_v_opt(arguments, 0);
+}
+
+static int merge_entry(struct index_state *istate, int quiet, int pos,
+		       const char *path, merge_cb cb, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		return -2;
+	}
+
+	return found;
+}
+
+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
+		   const char *path, merge_cb cb, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
+		if (ret == -1)
+			return -1;
+		else if (ret == -2)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all(struct index_state *istate, int oneshot, int quiet,
+	      merge_cb cb, void *data)
+{
+	int err = 0, i, ret;
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+		else if (ret == -2) {
+			if (oneshot)
+				err++;
+			else
+				return 1;
+		}
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index b527d145c7..cf78d7eaf4 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
 			      unsigned int orig_mode, unsigned int our_mode,
 			      unsigned int their_mode);
 
+typedef int (*merge_cb)(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_program_cb(const struct object_id *orig_blob,
+		     const struct object_id *our_blob,
+		     const struct object_id *their_blob, const char *path,
+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		     void *data);
+
+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
+		   const char *path, merge_cb cb, void *data);
+int merge_all(struct index_state *istate, int oneshot, int quiet,
+	      merge_cb cb, void *data);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 04/11] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (2 preceding siblings ...)
  2020-09-01 10:56   ` [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-09-01 10:56   ` Alban Gruin
  2020-09-01 10:56   ` [PATCH v2 05/11] merge-resolve: rewrite in C Alban Gruin
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

Since `git-merge-one-file' has been rewritten and libified, this teaches
`merge-index' to call merge_strategies_one_file() without forking using
a new callback, merge_one_file_cb().

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 29 +++++++++++++++++++++++++++--
 merge-strategies.c    | 11 +++++++++++
 merge-strategies.h    |  6 ++++++
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 6cb666cc78..19fff9a113 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,11 +1,15 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 	const char *pgm;
+	void *data;
+	merge_cb merge_action;
+	struct lock_file lock = LOCK_INIT;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -26,7 +30,19 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+	if (!strcmp(pgm, "git-merge-one-file")) {
+		merge_action = merge_one_file_cb;
+		data = (void *)the_repository;
+
+		setup_work_tree();
+		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+	} else {
+		merge_action = merge_program_cb;
+		data = (void *)pgm;
+	}
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -36,13 +52,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all(&the_index, one_shot, quiet,
-						 merge_program_cb, (void *)pgm);
+						 merge_action, data);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_one_path(&the_index, one_shot, quiet, arg,
-				      merge_program_cb, (void *)pgm);
+				      merge_action, data);
+	}
+
+	if (merge_action == merge_one_file_cb) {
+		if (err) {
+			rollback_lock_file(&lock);
+			return err;
+		}
+
+		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 	}
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index ffd6cf77d6..00738863e4 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -199,6 +199,17 @@ int merge_strategies_one_file(struct repository *r,
 	return 0;
 }
 
+int merge_one_file_cb(const struct object_id *orig_blob,
+		      const struct object_id *our_blob,
+		      const struct object_id *their_blob, const char *path,
+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		      void *data)
+{
+	return merge_strategies_one_file((struct repository *)data,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+}
+
 int merge_program_cb(const struct object_id *orig_blob,
 		     const struct object_id *our_blob,
 		     const struct object_id *their_blob, const char *path,
diff --git a/merge-strategies.h b/merge-strategies.h
index cf78d7eaf4..40e175ca39 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -16,6 +16,12 @@ typedef int (*merge_cb)(const struct object_id *orig_blob,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_cb(const struct object_id *orig_blob,
+		      const struct object_id *our_blob,
+		      const struct object_id *their_blob, const char *path,
+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		      void *data);
+
 int merge_program_cb(const struct object_id *orig_blob,
 		     const struct object_id *our_blob,
 		     const struct object_id *their_blob, const char *path,
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 05/11] merge-resolve: rewrite in C
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (3 preceding siblings ...)
  2020-09-01 10:56   ` [PATCH v2 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2020-09-01 10:56   ` Alban Gruin
  2020-09-01 10:57   ` [PATCH v2 06/11] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:56 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all() function.  A callback
   function, merge_one_file_cb(), is added to allow it to call
   merge_one_file() without forking.

Here too, the index is read in cmd_merge_resolve(), but
merge_strategies_resolve() takes care of writing it back to the disk.

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |  2 +-
 builtin.h               |  1 +
 builtin/merge-resolve.c | 69 +++++++++++++++++++++++++++++++++
 git-merge-resolve.sh    | 54 --------------------------
 git.c                   |  1 +
 merge-strategies.c      | 85 +++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 7 files changed, 162 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index 8849d54063..929c3dc3eb 100644
--- a/Makefile
+++ b/Makefile
@@ -596,7 +596,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1092,6 +1091,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 9205d5ecdc..6ea207c9fd 100644
--- a/builtin.h
+++ b/builtin.h
@@ -174,6 +174,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..59f734473b
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, is_baseless = 1, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (repo_read_index(the_repository) < 0)
+		die("invalid index");
+
+	/* The first parameters up to -- are merge bases; the rest are
+	 * heads. */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else if (remote) {
+			/* Give up if we are given two or more remotes.
+			 * Not handling octopus. */
+			return 2;
+		} else {
+			struct object_id oid;
+
+			get_oid(argv[i], &oid);
+			is_baseless &= sep_seen;
+
+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
+				struct commit *commit;
+				commit = lookup_commit_or_die(&oid, argv[i]);
+
+				if (sep_seen)
+					commit_list_append(commit, &remote);
+				else
+					next_base = commit_list_append(commit, next_base);
+			}
+		}
+	}
+
+	/* Give up if this is a baseless merge. */
+	if (is_baseless)
+		return 2;
+
+	return merge_strategies_resolve(the_repository, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 343fe7bccd..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index c97fea36c1..794ca6e9f0 100644
--- a/git.c
+++ b/git.c
@@ -538,6 +538,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 00738863e4..6b905dfc38 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,8 +1,11 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
 #include "ll-merge.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int add_to_index_cacheinfo(struct index_state *istate,
@@ -307,3 +310,85 @@ int merge_all(struct index_state *istate, int oneshot, int quiet,
 
 	return err;
 }
+
+static int add_tree(const struct object_id *oid, struct tree_desc *t)
+{
+	struct tree *tree;
+
+	tree = parse_tree_indirect(oid);
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	int i = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct object_id head, oid;
+	struct commit_list *j;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+	refresh_index(r->index, 0, NULL, NULL, NULL);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.update = 1;
+	opts.merge = 1;
+	opts.aggressive = 1;
+
+	for (j = bases; j && j->item; j = j->next) {
+		if (add_tree(&j->item->object.oid, t + (i++)))
+			goto out;
+	}
+
+	if (head_arg && add_tree(&head, t + (i++)))
+		goto out;
+	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
+		goto out;
+
+	if (i == 1)
+		opts.fn = oneway_merge;
+	else if (i == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (i >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = i - 1;
+	}
+
+	if (unpack_trees(i, t, &opts))
+		goto out;
+
+	puts(_("Trying simple merge."));
+	write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+
+		puts(_("Simple merge failed, trying Automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all(r->index, 0, 0, merge_one_file_cb, r);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+
+ out:
+	rollback_lock_file(&lock);
+	return 2;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 40e175ca39..778f8ce9d6 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_strategies_one_file(struct repository *r,
@@ -33,4 +34,8 @@ int merge_one_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all(struct index_state *istate, int oneshot, int quiet,
 	      merge_cb cb, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 06/11] merge-recursive: move better_branch_name() to merge.c
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (4 preceding siblings ...)
  2020-09-01 10:56   ` [PATCH v2 05/11] merge-resolve: rewrite in C Alban Gruin
@ 2020-09-01 10:57   ` Alban Gruin
  2020-09-01 10:57   ` [PATCH v2 07/11] merge-octopus: rewrite in C Alban Gruin
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

get_better_branch_name() will be used by rebase-octopus once it is
rewritten in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index 4cad61ffa4..a926b0bc87 100644
--- a/cache.h
+++ b/cache.h
@@ -1917,7 +1917,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 5fb88af102..801d673c5f 100644
--- a/merge.c
+++ b/merge.c
@@ -109,3 +109,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 07/11] merge-octopus: rewrite in C
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (5 preceding siblings ...)
  2020-09-01 10:57   ` [PATCH v2 06/11] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2020-09-01 10:57   ` Alban Gruin
  2020-09-01 10:57   ` [PATCH v2 08/11] merge: use the "resolve" strategy without forking Alban Gruin
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes(), and is moved from cmd_merge_octopus() to
   merge_octopus().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_strategies_octopus().

Here to, merge_strategies_octopus() takes two commit lists and a string
to reduce frictions when try_merge_strategies() will be modified to call
it directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  65 +++++++++++++
 git-merge-octopus.sh    | 112 ----------------------
 git.c                   |   1 +
 merge-strategies.c      | 200 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 271 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 929c3dc3eb..2fb26d9692 100644
--- a/Makefile
+++ b/Makefile
@@ -595,7 +595,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1088,6 +1087,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 6ea207c9fd..5a587ab70c 100644
--- a/builtin.h
+++ b/builtin.h
@@ -170,6 +170,7 @@ int cmd_mailsplit(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..37bbdf11cc
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,65 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (repo_read_index(the_repository) < 0)
+		die("corrupted cache");
+
+	/* The first parameters up to -- are merge bases; the rest are
+	 * heads. */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+
+			get_oid(argv[i], &oid);
+
+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
+				struct commit *commit;
+				commit = lookup_commit_or_die(&oid, argv[i]);
+
+				if (sep_seen)
+					next_remote = commit_list_append(commit, next_remote);
+				else
+					next_base = commit_list_append(commit, next_base);
+			}
+		}
+	}
+
+	/* Reject if this is not an octopus -- resolve should be used
+	 * instead. */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 7d19d37951..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 794ca6e9f0..df0bebdafc 100644
--- a/git.c
+++ b/git.c
@@ -533,6 +533,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 6b905dfc38..dee86389e3 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "ll-merge.h"
 #include "lockfile.h"
@@ -392,3 +393,202 @@ int merge_strategies_resolve(struct repository *r,
 	rollback_lock_file(&lock);
 	return 2;
 }
+
+static int fast_forward(struct repository *r, const struct object_id *oids,
+			int nr, int aggressive)
+{
+	int i;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	repo_read_index_preload(r, NULL, 0);
+	if (refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL))
+		return -1;
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	for (i = 0; i < nr; i++) {
+		struct tree *tree;
+		tree = parse_tree_indirect(oids + i);
+		if (parse_tree(tree))
+			return -1;
+		init_tree_desc(t + i, tree->buffer, tree->size);
+	}
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL);
+	if (!ret)
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int non_ff_merge = 0, ret = 0, references = 1;
+	struct commit **reference_commit;
+	struct tree *reference_tree;
+	struct commit_list *j;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+
+	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
+	reference_commit[0] = lookup_commit_reference(r, &head);
+	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
+
+	if (repo_index_has_changes(r, reference_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		ret = 2;
+		goto out;
+	}
+
+	for (j = remotes; j && j->item; j = j->next) {
+		struct commit *c = j->item;
+		struct object_id *oid = &c->object.oid;
+		struct commit_list *common, *k;
+		char *branch_name;
+		int can_ff = 1;
+
+		if (ret) {
+			/* We allow only last one to have a
+			   hand-resolvable conflicts.  Last round failed
+			   and we still had a head to merge. */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			goto out;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common)
+			die(_("Unable to find common commit with %s"), branch_name);
+
+		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
+
+		if (k) {
+			printf(_("Already up to date with %s\n"), branch_name);
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (!non_ff_merge) {
+			int i;
+
+			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
+				can_ff = oideq(&k->item->object.oid,
+					       &reference_commit[i]->object.oid);
+			}
+		}
+
+		if (!non_ff_merge && can_ff) {
+			/* The first head being merged was a
+			   fast-forward.  Advance the reference commit
+			   to the head being merged, and use that tree
+			   as the intermediate result of the merge.  We
+			   still need to count this as part of the
+			   parent set. */
+			struct object_id oids[2];
+			printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+			oidcpy(oids, &head);
+			oidcpy(oids + 1, oid);
+
+			ret = fast_forward(r, oids, 2, 0);
+			if (ret) {
+				free(branch_name);
+				free_commit_list(common);
+				goto out;
+			}
+
+			references = 0;
+			write_tree(r, &reference_tree);
+		} else {
+			int i = 0;
+			struct tree *next = NULL;
+			struct object_id oids[MAX_UNPACK_TREES];
+
+			non_ff_merge = 1;
+			printf(_("Trying simple merge with %s\n"), branch_name);
+
+			for (k = common; k; k = k->next)
+				oidcpy(oids + (i++), &k->item->object.oid);
+
+			oidcpy(oids + (i++), &reference_tree->object.oid);
+			oidcpy(oids + (i++), oid);
+
+			if (fast_forward(r, oids, i, 1)) {
+				ret = 2;
+
+				free(branch_name);
+				free_commit_list(common);
+
+				goto out;
+			}
+
+			if (write_tree(r, &next)) {
+				struct lock_file lock = LOCK_INIT;
+
+				puts(_("Simple merge did not work, trying automatic merge."));
+				repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+				ret = !!merge_all(r->index, 0, 0, merge_one_file_cb, r);
+				write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+				write_tree(r, &next);
+			}
+
+			reference_tree = next;
+		}
+
+		reference_commit[references++] = c;
+
+		free(branch_name);
+		free_commit_list(common);
+	}
+
+out:
+	free(reference_commit);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 778f8ce9d6..938411a04e 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -37,5 +37,8 @@ int merge_all(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 08/11] merge: use the "resolve" strategy without forking
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (6 preceding siblings ...)
  2020-09-01 10:57   ` [PATCH v2 07/11] merge-octopus: rewrite in C Alban Gruin
@ 2020-09-01 10:57   ` Alban Gruin
  2020-09-01 10:57   ` [PATCH v2 09/11] merge: use the "octopus" " Alban Gruin
                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 74829a838e..541d9bed02 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -740,7 +741,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
-	} else {
+	} else if (!strcmp(strategy, "resolve"))
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
+	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
 					 common, head_arg, remoteheads);
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 09/11] merge: use the "octopus" strategy without forking
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (7 preceding siblings ...)
  2020-09-01 10:57   ` [PATCH v2 08/11] merge: use the "resolve" strategy without forking Alban Gruin
@ 2020-09-01 10:57   ` Alban Gruin
  2020-09-01 10:57   ` [PATCH v2 10/11] sequencer: use the "resolve" " Alban Gruin
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 541d9bed02..90e092ad02 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -744,6 +744,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve"))
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	else if (!strcmp(strategy, "octopus"))
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 10/11] sequencer: use the "resolve" strategy without forking
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (8 preceding siblings ...)
  2020-09-01 10:57   ` [PATCH v2 09/11] merge: use the "octopus" " Alban Gruin
@ 2020-09-01 10:57   ` Alban Gruin
  2020-09-01 10:57   ` [PATCH v2 11/11] sequencer: use the "octopus" merge " Alban Gruin
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 2425896911..c4c7b28d24 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -33,6 +33,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -1922,9 +1923,15 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v2 11/11] sequencer: use the "octopus" merge strategy without forking
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (9 preceding siblings ...)
  2020-09-01 10:57   ` [PATCH v2 10/11] sequencer: use the "resolve" " Alban Gruin
@ 2020-09-01 10:57   ` Alban Gruin
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  11 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-01 10:57 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index c4c7b28d24..34853b8970 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1927,6 +1927,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.28.0.370.g2c2dc3cc62


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v2 02/11] merge-one-file: rewrite in C
  2020-09-01 10:56   ` [PATCH v2 02/11] merge-one-file: rewrite in C Alban Gruin
@ 2020-09-01 21:06     ` Junio C Hamano
  2020-09-02 14:50       ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-09-01 21:06 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..306a86c2f0
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,85 @@
> +/*
> + * Builtin "git merge-one-file"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> + *
> + * This is the git per-file merge utility, called with
> + *
> + *   argv[1] - original file SHA1 (or empty)
> + *   argv[2] - file in branch1 SHA1 (or empty)
> + *   argv[3] - file in branch2 SHA1 (or empty)
> + *   argv[4] - pathname in repository
> + *   argv[5] - original file mode (or empty)
> + *   argv[6] - file in branch1 mode (or empty)
> + *   argv[7] - file in branch2 mode (or empty)
> + *
> + * Handle some trivial cases. The _really_ trivial cases have been
> + * handled already by git read-tree, but that one doesn't do any merges
> + * that might change the tree layout.
> + */
> +
> +#include "cache.h"
> +#include "builtin.h"
> +#include "lockfile.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_one_file_usage[] =
> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
> +	"<orig mode> <our mode> <their mode>\n\n"
> +	"Blob ids and modes should be empty for missing files.";
> +
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> +{
> +	struct object_id orig_blob, our_blob, their_blob,
> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
> +	struct lock_file lock = LOCK_INIT;
> +
> +	if (argc != 8)
> +		usage(builtin_merge_one_file_usage);
> +
> +	if (repo_read_index(the_repository) < 0)
> +		die("invalid index");
> +
> +	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);

I do understand why we would want merge_strategies_one_file() helper
introduced by this step so that the helper can work in an arbitrary
repository (hence taking a pointer to repository structure as one of
its parameters).

But the "merge-one-file" command will always work in the_repository.
I do not see a point in using helpers that can work in an arbitrary
repository, like repo_read_index() or repo_hold_locked_index(), in
the above.  I only see downsides --- it is longer to read, makes
readers wonder if there is something tricky involving another
repository going on, etc.

> +	if (!get_oid(argv[1], &orig_blob)) {
> +		p_orig_blob = &orig_blob;
> +		orig_mode = strtol(argv[5], NULL, 8);

Write a wrapper around strtol(...,...,8) to reduce repetition, and
make sure you do not pass NULL as the second parameter to strtol()
to always check you parsed the string to the end.

> +	ret = merge_strategies_one_file(the_repository,
> +					p_orig_blob, p_our_blob, p_their_blob, argv[4],
> +					orig_mode, our_mode, their_mode);

Here, as I said above, it is perfectly fine to pass
the_repository().

> +	if (ret) {
> +		rollback_lock_file(&lock);
> +		return ret;
> +	}
> +
> +	return write_locked_index(the_repository->index, &lock, COMMIT_LOCK);

Likewise, I do not see much point in saying the_repository->index; the_index
is a perfectly fine short-hand.

> diff --git a/merge-strategies.c b/merge-strategies.c
> new file mode 100644
> index 0000000000..f2af4a894d
> --- /dev/null
> +++ b/merge-strategies.c
> @@ -0,0 +1,199 @@
> +#include "cache.h"
> +#include "dir.h"
> +#include "ll-merge.h"
> +#include "merge-strategies.h"
> +#include "xdiff-interface.h"
> +
> +static int add_to_index_cacheinfo(struct index_state *istate,
> +				  unsigned int mode,
> +				  const struct object_id *oid, const char *path)
> +{
> +	struct cache_entry *ce;
> +	int len, option;
> +
> +	if (!verify_path(path, mode))
> +		return error(_("Invalid path '%s'"), path);
> +
> +	len = strlen(path);
> +	ce = make_empty_cache_entry(istate, len);
> +
> +	oidcpy(&ce->oid, oid);
> +	memcpy(ce->name, path, len);
> +	ce->ce_flags = create_ce_flags(0);
> +	ce->ce_namelen = len;
> +	ce->ce_mode = create_ce_mode(mode);
> +	if (assume_unchanged)
> +		ce->ce_flags |= CE_VALID;
> +	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
> +	if (add_index_entry(istate, ce, option))
> +		return error(_("%s: cannot add to the index"), path);
> +
> +	return 0;
> +}
> +
> +static int checkout_from_index(struct index_state *istate, const char *path)
> +{
> +	struct checkout state = CHECKOUT_INIT;
> +	struct cache_entry *ce;
> +
> +	state.istate = istate;
> +	state.force = 1;
> +	state.base_dir = "";
> +	state.base_dir_len = 0;
> +
> +	ce = index_file_exists(istate, path, strlen(path), 0);
> +	if (checkout_entry(ce, &state, NULL, NULL) < 0)
> +		return error(_("%s: cannot checkout file"), path);
> +	return 0;
> +}
> +
> +static int merge_one_file_deleted(struct index_state *istate,
> +				  const struct object_id *orig_blob,
> +				  const struct object_id *our_blob,
> +				  const struct object_id *their_blob, const char *path,
> +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if ((our_blob && orig_mode != our_mode) ||
> +	    (their_blob && orig_mode != their_mode))
> +		return error(_("File %s deleted on one branch but had its "
> +			       "permissions changed on the other."), path);
> +
> +	if (our_blob) {
> +		printf(_("Removing %s\n"), path);
> +
> +		if (file_exists(path))
> +			remove_path(path);
> +	}
> +
> +	if (remove_file_from_index(istate, path))
> +		return error("%s: cannot remove from the index", path);
> +	return 0;
> +}

These functions we see above all are now easy to write these days,
thanks to the previous work that built many helpers to perform ommon
operations (e.g. remove_path()).  Reusing them is very good.

> +static int do_merge_one_file(struct index_state *istate,
> +			     const struct object_id *orig_blob,
> +			     const struct object_id *our_blob,
> +			     const struct object_id *their_blob, const char *path,
> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	int ret, i, dest;
> +	mmbuffer_t result = {NULL, 0};
> +	mmfile_t mmfs[3];
> +	struct ll_merge_options merge_opts = {0};
> +	struct cache_entry *ce;
> +
> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
> +		return error(_("%s: Not merging symbolic link changes."), path);
> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
> +		return error(_("%s: Not merging conflicting submodule changes."), path);
> +
> +	read_mmblob(mmfs + 1, our_blob);
> +	read_mmblob(mmfs + 2, their_blob);
> +
> +	if (orig_blob) {
> +		printf(_("Auto-merging %s\n"), path);
> +		read_mmblob(mmfs + 0, orig_blob);
> +	} else {
> +		printf(_("Added %s in both, but differently.\n"), path);
> +		read_mmblob(mmfs + 0, &null_oid);
> +	}
> +
> +	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
> +	ret = ll_merge(&result, path,
> +		       mmfs + 0, "orig",
> +		       mmfs + 1, "our",
> +		       mmfs + 2, "their",
> +		       istate, &merge_opts);
> +
> +	for (i = 0; i < 3; i++)
> +		free(mmfs[i].ptr);
> +
> +	if (ret > 127 || !orig_blob)
> +		ret = error(_("content conflict in %s"), path);

The original only checked if ret is zero or non-zero; here we
require ret to be large.  Intended?  

ll_merge() that called ll_xdl_merge() (i.e. the most common case)
would return the value returned from xdl_merge(), which can be -1
when we got an error before calling xdl_do_merge().  xdl_do_merge()
in turn can return -1.  The most common case returns the value
returned from xdl_cleanup_merge(), which is 0 for clean merge, and
any positive integer (not clipped to 127 or 128) for conflicted one.

> +	/* Create the working tree file, using "our tree" version from
> +	   the index, and then store the result of the merge. */

Style. (cf. Documentation/CodingGuidelines).

> +	ce = index_file_exists(istate, path, strlen(path), 0);
> +	if (!ce)
> +		BUG("file is not present in the cache?");
> +
> +	unlink(path);
> +	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
> +	write_in_full(dest, result.ptr, result.size);

If open() fails, we write to a bogus file descriptor here.

> +	close(dest);
> +
> +	free(result.ptr);
> +
> +	if (ret && our_mode != their_mode)
> +		return error(_("permission conflict: %o->%o,%o in %s"),
> +			     orig_mode, our_mode, their_mode, path);
> +	if (ret)
> +		return 1;

What is the error returning convention around here?  Our usual
convention is that 0 signals a success, and negative reports an
error.  Returning the value returned from add_file_to_index() below,
and error() above, are consistent with the convention, but this one
returns 1 that is not.  When deviating from convention, it needs to
be documented for the callers in a comment before the function
definition.

> +
> +	return add_file_to_index(istate, path, 0);
> +}



> +int merge_strategies_one_file(struct repository *r,
> +			      const struct object_id *orig_blob,
> +			      const struct object_id *our_blob,
> +			      const struct object_id *their_blob, const char *path,
> +			      unsigned int orig_mode, unsigned int our_mode,
> +			      unsigned int their_mode)
> +{
> +	if (orig_blob &&
> +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
> +		/* Deleted in both or deleted in one and unchanged in
> +		   the other */
> +		return merge_one_file_deleted(r->index,
> +					      orig_blob, our_blob, their_blob, path,
> +					      orig_mode, our_mode, their_mode);
> +	else if (!orig_blob && our_blob && !their_blob) {
> +		/* Added in one.  The other side did not add and we
> +		   added so there is nothing to be done, except making
> +		   the path merged. */
> +		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
> +	} else if (!orig_blob && !our_blob && their_blob) {
> +		printf(_("Adding %s\n"), path);
> +
> +		if (file_exists(path))
> +			return error(_("untracked %s is overwritten by the merge."), path);
> +
> +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
> +			return 1;
> +		return checkout_from_index(r->index, path);
> +	} else if (!orig_blob && our_blob && their_blob &&
> +		   oideq(our_blob, their_blob)) {
> +		/* Added in both, identically (check for same
> +		   permissions). */
> +		if (our_mode != their_mode)
> +			return error(_("File %s added identically in both branches, "
> +				       "but permissions conflict %o->%o."),
> +				     path, our_mode, their_mode);
> +
> +		printf(_("Adding %s\n"), path);
> +
> +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
> +			return 1;
> +		return checkout_from_index(r->index, path);
> +	} else if (our_blob && their_blob)
> +		/* Modified in both, but differently. */
> +		return do_merge_one_file(r->index,
> +					 orig_blob, our_blob, their_blob, path,
> +					 orig_mode, our_mode, their_mode);
> +	else {
> +		char *orig_hex = "", *our_hex = "", *their_hex = "";
> +
> +		if (orig_blob)
> +			orig_hex = oid_to_hex(orig_blob);
> +		if (our_blob)
> +			our_hex = oid_to_hex(our_blob);
> +		if (their_blob)
> +			their_hex = oid_to_hex(their_blob);

Prepare three char [] buffers and use oid_to_hex_r() instead,
instead of relying that we'd have sufficient number of entries in
the rotating buffer.

> +		return error(_("%s: Not handling case %s -> %s -> %s"),
> +			     path, orig_hex, our_hex, their_hex);
> +	}
> +
> +	return 0;
> +}
> diff --git a/merge-strategies.h b/merge-strategies.h
> new file mode 100644
> index 0000000000..b527d145c7
> --- /dev/null
> +++ b/merge-strategies.h
> @@ -0,0 +1,13 @@
> +#ifndef MERGE_STRATEGIES_H
> +#define MERGE_STRATEGIES_H
> +
> +#include "object.h"
> +
> +int merge_strategies_one_file(struct repository *r,
> +			      const struct object_id *orig_blob,
> +			      const struct object_id *our_blob,
> +			      const struct object_id *their_blob, const char *path,
> +			      unsigned int orig_mode, unsigned int our_mode,
> +			      unsigned int their_mode);
> +
> +#endif /* MERGE_STRATEGIES_H */
> diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
> index 2eddcc7664..5fb74e39a0 100755
> --- a/t/t6415-merge-dir-to-symlink.sh
> +++ b/t/t6415-merge-dir-to-symlink.sh
> @@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
>  	test -h a/b
>  '
>  
> -test_expect_failure 'do not lose untracked in merge (resolve)' '
> +test_expect_success 'do not lose untracked in merge (resolve)' '
>  	git reset --hard &&
>  	git checkout baseline^0 &&
>  	>a/b/c/e &&

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all()
  2020-09-01 10:56   ` [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-09-01 21:11     ` Junio C Hamano
  2020-09-02 15:37       ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-09-01 21:11 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> The "resolve" and "octopus" merge strategies do not call directly `git
> merge-one-file', they delegate the work to another git command, `git
> merge-index', that will loop over files in the index and call the
> specified command.  Unfortunately, these functions are not part of
> libgit.a, which means that once rewritten, the strategies would still
> have to invoke `merge-one-file' by spawning a new process first.
>
> To avoid this, this moves merge_one_path(), merge_all(), and their
> helpers to merge-strategies.c.  They also take a callback to dictate
> what they should do for each file.  For now, only one launching a new
> process is defined to preserve the behaviour of the builtin version.

... of the "builtin" version?  I thought this series is introducing
a new builtin version?  Puzzled...

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v2 02/11] merge-one-file: rewrite in C
  2020-09-01 21:06     ` Junio C Hamano
@ 2020-09-02 14:50       ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-02 14:50 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, phillip.wood

Hi Junio,

Le 01/09/2020 à 23:06, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
>> new file mode 100644
>> index 0000000000..306a86c2f0
>> --- /dev/null
>> +++ b/builtin/merge-one-file.c
>> @@ -0,0 +1,85 @@
>> +/*
>> + * Builtin "git merge-one-file"
>> + *
>> + * Copyright (c) 2020 Alban Gruin
>> + *
>> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
>> + *
>> + * This is the git per-file merge utility, called with
>> + *
>> + *   argv[1] - original file SHA1 (or empty)
>> + *   argv[2] - file in branch1 SHA1 (or empty)
>> + *   argv[3] - file in branch2 SHA1 (or empty)
>> + *   argv[4] - pathname in repository
>> + *   argv[5] - original file mode (or empty)
>> + *   argv[6] - file in branch1 mode (or empty)
>> + *   argv[7] - file in branch2 mode (or empty)
>> + *
>> + * Handle some trivial cases. The _really_ trivial cases have been
>> + * handled already by git read-tree, but that one doesn't do any merges
>> + * that might change the tree layout.
>> + */
>> +
>> +#include "cache.h"
>> +#include "builtin.h"
>> +#include "lockfile.h"
>> +#include "merge-strategies.h"
>> +
>> +static const char builtin_merge_one_file_usage[] =
>> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
>> +	"<orig mode> <our mode> <their mode>\n\n"
>> +	"Blob ids and modes should be empty for missing files.";
>> +
>> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
>> +{
>> +	struct object_id orig_blob, our_blob, their_blob,
>> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
>> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
>> +	struct lock_file lock = LOCK_INIT;
>> +
>> +	if (argc != 8)
>> +		usage(builtin_merge_one_file_usage);
>> +
>> +	if (repo_read_index(the_repository) < 0)
>> +		die("invalid index");
>> +
>> +	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
> 
> I do understand why we would want merge_strategies_one_file() helper
> introduced by this step so that the helper can work in an arbitrary
> repository (hence taking a pointer to repository structure as one of
> its parameters).
> 
> But the "merge-one-file" command will always work in the_repository.
> I do not see a point in using helpers that can work in an arbitrary
> repository, like repo_read_index() or repo_hold_locked_index(), in
> the above.  I only see downsides --- it is longer to read, makes
> readers wonder if there is something tricky involving another
> repository going on, etc.
> 

I was under the impression that using the_index is just deprecated, and
that we ought to avoid using it, even in builtins.

Will update that.

>> +	if (!get_oid(argv[1], &orig_blob)) {
>> +		p_orig_blob = &orig_blob;
>> +		orig_mode = strtol(argv[5], NULL, 8);
> 
> Write a wrapper around strtol(...,...,8) to reduce repetition, and
> make sure you do not pass NULL as the second parameter to strtol()
> to always check you parsed the string to the end.
> 
>> +	ret = merge_strategies_one_file(the_repository,
>> +					p_orig_blob, p_our_blob, p_their_blob, argv[4],
>> +					orig_mode, our_mode, their_mode);
> 
> Here, as I said above, it is perfectly fine to pass
> the_repository().
> 
>> +	if (ret) {
>> +		rollback_lock_file(&lock);
>> +		return ret;
>> +	}
>> +
>> +	return write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
> 
> Likewise, I do not see much point in saying the_repository->index; the_index
> is a perfectly fine short-hand.
> 
> -%<-
>> +static int do_merge_one_file(struct index_state *istate,
>> +			     const struct object_id *orig_blob,
>> +			     const struct object_id *our_blob,
>> +			     const struct object_id *their_blob, const char *path,
>> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
>> +{
>> +	int ret, i, dest;
>> +	mmbuffer_t result = {NULL, 0};
>> +	mmfile_t mmfs[3];
>> +	struct ll_merge_options merge_opts = {0};
>> +	struct cache_entry *ce;
>> +
>> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
>> +		return error(_("%s: Not merging symbolic link changes."), path);
>> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
>> +		return error(_("%s: Not merging conflicting submodule changes."), path);
>> +
>> +	read_mmblob(mmfs + 1, our_blob);
>> +	read_mmblob(mmfs + 2, their_blob);
>> +
>> +	if (orig_blob) {
>> +		printf(_("Auto-merging %s\n"), path);
>> +		read_mmblob(mmfs + 0, orig_blob);
>> +	} else {
>> +		printf(_("Added %s in both, but differently.\n"), path);
>> +		read_mmblob(mmfs + 0, &null_oid);
>> +	}
>> +
>> +	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
>> +	ret = ll_merge(&result, path,
>> +		       mmfs + 0, "orig",
>> +		       mmfs + 1, "our",
>> +		       mmfs + 2, "their",
>> +		       istate, &merge_opts);
>> +
>> +	for (i = 0; i < 3; i++)
>> +		free(mmfs[i].ptr);
>> +
>> +	if (ret > 127 || !orig_blob)
>> +		ret = error(_("content conflict in %s"), path);
> 
> The original only checked if ret is zero or non-zero; here we
> require ret to be large.  Intended?  
> 
> ll_merge() that called ll_xdl_merge() (i.e. the most common case)
> would return the value returned from xdl_merge(), which can be -1
> when we got an error before calling xdl_do_merge().  xdl_do_merge()
> in turn can return -1.  The most common case returns the value
> returned from xdl_cleanup_merge(), which is 0 for clean merge, and
> any positive integer (not clipped to 127 or 128) for conflicted one.
> 

Huh, not sure why I did this, and I'm puzzled that it did not broke
anything.

>> +	/* Create the working tree file, using "our tree" version from
>> +	   the index, and then store the result of the merge. */
> 
> Style. (cf. Documentation/CodingGuidelines).
> 
>> +	ce = index_file_exists(istate, path, strlen(path), 0);
>> +	if (!ce)
>> +		BUG("file is not present in the cache?");
>> +
>> +	unlink(path);
>> +	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
>> +	write_in_full(dest, result.ptr, result.size);
> 
> If open() fails, we write to a bogus file descriptor here.
> 
>> +	close(dest);
>> +
>> +	free(result.ptr);
>> +
>> +	if (ret && our_mode != their_mode)
>> +		return error(_("permission conflict: %o->%o,%o in %s"),
>> +			     orig_mode, our_mode, their_mode, path);
>> +	if (ret)
>> +		return 1;
> 
> What is the error returning convention around here?  Our usual
> convention is that 0 signals a success, and negative reports an
> error.  Returning the value returned from add_file_to_index() below,
> and error() above, are consistent with the convention, but this one
> returns 1 that is not.  When deviating from convention, it needs to
> be documented for the callers in a comment before the function
> definition.
> 

I stayed to close to the shell script on this one…

Note that this is not the case for "resolve" and "octopus", they use the
convention for merge backends, documented in builtin/merge.c:

> 		/*
> 		 * The backend exits with 1 when conflicts are
> 		 * left to be resolved, with 2 when it does not
> 		 * handle the given merge at all.
> 		 */

(In practice, it looks like any non-zero value lower than 2 indicates a
merge conflict, any value greater or equal to 2 is a general failure.)

>> +
>> +	return add_file_to_index(istate, path, 0);
>> +}
> 
> 
> 
>> +int merge_strategies_one_file(struct repository *r,
>> +			      const struct object_id *orig_blob,
>> +			      const struct object_id *our_blob,
>> +			      const struct object_id *their_blob, const char *path,
>> +			      unsigned int orig_mode, unsigned int our_mode,
>> +			      unsigned int their_mode)
>> +{
>> +	if (orig_blob &&
>> +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
>> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
>> +		/* Deleted in both or deleted in one and unchanged in
>> +		   the other */
>> +		return merge_one_file_deleted(r->index,
>> +					      orig_blob, our_blob, their_blob, path,
>> +					      orig_mode, our_mode, their_mode);
>> +	else if (!orig_blob && our_blob && !their_blob) {
>> +		/* Added in one.  The other side did not add and we
>> +		   added so there is nothing to be done, except making
>> +		   the path merged. */
>> +		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
>> +	} else if (!orig_blob && !our_blob && their_blob) {
>> +		printf(_("Adding %s\n"), path);
>> +
>> +		if (file_exists(path))
>> +			return error(_("untracked %s is overwritten by the merge."), path);
>> +
>> +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
>> +			return 1;
>> +		return checkout_from_index(r->index, path);
>> +	} else if (!orig_blob && our_blob && their_blob &&
>> +		   oideq(our_blob, their_blob)) {
>> +		/* Added in both, identically (check for same
>> +		   permissions). */
>> +		if (our_mode != their_mode)
>> +			return error(_("File %s added identically in both branches, "
>> +				       "but permissions conflict %o->%o."),
>> +				     path, our_mode, their_mode);
>> +
>> +		printf(_("Adding %s\n"), path);
>> +
>> +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
>> +			return 1;
>> +		return checkout_from_index(r->index, path);
>> +	} else if (our_blob && their_blob)
>> +		/* Modified in both, but differently. */
>> +		return do_merge_one_file(r->index,
>> +					 orig_blob, our_blob, their_blob, path,
>> +					 orig_mode, our_mode, their_mode);
>> +	else {
>> +		char *orig_hex = "", *our_hex = "", *their_hex = "";
>> +
>> +		if (orig_blob)
>> +			orig_hex = oid_to_hex(orig_blob);
>> +		if (our_blob)
>> +			our_hex = oid_to_hex(our_blob);
>> +		if (their_blob)
>> +			their_hex = oid_to_hex(their_blob);
> 
> Prepare three char [] buffers and use oid_to_hex_r() instead,
> instead of relying that we'd have sufficient number of entries in
> the rotating buffer.
> 
>> +		return error(_("%s: Not handling case %s -> %s -> %s"),
>> +			     path, orig_hex, our_hex, their_hex);
>> +	}
>> +
>> +	return 0;
>> +}
>> diff --git a/merge-strategies.h b/merge-strategies.h
>> new file mode 100644
>> index 0000000000..b527d145c7
>> --- /dev/null
>> +++ b/merge-strategies.h
>> @@ -0,0 +1,13 @@
>> +#ifndef MERGE_STRATEGIES_H
>> +#define MERGE_STRATEGIES_H
>> +
>> +#include "object.h"
>> +
>> +int merge_strategies_one_file(struct repository *r,
>> +			      const struct object_id *orig_blob,
>> +			      const struct object_id *our_blob,
>> +			      const struct object_id *their_blob, const char *path,
>> +			      unsigned int orig_mode, unsigned int our_mode,
>> +			      unsigned int their_mode);
>> +
>> +#endif /* MERGE_STRATEGIES_H */
>> diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
>> index 2eddcc7664..5fb74e39a0 100755
>> --- a/t/t6415-merge-dir-to-symlink.sh
>> +++ b/t/t6415-merge-dir-to-symlink.sh
>> @@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
>>  	test -h a/b
>>  '
>>  
>> -test_expect_failure 'do not lose untracked in merge (resolve)' '
>> +test_expect_success 'do not lose untracked in merge (resolve)' '
>>  	git reset --hard &&
>>  	git checkout baseline^0 &&
>>  	>a/b/c/e &&


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all()
  2020-09-01 21:11     ` Junio C Hamano
@ 2020-09-02 15:37       ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-09-02 15:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, phillip.wood

Le 01/09/2020 à 23:11, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> The "resolve" and "octopus" merge strategies do not call directly `git
>> merge-one-file', they delegate the work to another git command, `git
>> merge-index', that will loop over files in the index and call the
>> specified command.  Unfortunately, these functions are not part of
>> libgit.a, which means that once rewritten, the strategies would still
>> have to invoke `merge-one-file' by spawning a new process first.
>>
>> To avoid this, this moves merge_one_path(), merge_all(), and their
>> helpers to merge-strategies.c.  They also take a callback to dictate
>> what they should do for each file.  For now, only one launching a new
>> process is defined to preserve the behaviour of the builtin version.
> 
> ... of the "builtin" version?  I thought this series is introducing
> a new builtin version?  Puzzled...
> 

`merge-index' is already a builtin, this step libifies it.  Its core
feature is to call repeatedly a command (usually it's
`git-merge-one-file'), but the new version will call a callback instead,
so its behaviour is not hardcoded.  This patch only provides a callback
starting a new command to preserve its behaviour.

Perhaps rewording the last sentence like this would be better?

  For now, only one launching a new process is defined, to preserve the
behaviour of `merge-index'.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C
  2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                     ` (10 preceding siblings ...)
  2020-09-01 10:57   ` [PATCH v2 11/11] sequencer: use the "octopus" merge " Alban Gruin
@ 2020-10-05 12:26   ` Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 01/11] t6027: modernise tests Alban Gruin
                       ` (12 more replies)
  11 siblings, 13 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

In a effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on it.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without any changes.

This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).  The
tip is tagged as "rewrite-merge-strategies-v3" at
https://github.com/agrn/git.

Changes since v2:

 - Enable `USE_THE_INDEX_COMPATIBILITY_MACROS' in merge-one-file.c and
   use read_cache() and hold_locked_index() instead of repo_read_index()
   and repo_hold_locked_index() to improve readability.

 - Move file mode parsing to its own function in merge-one-file.c.

 - Improve IO errors handling in do_merge_one_file().

 - Return -1 instead of 1 when erroring out in do_merge_one_file() and
   merge_strategies_one_file().

 - Use oid_to_hex_r() instead of oid_to_hex() in do_merge_one_file().

 - Reformat multilines comments.

 - Reworded a sentence in commit 3/11.

Alban Gruin (11):
  t6027: modernise tests
  merge-one-file: rewrite in C
  merge-index: libify merge_one_path() and merge_all()
  merge-index: don't fork if the requested program is
    `git-merge-one-file'
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Makefile                        |   7 +-
 builtin.h                       |   3 +
 builtin/merge-index.c           | 102 ++----
 builtin/merge-octopus.c         |  69 ++++
 builtin/merge-one-file.c        |  92 +++++
 builtin/merge-recursive.c       |  16 +-
 builtin/merge-resolve.c         |  69 ++++
 builtin/merge.c                 |   9 +-
 cache.h                         |   2 +-
 git-merge-octopus.sh            | 112 ------
 git-merge-one-file.sh           | 167 ---------
 git-merge-resolve.sh            |  54 ---
 git.c                           |   3 +
 merge-strategies.c              | 613 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  44 +++
 merge.c                         |  12 +
 sequencer.c                     |  16 +-
 t/t6407-merge-binary.sh         |  27 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 19 files changed, 972 insertions(+), 447 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v2:
 1:  28c8fd11b6 =  1:  08c7df596a t6027: modernise tests
 2:  f5ab0fdf0a !  2:  ce911c99c0 merge-one-file: rewrite in C
    @@ builtin/merge-one-file.c (new)
     + * that might change the tree layout.
     + */
     +
    ++#define USE_THE_INDEX_COMPATIBILITY_MACROS
     +#include "cache.h"
     +#include "builtin.h"
     +#include "lockfile.h"
    @@ builtin/merge-one-file.c (new)
     +	"<orig mode> <our mode> <their mode>\n\n"
     +	"Blob ids and modes should be empty for missing files.";
     +
    ++static int read_mode(const char *name, const char *arg, unsigned int *mode)
    ++{
    ++	char *last;
    ++	int ret = 0;
    ++
    ++	*mode = strtol(arg, &last, 8);
    ++
    ++	if (*last)
    ++		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
    ++	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
    ++		ret = error(_("invalid '%s' mode: %o"), name, *mode);
    ++
    ++	return ret;
    ++}
    ++
     +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
     +{
     +	struct object_id orig_blob, our_blob, their_blob,
    @@ builtin/merge-one-file.c (new)
     +	if (argc != 8)
     +		usage(builtin_merge_one_file_usage);
     +
    -+	if (repo_read_index(the_repository) < 0)
    ++	if (read_cache() < 0)
     +		die("invalid index");
     +
    -+	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
    ++	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
     +
     +	if (!get_oid(argv[1], &orig_blob)) {
     +		p_orig_blob = &orig_blob;
    -+		orig_mode = strtol(argv[5], NULL, 8);
    -+
    -+		if (!(S_ISREG(orig_mode) || S_ISDIR(orig_mode) || S_ISLNK(orig_mode)))
    -+			ret |= error(_("invalid 'orig' mode: %o"), orig_mode);
    ++		ret = read_mode("orig", argv[5], &orig_mode);
     +	}
     +
     +	if (!get_oid(argv[2], &our_blob)) {
     +		p_our_blob = &our_blob;
    -+		our_mode = strtol(argv[6], NULL, 8);
    -+
    -+		if (!(S_ISREG(our_mode) || S_ISDIR(our_mode) || S_ISLNK(our_mode)))
    -+			ret |= error(_("invalid 'our' mode: %o"), our_mode);
    ++		ret = read_mode("our", argv[6], &our_mode);
     +	}
     +
     +	if (!get_oid(argv[3], &their_blob)) {
     +		p_their_blob = &their_blob;
    -+		their_mode = strtol(argv[7], NULL, 8);
    -+
    -+		if (!(S_ISREG(their_mode) || S_ISDIR(their_mode) || S_ISLNK(their_mode)))
    -+			ret = error(_("invalid 'their' mode: %o"), their_mode);
    ++		ret = read_mode("their", argv[7], &their_mode);
     +	}
     +
     +	if (ret)
    @@ builtin/merge-one-file.c (new)
     +
     +	if (ret) {
     +		rollback_lock_file(&lock);
    -+		return ret;
    ++		return !!ret;
     +	}
     +
    -+	return write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
    ++	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
     +}
     
      ## git-merge-one-file.sh (deleted) ##
    @@ merge-strategies.c (new)
     +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
     +{
     +	int ret, i, dest;
    ++	ssize_t written;
     +	mmbuffer_t result = {NULL, 0};
     +	mmfile_t mmfs[3];
     +	struct ll_merge_options merge_opts = {0};
    @@ merge-strategies.c (new)
     +	for (i = 0; i < 3; i++)
     +		free(mmfs[i].ptr);
     +
    -+	if (ret > 127 || !orig_blob)
    -+		ret = error(_("content conflict in %s"), path);
    ++	if (ret < 0) {
    ++		free(result.ptr);
    ++		return error(_("Failed to execute internal merge"));
    ++	}
     +
    -+	/* Create the working tree file, using "our tree" version from
    -+	   the index, and then store the result of the merge. */
    ++	/*
    ++	 * Create the working tree file, using "our tree" version from
    ++	 * the index, and then store the result of the merge.
    ++	 */
     +	ce = index_file_exists(istate, path, strlen(path), 0);
     +	if (!ce)
     +		BUG("file is not present in the cache?");
     +
     +	unlink(path);
    -+	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
    -+	write_in_full(dest, result.ptr, result.size);
    ++	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
    ++		free(result.ptr);
    ++		return error_errno(_("failed to open file '%s'"), path);
    ++	}
    ++
    ++	written = write_in_full(dest, result.ptr, result.size);
     +	close(dest);
     +
     +	free(result.ptr);
     +
    -+	if (ret && our_mode != their_mode)
    ++	if (written < 0)
    ++		return error_errno(_("failed to write to '%s'"), path);
    ++
    ++	if (ret != 0 || !orig_blob)
    ++		ret = error(_("content conflict in %s"), path);
    ++	if (our_mode != their_mode)
     +		return error(_("permission conflict: %o->%o,%o in %s"),
     +			     orig_mode, our_mode, their_mode, path);
     +	if (ret)
    -+		return 1;
    ++		return -1;
     +
     +	return add_file_to_index(istate, path, 0);
     +}
    @@ merge-strategies.c (new)
     +	if (orig_blob &&
     +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
     +	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
    -+		/* Deleted in both or deleted in one and unchanged in
    -+		   the other */
    ++		/* Deleted in both or deleted in one and unchanged in the other. */
     +		return merge_one_file_deleted(r->index,
     +					      orig_blob, our_blob, their_blob, path,
     +					      orig_mode, our_mode, their_mode);
     +	else if (!orig_blob && our_blob && !their_blob) {
    -+		/* Added in one.  The other side did not add and we
    -+		   added so there is nothing to be done, except making
    -+		   the path merged. */
    ++		/*
    ++		 * Added in one.  The other side did not add and we
    ++		 * added so there is nothing to be done, except making
    ++		 * the path merged.
    ++		 */
     +		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
     +	} else if (!orig_blob && !our_blob && their_blob) {
     +		printf(_("Adding %s\n"), path);
    @@ merge-strategies.c (new)
     +			return error(_("untracked %s is overwritten by the merge."), path);
     +
     +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
    -+			return 1;
    ++			return -1;
     +		return checkout_from_index(r->index, path);
     +	} else if (!orig_blob && our_blob && their_blob &&
     +		   oideq(our_blob, their_blob)) {
    -+		/* Added in both, identically (check for same
    -+		   permissions). */
    ++		/* Added in both, identically (check for same permissions). */
     +		if (our_mode != their_mode)
     +			return error(_("File %s added identically in both branches, "
     +				       "but permissions conflict %o->%o."),
    @@ merge-strategies.c (new)
     +		printf(_("Adding %s\n"), path);
     +
     +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
    -+			return 1;
    ++			return -1;
     +		return checkout_from_index(r->index, path);
     +	} else if (our_blob && their_blob)
     +		/* Modified in both, but differently. */
    @@ merge-strategies.c (new)
     +					 orig_blob, our_blob, their_blob, path,
     +					 orig_mode, our_mode, their_mode);
     +	else {
    -+		char *orig_hex = "", *our_hex = "", *their_hex = "";
    ++		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
    ++			their_hex[GIT_MAX_HEXSZ] = {0};
     +
     +		if (orig_blob)
    -+			orig_hex = oid_to_hex(orig_blob);
    ++			oid_to_hex_r(orig_hex, orig_blob);
     +		if (our_blob)
    -+			our_hex = oid_to_hex(our_blob);
    ++			oid_to_hex_r(our_hex, our_blob);
     +		if (their_blob)
    -+			their_hex = oid_to_hex(their_blob);
    ++			oid_to_hex_r(their_hex, their_blob);
     +
     +		return error(_("%s: Not handling case %s -> %s -> %s"),
     +			     path, orig_hex, our_hex, their_hex);
 3:  7f3ce7da17 !  3:  7f0999f5a3 merge-index: libify merge_one_path() and merge_all()
    @@ Commit message
     
         To avoid this, this moves merge_one_path(), merge_all(), and their
         helpers to merge-strategies.c.  They also take a callback to dictate
    -    what they should do for each file.  For now, only one launching a new
    -    process is defined to preserve the behaviour of the builtin version.
    +    what they should do for each file.  For now, to preserve the behaviour
    +    of `merge-index', only one callback, launching a new process, is
    +    defined.
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
 4:  07e6a6aaef =  4:  c0bc05406d merge-index: don't fork if the requested program is `git-merge-one-file'
 5:  117d4fc840 =  5:  cbfe192982 merge-resolve: rewrite in C
 6:  4fc955962b =  6:  35e386f626 merge-recursive: move better_branch_name() to merge.c
 7:  e7b9e15b34 !  7:  41eb0f7199 merge-octopus: rewrite in C
    @@ Makefile: BUILTIN_OBJS += builtin/mailsplit.o
      BUILTIN_OBJS += builtin/merge-recursive.o
     
      ## builtin.h ##
    -@@ builtin.h: int cmd_mailsplit(int argc, const char **argv, const char *prefix);
    +@@ builtin.h: int cmd_maintenance(int argc, const char **argv, const char *prefix);
      int cmd_merge(int argc, const char **argv, const char *prefix);
      int cmd_merge_base(int argc, const char **argv, const char *prefix);
      int cmd_merge_index(int argc, const char **argv, const char *prefix);
    @@ builtin/merge-octopus.c (new)
     +	if (repo_read_index(the_repository) < 0)
     +		die("corrupted cache");
     +
    -+	/* The first parameters up to -- are merge bases; the rest are
    -+	 * heads. */
    ++	/*
    ++	 * The first parameters up to -- are merge bases; the rest are
    ++	 * heads.
    ++	 */
     +	for (i = 1; i < argc; i++) {
     +		if (strcmp(argv[i], "--") == 0)
     +			sep_seen = 1;
    @@ builtin/merge-octopus.c (new)
     +		}
     +	}
     +
    -+	/* Reject if this is not an octopus -- resolve should be used
    -+	 * instead. */
    ++	/*
    ++	 * Reject if this is not an octopus -- resolve should be used
    ++	 * instead.
    ++	 */
     +	if (commit_list_count(remotes) < 2)
     +		return 2;
     +
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		int can_ff = 1;
     +
     +		if (ret) {
    -+			/* We allow only last one to have a
    -+			   hand-resolvable conflicts.  Last round failed
    -+			   and we still had a head to merge. */
    ++			/*
    ++			 * We allow only last one to have a
    ++			 * hand-resolvable conflicts.  Last round failed
    ++			 * and we still had a head to merge.
    ++			 */
     +			puts(_("Automated merge did not work."));
     +			puts(_("Should not be doing an octopus."));
     +
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		}
     +
     +		if (!non_ff_merge && can_ff) {
    -+			/* The first head being merged was a
    -+			   fast-forward.  Advance the reference commit
    -+			   to the head being merged, and use that tree
    -+			   as the intermediate result of the merge.  We
    -+			   still need to count this as part of the
    -+			   parent set. */
    ++			/*
    ++			 * The first head being merged was a
    ++			 * fast-forward.  Advance the reference commit
    ++			 * to the head being merged, and use that tree
    ++			 * as the intermediate result of the merge.  We
    ++			 * still need to count this as part of the
    ++			 * parent set.
    ++			 */
     +			struct object_id oids[2];
     +			printf(_("Fast-forwarding to: %s\n"), branch_name);
     +
 8:  cd0662201d =  8:  8f6c1ac057 merge: use the "resolve" strategy without forking
 9:  0525ff0183 =  9:  b1125261d1 merge: use the "octopus" strategy without forking
10:  6fbf599ba4 = 10:  8d0932fd02 sequencer: use the "resolve" strategy without forking
11:  2c2dc3cc62 = 11:  e304723957 sequencer: use the "octopus" merge strategy without forking
-- 
2.28.0.662.ge304723957


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v3 01/11] t6027: modernise tests
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-06 20:50       ` Junio C Hamano
  2020-10-05 12:26     ` [PATCH v3 02/11] merge-one-file: rewrite in C Alban Gruin
                       ` (11 subsequent siblings)
  12 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

Some tests in t6027 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6407-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index 4e6c7cb77e..071d3f7343 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -35,33 +34,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 02/11] merge-one-file: rewrite in C
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 01/11] t6027: modernise tests Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-06 22:01       ` Junio C Hamano
  2020-10-05 12:26     ` [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                       ` (10 subsequent siblings)
  12 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_cache_entry();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_cache();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and ll_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are replaced by calls to
   cache_file_exists();

 - calls to `update-index' are replaced by calls to add_file_to_cache().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   3 +-
 builtin.h                       |   1 +
 builtin/merge-one-file.c        |  92 ++++++++++++++
 git-merge-one-file.sh           | 167 -------------------------
 git.c                           |   1 +
 merge-strategies.c              | 214 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  13 ++
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 8 files changed, 324 insertions(+), 169 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index de53954590..6dfdb33cb2 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -909,6 +908,7 @@ LIB_OBJS += match-trees.o
 LIB_OBJS += mem-pool.o
 LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
@@ -1094,6 +1094,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..4d2cd78856 100644
--- a/builtin.h
+++ b/builtin.h
@@ -178,6 +178,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..598338ba16
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,92 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file SHA1 (or empty)
+ *   argv[2] - file in branch1 SHA1 (or empty)
+ *   argv[3] - file in branch2 SHA1 (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+static int read_mode(const char *name, const char *arg, unsigned int *mode)
+{
+	char *last;
+	int ret = 0;
+
+	*mode = strtol(arg, &last, 8);
+
+	if (*last)
+		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
+	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
+		ret = error(_("invalid '%s' mode: %o"), name, *mode);
+
+	return ret;
+}
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (read_cache() < 0)
+		die("invalid index");
+
+	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+
+	if (!get_oid(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		ret = read_mode("orig", argv[5], &orig_mode);
+	}
+
+	if (!get_oid(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		ret = read_mode("our", argv[6], &our_mode);
+	}
+
+	if (!get_oid(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		ret = read_mode("their", argv[7], &their_mode);
+	}
+
+	if (ret)
+		return ret;
+
+	ret = merge_strategies_one_file(the_repository,
+					p_orig_blob, p_our_blob, p_their_blob, argv[4],
+					orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return !!ret;
+	}
+
+	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index f1e8b56d99..a4d3f98094 100644
--- a/git.c
+++ b/git.c
@@ -540,6 +540,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..bbe6f48698
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,214 @@
+#include "cache.h"
+#include "dir.h"
+#include "ll-merge.h"
+#include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int add_to_index_cacheinfo(struct index_state *istate,
+				  unsigned int mode,
+				  const struct object_id *oid, const char *path)
+{
+	struct cache_entry *ce;
+	int len, option;
+
+	if (!verify_path(path, mode))
+		return error(_("Invalid path '%s'"), path);
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(0);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
+	if (add_index_entry(istate, ce, option))
+		return error(_("%s: cannot add to the index"), path);
+
+	return 0;
+}
+
+static int checkout_from_index(struct index_state *istate, const char *path)
+{
+	struct checkout state = CHECKOUT_INIT;
+	struct cache_entry *ce;
+
+	state.istate = istate;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	ce = index_file_exists(istate, path, strlen(path), 0);
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error(_("%s: cannot checkout file"), path);
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *orig_blob,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	ssize_t written;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	struct ll_merge_options merge_opts = {0};
+	struct cache_entry *ce;
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, &null_oid);
+	}
+
+	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
+	ret = ll_merge(&result, path,
+		       mmfs + 0, "orig",
+		       mmfs + 1, "our",
+		       mmfs + 2, "their",
+		       istate, &merge_opts);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret < 0) {
+		free(result.ptr);
+		return error(_("Failed to execute internal merge"));
+	}
+
+	/*
+	 * Create the working tree file, using "our tree" version from
+	 * the index, and then store the result of the merge.
+	 */
+	ce = index_file_exists(istate, path, strlen(path), 0);
+	if (!ce)
+		BUG("file is not present in the cache?");
+
+	unlink(path);
+	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
+		free(result.ptr);
+		return error_errno(_("failed to open file '%s'"), path);
+	}
+
+	written = write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (written < 0)
+		return error_errno(_("failed to write to '%s'"), path);
+
+	if (ret != 0 || !orig_blob)
+		ret = error(_("content conflict in %s"), path);
+	if (our_mode != their_mode)
+		return error(_("permission conflict: %o->%o,%o in %s"),
+			     orig_mode, our_mode, their_mode, path);
+	if (ret)
+		return -1;
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_strategies_one_file(struct repository *r,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob, const char *path,
+			      unsigned int orig_mode, unsigned int our_mode,
+			      unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
+		/* Deleted in both or deleted in one and unchanged in the other. */
+		return merge_one_file_deleted(r->index,
+					      orig_blob, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	else if (!orig_blob && our_blob && !their_blob) {
+		/*
+		 * Added in one.  The other side did not add and we
+		 * added so there is nothing to be done, except making
+		 * the path merged.
+		 */
+		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
+			return -1;
+		return checkout_from_index(r->index, path);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		/* Added in both, identically (check for same permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
+			return -1;
+		return checkout_from_index(r->index, path);
+	} else if (our_blob && their_blob)
+		/* Modified in both, but differently. */
+		return do_merge_one_file(r->index,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	else {
+		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
+			their_hex[GIT_MAX_HEXSZ] = {0};
+
+		if (orig_blob)
+			oid_to_hex_r(orig_hex, orig_blob);
+		if (our_blob)
+			oid_to_hex_r(our_hex, our_blob);
+		if (their_blob)
+			oid_to_hex_r(their_hex, their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..b527d145c7
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,13 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+int merge_strategies_one_file(struct repository *r,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob, const char *path,
+			      unsigned int orig_mode, unsigned int our_mode,
+			      unsigned int their_mode);
+
+#endif /* MERGE_STRATEGIES_H */
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2eddcc7664..5fb74e39a0 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all()
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 01/11] t6027: modernise tests Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 02/11] merge-one-file: rewrite in C Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-09  4:48       ` Junio C Hamano
  2020-10-05 12:26     ` [PATCH v3 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
                       ` (9 subsequent siblings)
  12 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves merge_one_path(), merge_all(), and their
helpers to merge-strategies.c.  They also take a callback to dictate
what they should do for each file.  For now, to preserve the behaviour
of `merge-index', only one callback, launching a new process, is
defined.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 77 +++------------------------------
 merge-strategies.c    | 99 +++++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    | 17 ++++++++
 3 files changed, 123 insertions(+), 70 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..6cb666cc78 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,11 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
-#include "run-command.h"
-
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
-
-static int merge_entry(int pos, const char *path)
-{
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
-	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
-
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
-
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
-	}
-}
+#include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	const char *pgm;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all(&the_index, one_shot, quiet,
+						 merge_program_cb, (void *)pgm);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_one_path(&the_index, one_shot, quiet, arg,
+				      merge_program_cb, (void *)pgm);
 	}
-	if (err && !quiet)
-		die("merge program failed");
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index bbe6f48698..f0e30f5624 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -2,6 +2,7 @@
 #include "dir.h"
 #include "ll-merge.h"
 #include "merge-strategies.h"
+#include "run-command.h"
 #include "xdiff-interface.h"
 
 static int add_to_index_cacheinfo(struct index_state *istate,
@@ -212,3 +213,101 @@ int merge_strategies_one_file(struct repository *r,
 
 	return 0;
 }
+
+int merge_program_cb(const struct object_id *orig_blob,
+		     const struct object_id *our_blob,
+		     const struct object_id *their_blob, const char *path,
+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		     void *data)
+{
+	char ownbuf[3][GIT_MAX_HEXSZ] = {{0}};
+	const char *arguments[] = { (char *)data, "", "", "", path,
+				    ownbuf[0], ownbuf[1], ownbuf[2],
+				    NULL };
+
+	if (orig_blob)
+		arguments[1] = oid_to_hex(orig_blob);
+	if (our_blob)
+		arguments[2] = oid_to_hex(our_blob);
+	if (their_blob)
+		arguments[3] = oid_to_hex(their_blob);
+
+	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
+	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
+	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);
+
+	return run_command_v_opt(arguments, 0);
+}
+
+static int merge_entry(struct index_state *istate, int quiet, int pos,
+		       const char *path, merge_cb cb, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		return -2;
+	}
+
+	return found;
+}
+
+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
+		   const char *path, merge_cb cb, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
+		if (ret == -1)
+			return -1;
+		else if (ret == -2)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all(struct index_state *istate, int oneshot, int quiet,
+	      merge_cb cb, void *data)
+{
+	int err = 0, i, ret;
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+		else if (ret == -2) {
+			if (oneshot)
+				err++;
+			else
+				return 1;
+		}
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index b527d145c7..cf78d7eaf4 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
 			      unsigned int orig_mode, unsigned int our_mode,
 			      unsigned int their_mode);
 
+typedef int (*merge_cb)(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_program_cb(const struct object_id *orig_blob,
+		     const struct object_id *our_blob,
+		     const struct object_id *their_blob, const char *path,
+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		     void *data);
+
+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
+		   const char *path, merge_cb cb, void *data);
+int merge_all(struct index_state *istate, int oneshot, int quiet,
+	      merge_cb cb, void *data);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 04/11] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (2 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-16 19:07       ` Junio C Hamano
  2020-10-05 12:26     ` [PATCH v3 05/11] merge-resolve: rewrite in C Alban Gruin
                       ` (8 subsequent siblings)
  12 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

Since `git-merge-one-file' has been rewritten and libified, this teaches
`merge-index' to call merge_strategies_one_file() without forking using
a new callback, merge_one_file_cb().

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 29 +++++++++++++++++++++++++++--
 merge-strategies.c    | 11 +++++++++++
 merge-strategies.h    |  6 ++++++
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 6cb666cc78..19fff9a113 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,11 +1,15 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 	const char *pgm;
+	void *data;
+	merge_cb merge_action;
+	struct lock_file lock = LOCK_INIT;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -26,7 +30,19 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+	if (!strcmp(pgm, "git-merge-one-file")) {
+		merge_action = merge_one_file_cb;
+		data = (void *)the_repository;
+
+		setup_work_tree();
+		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+	} else {
+		merge_action = merge_program_cb;
+		data = (void *)pgm;
+	}
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -36,13 +52,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all(&the_index, one_shot, quiet,
-						 merge_program_cb, (void *)pgm);
+						 merge_action, data);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_one_path(&the_index, one_shot, quiet, arg,
-				      merge_program_cb, (void *)pgm);
+				      merge_action, data);
+	}
+
+	if (merge_action == merge_one_file_cb) {
+		if (err) {
+			rollback_lock_file(&lock);
+			return err;
+		}
+
+		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 	}
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index f0e30f5624..c022ba9748 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -214,6 +214,17 @@ int merge_strategies_one_file(struct repository *r,
 	return 0;
 }
 
+int merge_one_file_cb(const struct object_id *orig_blob,
+		      const struct object_id *our_blob,
+		      const struct object_id *their_blob, const char *path,
+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		      void *data)
+{
+	return merge_strategies_one_file((struct repository *)data,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+}
+
 int merge_program_cb(const struct object_id *orig_blob,
 		     const struct object_id *our_blob,
 		     const struct object_id *their_blob, const char *path,
diff --git a/merge-strategies.h b/merge-strategies.h
index cf78d7eaf4..40e175ca39 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -16,6 +16,12 @@ typedef int (*merge_cb)(const struct object_id *orig_blob,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_cb(const struct object_id *orig_blob,
+		      const struct object_id *our_blob,
+		      const struct object_id *their_blob, const char *path,
+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+		      void *data);
+
 int merge_program_cb(const struct object_id *orig_blob,
 		     const struct object_id *our_blob,
 		     const struct object_id *their_blob, const char *path,
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 05/11] merge-resolve: rewrite in C
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (3 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-16 19:19       ` Junio C Hamano
  2020-10-05 12:26     ` [PATCH v3 06/11] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                       ` (7 subsequent siblings)
  12 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all() function.  A callback
   function, merge_one_file_cb(), is added to allow it to call
   merge_one_file() without forking.

Here too, the index is read in cmd_merge_resolve(), but
merge_strategies_resolve() takes care of writing it back to the disk.

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |  2 +-
 builtin.h               |  1 +
 builtin/merge-resolve.c | 69 +++++++++++++++++++++++++++++++++
 git-merge-resolve.sh    | 54 --------------------------
 git.c                   |  1 +
 merge-strategies.c      | 85 +++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 7 files changed, 162 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index 6dfdb33cb2..3cc6b192f1 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1097,6 +1096,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 4d2cd78856..35e91c16d0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -180,6 +180,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..59f734473b
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, is_baseless = 1, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (repo_read_index(the_repository) < 0)
+		die("invalid index");
+
+	/* The first parameters up to -- are merge bases; the rest are
+	 * heads. */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else if (remote) {
+			/* Give up if we are given two or more remotes.
+			 * Not handling octopus. */
+			return 2;
+		} else {
+			struct object_id oid;
+
+			get_oid(argv[i], &oid);
+			is_baseless &= sep_seen;
+
+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
+				struct commit *commit;
+				commit = lookup_commit_or_die(&oid, argv[i]);
+
+				if (sep_seen)
+					commit_list_append(commit, &remote);
+				else
+					next_base = commit_list_append(commit, next_base);
+			}
+		}
+	}
+
+	/* Give up if this is a baseless merge. */
+	if (is_baseless)
+		return 2;
+
+	return merge_strategies_resolve(the_repository, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 343fe7bccd..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index a4d3f98094..64a1a1de41 100644
--- a/git.c
+++ b/git.c
@@ -544,6 +544,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index c022ba9748..6b4b3d03a6 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,8 +1,11 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
 #include "ll-merge.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int add_to_index_cacheinfo(struct index_state *istate,
@@ -322,3 +325,85 @@ int merge_all(struct index_state *istate, int oneshot, int quiet,
 
 	return err;
 }
+
+static int add_tree(const struct object_id *oid, struct tree_desc *t)
+{
+	struct tree *tree;
+
+	tree = parse_tree_indirect(oid);
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	int i = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct object_id head, oid;
+	struct commit_list *j;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+	refresh_index(r->index, 0, NULL, NULL, NULL);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.update = 1;
+	opts.merge = 1;
+	opts.aggressive = 1;
+
+	for (j = bases; j && j->item; j = j->next) {
+		if (add_tree(&j->item->object.oid, t + (i++)))
+			goto out;
+	}
+
+	if (head_arg && add_tree(&head, t + (i++)))
+		goto out;
+	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
+		goto out;
+
+	if (i == 1)
+		opts.fn = oneway_merge;
+	else if (i == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (i >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = i - 1;
+	}
+
+	if (unpack_trees(i, t, &opts))
+		goto out;
+
+	puts(_("Trying simple merge."));
+	write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+
+		puts(_("Simple merge failed, trying Automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all(r->index, 0, 0, merge_one_file_cb, r);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+
+ out:
+	rollback_lock_file(&lock);
+	return 2;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 40e175ca39..778f8ce9d6 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_strategies_one_file(struct repository *r,
@@ -33,4 +34,8 @@ int merge_one_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all(struct index_state *istate, int oneshot, int quiet,
 	      merge_cb cb, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 06/11] merge-recursive: move better_branch_name() to merge.c
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (4 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 05/11] merge-resolve: rewrite in C Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 07/11] merge-octopus: rewrite in C Alban Gruin
                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

get_better_branch_name() will be used by rebase-octopus once it is
rewritten in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index c0072d43b1..5fa0ed8d1a 100644
--- a/cache.h
+++ b/cache.h
@@ -1928,7 +1928,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 5fb88af102..801d673c5f 100644
--- a/merge.c
+++ b/merge.c
@@ -109,3 +109,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 07/11] merge-octopus: rewrite in C
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (5 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 06/11] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 08/11] merge: use the "resolve" strategy without forking Alban Gruin
                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes(), and is moved from cmd_merge_octopus() to
   merge_octopus().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_strategies_octopus().

Here to, merge_strategies_octopus() takes two commit lists and a string
to reduce frictions when try_merge_strategies() will be modified to call
it directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  69 ++++++++++++++
 git-merge-octopus.sh    | 112 ----------------------
 git.c                   |   1 +
 merge-strategies.c      | 204 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 279 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 3cc6b192f1..2b2bdffafe 100644
--- a/Makefile
+++ b/Makefile
@@ -600,7 +600,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1093,6 +1092,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 35e91c16d0..50225404a0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -176,6 +176,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..abf0981fe8
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (repo_read_index(the_repository) < 0)
+		die("corrupted cache");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+
+			get_oid(argv[i], &oid);
+
+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
+				struct commit *commit;
+				commit = lookup_commit_or_die(&oid, argv[i]);
+
+				if (sep_seen)
+					next_remote = commit_list_append(commit, next_remote);
+				else
+					next_base = commit_list_append(commit, next_base);
+			}
+		}
+	}
+
+	/*
+	 * Reject if this is not an octopus -- resolve should be used
+	 * instead.
+	 */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 7d19d37951..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 64a1a1de41..d51fb5d2bf 100644
--- a/git.c
+++ b/git.c
@@ -539,6 +539,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 6b4b3d03a6..37c662094e 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "ll-merge.h"
 #include "lockfile.h"
@@ -407,3 +408,206 @@ int merge_strategies_resolve(struct repository *r,
 	rollback_lock_file(&lock);
 	return 2;
 }
+
+static int fast_forward(struct repository *r, const struct object_id *oids,
+			int nr, int aggressive)
+{
+	int i;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	repo_read_index_preload(r, NULL, 0);
+	if (refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL))
+		return -1;
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	for (i = 0; i < nr; i++) {
+		struct tree *tree;
+		tree = parse_tree_indirect(oids + i);
+		if (parse_tree(tree))
+			return -1;
+		init_tree_desc(t + i, tree->buffer, tree->size);
+	}
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL);
+	if (!ret)
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int non_ff_merge = 0, ret = 0, references = 1;
+	struct commit **reference_commit;
+	struct tree *reference_tree;
+	struct commit_list *j;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+
+	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
+	reference_commit[0] = lookup_commit_reference(r, &head);
+	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
+
+	if (repo_index_has_changes(r, reference_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		ret = 2;
+		goto out;
+	}
+
+	for (j = remotes; j && j->item; j = j->next) {
+		struct commit *c = j->item;
+		struct object_id *oid = &c->object.oid;
+		struct commit_list *common, *k;
+		char *branch_name;
+		int can_ff = 1;
+
+		if (ret) {
+			/*
+			 * We allow only last one to have a
+			 * hand-resolvable conflicts.  Last round failed
+			 * and we still had a head to merge.
+			 */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			goto out;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common)
+			die(_("Unable to find common commit with %s"), branch_name);
+
+		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
+
+		if (k) {
+			printf(_("Already up to date with %s\n"), branch_name);
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (!non_ff_merge) {
+			int i;
+
+			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
+				can_ff = oideq(&k->item->object.oid,
+					       &reference_commit[i]->object.oid);
+			}
+		}
+
+		if (!non_ff_merge && can_ff) {
+			/*
+			 * The first head being merged was a
+			 * fast-forward.  Advance the reference commit
+			 * to the head being merged, and use that tree
+			 * as the intermediate result of the merge.  We
+			 * still need to count this as part of the
+			 * parent set.
+			 */
+			struct object_id oids[2];
+			printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+			oidcpy(oids, &head);
+			oidcpy(oids + 1, oid);
+
+			ret = fast_forward(r, oids, 2, 0);
+			if (ret) {
+				free(branch_name);
+				free_commit_list(common);
+				goto out;
+			}
+
+			references = 0;
+			write_tree(r, &reference_tree);
+		} else {
+			int i = 0;
+			struct tree *next = NULL;
+			struct object_id oids[MAX_UNPACK_TREES];
+
+			non_ff_merge = 1;
+			printf(_("Trying simple merge with %s\n"), branch_name);
+
+			for (k = common; k; k = k->next)
+				oidcpy(oids + (i++), &k->item->object.oid);
+
+			oidcpy(oids + (i++), &reference_tree->object.oid);
+			oidcpy(oids + (i++), oid);
+
+			if (fast_forward(r, oids, i, 1)) {
+				ret = 2;
+
+				free(branch_name);
+				free_commit_list(common);
+
+				goto out;
+			}
+
+			if (write_tree(r, &next)) {
+				struct lock_file lock = LOCK_INIT;
+
+				puts(_("Simple merge did not work, trying automatic merge."));
+				repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+				ret = !!merge_all(r->index, 0, 0, merge_one_file_cb, r);
+				write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+				write_tree(r, &next);
+			}
+
+			reference_tree = next;
+		}
+
+		reference_commit[references++] = c;
+
+		free(branch_name);
+		free_commit_list(common);
+	}
+
+out:
+	free(reference_commit);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 778f8ce9d6..938411a04e 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -37,5 +37,8 @@ int merge_all(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 08/11] merge: use the "resolve" strategy without forking
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (6 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 07/11] merge-octopus: rewrite in C Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 09/11] merge: use the "octopus" " Alban Gruin
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 9d5359edc2..ddfefd8ce3 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -740,7 +741,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
-	} else {
+	} else if (!strcmp(strategy, "resolve"))
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
+	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
 					 common, head_arg, remoteheads);
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 09/11] merge: use the "octopus" strategy without forking
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (7 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 08/11] merge: use the "resolve" strategy without forking Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 10/11] sequencer: use the "resolve" " Alban Gruin
                       ` (3 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index ddfefd8ce3..02a2367647 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -744,6 +744,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve"))
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	else if (!strcmp(strategy, "octopus"))
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 10/11] sequencer: use the "resolve" strategy without forking
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (8 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 09/11] merge: use the "octopus" " Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-05 12:26     ` [PATCH v3 11/11] sequencer: use the "octopus" merge " Alban Gruin
                       ` (2 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index e8676e965f..ff411d54af 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -33,6 +33,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2000,9 +2001,15 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v3 11/11] sequencer: use the "octopus" merge strategy without forking
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (9 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 10/11] sequencer: use the "resolve" " Alban Gruin
@ 2020-10-05 12:26     ` Alban Gruin
  2020-10-07  6:57     ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Johannes Schindelin
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-05 12:26 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, phillip.wood, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index ff411d54af..746afad930 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2005,6 +2005,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.28.0.662.ge304723957


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 01/11] t6027: modernise tests
  2020-10-05 12:26     ` [PATCH v3 01/11] t6027: modernise tests Alban Gruin
@ 2020-10-06 20:50       ` Junio C Hamano
  0 siblings, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2020-10-06 20:50 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> Some tests in t6027 uses a if/then/else to check if a command failed or

s/uses/use/;

> not, but we have the `test_must_fail' function to do it correctly for us
> nowadays.

Makes sense.  The patch text reads good, too.

> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  t/t6407-merge-binary.sh | 27 ++++++---------------------
>  1 file changed, 6 insertions(+), 21 deletions(-)
>
> diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
> index 4e6c7cb77e..071d3f7343 100755
> --- a/t/t6407-merge-binary.sh
> +++ b/t/t6407-merge-binary.sh
> @@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
>  . ./test-lib.sh
>  
>  test_expect_success setup '
> -
>  	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
>  	git add m &&
>  	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
> @@ -35,33 +34,19 @@ test_expect_success setup '
>  '
>  
>  test_expect_success resolve '
> -
>  	rm -f a* m* &&
>  	git reset --hard anchor &&
> -
> -	if git merge -s resolve master
> -	then
> -		echo Oops, should not have succeeded
> -		false
> -	else
> -		git ls-files -s >current
> -		test_cmp expect current
> -	fi
> +	test_must_fail git merge -s resolve master &&
> +	git ls-files -s >current &&
> +	test_cmp expect current
>  '
>  
>  test_expect_success recursive '
> -
>  	rm -f a* m* &&
>  	git reset --hard anchor &&
> -
> -	if git merge -s recursive master
> -	then
> -		echo Oops, should not have succeeded
> -		false
> -	else
> -		git ls-files -s >current
> -		test_cmp expect current
> -	fi
> +	test_must_fail git merge -s recursive master &&
> +	git ls-files -s >current &&
> +	test_cmp expect current
>  '
>  
>  test_done

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 02/11] merge-one-file: rewrite in C
  2020-10-05 12:26     ` [PATCH v3 02/11] merge-one-file: rewrite in C Alban Gruin
@ 2020-10-06 22:01       ` Junio C Hamano
  2020-10-21 19:47         ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-10-06 22:01 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> This rewrites `git merge-one-file' from shell to C.  This port is not
> completely straightforward: to save precious cycles by avoiding reading
> and flushing the index repeatedly, write temporary files when an
> operation can be performed in-memory, or allow other function to use the
> rewrite without forking nor worrying about the index,...

So, the in-core index is still used, but when the contents of the in-core
index does not have to be written out disk, we just don't?  Makes sense.

> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..598338ba16
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,92 @@
> +/*
> + * Builtin "git merge-one-file"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> + *
> + * This is the git per-file merge utility, called with
> + *
> + *   argv[1] - original file SHA1 (or empty)
> + *   argv[2] - file in branch1 SHA1 (or empty)
> + *   argv[3] - file in branch2 SHA1 (or empty)

Let's modernize this comment while we are at it.

    SHA1 -> "object name" (or "blob object name")

> + *   argv[4] - pathname in repository
> + *   argv[5] - original file mode (or empty)
> + *   argv[6] - file in branch1 mode (or empty)
> + *   argv[7] - file in branch2 mode (or empty)
> + *
> + * Handle some trivial cases. The _really_ trivial cases have been
> + * handled already by git read-tree, but that one doesn't do any merges
> + * that might change the tree layout.
> + */
> +
> +#define USE_THE_INDEX_COMPATIBILITY_MACROS
> +#include "cache.h"
> +#include "builtin.h"
> +#include "lockfile.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_one_file_usage[] =
> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
> +	"<orig mode> <our mode> <their mode>\n\n"
> +	"Blob ids and modes should be empty for missing files.";
> +
> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
> +{
> +	char *last;
> +	int ret = 0;
> +
> +	*mode = strtol(arg, &last, 8);
> +
> +	if (*last)
> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
> +
> +	return ret;
> +}
> +
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> +{
> +	struct object_id orig_blob, our_blob, their_blob,
> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
> +	struct lock_file lock = LOCK_INIT;
> +
> +	if (argc != 8)
> +		usage(builtin_merge_one_file_usage);
> +
> +	if (read_cache() < 0)
> +		die("invalid index");
> +
> +	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
> +
> +	if (!get_oid(argv[1], &orig_blob)) {
> +		p_orig_blob = &orig_blob;
> +		ret = read_mode("orig", argv[5], &orig_mode);
> +	}

argv[1] is defined as "either the object name of the blob in the
common ancestor, or an empty string".  So you need to distinguish
three cases here, but you are only catching two.

 - argv[1] is an empty string; p_orig_blob can legitimately be left
   NULL.

 - argv[1] is a valid blob object name.  orig_blob should be
   populated and p_orig_blob should point at it.

 - argv[1] is garbage, names a non-blob object, or there is no such
   object with that name.  Don't we want to catch it as a mistake?

Also, when argv[1] is an empty string, argv[5] must also be an empty
string, or we got a wrong input---don't we want to catch it as a
mistake?

The third case needs a bit of thought.  For example, if $1 and $2
are the same and points at a non-existent object, we know we won't
care because we only care about $3.  In a lazily-cloned repository,
that may matter---we would not want to fail even if we not have blob
$1 and $2, as long as they are reasonably spelled a full hexadecimal
object name.  But we would want to fail if blob object named by $3
is missing.

One way to achieve semantics closer to the above than the posted
patch may be to tighten the parsing.  Instead of using "anything
goes" get_oid(), use get_oid_hex(), perhaps.

> +	if (!get_oid(argv[2], &our_blob)) {
> +		p_our_blob = &our_blob;
> +		ret = read_mode("our", argv[6], &our_mode);
> +	}
> +
> +	if (!get_oid(argv[3], &their_blob)) {
> +		p_their_blob = &their_blob;
> +		ret = read_mode("their", argv[7], &their_mode);
> +	}
> +
> +	if (ret)
> +		return ret;
> +
> +	ret = merge_strategies_one_file(the_repository,
> +					p_orig_blob, p_our_blob, p_their_blob, argv[4],
> +					orig_mode, our_mode, their_mode);

That's a funny function name.  It's not like the function will be
taught different strategy to handle the three-way merge, no?  It
probably makes sense to name it after what it does, which is "three
way merge".

> +	if (ret) {
> +		rollback_lock_file(&lock);
> +		return !!ret;
> +	}
> +
> +	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
> +}

> diff --git a/merge-strategies.c b/merge-strategies.c
> new file mode 100644
> index 0000000000..bbe6f48698
> --- /dev/null
> +++ b/merge-strategies.c
> @@ -0,0 +1,214 @@
> +#include "cache.h"
> +#include "dir.h"
> +#include "ll-merge.h"
> +#include "merge-strategies.h"
> +#include "xdiff-interface.h"
> +

> +static int add_to_index_cacheinfo(struct index_state *istate,
> +				  unsigned int mode,
> +				  const struct object_id *oid, const char *path)
> +{
> +	struct cache_entry *ce;
> +	int len, option;
> +
> +	if (!verify_path(path, mode))
> +		return error(_("Invalid path '%s'"), path);
> +
> +	len = strlen(path);
> +	ce = make_empty_cache_entry(istate, len);
> +
> +	oidcpy(&ce->oid, oid);
> +	memcpy(ce->name, path, len);
> +	ce->ce_flags = create_ce_flags(0);
> +	ce->ce_namelen = len;
> +	ce->ce_mode = create_ce_mode(mode);
> +	if (assume_unchanged)
> +		ce->ce_flags |= CE_VALID;
> +	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
> +	if (add_index_entry(istate, ce, option))
> +		return error(_("%s: cannot add to the index"), path);
> +
> +	return 0;
> +}

The above correctly does 'git update-index --add --cacheinfo "$6"
"$2" "$4"' but don't copy-and-paste existing code to do so.  Add one
preliminary patch before everything else in the series to massage
and extract add_cacheinfo() function out of builtin/update-index.c,
move it to somewhere common like read-cache.c and so that we can
call it from here.

> +static int checkout_from_index(struct index_state *istate, const char *path)
> +{
> +	struct checkout state = CHECKOUT_INIT;
> +	struct cache_entry *ce;
> +
> +	state.istate = istate;
> +	state.force = 1;
> +	state.base_dir = "";
> +	state.base_dir_len = 0;
> +
> +	ce = index_file_exists(istate, path, strlen(path), 0);

This call is unfortunate for the reasons I mention later.

But if you must have this call, then you need to sanity check what
you get from index_file_exists().  ce must be a merged cache entry,
so

	if (!ce || ce_stage(ce))
		BUG(...);

> +	if (checkout_entry(ce, &state, NULL, NULL) < 0)
> +		return error(_("%s: cannot checkout file"), path);
> +	return 0;
> +}
> +
> +static int merge_one_file_deleted(struct index_state *istate,
> +				  const struct object_id *orig_blob,
> +				  const struct object_id *our_blob,
> +				  const struct object_id *their_blob, const char *path,
> +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if ((our_blob && orig_mode != our_mode) ||
> +	    (their_blob && orig_mode != their_mode))
> +		return error(_("File %s deleted on one branch but had its "
> +			       "permissions changed on the other."), path);
> +
> +	if (our_blob) {
> +		printf(_("Removing %s\n"), path);
> +
> +		if (file_exists(path))
> +			remove_path(path);
> +	}
> +
> +	if (remove_file_from_index(istate, path))
> +		return error("%s: cannot remove from the index", path);
> +	return 0;

If the side that did not remove changed the mode, we don't silently
remove but fail and give a chance to inspect the situation to the
end user.  If we had the blob and it is removed by them, we give a
message and only in that case we remove the file from the working
tree, together with any leading directory that has become empty.

And after that we make sure that the path is no longer in the
index.  The function removes entries for the path at all the stages,
which is exactly what we want.

OK.

> +}
> +
> +static int do_merge_one_file(struct index_state *istate,
> +			     const struct object_id *orig_blob,
> +			     const struct object_id *our_blob,
> +			     const struct object_id *their_blob, const char *path,
> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	int ret, i, dest;
> +	ssize_t written;
> +	mmbuffer_t result = {NULL, 0};
> +	mmfile_t mmfs[3];
> +	struct ll_merge_options merge_opts = {0};
> +	struct cache_entry *ce;
> +
> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
> +		return error(_("%s: Not merging symbolic link changes."), path);
> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
> +		return error(_("%s: Not merging conflicting submodule changes."), path);
> +
> +	read_mmblob(mmfs + 1, our_blob);
> +	read_mmblob(mmfs + 2, their_blob);
> +
> +	if (orig_blob) {
> +		printf(_("Auto-merging %s\n"), path);
> +		read_mmblob(mmfs + 0, orig_blob);
> +	} else {
> +		printf(_("Added %s in both, but differently.\n"), path);
> +		read_mmblob(mmfs + 0, &null_oid);
> +	}
> +
> +	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
> +	ret = ll_merge(&result, path,
> +		       mmfs + 0, "orig",
> +		       mmfs + 1, "our",
> +		       mmfs + 2, "their",
> +		       istate, &merge_opts);

Is it correct to call into ll_merge() here?  The original used to
call "git merge-file" which called into xdl_merge().  Calling into
ll_merge() means the path is used to look up the attributes and use
the custom merge driver, which I am not offhand sure is what we want
to see at this low level (and if it turns out to be a good idea, we
definitely should explain the change of semantics in the proposed
log message for this commit).

> +	for (i = 0; i < 3; i++)
> +		free(mmfs[i].ptr);
> +
> +	if (ret < 0) {
> +		free(result.ptr);
> +		return error(_("Failed to execute internal merge"));
> +	}
> +
> +	/*
> +	 * Create the working tree file, using "our tree" version from
> +	 * the index, and then store the result of the merge.
> +	 */

The above is copied from the original, to explain what it did after
the comment, but it does not seem to match what the new code does.

> +	ce = index_file_exists(istate, path, strlen(path), 0);
> +	if (!ce)
> +		BUG("file is not present in the cache?");
> +
> +	unlink(path);
> +	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
> +		free(result.ptr);
> +		return error_errno(_("failed to open file '%s'"), path);
> +	}
> +
> +	written = write_in_full(dest, result.ptr, result.size);
> +	close(dest);
> +
> +	free(result.ptr);
> +
> +	if (written < 0)
> +		return error_errno(_("failed to write to '%s'"), path);
> +

This open(..., ce->ce_mode) call is way insufficient.

The comment we have above this part of the code talks about the
difficulty of doing this correctly in scripted version.  Creating a
file by 'git checkout-index -f --stage=2 -- "$4"' and reusing it to
store the merged contents was the cleanest and easiest way without
having direct access to adjust_shared_perm() to create a working
tree file with the correct permission bits.

We are writing in C, so we should be able to do much better than the
scripted version, as we can later call adjust_shared_perm().

> +	if (ret != 0 || !orig_blob)
> +		ret = error(_("content conflict in %s"), path);
> +	if (our_mode != their_mode)
> +		return error(_("permission conflict: %o->%o,%o in %s"),
> +			     orig_mode, our_mode, their_mode, path);
> +	if (ret)
> +		return -1;
> +
> +	return add_file_to_index(istate, path, 0);
> +}
> +
> +int merge_strategies_one_file(struct repository *r,
> +			      const struct object_id *orig_blob,
> +			      const struct object_id *our_blob,
> +			      const struct object_id *their_blob, const char *path,
> +			      unsigned int orig_mode, unsigned int our_mode,
> +			      unsigned int their_mode)
> +{

In a long if/else if/else if/.../else cascade, enclose all bodies in
braces, if any one of them has a multi-statement body, to avoid
being distracting.

> +	if (orig_blob &&
> +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
> +		/* Deleted in both or deleted in one and unchanged in the other. */
> +		return merge_one_file_deleted(r->index,
> +					      orig_blob, our_blob, their_blob, path,
> +					      orig_mode, our_mode, their_mode);

OK, we've already reviewed that function.

> +	else if (!orig_blob && our_blob && !their_blob) {
> +		/*
> +		 * Added in one.  The other side did not add and we
> +		 * added so there is nothing to be done, except making
> +		 * the path merged.
> +		 */
> +		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);

OK, we've already reviewed that function.

> +	} else if (!orig_blob && !our_blob && their_blob) {
> +		printf(_("Adding %s\n"), path);
> +
> +		if (file_exists(path))
> +			return error(_("untracked %s is overwritten by the merge."), path);
> +
> +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
> +			return -1;
> +		return checkout_from_index(r->index, path);

You did "add_to_index_cacheinfo()", so you MUST know which ce is to
be checked out.

Consider if it is worth to teach add_to_index_cacheinfo() to give
you ce back and pass it to checkout_from_index(); that way, you do
not have to call index_file_exists() based on path in the function.

> +	} else if (!orig_blob && our_blob && their_blob &&
> +		   oideq(our_blob, their_blob)) {
> +		/* Added in both, identically (check for same permissions). */
> +		if (our_mode != their_mode)
> +			return error(_("File %s added identically in both branches, "
> +				       "but permissions conflict %o->%o."),
> +				     path, our_mode, their_mode);
> +
> +		printf(_("Adding %s\n"), path);
> +
> +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
> +			return -1;
> +		return checkout_from_index(r->index, path);

Ditto.

> +	} else if (our_blob && their_blob)
> +		/* Modified in both, but differently. */
> +		return do_merge_one_file(r->index,
> +					 orig_blob, our_blob, their_blob, path,
> +					 orig_mode, our_mode, their_mode);
> +	else {
> +		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
> +			their_hex[GIT_MAX_HEXSZ] = {0};
> +
> +		if (orig_blob)
> +			oid_to_hex_r(orig_hex, orig_blob);
> +		if (our_blob)
> +			oid_to_hex_r(our_hex, our_blob);
> +		if (their_blob)
> +			oid_to_hex_r(their_hex, their_blob);
> +
> +		return error(_("%s: Not handling case %s -> %s -> %s"),
> +			     path, orig_hex, our_hex, their_hex);
> +	}
> +
> +	return 0;
> +}

I can see that this does go in the right direction.  With a bit more
attention to details it would soon be production-ready quality.

Thanks.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (10 preceding siblings ...)
  2020-10-05 12:26     ` [PATCH v3 11/11] sequencer: use the "octopus" merge " Alban Gruin
@ 2020-10-07  6:57     ` Johannes Schindelin
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
  12 siblings, 0 replies; 221+ messages in thread
From: Johannes Schindelin @ 2020-10-07  6:57 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, phillip.wood

Hi Alban,

On Mon, 5 Oct 2020, Alban Gruin wrote:

> In a effort to reduce the number of shell scripts in git's codebase, I
> propose this patch series converting the two remaining merge strategies,
> resolve and octopus, from shell to C.  This will enable slightly better
> performance, better integration with git itself (no more forking to
> perform these operations), better portability (Windows and shell scripts
> don't mix well).
>
> Three scripts are actually converted: first git-merge-one-file.sh, then
> git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
> are converted, but they also are modified to operate without forking,
> and then libified so they can be used by git without spawning another
> process.
>
> The first patch is not important to make the whole series work, but I
> made this patch while working on it.
>
> This series keeps the commands `git merge-one-file', `git
> merge-resolve', and `git merge-octopus', so any script depending on them
> should keep working without any changes.

While that may be true, with SKIP_DASHED_BUILT_INS=YesPlease, it is no
longer true. And that is a good thing!

However, it also broke the CI build (`seen` sees a breakage in t9902.199).

I will send out a fix shortly (see
https://github.com/gitgitgadget/git/pull/745 for details).

Ciao,
Dscho

>
> This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).  The
> tip is tagged as "rewrite-merge-strategies-v3" at
> https://github.com/agrn/git.
>
> Changes since v2:
>
>  - Enable `USE_THE_INDEX_COMPATIBILITY_MACROS' in merge-one-file.c and
>    use read_cache() and hold_locked_index() instead of repo_read_index()
>    and repo_hold_locked_index() to improve readability.
>
>  - Move file mode parsing to its own function in merge-one-file.c.
>
>  - Improve IO errors handling in do_merge_one_file().
>
>  - Return -1 instead of 1 when erroring out in do_merge_one_file() and
>    merge_strategies_one_file().
>
>  - Use oid_to_hex_r() instead of oid_to_hex() in do_merge_one_file().
>
>  - Reformat multilines comments.
>
>  - Reworded a sentence in commit 3/11.
>
> Alban Gruin (11):
>   t6027: modernise tests
>   merge-one-file: rewrite in C
>   merge-index: libify merge_one_path() and merge_all()
>   merge-index: don't fork if the requested program is
>     `git-merge-one-file'
>   merge-resolve: rewrite in C
>   merge-recursive: move better_branch_name() to merge.c
>   merge-octopus: rewrite in C
>   merge: use the "resolve" strategy without forking
>   merge: use the "octopus" strategy without forking
>   sequencer: use the "resolve" strategy without forking
>   sequencer: use the "octopus" merge strategy without forking
>
>  Makefile                        |   7 +-
>  builtin.h                       |   3 +
>  builtin/merge-index.c           | 102 ++----
>  builtin/merge-octopus.c         |  69 ++++
>  builtin/merge-one-file.c        |  92 +++++
>  builtin/merge-recursive.c       |  16 +-
>  builtin/merge-resolve.c         |  69 ++++
>  builtin/merge.c                 |   9 +-
>  cache.h                         |   2 +-
>  git-merge-octopus.sh            | 112 ------
>  git-merge-one-file.sh           | 167 ---------
>  git-merge-resolve.sh            |  54 ---
>  git.c                           |   3 +
>  merge-strategies.c              | 613 ++++++++++++++++++++++++++++++++
>  merge-strategies.h              |  44 +++
>  merge.c                         |  12 +
>  sequencer.c                     |  16 +-
>  t/t6407-merge-binary.sh         |  27 +-
>  t/t6415-merge-dir-to-symlink.sh |   2 +-
>  19 files changed, 972 insertions(+), 447 deletions(-)
>  create mode 100644 builtin/merge-octopus.c
>  create mode 100644 builtin/merge-one-file.c
>  create mode 100644 builtin/merge-resolve.c
>  delete mode 100755 git-merge-octopus.sh
>  delete mode 100755 git-merge-one-file.sh
>  delete mode 100755 git-merge-resolve.sh
>  create mode 100644 merge-strategies.c
>  create mode 100644 merge-strategies.h
>
> Range-diff against v2:
>  1:  28c8fd11b6 =  1:  08c7df596a t6027: modernise tests
>  2:  f5ab0fdf0a !  2:  ce911c99c0 merge-one-file: rewrite in C
>     @@ builtin/merge-one-file.c (new)
>      + * that might change the tree layout.
>      + */
>      +
>     ++#define USE_THE_INDEX_COMPATIBILITY_MACROS
>      +#include "cache.h"
>      +#include "builtin.h"
>      +#include "lockfile.h"
>     @@ builtin/merge-one-file.c (new)
>      +	"<orig mode> <our mode> <their mode>\n\n"
>      +	"Blob ids and modes should be empty for missing files.";
>      +
>     ++static int read_mode(const char *name, const char *arg, unsigned int *mode)
>     ++{
>     ++	char *last;
>     ++	int ret = 0;
>     ++
>     ++	*mode = strtol(arg, &last, 8);
>     ++
>     ++	if (*last)
>     ++		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
>     ++	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
>     ++		ret = error(_("invalid '%s' mode: %o"), name, *mode);
>     ++
>     ++	return ret;
>     ++}
>     ++
>      +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
>      +{
>      +	struct object_id orig_blob, our_blob, their_blob,
>     @@ builtin/merge-one-file.c (new)
>      +	if (argc != 8)
>      +		usage(builtin_merge_one_file_usage);
>      +
>     -+	if (repo_read_index(the_repository) < 0)
>     ++	if (read_cache() < 0)
>      +		die("invalid index");
>      +
>     -+	repo_hold_locked_index(the_repository, &lock, LOCK_DIE_ON_ERROR);
>     ++	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
>      +
>      +	if (!get_oid(argv[1], &orig_blob)) {
>      +		p_orig_blob = &orig_blob;
>     -+		orig_mode = strtol(argv[5], NULL, 8);
>     -+
>     -+		if (!(S_ISREG(orig_mode) || S_ISDIR(orig_mode) || S_ISLNK(orig_mode)))
>     -+			ret |= error(_("invalid 'orig' mode: %o"), orig_mode);
>     ++		ret = read_mode("orig", argv[5], &orig_mode);
>      +	}
>      +
>      +	if (!get_oid(argv[2], &our_blob)) {
>      +		p_our_blob = &our_blob;
>     -+		our_mode = strtol(argv[6], NULL, 8);
>     -+
>     -+		if (!(S_ISREG(our_mode) || S_ISDIR(our_mode) || S_ISLNK(our_mode)))
>     -+			ret |= error(_("invalid 'our' mode: %o"), our_mode);
>     ++		ret = read_mode("our", argv[6], &our_mode);
>      +	}
>      +
>      +	if (!get_oid(argv[3], &their_blob)) {
>      +		p_their_blob = &their_blob;
>     -+		their_mode = strtol(argv[7], NULL, 8);
>     -+
>     -+		if (!(S_ISREG(their_mode) || S_ISDIR(their_mode) || S_ISLNK(their_mode)))
>     -+			ret = error(_("invalid 'their' mode: %o"), their_mode);
>     ++		ret = read_mode("their", argv[7], &their_mode);
>      +	}
>      +
>      +	if (ret)
>     @@ builtin/merge-one-file.c (new)
>      +
>      +	if (ret) {
>      +		rollback_lock_file(&lock);
>     -+		return ret;
>     ++		return !!ret;
>      +	}
>      +
>     -+	return write_locked_index(the_repository->index, &lock, COMMIT_LOCK);
>     ++	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>      +}
>
>       ## git-merge-one-file.sh (deleted) ##
>     @@ merge-strategies.c (new)
>      +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
>      +{
>      +	int ret, i, dest;
>     ++	ssize_t written;
>      +	mmbuffer_t result = {NULL, 0};
>      +	mmfile_t mmfs[3];
>      +	struct ll_merge_options merge_opts = {0};
>     @@ merge-strategies.c (new)
>      +	for (i = 0; i < 3; i++)
>      +		free(mmfs[i].ptr);
>      +
>     -+	if (ret > 127 || !orig_blob)
>     -+		ret = error(_("content conflict in %s"), path);
>     ++	if (ret < 0) {
>     ++		free(result.ptr);
>     ++		return error(_("Failed to execute internal merge"));
>     ++	}
>      +
>     -+	/* Create the working tree file, using "our tree" version from
>     -+	   the index, and then store the result of the merge. */
>     ++	/*
>     ++	 * Create the working tree file, using "our tree" version from
>     ++	 * the index, and then store the result of the merge.
>     ++	 */
>      +	ce = index_file_exists(istate, path, strlen(path), 0);
>      +	if (!ce)
>      +		BUG("file is not present in the cache?");
>      +
>      +	unlink(path);
>     -+	dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode);
>     -+	write_in_full(dest, result.ptr, result.size);
>     ++	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
>     ++		free(result.ptr);
>     ++		return error_errno(_("failed to open file '%s'"), path);
>     ++	}
>     ++
>     ++	written = write_in_full(dest, result.ptr, result.size);
>      +	close(dest);
>      +
>      +	free(result.ptr);
>      +
>     -+	if (ret && our_mode != their_mode)
>     ++	if (written < 0)
>     ++		return error_errno(_("failed to write to '%s'"), path);
>     ++
>     ++	if (ret != 0 || !orig_blob)
>     ++		ret = error(_("content conflict in %s"), path);
>     ++	if (our_mode != their_mode)
>      +		return error(_("permission conflict: %o->%o,%o in %s"),
>      +			     orig_mode, our_mode, their_mode, path);
>      +	if (ret)
>     -+		return 1;
>     ++		return -1;
>      +
>      +	return add_file_to_index(istate, path, 0);
>      +}
>     @@ merge-strategies.c (new)
>      +	if (orig_blob &&
>      +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
>      +	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
>     -+		/* Deleted in both or deleted in one and unchanged in
>     -+		   the other */
>     ++		/* Deleted in both or deleted in one and unchanged in the other. */
>      +		return merge_one_file_deleted(r->index,
>      +					      orig_blob, our_blob, their_blob, path,
>      +					      orig_mode, our_mode, their_mode);
>      +	else if (!orig_blob && our_blob && !their_blob) {
>     -+		/* Added in one.  The other side did not add and we
>     -+		   added so there is nothing to be done, except making
>     -+		   the path merged. */
>     ++		/*
>     ++		 * Added in one.  The other side did not add and we
>     ++		 * added so there is nothing to be done, except making
>     ++		 * the path merged.
>     ++		 */
>      +		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
>      +	} else if (!orig_blob && !our_blob && their_blob) {
>      +		printf(_("Adding %s\n"), path);
>     @@ merge-strategies.c (new)
>      +			return error(_("untracked %s is overwritten by the merge."), path);
>      +
>      +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
>     -+			return 1;
>     ++			return -1;
>      +		return checkout_from_index(r->index, path);
>      +	} else if (!orig_blob && our_blob && their_blob &&
>      +		   oideq(our_blob, their_blob)) {
>     -+		/* Added in both, identically (check for same
>     -+		   permissions). */
>     ++		/* Added in both, identically (check for same permissions). */
>      +		if (our_mode != their_mode)
>      +			return error(_("File %s added identically in both branches, "
>      +				       "but permissions conflict %o->%o."),
>     @@ merge-strategies.c (new)
>      +		printf(_("Adding %s\n"), path);
>      +
>      +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
>     -+			return 1;
>     ++			return -1;
>      +		return checkout_from_index(r->index, path);
>      +	} else if (our_blob && their_blob)
>      +		/* Modified in both, but differently. */
>     @@ merge-strategies.c (new)
>      +					 orig_blob, our_blob, their_blob, path,
>      +					 orig_mode, our_mode, their_mode);
>      +	else {
>     -+		char *orig_hex = "", *our_hex = "", *their_hex = "";
>     ++		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
>     ++			their_hex[GIT_MAX_HEXSZ] = {0};
>      +
>      +		if (orig_blob)
>     -+			orig_hex = oid_to_hex(orig_blob);
>     ++			oid_to_hex_r(orig_hex, orig_blob);
>      +		if (our_blob)
>     -+			our_hex = oid_to_hex(our_blob);
>     ++			oid_to_hex_r(our_hex, our_blob);
>      +		if (their_blob)
>     -+			their_hex = oid_to_hex(their_blob);
>     ++			oid_to_hex_r(their_hex, their_blob);
>      +
>      +		return error(_("%s: Not handling case %s -> %s -> %s"),
>      +			     path, orig_hex, our_hex, their_hex);
>  3:  7f3ce7da17 !  3:  7f0999f5a3 merge-index: libify merge_one_path() and merge_all()
>     @@ Commit message
>
>          To avoid this, this moves merge_one_path(), merge_all(), and their
>          helpers to merge-strategies.c.  They also take a callback to dictate
>     -    what they should do for each file.  For now, only one launching a new
>     -    process is defined to preserve the behaviour of the builtin version.
>     +    what they should do for each file.  For now, to preserve the behaviour
>     +    of `merge-index', only one callback, launching a new process, is
>     +    defined.
>
>          Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
>
>  4:  07e6a6aaef =  4:  c0bc05406d merge-index: don't fork if the requested program is `git-merge-one-file'
>  5:  117d4fc840 =  5:  cbfe192982 merge-resolve: rewrite in C
>  6:  4fc955962b =  6:  35e386f626 merge-recursive: move better_branch_name() to merge.c
>  7:  e7b9e15b34 !  7:  41eb0f7199 merge-octopus: rewrite in C
>     @@ Makefile: BUILTIN_OBJS += builtin/mailsplit.o
>       BUILTIN_OBJS += builtin/merge-recursive.o
>
>       ## builtin.h ##
>     -@@ builtin.h: int cmd_mailsplit(int argc, const char **argv, const char *prefix);
>     +@@ builtin.h: int cmd_maintenance(int argc, const char **argv, const char *prefix);
>       int cmd_merge(int argc, const char **argv, const char *prefix);
>       int cmd_merge_base(int argc, const char **argv, const char *prefix);
>       int cmd_merge_index(int argc, const char **argv, const char *prefix);
>     @@ builtin/merge-octopus.c (new)
>      +	if (repo_read_index(the_repository) < 0)
>      +		die("corrupted cache");
>      +
>     -+	/* The first parameters up to -- are merge bases; the rest are
>     -+	 * heads. */
>     ++	/*
>     ++	 * The first parameters up to -- are merge bases; the rest are
>     ++	 * heads.
>     ++	 */
>      +	for (i = 1; i < argc; i++) {
>      +		if (strcmp(argv[i], "--") == 0)
>      +			sep_seen = 1;
>     @@ builtin/merge-octopus.c (new)
>      +		}
>      +	}
>      +
>     -+	/* Reject if this is not an octopus -- resolve should be used
>     -+	 * instead. */
>     ++	/*
>     ++	 * Reject if this is not an octopus -- resolve should be used
>     ++	 * instead.
>     ++	 */
>      +	if (commit_list_count(remotes) < 2)
>      +		return 2;
>      +
>     @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
>      +		int can_ff = 1;
>      +
>      +		if (ret) {
>     -+			/* We allow only last one to have a
>     -+			   hand-resolvable conflicts.  Last round failed
>     -+			   and we still had a head to merge. */
>     ++			/*
>     ++			 * We allow only last one to have a
>     ++			 * hand-resolvable conflicts.  Last round failed
>     ++			 * and we still had a head to merge.
>     ++			 */
>      +			puts(_("Automated merge did not work."));
>      +			puts(_("Should not be doing an octopus."));
>      +
>     @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
>      +		}
>      +
>      +		if (!non_ff_merge && can_ff) {
>     -+			/* The first head being merged was a
>     -+			   fast-forward.  Advance the reference commit
>     -+			   to the head being merged, and use that tree
>     -+			   as the intermediate result of the merge.  We
>     -+			   still need to count this as part of the
>     -+			   parent set. */
>     ++			/*
>     ++			 * The first head being merged was a
>     ++			 * fast-forward.  Advance the reference commit
>     ++			 * to the head being merged, and use that tree
>     ++			 * as the intermediate result of the merge.  We
>     ++			 * still need to count this as part of the
>     ++			 * parent set.
>     ++			 */
>      +			struct object_id oids[2];
>      +			printf(_("Fast-forwarding to: %s\n"), branch_name);
>      +
>  8:  cd0662201d =  8:  8f6c1ac057 merge: use the "resolve" strategy without forking
>  9:  0525ff0183 =  9:  b1125261d1 merge: use the "octopus" strategy without forking
> 10:  6fbf599ba4 = 10:  8d0932fd02 sequencer: use the "resolve" strategy without forking
> 11:  2c2dc3cc62 = 11:  e304723957 sequencer: use the "octopus" merge strategy without forking
> --
> 2.28.0.662.ge304723957
>
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all()
  2020-10-05 12:26     ` [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-10-09  4:48       ` Junio C Hamano
  2020-11-06 19:53         ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-10-09  4:48 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> diff --git a/merge-strategies.c b/merge-strategies.c
> index bbe6f48698..f0e30f5624 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -2,6 +2,7 @@
>  #include "dir.h"
>  #include "ll-merge.h"
>  #include "merge-strategies.h"
> +#include "run-command.h"
>  #include "xdiff-interface.h"
>  
>  static int add_to_index_cacheinfo(struct index_state *istate,
> @@ -212,3 +213,101 @@ int merge_strategies_one_file(struct repository *r,
>  
>  	return 0;
>  }
> +
> +int merge_program_cb(const struct object_id *orig_blob,
> +		     const struct object_id *our_blob,
> +		     const struct object_id *their_blob, const char *path,
> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +		     void *data)
> +{
> +	char ownbuf[3][GIT_MAX_HEXSZ] = {{0}};
> +	const char *arguments[] = { (char *)data, "", "", "", path,
> +				    ownbuf[0], ownbuf[1], ownbuf[2],
> +				    NULL };
> +
> +	if (orig_blob)
> +		arguments[1] = oid_to_hex(orig_blob);
> +	if (our_blob)
> +		arguments[2] = oid_to_hex(our_blob);
> +	if (their_blob)
> +		arguments[3] = oid_to_hex(their_blob);

oid_to_hex() uses 4-slot rotating buffer, no?  Relying on the fact
that three would be available here without getting reused (or,
rather, our caller didn't make its own calls and/or does not mind
us invalidating all but one slot for them) feels a bit iffy.

Extending ownbuf[] to 6 elements and using oid_to_hex_r() would be a
trivial way to clarify the code.

> +	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
> +	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
> +	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);

And these mode bits would not need GIT_MAX_HEXSZ to begin with.
This smells like a WIP that hasn't been carefullly proofread.

	char oidbuf[3][GIT_MAX_HEXSZ] = { 0 };
	char modebuf[3][8] = { 0 };
	char *args[] = {
		data, oidbuf[0], oidbuf[1], oidbuf[2], path,
		modebuf[0], modebuf[1], modebuf[2], NULL,
	};
        
        if (orig_blob)
		oid_to_hex_r(oidbuf[0], orig_blob);
	...
	xsnprintf(modebuf[0], ...);


Eh, wait.  Is this meant to be able to drive "git-merge-one-file",
i.e. a missing common/ours/theirs is indicated by an empty string
in both oiod and mode?  If so, an unconditional xsnprintf() would
either give garbage or "0" at best, neither of which is an empty
string.  So the body would be more like

	if (orig_blob) {
		oid_to_hex_r(oidbuf[0], orig_blob);
		xsnprintf(modebuf[0], "%o", orig_mode);
	}
	if (our_blob) {
		oid_to_hex_r(oidbuf[1], our_blob);
		xsnprintf(modebuf[1], "%o", our_mode);
	}
	...

wouldn't it?

> +	return run_command_v_opt(arguments, 0);
> +}
> +
> +static int merge_entry(struct index_state *istate, int quiet, int pos,
> +		       const char *path, merge_cb cb, void *data)

When we use an identifier "cb", it typically means callback data,
not a callback function which is often called "fn".  So, name the
type "merge_fn" (or "merge_func"), and call the parameter "fn".

> +{
> +	int found = 0;
> +	const struct object_id *oids[3] = {NULL};
> +	unsigned int modes[3] = {0};
> +
> +	do {
> +		const struct cache_entry *ce = istate->cache[pos];
> +		int stage = ce_stage(ce);
> +
> +		if (strcmp(ce->name, path))
> +			break;
> +		found++;
> +		oids[stage - 1] = &ce->oid;
> +		modes[stage - 1] = ce->ce_mode;
> +	} while (++pos < istate->cache_nr);
> +	if (!found)
> +		return error(_("%s is not in the cache"), path);
> +
> +	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
> +		if (!quiet)
> +			error(_("Merge program failed"));
> +		return -2;
> +	}
> +
> +	return found;
> +}

This copies from builtin/merge-index.c::merge_entry().

> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
> +		   const char *path, merge_cb cb, void *data)
> +{
> +	int pos = index_name_pos(istate, path, strlen(path)), ret;
> +
> +	/*
> +	 * If it already exists in the cache as stage0, it's
> +	 * already merged and there is nothing to do.
> +	 */
> +	if (pos < 0) {
> +		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
> +		if (ret == -1)
> +			return -1;
> +		else if (ret == -2)
> +			return 1;
> +	}
> +	return 0;
> +}

Likewise from the same function in that file.

Are we removing the "git merge-index" program?  Reusing the same
identifier for these copied-and-pasted pairs of functions bothers
me for two reasons.

 - An indentifier that was clear and unique enough in the original
   context as a file-scope static may not be a good name as a global
   identifier.  

 - Having two similar-looking functions with the same name makes
   reading and learning the codebase starting at "git grep" hits
   more difficult than necessary.

> +int merge_all(struct index_state *istate, int oneshot, int quiet,
> +	      merge_cb cb, void *data)
> +{
> +	int err = 0, i, ret;
> +	for (i = 0; i < istate->cache_nr; i++) {
> +		const struct cache_entry *ce = istate->cache[i];
> +		if (!ce_stage(ce))
> +			continue;
> +
> +		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
> +		if (ret > 0)
> +			i += ret - 1;
> +		else if (ret == -1)
> +			return -1;
> +		else if (ret == -2) {
> +			if (oneshot)
> +				err++;
> +			else
> +				return 1;
> +		}
> +	}
> +
> +	return err;
> +}

Likewise.

> diff --git a/merge-strategies.h b/merge-strategies.h
> index b527d145c7..cf78d7eaf4 100644
> --- a/merge-strategies.h
> +++ b/merge-strategies.h
> @@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
>  			      unsigned int orig_mode, unsigned int our_mode,
>  			      unsigned int their_mode);
>  
> +typedef int (*merge_cb)(const struct object_id *orig_blob,
> +			const struct object_id *our_blob,
> +			const struct object_id *their_blob, const char *path,
> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			void *data);

Call it "merge_one_file_func", probably.

> +int merge_program_cb(const struct object_id *orig_blob,

Call it spawn_merge_one_file() perhaps?

> +		     const struct object_id *our_blob,
> +		     const struct object_id *their_blob, const char *path,
> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +		     void *data);
> +
> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
> +		   const char *path, merge_cb cb, void *data);
> +int merge_all(struct index_state *istate, int oneshot, int quiet,
> +	      merge_cb cb, void *data);
>  #endif /* MERGE_STRATEGIES_H */

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 04/11] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-10-05 12:26     ` [PATCH v3 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2020-10-16 19:07       ` Junio C Hamano
  0 siblings, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2020-10-16 19:07 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> Since `git-merge-one-file' has been rewritten and libified, this teaches
> `merge-index' to call merge_strategies_one_file() without forking using
> a new callback, merge_one_file_cb().
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---

I do not know how much of the change in this patch survives when the
previous step gets adjusted, so I'll skip this step for now.

Thanks.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 05/11] merge-resolve: rewrite in C
  2020-10-05 12:26     ` [PATCH v3 05/11] merge-resolve: rewrite in C Alban Gruin
@ 2020-10-16 19:19       ` Junio C Hamano
  2020-11-06 19:53         ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-10-16 19:19 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

> +#include "cache.h"
> +#include "builtin.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_resolve_usage[] =
> +	"git merge-resolve <bases>... -- <head> <remote>";
> +
> +int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
> +{
> +	int i, is_baseless = 1, sep_seen = 0;
> +	const char *head = NULL;
> +	struct commit_list *bases = NULL, *remote = NULL;
> +	struct commit_list **next_base = &bases;
> +
> +	if (argc < 5)
> +		usage(builtin_merge_resolve_usage);
> +
> +	setup_work_tree();
> +	if (repo_read_index(the_repository) < 0)
> +		die("invalid index");
> +
> +	/* The first parameters up to -- are merge bases; the rest are
> +	 * heads. */

Style (I won't repeat).

> +	for (i = 1; i < argc; i++) {
> +		if (strcmp(argv[i], "--") == 0)

	if (!strcmp(...))

is more typical than comparing with "== 0".

> +			sep_seen = 1;
> +		else if (strcmp(argv[i], "-h") == 0)
> +			usage(builtin_merge_resolve_usage);
> +		else if (sep_seen && !head)
> +			head = argv[i];
> +		else if (remote) {
> +			/* Give up if we are given two or more remotes.
> +			 * Not handling octopus. */
> +			return 2;
> +		} else {
> +			struct object_id oid;
> +
> +			get_oid(argv[i], &oid);
> +			is_baseless &= sep_seen;
> +
> +			if (!oideq(&oid, the_hash_algo->empty_tree)) {

What is this business about an empty tree about?

> +				struct commit *commit;
> +				commit = lookup_commit_or_die(&oid, argv[i]);
> +
> +				if (sep_seen)
> +					commit_list_append(commit, &remote);
> +				else
> +					next_base = commit_list_append(commit, next_base);
> +			}
> +		}
> +	}
> +
> +	/* Give up if this is a baseless merge. */
> +	if (is_baseless)
> +		return 2;

This is quite convoluted.  

The original is much more straight-forward.  We just said "grab
everything before we see '--' and call them bases; immediately after
'--' is HEAD and everything else is remote.  Now do we have any
base?  Otherwise we cannot handle it".

I cannot see an equivalence to it in the rewritten result, with the
bit operation with is_baseless and sep_seen.  Wouldn't it be the
matter of checking if next_base is NULL, or is there something more
subtle that deserves in-code comment going on?

Thanks.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 02/11] merge-one-file: rewrite in C
  2020-10-06 22:01       ` Junio C Hamano
@ 2020-10-21 19:47         ` Alban Gruin
  2020-10-21 20:28           ` Junio C Hamano
  2020-10-21 20:30           ` Junio C Hamano
  0 siblings, 2 replies; 221+ messages in thread
From: Alban Gruin @ 2020-10-21 19:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, phillip.wood

Hi Junio,

On 07/10/2020 00:01, Junio C Hamano wrote:
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> This rewrites `git merge-one-file' from shell to C.  This port is not
>> completely straightforward: to save precious cycles by avoiding reading
>> and flushing the index repeatedly, write temporary files when an
>> operation can be performed in-memory, or allow other function to use the
>> rewrite without forking nor worrying about the index,...
> 
> So, the in-core index is still used, but when the contents of the in-core
> index does not have to be written out disk, we just don't?  Makes sense.
> 

>> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
>> new file mode 100644
>> index 0000000000..598338ba16
>> --- /dev/null
>> +++ b/builtin/merge-one-file.c
>> @@ -0,0 +1,92 @@
>> +/*
>> + * Builtin "git merge-one-file"
>> + *
>> + * Copyright (c) 2020 Alban Gruin
>> + *
>> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
>> + *
>> + * This is the git per-file merge utility, called with
>> + *
>> + *   argv[1] - original file SHA1 (or empty)
>> + *   argv[2] - file in branch1 SHA1 (or empty)
>> + *   argv[3] - file in branch2 SHA1 (or empty)
> 
> Let's modernize this comment while we are at it.
> 
>     SHA1 -> "object name" (or "blob object name")
> 
>> + *   argv[4] - pathname in repository
>> + *   argv[5] - original file mode (or empty)
>> + *   argv[6] - file in branch1 mode (or empty)
>> + *   argv[7] - file in branch2 mode (or empty)
>> + *
>> + * Handle some trivial cases. The _really_ trivial cases have been
>> + * handled already by git read-tree, but that one doesn't do any merges
>> + * that might change the tree layout.
>> + */
>> +
>> +#define USE_THE_INDEX_COMPATIBILITY_MACROS
>> +#include "cache.h"
>> +#include "builtin.h"
>> +#include "lockfile.h"
>> +#include "merge-strategies.h"
>> +
>> +static const char builtin_merge_one_file_usage[] =
>> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
>> +	"<orig mode> <our mode> <their mode>\n\n"
>> +	"Blob ids and modes should be empty for missing files.";
>> +
>> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
>> +{
>> +	char *last;
>> +	int ret = 0;
>> +
>> +	*mode = strtol(arg, &last, 8);
>> +
>> +	if (*last)
>> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
>> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
>> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
>> +
>> +	return ret;
>> +}
>> +
>> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
>> +{
>> +	struct object_id orig_blob, our_blob, their_blob,
>> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
>> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
>> +	struct lock_file lock = LOCK_INIT;
>> +
>> +	if (argc != 8)
>> +		usage(builtin_merge_one_file_usage);
>> +
>> +	if (read_cache() < 0)
>> +		die("invalid index");
>> +
>> +	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
>> +
>> +	if (!get_oid(argv[1], &orig_blob)) {
>> +		p_orig_blob = &orig_blob;
>> +		ret = read_mode("orig", argv[5], &orig_mode);
>> +	}
> 
> argv[1] is defined as "either the object name of the blob in the
> common ancestor, or an empty string".  So you need to distinguish
> three cases here, but you are only catching two.
> 
>  - argv[1] is an empty string; p_orig_blob can legitimately be left
>    NULL.
> 
>  - argv[1] is a valid blob object name.  orig_blob should be
>    populated and p_orig_blob should point at it.
> 
>  - argv[1] is garbage, names a non-blob object, or there is no such
>    object with that name.  Don't we want to catch it as a mistake?
> 
> Also, when argv[1] is an empty string, argv[5] must also be an empty
> string, or we got a wrong input---don't we want to catch it as a
> mistake?
> 
> The third case needs a bit of thought.  For example, if $1 and $2
> are the same and points at a non-existent object, we know we won't
> care because we only care about $3.  In a lazily-cloned repository,
> that may matter---we would not want to fail even if we not have blob
> $1 and $2, as long as they are reasonably spelled a full hexadecimal
> object name.  But we would want to fail if blob object named by $3
> is missing.
> 
> One way to achieve semantics closer to the above than the posted
> patch may be to tighten the parsing.  Instead of using "anything
> goes" get_oid(), use get_oid_hex(), perhaps.
> 
>> +	if (!get_oid(argv[2], &our_blob)) {
>> +		p_our_blob = &our_blob;
>> +		ret = read_mode("our", argv[6], &our_mode);
>> +	}
>> +
>> +	if (!get_oid(argv[3], &their_blob)) {
>> +		p_their_blob = &their_blob;
>> +		ret = read_mode("their", argv[7], &their_mode);
>> +	}
>> +
>> +	if (ret)
>> +		return ret;
>> +
>> +	ret = merge_strategies_one_file(the_repository,
>> +					p_orig_blob, p_our_blob, p_their_blob, argv[4],
>> +					orig_mode, our_mode, their_mode);
> 
> That's a funny function name.  It's not like the function will be
> taught different strategy to handle the three-way merge, no?  It
> probably makes sense to name it after what it does, which is "three
> way merge".
> 

Okay.  There's already a function called threeway_merge() in
unpack_trees() that does something different.
merge_strategies_threeway() should be good?

>> +	if (ret) {
>> +		rollback_lock_file(&lock);
>> +		return !!ret;
>> +	}
>> +
>> +	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>> +}
> 
>> diff --git a/merge-strategies.c b/merge-strategies.c
>> new file mode 100644
>> index 0000000000..bbe6f48698
>> --- /dev/null
>> +++ b/merge-strategies.c
>> @@ -0,0 +1,214 @@
>> +#include "cache.h"
>> +#include "dir.h"
>> +#include "ll-merge.h"
>> +#include "merge-strategies.h"
>> +#include "xdiff-interface.h"
>> +
> 
>> +static int add_to_index_cacheinfo(struct index_state *istate,
>> +				  unsigned int mode,
>> +				  const struct object_id *oid, const char *path)
>> +{
>> +	struct cache_entry *ce;
>> +	int len, option;
>> +
>> +	if (!verify_path(path, mode))
>> +		return error(_("Invalid path '%s'"), path);
>> +
>> +	len = strlen(path);
>> +	ce = make_empty_cache_entry(istate, len);
>> +
>> +	oidcpy(&ce->oid, oid);
>> +	memcpy(ce->name, path, len);
>> +	ce->ce_flags = create_ce_flags(0);
>> +	ce->ce_namelen = len;
>> +	ce->ce_mode = create_ce_mode(mode);
>> +	if (assume_unchanged)
>> +		ce->ce_flags |= CE_VALID;
>> +	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
>> +	if (add_index_entry(istate, ce, option))
>> +		return error(_("%s: cannot add to the index"), path);
>> +
>> +	return 0;
>> +}
> 
> The above correctly does 'git update-index --add --cacheinfo "$6"
> "$2" "$4"' but don't copy-and-paste existing code to do so.  Add one
> preliminary patch before everything else in the series to massage
> and extract add_cacheinfo() function out of builtin/update-index.c,
> move it to somewhere common like read-cache.c and so that we can
> call it from here.
> 

Hmm, I’d really like to do this, but I have one remark/question about
it.  In builtin/update-index.c, when add_cache_entry() fails, this
message is printed:

	cannot add to the index - missing --add option?

Obviously, this is not what we want to show in git-merge when
add_index_entry() fails.  But then, verify_path() can also fail, and
will show a sensible message for any situation:

	Invalid path '%s'

Should I return error when verify_path() fails, but eg. -2 in the case
of add_index_entry(), and if this new add_cacheinfo() returns -2 but not
-1, print the correct message?  Or let the caller verify the path so it
cannot fail because of this?

>> +static int checkout_from_index(struct index_state *istate, const char *path)
>> +{
>> +	struct checkout state = CHECKOUT_INIT;
>> +	struct cache_entry *ce;
>> +
>> +	state.istate = istate;
>> +	state.force = 1;
>> +	state.base_dir = "";
>> +	state.base_dir_len = 0;
>> +
>> +	ce = index_file_exists(istate, path, strlen(path), 0);
> 
> This call is unfortunate for the reasons I mention later.
> 
> But if you must have this call, then you need to sanity check what
> you get from index_file_exists().  ce must be a merged cache entry,
> so
> 
> 	if (!ce || ce_stage(ce))
> 		BUG(...);
> 

That’s ok, I managed to remove it following your advice.

>> +	if (checkout_entry(ce, &state, NULL, NULL) < 0)
>> +		return error(_("%s: cannot checkout file"), path);
>> +	return 0;
>> +}
>> +
>> +static int merge_one_file_deleted(struct index_state *istate,
>> +				  const struct object_id *orig_blob,
>> +				  const struct object_id *our_blob,
>> +				  const struct object_id *their_blob, const char *path,
>> +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
>> +{
>> +	if ((our_blob && orig_mode != our_mode) ||
>> +	    (their_blob && orig_mode != their_mode))
>> +		return error(_("File %s deleted on one branch but had its "
>> +			       "permissions changed on the other."), path);
>> +
>> +	if (our_blob) {
>> +		printf(_("Removing %s\n"), path);
>> +
>> +		if (file_exists(path))
>> +			remove_path(path);
>> +	}
>> +
>> +	if (remove_file_from_index(istate, path))
>> +		return error("%s: cannot remove from the index", path);
>> +	return 0;
> 
> If the side that did not remove changed the mode, we don't silently
> remove but fail and give a chance to inspect the situation to the
> end user.  If we had the blob and it is removed by them, we give a
> message and only in that case we remove the file from the working
> tree, together with any leading directory that has become empty.
> 
> And after that we make sure that the path is no longer in the
> index.  The function removes entries for the path at all the stages,
> which is exactly what we want.
> 
> OK.
> 
>> +}
>> +
>> +static int do_merge_one_file(struct index_state *istate,
>> +			     const struct object_id *orig_blob,
>> +			     const struct object_id *our_blob,
>> +			     const struct object_id *their_blob, const char *path,
>> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
>> +{
>> +	int ret, i, dest;
>> +	ssize_t written;
>> +	mmbuffer_t result = {NULL, 0};
>> +	mmfile_t mmfs[3];
>> +	struct ll_merge_options merge_opts = {0};
>> +	struct cache_entry *ce;
>> +
>> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
>> +		return error(_("%s: Not merging symbolic link changes."), path);
>> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
>> +		return error(_("%s: Not merging conflicting submodule changes."), path);
>> +
>> +	read_mmblob(mmfs + 1, our_blob);
>> +	read_mmblob(mmfs + 2, their_blob);
>> +
>> +	if (orig_blob) {
>> +		printf(_("Auto-merging %s\n"), path);
>> +		read_mmblob(mmfs + 0, orig_blob);
>> +	} else {
>> +		printf(_("Added %s in both, but differently.\n"), path);
>> +		read_mmblob(mmfs + 0, &null_oid);
>> +	}
>> +
>> +	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
>> +	ret = ll_merge(&result, path,
>> +		       mmfs + 0, "orig",
>> +		       mmfs + 1, "our",
>> +		       mmfs + 2, "their",
>> +		       istate, &merge_opts);
> 
> Is it correct to call into ll_merge() here?  The original used to
> call "git merge-file" which called into xdl_merge().  Calling into
> ll_merge() means the path is used to look up the attributes and use
> the custom merge driver, which I am not offhand sure is what we want
> to see at this low level (and if it turns out to be a good idea, we
> definitely should explain the change of semantics in the proposed
> log message for this commit).
> 
>> +	for (i = 0; i < 3; i++)
>> +		free(mmfs[i].ptr);
>> +
>> +	if (ret < 0) {
>> +		free(result.ptr);
>> +		return error(_("Failed to execute internal merge"));
>> +	}
>> +
>> +	/*
>> +	 * Create the working tree file, using "our tree" version from
>> +	 * the index, and then store the result of the merge.
>> +	 */
> 
> The above is copied from the original, to explain what it did after
> the comment, but it does not seem to match what the new code does.
> 
>> +	ce = index_file_exists(istate, path, strlen(path), 0);
>> +	if (!ce)
>> +		BUG("file is not present in the cache?");
>> +
>> +	unlink(path);
>> +	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
>> +		free(result.ptr);
>> +		return error_errno(_("failed to open file '%s'"), path);
>> +	}
>> +
>> +	written = write_in_full(dest, result.ptr, result.size);
>> +	close(dest);
>> +
>> +	free(result.ptr);
>> +
>> +	if (written < 0)
>> +		return error_errno(_("failed to write to '%s'"), path);
>> +
> 
> This open(..., ce->ce_mode) call is way insufficient.
> 
> The comment we have above this part of the code talks about the
> difficulty of doing this correctly in scripted version.  Creating a
> file by 'git checkout-index -f --stage=2 -- "$4"' and reusing it to
> store the merged contents was the cleanest and easiest way without
> having direct access to adjust_shared_perm() to create a working
> tree file with the correct permission bits.
> 
> We are writing in C, so we should be able to do much better than the
> scripted version, as we can later call adjust_shared_perm().
> 

I'm not sure I understand the issue correctly.

Is this because I fetch an entry from the index to have the mode of the
file, instead of using `our_mode'?  So I should move the error handling
of ll_merge()/xdl_merge() and the detection of the permission conflict
before writing in the file, and call open(…, our_mode)?

I'm also not sure why we need adjust_shared_perm() here.

>> +	if (ret != 0 || !orig_blob)
>> +		ret = error(_("content conflict in %s"), path);
>> +	if (our_mode != their_mode)
>> +		return error(_("permission conflict: %o->%o,%o in %s"),
>> +			     orig_mode, our_mode, their_mode, path);
>> +	if (ret)
>> +		return -1;
>> +
>> +	return add_file_to_index(istate, path, 0);
>> +}
>> +
>> +int merge_strategies_one_file(struct repository *r,
>> +			      const struct object_id *orig_blob,
>> +			      const struct object_id *our_blob,
>> +			      const struct object_id *their_blob, const char *path,
>> +			      unsigned int orig_mode, unsigned int our_mode,
>> +			      unsigned int their_mode)
>> +{
> 
> In a long if/else if/else if/.../else cascade, enclose all bodies in
> braces, if any one of them has a multi-statement body, to avoid
> being distracting.
> 
>> +	if (orig_blob &&
>> +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
>> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
>> +		/* Deleted in both or deleted in one and unchanged in the other. */
>> +		return merge_one_file_deleted(r->index,
>> +					      orig_blob, our_blob, their_blob, path,
>> +					      orig_mode, our_mode, their_mode);
> 
> OK, we've already reviewed that function.
> 
>> +	else if (!orig_blob && our_blob && !their_blob) {
>> +		/*
>> +		 * Added in one.  The other side did not add and we
>> +		 * added so there is nothing to be done, except making
>> +		 * the path merged.
>> +		 */
>> +		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
> 
> OK, we've already reviewed that function.
> 
>> +	} else if (!orig_blob && !our_blob && their_blob) {
>> +		printf(_("Adding %s\n"), path);
>> +
>> +		if (file_exists(path))
>> +			return error(_("untracked %s is overwritten by the merge."), path);
>> +
>> +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
>> +			return -1;
>> +		return checkout_from_index(r->index, path);
> 
> You did "add_to_index_cacheinfo()", so you MUST know which ce is to
> be checked out.
> 
> Consider if it is worth to teach add_to_index_cacheinfo() to give
> you ce back and pass it to checkout_from_index(); that way, you do
> not have to call index_file_exists() based on path in the function.
> 

OK, this is doable.

>> +	} else if (!orig_blob && our_blob && their_blob &&
>> +		   oideq(our_blob, their_blob)) {
>> +		/* Added in both, identically (check for same permissions). */
>> +		if (our_mode != their_mode)
>> +			return error(_("File %s added identically in both branches, "
>> +				       "but permissions conflict %o->%o."),
>> +				     path, our_mode, their_mode);
>> +
>> +		printf(_("Adding %s\n"), path);
>> +
>> +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
>> +			return -1;
>> +		return checkout_from_index(r->index, path);
> 
> Ditto.
> 
>> +	} else if (our_blob && their_blob)
>> +		/* Modified in both, but differently. */
>> +		return do_merge_one_file(r->index,
>> +					 orig_blob, our_blob, their_blob, path,
>> +					 orig_mode, our_mode, their_mode);
>> +	else {
>> +		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
>> +			their_hex[GIT_MAX_HEXSZ] = {0};
>> +
>> +		if (orig_blob)
>> +			oid_to_hex_r(orig_hex, orig_blob);
>> +		if (our_blob)
>> +			oid_to_hex_r(our_hex, our_blob);
>> +		if (their_blob)
>> +			oid_to_hex_r(their_hex, their_blob);
>> +
>> +		return error(_("%s: Not handling case %s -> %s -> %s"),
>> +			     path, orig_hex, our_hex, their_hex);
>> +	}
>> +
>> +	return 0;
>> +}
> 
> I can see that this does go in the right direction.  With a bit more
> attention to details it would soon be production-ready quality.
> 
> Thanks.
> 

Thank you,
Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 02/11] merge-one-file: rewrite in C
  2020-10-21 19:47         ` Alban Gruin
@ 2020-10-21 20:28           ` Junio C Hamano
  2020-10-21 21:20             ` Junio C Hamano
  2020-10-21 20:30           ` Junio C Hamano
  1 sibling, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-10-21 20:28 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

>>> +	/*
>>> +	 * Create the working tree file, using "our tree" version from
>>> +	 * the index, and then store the result of the merge.
>>> +	 */
>> 
>> The above is copied from the original, to explain what it did after
>> the comment, but it does not seem to match what the new code does.
>> 
>>> +	ce = index_file_exists(istate, path, strlen(path), 0);
>>> +	if (!ce)
>>> +		BUG("file is not present in the cache?");
>>> +
>>> +	unlink(path);
>>> +	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
>>> +		free(result.ptr);
>>> +		return error_errno(_("failed to open file '%s'"), path);
>>> +	}
>>> +
>>> +	written = write_in_full(dest, result.ptr, result.size);
>>> +	close(dest);
>>> +
>>> +	free(result.ptr);
>>> +
>>> +	if (written < 0)
>>> +		return error_errno(_("failed to write to '%s'"), path);
>>> +
>> 
>> This open(..., ce->ce_mode) call is way insufficient.
>> 
>> The comment we have above this part of the code talks about the
>> difficulty of doing this correctly in scripted version.  Creating a
>> file by 'git checkout-index -f --stage=2 -- "$4"' and reusing it to
>> store the merged contents was the cleanest and easiest way without
>> having direct access to adjust_shared_perm() to create a working
>> tree file with the correct permission bits.

The original that the comment applies to does this

	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1

to create path "$4" with the correct mode bits, instead of a naïve

	mv "$src1" "$4"

because the filemode 'git checkout-index -f --stage=2 -- "$4"' gives
to file "$4" is by definition the most correct one for the path.
The command knows how user's umask and type bits in the index should
interact and produce the final mode bits, but "$src1" was created
without any regard to the mode bits---the 'git merge-file' command
only cares about the contents and not filemode.  We can even lose
the executable bit that way.  And preparing "$4" and then catting
the computed contents into it was a roundabout way (it wastes the
entire writing-out of the contents from the index), and that was
what the comment was about.

But all that is unnecessary once you port this to C.  So the comment
does not apply to the code you wrote, I think, and should just be
dropped.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 02/11] merge-one-file: rewrite in C
  2020-10-21 19:47         ` Alban Gruin
  2020-10-21 20:28           ` Junio C Hamano
@ 2020-10-21 20:30           ` Junio C Hamano
  1 sibling, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2020-10-21 20:30 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Alban Gruin <alban.gruin@gmail.com> writes:

>>> +	int ret, i, dest;
>>> +	ssize_t written;
>>> +	mmbuffer_t result = {NULL, 0};
>>> +	mmfile_t mmfs[3];
>>> +	struct ll_merge_options merge_opts = {0};
>>> +	struct cache_entry *ce;
>>> +
>>> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
>>> +		return error(_("%s: Not merging symbolic link changes."), path);
>>> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
>>> +		return error(_("%s: Not merging conflicting submodule changes."), path);
>>> +
>>> +	read_mmblob(mmfs + 1, our_blob);
>>> +	read_mmblob(mmfs + 2, their_blob);
>>> +
>>> +	if (orig_blob) {
>>> +		printf(_("Auto-merging %s\n"), path);
>>> +		read_mmblob(mmfs + 0, orig_blob);
>>> +	} else {
>>> +		printf(_("Added %s in both, but differently.\n"), path);
>>> +		read_mmblob(mmfs + 0, &null_oid);
>>> +	}
>>> +
>>> +	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
>>> +	ret = ll_merge(&result, path,
>>> +		       mmfs + 0, "orig",
>>> +		       mmfs + 1, "our",
>>> +		       mmfs + 2, "their",
>>> +		       istate, &merge_opts);
>> 
>> Is it correct to call into ll_merge() here?  The original used to
>> call "git merge-file" which called into xdl_merge().  Calling into
>> ll_merge() means the path is used to look up the attributes and use
>> the custom merge driver, which I am not offhand sure is what we want
>> to see at this low level (and if it turns out to be a good idea, we
>> definitely should explain the change of semantics in the proposed
>> log message for this commit).

I am still not sure if it is correct to call ll_merge() and not the
xdl_merge() from here.  We need to highlight this change in the log
message, if we were still going to do this.

Thanks.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 02/11] merge-one-file: rewrite in C
  2020-10-21 20:28           ` Junio C Hamano
@ 2020-10-21 21:20             ` Junio C Hamano
  0 siblings, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2020-10-21 21:20 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, phillip.wood

Junio C Hamano <gitster@pobox.com> writes:

>>> This open(..., ce->ce_mode) call is way insufficient.
>>> ...
> But all that is unnecessary once you port this to C.  So the comment
> does not apply to the code you wrote, I think, and should just be
> dropped.

Sorry, forgot to mention one thing.  Using ce->ce_mode to create the
output file is the way how helpers in entry.c check out paths from
the index to the working tree, so the code is OK.  It's just the
copied comment was about the issue that your code did not even have
to worry about.

Thanks.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all()
  2020-10-09  4:48       ` Junio C Hamano
@ 2020-11-06 19:53         ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-06 19:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, phillip.wood

Hi Junio,

Le 09/10/2020 à 06:48, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> diff --git a/merge-strategies.c b/merge-strategies.c
>> index bbe6f48698..f0e30f5624 100644
>> --- a/merge-strategies.c
>> +++ b/merge-strategies.c
>> @@ -2,6 +2,7 @@
>>  #include "dir.h"
>>  #include "ll-merge.h"
>>  #include "merge-strategies.h"
>> +#include "run-command.h"
>>  #include "xdiff-interface.h"
>>  
>>  static int add_to_index_cacheinfo(struct index_state *istate,
>> @@ -212,3 +213,101 @@ int merge_strategies_one_file(struct repository *r,
>>  
>>  	return 0;
>>  }
>> +
>> +int merge_program_cb(const struct object_id *orig_blob,
>> +		     const struct object_id *our_blob,
>> +		     const struct object_id *their_blob, const char *path,
>> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +		     void *data)
>> +{
>> +	char ownbuf[3][GIT_MAX_HEXSZ] = {{0}};
>> +	const char *arguments[] = { (char *)data, "", "", "", path,
>> +				    ownbuf[0], ownbuf[1], ownbuf[2],
>> +				    NULL };
>> +
>> +	if (orig_blob)
>> +		arguments[1] = oid_to_hex(orig_blob);
>> +	if (our_blob)
>> +		arguments[2] = oid_to_hex(our_blob);
>> +	if (their_blob)
>> +		arguments[3] = oid_to_hex(their_blob);
> 
> oid_to_hex() uses 4-slot rotating buffer, no?  Relying on the fact
> that three would be available here without getting reused (or,
> rather, our caller didn't make its own calls and/or does not mind
> us invalidating all but one slot for them) feels a bit iffy.
> 
> Extending ownbuf[] to 6 elements and using oid_to_hex_r() would be a
> trivial way to clarify the code.
> 
>> +	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
>> +	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
>> +	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);
> 
> And these mode bits would not need GIT_MAX_HEXSZ to begin with.
> This smells like a WIP that hasn't been carefullly proofread.
> 
> 	char oidbuf[3][GIT_MAX_HEXSZ] = { 0 };
> 	char modebuf[3][8] = { 0 };

So here I picked GIT_MAX_HEXSZ + 1 and 10 for those buffers, they are
already used by builtin/diff.c.

> 	char *args[] = {
> 		data, oidbuf[0], oidbuf[1], oidbuf[2], path,
> 		modebuf[0], modebuf[1], modebuf[2], NULL,
> 	};
>         
>         if (orig_blob)
> 		oid_to_hex_r(oidbuf[0], orig_blob);
> 	...
> 	xsnprintf(modebuf[0], ...);
> 
> 
> Eh, wait.  Is this meant to be able to drive "git-merge-one-file",
> i.e. a missing common/ours/theirs is indicated by an empty string
> in both oiod and mode?  If so, an unconditional xsnprintf() would
> either give garbage or "0" at best, neither of which is an empty
> string.  So the body would be more like
> 
> 	if (orig_blob) {
> 		oid_to_hex_r(oidbuf[0], orig_blob);
> 		xsnprintf(modebuf[0], "%o", orig_mode);
> 	}
> 	if (our_blob) {
> 		oid_to_hex_r(oidbuf[1], our_blob);
> 		xsnprintf(modebuf[1], "%o", our_mode);
> 	}
> 	...
> 
> wouldn't it?
> 

Yes, especially since you suggested to error out if an empty oid has a
non-empty mode in the second patch.

>> +	return run_command_v_opt(arguments, 0);
>> +}
>> +
>> +static int merge_entry(struct index_state *istate, int quiet, int pos,
>> +		       const char *path, merge_cb cb, void *data)
> 
> When we use an identifier "cb", it typically means callback data,
> not a callback function which is often called "fn".  So, name the
> type "merge_fn" (or "merge_func"), and call the parameter "fn".
> 
>> +{
>> +	int found = 0;
>> +	const struct object_id *oids[3] = {NULL};
>> +	unsigned int modes[3] = {0};
>> +
>> +	do {
>> +		const struct cache_entry *ce = istate->cache[pos];
>> +		int stage = ce_stage(ce);
>> +
>> +		if (strcmp(ce->name, path))
>> +			break;
>> +		found++;
>> +		oids[stage - 1] = &ce->oid;
>> +		modes[stage - 1] = ce->ce_mode;
>> +	} while (++pos < istate->cache_nr);
>> +	if (!found)
>> +		return error(_("%s is not in the cache"), path);
>> +
>> +	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
>> +		if (!quiet)
>> +			error(_("Merge program failed"));
>> +		return -2;
>> +	}
>> +
>> +	return found;
>> +}
> 
> This copies from builtin/merge-index.c::merge_entry().
> 
>> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
>> +		   const char *path, merge_cb cb, void *data)
>> +{
>> +	int pos = index_name_pos(istate, path, strlen(path)), ret;
>> +
>> +	/*
>> +	 * If it already exists in the cache as stage0, it's
>> +	 * already merged and there is nothing to do.
>> +	 */
>> +	if (pos < 0) {
>> +		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
>> +		if (ret == -1)
>> +			return -1;
>> +		else if (ret == -2)
>> +			return 1;
>> +	}
>> +	return 0;
>> +}
> 
> Likewise from the same function in that file.
> 
> Are we removing the "git merge-index" program?  Reusing the same
> identifier for these copied-and-pasted pairs of functions bothers
> me for two reasons.
> 
>  - An indentifier that was clear and unique enough in the original
>    context as a file-scope static may not be a good name as a global
>    identifier.  
> 
>  - Having two similar-looking functions with the same name makes
>    reading and learning the codebase starting at "git grep" hits
>    more difficult than necessary.
> 

I don't plan to remove `git merge-index' -- nor any other program, for
that matter.  Why not renaming merge_one_path() and merge_all(),
merge_index_path() and merge_all_index()?

>> +int merge_all(struct index_state *istate, int oneshot, int quiet,
>> +	      merge_cb cb, void *data)
>> +{
>> +	int err = 0, i, ret;
>> +	for (i = 0; i < istate->cache_nr; i++) {
>> +		const struct cache_entry *ce = istate->cache[i];
>> +		if (!ce_stage(ce))
>> +			continue;
>> +
>> +		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
>> +		if (ret > 0)
>> +			i += ret - 1;
>> +		else if (ret == -1)
>> +			return -1;
>> +		else if (ret == -2) {
>> +			if (oneshot)
>> +				err++;
>> +			else
>> +				return 1;
>> +		}
>> +	}
>> +
>> +	return err;
>> +}
> 
> Likewise.
> 
>> diff --git a/merge-strategies.h b/merge-strategies.h
>> index b527d145c7..cf78d7eaf4 100644
>> --- a/merge-strategies.h
>> +++ b/merge-strategies.h
>> @@ -10,4 +10,21 @@ int merge_strategies_one_file(struct repository *r,
>>  			      unsigned int orig_mode, unsigned int our_mode,
>>  			      unsigned int their_mode);
>>  
>> +typedef int (*merge_cb)(const struct object_id *orig_blob,
>> +			const struct object_id *our_blob,
>> +			const struct object_id *their_blob, const char *path,
>> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +			void *data);
> 
> Call it "merge_one_file_func", probably.
> 
>> +int merge_program_cb(const struct object_id *orig_blob,
> 
> Call it spawn_merge_one_file() perhaps?
> 
>> +		     const struct object_id *our_blob,
>> +		     const struct object_id *their_blob, const char *path,
>> +		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +		     void *data);
>> +
>> +int merge_one_path(struct index_state *istate, int oneshot, int quiet,
>> +		   const char *path, merge_cb cb, void *data);
>> +int merge_all(struct index_state *istate, int oneshot, int quiet,
>> +	      merge_cb cb, void *data);
>>  #endif /* MERGE_STRATEGIES_H */

Ack for the rest, the two function names are the only thing I am still
missing on this patch right now.

Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v3 05/11] merge-resolve: rewrite in C
  2020-10-16 19:19       ` Junio C Hamano
@ 2020-11-06 19:53         ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-06 19:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, phillip.wood

Le 16/10/2020 à 21:19, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> +#include "cache.h"
>> +#include "builtin.h"
>> +#include "merge-strategies.h"
>> +
>> +static const char builtin_merge_resolve_usage[] =
>> +	"git merge-resolve <bases>... -- <head> <remote>";
>> +
>> +int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
>> +{
>> +	int i, is_baseless = 1, sep_seen = 0;
>> +	const char *head = NULL;
>> +	struct commit_list *bases = NULL, *remote = NULL;
>> +	struct commit_list **next_base = &bases;
>> +
>> +	if (argc < 5)
>> +		usage(builtin_merge_resolve_usage);
>> +
>> +	setup_work_tree();
>> +	if (repo_read_index(the_repository) < 0)
>> +		die("invalid index");
>> +
>> +	/* The first parameters up to -- are merge bases; the rest are
>> +	 * heads. */
> 
> Style (I won't repeat).
> 
>> +	for (i = 1; i < argc; i++) {
>> +		if (strcmp(argv[i], "--") == 0)
> 
> 	if (!strcmp(...))
> 
> is more typical than comparing with "== 0".
> 
>> +			sep_seen = 1;
>> +		else if (strcmp(argv[i], "-h") == 0)
>> +			usage(builtin_merge_resolve_usage);
>> +		else if (sep_seen && !head)
>> +			head = argv[i];
>> +		else if (remote) {
>> +			/* Give up if we are given two or more remotes.
>> +			 * Not handling octopus. */
>> +			return 2;
>> +		} else {
>> +			struct object_id oid;
>> +
>> +			get_oid(argv[i], &oid);
>> +			is_baseless &= sep_seen;
>> +
>> +			if (!oideq(&oid, the_hash_algo->empty_tree)) {
> 
> What is this business about an empty tree about?
> 

I don’t remember my intent here -- perhaps I wanted to avoid merges on
empty trees…  I’ll remove that from here and merge-octopus.c.

>> +				struct commit *commit;
>> +				commit = lookup_commit_or_die(&oid, argv[i]);
>> +
>> +				if (sep_seen)
>> +					commit_list_append(commit, &remote);
>> +				else
>> +					next_base = commit_list_append(commit, next_base);
>> +			}
>> +		}
>> +	}
>> +
>> +	/* Give up if this is a baseless merge. */
>> +	if (is_baseless)
>> +		return 2;
> 
> This is quite convoluted.  
> 
> The original is much more straight-forward.  We just said "grab
> everything before we see '--' and call them bases; immediately after
> '--' is HEAD and everything else is remote.  Now do we have any
> base?  Otherwise we cannot handle it".
> 
> I cannot see an equivalence to it in the rewritten result, with the
> bit operation with is_baseless and sep_seen.  Wouldn't it be the
> matter of checking if next_base is NULL, or is there something more
> subtle that deserves in-code comment going on?
> 

After re-reading this many, many weeks later, I can confirm that this is
convoluted, and that there is a much better way to perform some checks…
 for instance, checking if `bases' is NULL instead of having
`is_baseless', or checking after the loop if `remotes->next' is not NULL
to verify if there is multiple remotes.

> Thanks.
> 

Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v4 00/12] Rewrite the remaining merge strategies from shell to C
  2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
                       ` (11 preceding siblings ...)
  2020-10-07  6:57     ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Johannes Schindelin
@ 2020-11-13 11:04     ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 01/12] t6027: modernise tests Alban Gruin
                         ` (12 more replies)
  12 siblings, 13 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

In a effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on it.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without any changes.

This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).  The
tip is tagged as "rewrite-merge-strategies-v4" at
https://github.com/agrn/git.

Changes since v3:

 - [2/12] Move add_cacheinfo() to read-cache.c and rename it
   add_to_index_cacheinfo().  That way, there is no need to copy it to
   merge-strategies.c.  It also returns the new cache entry.

 - [3/12] Changed SHA1 to "object name" in the comments

 - [3/12] Error out if an object was not specified but a corresponding
   mode was.

 - [3/12] Add a cache entry parameter to checkout_from_index() to avoid
   calling index_file_exists(), as all of its callers now have the new
   cache entry thanks to add_to_index_cacheinfo().

 - [3/12] Replace ll_merge() with xdl_merge() in do_merge_one_file().

 - [3/12] Fail earlier in the case of a permission conflict in
   do_merge_one_file().

 - [3/12] Use `our_mode' instead of fetching a cache entry to define the
   mode of a merged file in do_merge_one_file().

 - [3/12] Rename merge_strategies_one_file() to merge_three_way().

 - [3/12] Reformatted a long chain of if/else if/else blocks.

 - [4/12] Rename merge_all() to merge_all_index(), merge_one_path() by
   merge_index_path(), merge_program_cb() to merge_one_file_spawn(),
   `merge_cb' to `merge_fn', and the parameters `cb' to `fn'.

 - [4/12] Use oid_to_hex_r() instead of oid_to_hex() in
   merge_one_file_spawn().

 - [5/12] Rename merge_one_file_cb() to merge_one_file_func().

 - [6/12, 8/12] Enable `USE_THE_INDEX_COMPATIBILITY_MACROS' and use
   read_cache() instead of repo_read_index().

 - [6/12] The parameter parsing has been rewritten to look less
   convoluted.

 - [6/12] Reformatted multi-line comments.

 - [7/12] Fixed multiple mistakes in the commit message.

 - [8/12] The parameters parsing has been rewritten to look more like
   builtin/merge-resolve.c.

 - [3/12, 6/12, 8/12] Removed obsolete informations from commit
   messages.

Alban Gruin (12):
  t6027: modernise tests
  update-index: move add_cacheinfo() to read-cache.c
  merge-one-file: rewrite in C
  merge-index: libify merge_one_path() and merge_all()
  merge-index: don't fork if the requested program is
    `git-merge-one-file'
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Makefile                        |   7 +-
 builtin.h                       |   3 +
 builtin/merge-index.c           | 102 ++----
 builtin/merge-octopus.c         |  69 ++++
 builtin/merge-one-file.c        |  94 ++++++
 builtin/merge-recursive.c       |  16 +-
 builtin/merge-resolve.c         |  73 ++++
 builtin/merge.c                 |   9 +-
 builtin/update-index.c          |  25 +-
 cache.h                         |   7 +-
 git-merge-octopus.sh            | 112 -------
 git-merge-one-file.sh           | 167 ---------
 git-merge-resolve.sh            |  54 ---
 git.c                           |   3 +
 merge-strategies.c              | 576 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  43 +++
 merge.c                         |  12 +
 read-cache.c                    |  35 ++
 sequencer.c                     |  16 +-
 t/t6407-merge-binary.sh         |  27 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 21 files changed, 987 insertions(+), 465 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v3:
 1:  08c7df596a =  1:  08c7df596a t6027: modernise tests
 -:  ---------- >  2:  df237da758 update-index: move add_cacheinfo() to read-cache.c
 2:  ce911c99c0 !  3:  b64bad0d23 merge-one-file: rewrite in C
    @@ -10,22 +10,23 @@
         external processes are replaced by calls to functions in libgit.a:
     
          - calls to `update-index --add --cacheinfo' are replaced by calls to
    -       add_cache_entry();
    +       add_to_index_cacheinfo();
     
          - calls to `update-index --remove' are replaced by calls to
    -       remove_file_from_cache();
    +       remove_file_from_index();
     
          - calls to `checkout-index -u -f' are replaced by calls to
            checkout_entry();
     
          - calls to `unpack-file' and `merge-files' are replaced by calls to
    -       read_mmblob() and ll_merge(), respectively, to merge files
    +       read_mmblob() and xdl_merge(), respectively, to merge files
            in-memory;
     
    -     - calls to `checkout-index -f --stage=2' are replaced by calls to
    -       cache_file_exists();
    +     - calls to `checkout-index -f --stage=2' are removed, as this is needed
    +       to have the correct permission bits on the merged file from the
    +       script, but not in the C version;
     
    -     - calls to `update-index' are replaced by calls to add_file_to_cache().
    +     - calls to `update-index' are replaced by calls to add_file_to_index().
     
         The bulk of the rewrite is done in a new file in libgit.a,
         merge-strategies.c.  This will enable the resolve and octopus strategies
    @@ -96,9 +97,9 @@
     + *
     + * This is the git per-file merge utility, called with
     + *
    -+ *   argv[1] - original file SHA1 (or empty)
    -+ *   argv[2] - file in branch1 SHA1 (or empty)
    -+ *   argv[3] - file in branch2 SHA1 (or empty)
    ++ *   argv[1] - original file object name (or empty)
    ++ *   argv[2] - file in branch1 object name (or empty)
    ++ *   argv[3] - file in branch2 object name (or empty)
     + *   argv[4] - pathname in repository
     + *   argv[5] - original file mode (or empty)
     + *   argv[6] - file in branch1 mode (or empty)
    @@ -150,27 +151,29 @@
     +
     +	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
     +
    -+	if (!get_oid(argv[1], &orig_blob)) {
    ++	if (!get_oid_hex(argv[1], &orig_blob)) {
     +		p_orig_blob = &orig_blob;
     +		ret = read_mode("orig", argv[5], &orig_mode);
    -+	}
    ++	} else if (!*argv[1] && *argv[5])
    ++		ret = error(_("no 'orig' object id given, but a mode was still given."));
     +
    -+	if (!get_oid(argv[2], &our_blob)) {
    ++	if (!get_oid_hex(argv[2], &our_blob)) {
     +		p_our_blob = &our_blob;
     +		ret = read_mode("our", argv[6], &our_mode);
    -+	}
    ++	} else if (!*argv[2] && *argv[6])
    ++		ret = error(_("no 'our' object id given, but a mode was still given."));
     +
    -+	if (!get_oid(argv[3], &their_blob)) {
    ++	if (!get_oid_hex(argv[3], &their_blob)) {
     +		p_their_blob = &their_blob;
     +		ret = read_mode("their", argv[7], &their_mode);
    -+	}
    ++	} else if (!*argv[3] && *argv[7])
    ++		ret = error(_("no 'their' object id given, but a mode was still given."));
     +
     +	if (ret)
     +		return ret;
     +
    -+	ret = merge_strategies_one_file(the_repository,
    -+					p_orig_blob, p_our_blob, p_their_blob, argv[4],
    -+					orig_mode, our_mode, their_mode);
    ++	ret = merge_three_way(the_repository, p_orig_blob, p_our_blob, p_their_blob,
    ++			      argv[4], orig_mode, our_mode, their_mode);
     +
     +	if (ret) {
     +		rollback_lock_file(&lock);
    @@ -372,55 +375,25 @@
     @@
     +#include "cache.h"
     +#include "dir.h"
    -+#include "ll-merge.h"
     +#include "merge-strategies.h"
     +#include "xdiff-interface.h"
     +
    -+static int add_to_index_cacheinfo(struct index_state *istate,
    -+				  unsigned int mode,
    -+				  const struct object_id *oid, const char *path)
    -+{
    -+	struct cache_entry *ce;
    -+	int len, option;
    -+
    -+	if (!verify_path(path, mode))
    -+		return error(_("Invalid path '%s'"), path);
    -+
    -+	len = strlen(path);
    -+	ce = make_empty_cache_entry(istate, len);
    -+
    -+	oidcpy(&ce->oid, oid);
    -+	memcpy(ce->name, path, len);
    -+	ce->ce_flags = create_ce_flags(0);
    -+	ce->ce_namelen = len;
    -+	ce->ce_mode = create_ce_mode(mode);
    -+	if (assume_unchanged)
    -+		ce->ce_flags |= CE_VALID;
    -+	option = ADD_CACHE_OK_TO_ADD | ADD_CACHE_OK_TO_REPLACE;
    -+	if (add_index_entry(istate, ce, option))
    -+		return error(_("%s: cannot add to the index"), path);
    -+
    -+	return 0;
    -+}
    -+
    -+static int checkout_from_index(struct index_state *istate, const char *path)
    ++static int checkout_from_index(struct index_state *istate, const char *path,
    ++			       struct cache_entry *ce)
     +{
     +	struct checkout state = CHECKOUT_INIT;
    -+	struct cache_entry *ce;
     +
     +	state.istate = istate;
     +	state.force = 1;
     +	state.base_dir = "";
     +	state.base_dir_len = 0;
     +
    -+	ce = index_file_exists(istate, path, strlen(path), 0);
     +	if (checkout_entry(ce, &state, NULL, NULL) < 0)
     +		return error(_("%s: cannot checkout file"), path);
     +	return 0;
     +}
     +
     +static int merge_one_file_deleted(struct index_state *istate,
    -+				  const struct object_id *orig_blob,
     +				  const struct object_id *our_blob,
     +				  const struct object_id *their_blob, const char *path,
     +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
    @@ -452,16 +425,15 @@
     +	ssize_t written;
     +	mmbuffer_t result = {NULL, 0};
     +	mmfile_t mmfs[3];
    -+	struct ll_merge_options merge_opts = {0};
    -+	struct cache_entry *ce;
    ++	xmparam_t xmp = {{0}};
     +
     +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
     +		return error(_("%s: Not merging symbolic link changes."), path);
     +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
     +		return error(_("%s: Not merging conflicting submodule changes."), path);
    -+
    -+	read_mmblob(mmfs + 1, our_blob);
    -+	read_mmblob(mmfs + 2, their_blob);
    ++	else if (our_mode != their_mode)
    ++		return error(_("permission conflict: %o->%o,%o in %s"),
    ++			     orig_mode, our_mode, their_mode, path);
     +
     +	if (orig_blob) {
     +		printf(_("Auto-merging %s\n"), path);
    @@ -471,12 +443,14 @@
     +		read_mmblob(mmfs + 0, &null_oid);
     +	}
     +
    -+	merge_opts.xdl_opts = XDL_MERGE_ZEALOUS_ALNUM;
    -+	ret = ll_merge(&result, path,
    -+		       mmfs + 0, "orig",
    -+		       mmfs + 1, "our",
    -+		       mmfs + 2, "their",
    -+		       istate, &merge_opts);
    ++	read_mmblob(mmfs + 1, our_blob);
    ++	read_mmblob(mmfs + 2, their_blob);
    ++
    ++	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
    ++	xmp.style = 0;
    ++	xmp.favor = 0;
    ++
    ++	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
     +
     +	for (i = 0; i < 3; i++)
     +		free(mmfs[i].ptr);
    @@ -484,18 +458,13 @@
     +	if (ret < 0) {
     +		free(result.ptr);
     +		return error(_("Failed to execute internal merge"));
    ++	} else if (ret > 0 || !orig_blob) {
    ++		free(result.ptr);
    ++		return error(_("content conflict in %s"), path);
     +	}
     +
    -+	/*
    -+	 * Create the working tree file, using "our tree" version from
    -+	 * the index, and then store the result of the merge.
    -+	 */
    -+	ce = index_file_exists(istate, path, strlen(path), 0);
    -+	if (!ce)
    -+		BUG("file is not present in the cache?");
    -+
     +	unlink(path);
    -+	if ((dest = open(path, O_WRONLY | O_CREAT, ce->ce_mode)) < 0) {
    ++	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
     +		free(result.ptr);
     +		return error_errno(_("failed to open file '%s'"), path);
     +	}
    @@ -508,49 +477,42 @@
     +	if (written < 0)
     +		return error_errno(_("failed to write to '%s'"), path);
     +
    -+	if (ret != 0 || !orig_blob)
    -+		ret = error(_("content conflict in %s"), path);
    -+	if (our_mode != their_mode)
    -+		return error(_("permission conflict: %o->%o,%o in %s"),
    -+			     orig_mode, our_mode, their_mode, path);
    -+	if (ret)
    -+		return -1;
    -+
     +	return add_file_to_index(istate, path, 0);
     +}
     +
    -+int merge_strategies_one_file(struct repository *r,
    -+			      const struct object_id *orig_blob,
    -+			      const struct object_id *our_blob,
    -+			      const struct object_id *their_blob, const char *path,
    -+			      unsigned int orig_mode, unsigned int our_mode,
    -+			      unsigned int their_mode)
    ++int merge_three_way(struct repository *r,
    ++		    const struct object_id *orig_blob,
    ++		    const struct object_id *our_blob,
    ++		    const struct object_id *their_blob, const char *path,
    ++		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
     +{
     +	if (orig_blob &&
     +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
    -+	     (!our_blob && their_blob && oideq(orig_blob, their_blob))))
    ++	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
     +		/* Deleted in both or deleted in one and unchanged in the other. */
    -+		return merge_one_file_deleted(r->index,
    -+					      orig_blob, our_blob, their_blob, path,
    ++		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
     +					      orig_mode, our_mode, their_mode);
    -+	else if (!orig_blob && our_blob && !their_blob) {
    ++	} else if (!orig_blob && our_blob && !their_blob) {
     +		/*
     +		 * Added in one.  The other side did not add and we
     +		 * added so there is nothing to be done, except making
     +		 * the path merged.
     +		 */
    -+		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path);
    ++		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path, 0, 1, 1, NULL);
     +	} else if (!orig_blob && !our_blob && their_blob) {
    ++		struct cache_entry *ce;
     +		printf(_("Adding %s\n"), path);
     +
     +		if (file_exists(path))
     +			return error(_("untracked %s is overwritten by the merge."), path);
     +
    -+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path))
    ++		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path, 0, 1, 1, &ce))
     +			return -1;
    -+		return checkout_from_index(r->index, path);
    ++		return checkout_from_index(r->index, path, ce);
     +	} else if (!orig_blob && our_blob && their_blob &&
     +		   oideq(our_blob, their_blob)) {
    ++		struct cache_entry *ce;
    ++
     +		/* Added in both, identically (check for same permissions). */
     +		if (our_mode != their_mode)
     +			return error(_("File %s added identically in both branches, "
    @@ -559,15 +521,15 @@
     +
     +		printf(_("Adding %s\n"), path);
     +
    -+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path))
    ++		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path, 0, 1, 1, &ce))
     +			return -1;
    -+		return checkout_from_index(r->index, path);
    -+	} else if (our_blob && their_blob)
    ++		return checkout_from_index(r->index, path, ce);
    ++	} else if (our_blob && their_blob) {
     +		/* Modified in both, but differently. */
     +		return do_merge_one_file(r->index,
     +					 orig_blob, our_blob, their_blob, path,
     +					 orig_mode, our_mode, their_mode);
    -+	else {
    ++	} else {
     +		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
     +			their_hex[GIT_MAX_HEXSZ] = {0};
     +
    @@ -595,12 +557,11 @@
     +
     +#include "object.h"
     +
    -+int merge_strategies_one_file(struct repository *r,
    -+			      const struct object_id *orig_blob,
    -+			      const struct object_id *our_blob,
    -+			      const struct object_id *their_blob, const char *path,
    -+			      unsigned int orig_mode, unsigned int our_mode,
    -+			      unsigned int their_mode);
    ++int merge_three_way(struct repository *r,
    ++		    const struct object_id *orig_blob,
    ++		    const struct object_id *our_blob,
    ++		    const struct object_id *their_blob, const char *path,
    ++		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
     +
     +#endif /* MERGE_STRATEGIES_H */
     
 3:  7f0999f5a3 !  4:  c5577dc691 merge-index: libify merge_one_path() and merge_all()
    @@ -9,11 +9,11 @@
         libgit.a, which means that once rewritten, the strategies would still
         have to invoke `merge-one-file' by spawning a new process first.
     
    -    To avoid this, this moves merge_one_path(), merge_all(), and their
    -    helpers to merge-strategies.c.  They also take a callback to dictate
    -    what they should do for each file.  For now, to preserve the behaviour
    -    of `merge-index', only one callback, launching a new process, is
    -    defined.
    +    To avoid this, this moves and renames merge_one_path(), merge_all(), and
    +    their helpers to merge-strategies.c.  They also take a callback to
    +    dictate what they should do for each file.  For now, to preserve the
    +    behaviour of `merge-index', only one callback, launching a new process,
    +    is defined.
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
    @@ -103,15 +103,15 @@
      			}
      			if (!strcmp(arg, "-a")) {
     -				merge_all();
    -+				err |= merge_all(&the_index, one_shot, quiet,
    -+						 merge_program_cb, (void *)pgm);
    ++				err |= merge_all_index(&the_index, one_shot, quiet,
    ++						       merge_one_file_spawn, (void *)pgm);
      				continue;
      			}
      			die("git merge-index: unknown option %s", arg);
      		}
     -		merge_one_path(arg);
    -+		err |= merge_one_path(&the_index, one_shot, quiet, arg,
    -+				      merge_program_cb, (void *)pgm);
    ++		err |= merge_index_path(&the_index, one_shot, quiet, arg,
    ++					merge_one_file_spawn, (void *)pgm);
      	}
     -	if (err && !quiet)
     -		die("merge program failed");
    @@ -122,45 +122,49 @@
      --- a/merge-strategies.c
      +++ b/merge-strategies.c
     @@
    + #include "cache.h"
      #include "dir.h"
    - #include "ll-merge.h"
      #include "merge-strategies.h"
     +#include "run-command.h"
      #include "xdiff-interface.h"
      
    - static int add_to_index_cacheinfo(struct index_state *istate,
    + static int checkout_from_index(struct index_state *istate, const char *path,
     @@
      
      	return 0;
      }
     +
    -+int merge_program_cb(const struct object_id *orig_blob,
    -+		     const struct object_id *our_blob,
    -+		     const struct object_id *their_blob, const char *path,
    -+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    -+		     void *data)
    ++int merge_one_file_spawn(const struct object_id *orig_blob,
    ++			 const struct object_id *our_blob,
    ++			 const struct object_id *their_blob, const char *path,
    ++			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    ++			 void *data)
     +{
    -+	char ownbuf[3][GIT_MAX_HEXSZ] = {{0}};
    -+	const char *arguments[] = { (char *)data, "", "", "", path,
    -+				    ownbuf[0], ownbuf[1], ownbuf[2],
    -+				    NULL };
    ++	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
    ++	char modes[3][10] = {{0}};
    ++	const char *arguments[] = { (char *)data, oids[0], oids[1], oids[2],
    ++				    path, modes[0], modes[1], modes[2], NULL };
     +
    -+	if (orig_blob)
    -+		arguments[1] = oid_to_hex(orig_blob);
    -+	if (our_blob)
    -+		arguments[2] = oid_to_hex(our_blob);
    -+	if (their_blob)
    -+		arguments[3] = oid_to_hex(their_blob);
    ++	if (orig_blob) {
    ++		oid_to_hex_r(oids[0], orig_blob);
    ++		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
    ++	}
     +
    -+	xsnprintf(ownbuf[0], sizeof(ownbuf[0]), "%o", orig_mode);
    -+	xsnprintf(ownbuf[1], sizeof(ownbuf[1]), "%o", our_mode);
    -+	xsnprintf(ownbuf[2], sizeof(ownbuf[2]), "%o", their_mode);
    ++	if (our_blob) {
    ++		oid_to_hex_r(oids[1], our_blob);
    ++		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
    ++	}
    ++
    ++	if (their_blob) {
    ++		oid_to_hex_r(oids[2], their_blob);
    ++		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
    ++	}
     +
     +	return run_command_v_opt(arguments, 0);
     +}
     +
     +static int merge_entry(struct index_state *istate, int quiet, int pos,
    -+		       const char *path, merge_cb cb, void *data)
    ++		       const char *path, merge_fn fn, void *data)
     +{
     +	int found = 0;
     +	const struct object_id *oids[3] = {NULL};
    @@ -179,7 +183,7 @@
     +	if (!found)
     +		return error(_("%s is not in the cache"), path);
     +
    -+	if (cb(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
    ++	if (fn(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
     +		if (!quiet)
     +			error(_("Merge program failed"));
     +		return -2;
    @@ -188,8 +192,8 @@
     +	return found;
     +}
     +
    -+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
    -+		   const char *path, merge_cb cb, void *data)
    ++int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    ++		     const char *path, merge_fn fn, void *data)
     +{
     +	int pos = index_name_pos(istate, path, strlen(path)), ret;
     +
    @@ -198,7 +202,7 @@
     +	 * already merged and there is nothing to do.
     +	 */
     +	if (pos < 0) {
    -+		ret = merge_entry(istate, quiet, -pos - 1, path, cb, data);
    ++		ret = merge_entry(istate, quiet, -pos - 1, path, fn, data);
     +		if (ret == -1)
     +			return -1;
     +		else if (ret == -2)
    @@ -207,8 +211,8 @@
     +	return 0;
     +}
     +
    -+int merge_all(struct index_state *istate, int oneshot, int quiet,
    -+	      merge_cb cb, void *data)
    ++int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    ++		    merge_fn fn, void *data)
     +{
     +	int err = 0, i, ret;
     +	for (i = 0; i < istate->cache_nr; i++) {
    @@ -216,7 +220,7 @@
     +		if (!ce_stage(ce))
     +			continue;
     +
    -+		ret = merge_entry(istate, quiet, i, ce->name, cb, data);
    ++		ret = merge_entry(istate, quiet, i, ce->name, fn, data);
     +		if (ret > 0)
     +			i += ret - 1;
     +		else if (ret == -1)
    @@ -236,24 +240,24 @@
      --- a/merge-strategies.h
      +++ b/merge-strategies.h
     @@
    - 			      unsigned int orig_mode, unsigned int our_mode,
    - 			      unsigned int their_mode);
    + 		    const struct object_id *their_blob, const char *path,
    + 		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
      
    -+typedef int (*merge_cb)(const struct object_id *orig_blob,
    ++typedef int (*merge_fn)(const struct object_id *orig_blob,
     +			const struct object_id *our_blob,
     +			const struct object_id *their_blob, const char *path,
     +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
     +			void *data);
     +
    -+int merge_program_cb(const struct object_id *orig_blob,
    -+		     const struct object_id *our_blob,
    -+		     const struct object_id *their_blob, const char *path,
    -+		     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    -+		     void *data);
    ++int merge_one_file_spawn(const struct object_id *orig_blob,
    ++			 const struct object_id *our_blob,
    ++			 const struct object_id *their_blob, const char *path,
    ++			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    ++			 void *data);
     +
    -+int merge_one_path(struct index_state *istate, int oneshot, int quiet,
    -+		   const char *path, merge_cb cb, void *data);
    -+int merge_all(struct index_state *istate, int oneshot, int quiet,
    -+	      merge_cb cb, void *data);
    ++int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    ++		     const char *path, merge_fn fn, void *data);
    ++int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    ++		    merge_fn fn, void *data);
     +
      #endif /* MERGE_STRATEGIES_H */
 4:  c0bc05406d !  5:  a0e6cebe89 merge-index: don't fork if the requested program is `git-merge-one-file'
    @@ -3,8 +3,8 @@
         merge-index: don't fork if the requested program is `git-merge-one-file'
     
         Since `git-merge-one-file' has been rewritten and libified, this teaches
    -    `merge-index' to call merge_strategies_one_file() without forking using
    -    a new callback, merge_one_file_cb().
    +    `merge-index' to call merge_three_way() without forking using a new
    +    callback, merge_one_file_func().
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
    @@ -22,7 +22,7 @@
      	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
      	const char *pgm;
     +	void *data;
    -+	merge_cb merge_action;
    ++	merge_fn merge_action;
     +	struct lock_file lock = LOCK_INIT;
      
      	/* Without this we cannot rely on waitpid() to tell
    @@ -34,13 +34,13 @@
     +
      	pgm = argv[i++];
     +	if (!strcmp(pgm, "git-merge-one-file")) {
    -+		merge_action = merge_one_file_cb;
    ++		merge_action = merge_one_file_func;
     +		data = (void *)the_repository;
     +
     +		setup_work_tree();
     +		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
     +	} else {
    -+		merge_action = merge_program_cb;
    ++		merge_action = merge_one_file_spawn;
     +		data = (void *)pgm;
     +	}
     +
    @@ -50,19 +50,19 @@
     @@
      			}
      			if (!strcmp(arg, "-a")) {
    - 				err |= merge_all(&the_index, one_shot, quiet,
    --						 merge_program_cb, (void *)pgm);
    -+						 merge_action, data);
    + 				err |= merge_all_index(&the_index, one_shot, quiet,
    +-						       merge_one_file_spawn, (void *)pgm);
    ++						       merge_action, data);
      				continue;
      			}
      			die("git merge-index: unknown option %s", arg);
      		}
    - 		err |= merge_one_path(&the_index, one_shot, quiet, arg,
    --				      merge_program_cb, (void *)pgm);
    -+				      merge_action, data);
    + 		err |= merge_index_path(&the_index, one_shot, quiet, arg,
    +-					merge_one_file_spawn, (void *)pgm);
    ++					merge_action, data);
     +	}
     +
    -+	if (merge_action == merge_one_file_cb) {
    ++	if (merge_action == merge_one_file_func) {
     +		if (err) {
     +			rollback_lock_file(&lock);
     +			return err;
    @@ -80,20 +80,20 @@
      	return 0;
      }
      
    -+int merge_one_file_cb(const struct object_id *orig_blob,
    -+		      const struct object_id *our_blob,
    -+		      const struct object_id *their_blob, const char *path,
    -+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    -+		      void *data)
    ++int merge_one_file_func(const struct object_id *orig_blob,
    ++			const struct object_id *our_blob,
    ++			const struct object_id *their_blob, const char *path,
    ++			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    ++			void *data)
     +{
    -+	return merge_strategies_one_file((struct repository *)data,
    -+					 orig_blob, our_blob, their_blob, path,
    -+					 orig_mode, our_mode, their_mode);
    ++	return merge_three_way((struct repository *)data,
    ++			       orig_blob, our_blob, their_blob, path,
    ++			       orig_mode, our_mode, their_mode);
     +}
     +
    - int merge_program_cb(const struct object_id *orig_blob,
    - 		     const struct object_id *our_blob,
    - 		     const struct object_id *their_blob, const char *path,
    + int merge_one_file_spawn(const struct object_id *orig_blob,
    + 			 const struct object_id *our_blob,
    + 			 const struct object_id *their_blob, const char *path,
     
      diff --git a/merge-strategies.h b/merge-strategies.h
      --- a/merge-strategies.h
    @@ -102,12 +102,12 @@
      			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
      			void *data);
      
    -+int merge_one_file_cb(const struct object_id *orig_blob,
    -+		      const struct object_id *our_blob,
    -+		      const struct object_id *their_blob, const char *path,
    -+		      unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    -+		      void *data);
    ++int merge_one_file_func(const struct object_id *orig_blob,
    ++			const struct object_id *our_blob,
    ++			const struct object_id *their_blob, const char *path,
    ++			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    ++			void *data);
     +
    - int merge_program_cb(const struct object_id *orig_blob,
    - 		     const struct object_id *our_blob,
    - 		     const struct object_id *their_blob, const char *path,
    + int merge_one_file_spawn(const struct object_id *orig_blob,
    + 			 const struct object_id *our_blob,
    + 			 const struct object_id *their_blob, const char *path,
 5:  cbfe192982 !  6:  94fbc7e286 merge-resolve: rewrite in C
    @@ -17,12 +17,10 @@
            write_index_as_tree().
     
          - The call to `merge-index', needed to invoke `git merge-one-file', is
    -       replaced by a call to the new merge_all() function.  A callback
    -       function, merge_one_file_cb(), is added to allow it to call
    -       merge_one_file() without forking.
    +       replaced by a call to the new merge_all_index() function.
     
    -    Here too, the index is read in cmd_merge_resolve(), but
    -    merge_strategies_resolve() takes care of writing it back to the disk.
    +    The index is read in cmd_merge_resolve(), and is wrote back by
    +    merge_strategies_resolve().
     
         The parameters of merge_strategies_resolve() will be surprising at first
         glance: why using a commit list for `bases' and `remote', where we could
    @@ -83,6 +81,7 @@
     + * Resolve two trees, using enhanced multi-base read-tree.
     + */
     +
    ++#define USE_THE_INDEX_COMPATIBILITY_MACROS
     +#include "cache.h"
     +#include "builtin.h"
     +#include "merge-strategies.h"
    @@ -92,7 +91,7 @@
     +
     +int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
     +{
    -+	int i, is_baseless = 1, sep_seen = 0;
    ++	int i, sep_seen = 0;
     +	const char *head = NULL;
     +	struct commit_list *bases = NULL, *remote = NULL;
     +	struct commit_list **next_base = &bases;
    @@ -101,42 +100,45 @@
     +		usage(builtin_merge_resolve_usage);
     +
     +	setup_work_tree();
    -+	if (repo_read_index(the_repository) < 0)
    ++	if (read_cache() < 0)
     +		die("invalid index");
     +
    -+	/* The first parameters up to -- are merge bases; the rest are
    -+	 * heads. */
    ++	/*
    ++	 * The first parameters up to -- are merge bases; the rest are
    ++	 * heads.
    ++	 */
     +	for (i = 1; i < argc; i++) {
    -+		if (strcmp(argv[i], "--") == 0)
    ++		if (!strcmp(argv[i], "--"))
     +			sep_seen = 1;
    -+		else if (strcmp(argv[i], "-h") == 0)
    ++		else if (!strcmp(argv[i], "-h"))
     +			usage(builtin_merge_resolve_usage);
     +		else if (sep_seen && !head)
     +			head = argv[i];
    -+		else if (remote) {
    -+			/* Give up if we are given two or more remotes.
    -+			 * Not handling octopus. */
    -+			return 2;
    -+		} else {
    ++		else {
     +			struct object_id oid;
    ++			struct commit *commit;
     +
    -+			get_oid(argv[i], &oid);
    -+			is_baseless &= sep_seen;
    ++			if (get_oid(argv[i], &oid))
    ++				die("object %s not found.", argv[i]);
     +
    -+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
    -+				struct commit *commit;
    -+				commit = lookup_commit_or_die(&oid, argv[i]);
    ++			commit = lookup_commit_or_die(&oid, argv[i]);
     +
    -+				if (sep_seen)
    -+					commit_list_append(commit, &remote);
    -+				else
    -+					next_base = commit_list_append(commit, next_base);
    -+			}
    ++			if (sep_seen)
    ++				commit_list_insert(commit, &remote);
    ++			else
    ++				next_base = commit_list_append(commit, next_base);
     +		}
     +	}
     +
    ++	/*
    ++	 * Give up if we are given two or more remotes.  Not handling
    ++	 * octopus.
    ++	 */
    ++	if (remote && remote->next)
    ++		return 2;
    ++
     +	/* Give up if this is a baseless merge. */
    -+	if (is_baseless)
    ++	if (!bases)
     +		return 2;
     +
     +	return merge_strategies_resolve(the_repository, bases, head, remote);
    @@ -221,14 +223,13 @@
      #include "cache.h"
     +#include "cache-tree.h"
      #include "dir.h"
    - #include "ll-merge.h"
     +#include "lockfile.h"
      #include "merge-strategies.h"
      #include "run-command.h"
     +#include "unpack-trees.h"
      #include "xdiff-interface.h"
      
    - static int add_to_index_cacheinfo(struct index_state *istate,
    + static int checkout_from_index(struct index_state *istate, const char *path,
     @@
      
      	return err;
    @@ -303,7 +304,7 @@
     +
     +		puts(_("Simple merge failed, trying Automatic merge."));
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+		ret = merge_all(r->index, 0, 0, merge_one_file_cb, r);
    ++		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
     +
     +		write_locked_index(r->index, &lock, COMMIT_LOCK);
     +		return !!ret;
    @@ -326,10 +327,10 @@
     +#include "commit.h"
      #include "object.h"
      
    - int merge_strategies_one_file(struct repository *r,
    + int merge_three_way(struct repository *r,
     @@
    - int merge_all(struct index_state *istate, int oneshot, int quiet,
    - 	      merge_cb cb, void *data);
    + int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    + 		    merge_fn fn, void *data);
      
     +int merge_strategies_resolve(struct repository *r,
     +			     struct commit_list *bases, const char *head_arg,
 6:  35e386f626 !  7:  b582b7e5d1 merge-recursive: move better_branch_name() to merge.c
    @@ -2,8 +2,8 @@
     
         merge-recursive: move better_branch_name() to merge.c
     
    -    get_better_branch_name() will be used by rebase-octopus once it is
    -    rewritten in C, so instead of duplicating it, this moves this function
    +    better_branch_name() will be used by merge-octopus once it is rewritten
    +    in C, so instead of duplicating it, this moves this function
         preventively inside an appropriate file in libgit.a.  This function is
         also renamed to reflect its usage by merge strategies.
     
 7:  41eb0f7199 !  8:  d1936645d5 merge-octopus: rewrite in C
    @@ -13,11 +13,10 @@
            write_index_as_tree().
     
          - The call to `diff-index ...' is replaced by a call to
    -       repo_index_has_changes(), and is moved from cmd_merge_octopus() to
    -       merge_octopus().
    +       repo_index_has_changes().
     
          - The call to `merge-index', needed to invoke `git merge-one-file', is
    -       replaced by a call to merge_all().
    +       replaced by a call to merge_all_index().
     
         The index is read in cmd_merge_octopus(), and is wrote back by
         merge_strategies_octopus().
    @@ -75,6 +74,7 @@
     + * Resolve two or more trees.
     + */
     +
    ++#define USE_THE_INDEX_COMPATIBILITY_MACROS
     +#include "cache.h"
     +#include "builtin.h"
     +#include "commit.h"
    @@ -94,8 +94,8 @@
     +		usage(builtin_merge_octopus_usage);
     +
     +	setup_work_tree();
    -+	if (repo_read_index(the_repository) < 0)
    -+		die("corrupted cache");
    ++	if (read_cache() < 0)
    ++		die("invalid index");
     +
     +	/*
     +	 * The first parameters up to -- are merge bases; the rest are
    @@ -110,18 +110,17 @@
     +			head_arg = argv[i];
     +		else {
     +			struct object_id oid;
    ++			struct commit *commit;
     +
    -+			get_oid(argv[i], &oid);
    ++			if (get_oid(argv[i], &oid))
    ++				die("object %s not found.", argv[i]);
     +
    -+			if (!oideq(&oid, the_hash_algo->empty_tree)) {
    -+				struct commit *commit;
    -+				commit = lookup_commit_or_die(&oid, argv[i]);
    ++			commit = lookup_commit_or_die(&oid, argv[i]);
     +
    -+				if (sep_seen)
    -+					next_remote = commit_list_append(commit, next_remote);
    -+				else
    -+					next_base = commit_list_append(commit, next_base);
    -+			}
    ++			if (sep_seen)
    ++				next_remote = commit_list_append(commit, next_remote);
    ++			else
    ++				next_base = commit_list_append(commit, next_base);
     +		}
     +	}
     +
    @@ -273,8 +272,8 @@
      #include "cache-tree.h"
     +#include "commit-reach.h"
      #include "dir.h"
    - #include "ll-merge.h"
      #include "lockfile.h"
    + #include "merge-strategies.h"
     @@
      	rollback_lock_file(&lock);
      	return 2;
    @@ -463,7 +462,7 @@
     +
     +				puts(_("Simple merge did not work, trying automatic merge."));
     +				repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+				ret = !!merge_all(r->index, 0, 0, merge_one_file_cb, r);
    ++				ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, r);
     +				write_locked_index(r->index, &lock, COMMIT_LOCK);
     +
     +				write_tree(r, &next);
 8:  8f6c1ac057 =  9:  26b1a3979c merge: use the "resolve" strategy without forking
 9:  b1125261d1 = 10:  23bc9824df merge: use the "octopus" strategy without forking
10:  8d0932fd02 = 11:  3a340f5984 sequencer: use the "resolve" strategy without forking
11:  e304723957 = 12:  ce3723cf34 sequencer: use the "octopus" merge strategy without forking
-- 
2.20.1


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v4 01/12] t6027: modernise tests
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 02/12] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
                         ` (11 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Some tests in t6027 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6407-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index 4e6c7cb77e..071d3f7343 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -35,33 +34,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 02/12] update-index: move add_cacheinfo() to read-cache.c
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 01/12] t6027: modernise tests Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 03/12] merge-one-file: rewrite in C Alban Gruin
                         ` (10 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This moves the function add_cacheinfo() that already exists in
update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
and adds an `istate' parameter.  The new cache entry is returned through
a pointer passed in the parameters.  The return value is either 0
(success), -1 (invalid path), or -2 (failed to add the file in the
index).

This will become useful in the next commit, when the three-way merge
will need to call this function.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/update-index.c | 25 +++++++------------------
 cache.h                |  5 +++++
 read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 79087bccea..44862f5e1d 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -404,27 +404,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
 			 const char *path, int stage)
 {
-	int len, option;
-	struct cache_entry *ce;
+	int res;
 
-	if (!verify_path(path, mode))
-		return error("Invalid path '%s'", path);
-
-	len = strlen(path);
-	ce = make_empty_cache_entry(&the_index, len);
-
-	oidcpy(&ce->oid, oid);
-	memcpy(ce->name, path, len);
-	ce->ce_flags = create_ce_flags(stage);
-	ce->ce_namelen = len;
-	ce->ce_mode = create_ce_mode(mode);
-	if (assume_unchanged)
-		ce->ce_flags |= CE_VALID;
-	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
-	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
+				     allow_add, allow_replace, NULL);
+	if (res == -1)
+		return res;
+	if (res == -2)
 		return error("%s: cannot add to the index - missing --add option?",
 			     path);
+
 	report("add '%s'", path);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index c0072d43b1..be16ab3215 100644
--- a/cache.h
+++ b/cache.h
@@ -830,6 +830,11 @@ int remove_file_from_index(struct index_state *, const char *path);
 int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
 int add_file_to_index(struct index_state *, const char *path, int flags);
 
+int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **pce);
+
 int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
 int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
 void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
diff --git a/read-cache.c b/read-cache.c
index ecf6f68994..c25f951db4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1350,6 +1350,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
 	return 0;
 }
 
+int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **pce)
+{
+	int len, option;
+	struct cache_entry *ce = NULL;
+
+	if (!verify_path(path, mode))
+		return error(_("Invalid path '%s'"), path);
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(stage);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
+	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
+
+	if (add_index_entry(istate, ce, option)) {
+		discard_cache_entry(ce);
+		return -2;
+	}
+
+	if (pce)
+		*pce = ce;
+
+	return 0;
+}
+
 /*
  * "refresh" does not calculate a new sha1 file or bring the
  * cache up-to-date for mode/content changes. But what it
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 03/12] merge-one-file: rewrite in C
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 01/12] t6027: modernise tests Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 02/12] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 04/12] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                         ` (9 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_to_index_cacheinfo();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_index();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and xdl_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are removed, as this is needed
   to have the correct permission bits on the merged file from the
   script, but not in the C version;

 - calls to `update-index' are replaced by calls to add_file_to_index().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   3 +-
 builtin.h                       |   1 +
 builtin/merge-one-file.c        |  94 +++++++++++++++++
 git-merge-one-file.sh           | 167 ------------------------------
 git.c                           |   1 +
 merge-strategies.c              | 173 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  12 +++
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 8 files changed, 284 insertions(+), 169 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index de53954590..6dfdb33cb2 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -909,6 +908,7 @@ LIB_OBJS += match-trees.o
 LIB_OBJS += mem-pool.o
 LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
@@ -1094,6 +1094,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..4d2cd78856 100644
--- a/builtin.h
+++ b/builtin.h
@@ -178,6 +178,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..9c21778e1d
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,94 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file object name (or empty)
+ *   argv[2] - file in branch1 object name (or empty)
+ *   argv[3] - file in branch2 object name (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+static int read_mode(const char *name, const char *arg, unsigned int *mode)
+{
+	char *last;
+	int ret = 0;
+
+	*mode = strtol(arg, &last, 8);
+
+	if (*last)
+		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
+	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
+		ret = error(_("invalid '%s' mode: %o"), name, *mode);
+
+	return ret;
+}
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (read_cache() < 0)
+		die("invalid index");
+
+	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+
+	if (!get_oid_hex(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		ret = read_mode("orig", argv[5], &orig_mode);
+	} else if (!*argv[1] && *argv[5])
+		ret = error(_("no 'orig' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		ret = read_mode("our", argv[6], &our_mode);
+	} else if (!*argv[2] && *argv[6])
+		ret = error(_("no 'our' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		ret = read_mode("their", argv[7], &their_mode);
+	} else if (!*argv[3] && *argv[7])
+		ret = error(_("no 'their' object id given, but a mode was still given."));
+
+	if (ret)
+		return ret;
+
+	ret = merge_three_way(the_repository, p_orig_blob, p_our_blob, p_their_blob,
+			      argv[4], orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return !!ret;
+	}
+
+	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index f1e8b56d99..a4d3f98094 100644
--- a/git.c
+++ b/git.c
@@ -540,6 +540,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..f5fdb15bbf
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,173 @@
+#include "cache.h"
+#include "dir.h"
+#include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int checkout_from_index(struct index_state *istate, const char *path,
+			       struct cache_entry *ce)
+{
+	struct checkout state = CHECKOUT_INIT;
+
+	state.istate = istate;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error(_("%s: cannot checkout file"), path);
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	ssize_t written;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+	else if (our_mode != their_mode)
+		return error(_("permission conflict: %o->%o,%o in %s"),
+			     orig_mode, our_mode, their_mode, path);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, &null_oid);
+	}
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
+
+	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret < 0) {
+		free(result.ptr);
+		return error(_("Failed to execute internal merge"));
+	} else if (ret > 0 || !orig_blob) {
+		free(result.ptr);
+		return error(_("content conflict in %s"), path);
+	}
+
+	unlink(path);
+	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
+		free(result.ptr);
+		return error_errno(_("failed to open file '%s'"), path);
+	}
+
+	written = write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (written < 0)
+		return error_errno(_("failed to write to '%s'"), path);
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_three_way(struct repository *r,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
+		/* Deleted in both or deleted in one and unchanged in the other. */
+		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	} else if (!orig_blob && our_blob && !their_blob) {
+		/*
+		 * Added in one.  The other side did not add and we
+		 * added so there is nothing to be done, except making
+		 * the path merged.
+		 */
+		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path, 0, 1, 1, NULL);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		struct cache_entry *ce;
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path, 0, 1, 1, &ce))
+			return -1;
+		return checkout_from_index(r->index, path, ce);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		struct cache_entry *ce;
+
+		/* Added in both, identically (check for same permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path, 0, 1, 1, &ce))
+			return -1;
+		return checkout_from_index(r->index, path, ce);
+	} else if (our_blob && their_blob) {
+		/* Modified in both, but differently. */
+		return do_merge_one_file(r->index,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	} else {
+		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
+			their_hex[GIT_MAX_HEXSZ] = {0};
+
+		if (orig_blob)
+			oid_to_hex_r(orig_hex, orig_blob);
+		if (our_blob)
+			oid_to_hex_r(our_hex, our_blob);
+		if (their_blob)
+			oid_to_hex_r(their_hex, their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..e624c4f27c
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,12 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+int merge_three_way(struct repository *r,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
+
+#endif /* MERGE_STRATEGIES_H */
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2eddcc7664..5fb74e39a0 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 04/12] merge-index: libify merge_one_path() and merge_all()
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (2 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 03/12] merge-one-file: rewrite in C Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 05/12] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
                         ` (8 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves and renames merge_one_path(), merge_all(), and
their helpers to merge-strategies.c.  They also take a callback to
dictate what they should do for each file.  For now, to preserve the
behaviour of `merge-index', only one callback, launching a new process,
is defined.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c |  77 +++----------------------------
 merge-strategies.c    | 103 ++++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    |  17 +++++++
 3 files changed, 127 insertions(+), 70 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..49e3382fb9 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,11 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
-#include "run-command.h"
-
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
-
-static int merge_entry(int pos, const char *path)
-{
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
-	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
-
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
-
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
-	}
-}
+#include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	const char *pgm;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all_index(&the_index, one_shot, quiet,
+						       merge_one_file_spawn, (void *)pgm);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_index_path(&the_index, one_shot, quiet, arg,
+					merge_one_file_spawn, (void *)pgm);
 	}
-	if (err && !quiet)
-		die("merge program failed");
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index f5fdb15bbf..e1d121c993 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "dir.h"
 #include "merge-strategies.h"
+#include "run-command.h"
 #include "xdiff-interface.h"
 
 static int checkout_from_index(struct index_state *istate, const char *path,
@@ -171,3 +172,105 @@ int merge_three_way(struct repository *r,
 
 	return 0;
 }
+
+int merge_one_file_spawn(const struct object_id *orig_blob,
+			 const struct object_id *our_blob,
+			 const struct object_id *their_blob, const char *path,
+			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			 void *data)
+{
+	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
+	char modes[3][10] = {{0}};
+	const char *arguments[] = { (char *)data, oids[0], oids[1], oids[2],
+				    path, modes[0], modes[1], modes[2], NULL };
+
+	if (orig_blob) {
+		oid_to_hex_r(oids[0], orig_blob);
+		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
+	}
+
+	if (our_blob) {
+		oid_to_hex_r(oids[1], our_blob);
+		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
+	}
+
+	if (their_blob) {
+		oid_to_hex_r(oids[2], their_blob);
+		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
+	}
+
+	return run_command_v_opt(arguments, 0);
+}
+
+static int merge_entry(struct index_state *istate, int quiet, int pos,
+		       const char *path, merge_fn fn, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (fn(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		return -2;
+	}
+
+	return found;
+}
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, quiet, -pos - 1, path, fn, data);
+		if (ret == -1)
+			return -1;
+		else if (ret == -2)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data)
+{
+	int err = 0, i, ret;
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, quiet, i, ce->name, fn, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+		else if (ret == -2) {
+			if (oneshot)
+				err++;
+			else
+				return 1;
+		}
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index e624c4f27c..d2f52d6792 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -9,4 +9,21 @@ int merge_three_way(struct repository *r,
 		    const struct object_id *their_blob, const char *path,
 		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
 
+typedef int (*merge_fn)(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_one_file_spawn(const struct object_id *orig_blob,
+			 const struct object_id *our_blob,
+			 const struct object_id *their_blob, const char *path,
+			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			 void *data);
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data);
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 05/12] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (3 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 04/12] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 06/12] merge-resolve: rewrite in C Alban Gruin
                         ` (7 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Since `git-merge-one-file' has been rewritten and libified, this teaches
`merge-index' to call merge_three_way() without forking using a new
callback, merge_one_file_func().

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 29 +++++++++++++++++++++++++++--
 merge-strategies.c    | 11 +++++++++++
 merge-strategies.h    |  6 ++++++
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 49e3382fb9..e684811d35 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,11 +1,15 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 	const char *pgm;
+	void *data;
+	merge_fn merge_action;
+	struct lock_file lock = LOCK_INIT;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -26,7 +30,19 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+	if (!strcmp(pgm, "git-merge-one-file")) {
+		merge_action = merge_one_file_func;
+		data = (void *)the_repository;
+
+		setup_work_tree();
+		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+	} else {
+		merge_action = merge_one_file_spawn;
+		data = (void *)pgm;
+	}
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -36,13 +52,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all_index(&the_index, one_shot, quiet,
-						       merge_one_file_spawn, (void *)pgm);
+						       merge_action, data);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_index_path(&the_index, one_shot, quiet, arg,
-					merge_one_file_spawn, (void *)pgm);
+					merge_action, data);
+	}
+
+	if (merge_action == merge_one_file_func) {
+		if (err) {
+			rollback_lock_file(&lock);
+			return err;
+		}
+
+		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 	}
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index e1d121c993..aa31b7045c 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -173,6 +173,17 @@ int merge_three_way(struct repository *r,
 	return 0;
 }
 
+int merge_one_file_func(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data)
+{
+	return merge_three_way((struct repository *)data,
+			       orig_blob, our_blob, their_blob, path,
+			       orig_mode, our_mode, their_mode);
+}
+
 int merge_one_file_spawn(const struct object_id *orig_blob,
 			 const struct object_id *our_blob,
 			 const struct object_id *their_blob, const char *path,
diff --git a/merge-strategies.h b/merge-strategies.h
index d2f52d6792..b69a12b390 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -15,6 +15,12 @@ typedef int (*merge_fn)(const struct object_id *orig_blob,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_func(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
 int merge_one_file_spawn(const struct object_id *orig_blob,
 			 const struct object_id *our_blob,
 			 const struct object_id *their_blob, const char *path,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 06/12] merge-resolve: rewrite in C
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (4 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 05/12] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 07/12] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                         ` (6 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all_index() function.

The index is read in cmd_merge_resolve(), and is wrote back by
merge_strategies_resolve().

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |  2 +-
 builtin.h               |  1 +
 builtin/merge-resolve.c | 73 +++++++++++++++++++++++++++++++++++
 git-merge-resolve.sh    | 54 --------------------------
 git.c                   |  1 +
 merge-strategies.c      | 85 +++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 7 files changed, 166 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index 6dfdb33cb2..3cc6b192f1 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1097,6 +1096,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 4d2cd78856..35e91c16d0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -180,6 +180,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..dca31676b8
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,73 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (read_cache() < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (!strcmp(argv[i], "--"))
+			sep_seen = 1;
+		else if (!strcmp(argv[i], "-h"))
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				commit_list_insert(commit, &remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Give up if we are given two or more remotes.  Not handling
+	 * octopus.
+	 */
+	if (remote && remote->next)
+		return 2;
+
+	/* Give up if this is a baseless merge. */
+	if (!bases)
+		return 2;
+
+	return merge_strategies_resolve(the_repository, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 343fe7bccd..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index a4d3f98094..64a1a1de41 100644
--- a/git.c
+++ b/git.c
@@ -544,6 +544,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index aa31b7045c..2b34ea0b76 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,7 +1,10 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int checkout_from_index(struct index_state *istate, const char *path,
@@ -285,3 +288,85 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 
 	return err;
 }
+
+static int add_tree(const struct object_id *oid, struct tree_desc *t)
+{
+	struct tree *tree;
+
+	tree = parse_tree_indirect(oid);
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	int i = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct object_id head, oid;
+	struct commit_list *j;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+	refresh_index(r->index, 0, NULL, NULL, NULL);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.update = 1;
+	opts.merge = 1;
+	opts.aggressive = 1;
+
+	for (j = bases; j && j->item; j = j->next) {
+		if (add_tree(&j->item->object.oid, t + (i++)))
+			goto out;
+	}
+
+	if (head_arg && add_tree(&head, t + (i++)))
+		goto out;
+	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
+		goto out;
+
+	if (i == 1)
+		opts.fn = oneway_merge;
+	else if (i == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (i >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = i - 1;
+	}
+
+	if (unpack_trees(i, t, &opts))
+		goto out;
+
+	puts(_("Trying simple merge."));
+	write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+
+		puts(_("Simple merge failed, trying Automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+
+ out:
+	rollback_lock_file(&lock);
+	return 2;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index b69a12b390..4f996261b4 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_three_way(struct repository *r,
@@ -32,4 +33,8 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		    merge_fn fn, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 07/12] merge-recursive: move better_branch_name() to merge.c
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (5 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 06/12] merge-resolve: rewrite in C Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 08/12] merge-octopus: rewrite in C Alban Gruin
                         ` (5 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

better_branch_name() will be used by merge-octopus once it is rewritten
in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index be16ab3215..2d844576ea 100644
--- a/cache.h
+++ b/cache.h
@@ -1933,7 +1933,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 5fb88af102..801d673c5f 100644
--- a/merge.c
+++ b/merge.c
@@ -109,3 +109,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 08/12] merge-octopus: rewrite in C
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (6 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 07/12] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 09/12] merge: use the "resolve" strategy without forking Alban Gruin
                         ` (4 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all_index().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_strategies_octopus().

Here to, merge_strategies_octopus() takes two commit lists and a string
to reduce frictions when try_merge_strategies() will be modified to call
it directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  69 ++++++++++++++
 git-merge-octopus.sh    | 112 ----------------------
 git.c                   |   1 +
 merge-strategies.c      | 204 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 279 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 3cc6b192f1..2b2bdffafe 100644
--- a/Makefile
+++ b/Makefile
@@ -600,7 +600,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1093,6 +1092,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 35e91c16d0..50225404a0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -176,6 +176,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..ca8f9f345d
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (read_cache() < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				next_remote = commit_list_append(commit, next_remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Reject if this is not an octopus -- resolve should be used
+	 * instead.
+	 */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 7d19d37951..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 64a1a1de41..d51fb5d2bf 100644
--- a/git.c
+++ b/git.c
@@ -539,6 +539,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 2b34ea0b76..2ae27f4a80 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "lockfile.h"
 #include "merge-strategies.h"
@@ -370,3 +371,206 @@ int merge_strategies_resolve(struct repository *r,
 	rollback_lock_file(&lock);
 	return 2;
 }
+
+static int fast_forward(struct repository *r, const struct object_id *oids,
+			int nr, int aggressive)
+{
+	int i;
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	repo_read_index_preload(r, NULL, 0);
+	if (refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL))
+		return -1;
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	for (i = 0; i < nr; i++) {
+		struct tree *tree;
+		tree = parse_tree_indirect(oids + i);
+		if (parse_tree(tree))
+			return -1;
+		init_tree_desc(t + i, tree->buffer, tree->size);
+	}
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL);
+	if (!ret)
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int non_ff_merge = 0, ret = 0, references = 1;
+	struct commit **reference_commit;
+	struct tree *reference_tree;
+	struct commit_list *j;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+
+	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
+	reference_commit[0] = lookup_commit_reference(r, &head);
+	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
+
+	if (repo_index_has_changes(r, reference_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		ret = 2;
+		goto out;
+	}
+
+	for (j = remotes; j && j->item; j = j->next) {
+		struct commit *c = j->item;
+		struct object_id *oid = &c->object.oid;
+		struct commit_list *common, *k;
+		char *branch_name;
+		int can_ff = 1;
+
+		if (ret) {
+			/*
+			 * We allow only last one to have a
+			 * hand-resolvable conflicts.  Last round failed
+			 * and we still had a head to merge.
+			 */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			goto out;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common)
+			die(_("Unable to find common commit with %s"), branch_name);
+
+		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
+
+		if (k) {
+			printf(_("Already up to date with %s\n"), branch_name);
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (!non_ff_merge) {
+			int i;
+
+			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
+				can_ff = oideq(&k->item->object.oid,
+					       &reference_commit[i]->object.oid);
+			}
+		}
+
+		if (!non_ff_merge && can_ff) {
+			/*
+			 * The first head being merged was a
+			 * fast-forward.  Advance the reference commit
+			 * to the head being merged, and use that tree
+			 * as the intermediate result of the merge.  We
+			 * still need to count this as part of the
+			 * parent set.
+			 */
+			struct object_id oids[2];
+			printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+			oidcpy(oids, &head);
+			oidcpy(oids + 1, oid);
+
+			ret = fast_forward(r, oids, 2, 0);
+			if (ret) {
+				free(branch_name);
+				free_commit_list(common);
+				goto out;
+			}
+
+			references = 0;
+			write_tree(r, &reference_tree);
+		} else {
+			int i = 0;
+			struct tree *next = NULL;
+			struct object_id oids[MAX_UNPACK_TREES];
+
+			non_ff_merge = 1;
+			printf(_("Trying simple merge with %s\n"), branch_name);
+
+			for (k = common; k; k = k->next)
+				oidcpy(oids + (i++), &k->item->object.oid);
+
+			oidcpy(oids + (i++), &reference_tree->object.oid);
+			oidcpy(oids + (i++), oid);
+
+			if (fast_forward(r, oids, i, 1)) {
+				ret = 2;
+
+				free(branch_name);
+				free_commit_list(common);
+
+				goto out;
+			}
+
+			if (write_tree(r, &next)) {
+				struct lock_file lock = LOCK_INIT;
+
+				puts(_("Simple merge did not work, trying automatic merge."));
+				repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+				ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, r);
+				write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+				write_tree(r, &next);
+			}
+
+			reference_tree = next;
+		}
+
+		reference_commit[references++] = c;
+
+		free(branch_name);
+		free_commit_list(common);
+	}
+
+out:
+	free(reference_commit);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 4f996261b4..05232a5a89 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -36,5 +36,8 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 09/12] merge: use the "resolve" strategy without forking
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (7 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 08/12] merge-octopus: rewrite in C Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 10/12] merge: use the "octopus" " Alban Gruin
                         ` (3 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/builtin/merge.c b/builtin/merge.c
index 9d5359edc2..ddfefd8ce3 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -740,7 +741,10 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
-	} else {
+	} else if (!strcmp(strategy, "resolve"))
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
+	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
 					 common, head_arg, remoteheads);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 10/12] merge: use the "octopus" strategy without forking
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (8 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 09/12] merge: use the "resolve" strategy without forking Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 11/12] sequencer: use the "resolve" " Alban Gruin
                         ` (2 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index ddfefd8ce3..02a2367647 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -744,6 +744,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve"))
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	else if (!strcmp(strategy, "octopus"))
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 11/12] sequencer: use the "resolve" strategy without forking
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (9 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 10/12] merge: use the "octopus" " Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-13 11:04       ` [PATCH v4 12/12] sequencer: use the "octopus" merge " Alban Gruin
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index e8676e965f..ff411d54af 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -33,6 +33,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2000,9 +2001,15 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v4 12/12] sequencer: use the "octopus" merge strategy without forking
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (10 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 11/12] sequencer: use the "resolve" " Alban Gruin
@ 2020-11-13 11:04       ` Alban Gruin
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-13 11:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index ff411d54af..746afad930 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2005,6 +2005,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C
  2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
                         ` (11 preceding siblings ...)
  2020-11-13 11:04       ` [PATCH v4 12/12] sequencer: use the "octopus" merge " Alban Gruin
@ 2020-11-16 10:21       ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 01/12] t6027: modernise tests Alban Gruin
                           ` (12 more replies)
  12 siblings, 13 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

In a effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on it.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without any changes.

This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).  The
tip is tagged as "rewrite-merge-strategies-v5" at
https://github.com/agrn/git.

Changes since v4:

 - [3/12] Split long lines to 80 characters max.

 - [6/12, 8/12] Define fast_forward() when rewriting `merge-resolve'
   instead of `merge-octopus' and use it in merge_strategies_resolve()
   to reduce code duplication.  This version takes a list `tree_desc'
   instead of a list of oids.

 - [6/12, 8/12] Rename some variables (eg. i -> nr, j -> i, k -> j).

 - [8/12] Rewrote the two loops detecting if the merge was a
   fast-forward, or if a step was already up to date, to make only one
   less convoluted loop.

 - [8/12] Moved the blocks doing a fast-forward and a non-fast-forward
   merge to their own functions to make the code simpler.  That way,
   there is no need to free `branch_name' and `common' each time an
   error is handled.

 - [8/12] A call to die has been replaced by an error()/return.

 - [9/12, 10/12] Reformatted a chain of if/else if/else blocks.

Alban Gruin (12):
  t6027: modernise tests
  update-index: move add_cacheinfo() to read-cache.c
  merge-one-file: rewrite in C
  merge-index: libify merge_one_path() and merge_all()
  merge-index: don't fork if the requested program is
    `git-merge-one-file'
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Makefile                        |   7 +-
 builtin.h                       |   3 +
 builtin/merge-index.c           | 102 ++----
 builtin/merge-octopus.c         |  69 ++++
 builtin/merge-one-file.c        |  94 ++++++
 builtin/merge-recursive.c       |  16 +-
 builtin/merge-resolve.c         |  73 +++++
 builtin/merge.c                 |   7 +
 builtin/update-index.c          |  25 +-
 cache.h                         |   7 +-
 git-merge-octopus.sh            | 112 -------
 git-merge-one-file.sh           | 167 ----------
 git-merge-resolve.sh            |  54 ---
 git.c                           |   3 +
 merge-strategies.c              | 564 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  43 +++
 merge.c                         |  12 +
 read-cache.c                    |  35 ++
 sequencer.c                     |  16 +-
 t/t6407-merge-binary.sh         |  27 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 21 files changed, 974 insertions(+), 464 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v4:
 1:  08c7df596a =  1:  08c7df596a t6027: modernise tests
 2:  df237da758 =  2:  df237da758 update-index: move add_cacheinfo() to read-cache.c
 3:  b64bad0d23 !  3:  eedddde8ea merge-one-file: rewrite in C
    @@ -498,7 +498,8 @@
     +		 * added so there is nothing to be done, except making
     +		 * the path merged.
     +		 */
    -+		return add_to_index_cacheinfo(r->index, our_mode, our_blob, path, 0, 1, 1, NULL);
    ++		return add_to_index_cacheinfo(r->index, our_mode, our_blob,
    ++					      path, 0, 1, 1, NULL);
     +	} else if (!orig_blob && !our_blob && their_blob) {
     +		struct cache_entry *ce;
     +		printf(_("Adding %s\n"), path);
    @@ -506,7 +507,8 @@
     +		if (file_exists(path))
     +			return error(_("untracked %s is overwritten by the merge."), path);
     +
    -+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob, path, 0, 1, 1, &ce))
    ++		if (add_to_index_cacheinfo(r->index, their_mode, their_blob,
    ++					   path, 0, 1, 1, &ce))
     +			return -1;
     +		return checkout_from_index(r->index, path, ce);
     +	} else if (!orig_blob && our_blob && their_blob &&
    @@ -521,7 +523,8 @@
     +
     +		printf(_("Adding %s\n"), path);
     +
    -+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob, path, 0, 1, 1, &ce))
    ++		if (add_to_index_cacheinfo(r->index, our_mode, our_blob,
    ++					   path, 0, 1, 1, &ce))
     +			return -1;
     +		return checkout_from_index(r->index, path, ce);
     +	} else if (our_blob && their_blob) {
 4:  c5577dc691 =  4:  a9b9942243 merge-index: libify merge_one_path() and merge_all()
 5:  a0e6cebe89 =  5:  12775907c5 merge-index: don't fork if the requested program is `git-merge-one-file'
 6:  94fbc7e286 !  6:  54a4a12504 merge-resolve: rewrite in C
    @@ -235,72 +235,86 @@
      	return err;
      }
     +
    -+static int add_tree(const struct object_id *oid, struct tree_desc *t)
    ++static int fast_forward(struct repository *r, struct tree_desc *t,
    ++			int nr, int aggressive)
     +{
    -+	struct tree *tree;
    -+
    -+	tree = parse_tree_indirect(oid);
    -+	if (parse_tree(tree))
    -+		return -1;
    -+
    -+	init_tree_desc(t, tree->buffer, tree->size);
    -+	return 0;
    -+}
    -+
    -+int merge_strategies_resolve(struct repository *r,
    -+			     struct commit_list *bases, const char *head_arg,
    -+			     struct commit_list *remote)
    -+{
    -+	int i = 0;
    -+	struct lock_file lock = LOCK_INIT;
    -+	struct tree_desc t[MAX_UNPACK_TREES];
     +	struct unpack_trees_options opts;
    -+	struct object_id head, oid;
    -+	struct commit_list *j;
    -+
    -+	if (head_arg)
    -+		get_oid(head_arg, &head);
    ++	struct lock_file lock = LOCK_INIT;
     +
    ++	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
     +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+	refresh_index(r->index, 0, NULL, NULL, NULL);
     +
     +	memset(&opts, 0, sizeof(opts));
     +	opts.head_idx = 1;
     +	opts.src_index = r->index;
     +	opts.dst_index = r->index;
    -+	opts.update = 1;
     +	opts.merge = 1;
    -+	opts.aggressive = 1;
    ++	opts.update = 1;
    ++	opts.aggressive = aggressive;
     +
    -+	for (j = bases; j && j->item; j = j->next) {
    -+		if (add_tree(&j->item->object.oid, t + (i++)))
    -+			goto out;
    -+	}
    -+
    -+	if (head_arg && add_tree(&head, t + (i++)))
    -+		goto out;
    -+	if (remote && add_tree(&remote->item->object.oid, t + (i++)))
    -+		goto out;
    -+
    -+	if (i == 1)
    ++	if (nr == 1)
     +		opts.fn = oneway_merge;
    -+	else if (i == 2) {
    ++	else if (nr == 2) {
     +		opts.fn = twoway_merge;
     +		opts.initial_checkout = is_index_unborn(r->index);
    -+	} else if (i >= 3) {
    ++	} else if (nr >= 3) {
     +		opts.fn = threeway_merge;
    -+		opts.head_idx = i - 1;
    ++		opts.head_idx = nr - 1;
     +	}
     +
    -+	if (unpack_trees(i, t, &opts))
    -+		goto out;
    ++	if (unpack_trees(nr, t, &opts))
    ++		return -1;
    ++
    ++	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
    ++		return error(_("unable to write new index file"));
    ++
    ++	return 0;
    ++}
    ++
    ++static int add_tree(struct tree *tree, struct tree_desc *t)
    ++{
    ++	if (parse_tree(tree))
    ++		return -1;
    ++
    ++	init_tree_desc(t, tree->buffer, tree->size);
    ++	return 0;
    ++}
    ++
    ++int merge_strategies_resolve(struct repository *r,
    ++			     struct commit_list *bases, const char *head_arg,
    ++			     struct commit_list *remote)
    ++{
    ++	struct tree_desc t[MAX_UNPACK_TREES];
    ++	struct object_id head, oid;
    ++	struct commit_list *i;
    ++	int nr = 0;
    ++
    ++	if (head_arg)
    ++		get_oid(head_arg, &head);
     +
     +	puts(_("Trying simple merge."));
    -+	write_locked_index(r->index, &lock, COMMIT_LOCK);
    ++
    ++	for (i = bases; i && i->item; i = i->next) {
    ++		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
    ++			return 2;
    ++	}
    ++
    ++	if (head_arg) {
    ++		struct tree *tree = parse_tree_indirect(&head);
    ++		if (add_tree(tree, t + (nr++)))
    ++			return 2;
    ++	}
    ++
    ++	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
    ++		return 2;
    ++
    ++	if (fast_forward(r, t, nr, 1))
    ++		return 2;
     +
     +	if (write_index_as_tree(&oid, r->index, r->index_file,
     +				WRITE_TREE_SILENT, NULL)) {
     +		int ret;
    ++		struct lock_file lock = LOCK_INIT;
     +
     +		puts(_("Simple merge failed, trying Automatic merge."));
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    @@ -311,10 +325,6 @@
     +	}
     +
     +	return 0;
    -+
    -+ out:
    -+	rollback_lock_file(&lock);
    -+	return 2;
     +}
     
      diff --git a/merge-strategies.h b/merge-strategies.h
 7:  b582b7e5d1 =  7:  7c4ad06b95 merge-recursive: move better_branch_name() to merge.c
 8:  d1936645d5 !  8:  edbe08d41b merge-octopus: rewrite in C
    @@ -275,88 +275,107 @@
      #include "lockfile.h"
      #include "merge-strategies.h"
     @@
    - 	rollback_lock_file(&lock);
    - 	return 2;
    + 
    + 	return 0;
      }
     +
    -+static int fast_forward(struct repository *r, const struct object_id *oids,
    -+			int nr, int aggressive)
    -+{
    -+	int i;
    -+	struct tree_desc t[MAX_UNPACK_TREES];
    -+	struct unpack_trees_options opts;
    -+	struct lock_file lock = LOCK_INIT;
    -+
    -+	repo_read_index_preload(r, NULL, 0);
    -+	if (refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL))
    -+		return -1;
    -+
    -+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+
    -+	memset(&opts, 0, sizeof(opts));
    -+	opts.head_idx = 1;
    -+	opts.src_index = r->index;
    -+	opts.dst_index = r->index;
    -+	opts.merge = 1;
    -+	opts.update = 1;
    -+	opts.aggressive = aggressive;
    -+
    -+	for (i = 0; i < nr; i++) {
    -+		struct tree *tree;
    -+		tree = parse_tree_indirect(oids + i);
    -+		if (parse_tree(tree))
    -+			return -1;
    -+		init_tree_desc(t + i, tree->buffer, tree->size);
    -+	}
    -+
    -+	if (nr == 1)
    -+		opts.fn = oneway_merge;
    -+	else if (nr == 2) {
    -+		opts.fn = twoway_merge;
    -+		opts.initial_checkout = is_index_unborn(r->index);
    -+	} else if (nr >= 3) {
    -+		opts.fn = threeway_merge;
    -+		opts.head_idx = nr - 1;
    -+	}
    -+
    -+	if (unpack_trees(nr, t, &opts))
    -+		return -1;
    -+
    -+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
    -+		return error(_("unable to write new index file"));
    -+
    -+	return 0;
    -+}
    -+
     +static int write_tree(struct repository *r, struct tree **reference_tree)
     +{
     +	struct object_id oid;
     +	int ret;
     +
    -+	ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL);
    -+	if (!ret)
    ++	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL)))
     +		*reference_tree = lookup_tree(r, &oid);
     +
     +	return ret;
     +}
     +
    ++static int octopus_fast_forward(struct repository *r, const char *branch_name,
    ++				struct tree *tree_head, struct tree *current_tree,
    ++				struct tree **reference_tree)
    ++{
    ++	/*
    ++	 * The first head being merged was a fast-forward.  Advance the
    ++	 * reference commit to the head being merged, and use that tree
    ++	 * as the intermediate result of the merge.  We still need to
    ++	 * count this as part of the parent set.
    ++	 */
    ++	struct tree_desc t[2];
    ++
    ++	printf(_("Fast-forwarding to: %s\n"), branch_name);
    ++
    ++	init_tree_desc(t, tree_head->buffer, tree_head->size);
    ++	if (add_tree(current_tree, t + 1))
    ++		return -1;
    ++	if (fast_forward(r, t, 2, 0))
    ++		return -1;
    ++	if (write_tree(r, reference_tree))
    ++		return -1;
    ++
    ++	return 0;
    ++}
    ++
    ++static int octopus_do_merge(struct repository *r, const char *branch_name,
    ++			    struct commit_list *common, struct tree *current_tree,
    ++			    struct tree **reference_tree)
    ++{
    ++	struct tree_desc t[MAX_UNPACK_TREES];
    ++	struct commit_list *j;
    ++	int nr = 0, ret = 0;
    ++
    ++	printf(_("Trying simple merge with %s\n"), branch_name);
    ++
    ++	for (j = common; j; j = j->next) {
    ++		struct tree *tree = repo_get_commit_tree(r, j->item);
    ++		if (add_tree(tree, t + (nr++)))
    ++			return -1;
    ++	}
    ++
    ++	if (add_tree(*reference_tree, t + (nr++)))
    ++		return -1;
    ++	if (add_tree(current_tree, t + (nr++)))
    ++		return -1;
    ++	if (fast_forward(r, t, nr, 1))
    ++		return -1;
    ++
    ++	if (write_tree(r, reference_tree)) {
    ++		struct lock_file lock = LOCK_INIT;
    ++
    ++		puts(_("Simple merge did not work, trying automatic merge."));
    ++		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    ++		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
    ++		write_locked_index(r->index, &lock, COMMIT_LOCK);
    ++
    ++		write_tree(r, reference_tree);
    ++	}
    ++
    ++	return ret ? -2 : 0;
    ++}
    ++
     +int merge_strategies_octopus(struct repository *r,
     +			     struct commit_list *bases, const char *head_arg,
     +			     struct commit_list *remotes)
     +{
    -+	int non_ff_merge = 0, ret = 0, references = 1;
    ++	int ff_merge = 1, ret = 0, references = 1;
     +	struct commit **reference_commit;
    -+	struct tree *reference_tree;
    -+	struct commit_list *j;
    ++	struct tree *reference_tree, *tree_head;
    ++	struct commit_list *i;
     +	struct object_id head;
     +	struct strbuf sb = STRBUF_INIT;
     +
     +	get_oid(head_arg, &head);
     +
    -+	reference_commit = xcalloc(commit_list_count(remotes) + 1, sizeof(struct commit *));
    ++	reference_commit = xcalloc(commit_list_count(remotes) + 1,
    ++				   sizeof(struct commit *));
     +	reference_commit[0] = lookup_commit_reference(r, &head);
     +	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
     +
    ++	tree_head = repo_get_commit_tree(r, reference_commit[0]);
    ++	if (parse_tree(tree_head)) {
    ++		ret = 2;
    ++		goto out;
    ++	}
    ++
     +	if (repo_index_has_changes(r, reference_tree, &sb)) {
     +		error(_("Your local changes to the following files "
     +			"would be overwritten by merge:\n  %s"),
    @@ -366,12 +385,13 @@
     +		goto out;
     +	}
     +
    -+	for (j = remotes; j && j->item; j = j->next) {
    -+		struct commit *c = j->item;
    ++	for (i = remotes; i && i->item; i = i->next) {
    ++		struct commit *c = i->item;
     +		struct object_id *oid = &c->object.oid;
    -+		struct commit_list *common, *k;
    ++		struct tree *current_tree = repo_get_commit_tree(r, c);
    ++		struct commit_list *common, *j;
     +		char *branch_name;
    -+		int can_ff = 1;
    ++		int k = 0, up_to_date = 0;
     +
     +		if (ret) {
     +			/*
    @@ -389,92 +409,47 @@
     +		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
     +		common = get_merge_bases_many(c, references, reference_commit);
     +
    -+		if (!common)
    -+			die(_("Unable to find common commit with %s"), branch_name);
    ++		if (!common) {
    ++			error(_("Unable to find common commit with %s"), branch_name);
     +
    -+		for (k = common; k && !oideq(&k->item->object.oid, oid); k = k->next);
    ++			free(branch_name);
    ++			free_commit_list(common);
     +
    -+		if (k) {
    ++			ret = 2;
    ++			goto out;
    ++		}
    ++
    ++		for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
    ++			up_to_date |= oideq(&j->item->object.oid, oid);
    ++
    ++			if (k < references)
    ++				ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid);
    ++		}
    ++
    ++		if (up_to_date) {
     +			printf(_("Already up to date with %s\n"), branch_name);
    ++
     +			free(branch_name);
     +			free_commit_list(common);
     +			continue;
     +		}
     +
    -+		if (!non_ff_merge) {
    -+			int i;
    -+
    -+			for (i = 0, k = common; k && i < references && can_ff; k = k->next, i++) {
    -+				can_ff = oideq(&k->item->object.oid,
    -+					       &reference_commit[i]->object.oid);
    -+			}
    -+		}
    -+
    -+		if (!non_ff_merge && can_ff) {
    -+			/*
    -+			 * The first head being merged was a
    -+			 * fast-forward.  Advance the reference commit
    -+			 * to the head being merged, and use that tree
    -+			 * as the intermediate result of the merge.  We
    -+			 * still need to count this as part of the
    -+			 * parent set.
    -+			 */
    -+			struct object_id oids[2];
    -+			printf(_("Fast-forwarding to: %s\n"), branch_name);
    -+
    -+			oidcpy(oids, &head);
    -+			oidcpy(oids + 1, oid);
    -+
    -+			ret = fast_forward(r, oids, 2, 0);
    -+			if (ret) {
    -+				free(branch_name);
    -+				free_commit_list(common);
    -+				goto out;
    -+			}
    -+
    ++		if (ff_merge) {
    ++			ret = octopus_fast_forward(r, branch_name, tree_head,
    ++						   current_tree, &reference_tree);
     +			references = 0;
    -+			write_tree(r, &reference_tree);
     +		} else {
    -+			int i = 0;
    -+			struct tree *next = NULL;
    -+			struct object_id oids[MAX_UNPACK_TREES];
    -+
    -+			non_ff_merge = 1;
    -+			printf(_("Trying simple merge with %s\n"), branch_name);
    -+
    -+			for (k = common; k; k = k->next)
    -+				oidcpy(oids + (i++), &k->item->object.oid);
    -+
    -+			oidcpy(oids + (i++), &reference_tree->object.oid);
    -+			oidcpy(oids + (i++), oid);
    -+
    -+			if (fast_forward(r, oids, i, 1)) {
    -+				ret = 2;
    -+
    -+				free(branch_name);
    -+				free_commit_list(common);
    -+
    -+				goto out;
    -+			}
    -+
    -+			if (write_tree(r, &next)) {
    -+				struct lock_file lock = LOCK_INIT;
    -+
    -+				puts(_("Simple merge did not work, trying automatic merge."));
    -+				repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+				ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, r);
    -+				write_locked_index(r->index, &lock, COMMIT_LOCK);
    -+
    -+				write_tree(r, &next);
    -+			}
    -+
    -+			reference_tree = next;
    ++			ret = octopus_do_merge(r, branch_name, common,
    ++					       current_tree, &reference_tree);
     +		}
     +
    -+		reference_commit[references++] = c;
    -+
     +		free(branch_name);
     +		free_commit_list(common);
    ++
    ++		if (ret == -1)
    ++			goto out;
    ++
    ++		reference_commit[references++] = c;
     +	}
     +
     +out:
 9:  26b1a3979c !  9:  e677b27c06 merge: use the "resolve" strategy without forking
    @@ -22,11 +22,9 @@
      				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
      			die(_("unable to write %s"), get_index_file());
      		return clean ? 0 : 1;
    --	} else {
    -+	} else if (!strcmp(strategy, "resolve"))
    ++	} else if (!strcmp(strategy, "resolve")) {
     +		return merge_strategies_resolve(the_repository, common,
     +						head_arg, remoteheads);
    -+	else {
    + 	} else {
      		return try_merge_command(the_repository,
      					 strategy, xopts_nr, xopts,
    - 					 common, head_arg, remoteheads);
10:  23bc9824df ! 10:  963f316fd6 merge: use the "octopus" strategy without forking
    @@ -11,12 +11,12 @@
      --- a/builtin/merge.c
      +++ b/builtin/merge.c
     @@
    - 	} else if (!strcmp(strategy, "resolve"))
    + 	} else if (!strcmp(strategy, "resolve")) {
      		return merge_strategies_resolve(the_repository, common,
      						head_arg, remoteheads);
    -+	else if (!strcmp(strategy, "octopus"))
    ++	} else if (!strcmp(strategy, "octopus")) {
     +		return merge_strategies_octopus(the_repository, common,
     +						head_arg, remoteheads);
    - 	else {
    + 	} else {
      		return try_merge_command(the_repository,
      					 strategy, xopts_nr, xopts,
11:  3a340f5984 = 11:  0ad967a7e5 sequencer: use the "resolve" strategy without forking
12:  ce3723cf34 = 12:  3814f61717 sequencer: use the "octopus" merge strategy without forking
-- 
2.20.1


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v5 01/12] t6027: modernise tests
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 02/12] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
                           ` (11 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Some tests in t6027 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6407-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index 4e6c7cb77e..071d3f7343 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -35,33 +34,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 02/12] update-index: move add_cacheinfo() to read-cache.c
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 01/12] t6027: modernise tests Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 03/12] merge-one-file: rewrite in C Alban Gruin
                           ` (10 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This moves the function add_cacheinfo() that already exists in
update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
and adds an `istate' parameter.  The new cache entry is returned through
a pointer passed in the parameters.  The return value is either 0
(success), -1 (invalid path), or -2 (failed to add the file in the
index).

This will become useful in the next commit, when the three-way merge
will need to call this function.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/update-index.c | 25 +++++++------------------
 cache.h                |  5 +++++
 read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 79087bccea..44862f5e1d 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -404,27 +404,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
 			 const char *path, int stage)
 {
-	int len, option;
-	struct cache_entry *ce;
+	int res;
 
-	if (!verify_path(path, mode))
-		return error("Invalid path '%s'", path);
-
-	len = strlen(path);
-	ce = make_empty_cache_entry(&the_index, len);
-
-	oidcpy(&ce->oid, oid);
-	memcpy(ce->name, path, len);
-	ce->ce_flags = create_ce_flags(stage);
-	ce->ce_namelen = len;
-	ce->ce_mode = create_ce_mode(mode);
-	if (assume_unchanged)
-		ce->ce_flags |= CE_VALID;
-	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
-	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
+				     allow_add, allow_replace, NULL);
+	if (res == -1)
+		return res;
+	if (res == -2)
 		return error("%s: cannot add to the index - missing --add option?",
 			     path);
+
 	report("add '%s'", path);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index c0072d43b1..be16ab3215 100644
--- a/cache.h
+++ b/cache.h
@@ -830,6 +830,11 @@ int remove_file_from_index(struct index_state *, const char *path);
 int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
 int add_file_to_index(struct index_state *, const char *path, int flags);
 
+int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **pce);
+
 int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
 int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
 void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
diff --git a/read-cache.c b/read-cache.c
index ecf6f68994..c25f951db4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1350,6 +1350,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
 	return 0;
 }
 
+int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **pce)
+{
+	int len, option;
+	struct cache_entry *ce = NULL;
+
+	if (!verify_path(path, mode))
+		return error(_("Invalid path '%s'"), path);
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(stage);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
+	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
+
+	if (add_index_entry(istate, ce, option)) {
+		discard_cache_entry(ce);
+		return -2;
+	}
+
+	if (pce)
+		*pce = ce;
+
+	return 0;
+}
+
 /*
  * "refresh" does not calculate a new sha1 file or bring the
  * cache up-to-date for mode/content changes. But what it
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 03/12] merge-one-file: rewrite in C
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 01/12] t6027: modernise tests Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 02/12] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 04/12] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                           ` (9 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_to_index_cacheinfo();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_index();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and xdl_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are removed, as this is needed
   to have the correct permission bits on the merged file from the
   script, but not in the C version;

 - calls to `update-index' are replaced by calls to add_file_to_index().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   3 +-
 builtin.h                       |   1 +
 builtin/merge-one-file.c        |  94 +++++++++++++++++
 git-merge-one-file.sh           | 167 ------------------------------
 git.c                           |   1 +
 merge-strategies.c              | 176 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  12 +++
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 8 files changed, 287 insertions(+), 169 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index de53954590..6dfdb33cb2 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -909,6 +908,7 @@ LIB_OBJS += match-trees.o
 LIB_OBJS += mem-pool.o
 LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
@@ -1094,6 +1094,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..4d2cd78856 100644
--- a/builtin.h
+++ b/builtin.h
@@ -178,6 +178,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..9c21778e1d
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,94 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file object name (or empty)
+ *   argv[2] - file in branch1 object name (or empty)
+ *   argv[3] - file in branch2 object name (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+static int read_mode(const char *name, const char *arg, unsigned int *mode)
+{
+	char *last;
+	int ret = 0;
+
+	*mode = strtol(arg, &last, 8);
+
+	if (*last)
+		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
+	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
+		ret = error(_("invalid '%s' mode: %o"), name, *mode);
+
+	return ret;
+}
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (read_cache() < 0)
+		die("invalid index");
+
+	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+
+	if (!get_oid_hex(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		ret = read_mode("orig", argv[5], &orig_mode);
+	} else if (!*argv[1] && *argv[5])
+		ret = error(_("no 'orig' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		ret = read_mode("our", argv[6], &our_mode);
+	} else if (!*argv[2] && *argv[6])
+		ret = error(_("no 'our' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		ret = read_mode("their", argv[7], &their_mode);
+	} else if (!*argv[3] && *argv[7])
+		ret = error(_("no 'their' object id given, but a mode was still given."));
+
+	if (ret)
+		return ret;
+
+	ret = merge_three_way(the_repository, p_orig_blob, p_our_blob, p_their_blob,
+			      argv[4], orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return !!ret;
+	}
+
+	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index f1e8b56d99..a4d3f98094 100644
--- a/git.c
+++ b/git.c
@@ -540,6 +540,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..c5576dc891
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,176 @@
+#include "cache.h"
+#include "dir.h"
+#include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int checkout_from_index(struct index_state *istate, const char *path,
+			       struct cache_entry *ce)
+{
+	struct checkout state = CHECKOUT_INIT;
+
+	state.istate = istate;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error(_("%s: cannot checkout file"), path);
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	ssize_t written;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+	else if (our_mode != their_mode)
+		return error(_("permission conflict: %o->%o,%o in %s"),
+			     orig_mode, our_mode, their_mode, path);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, &null_oid);
+	}
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
+
+	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret < 0) {
+		free(result.ptr);
+		return error(_("Failed to execute internal merge"));
+	} else if (ret > 0 || !orig_blob) {
+		free(result.ptr);
+		return error(_("content conflict in %s"), path);
+	}
+
+	unlink(path);
+	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
+		free(result.ptr);
+		return error_errno(_("failed to open file '%s'"), path);
+	}
+
+	written = write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (written < 0)
+		return error_errno(_("failed to write to '%s'"), path);
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_three_way(struct repository *r,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
+		/* Deleted in both or deleted in one and unchanged in the other. */
+		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	} else if (!orig_blob && our_blob && !their_blob) {
+		/*
+		 * Added in one.  The other side did not add and we
+		 * added so there is nothing to be done, except making
+		 * the path merged.
+		 */
+		return add_to_index_cacheinfo(r->index, our_mode, our_blob,
+					      path, 0, 1, 1, NULL);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		struct cache_entry *ce;
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob,
+					   path, 0, 1, 1, &ce))
+			return -1;
+		return checkout_from_index(r->index, path, ce);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		struct cache_entry *ce;
+
+		/* Added in both, identically (check for same permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob,
+					   path, 0, 1, 1, &ce))
+			return -1;
+		return checkout_from_index(r->index, path, ce);
+	} else if (our_blob && their_blob) {
+		/* Modified in both, but differently. */
+		return do_merge_one_file(r->index,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	} else {
+		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
+			their_hex[GIT_MAX_HEXSZ] = {0};
+
+		if (orig_blob)
+			oid_to_hex_r(orig_hex, orig_blob);
+		if (our_blob)
+			oid_to_hex_r(our_hex, our_blob);
+		if (their_blob)
+			oid_to_hex_r(their_hex, their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..e624c4f27c
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,12 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+int merge_three_way(struct repository *r,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
+
+#endif /* MERGE_STRATEGIES_H */
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2eddcc7664..5fb74e39a0 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 04/12] merge-index: libify merge_one_path() and merge_all()
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (2 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 03/12] merge-one-file: rewrite in C Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 05/12] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
                           ` (8 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves and renames merge_one_path(), merge_all(), and
their helpers to merge-strategies.c.  They also take a callback to
dictate what they should do for each file.  For now, to preserve the
behaviour of `merge-index', only one callback, launching a new process,
is defined.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c |  77 +++----------------------------
 merge-strategies.c    | 103 ++++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    |  17 +++++++
 3 files changed, 127 insertions(+), 70 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..49e3382fb9 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,11 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
-#include "run-command.h"
-
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
-
-static int merge_entry(int pos, const char *path)
-{
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
-	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
-
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
-
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
-	}
-}
+#include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	const char *pgm;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all_index(&the_index, one_shot, quiet,
+						       merge_one_file_spawn, (void *)pgm);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_index_path(&the_index, one_shot, quiet, arg,
+					merge_one_file_spawn, (void *)pgm);
 	}
-	if (err && !quiet)
-		die("merge program failed");
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index c5576dc891..4eb96129f1 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "dir.h"
 #include "merge-strategies.h"
+#include "run-command.h"
 #include "xdiff-interface.h"
 
 static int checkout_from_index(struct index_state *istate, const char *path,
@@ -174,3 +175,105 @@ int merge_three_way(struct repository *r,
 
 	return 0;
 }
+
+int merge_one_file_spawn(const struct object_id *orig_blob,
+			 const struct object_id *our_blob,
+			 const struct object_id *their_blob, const char *path,
+			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			 void *data)
+{
+	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
+	char modes[3][10] = {{0}};
+	const char *arguments[] = { (char *)data, oids[0], oids[1], oids[2],
+				    path, modes[0], modes[1], modes[2], NULL };
+
+	if (orig_blob) {
+		oid_to_hex_r(oids[0], orig_blob);
+		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
+	}
+
+	if (our_blob) {
+		oid_to_hex_r(oids[1], our_blob);
+		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
+	}
+
+	if (their_blob) {
+		oid_to_hex_r(oids[2], their_blob);
+		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
+	}
+
+	return run_command_v_opt(arguments, 0);
+}
+
+static int merge_entry(struct index_state *istate, int quiet, int pos,
+		       const char *path, merge_fn fn, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (fn(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		return -2;
+	}
+
+	return found;
+}
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, quiet, -pos - 1, path, fn, data);
+		if (ret == -1)
+			return -1;
+		else if (ret == -2)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data)
+{
+	int err = 0, i, ret;
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, quiet, i, ce->name, fn, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+		else if (ret == -2) {
+			if (oneshot)
+				err++;
+			else
+				return 1;
+		}
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index e624c4f27c..d2f52d6792 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -9,4 +9,21 @@ int merge_three_way(struct repository *r,
 		    const struct object_id *their_blob, const char *path,
 		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
 
+typedef int (*merge_fn)(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_one_file_spawn(const struct object_id *orig_blob,
+			 const struct object_id *our_blob,
+			 const struct object_id *their_blob, const char *path,
+			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			 void *data);
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data);
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 05/12] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (3 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 04/12] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 06/12] merge-resolve: rewrite in C Alban Gruin
                           ` (7 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Since `git-merge-one-file' has been rewritten and libified, this teaches
`merge-index' to call merge_three_way() without forking using a new
callback, merge_one_file_func().

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 29 +++++++++++++++++++++++++++--
 merge-strategies.c    | 11 +++++++++++
 merge-strategies.h    |  6 ++++++
 3 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 49e3382fb9..e684811d35 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,11 +1,15 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 	const char *pgm;
+	void *data;
+	merge_fn merge_action;
+	struct lock_file lock = LOCK_INIT;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -26,7 +30,19 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+	if (!strcmp(pgm, "git-merge-one-file")) {
+		merge_action = merge_one_file_func;
+		data = (void *)the_repository;
+
+		setup_work_tree();
+		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+	} else {
+		merge_action = merge_one_file_spawn;
+		data = (void *)pgm;
+	}
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -36,13 +52,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all_index(&the_index, one_shot, quiet,
-						       merge_one_file_spawn, (void *)pgm);
+						       merge_action, data);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_index_path(&the_index, one_shot, quiet, arg,
-					merge_one_file_spawn, (void *)pgm);
+					merge_action, data);
+	}
+
+	if (merge_action == merge_one_file_func) {
+		if (err) {
+			rollback_lock_file(&lock);
+			return err;
+		}
+
+		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 	}
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index 4eb96129f1..2ed3a8dd68 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -176,6 +176,17 @@ int merge_three_way(struct repository *r,
 	return 0;
 }
 
+int merge_one_file_func(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data)
+{
+	return merge_three_way((struct repository *)data,
+			       orig_blob, our_blob, their_blob, path,
+			       orig_mode, our_mode, their_mode);
+}
+
 int merge_one_file_spawn(const struct object_id *orig_blob,
 			 const struct object_id *our_blob,
 			 const struct object_id *their_blob, const char *path,
diff --git a/merge-strategies.h b/merge-strategies.h
index d2f52d6792..b69a12b390 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -15,6 +15,12 @@ typedef int (*merge_fn)(const struct object_id *orig_blob,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_func(const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
 int merge_one_file_spawn(const struct object_id *orig_blob,
 			 const struct object_id *our_blob,
 			 const struct object_id *their_blob, const char *path,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 06/12] merge-resolve: rewrite in C
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (4 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 05/12] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 07/12] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                           ` (6 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all_index() function.

The index is read in cmd_merge_resolve(), and is wrote back by
merge_strategies_resolve().

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |  2 +-
 builtin.h               |  1 +
 builtin/merge-resolve.c | 73 +++++++++++++++++++++++++++++++
 git-merge-resolve.sh    | 54 -----------------------
 git.c                   |  1 +
 merge-strategies.c      | 95 +++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 7 files changed, 176 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index 6dfdb33cb2..3cc6b192f1 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1097,6 +1096,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 4d2cd78856..35e91c16d0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -180,6 +180,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..dca31676b8
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,73 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (read_cache() < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (!strcmp(argv[i], "--"))
+			sep_seen = 1;
+		else if (!strcmp(argv[i], "-h"))
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				commit_list_insert(commit, &remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Give up if we are given two or more remotes.  Not handling
+	 * octopus.
+	 */
+	if (remote && remote->next)
+		return 2;
+
+	/* Give up if this is a baseless merge. */
+	if (!bases)
+		return 2;
+
+	return merge_strategies_resolve(the_repository, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 343fe7bccd..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index a4d3f98094..64a1a1de41 100644
--- a/git.c
+++ b/git.c
@@ -544,6 +544,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 2ed3a8dd68..9fafee5954 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,7 +1,10 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int checkout_from_index(struct index_state *istate, const char *path,
@@ -288,3 +291,95 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 
 	return err;
 }
+
+static int fast_forward(struct repository *r, struct tree_desc *t,
+			int nr, int aggressive)
+{
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int add_tree(struct tree *tree, struct tree_desc *t)
+{
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct object_id head, oid;
+	struct commit_list *i;
+	int nr = 0;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	puts(_("Trying simple merge."));
+
+	for (i = bases; i && i->item; i = i->next) {
+		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
+			return 2;
+	}
+
+	if (head_arg) {
+		struct tree *tree = parse_tree_indirect(&head);
+		if (add_tree(tree, t + (nr++)))
+			return 2;
+	}
+
+	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
+		return 2;
+
+	if (fast_forward(r, t, nr, 1))
+		return 2;
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+		struct lock_file lock = LOCK_INIT;
+
+		puts(_("Simple merge failed, trying Automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index b69a12b390..4f996261b4 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_three_way(struct repository *r,
@@ -32,4 +33,8 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		    merge_fn fn, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 07/12] merge-recursive: move better_branch_name() to merge.c
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (5 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 06/12] merge-resolve: rewrite in C Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 08/12] merge-octopus: rewrite in C Alban Gruin
                           ` (5 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

better_branch_name() will be used by merge-octopus once it is rewritten
in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index be16ab3215..2d844576ea 100644
--- a/cache.h
+++ b/cache.h
@@ -1933,7 +1933,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 5fb88af102..801d673c5f 100644
--- a/merge.c
+++ b/merge.c
@@ -109,3 +109,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 08/12] merge-octopus: rewrite in C
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (6 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 07/12] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 09/12] merge: use the "resolve" strategy without forking Alban Gruin
                           ` (4 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all_index().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_strategies_octopus().

Here to, merge_strategies_octopus() takes two commit lists and a string
to reduce frictions when try_merge_strategies() will be modified to call
it directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  69 ++++++++++++++++
 git-merge-octopus.sh    | 112 -------------------------
 git.c                   |   1 +
 merge-strategies.c      | 179 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 254 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 3cc6b192f1..2b2bdffafe 100644
--- a/Makefile
+++ b/Makefile
@@ -600,7 +600,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1093,6 +1092,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 35e91c16d0..50225404a0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -176,6 +176,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..ca8f9f345d
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (read_cache() < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				next_remote = commit_list_append(commit, next_remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Reject if this is not an octopus -- resolve should be used
+	 * instead.
+	 */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 7d19d37951..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 64a1a1de41..d51fb5d2bf 100644
--- a/git.c
+++ b/git.c
@@ -539,6 +539,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 9fafee5954..970ff4793d 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "lockfile.h"
 #include "merge-strategies.h"
@@ -383,3 +384,181 @@ int merge_strategies_resolve(struct repository *r,
 
 	return 0;
 }
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL)))
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+static int octopus_fast_forward(struct repository *r, const char *branch_name,
+				struct tree *tree_head, struct tree *current_tree,
+				struct tree **reference_tree)
+{
+	/*
+	 * The first head being merged was a fast-forward.  Advance the
+	 * reference commit to the head being merged, and use that tree
+	 * as the intermediate result of the merge.  We still need to
+	 * count this as part of the parent set.
+	 */
+	struct tree_desc t[2];
+
+	printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+	init_tree_desc(t, tree_head->buffer, tree_head->size);
+	if (add_tree(current_tree, t + 1))
+		return -1;
+	if (fast_forward(r, t, 2, 0))
+		return -1;
+	if (write_tree(r, reference_tree))
+		return -1;
+
+	return 0;
+}
+
+static int octopus_do_merge(struct repository *r, const char *branch_name,
+			    struct commit_list *common, struct tree *current_tree,
+			    struct tree **reference_tree)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct commit_list *j;
+	int nr = 0, ret = 0;
+
+	printf(_("Trying simple merge with %s\n"), branch_name);
+
+	for (j = common; j; j = j->next) {
+		struct tree *tree = repo_get_commit_tree(r, j->item);
+		if (add_tree(tree, t + (nr++)))
+			return -1;
+	}
+
+	if (add_tree(*reference_tree, t + (nr++)))
+		return -1;
+	if (add_tree(current_tree, t + (nr++)))
+		return -1;
+	if (fast_forward(r, t, nr, 1))
+		return -1;
+
+	if (write_tree(r, reference_tree)) {
+		struct lock_file lock = LOCK_INIT;
+
+		puts(_("Simple merge did not work, trying automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+		write_tree(r, reference_tree);
+	}
+
+	return ret ? -2 : 0;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int ff_merge = 1, ret = 0, references = 1;
+	struct commit **reference_commit;
+	struct tree *reference_tree, *tree_head;
+	struct commit_list *i;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+
+	reference_commit = xcalloc(commit_list_count(remotes) + 1,
+				   sizeof(struct commit *));
+	reference_commit[0] = lookup_commit_reference(r, &head);
+	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
+
+	tree_head = repo_get_commit_tree(r, reference_commit[0]);
+	if (parse_tree(tree_head)) {
+		ret = 2;
+		goto out;
+	}
+
+	if (repo_index_has_changes(r, reference_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		ret = 2;
+		goto out;
+	}
+
+	for (i = remotes; i && i->item; i = i->next) {
+		struct commit *c = i->item;
+		struct object_id *oid = &c->object.oid;
+		struct tree *current_tree = repo_get_commit_tree(r, c);
+		struct commit_list *common, *j;
+		char *branch_name;
+		int k = 0, up_to_date = 0;
+
+		if (ret) {
+			/*
+			 * We allow only last one to have a
+			 * hand-resolvable conflicts.  Last round failed
+			 * and we still had a head to merge.
+			 */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			goto out;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common) {
+			error(_("Unable to find common commit with %s"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+
+			ret = 2;
+			goto out;
+		}
+
+		for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
+			up_to_date |= oideq(&j->item->object.oid, oid);
+
+			if (k < references)
+				ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid);
+		}
+
+		if (up_to_date) {
+			printf(_("Already up to date with %s\n"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (ff_merge) {
+			ret = octopus_fast_forward(r, branch_name, tree_head,
+						   current_tree, &reference_tree);
+			references = 0;
+		} else {
+			ret = octopus_do_merge(r, branch_name, common,
+					       current_tree, &reference_tree);
+		}
+
+		free(branch_name);
+		free_commit_list(common);
+
+		if (ret == -1)
+			goto out;
+
+		reference_commit[references++] = c;
+	}
+
+out:
+	free(reference_commit);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 4f996261b4..05232a5a89 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -36,5 +36,8 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 09/12] merge: use the "resolve" strategy without forking
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (7 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 08/12] merge-octopus: rewrite in C Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 10/12] merge: use the "octopus" " Alban Gruin
                           ` (3 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 9d5359edc2..3b35aa320c 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -740,6 +741,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
+	} else if (!strcmp(strategy, "resolve")) {
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 10/12] merge: use the "octopus" strategy without forking
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (8 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 09/12] merge: use the "resolve" strategy without forking Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 11/12] sequencer: use the "resolve" " Alban Gruin
                           ` (2 subsequent siblings)
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 3b35aa320c..f3345a582a 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -744,6 +744,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve")) {
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	} else if (!strcmp(strategy, "octopus")) {
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 11/12] sequencer: use the "resolve" strategy without forking
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (9 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 10/12] merge: use the "octopus" " Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-16 10:21         ` [PATCH v5 12/12] sequencer: use the "octopus" merge " Alban Gruin
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index e8676e965f..ff411d54af 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -33,6 +33,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2000,9 +2001,15 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v5 12/12] sequencer: use the "octopus" merge strategy without forking
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (10 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 11/12] sequencer: use the "resolve" " Alban Gruin
@ 2020-11-16 10:21         ` Alban Gruin
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
  12 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-16 10:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index ff411d54af..746afad930 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2005,6 +2005,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C
  2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
                           ` (11 preceding siblings ...)
  2020-11-16 10:21         ` [PATCH v5 12/12] sequencer: use the "octopus" merge " Alban Gruin
@ 2020-11-24 11:53         ` Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 01/13] t6407: modernise tests Alban Gruin
                             ` (15 more replies)
  12 siblings, 16 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

In a effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on it.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without any changes.

This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).  The
tip is tagged as "rewrite-merge-strategies-v6" at
https://github.com/agrn/git.

Changes since v5:

 - [1/13] Change the commit message to reflect the change of the name of
   t6027.

 - [2/13] Introduce changes in t6060 to avoid potential issues with the
   libified version of merge-index, where a conflicted file could be
   left unmerged (more details in the commit message).

 - [4/13] Fix error handling in do_merge_one_file().

 - [5/13] Pass the repository instead of the index to the libified
   version of merge-index.  This will be useful when merge_three_way()
   will be modified to put more information in the merge markers:
   instead of passing the repository as context, an array of two strings
   will be given to merge_entry()'s callback.

 - [5/13] Change error handling in merge_entry().  Instead of returning
   -2 when the merge program failed, it will increase the number of
   errors, passed by a pointer.  This change is introduced so the caller
   (namely merge_all_index()) still knows how many times a file was
   found in the index with its return value.  When not ran in oneshot
   mode, the number of errors should remain 0.  merge_entry() should
   also not print "Merge program failed" when ran in oneshot mode.

 - [6/13] Fix the issue described in the second patch.

 - [7/13] Pass the repository to merge_all_index().

 - [9/13] Set the flag WRITE_TREE_SILENT when calling
   write_index_as_tree().

 - [9/13] Cleanup of merge_strategies_octopus() (removing redundant
   code, removing gotos, etc.).

 - [12/13, 13/13] Reformatted an if/else if/else sequence.

Alban Gruin (13):
  t6407: modernise tests
  t6060: modify multiple files to expose a possible issue with
    merge-index
  update-index: move add_cacheinfo() to read-cache.c
  merge-one-file: rewrite in C
  merge-index: libify merge_one_path() and merge_all()
  merge-index: don't fork if the requested program is
    `git-merge-one-file'
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Makefile                        |   7 +-
 builtin.h                       |   3 +
 builtin/merge-index.c           | 101 ++----
 builtin/merge-octopus.c         |  69 ++++
 builtin/merge-one-file.c        |  94 ++++++
 builtin/merge-recursive.c       |  16 +-
 builtin/merge-resolve.c         |  73 ++++
 builtin/merge.c                 |   7 +
 builtin/update-index.c          |  25 +-
 cache.h                         |   7 +-
 git-merge-octopus.sh            | 112 -------
 git-merge-one-file.sh           | 167 ----------
 git-merge-resolve.sh            |  54 ---
 git.c                           |   3 +
 merge-strategies.c              | 571 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  46 +++
 merge.c                         |  12 +
 read-cache.c                    |  35 ++
 sequencer.c                     |  17 +-
 t/t6060-merge-index.sh          |  10 +-
 t/t6407-merge-binary.sh         |  27 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 22 files changed, 992 insertions(+), 466 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v5:
 1:  08c7df596a !  1:  70d6507330 t6027: modernise tests
    @@ Metadata
     Author: Alban Gruin <alban.gruin@gmail.com>
     
      ## Commit message ##
    -    t6027: modernise tests
    +    t6407: modernise tests
     
    -    Some tests in t6027 uses a if/then/else to check if a command failed or
    +    Some tests in t6407 uses a if/then/else to check if a command failed or
         not, but we have the `test_must_fail' function to do it correctly for us
         nowadays.
     
 -:  ---------- >  2:  25e9c47e41 t6060: modify multiple files to expose a possible issue with merge-index
 2:  df237da758 =  3:  e7ea43c5ff update-index: move add_cacheinfo() to read-cache.c
 3:  eedddde8ea !  4:  284fc4227f merge-one-file: rewrite in C
    @@ merge-strategies.c (new)
     +		return error(_("%s: Not merging symbolic link changes."), path);
     +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
     +		return error(_("%s: Not merging conflicting submodule changes."), path);
    -+	else if (our_mode != their_mode)
    -+		return error(_("permission conflict: %o->%o,%o in %s"),
    -+			     orig_mode, our_mode, their_mode, path);
     +
     +	if (orig_blob) {
     +		printf(_("Auto-merging %s\n"), path);
    @@ merge-strategies.c (new)
     +	if (ret < 0) {
     +		free(result.ptr);
     +		return error(_("Failed to execute internal merge"));
    -+	} else if (ret > 0 || !orig_blob) {
    -+		free(result.ptr);
    -+		return error(_("content conflict in %s"), path);
     +	}
     +
    ++	if (ret > 0 || !orig_blob)
    ++		ret = error(_("content conflict in %s"), path);
    ++	if (our_mode != their_mode)
    ++		ret = error(_("permission conflict: %o->%o,%o in %s"),
    ++			    orig_mode, our_mode, their_mode, path);
    ++
     +	unlink(path);
     +	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
     +		free(result.ptr);
    @@ merge-strategies.c (new)
     +
     +	if (written < 0)
     +		return error_errno(_("failed to write to '%s'"), path);
    ++	if (ret)
    ++		return ret;
     +
     +	return add_file_to_index(istate, path, 0);
     +}
 4:  a9b9942243 !  5:  54abee902f merge-index: libify merge_one_path() and merge_all()
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
      			}
      			if (!strcmp(arg, "-a")) {
     -				merge_all();
    -+				err |= merge_all_index(&the_index, one_shot, quiet,
    ++				err |= merge_all_index(the_repository, one_shot, quiet,
     +						       merge_one_file_spawn, (void *)pgm);
      				continue;
      			}
      			die("git merge-index: unknown option %s", arg);
      		}
     -		merge_one_path(arg);
    -+		err |= merge_index_path(&the_index, one_shot, quiet, arg,
    ++		err |= merge_index_path(the_repository, one_shot, quiet, arg,
     +					merge_one_file_spawn, (void *)pgm);
      	}
     -	if (err && !quiet)
    @@ merge-strategies.c: int merge_three_way(struct repository *r,
      	return 0;
      }
     +
    -+int merge_one_file_spawn(const struct object_id *orig_blob,
    ++int merge_one_file_spawn(struct repository *r,
    ++			 const struct object_id *orig_blob,
     +			 const struct object_id *our_blob,
     +			 const struct object_id *their_blob, const char *path,
     +			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    @@ merge-strategies.c: int merge_three_way(struct repository *r,
     +	return run_command_v_opt(arguments, 0);
     +}
     +
    -+static int merge_entry(struct index_state *istate, int quiet, int pos,
    -+		       const char *path, merge_fn fn, void *data)
    ++static int merge_entry(struct repository *r, int quiet, unsigned int pos,
    ++		       const char *path, int *err, merge_fn fn, void *data)
     +{
     +	int found = 0;
     +	const struct object_id *oids[3] = {NULL};
     +	unsigned int modes[3] = {0};
     +
     +	do {
    -+		const struct cache_entry *ce = istate->cache[pos];
    ++		const struct cache_entry *ce = r->index->cache[pos];
     +		int stage = ce_stage(ce);
     +
     +		if (strcmp(ce->name, path))
    @@ merge-strategies.c: int merge_three_way(struct repository *r,
     +		found++;
     +		oids[stage - 1] = &ce->oid;
     +		modes[stage - 1] = ce->ce_mode;
    -+	} while (++pos < istate->cache_nr);
    ++	} while (++pos < r->index->cache_nr);
     +	if (!found)
     +		return error(_("%s is not in the cache"), path);
     +
    -+	if (fn(oids[0], oids[1], oids[2], path, modes[0], modes[1], modes[2], data)) {
    ++	if (fn(r, oids[0], oids[1], oids[2], path,
    ++	       modes[0], modes[1], modes[2], data)) {
     +		if (!quiet)
     +			error(_("Merge program failed"));
    -+		return -2;
    ++		(*err)++;
     +	}
     +
     +	return found;
     +}
     +
    -+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    ++int merge_index_path(struct repository *r, int oneshot, int quiet,
     +		     const char *path, merge_fn fn, void *data)
     +{
    -+	int pos = index_name_pos(istate, path, strlen(path)), ret;
    ++	int pos = index_name_pos(r->index, path, strlen(path)), ret, err = 0;
     +
     +	/*
     +	 * If it already exists in the cache as stage0, it's
     +	 * already merged and there is nothing to do.
     +	 */
     +	if (pos < 0) {
    -+		ret = merge_entry(istate, quiet, -pos - 1, path, fn, data);
    ++		ret = merge_entry(r, quiet || oneshot, -pos - 1, path, &err, fn, data);
     +		if (ret == -1)
     +			return -1;
    -+		else if (ret == -2)
    ++		else if (err)
     +			return 1;
     +	}
     +	return 0;
     +}
     +
    -+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    ++int merge_all_index(struct repository *r, int oneshot, int quiet,
     +		    merge_fn fn, void *data)
     +{
    -+	int err = 0, i, ret;
    -+	for (i = 0; i < istate->cache_nr; i++) {
    -+		const struct cache_entry *ce = istate->cache[i];
    ++	int err = 0, ret;
    ++	unsigned int i;
    ++
    ++	for (i = 0; i < r->index->cache_nr; i++) {
    ++		const struct cache_entry *ce = r->index->cache[i];
     +		if (!ce_stage(ce))
     +			continue;
     +
    -+		ret = merge_entry(istate, quiet, i, ce->name, fn, data);
    ++		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
     +		if (ret > 0)
     +			i += ret - 1;
     +		else if (ret == -1)
     +			return -1;
    -+		else if (ret == -2) {
    -+			if (oneshot)
    -+				err++;
    -+			else
    -+				return 1;
    -+		}
    ++
    ++		if (err && !oneshot)
    ++			return 1;
     +	}
     +
     +	return err;
    @@ merge-strategies.h: int merge_three_way(struct repository *r,
      		    const struct object_id *their_blob, const char *path,
      		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
      
    -+typedef int (*merge_fn)(const struct object_id *orig_blob,
    ++typedef int (*merge_fn)(struct repository *r,
    ++			const struct object_id *orig_blob,
     +			const struct object_id *our_blob,
     +			const struct object_id *their_blob, const char *path,
     +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
     +			void *data);
     +
    -+int merge_one_file_spawn(const struct object_id *orig_blob,
    ++int merge_one_file_spawn(struct repository *r,
    ++			 const struct object_id *orig_blob,
     +			 const struct object_id *our_blob,
     +			 const struct object_id *their_blob, const char *path,
     +			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
     +			 void *data);
     +
    -+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    ++int merge_index_path(struct repository *r, int oneshot, int quiet,
     +		     const char *path, merge_fn fn, void *data);
    -+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    ++int merge_all_index(struct repository *r, int oneshot, int quiet,
     +		    merge_fn fn, void *data);
     +
      #endif /* MERGE_STRATEGIES_H */
 5:  12775907c5 !  6:  acaf100edd merge-index: don't fork if the requested program is `git-merge-one-file'
    @@ Commit message
         `merge-index' to call merge_three_way() without forking using a new
         callback, merge_one_file_func().
     
    +    To avoid any issue with a shrinking index because of the merge function
    +    used (directly in the process or by forking), as described earlier, the
    +    iterator of the loop of merge_all_index() is increased by the number of
    +    entries with the same name, minus the difference between the number of
    +    entries in the index before and after the merge.
    +
    +    This should handle a shrinking index correctly, but could lead to issues
    +    with a growing index.  However, this case is not treated, as there is no
    +    callback that can produce such a case.
    +
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
      ## builtin/merge-index.c ##
    @@ builtin/merge-index.c
      {
      	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
      	const char *pgm;
    -+	void *data;
    ++	void *data = NULL;
     +	merge_fn merge_action;
     +	struct lock_file lock = LOCK_INIT;
      
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
      	}
     +
      	pgm = argv[i++];
    ++	setup_work_tree();
    ++
     +	if (!strcmp(pgm, "git-merge-one-file")) {
     +		merge_action = merge_one_file_func;
    -+		data = (void *)the_repository;
    -+
    -+		setup_work_tree();
     +		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
     +	} else {
     +		merge_action = merge_one_file_spawn;
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
      			}
      			if (!strcmp(arg, "-a")) {
    - 				err |= merge_all_index(&the_index, one_shot, quiet,
    + 				err |= merge_all_index(the_repository, one_shot, quiet,
     -						       merge_one_file_spawn, (void *)pgm);
     +						       merge_action, data);
      				continue;
      			}
      			die("git merge-index: unknown option %s", arg);
      		}
    - 		err |= merge_index_path(&the_index, one_shot, quiet, arg,
    + 		err |= merge_index_path(the_repository, one_shot, quiet, arg,
     -					merge_one_file_spawn, (void *)pgm);
     +					merge_action, data);
     +	}
    @@ merge-strategies.c: int merge_three_way(struct repository *r,
      	return 0;
      }
      
    -+int merge_one_file_func(const struct object_id *orig_blob,
    ++int merge_one_file_func(struct repository *r,
    ++			const struct object_id *orig_blob,
     +			const struct object_id *our_blob,
     +			const struct object_id *their_blob, const char *path,
     +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
     +			void *data)
     +{
    -+	return merge_three_way((struct repository *)data,
    ++	return merge_three_way(r,
     +			       orig_blob, our_blob, their_blob, path,
     +			       orig_mode, our_mode, their_mode);
     +}
     +
    - int merge_one_file_spawn(const struct object_id *orig_blob,
    + int merge_one_file_spawn(struct repository *r,
    + 			 const struct object_id *orig_blob,
      			 const struct object_id *our_blob,
    - 			 const struct object_id *their_blob, const char *path,
    +@@ merge-strategies.c: int merge_all_index(struct repository *r, int oneshot, int quiet,
    + 		    merge_fn fn, void *data)
    + {
    + 	int err = 0, ret;
    +-	unsigned int i;
    ++	unsigned int i, prev_nr;
    + 
    + 	for (i = 0; i < r->index->cache_nr; i++) {
    + 		const struct cache_entry *ce = r->index->cache[i];
    + 		if (!ce_stage(ce))
    + 			continue;
    + 
    ++		prev_nr = r->index->cache_nr;
    + 		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
    +-		if (ret > 0)
    +-			i += ret - 1;
    +-		else if (ret == -1)
    ++		if (ret > 0) {
    ++			/* Don't bother handling an index that has
    ++			   grown, since merge_one_file_func() can't grow
    ++			   it, and merge_one_file_spawn() can't change
    ++			   it. */
    ++			i += ret - (prev_nr - r->index->cache_nr) - 1;
    ++		} else if (ret == -1)
    + 			return -1;
    + 
    + 		if (err && !oneshot)
     
      ## merge-strategies.h ##
    -@@ merge-strategies.h: typedef int (*merge_fn)(const struct object_id *orig_blob,
    +@@ merge-strategies.h: typedef int (*merge_fn)(struct repository *r,
      			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
      			void *data);
      
    -+int merge_one_file_func(const struct object_id *orig_blob,
    ++int merge_one_file_func(struct repository *r,
    ++			const struct object_id *orig_blob,
     +			const struct object_id *our_blob,
     +			const struct object_id *their_blob, const char *path,
     +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
     +			void *data);
     +
    - int merge_one_file_spawn(const struct object_id *orig_blob,
    + int merge_one_file_spawn(struct repository *r,
    + 			 const struct object_id *orig_blob,
      			 const struct object_id *our_blob,
    - 			 const struct object_id *their_blob, const char *path,
 6:  54a4a12504 !  7:  9a9e3faeff merge-resolve: rewrite in C
    @@ merge-strategies.c
      #include "xdiff-interface.h"
      
      static int checkout_from_index(struct index_state *istate, const char *path,
    -@@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    +@@ merge-strategies.c: int merge_all_index(struct repository *r, int oneshot, int quiet,
      
      	return err;
      }
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     +
     +		puts(_("Simple merge failed, trying Automatic merge."));
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
    ++		ret = merge_all_index(r, 1, 0, merge_one_file_func, NULL);
     +
     +		write_locked_index(r->index, &lock, COMMIT_LOCK);
     +		return !!ret;
    @@ merge-strategies.h
      #include "object.h"
      
      int merge_three_way(struct repository *r,
    -@@ merge-strategies.h: int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    - int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    +@@ merge-strategies.h: int merge_index_path(struct repository *r, int oneshot, int quiet,
    + int merge_all_index(struct repository *r, int oneshot, int quiet,
      		    merge_fn fn, void *data);
      
     +int merge_strategies_resolve(struct repository *r,
 7:  7c4ad06b95 =  8:  359346229c merge-recursive: move better_branch_name() to merge.c
 8:  edbe08d41b !  9:  4dff780212 merge-octopus: rewrite in C
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +	struct object_id oid;
     +	int ret;
     +
    -+	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file, 0, NULL)))
    ++	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file,
    ++					WRITE_TREE_SILENT, NULL)))
     +		*reference_tree = lookup_tree(r, &oid);
     +
     +	return ret;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +
     +		puts(_("Simple merge did not work, trying automatic merge."));
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+		ret = merge_all_index(r->index, 0, 0, merge_one_file_func, r);
    ++		ret = merge_all_index(r, 1, 0, merge_one_file_func, NULL);
     +		write_locked_index(r->index, &lock, COMMIT_LOCK);
     +
     +		write_tree(r, reference_tree);
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +			     struct commit_list *remotes)
     +{
     +	int ff_merge = 1, ret = 0, references = 1;
    -+	struct commit **reference_commit;
    -+	struct tree *reference_tree, *tree_head;
    ++	struct commit **reference_commit, *head_commit;
    ++	struct tree *reference_tree, *head_tree;
     +	struct commit_list *i;
     +	struct object_id head;
     +	struct strbuf sb = STRBUF_INIT;
     +
     +	get_oid(head_arg, &head);
    ++	head_commit = lookup_commit_reference(r, &head);
    ++	head_tree = repo_get_commit_tree(r, head_commit);
     +
    -+	reference_commit = xcalloc(commit_list_count(remotes) + 1,
    -+				   sizeof(struct commit *));
    -+	reference_commit[0] = lookup_commit_reference(r, &head);
    -+	reference_tree = repo_get_commit_tree(r, reference_commit[0]);
    ++	if (parse_tree(head_tree))
    ++		return 2;
     +
    -+	tree_head = repo_get_commit_tree(r, reference_commit[0]);
    -+	if (parse_tree(tree_head)) {
    -+		ret = 2;
    -+		goto out;
    -+	}
    -+
    -+	if (repo_index_has_changes(r, reference_tree, &sb)) {
    ++	if (repo_index_has_changes(r, head_tree, &sb)) {
     +		error(_("Your local changes to the following files "
     +			"would be overwritten by merge:\n  %s"),
     +		      sb.buf);
     +		strbuf_release(&sb);
    -+		ret = 2;
    -+		goto out;
    ++		return 2;
     +	}
     +
    ++	reference_commit = xcalloc(commit_list_count(remotes) + 1,
    ++				   sizeof(struct commit *));
    ++	reference_commit[0] = head_commit;
    ++	reference_tree = head_tree;
    ++
     +	for (i = remotes; i && i->item; i = i->next) {
     +		struct commit *c = i->item;
     +		struct object_id *oid = &c->object.oid;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +			puts(_("Automated merge did not work."));
     +			puts(_("Should not be doing an octopus."));
     +
    -+			ret = 2;
    -+			goto out;
    ++			free(reference_commit);
    ++			return 2;
     +		}
     +
     +		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +
     +			free(branch_name);
     +			free_commit_list(common);
    ++			free(reference_commit);
     +
    -+			ret = 2;
    -+			goto out;
    ++			return 2;
     +		}
     +
     +		for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		}
     +
     +		if (ff_merge) {
    -+			ret = octopus_fast_forward(r, branch_name, tree_head,
    ++			ret = octopus_fast_forward(r, branch_name, head_tree,
     +						   current_tree, &reference_tree);
     +			references = 0;
     +		} else {
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		free_commit_list(common);
     +
     +		if (ret == -1)
    -+			goto out;
    ++			break;
     +
     +		reference_commit[references++] = c;
     +	}
     +
    -+out:
     +	free(reference_commit);
     +	return ret;
     +}
     
      ## merge-strategies.h ##
    -@@ merge-strategies.h: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    +@@ merge-strategies.h: int merge_all_index(struct repository *r, int oneshot, int quiet,
      int merge_strategies_resolve(struct repository *r,
      			     struct commit_list *bases, const char *head_arg,
      			     struct commit_list *remote);
 9:  e677b27c06 = 10:  76f02b4531 merge: use the "resolve" strategy without forking
10:  963f316fd6 = 11:  c9e0a38d0f merge: use the "octopus" strategy without forking
11:  0ad967a7e5 ! 12:  5b595efa46 sequencer: use the "resolve" strategy without forking
    @@ sequencer.c: static int do_pick_commit(struct repository *r,
     +		if (!strcmp(opts->strategy, "resolve")) {
     +			repo_read_index(r);
     +			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
    -+		} else
    ++		} else {
     +			res |= try_merge_command(r, opts->strategy,
     +						 opts->xopts_nr, (const char **)opts->xopts,
     +						 common, oid_to_hex(&head), remotes);
    ++		}
     +
      		free_commit_list(common);
      		free_commit_list(remotes);
12:  3814f61717 ! 13:  7eb0f13442 sequencer: use the "octopus" merge strategy without forking
    @@ sequencer.c: static int do_pick_commit(struct repository *r,
     +		} else if (!strcmp(opts->strategy, "octopus")) {
     +			repo_read_index(r);
     +			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
    - 		} else
    + 		} else {
      			res |= try_merge_command(r, opts->strategy,
      						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v6 01/13] t6407: modernise tests
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 02/13] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
                             ` (14 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Some tests in t6407 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6407-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index 4e6c7cb77e..071d3f7343 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -5,7 +5,6 @@ test_description='ask merge-recursive to merge binary files'
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -35,33 +34,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive master
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive master &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 02/13] t6060: modify multiple files to expose a possible issue with merge-index
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 01/13] t6407: modernise tests Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 03/13] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
                             ` (13 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Currently, merge-index iterates over every index entry, skipping stage0
entries.  It will then count how many entries following the current one
have the same name, then fork to do the merge.  It will then increase
the iterator by the number of entries to skip them.  This behaviour is
correct, as even if the subprocess modifies the index, merge-index does
not reload it at all.

But when it will be rewritten to use a function, the index it will use
will be modified and may shrink when a conflict happens or if a file is
removed, so we have to be careful to handle such cases.

Here is an example:

 *    Merge branches, file1 and file2 are trivially mergeable.
 |\
 | *  Modifies file1 and file2.
 * |  Modifies file1 and file2.
 |/
 *    Adds file1 and file2.

When the merge happens, the index will look like that:

 i -> 0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
      3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

merge-index handles `file1' first.  As it appears 3 times after the
iterator, it is merged.  The index is now stale, `i' is increased by 3,
and the index now looks like this:

      0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
 i -> 3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

`file2' appears three times too, so it is merged.

With a naive rewrite, the index would look like this:

      0. file1 (stage0)
      1. file2 (stage1)
      2. file2 (stage2)
 i -> 3. file2 (stage3)

`file2' appears once at the iterator or after, so it will be added,
_not_ merged.  Which is wrong.

A naive rewrite would lead to unproperly merged files, or even files not
handled at all.

This changes t6060 to reproduce this case, by creating 2 files instead
of 1, to check the correctness of the soon-to-be-rewritten merge-index.
The files are identical, which is not really important -- the factors
that could trigger this issue are that they should be separated by at
most one entry in the index, and that the first one in the index should
be trivially mergeable.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6060-merge-index.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index ddf34f0115..9e15ceb957 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -7,16 +7,19 @@ test_expect_success 'setup diverging branches' '
 	for i in 1 2 3 4 5 6 7 8 9 10; do
 		echo $i
 	done >file &&
-	git add file &&
+	cp file file2 &&
+	git add file file2 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
 	sed s/10/ten/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m ten &&
 	git tag ten
 '
@@ -35,8 +38,11 @@ ten
 EOF
 
 test_expect_success 'read-tree does not resolve content merge' '
+	cat >expect <<-\EOF &&
+	file
+	file2
+	EOF
 	git read-tree -i -m base ten two &&
-	echo file >expect &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_cmp expect unmerged
 '
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 03/13] update-index: move add_cacheinfo() to read-cache.c
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 01/13] t6407: modernise tests Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 02/13] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-12-22 20:54             ` Junio C Hamano
  2020-11-24 11:53           ` [PATCH v6 04/13] merge-one-file: rewrite in C Alban Gruin
                             ` (12 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This moves the function add_cacheinfo() that already exists in
update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
and adds an `istate' parameter.  The new cache entry is returned through
a pointer passed in the parameters.  The return value is either 0
(success), -1 (invalid path), or -2 (failed to add the file in the
index).

This will become useful in the next commit, when the three-way merge
will need to call this function.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/update-index.c | 25 +++++++------------------
 cache.h                |  5 +++++
 read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 79087bccea..44862f5e1d 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -404,27 +404,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
 			 const char *path, int stage)
 {
-	int len, option;
-	struct cache_entry *ce;
+	int res;
 
-	if (!verify_path(path, mode))
-		return error("Invalid path '%s'", path);
-
-	len = strlen(path);
-	ce = make_empty_cache_entry(&the_index, len);
-
-	oidcpy(&ce->oid, oid);
-	memcpy(ce->name, path, len);
-	ce->ce_flags = create_ce_flags(stage);
-	ce->ce_namelen = len;
-	ce->ce_mode = create_ce_mode(mode);
-	if (assume_unchanged)
-		ce->ce_flags |= CE_VALID;
-	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
-	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
+				     allow_add, allow_replace, NULL);
+	if (res == -1)
+		return res;
+	if (res == -2)
 		return error("%s: cannot add to the index - missing --add option?",
 			     path);
+
 	report("add '%s'", path);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index c0072d43b1..be16ab3215 100644
--- a/cache.h
+++ b/cache.h
@@ -830,6 +830,11 @@ int remove_file_from_index(struct index_state *, const char *path);
 int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
 int add_file_to_index(struct index_state *, const char *path, int flags);
 
+int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **pce);
+
 int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
 int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
 void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
diff --git a/read-cache.c b/read-cache.c
index ecf6f68994..c25f951db4 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1350,6 +1350,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
 	return 0;
 }
 
+int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **pce)
+{
+	int len, option;
+	struct cache_entry *ce = NULL;
+
+	if (!verify_path(path, mode))
+		return error(_("Invalid path '%s'"), path);
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(stage);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
+	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
+
+	if (add_index_entry(istate, ce, option)) {
+		discard_cache_entry(ce);
+		return -2;
+	}
+
+	if (pce)
+		*pce = ce;
+
+	return 0;
+}
+
 /*
  * "refresh" does not calculate a new sha1 file or bring the
  * cache up-to-date for mode/content changes. But what it
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 04/13] merge-one-file: rewrite in C
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (2 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 03/13] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-12-22 21:36             ` Junio C Hamano
  2020-11-24 11:53           ` [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                             ` (11 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_to_index_cacheinfo();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_index();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and xdl_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are removed, as this is needed
   to have the correct permission bits on the merged file from the
   script, but not in the C version;

 - calls to `update-index' are replaced by calls to add_file_to_index().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   3 +-
 builtin.h                       |   1 +
 builtin/merge-one-file.c        |  94 +++++++++++++++++
 git-merge-one-file.sh           | 167 ------------------------------
 git.c                           |   1 +
 merge-strategies.c              | 178 ++++++++++++++++++++++++++++++++
 merge-strategies.h              |  12 +++
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 8 files changed, 289 insertions(+), 169 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index de53954590..6dfdb33cb2 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -909,6 +908,7 @@ LIB_OBJS += match-trees.o
 LIB_OBJS += mem-pool.o
 LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
@@ -1094,6 +1094,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index 53fb290963..4d2cd78856 100644
--- a/builtin.h
+++ b/builtin.h
@@ -178,6 +178,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..9c21778e1d
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,94 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file object name (or empty)
+ *   argv[2] - file in branch1 object name (or empty)
+ *   argv[3] - file in branch2 object name (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+static int read_mode(const char *name, const char *arg, unsigned int *mode)
+{
+	char *last;
+	int ret = 0;
+
+	*mode = strtol(arg, &last, 8);
+
+	if (*last)
+		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
+	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
+		ret = error(_("invalid '%s' mode: %o"), name, *mode);
+
+	return ret;
+}
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (read_cache() < 0)
+		die("invalid index");
+
+	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+
+	if (!get_oid_hex(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		ret = read_mode("orig", argv[5], &orig_mode);
+	} else if (!*argv[1] && *argv[5])
+		ret = error(_("no 'orig' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		ret = read_mode("our", argv[6], &our_mode);
+	} else if (!*argv[2] && *argv[6])
+		ret = error(_("no 'our' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		ret = read_mode("their", argv[7], &their_mode);
+	} else if (!*argv[3] && *argv[7])
+		ret = error(_("no 'their' object id given, but a mode was still given."));
+
+	if (ret)
+		return ret;
+
+	ret = merge_three_way(the_repository, p_orig_blob, p_our_blob, p_their_blob,
+			      argv[4], orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return !!ret;
+	}
+
+	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index f1e8b56d99..a4d3f98094 100644
--- a/git.c
+++ b/git.c
@@ -540,6 +540,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..20a328bf57
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,178 @@
+#include "cache.h"
+#include "dir.h"
+#include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int checkout_from_index(struct index_state *istate, const char *path,
+			       struct cache_entry *ce)
+{
+	struct checkout state = CHECKOUT_INIT;
+
+	state.istate = istate;
+	state.force = 1;
+	state.base_dir = "";
+	state.base_dir_len = 0;
+
+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
+		return error(_("%s: cannot checkout file"), path);
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((our_blob && orig_mode != our_mode) ||
+	    (their_blob && orig_mode != their_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	ssize_t written;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, &null_oid);
+	}
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
+
+	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret < 0) {
+		free(result.ptr);
+		return error(_("Failed to execute internal merge"));
+	}
+
+	if (ret > 0 || !orig_blob)
+		ret = error(_("content conflict in %s"), path);
+	if (our_mode != their_mode)
+		ret = error(_("permission conflict: %o->%o,%o in %s"),
+			    orig_mode, our_mode, their_mode, path);
+
+	unlink(path);
+	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
+		free(result.ptr);
+		return error_errno(_("failed to open file '%s'"), path);
+	}
+
+	written = write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (written < 0)
+		return error_errno(_("failed to write to '%s'"), path);
+	if (ret)
+		return ret;
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_three_way(struct repository *r,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
+		/* Deleted in both or deleted in one and unchanged in the other. */
+		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	} else if (!orig_blob && our_blob && !their_blob) {
+		/*
+		 * Added in one.  The other side did not add and we
+		 * added so there is nothing to be done, except making
+		 * the path merged.
+		 */
+		return add_to_index_cacheinfo(r->index, our_mode, our_blob,
+					      path, 0, 1, 1, NULL);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		struct cache_entry *ce;
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob,
+					   path, 0, 1, 1, &ce))
+			return -1;
+		return checkout_from_index(r->index, path, ce);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		struct cache_entry *ce;
+
+		/* Added in both, identically (check for same permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob,
+					   path, 0, 1, 1, &ce))
+			return -1;
+		return checkout_from_index(r->index, path, ce);
+	} else if (our_blob && their_blob) {
+		/* Modified in both, but differently. */
+		return do_merge_one_file(r->index,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	} else {
+		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
+			their_hex[GIT_MAX_HEXSZ] = {0};
+
+		if (orig_blob)
+			oid_to_hex_r(orig_hex, orig_blob);
+		if (our_blob)
+			oid_to_hex_r(our_hex, our_blob);
+		if (their_blob)
+			oid_to_hex_r(their_hex, their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..e624c4f27c
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,12 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+int merge_three_way(struct repository *r,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
+
+#endif /* MERGE_STRATEGIES_H */
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2eddcc7664..5fb74e39a0 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all()
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (3 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 04/13] merge-one-file: rewrite in C Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2021-01-05 15:59             ` Derrick Stolee
  2020-11-24 11:53           ` [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
                             ` (10 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves and renames merge_one_path(), merge_all(), and
their helpers to merge-strategies.c.  They also take a callback to
dictate what they should do for each file.  For now, to preserve the
behaviour of `merge-index', only one callback, launching a new process,
is defined.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c |  77 +++----------------------------
 merge-strategies.c    | 104 ++++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    |  19 ++++++++
 3 files changed, 130 insertions(+), 70 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..d5e5713b25 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,11 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
-#include "run-command.h"
-
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
-
-static int merge_entry(int pos, const char *path)
-{
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
-	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
-
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
-
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
-	}
-}
+#include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	const char *pgm;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all_index(the_repository, one_shot, quiet,
+						       merge_one_file_spawn, (void *)pgm);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_index_path(the_repository, one_shot, quiet, arg,
+					merge_one_file_spawn, (void *)pgm);
 	}
-	if (err && !quiet)
-		die("merge program failed");
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index 20a328bf57..6f27e66dfe 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "dir.h"
 #include "merge-strategies.h"
+#include "run-command.h"
 #include "xdiff-interface.h"
 
 static int checkout_from_index(struct index_state *istate, const char *path,
@@ -176,3 +177,106 @@ int merge_three_way(struct repository *r,
 
 	return 0;
 }
+
+int merge_one_file_spawn(struct repository *r,
+			 const struct object_id *orig_blob,
+			 const struct object_id *our_blob,
+			 const struct object_id *their_blob, const char *path,
+			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			 void *data)
+{
+	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
+	char modes[3][10] = {{0}};
+	const char *arguments[] = { (char *)data, oids[0], oids[1], oids[2],
+				    path, modes[0], modes[1], modes[2], NULL };
+
+	if (orig_blob) {
+		oid_to_hex_r(oids[0], orig_blob);
+		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
+	}
+
+	if (our_blob) {
+		oid_to_hex_r(oids[1], our_blob);
+		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
+	}
+
+	if (their_blob) {
+		oid_to_hex_r(oids[2], their_blob);
+		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
+	}
+
+	return run_command_v_opt(arguments, 0);
+}
+
+static int merge_entry(struct repository *r, int quiet, unsigned int pos,
+		       const char *path, int *err, merge_fn fn, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = r->index->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < r->index->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (fn(r, oids[0], oids[1], oids[2], path,
+	       modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		(*err)++;
+	}
+
+	return found;
+}
+
+int merge_index_path(struct repository *r, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data)
+{
+	int pos = index_name_pos(r->index, path, strlen(path)), ret, err = 0;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(r, quiet || oneshot, -pos - 1, path, &err, fn, data);
+		if (ret == -1)
+			return -1;
+		else if (err)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all_index(struct repository *r, int oneshot, int quiet,
+		    merge_fn fn, void *data)
+{
+	int err = 0, ret;
+	unsigned int i;
+
+	for (i = 0; i < r->index->cache_nr; i++) {
+		const struct cache_entry *ce = r->index->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+
+		if (err && !oneshot)
+			return 1;
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index e624c4f27c..94c40635c4 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -9,4 +9,23 @@ int merge_three_way(struct repository *r,
 		    const struct object_id *their_blob, const char *path,
 		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
 
+typedef int (*merge_fn)(struct repository *r,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_one_file_spawn(struct repository *r,
+			 const struct object_id *orig_blob,
+			 const struct object_id *our_blob,
+			 const struct object_id *their_blob, const char *path,
+			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			 void *data);
+
+int merge_index_path(struct repository *r, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data);
+int merge_all_index(struct repository *r, int oneshot, int quiet,
+		    merge_fn fn, void *data);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (4 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2021-01-05 16:11             ` Derrick Stolee
  2020-11-24 11:53           ` [PATCH v6 07/13] merge-resolve: rewrite in C Alban Gruin
                             ` (9 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

Since `git-merge-one-file' has been rewritten and libified, this teaches
`merge-index' to call merge_three_way() without forking using a new
callback, merge_one_file_func().

To avoid any issue with a shrinking index because of the merge function
used (directly in the process or by forking), as described earlier, the
iterator of the loop of merge_all_index() is increased by the number of
entries with the same name, minus the difference between the number of
entries in the index before and after the merge.

This should handle a shrinking index correctly, but could lead to issues
with a growing index.  However, this case is not treated, as there is no
callback that can produce such a case.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 28 ++++++++++++++++++++++++++--
 merge-strategies.c    | 25 +++++++++++++++++++++----
 merge-strategies.h    |  7 +++++++
 3 files changed, 54 insertions(+), 6 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index d5e5713b25..60fcde579f 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,11 +1,15 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 	const char *pgm;
+	void *data = NULL;
+	merge_fn merge_action;
+	struct lock_file lock = LOCK_INIT;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -26,7 +30,18 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+	setup_work_tree();
+
+	if (!strcmp(pgm, "git-merge-one-file")) {
+		merge_action = merge_one_file_func;
+		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
+	} else {
+		merge_action = merge_one_file_spawn;
+		data = (void *)pgm;
+	}
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -36,13 +51,22 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all_index(the_repository, one_shot, quiet,
-						       merge_one_file_spawn, (void *)pgm);
+						       merge_action, data);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_index_path(the_repository, one_shot, quiet, arg,
-					merge_one_file_spawn, (void *)pgm);
+					merge_action, data);
+	}
+
+	if (merge_action == merge_one_file_func) {
+		if (err) {
+			rollback_lock_file(&lock);
+			return err;
+		}
+
+		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
 	}
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
index 6f27e66dfe..542cefcf3d 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -178,6 +178,18 @@ int merge_three_way(struct repository *r,
 	return 0;
 }
 
+int merge_one_file_func(struct repository *r,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data)
+{
+	return merge_three_way(r,
+			       orig_blob, our_blob, their_blob, path,
+			       orig_mode, our_mode, their_mode);
+}
+
 int merge_one_file_spawn(struct repository *r,
 			 const struct object_id *orig_blob,
 			 const struct object_id *our_blob,
@@ -261,17 +273,22 @@ int merge_all_index(struct repository *r, int oneshot, int quiet,
 		    merge_fn fn, void *data)
 {
 	int err = 0, ret;
-	unsigned int i;
+	unsigned int i, prev_nr;
 
 	for (i = 0; i < r->index->cache_nr; i++) {
 		const struct cache_entry *ce = r->index->cache[i];
 		if (!ce_stage(ce))
 			continue;
 
+		prev_nr = r->index->cache_nr;
 		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
-		if (ret > 0)
-			i += ret - 1;
-		else if (ret == -1)
+		if (ret > 0) {
+			/* Don't bother handling an index that has
+			   grown, since merge_one_file_func() can't grow
+			   it, and merge_one_file_spawn() can't change
+			   it. */
+			i += ret - (prev_nr - r->index->cache_nr) - 1;
+		} else if (ret == -1)
 			return -1;
 
 		if (err && !oneshot)
diff --git a/merge-strategies.h b/merge-strategies.h
index 94c40635c4..0b74d45431 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -16,6 +16,13 @@ typedef int (*merge_fn)(struct repository *r,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_func(struct repository *r,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
 int merge_one_file_spawn(struct repository *r,
 			 const struct object_id *orig_blob,
 			 const struct object_id *our_blob,
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 07/13] merge-resolve: rewrite in C
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (5 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 08/13] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                             ` (8 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all_index() function.

The index is read in cmd_merge_resolve(), and is wrote back by
merge_strategies_resolve().

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |  2 +-
 builtin.h               |  1 +
 builtin/merge-resolve.c | 73 +++++++++++++++++++++++++++++++
 git-merge-resolve.sh    | 54 -----------------------
 git.c                   |  1 +
 merge-strategies.c      | 95 +++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 7 files changed, 176 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index 6dfdb33cb2..3cc6b192f1 100644
--- a/Makefile
+++ b/Makefile
@@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1097,6 +1096,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 4d2cd78856..35e91c16d0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -180,6 +180,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..dca31676b8
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,73 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (read_cache() < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (!strcmp(argv[i], "--"))
+			sep_seen = 1;
+		else if (!strcmp(argv[i], "-h"))
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				commit_list_insert(commit, &remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Give up if we are given two or more remotes.  Not handling
+	 * octopus.
+	 */
+	if (remote && remote->next)
+		return 2;
+
+	/* Give up if this is a baseless merge. */
+	if (!bases)
+		return 2;
+
+	return merge_strategies_resolve(the_repository, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 343fe7bccd..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index a4d3f98094..64a1a1de41 100644
--- a/git.c
+++ b/git.c
@@ -544,6 +544,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 542cefcf3d..9aa07e91b5 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,7 +1,10 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int checkout_from_index(struct index_state *istate, const char *path,
@@ -297,3 +300,95 @@ int merge_all_index(struct repository *r, int oneshot, int quiet,
 
 	return err;
 }
+
+static int fast_forward(struct repository *r, struct tree_desc *t,
+			int nr, int aggressive)
+{
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int add_tree(struct tree *tree, struct tree_desc *t)
+{
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct object_id head, oid;
+	struct commit_list *i;
+	int nr = 0;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	puts(_("Trying simple merge."));
+
+	for (i = bases; i && i->item; i = i->next) {
+		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
+			return 2;
+	}
+
+	if (head_arg) {
+		struct tree *tree = parse_tree_indirect(&head);
+		if (add_tree(tree, t + (nr++)))
+			return 2;
+	}
+
+	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
+		return 2;
+
+	if (fast_forward(r, t, nr, 1))
+		return 2;
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+		struct lock_file lock = LOCK_INIT;
+
+		puts(_("Simple merge failed, trying Automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all_index(r, 1, 0, merge_one_file_func, NULL);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 0b74d45431..47dcd71ad5 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_three_way(struct repository *r,
@@ -35,4 +36,8 @@ int merge_index_path(struct repository *r, int oneshot, int quiet,
 int merge_all_index(struct repository *r, int oneshot, int quiet,
 		    merge_fn fn, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 08/13] merge-recursive: move better_branch_name() to merge.c
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (6 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 07/13] merge-resolve: rewrite in C Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2021-01-05 16:19             ` Derrick Stolee
  2020-11-24 11:53           ` [PATCH v6 09/13] merge-octopus: rewrite in C Alban Gruin
                             ` (7 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

better_branch_name() will be used by merge-octopus once it is rewritten
in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index be16ab3215..2d844576ea 100644
--- a/cache.h
+++ b/cache.h
@@ -1933,7 +1933,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 5fb88af102..801d673c5f 100644
--- a/merge.c
+++ b/merge.c
@@ -109,3 +109,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 09/13] merge-octopus: rewrite in C
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (7 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 08/13] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2021-01-05 16:40             ` Derrick Stolee
  2020-11-24 11:53           ` [PATCH v6 10/13] merge: use the "resolve" strategy without forking Alban Gruin
                             ` (6 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all_index().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_strategies_octopus().

Here to, merge_strategies_octopus() takes two commit lists and a string
to reduce frictions when try_merge_strategies() will be modified to call
it directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  69 ++++++++++++++++
 git-merge-octopus.sh    | 112 -------------------------
 git.c                   |   1 +
 merge-strategies.c      | 177 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 252 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 3cc6b192f1..2b2bdffafe 100644
--- a/Makefile
+++ b/Makefile
@@ -600,7 +600,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1093,6 +1092,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 35e91c16d0..50225404a0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -176,6 +176,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..ca8f9f345d
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,69 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#define USE_THE_INDEX_COMPATIBILITY_MACROS
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (read_cache() < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				next_remote = commit_list_append(commit, next_remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Reject if this is not an octopus -- resolve should be used
+	 * instead.
+	 */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 7d19d37951..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 64a1a1de41..d51fb5d2bf 100644
--- a/git.c
+++ b/git.c
@@ -539,6 +539,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 9aa07e91b5..4d9dd55296 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "lockfile.h"
 #include "merge-strategies.h"
@@ -392,3 +393,179 @@ int merge_strategies_resolve(struct repository *r,
 
 	return 0;
 }
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file,
+					WRITE_TREE_SILENT, NULL)))
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+static int octopus_fast_forward(struct repository *r, const char *branch_name,
+				struct tree *tree_head, struct tree *current_tree,
+				struct tree **reference_tree)
+{
+	/*
+	 * The first head being merged was a fast-forward.  Advance the
+	 * reference commit to the head being merged, and use that tree
+	 * as the intermediate result of the merge.  We still need to
+	 * count this as part of the parent set.
+	 */
+	struct tree_desc t[2];
+
+	printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+	init_tree_desc(t, tree_head->buffer, tree_head->size);
+	if (add_tree(current_tree, t + 1))
+		return -1;
+	if (fast_forward(r, t, 2, 0))
+		return -1;
+	if (write_tree(r, reference_tree))
+		return -1;
+
+	return 0;
+}
+
+static int octopus_do_merge(struct repository *r, const char *branch_name,
+			    struct commit_list *common, struct tree *current_tree,
+			    struct tree **reference_tree)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct commit_list *j;
+	int nr = 0, ret = 0;
+
+	printf(_("Trying simple merge with %s\n"), branch_name);
+
+	for (j = common; j; j = j->next) {
+		struct tree *tree = repo_get_commit_tree(r, j->item);
+		if (add_tree(tree, t + (nr++)))
+			return -1;
+	}
+
+	if (add_tree(*reference_tree, t + (nr++)))
+		return -1;
+	if (add_tree(current_tree, t + (nr++)))
+		return -1;
+	if (fast_forward(r, t, nr, 1))
+		return -1;
+
+	if (write_tree(r, reference_tree)) {
+		struct lock_file lock = LOCK_INIT;
+
+		puts(_("Simple merge did not work, trying automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all_index(r, 1, 0, merge_one_file_func, NULL);
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+		write_tree(r, reference_tree);
+	}
+
+	return ret ? -2 : 0;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int ff_merge = 1, ret = 0, references = 1;
+	struct commit **reference_commit, *head_commit;
+	struct tree *reference_tree, *head_tree;
+	struct commit_list *i;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+	head_commit = lookup_commit_reference(r, &head);
+	head_tree = repo_get_commit_tree(r, head_commit);
+
+	if (parse_tree(head_tree))
+		return 2;
+
+	if (repo_index_has_changes(r, head_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		return 2;
+	}
+
+	reference_commit = xcalloc(commit_list_count(remotes) + 1,
+				   sizeof(struct commit *));
+	reference_commit[0] = head_commit;
+	reference_tree = head_tree;
+
+	for (i = remotes; i && i->item; i = i->next) {
+		struct commit *c = i->item;
+		struct object_id *oid = &c->object.oid;
+		struct tree *current_tree = repo_get_commit_tree(r, c);
+		struct commit_list *common, *j;
+		char *branch_name;
+		int k = 0, up_to_date = 0;
+
+		if (ret) {
+			/*
+			 * We allow only last one to have a
+			 * hand-resolvable conflicts.  Last round failed
+			 * and we still had a head to merge.
+			 */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			free(reference_commit);
+			return 2;
+		}
+
+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		common = get_merge_bases_many(c, references, reference_commit);
+
+		if (!common) {
+			error(_("Unable to find common commit with %s"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+			free(reference_commit);
+
+			return 2;
+		}
+
+		for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
+			up_to_date |= oideq(&j->item->object.oid, oid);
+
+			if (k < references)
+				ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid);
+		}
+
+		if (up_to_date) {
+			printf(_("Already up to date with %s\n"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (ff_merge) {
+			ret = octopus_fast_forward(r, branch_name, head_tree,
+						   current_tree, &reference_tree);
+			references = 0;
+		} else {
+			ret = octopus_do_merge(r, branch_name, common,
+					       current_tree, &reference_tree);
+		}
+
+		free(branch_name);
+		free_commit_list(common);
+
+		if (ret == -1)
+			break;
+
+		reference_commit[references++] = c;
+	}
+
+	free(reference_commit);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 47dcd71ad5..05c50159ec 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -39,5 +39,8 @@ int merge_all_index(struct repository *r, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 10/13] merge: use the "resolve" strategy without forking
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (8 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 09/13] merge-octopus: rewrite in C Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2021-01-05 16:45             ` Derrick Stolee
  2020-11-24 11:53           ` [PATCH v6 11/13] merge: use the "octopus" " Alban Gruin
                             ` (5 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 9d5359edc2..3b35aa320c 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -41,6 +41,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -740,6 +741,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
+	} else if (!strcmp(strategy, "resolve")) {
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 11/13] merge: use the "octopus" strategy without forking
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (9 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 10/13] merge: use the "resolve" strategy without forking Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 12/13] sequencer: use the "resolve" " Alban Gruin
                             ` (4 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 3b35aa320c..f3345a582a 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -744,6 +744,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve")) {
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	} else if (!strcmp(strategy, "octopus")) {
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 12/13] sequencer: use the "resolve" strategy without forking
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (10 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 11/13] merge: use the "octopus" " Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-11-24 11:53           ` [PATCH v6 13/13] sequencer: use the "octopus" merge " Alban Gruin
                             ` (3 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index e8676e965f..706c2eee87 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -33,6 +33,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2000,9 +2001,16 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else {
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+		}
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v6 13/13] sequencer: use the "octopus" merge strategy without forking
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (11 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 12/13] sequencer: use the "resolve" " Alban Gruin
@ 2020-11-24 11:53           ` Alban Gruin
  2020-11-24 19:34           ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C SZEDER Gábor
                             ` (2 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2020-11-24 11:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 706c2eee87..591de451a2 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2005,6 +2005,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else {
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.29.2.260.ge31aba42fb


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (12 preceding siblings ...)
  2020-11-24 11:53           ` [PATCH v6 13/13] sequencer: use the "octopus" merge " Alban Gruin
@ 2020-11-24 19:34           ` SZEDER Gábor
  2021-01-05 16:50           ` Derrick Stolee
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
  15 siblings, 0 replies; 221+ messages in thread
From: SZEDER Gábor @ 2020-11-24 19:34 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood

On Tue, Nov 24, 2020 at 12:53:02PM +0100, Alban Gruin wrote:
> In a effort to reduce the number of shell scripts in git's codebase, I
> propose this patch series converting the two remaining merge strategies,
> resolve and octopus, from shell to C.  This will enable slightly better
> performance, better integration with git itself (no more forking to
> perform these operations), better portability (Windows and shell scripts
> don't mix well).
> 
> Three scripts are actually converted: first git-merge-one-file.sh, then
> git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
> are converted, but they also are modified to operate without forking,
> and then libified so they can be used by git without spawning another
> process.

> This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).

This patch series should be based on top of 722fc37491 (help: do not
expect built-in commands to be hardlinked, 2020-10-07) (in
v2.29.0-rc1), because without the fix in that commit we don't get the
list of available merge strategies when building with
SKIP_DASHED_BUILT_INS=YesPlease:

  $ make clean
  [...]
  $ SKIP_DASHED_BUILT_INS=YesPlease make
  [...]
  $ git merge -s help
  Could not find merge strategy 'help'.
  Available strategies are:.

Our completion script relies on this to list available strategies, and
a test in 't9902-completion.sh' fails without that fix.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 03/13] update-index: move add_cacheinfo() to read-cache.c
  2020-11-24 11:53           ` [PATCH v6 03/13] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2020-12-22 20:54             ` Junio C Hamano
  0 siblings, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2020-12-22 20:54 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Phillip Wood

Alban Gruin <alban.gruin@gmail.com> writes:

> This moves the function add_cacheinfo() that already exists in
> update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
> and adds an `istate' parameter.  The new cache entry is returned through
> a pointer passed in the parameters.  The return value is either 0
> (success), -1 (invalid path), or -2 (failed to add the file in the
> index).
>
> This will become useful in the next commit, when the three-way merge
> will need to call this function.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  builtin/update-index.c | 25 +++++++------------------
>  cache.h                |  5 +++++
>  read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
>  3 files changed, 47 insertions(+), 18 deletions(-)
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 79087bccea..44862f5e1d 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -404,27 +404,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
>  static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
>  			 const char *path, int stage)
>  {
> -	int len, option;
> -	struct cache_entry *ce;
> +	int res;
>  
> -	if (!verify_path(path, mode))
> -		return error("Invalid path '%s'", path);
> -
> -	len = strlen(path);
> -	ce = make_empty_cache_entry(&the_index, len);
> -
> -	oidcpy(&ce->oid, oid);
> -	memcpy(ce->name, path, len);
> -	ce->ce_flags = create_ce_flags(stage);
> -	ce->ce_namelen = len;
> -	ce->ce_mode = create_ce_mode(mode);
> -	if (assume_unchanged)
> -		ce->ce_flags |= CE_VALID;
> -	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
> -	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
> -	if (add_cache_entry(ce, option))
> +	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
> +				     allow_add, allow_replace, NULL);
> +	if (res == -1)
> +		return res;
> +	if (res == -2)
>  		return error("%s: cannot add to the index - missing --add option?",
>  			     path);

Introduce a symbolic constant (C preprocessor macros) so that the
above becomes

	if (res == ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD)
		return error("%s: cannot add to the index - missing --add option?",
			     path);
	if (res < 0)
		return res;

or something like that.

Stepping back a bit.

It feels _really_ odd that add_to_index_cacheinfo() became silent
only for one error-return case while the other error case emits an
error message on its own, without any way to squelch it.  Isn't this
adapting too much the need of a single (future) caller?

It may make more sense to do

	#define ADD_TO_INDEX_CACHEINFO_INVALID_PATH	(-1)
	#define ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD	(-2)

and make both silent.  At least that would be more consistent.

> +
>  	report("add '%s'", path);
>  	return 0;
>  }
> diff --git a/cache.h b/cache.h
> index c0072d43b1..be16ab3215 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -830,6 +830,11 @@ int remove_file_from_index(struct index_state *, const char *path);
>  int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
>  int add_file_to_index(struct index_state *, const char *path, int flags);

As a public function with mysterious 0/-1/-2 return values, a reader
deserves to see a comment to understand how to call this function,
how to treat its return value, etc.

You already have enough material to fill in such a comment in your
proposed log message, it seems, which is good.

> +int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
> +			   const struct object_id *oid, const char *path,
> +			   int stage, int allow_add, int allow_replace,
> +			   struct cache_entry **pce);
> +
>  int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
>  int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
>  void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
> diff --git a/read-cache.c b/read-cache.c
> index ecf6f68994..c25f951db4 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1350,6 +1350,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
>  	return 0;
>  }
>  
> +int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
> +			   const struct object_id *oid, const char *path,
> +			   int stage, int allow_add, int allow_replace,
> +			   struct cache_entry **pce)
> +{

I see two behaviour differences from the original, which may be
worth noting in the proposed log message as difference.

 - callers of add_cacheinfo() never learned of the new cache entry;
   this allows the caller to optionally obtain a pointer to it.

 - we used to leak a new cache entry when add_cache_entry() refused
   to add it to the index; the leak got plugged.

> +	int len, option;
> +	struct cache_entry *ce = NULL;

Why initialize it to NULL?  It is quite clear in the code that the
variable is never used until it is assigned to.

> +	if (!verify_path(path, mode))
> +		return error(_("Invalid path '%s'"), path);
> +
> +	len = strlen(path);
> +	ce = make_empty_cache_entry(istate, len);
> +
> +	oidcpy(&ce->oid, oid);
> +	memcpy(ce->name, path, len);
> +	ce->ce_flags = create_ce_flags(stage);
> +	ce->ce_namelen = len;
> +	ce->ce_mode = create_ce_mode(mode);
> +	if (assume_unchanged)
> +		ce->ce_flags |= CE_VALID;
> +	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
> +	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
> +
> +	if (add_index_entry(istate, ce, option)) {
> +		discard_cache_entry(ce);

This behaviour is new.  We were leaking the ce.

> +		return -2;
> +	}
> +
> +	if (pce)
> +		*pce = ce;

I think you mean by 'p' a "pointer", but that is a horrible way to
name things.  We know from the type that it is a pointer to a
pointer already; what reader needs to learn from either its name or
a comment associated with it is what purpose it serves.

Perhaps call it with a name that hints it is used as the return
parameter, e.g. ce_ret?

> +	return 0;
> +}
> +
>  /*
>   * "refresh" does not calculate a new sha1 file or bring the
>   * cache up-to-date for mode/content changes. But what it

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 04/13] merge-one-file: rewrite in C
  2020-11-24 11:53           ` [PATCH v6 04/13] merge-one-file: rewrite in C Alban Gruin
@ 2020-12-22 21:36             ` Junio C Hamano
  2021-01-03 22:41               ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2020-12-22 21:36 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Phillip Wood

Alban Gruin <alban.gruin@gmail.com> writes:

> This rewrites `git merge-one-file' from shell to C.  This port is not
> completely straightforward: to save precious cycles by avoiding reading
> and flushing the index repeatedly, write temporary files when an
> operation can be performed in-memory, or allow other function to use the
> rewrite without forking nor worrying about the index, the calls to
> external processes are replaced by calls to functions in libgit.a:
>
>  - calls to `update-index --add --cacheinfo' are replaced by calls to
>    add_to_index_cacheinfo();
>
>  - calls to `update-index --remove' are replaced by calls to
>    remove_file_from_index();
>
>  - calls to `checkout-index -u -f' are replaced by calls to
>    checkout_entry();
>
>  - calls to `unpack-file' and `merge-files' are replaced by calls to
>    read_mmblob() and xdl_merge(), respectively, to merge files
>    in-memory;
>
>  - calls to `checkout-index -f --stage=2' are removed, as this is needed
>    to have the correct permission bits on the merged file from the
>    script, but not in the C version;
>
>  - calls to `update-index' are replaced by calls to add_file_to_index().
>
> The bulk of the rewrite is done in a new file in libgit.a,
> merge-strategies.c.  This will enable the resolve and octopus strategies
> to directly call it instead of forking.
>
> This also fixes a bug present in the original script: instead of
> checking if a _regular_ file exists when a file exists in the branch to
> merge, but not in our branch, the rewritten version checks if a file of
> any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
> where the branch to merge had a new file, `a/b', but our branch had a
> directory there; it should have failed because a directory exists, but
> it did not because there was no regular file called `a/b'.  This test is
> now marked as successful.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  Makefile                        |   3 +-
>  builtin.h                       |   1 +
>  builtin/merge-one-file.c        |  94 +++++++++++++++++
>  git-merge-one-file.sh           | 167 ------------------------------
>  git.c                           |   1 +
>  merge-strategies.c              | 178 ++++++++++++++++++++++++++++++++
>  merge-strategies.h              |  12 +++
>  t/t6415-merge-dir-to-symlink.sh |   2 +-
>  8 files changed, 289 insertions(+), 169 deletions(-)
>  create mode 100644 builtin/merge-one-file.c
>  delete mode 100755 git-merge-one-file.sh
>  create mode 100644 merge-strategies.c
>  create mode 100644 merge-strategies.h
>
> diff --git a/Makefile b/Makefile
> index de53954590..6dfdb33cb2 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -601,7 +601,6 @@ SCRIPT_SH += git-bisect.sh
>  SCRIPT_SH += git-difftool--helper.sh
>  SCRIPT_SH += git-filter-branch.sh
>  SCRIPT_SH += git-merge-octopus.sh
> -SCRIPT_SH += git-merge-one-file.sh
>  SCRIPT_SH += git-merge-resolve.sh
>  SCRIPT_SH += git-mergetool.sh
>  SCRIPT_SH += git-quiltimport.sh
> @@ -909,6 +908,7 @@ LIB_OBJS += match-trees.o
>  LIB_OBJS += mem-pool.o
>  LIB_OBJS += merge-blobs.o
>  LIB_OBJS += merge-recursive.o
> +LIB_OBJS += merge-strategies.o
>  LIB_OBJS += merge.o
>  LIB_OBJS += mergesort.o
>  LIB_OBJS += midx.o
> @@ -1094,6 +1094,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
>  BUILTIN_OBJS += builtin/merge-base.o
>  BUILTIN_OBJS += builtin/merge-file.o
>  BUILTIN_OBJS += builtin/merge-index.o
> +BUILTIN_OBJS += builtin/merge-one-file.o
>  BUILTIN_OBJS += builtin/merge-ours.o
>  BUILTIN_OBJS += builtin/merge-recursive.o
>  BUILTIN_OBJS += builtin/merge-tree.o
> diff --git a/builtin.h b/builtin.h
> index 53fb290963..4d2cd78856 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -178,6 +178,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
>  int cmd_merge_index(int argc, const char **argv, const char *prefix);
>  int cmd_merge_ours(int argc, const char **argv, const char *prefix);
>  int cmd_merge_file(int argc, const char **argv, const char *prefix);
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
>  int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
>  int cmd_merge_tree(int argc, const char **argv, const char *prefix);
>  int cmd_mktag(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..9c21778e1d
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,94 @@
> +/*
> + * Builtin "git merge-one-file"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> + *
> + * This is the git per-file merge utility, called with
> + *
> + *   argv[1] - original file object name (or empty)
> + *   argv[2] - file in branch1 object name (or empty)
> + *   argv[3] - file in branch2 object name (or empty)
> + *   argv[4] - pathname in repository
> + *   argv[5] - original file mode (or empty)
> + *   argv[6] - file in branch1 mode (or empty)
> + *   argv[7] - file in branch2 mode (or empty)
> + *
> + * Handle some trivial cases. The _really_ trivial cases have been
> + * handled already by git read-tree, but that one doesn't do any merges
> + * that might change the tree layout.
> + */
> +
> +#define USE_THE_INDEX_COMPATIBILITY_MACROS
> +#include "cache.h"
> +#include "builtin.h"
> +#include "lockfile.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_one_file_usage[] =
> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
> +	"<orig mode> <our mode> <their mode>\n\n"
> +	"Blob ids and modes should be empty for missing files.";
> +
> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
> +{
> +	char *last;
> +	int ret = 0;
> +
> +	*mode = strtol(arg, &last, 8);
> +
> +	if (*last)
> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
> +
> +	return ret;
> +}
> +
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> +{
> +	struct object_id orig_blob, our_blob, their_blob,
> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
> +	struct lock_file lock = LOCK_INIT;
> +
> +	if (argc != 8)
> +		usage(builtin_merge_one_file_usage);
> +
> +	if (read_cache() < 0)
> +		die("invalid index");
> +
> +	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
> +
> +	if (!get_oid_hex(argv[1], &orig_blob)) {
> +		p_orig_blob = &orig_blob;
> +		ret = read_mode("orig", argv[5], &orig_mode);
> +	} else if (!*argv[1] && *argv[5])
> +		ret = error(_("no 'orig' object id given, but a mode was still given."));
> +
> +	if (!get_oid_hex(argv[2], &our_blob)) {
> +		p_our_blob = &our_blob;
> +		ret = read_mode("our", argv[6], &our_mode);
> +	} else if (!*argv[2] && *argv[6])
> +		ret = error(_("no 'our' object id given, but a mode was still given."));
> +
> +	if (!get_oid_hex(argv[3], &their_blob)) {
> +		p_their_blob = &their_blob;
> +		ret = read_mode("their", argv[7], &their_mode);
> +	} else if (!*argv[3] && *argv[7])
> +		ret = error(_("no 'their' object id given, but a mode was still given."));
> +
> +	if (ret)
> +		return ret;
> +
> +	ret = merge_three_way(the_repository, p_orig_blob, p_our_blob, p_their_blob,
> +			      argv[4], orig_mode, our_mode, their_mode);
> +
> +	if (ret) {
> +		rollback_lock_file(&lock);
> +		return !!ret;
> +	}
> +
> +	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
> +}
> diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
> deleted file mode 100755
> index f6d9852d2f..0000000000
> --- a/git-merge-one-file.sh
> +++ /dev/null
> @@ -1,167 +0,0 @@
> -#!/bin/sh
> -#
> -# Copyright (c) Linus Torvalds, 2005
> -#
> -# This is the git per-file merge script, called with
> -#
> -#   $1 - original file SHA1 (or empty)
> -#   $2 - file in branch1 SHA1 (or empty)
> -#   $3 - file in branch2 SHA1 (or empty)
> -#   $4 - pathname in repository
> -#   $5 - original file mode (or empty)
> -#   $6 - file in branch1 mode (or empty)
> -#   $7 - file in branch2 mode (or empty)
> -#
> -# Handle some trivial cases.. The _really_ trivial cases have
> -# been handled already by git read-tree, but that one doesn't
> -# do any merges that might change the tree layout.
> -
> -USAGE='<orig blob> <our blob> <their blob> <path>'
> -USAGE="$USAGE <orig mode> <our mode> <their mode>"
> -LONG_USAGE="usage: git merge-one-file $USAGE
> -
> -Blob ids and modes should be empty for missing files."
> -
> -SUBDIRECTORY_OK=Yes
> -. git-sh-setup
> -cd_to_toplevel
> -require_work_tree
> -
> -if test $# != 7
> -then
> -	echo "$LONG_USAGE"
> -	exit 1
> -fi
> -
> -case "${1:-.}${2:-.}${3:-.}" in
> -#
> -# Deleted in both or deleted in one and unchanged in the other
> -#
> -"$1.." | "$1.$1" | "$1$1.")
> -	if { test -z "$6" && test "$5" != "$7"; } ||
> -	   { test -z "$7" && test "$5" != "$6"; }
> -	then
> -		echo "ERROR: File $4 deleted on one branch but had its" >&2
> -		echo "ERROR: permissions changed on the other." >&2
> -		exit 1
> -	fi
> -
> -	if test -n "$2"
> -	then
> -		echo "Removing $4"
> -	else
> -		# read-tree checked that index matches HEAD already,
> -		# so we know we do not have this path tracked.
> -		# there may be an unrelated working tree file here,
> -		# which we should just leave unmolested.  Make sure
> -		# we do not have it in the index, though.
> -		exec git update-index --remove -- "$4"
> -	fi
> -	if test -f "$4"
> -	then
> -		rm -f -- "$4" &&
> -		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
> -	fi &&
> -		exec git update-index --remove -- "$4"
> -	;;
> -
> -#
> -# Added in one.
> -#
> -".$2.")
> -	# the other side did not add and we added so there is nothing
> -	# to be done, except making the path merged.
> -	exec git update-index --add --cacheinfo "$6" "$2" "$4"
> -	;;
> -"..$3")
> -	echo "Adding $4"
> -	if test -f "$4"
> -	then
> -		echo "ERROR: untracked $4 is overwritten by the merge." >&2
> -		exit 1
> -	fi
> -	git update-index --add --cacheinfo "$7" "$3" "$4" &&
> -		exec git checkout-index -u -f -- "$4"
> -	;;
> -
> -#
> -# Added in both, identically (check for same permissions).
> -#
> -".$3$2")
> -	if test "$6" != "$7"
> -	then
> -		echo "ERROR: File $4 added identically in both branches," >&2
> -		echo "ERROR: but permissions conflict $6->$7." >&2
> -		exit 1
> -	fi
> -	echo "Adding $4"
> -	git update-index --add --cacheinfo "$6" "$2" "$4" &&
> -		exec git checkout-index -u -f -- "$4"
> -	;;
> -
> -#
> -# Modified in both, but differently.
> -#
> -"$1$2$3" | ".$2$3")
> -
> -	case ",$6,$7," in
> -	*,120000,*)
> -		echo "ERROR: $4: Not merging symbolic link changes." >&2
> -		exit 1
> -		;;
> -	*,160000,*)
> -		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
> -		exit 1
> -		;;
> -	esac
> -
> -	src1=$(git unpack-file $2)
> -	src2=$(git unpack-file $3)
> -	case "$1" in
> -	'')
> -		echo "Added $4 in both, but differently."
> -		orig=$(git unpack-file $(git hash-object /dev/null))
> -		;;
> -	*)
> -		echo "Auto-merging $4"
> -		orig=$(git unpack-file $1)
> -		;;
> -	esac
> -
> -	git merge-file "$src1" "$orig" "$src2"
> -	ret=$?
> -	msg=
> -	if test $ret != 0 || test -z "$1"
> -	then
> -		msg='content conflict'
> -		ret=1
> -	fi
> -
> -	# Create the working tree file, using "our tree" version from the
> -	# index, and then store the result of the merge.
> -	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
> -	rm -f -- "$orig" "$src1" "$src2"
> -
> -	if test "$6" != "$7"
> -	then
> -		if test -n "$msg"
> -		then
> -			msg="$msg, "
> -		fi
> -		msg="${msg}permissions conflict: $5->$6,$7"
> -		ret=1
> -	fi
> -
> -	if test $ret != 0
> -	then
> -		echo "ERROR: $msg in $4" >&2
> -		exit 1
> -	fi
> -	exec git update-index -- "$4"
> -	;;
> -
> -*)
> -	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
> -	;;
> -esac
> -exit 1
> diff --git a/git.c b/git.c
> index f1e8b56d99..a4d3f98094 100644
> --- a/git.c
> +++ b/git.c
> @@ -540,6 +540,7 @@ static struct cmd_struct commands[] = {
>  	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
>  	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
>  	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
> +	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>  	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>  	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>  	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
> diff --git a/merge-strategies.c b/merge-strategies.c
> new file mode 100644
> index 0000000000..20a328bf57
> --- /dev/null
> +++ b/merge-strategies.c
> @@ -0,0 +1,178 @@
> +#include "cache.h"
> +#include "dir.h"
> +#include "merge-strategies.h"
> +#include "xdiff-interface.h"
> +
> +static int checkout_from_index(struct index_state *istate, const char *path,
> +			       struct cache_entry *ce)
> +{
> +	struct checkout state = CHECKOUT_INIT;
> +
> +	state.istate = istate;
> +	state.force = 1;
> +	state.base_dir = "";
> +	state.base_dir_len = 0;
> +
> +	if (checkout_entry(ce, &state, NULL, NULL) < 0)
> +		return error(_("%s: cannot checkout file"), path);
> +	return 0;
> +}
> +
> +static int merge_one_file_deleted(struct index_state *istate,
> +				  const struct object_id *our_blob,
> +				  const struct object_id *their_blob, const char *path,
> +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if ((our_blob && orig_mode != our_mode) ||
> +	    (their_blob && orig_mode != their_mode))
> +		return error(_("File %s deleted on one branch but had its "
> +			       "permissions changed on the other."), path);
> +
> +	if (our_blob) {
> +		printf(_("Removing %s\n"), path);
> +
> +		if (file_exists(path))
> +			remove_path(path);
> +	}
> +
> +	if (remove_file_from_index(istate, path))
> +		return error("%s: cannot remove from the index", path);
> +	return 0;
> +}
> +
> +static int do_merge_one_file(struct index_state *istate,
> +			     const struct object_id *orig_blob,
> +			     const struct object_id *our_blob,
> +			     const struct object_id *their_blob, const char *path,
> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	int ret, i, dest;
> +	ssize_t written;
> +	mmbuffer_t result = {NULL, 0};
> +	mmfile_t mmfs[3];
> +	xmparam_t xmp = {{0}};
> +
> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
> +		return error(_("%s: Not merging symbolic link changes."), path);
> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
> +		return error(_("%s: Not merging conflicting submodule changes."), path);
> +
> +	if (orig_blob) {
> +		printf(_("Auto-merging %s\n"), path);
> +		read_mmblob(mmfs + 0, orig_blob);
> +	} else {
> +		printf(_("Added %s in both, but differently.\n"), path);
> +		read_mmblob(mmfs + 0, &null_oid);
> +	}
> +
> +	read_mmblob(mmfs + 1, our_blob);
> +	read_mmblob(mmfs + 2, their_blob);
> +
> +	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
> +	xmp.style = 0;
> +	xmp.favor = 0;
> +
> +	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
> +
> +	for (i = 0; i < 3; i++)
> +		free(mmfs[i].ptr);
> +
> +	if (ret < 0) {
> +		free(result.ptr);
> +		return error(_("Failed to execute internal merge"));
> +	}
> +
> +	if (ret > 0 || !orig_blob)
> +		ret = error(_("content conflict in %s"), path);
> +	if (our_mode != their_mode)
> +		ret = error(_("permission conflict: %o->%o,%o in %s"),
> +			    orig_mode, our_mode, their_mode, path);
> +
> +	unlink(path);
> +	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
> +		free(result.ptr);
> +		return error_errno(_("failed to open file '%s'"), path);
> +	}
> +
> +	written = write_in_full(dest, result.ptr, result.size);
> +	close(dest);
> +
> +	free(result.ptr);
> +
> +	if (written < 0)
> +		return error_errno(_("failed to write to '%s'"), path);
> +	if (ret)
> +		return ret;
> +
> +	return add_file_to_index(istate, path, 0);
> +}
> +
> +int merge_three_way(struct repository *r,
> +		    const struct object_id *orig_blob,
> +		    const struct object_id *our_blob,
> +		    const struct object_id *their_blob, const char *path,
> +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if (orig_blob &&
> +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
> +		/* Deleted in both or deleted in one and unchanged in the other. */
> +		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
> +					      orig_mode, our_mode, their_mode);

When both ours and theirs deleted, by definition orig_blob cannot be
NULL, so "orig_blob &&" part would be true, but the other side that
requires either (!their && our) or (!our && their) is true cannot be
satisfied.  So it seems that the comment does not match the behaviour.

You'd need "(!their_blob && !our_blob) ||" in the second part?

This shows lack of test coverage, I think A manual test seems to
trigger the "unhandled case" error you added:

$ make
$ ./git-merge-one-file $(git rev-parse :COPYING) "" "" \
	COPYING \
	100644 "" ""
error: COPYING: Not handling case 536e55524db72bd2acf175208aef4f3dfc148d42 ->  ->

> +	} else if (!orig_blob && our_blob && !their_blob) {
> +		/*
> +		 * Added in one.  The other side did not add and we
> +		 * added so there is nothing to be done, except making
> +		 * the path merged.
> +		 */

This is not the sole "Added in one" case.  The next elseif arm also
is added in one.

What is notable in this elseif arm is that this is "added in ours",
which allows us (and forces us) not to touch the working tree with
extra "checkout".  So either remove "Added in one" from here for
symmetry with the next elseif arm, or better yet say "Added in
ours".

> +		return add_to_index_cacheinfo(r->index, our_mode, our_blob,
> +					      path, 0, 1, 1, NULL);

All callers to add_to_index_cacheinfo() uses 0, 1, 1 for stage,
allow_add and allow_replace, except for the original.  The new
callers you added should not have to keep repeating 0, 1, 1 like
this caller does (see below).

> +	} else if (!orig_blob && !our_blob && their_blob) {
> +		struct cache_entry *ce;
> +		printf(_("Adding %s\n"), path);
> +
> +		if (file_exists(path))
> +			return error(_("untracked %s is overwritten by the merge."), path);
> +
> +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob,
> +					   path, 0, 1, 1, &ce))
> +			return -1;
> +		return checkout_from_index(r->index, path, ce);

"git grep -A4 -e add_to_index_cacheinfo" after applying all patches
in the series shows us that the &ce parameter was added only to call
checkout_from_index() using it.

I doubt add_to_index_cacheinfo() is the right interface for this
series.  This caller (and all other callers in the series that calls
add_to_index_cacheinfo(), followed by checkout_from_index()) rather
wants to have a function (defined in <cache.h>):

	extern int add_merge_result_to_index(struct index_state, *
			unsigned int mode,
                        const struct object_id *oid,
			const char *path,
			int checkout);

with which the last 4 lines of the above hunk can just become

		return add_merge_result_to_index(r->index,
			their_mode, their_blob, path, 1);

I would think.  The earlier caller to add_to_index_cacheinfo() for
"ours is the result" can pass 0 to the checkout parameter so the
helper won't make a call to checkout_from_index().

And the step to add that helper would be in this patch (it could be
after the previous step and before this step, but it is probably
easier to understand if the new helper is introduced with its
callers).

If we were to do that, then I do not mind the repetition of 0, 1, 1
too much.

> +	} else if (!orig_blob && our_blob && their_blob &&
> +		   oideq(our_blob, their_blob)) {
> +		struct cache_entry *ce;
> +
> +		/* Added in both, identically (check for same permissions). */
> +		if (our_mode != their_mode)
> +			return error(_("File %s added identically in both branches, "
> +				       "but permissions conflict %o->%o."),
> +				     path, our_mode, their_mode);
> +
> +		printf(_("Adding %s\n"), path);
> +
> +		if (add_to_index_cacheinfo(r->index, our_mode, our_blob,
> +					   path, 0, 1, 1, &ce))
> +			return -1;
> +		return checkout_from_index(r->index, path, ce);

Likewise; this wants to call add_merge_result_to_index(), too.

> +	} else if (our_blob && their_blob) {
> +		/* Modified in both, but differently. */
> +		return do_merge_one_file(r->index,
> +					 orig_blob, our_blob, their_blob, path,
> +					 orig_mode, our_mode, their_mode);
> +	} else {
> +		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
> +			their_hex[GIT_MAX_HEXSZ] = {0};
> +
> +		if (orig_blob)
> +			oid_to_hex_r(orig_hex, orig_blob);
> +		if (our_blob)
> +			oid_to_hex_r(our_hex, our_blob);
> +		if (their_blob)
> +			oid_to_hex_r(their_hex, their_blob);
> +
> +		return error(_("%s: Not handling case %s -> %s -> %s"),
> +			     path, orig_hex, our_hex, their_hex);
> +	}
> +
> +	return 0;
> +}
> diff --git a/merge-strategies.h b/merge-strategies.h
> new file mode 100644
> index 0000000000..e624c4f27c
> --- /dev/null
> +++ b/merge-strategies.h
> @@ -0,0 +1,12 @@
> +#ifndef MERGE_STRATEGIES_H
> +#define MERGE_STRATEGIES_H
> +
> +#include "object.h"
> +
> +int merge_three_way(struct repository *r,
> +		    const struct object_id *orig_blob,
> +		    const struct object_id *our_blob,
> +		    const struct object_id *their_blob, const char *path,
> +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
> +
> +#endif /* MERGE_STRATEGIES_H */
> diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
> index 2eddcc7664..5fb74e39a0 100755
> --- a/t/t6415-merge-dir-to-symlink.sh
> +++ b/t/t6415-merge-dir-to-symlink.sh
> @@ -94,7 +94,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
>  	test -h a/b
>  '
>  
> -test_expect_failure 'do not lose untracked in merge (resolve)' '
> +test_expect_success 'do not lose untracked in merge (resolve)' '
>  	git reset --hard &&
>  	git checkout baseline^0 &&
>  	>a/b/c/e &&

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 04/13] merge-one-file: rewrite in C
  2020-12-22 21:36             ` Junio C Hamano
@ 2021-01-03 22:41               ` Alban Gruin
  2021-01-08  6:54                 ` Junio C Hamano
  0 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-01-03 22:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Phillip Wood

Hi Junio,

Thank you for your comments.

Le 22/12/2020 à 22:36, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> +int merge_three_way(struct repository *r,
>> +		    const struct object_id *orig_blob,
>> +		    const struct object_id *our_blob,
>> +		    const struct object_id *their_blob, const char *path,
>> +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
>> +{
>> +	if (orig_blob &&
>> +	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
>> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
>> +		/* Deleted in both or deleted in one and unchanged in the other. */
>> +		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
>> +					      orig_mode, our_mode, their_mode);
> 
> When both ours and theirs deleted, by definition orig_blob cannot be
> NULL, so "orig_blob &&" part would be true, but the other side that
> requires either (!their && our) or (!our && their) is true cannot be
> satisfied.  So it seems that the comment does not match the behaviour.
> 
> You'd need "(!their_blob && !our_blob) ||" in the second part?
> 

Yes, you're correct.

> This shows lack of test coverage, I think A manual test seems to
> trigger the "unhandled case" error you added:
> 
> $ make
> $ ./git-merge-one-file $(git rev-parse :COPYING) "" "" \
> 	COPYING \
> 	100644 "" ""
> error: COPYING: Not handling case 536e55524db72bd2acf175208aef4f3dfc148d42 ->  ->
> 

Okay, I will add a test case for this.

>> +	} else if (!orig_blob && our_blob && !their_blob) {
>> +		/*
>> +		 * Added in one.  The other side did not add and we
>> +		 * added so there is nothing to be done, except making
>> +		 * the path merged.
>> +		 */
> 
> This is not the sole "Added in one" case.  The next elseif arm also
> is added in one.
> 
> What is notable in this elseif arm is that this is "added in ours",
> which allows us (and forces us) not to touch the working tree with
> extra "checkout".  So either remove "Added in one" from here for
> symmetry with the next elseif arm, or better yet say "Added in
> ours".
> 
>> +		return add_to_index_cacheinfo(r->index, our_mode, our_blob,
>> +					      path, 0, 1, 1, NULL);
> 
> All callers to add_to_index_cacheinfo() uses 0, 1, 1 for stage,
> allow_add and allow_replace, except for the original.  The new
> callers you added should not have to keep repeating 0, 1, 1 like
> this caller does (see below).
> 
>> +	} else if (!orig_blob && !our_blob && their_blob) {
>> +		struct cache_entry *ce;
>> +		printf(_("Adding %s\n"), path);
>> +
>> +		if (file_exists(path))
>> +			return error(_("untracked %s is overwritten by the merge."), path);
>> +
>> +		if (add_to_index_cacheinfo(r->index, their_mode, their_blob,
>> +					   path, 0, 1, 1, &ce))
>> +			return -1;
>> +		return checkout_from_index(r->index, path, ce);
> 
> "git grep -A4 -e add_to_index_cacheinfo" after applying all patches
> in the series shows us that the &ce parameter was added only to call
> checkout_from_index() using it.
> 
> I doubt add_to_index_cacheinfo() is the right interface for this
> series.  This caller (and all other callers in the series that calls
> add_to_index_cacheinfo(), followed by checkout_from_index()) rather
> wants to have a function (defined in <cache.h>):
> 
> 	extern int add_merge_result_to_index(struct index_state, *
> 			unsigned int mode,
>                         const struct object_id *oid,
> 			const char *path,
> 			int checkout);
> 
> with which the last 4 lines of the above hunk can just become
> 
> 		return add_merge_result_to_index(r->index,
> 			their_mode, their_blob, path, 1);
> 
> I would think.  The earlier caller to add_to_index_cacheinfo() for
> "ours is the result" can pass 0 to the checkout parameter so the
> helper won't make a call to checkout_from_index().
> 
> And the step to add that helper would be in this patch (it could be
> after the previous step and before this step, but it is probably
> easier to understand if the new helper is introduced with its
> callers).
> 
> If we were to do that, then I do not mind the repetition of 0, 1, 1
> too much.
> 

Okay.  Are we sure we want add_merge_result_to_index() inside
read-cache.c/cache.h?

Cheers,
Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all()
  2020-11-24 11:53           ` [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2021-01-05 15:59             ` Derrick Stolee
  2021-01-05 23:20               ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Derrick Stolee @ 2021-01-05 15:59 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Phillip Wood

On 11/24/2020 6:53 AM, Alban Gruin wrote:
> The "resolve" and "octopus" merge strategies do not call directly `git
> merge-one-file', they delegate the work to another git command, `git
> merge-index', that will loop over files in the index and call the
> specified command.  Unfortunately, these functions are not part of
> libgit.a, which means that once rewritten, the strategies would still
> have to invoke `merge-one-file' by spawning a new process first.

This is a good thing to do.
 
> To avoid this, this moves and renames merge_one_path(), merge_all(), and
> their helpers to merge-strategies.c.  They also take a callback to
> dictate what they should do for each file.  For now, to preserve the
> behaviour of `merge-index', only one callback, launching a new process,
> is defined.

I don't think the callback should be in libgit.a, though. The callback
itself should be a static method inside builtin/merge-index.c.

> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  builtin/merge-index.c |  77 +++----------------------------
>  merge-strategies.c    | 104 ++++++++++++++++++++++++++++++++++++++++++
>  merge-strategies.h    |  19 ++++++++
>  3 files changed, 130 insertions(+), 70 deletions(-)
> 
> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> index 38ea6ad6ca..d5e5713b25 100644
> --- a/builtin/merge-index.c
> +++ b/builtin/merge-index.c
> @@ -1,74 +1,11 @@
>  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>  #include "builtin.h"
> -#include "run-command.h"
> -
> -static const char *pgm;
> -static int one_shot, quiet;
> -static int err;
> -
> -static int merge_entry(int pos, const char *path)
> -{
> -	int found;
> -	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
> -	char hexbuf[4][GIT_MAX_HEXSZ + 1];
> -	char ownbuf[4][60];
> -
> -	if (pos >= active_nr)
> -		die("git merge-index: %s not in the cache", path);
> -	found = 0;
> -	do {
> -		const struct cache_entry *ce = active_cache[pos];
> -		int stage = ce_stage(ce);
> -
> -		if (strcmp(ce->name, path))
> -			break;
> -		found++;
> -		oid_to_hex_r(hexbuf[stage], &ce->oid);
> -		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
> -		arguments[stage] = hexbuf[stage];
> -		arguments[stage + 4] = ownbuf[stage];
> -	} while (++pos < active_nr);
> -	if (!found)
> -		die("git merge-index: %s not in the cache", path);
> -
> -	if (run_command_v_opt(arguments, 0)) {
> -		if (one_shot)
> -			err++;
> -		else {
> -			if (!quiet)
> -				die("merge program failed");
> -			exit(1);
> -		}
> -	}
> -	return found;
> -}
> -
> -static void merge_one_path(const char *path)
> -{
> -	int pos = cache_name_pos(path, strlen(path));
> -
> -	/*
> -	 * If it already exists in the cache as stage0, it's
> -	 * already merged and there is nothing to do.
> -	 */
> -	if (pos < 0)
> -		merge_entry(-pos-1, path);
> -}
> -
> -static void merge_all(void)
> -{
> -	int i;
> -	for (i = 0; i < active_nr; i++) {
> -		const struct cache_entry *ce = active_cache[i];
> -		if (!ce_stage(ce))
> -			continue;
> -		i += merge_entry(i, ce->name)-1;
> -	}
> -}
> +#include "merge-strategies.h"
>  
>  int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  {
> -	int i, force_file = 0;
> +	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
> +	const char *pgm;
>  
>  	/* Without this we cannot rely on waitpid() to tell
>  	 * what happened to our children.
> @@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  				continue;
>  			}
>  			if (!strcmp(arg, "-a")) {
> -				merge_all();
> +				err |= merge_all_index(the_repository, one_shot, quiet,
> +						       merge_one_file_spawn, (void *)pgm);

This hunk makes it look like pgm is uninitialized, but it is set earlier
in cmd_merge_index() (previously referring to the global instance). Good.

> +int merge_one_file_spawn(struct repository *r,
> +			 const struct object_id *orig_blob,
> +			 const struct object_id *our_blob,
> +			 const struct object_id *their_blob, const char *path,
> +			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			 void *data)
> +{
> +	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
> +	char modes[3][10] = {{0}};
> +	const char *arguments[] = { (char *)data, oids[0], oids[1], oids[2],
> +				    path, modes[0], modes[1], modes[2], NULL };
> +
> +	if (orig_blob) {
> +		oid_to_hex_r(oids[0], orig_blob);
> +		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
> +	}
> +
> +	if (our_blob) {
> +		oid_to_hex_r(oids[1], our_blob);
> +		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
> +	}
> +
> +	if (their_blob) {
> +		oid_to_hex_r(oids[2], their_blob);
> +		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
> +	}
> +
> +	return run_command_v_opt(arguments, 0);
> +}

Yes, this would be better in the builtin code. Better to keep the meaning
of 'data' clear in the context of that file.

> +static int merge_entry(struct repository *r, int quiet, unsigned int pos,
> +		       const char *path, int *err, merge_fn fn, void *data)
> +{
> +	int found = 0;
> +	const struct object_id *oids[3] = {NULL};
> +	unsigned int modes[3] = {0};
> +
> +	do {
> +		const struct cache_entry *ce = r->index->cache[pos];
> +		int stage = ce_stage(ce);
> +
> +		if (strcmp(ce->name, path))
> +			break;
> +		found++;
> +		oids[stage - 1] = &ce->oid;
> +		modes[stage - 1] = ce->ce_mode;
> +	} while (++pos < r->index->cache_nr);
> +	if (!found)
> +		return error(_("%s is not in the cache"), path);
> +
> +	if (fn(r, oids[0], oids[1], oids[2], path,
> +	       modes[0], modes[1], modes[2], data)) {
> +		if (!quiet)
> +			error(_("Merge program failed"));
> +		(*err)++;
> +	}
> +
> +	return found;
> +}
> +
> +int merge_index_path(struct repository *r, int oneshot, int quiet,
> +		     const char *path, merge_fn fn, void *data)
> +{
> +	int pos = index_name_pos(r->index, path, strlen(path)), ret, err = 0;
> +
> +	/*
> +	 * If it already exists in the cache as stage0, it's
> +	 * already merged and there is nothing to do.
> +	 */
> +	if (pos < 0) {
> +		ret = merge_entry(r, quiet || oneshot, -pos - 1, path, &err, fn, data);
> +		if (ret == -1)
> +			return -1;
> +		else if (err)
> +			return 1;
> +	}
> +	return 0;
> +}
> +
> +int merge_all_index(struct repository *r, int oneshot, int quiet,
> +		    merge_fn fn, void *data)
> +{
> +	int err = 0, ret;
> +	unsigned int i;
> +
> +	for (i = 0; i < r->index->cache_nr; i++) {
> +		const struct cache_entry *ce = r->index->cache[i];
> +		if (!ce_stage(ce))
> +			continue;
> +
> +		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
> +		if (ret > 0)
> +			i += ret - 1;
> +		else if (ret == -1)
> +			return -1;
> +
> +		if (err && !oneshot)
> +			return 1;
> +	}
> +
> +	return err;
> +}

I notice that these methods don't actually use the repository pointer
more than they just use 'r->index'. Should they instead take a
'struct index_state *istate' directly? (I see that the repository is
used later by merge_strategies_resolve(), but not in these.)

If you think it likely that we will need a repository for these methods,
then feel free to ignore me and keep your 'r' pointer.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2020-11-24 11:53           ` [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
@ 2021-01-05 16:11             ` Derrick Stolee
  2021-01-05 17:35               ` Martin Ågren
  2021-01-05 23:20               ` Alban Gruin
  0 siblings, 2 replies; 221+ messages in thread
From: Derrick Stolee @ 2021-01-05 16:11 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Phillip Wood

On 11/24/2020 6:53 AM, Alban Gruin wrote:
> +
>  	pgm = argv[i++];
> +	setup_work_tree();
> +
> +	if (!strcmp(pgm, "git-merge-one-file")) {

This stood out to me as possibly fragile. What if we call the
non-dashed form "git merge-one-file"? Shouldn't we be doing so?

Or, is this something that is handled higher in the builtin
machinery to take the non-dashed version and change it to the
dashed version for historical reasons?

> +		merge_action = merge_one_file_func;
> +		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
> +	} else {
> +		merge_action = merge_one_file_spawn;
> +		data = (void *)pgm;
> +	}
> +

...

> +	if (merge_action == merge_one_file_func) {

nit: This made me think it would be better to check the 'lock'
itself to see if it was initialized or not. Perhaps

	if (lock.tempfile) {

would be the appropriate way to check this?

For now, this is equivalent behavior, but it might be helpful if
we add more cases that take the lock in the future.

> +		if (err) {
> +			rollback_lock_file(&lock);
> +			return err;
> +		}
> +
> +		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>  	}
>  	return err;

nit: this could be simplified. In total, I recommend:

	if (lock.tempfile) {
		if (err)
			rollback_lock_file(&lock);
		else
			return write_locked_index(&the_index, &lock, COMMIT_LOCK);
	}
	return err;


>  }
> diff --git a/merge-strategies.c b/merge-strategies.c
> index 6f27e66dfe..542cefcf3d 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -178,6 +178,18 @@ int merge_three_way(struct repository *r,
>  	return 0;
>  }
>  
> +int merge_one_file_func(struct repository *r,
> +			const struct object_id *orig_blob,
> +			const struct object_id *our_blob,
> +			const struct object_id *their_blob, const char *path,
> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			void *data)
> +{
> +	return merge_three_way(r,
> +			       orig_blob, our_blob, their_blob, path,
> +			       orig_mode, our_mode, their_mode);
> +}
> +

Again, I don't recommend making this callback in the library. Instead, keep
it in the builtin and then use merge_three_way() which is in the library.

>  int merge_one_file_spawn(struct repository *r,
>  			 const struct object_id *orig_blob,
>  			 const struct object_id *our_blob,
> @@ -261,17 +273,22 @@ int merge_all_index(struct repository *r, int oneshot, int quiet,
>  		    merge_fn fn, void *data)
>  {
>  	int err = 0, ret;
> -	unsigned int i;
> +	unsigned int i, prev_nr;
>  
>  	for (i = 0; i < r->index->cache_nr; i++) {
>  		const struct cache_entry *ce = r->index->cache[i];
>  		if (!ce_stage(ce))
>  			continue;
>  
> +		prev_nr = r->index->cache_nr;
>  		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
> -		if (ret > 0)
> -			i += ret - 1;
> -		else if (ret == -1)
> +		if (ret > 0) {
> +			/* Don't bother handling an index that has
> +			   grown, since merge_one_file_func() can't grow
> +			   it, and merge_one_file_spawn() can't change
> +			   it. */

multi-line comment style is as follows:

	/*
	 * Don't bother handling an index that has
	 * grown, since merge_one_file_func() can't grow
	 * it, and merge_one_file_spawn() can't change it.
	 */

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 08/13] merge-recursive: move better_branch_name() to merge.c
  2020-11-24 11:53           ` [PATCH v6 08/13] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2021-01-05 16:19             ` Derrick Stolee
  0 siblings, 0 replies; 221+ messages in thread
From: Derrick Stolee @ 2021-01-05 16:19 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Phillip Wood

On 11/24/2020 6:53 AM, Alban Gruin wrote:
> better_branch_name() will be used by merge-octopus once it is rewritten
> in C, so instead of duplicating it, this moves this function
> preventively inside an appropriate file in libgit.a.  This function is
> also renamed to reflect its usage by merge strategies.

s/preventively/preemptively/

> diff --git a/cache.h b/cache.h
> index be16ab3215..2d844576ea 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1933,7 +1933,7 @@ int checkout_fast_forward(struct repository *r,
>  			  const struct object_id *from,
>  			  const struct object_id *to,
>  			  int overwrite_ignore);
> -
> +char *merge_get_better_branch_name(const char *branch);
>  
>  int sane_execvp(const char *file, char *const argv[]);

I tend to avoid adding new things to the enormous cache.h, but the
best place I could see was refs.h next to repo_default_branch_name().

Maybe cache.h is fine.

-Stolee


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 09/13] merge-octopus: rewrite in C
  2020-11-24 11:53           ` [PATCH v6 09/13] merge-octopus: rewrite in C Alban Gruin
@ 2021-01-05 16:40             ` Derrick Stolee
  0 siblings, 0 replies; 221+ messages in thread
From: Derrick Stolee @ 2021-01-05 16:40 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Phillip Wood

On 11/24/2020 6:53 AM, Alban Gruin wrote:
> This rewrites `git merge-octopus' from shell to C.  As for the two last
> conversions, this port removes calls to external processes to avoid
> reading and writing the index over and over again.

...

> diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
> new file mode 100644
> index 0000000000..ca8f9f345d
> --- /dev/null
> +++ b/builtin/merge-octopus.c
> @@ -0,0 +1,69 @@
> +/*
> + * Builtin "git merge-octopus"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-octopus.sh, written by Junio C Hamano.
> + *
> + * Resolve two or more trees.
> + */
> +
> +#define USE_THE_INDEX_COMPATIBILITY_MACROS

Hm. It would be best if this was not added to any new code. Please
squash in changes that avoid these macros.

> +#include "cache.h"
> +#include "builtin.h"
> +#include "commit.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_octopus_usage[] =
> +	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
> +
> +int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
> +{
> +	int i, sep_seen = 0;

One strategy I've been trying to do, when I remember, is to start each
builtin with

	struct repository *repo = the_repository;

then use 'repo' over 'the_repository' and other macros. This gets us
closer to the state where the builtin cmd_*() methods could take a
repository pointer as a parameter. (We are a ways off, but every little
bit helps, right?)

> +	/*
> +	 * Reject if this is not an octopus -- resolve should be used
> +	 * instead.
> +	 */
> +	if (commit_list_count(remotes) < 2)
> +		return 2;

This caused me to pause, since it might be nice to have a warning message
here. However, it is identical behavior to the script, including the
comment.

It appears that there is no Documentation/git-merge-octopus.txt, but such
a doc file would want to include the meaning of exit code 2.

> diff --git a/merge-strategies.c b/merge-strategies.c
> index 9aa07e91b5..4d9dd55296 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -1,5 +1,6 @@
>  #include "cache.h"
>  #include "cache-tree.h"
> +#include "commit-reach.h"

You had my curiosity, but now you have my attention. ;)

> +int merge_strategies_octopus(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remotes)
> +{
> +	int ff_merge = 1, ret = 0, references = 1;
> +	struct commit **reference_commit, *head_commit;

'reference_commit' might be clearer if it was plural, right?

> +	struct tree *reference_tree, *head_tree;
> +	struct commit_list *i;
> +	struct object_id head;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	get_oid(head_arg, &head);
> +	head_commit = lookup_commit_reference(r, &head);
> +	head_tree = repo_get_commit_tree(r, head_commit);
> +
> +	if (parse_tree(head_tree))
> +		return 2;
> +
> +	if (repo_index_has_changes(r, head_tree, &sb)) {
> +		error(_("Your local changes to the following files "
> +			"would be overwritten by merge:\n  %s"),
> +		      sb.buf);
> +		strbuf_release(&sb);
> +		return 2;
> +	}
> +
> +	reference_commit = xcalloc(commit_list_count(remotes) + 1,
> +				   sizeof(struct commit *));
> +	reference_commit[0] = head_commit;
> +	reference_tree = head_tree;
> +
> +	for (i = remotes; i && i->item; i = i->next) {
> +		struct commit *c = i->item;
> +		struct object_id *oid = &c->object.oid;
> +		struct tree *current_tree = repo_get_commit_tree(r, c);
> +		struct commit_list *common, *j;
> +		char *branch_name;
> +		int k = 0, up_to_date = 0;
> +
> +		if (ret) {
> +			/*
> +			 * We allow only last one to have a
> +			 * hand-resolvable conflicts.  Last round failed
> +			 * and we still had a head to merge.
> +			 */
> +			puts(_("Automated merge did not work."));
> +			puts(_("Should not be doing an octopus."));
> +
> +			free(reference_commit);
> +			return 2;
> +		}
> +
> +		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
> +		common = get_merge_bases_many(c, references, reference_commit);

Here we are. You should probably use repo_get_merge_bases_many().

'references' is not a list, but instead a count. Could
it be renamed nr_references or something?

> +
> +		if (!common) {
> +			error(_("Unable to find common commit with %s"), branch_name);
> +
> +			free(branch_name);
> +			free_commit_list(common);
> +			free(reference_commit);
> +
> +			return 2;

hm. we are getting into magic constant territory. Perhaps this should
be marked with a macro in merge-strategies.h? It could be used in the
case of "only two heads" as well.

> +		}
> +
> +		for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
> +			up_to_date |= oideq(&j->item->object.oid, oid);
> +
> +			if (k < references)
> +				ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid);

I'm confused about this line. Shouldn't we care only about
reference_commit[references]? If we _do_ care about all possible
reference_commit[k] values, then shouldn't this be a loop over the
k values, not a single check per k (and advancing as we iterate
through the results from common)?

Seems we could use some test cases around criss-cross octopus
merges (i.e. multiple merge bases).

> +		}
> +
> +		if (up_to_date) {
> +			printf(_("Already up to date with %s\n"), branch_name);
> +
> +			free(branch_name);
> +			free_commit_list(common);
> +			continue;
> +		}
> +
> +		if (ff_merge) {
> +			ret = octopus_fast_forward(r, branch_name, head_tree,
> +						   current_tree, &reference_tree);
> +			references = 0;
> +		} else {
> +			ret = octopus_do_merge(r, branch_name, common,
> +					       current_tree, &reference_tree);
> +		}
> +
> +		free(branch_name);
> +		free_commit_list(common);
> +
> +		if (ret == -1)
> +			break;
> +
> +		reference_commit[references++] = c;
> +	}
> +
> +	free(reference_commit);
> +	return ret;
> +}

This patch could use a little work, but it's a good start.

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 10/13] merge: use the "resolve" strategy without forking
  2020-11-24 11:53           ` [PATCH v6 10/13] merge: use the "resolve" strategy without forking Alban Gruin
@ 2021-01-05 16:45             ` Derrick Stolee
  0 siblings, 0 replies; 221+ messages in thread
From: Derrick Stolee @ 2021-01-05 16:45 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Phillip Wood

On 11/24/2020 6:53 AM, Alban Gruin wrote:
> This teaches `git merge' to invoke the "resolve" strategy with a
> function call instead of forking.
...
> @@ -740,6 +741,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>  				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  			die(_("unable to write %s"), get_index_file());
>  		return clean ? 0 : 1;
> +	} else if (!strcmp(strategy, "resolve")) {
> +		return merge_strategies_resolve(the_repository, common,
> +						head_arg, remoteheads);
>  	} else {
>  		return try_merge_command(the_repository,
>  					 strategy, xopts_nr, xopts,
> 

This is a very satisfying change.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (13 preceding siblings ...)
  2020-11-24 19:34           ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C SZEDER Gábor
@ 2021-01-05 16:50           ` Derrick Stolee
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
  15 siblings, 0 replies; 221+ messages in thread
From: Derrick Stolee @ 2021-01-05 16:50 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Phillip Wood, SZEDER Gábor

On 11/24/2020 6:53 AM, Alban Gruin wrote:
> In a effort to reduce the number of shell scripts in git's codebase, I
> propose this patch series converting the two remaining merge strategies,
> resolve and octopus, from shell to C.  This will enable slightly better
> performance, better integration with git itself (no more forking to
> perform these operations), better portability (Windows and shell scripts
> don't mix well).
> 
> Three scripts are actually converted: first git-merge-one-file.sh, then
> git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
> are converted, but they also are modified to operate without forking,
> and then libified so they can be used by git without spawning another
> process.

This is a worthwhile effort. Of course, I wasn't familiar with this
area and only took interest when I started working in a conflicting
area.

I did my best in reviewing the content here. I did not comment further
on the patches where Junio already gave extensive review.

> This series keeps the commands `git merge-one-file', `git
> merge-resolve', and `git merge-octopus', so any script depending on them
> should keep working without any changes.

I pointed out some questions about the "dashed versus non-dashed"
forms.

> This series is based on 306ee63a70 (Eighteenth batch, 2020-09-29).  The
> tip is tagged as "rewrite-merge-strategies-v6" at
> https://github.com/agrn/git.

Please also base onto 722fc37491 (help: do not expect built-in
commands to be hardlinked, 2020-10-07) as requested by Szeder.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-05 16:11             ` Derrick Stolee
@ 2021-01-05 17:35               ` Martin Ågren
  2021-01-05 23:20                 ` Alban Gruin
  2021-01-05 23:20               ` Alban Gruin
  1 sibling, 1 reply; 221+ messages in thread
From: Martin Ågren @ 2021-01-05 17:35 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Alban Gruin, Git Mailing List, Junio C Hamano, Phillip Wood

On Tue, 5 Jan 2021 at 17:13, Derrick Stolee <stolee@gmail.com> wrote:
>
> On 11/24/2020 6:53 AM, Alban Gruin wrote:
> > +     if (merge_action == merge_one_file_func) {
>
> nit: This made me think it would be better to check the 'lock'
> itself to see if it was initialized or not. Perhaps
>
>         if (lock.tempfile) {
>
> would be the appropriate way to check this?

> nit: this could be simplified. In total, I recommend:
>
>         if (lock.tempfile) {
>                 if (err)
>                         rollback_lock_file(&lock);
>                 else
>                         return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>         }
>         return err;

FWIW, I also find that way of writing it easier to grok. Although,
rather than peeking at `lock.tempfile`, I suggest using
`is_lock_file_locked(&lock)`.

Martin

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all()
  2021-01-05 15:59             ` Derrick Stolee
@ 2021-01-05 23:20               ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-01-05 23:20 UTC (permalink / raw)
  To: Derrick Stolee, git; +Cc: Junio C Hamano, Phillip Wood

Hi Derrick,

Le 05/01/2021 à 16:59, Derrick Stolee a écrit :
> On 11/24/2020 6:53 AM, Alban Gruin wrote:
>> The "resolve" and "octopus" merge strategies do not call directly `git
>> merge-one-file', they delegate the work to another git command, `git
>> merge-index', that will loop over files in the index and call the
>> specified command.  Unfortunately, these functions are not part of
>> libgit.a, which means that once rewritten, the strategies would still
>> have to invoke `merge-one-file' by spawning a new process first.
> 
> This is a good thing to do.
>  
>> To avoid this, this moves and renames merge_one_path(), merge_all(), and
>> their helpers to merge-strategies.c.  They also take a callback to
>> dictate what they should do for each file.  For now, to preserve the
>> behaviour of `merge-index', only one callback, launching a new process,
>> is defined.
> 
> I don't think the callback should be in libgit.a, though. The callback
> itself should be a static method inside builtin/merge-index.c.
> 

Right.  Modern code should not use this callback -- or the merge-index
builtin once this gets merged.

>> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
>> ---
>>  builtin/merge-index.c |  77 +++----------------------------
>>  merge-strategies.c    | 104 ++++++++++++++++++++++++++++++++++++++++++
>>  merge-strategies.h    |  19 ++++++++
>>  3 files changed, 130 insertions(+), 70 deletions(-)
>>
>> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
>> index 38ea6ad6ca..d5e5713b25 100644
>> --- a/builtin/merge-index.c
>> +++ b/builtin/merge-index.c
>> @@ -1,74 +1,11 @@
>>  #define USE_THE_INDEX_COMPATIBILITY_MACROS
>>  #include "builtin.h"
>> -#include "run-command.h"
>> -
>> -static const char *pgm;
>> -static int one_shot, quiet;
>> -static int err;
>> -
>> -static int merge_entry(int pos, const char *path)
>> -{
>> -	int found;
>> -	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
>> -	char hexbuf[4][GIT_MAX_HEXSZ + 1];
>> -	char ownbuf[4][60];
>> -
>> -	if (pos >= active_nr)
>> -		die("git merge-index: %s not in the cache", path);
>> -	found = 0;
>> -	do {
>> -		const struct cache_entry *ce = active_cache[pos];
>> -		int stage = ce_stage(ce);
>> -
>> -		if (strcmp(ce->name, path))
>> -			break;
>> -		found++;
>> -		oid_to_hex_r(hexbuf[stage], &ce->oid);
>> -		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
>> -		arguments[stage] = hexbuf[stage];
>> -		arguments[stage + 4] = ownbuf[stage];
>> -	} while (++pos < active_nr);
>> -	if (!found)
>> -		die("git merge-index: %s not in the cache", path);
>> -
>> -	if (run_command_v_opt(arguments, 0)) {
>> -		if (one_shot)
>> -			err++;
>> -		else {
>> -			if (!quiet)
>> -				die("merge program failed");
>> -			exit(1);
>> -		}
>> -	}
>> -	return found;
>> -}
>> -
>> -static void merge_one_path(const char *path)
>> -{
>> -	int pos = cache_name_pos(path, strlen(path));
>> -
>> -	/*
>> -	 * If it already exists in the cache as stage0, it's
>> -	 * already merged and there is nothing to do.
>> -	 */
>> -	if (pos < 0)
>> -		merge_entry(-pos-1, path);
>> -}
>> -
>> -static void merge_all(void)
>> -{
>> -	int i;
>> -	for (i = 0; i < active_nr; i++) {
>> -		const struct cache_entry *ce = active_cache[i];
>> -		if (!ce_stage(ce))
>> -			continue;
>> -		i += merge_entry(i, ce->name)-1;
>> -	}
>> -}
>> +#include "merge-strategies.h"
>>  
>>  int cmd_merge_index(int argc, const char **argv, const char *prefix)
>>  {
>> -	int i, force_file = 0;
>> +	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
>> +	const char *pgm;
>>  
>>  	/* Without this we cannot rely on waitpid() to tell
>>  	 * what happened to our children.
>> @@ -98,14 +35,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>>  				continue;
>>  			}
>>  			if (!strcmp(arg, "-a")) {
>> -				merge_all();
>> +				err |= merge_all_index(the_repository, one_shot, quiet,
>> +						       merge_one_file_spawn, (void *)pgm);
> 
> This hunk makes it look like pgm is uninitialized, but it is set earlier
> in cmd_merge_index() (previously referring to the global instance). Good.
> 
>> +int merge_one_file_spawn(struct repository *r,
>> +			 const struct object_id *orig_blob,
>> +			 const struct object_id *our_blob,
>> +			 const struct object_id *their_blob, const char *path,
>> +			 unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +			 void *data)
>> +{
>> +	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
>> +	char modes[3][10] = {{0}};
>> +	const char *arguments[] = { (char *)data, oids[0], oids[1], oids[2],
>> +				    path, modes[0], modes[1], modes[2], NULL };
>> +
>> +	if (orig_blob) {
>> +		oid_to_hex_r(oids[0], orig_blob);
>> +		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
>> +	}
>> +
>> +	if (our_blob) {
>> +		oid_to_hex_r(oids[1], our_blob);
>> +		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
>> +	}
>> +
>> +	if (their_blob) {
>> +		oid_to_hex_r(oids[2], their_blob);
>> +		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
>> +	}
>> +
>> +	return run_command_v_opt(arguments, 0);
>> +}
> 
> Yes, this would be better in the builtin code. Better to keep the meaning
> of 'data' clear in the context of that file.
> 
>> +static int merge_entry(struct repository *r, int quiet, unsigned int pos,
>> +		       const char *path, int *err, merge_fn fn, void *data)
>> +{
>> +	int found = 0;
>> +	const struct object_id *oids[3] = {NULL};
>> +	unsigned int modes[3] = {0};
>> +
>> +	do {
>> +		const struct cache_entry *ce = r->index->cache[pos];
>> +		int stage = ce_stage(ce);
>> +
>> +		if (strcmp(ce->name, path))
>> +			break;
>> +		found++;
>> +		oids[stage - 1] = &ce->oid;
>> +		modes[stage - 1] = ce->ce_mode;
>> +	} while (++pos < r->index->cache_nr);
>> +	if (!found)
>> +		return error(_("%s is not in the cache"), path);
>> +
>> +	if (fn(r, oids[0], oids[1], oids[2], path,
>> +	       modes[0], modes[1], modes[2], data)) {
>> +		if (!quiet)
>> +			error(_("Merge program failed"));
>> +		(*err)++;
>> +	}
>> +
>> +	return found;
>> +}
>> +
>> +int merge_index_path(struct repository *r, int oneshot, int quiet,
>> +		     const char *path, merge_fn fn, void *data)
>> +{
>> +	int pos = index_name_pos(r->index, path, strlen(path)), ret, err = 0;
>> +
>> +	/*
>> +	 * If it already exists in the cache as stage0, it's
>> +	 * already merged and there is nothing to do.
>> +	 */
>> +	if (pos < 0) {
>> +		ret = merge_entry(r, quiet || oneshot, -pos - 1, path, &err, fn, data);
>> +		if (ret == -1)
>> +			return -1;
>> +		else if (err)
>> +			return 1;
>> +	}
>> +	return 0;
>> +}
>> +
>> +int merge_all_index(struct repository *r, int oneshot, int quiet,
>> +		    merge_fn fn, void *data)
>> +{
>> +	int err = 0, ret;
>> +	unsigned int i;
>> +
>> +	for (i = 0; i < r->index->cache_nr; i++) {
>> +		const struct cache_entry *ce = r->index->cache[i];
>> +		if (!ce_stage(ce))
>> +			continue;
>> +
>> +		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
>> +		if (ret > 0)
>> +			i += ret - 1;
>> +		else if (ret == -1)
>> +			return -1;
>> +
>> +		if (err && !oneshot)
>> +			return 1;
>> +	}
>> +
>> +	return err;
>> +}
> 
> I notice that these methods don't actually use the repository pointer
> more than they just use 'r->index'. Should they instead take a
> 'struct index_state *istate' directly? (I see that the repository is
> used later by merge_strategies_resolve(), but not in these.)
> 
> If you think it likely that we will need a repository for these methods,
> then feel free to ignore me and keep your 'r' pointer.
> 

Ouch, you're right.  I thought this was necessary because
merge_three_way() wanted a `struct repository *', without noticing that
it was in fact unnecessary, even in my follow-up patch.  I change that.

> Thanks,
> -Stolee
> 

Cheers,
Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-05 16:11             ` Derrick Stolee
  2021-01-05 17:35               ` Martin Ågren
@ 2021-01-05 23:20               ` Alban Gruin
  2021-01-06  2:04                 ` Junio C Hamano
  1 sibling, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-01-05 23:20 UTC (permalink / raw)
  To: Derrick Stolee, git; +Cc: Junio C Hamano, Phillip Wood

Le 05/01/2021 à 17:11, Derrick Stolee a écrit :
> On 11/24/2020 6:53 AM, Alban Gruin wrote:
>> +
>>  	pgm = argv[i++];
>> +	setup_work_tree();
>> +
>> +	if (!strcmp(pgm, "git-merge-one-file")) {
> 
> This stood out to me as possibly fragile. What if we call the
> non-dashed form "git merge-one-file"? Shouldn't we be doing so?
> 
> Or, is this something that is handled higher in the builtin
> machinery to take the non-dashed version and change it to the
> dashed version for historical reasons?
> 

We had the same discussion with Phillip, who pointed out this previous
discussion about this topic:
https://lore.kernel.org/git/xmqqblv5kr9u.fsf@gitster-ct.c.googlers.com/

So, it's probably OK to do that.

>> +		merge_action = merge_one_file_func;
>> +		hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
>> +	} else {
>> +		merge_action = merge_one_file_spawn;
>> +		data = (void *)pgm;
>> +	}
>> +
> 
> ...
> 
>> +	if (merge_action == merge_one_file_func) {
> 
> nit: This made me think it would be better to check the 'lock'
> itself to see if it was initialized or not. Perhaps
> 
> 	if (lock.tempfile) {
> 
> would be the appropriate way to check this?
> 
> For now, this is equivalent behavior, but it might be helpful if
> we add more cases that take the lock in the future.
> 
>> +		if (err) {
>> +			rollback_lock_file(&lock);
>> +			return err;
>> +		}
>> +
>> +		return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>>  	}
>>  	return err;
> 
> nit: this could be simplified. In total, I recommend:
> 
> 	if (lock.tempfile) {
> 		if (err)
> 			rollback_lock_file(&lock);
> 		else
> 			return write_locked_index(&the_index, &lock, COMMIT_LOCK);
> 	}
> 	return err;
> 

Sure, looks better than mine.  :)

> 
>>  }
>> diff --git a/merge-strategies.c b/merge-strategies.c
>> index 6f27e66dfe..542cefcf3d 100644
>> --- a/merge-strategies.c
>> +++ b/merge-strategies.c
>> @@ -178,6 +178,18 @@ int merge_three_way(struct repository *r,
>>  	return 0;
>>  }
>>  
>> +int merge_one_file_func(struct repository *r,
>> +			const struct object_id *orig_blob,
>> +			const struct object_id *our_blob,
>> +			const struct object_id *their_blob, const char *path,
>> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>> +			void *data)
>> +{
>> +	return merge_three_way(r,
>> +			       orig_blob, our_blob, their_blob, path,
>> +			       orig_mode, our_mode, their_mode);
>> +}
>> +
> 
> Again, I don't recommend making this callback in the library. Instead, keep
> it in the builtin and then use merge_three_way() which is in the library.
> 

This is not possible with this callback, as it will be used later by
merge_strategies_resolve() and indirectly by merge_strategies_octopus().

>>  int merge_one_file_spawn(struct repository *r,
>>  			 const struct object_id *orig_blob,
>>  			 const struct object_id *our_blob,
>> @@ -261,17 +273,22 @@ int merge_all_index(struct repository *r, int oneshot, int quiet,
>>  		    merge_fn fn, void *data)
>>  {
>>  	int err = 0, ret;
>> -	unsigned int i;
>> +	unsigned int i, prev_nr;
>>  
>>  	for (i = 0; i < r->index->cache_nr; i++) {
>>  		const struct cache_entry *ce = r->index->cache[i];
>>  		if (!ce_stage(ce))
>>  			continue;
>>  
>> +		prev_nr = r->index->cache_nr;
>>  		ret = merge_entry(r, quiet || oneshot, i, ce->name, &err, fn, data);
>> -		if (ret > 0)
>> -			i += ret - 1;
>> -		else if (ret == -1)
>> +		if (ret > 0) {
>> +			/* Don't bother handling an index that has
>> +			   grown, since merge_one_file_func() can't grow
>> +			   it, and merge_one_file_spawn() can't change
>> +			   it. */
> 
> multi-line comment style is as follows:
> 
> 	/*
> 	 * Don't bother handling an index that has
> 	 * grown, since merge_one_file_func() can't grow
> 	 * it, and merge_one_file_spawn() can't change it.
> 	 */
> 
> Thanks,
> -Stolee
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-05 17:35               ` Martin Ågren
@ 2021-01-05 23:20                 ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-01-05 23:20 UTC (permalink / raw)
  To: Martin Ågren, Derrick Stolee
  Cc: Git Mailing List, Junio C Hamano, Phillip Wood

Hi Martin & Derrick,

Le 05/01/2021 à 18:35, Martin Ågren a écrit :
> On Tue, 5 Jan 2021 at 17:13, Derrick Stolee <stolee@gmail.com> wrote:
>>
>> On 11/24/2020 6:53 AM, Alban Gruin wrote:
>>> +     if (merge_action == merge_one_file_func) {
>>
>> nit: This made me think it would be better to check the 'lock'
>> itself to see if it was initialized or not. Perhaps
>>
>>         if (lock.tempfile) {
>>
>> would be the appropriate way to check this?
> 
>> nit: this could be simplified. In total, I recommend:
>>
>>         if (lock.tempfile) {
>>                 if (err)
>>                         rollback_lock_file(&lock);
>>                 else
>>                         return write_locked_index(&the_index, &lock, COMMIT_LOCK);
>>         }
>>         return err;
> 
> FWIW, I also find that way of writing it easier to grok. Although,
> rather than peeking at `lock.tempfile`, I suggest using
> `is_lock_file_locked(&lock)`.
> 

OK, this looks good to me.

> Martin
> 

Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-05 23:20               ` Alban Gruin
@ 2021-01-06  2:04                 ` Junio C Hamano
  2021-01-10 17:15                   ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2021-01-06  2:04 UTC (permalink / raw)
  To: Alban Gruin; +Cc: Derrick Stolee, git, Phillip Wood

Alban Gruin <alban.gruin@gmail.com> writes:

> We had the same discussion with Phillip, who pointed out this previous
> discussion about this topic:
> https://lore.kernel.org/git/xmqqblv5kr9u.fsf@gitster-ct.c.googlers.com/
>
> So, it's probably OK to do that.

These days, there exists an optional installation option exists that
won't even install built-in commands in $GIT_EXEC_PATH, which
invalidates the assessment made in 2019 in the article you cited
above, so the code might still be OK, but the old justification no
longer would apply.

In any case, if two people who reviewed a patch found the same thing
in it fishy, it is an indication that the reason why the apparently
fishy code is OK needs to be better explained so that future readers
of the code do not have to be puzzled about the same thing.

Thanks.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 04/13] merge-one-file: rewrite in C
  2021-01-03 22:41               ` Alban Gruin
@ 2021-01-08  6:54                 ` Junio C Hamano
  0 siblings, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2021-01-08  6:54 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Phillip Wood

Alban Gruin <alban.gruin@gmail.com> writes:

>> If we were to do that, then I do not mind the repetition of 0, 1, 1
>> too much.

Sorry, I think we will not need to see repetition of "0, 1, 1" if we
take the route I outlined above.

> Okay.  Are we sure we want add_merge_result_to_index() inside
> read-cache.c/cache.h?

I wouldn't be surprised if you can find a better place, so please
try to see if there is one.  If the only user ends up being the
merge-one-file itself and nobody else, then it may be a better
place.  Or perhaps merge-strategies.c turns out to be a better place
if other parts of the merge machinery can reuse it.

Thanks.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-06  2:04                 ` Junio C Hamano
@ 2021-01-10 17:15                   ` Alban Gruin
  2021-01-10 20:51                     ` Junio C Hamano
  0 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-01-10 17:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, Phillip Wood

Hi Junio,

Le 06/01/2021 à 03:04, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>> We had the same discussion with Phillip, who pointed out this previous
>> discussion about this topic:
>> https://lore.kernel.org/git/xmqqblv5kr9u.fsf@gitster-ct.c.googlers.com/
>>
>> So, it's probably OK to do that.
> 
> These days, there exists an optional installation option exists that
> won't even install built-in commands in $GIT_EXEC_PATH, which
> invalidates the assessment made in 2019 in the article you cited
> above, so the code might still be OK, but the old justification no
> longer would apply.
> 
> In any case, if two people who reviewed a patch found the same thing
> in it fishy, it is an indication that the reason why the apparently
> fishy code is OK needs to be better explained so that future readers
> of the code do not have to be puzzled about the same thing.
> 
> Thanks.
> 

Perhaps we could try to check if the provided command exists (with
locate_in_PATH()), if it does, run it through merge_one_file_spawn(),
else, use merge_one_file_func()?

Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-10 17:15                   ` Alban Gruin
@ 2021-01-10 20:51                     ` Junio C Hamano
  2021-03-08 20:32                       ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2021-01-10 20:51 UTC (permalink / raw)
  To: Alban Gruin; +Cc: Derrick Stolee, git, Phillip Wood

Alban Gruin <alban.gruin@gmail.com> writes:

>> These days, there exists an optional installation option exists that
>> won't even install built-in commands in $GIT_EXEC_PATH, which
>> invalidates the assessment made in 2019 in the article you cited
>> above, so the code might still be OK, but the old justification no
>> longer would apply.
>> 
>> In any case, if two people who reviewed a patch found the same thing
>> in it fishy, it is an indication that the reason why the apparently
>> fishy code is OK needs to be better explained so that future readers
>> of the code do not have to be puzzled about the same thing.
>
> Perhaps we could try to check if the provided command exists (with
> locate_in_PATH()), if it does, run it through merge_one_file_spawn(),
> else, use merge_one_file_func()?

So you think your current implementation will be broken if the "no
dashed git binary on disk" installation option is used?

I do not think "first check if an on-disk command exists and use it,
otherwise check its name" alone would work well in practice.  Both
the 'cat' example that appears in the manual page, and the typical
invocation of git-merge-one-file from merge-resolve:

	git merge-index cat MM
	git merge-index git-merge-one-file -a

would work just as well as before, but does not give you a way to
bypass fork() for the latter.  And changing the order of checks
would mean the users won't have a way to override a buggy builtin
implementation of merge_one_file function.  Besides, using the name
of the binary feels like a bad hack.  

As the invocation from merge-resolve is purely an internal matter,
it may make more sense to introduce a new option and explicitly tell
merge-index that the command line is not asking for an external
program to be spawned, e.g.

	git merge-index --use=merge-one-file -a

You'd prepare a table of internally implemented "take info on a
single path that is being merged and give an automated resolution"
functions, which begins with a single entry that maps the string
"merge-one-file" to your merge_one_file_func function.  Any value to
the "--use" option that names a function not in the table would
cause an error.

Note that in the above the "table of functions" is merely
conceptual.  It is perfectly OK to implement the single entry table
by codeflow (i.e. "if (!strcmp()) ... else error();").  But thinking
in terms of "a table of functions the user can choose from" helps to
form the right mental picture.

Hmm?

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file'
  2021-01-10 20:51                     ` Junio C Hamano
@ 2021-03-08 20:32                       ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-08 20:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee, git, Phillip Wood

Hi Junio,

Le 10/01/2021 à 21:51, Junio C Hamano a écrit :
> Alban Gruin <alban.gruin@gmail.com> writes:
> 
>>> These days, there exists an optional installation option exists that
>>> won't even install built-in commands in $GIT_EXEC_PATH, which
>>> invalidates the assessment made in 2019 in the article you cited
>>> above, so the code might still be OK, but the old justification no
>>> longer would apply.
>>>
>>> In any case, if two people who reviewed a patch found the same thing
>>> in it fishy, it is an indication that the reason why the apparently
>>> fishy code is OK needs to be better explained so that future readers
>>> of the code do not have to be puzzled about the same thing.
>>
>> Perhaps we could try to check if the provided command exists (with
>> locate_in_PATH()), if it does, run it through merge_one_file_spawn(),
>> else, use merge_one_file_func()?
> 
> So you think your current implementation will be broken if the "no
> dashed git binary on disk" installation option is used?
> 
> I do not think "first check if an on-disk command exists and use it,
> otherwise check its name" alone would work well in practice.  Both
> the 'cat' example that appears in the manual page, and the typical
> invocation of git-merge-one-file from merge-resolve:
> 
> 	git merge-index cat MM
> 	git merge-index git-merge-one-file -a
> 
> would work just as well as before, but does not give you a way to
> bypass fork() for the latter.  And changing the order of checks
> would mean the users won't have a way to override a buggy builtin
> implementation of merge_one_file function.  Besides, using the name
> of the binary feels like a bad hack.  
> 
> As the invocation from merge-resolve is purely an internal matter,
> it may make more sense to introduce a new option and explicitly tell
> merge-index that the command line is not asking for an external
> program to be spawned, e.g.
> 
> 	git merge-index --use=merge-one-file -a
> 
> You'd prepare a table of internally implemented "take info on a
> single path that is being merged and give an automated resolution"
> functions, which begins with a single entry that maps the string
> "merge-one-file" to your merge_one_file_func function.  Any value to
> the "--use" option that names a function not in the table would
> cause an error.
> 
> Note that in the above the "table of functions" is merely
> conceptual.  It is perfectly OK to implement the single entry table
> by codeflow (i.e. "if (!strcmp()) ... else error();").  But thinking
> in terms of "a table of functions the user can choose from" helps to
> form the right mental picture.
> 
> Hmm?
> 

Yes, this should work.

To achieve this, I think I have to reorder this series a bit.
Currently, this is what it looks like:

 1. convert git-merge-one-file;
 2. libify git-merge-index, add the ability to call merge-one-file directly;
 3. convert the resolve strategy;
 4. convert the octopus strategy.

After the reorder, the series would look like this:

 1. libify git-merge-index, add `--use=merge-one-file', change
git-merge-resolve.sh, -octopus.sh, and t6060 to use this new parameter;
 2. convert git-merge-one-file, add the ability for merge-index to call
it directly;
 3. convert the resolve strategy;
 4. convert the octopus strategy.

Alban


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v7 00/15] Rewrite the remaining merge strategies from shell to C
  2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
                             ` (14 preceding siblings ...)
  2021-01-05 16:50           ` Derrick Stolee
@ 2021-03-17 20:49           ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 01/15] t6407: modernise tests Alban Gruin
                               ` (15 more replies)
  15 siblings, 16 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

In an effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

The first patch is not important to make the whole series work, but I
made this patch while working on it.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without changes.

This series is based on a5828ae6b5 (Git 2.31, 2021-03-15).  The tip is
tagged as "rewrite-merge-strategies-v7" at https://github.com/agrn/git.

Changes since v6:

 - The series has been rebased on Git 2.31.

 - The series has been reordered.  Now, all the work around merge-index
   happens first to handle the case when git has been compiled with
   SKIP_DASHED_BUILT_INS enabled.

 - Remove usage of the index in the new builtins and in merge-index.

 - Adapt t6407 to use the "main" branch instead of "master".

 - Move merge_one_file_spawn() from merge-strategies.c to
   builtin/merge-index.c.

 - The functions extracted from merge-index and merge_three_way() now
   take a `struct index_state *' instead of a `struct repository *'.

 - Introduce ADD_TO_INDEX_CACHEINFO_{INVALID_PATH,UNABLE_TO_ADD}.

 - Remove checkout_from_index(), and replace it by
   add_merge_result_to_index(), a new function that calls
   add_to_index_cacheinfo() and checkout_entry() at the same time.

 - Fix a case where a file deleted in both branches would result in a
   failure in merge_three_way().  A test case has been added in t6060 to
   check that the new version is correct.

 - Rename some variables in merge_strategies_octopus(), and change its
   flow to make it more understandable.

 - Use CALLOC_ARRAY() in merge_strategies_octopus() instead of
   xcalloc().

 - Change merge-resolve and merge-octopus to handle the case where they
   are given an empty tree instead of a commit.

Alban Gruin (15):
  t6407: modernise tests
  t6060: modify multiple files to expose a possible issue with
    merge-index
  t6060: add tests for removed files
  merge-index: libify merge_one_path() and merge_all()
  merge-index: drop the index
  merge-index: add a new way to invoke `git-merge-one-file'
  update-index: move add_cacheinfo() to read-cache.c
  merge-one-file: rewrite in C
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" merge strategy without forking

 Documentation/git-merge-index.txt |   7 +-
 Makefile                          |   7 +-
 builtin.h                         |   3 +
 builtin/merge-index.c             | 119 ++++---
 builtin/merge-octopus.c           |  70 ++++
 builtin/merge-one-file.c          |  94 ++++++
 builtin/merge-recursive.c         |  16 +-
 builtin/merge-resolve.c           |  74 ++++
 builtin/merge.c                   |   7 +
 builtin/update-index.c            |  25 +-
 cache.h                           |  10 +-
 git-merge-octopus.sh              | 112 ------
 git-merge-one-file.sh             | 167 ---------
 git-merge-resolve.sh              |  54 ---
 git.c                             |   3 +
 merge-strategies.c                | 544 ++++++++++++++++++++++++++++++
 merge-strategies.h                |  39 +++
 merge.c                           |  12 +
 read-cache.c                      |  35 ++
 sequencer.c                       |  17 +-
 t/t6060-merge-index.sh            |  23 +-
 t/t6407-merge-binary.sh           |  27 +-
 t/t6415-merge-dir-to-symlink.sh   |   2 +-
 23 files changed, 1001 insertions(+), 466 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v6:
 1:  70d6507330 !  1:  dfe230bfce t6407: modernise tests
    @@ Commit message
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
      ## t/t6407-merge-binary.sh ##
    -@@ t/t6407-merge-binary.sh: test_description='ask merge-recursive to merge binary files'
    +@@ t/t6407-merge-binary.sh: export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
      . ./test-lib.sh
      
      test_expect_success setup '
    @@ t/t6407-merge-binary.sh: test_expect_success setup '
      	rm -f a* m* &&
      	git reset --hard anchor &&
     -
    --	if git merge -s resolve master
    +-	if git merge -s resolve main
     -	then
     -		echo Oops, should not have succeeded
     -		false
    @@ t/t6407-merge-binary.sh: test_expect_success setup '
     -		git ls-files -s >current
     -		test_cmp expect current
     -	fi
    -+	test_must_fail git merge -s resolve master &&
    ++	test_must_fail git merge -s resolve main &&
     +	git ls-files -s >current &&
     +	test_cmp expect current
      '
    @@ t/t6407-merge-binary.sh: test_expect_success setup '
      	rm -f a* m* &&
      	git reset --hard anchor &&
     -
    --	if git merge -s recursive master
    +-	if git merge -s recursive main
     -	then
     -		echo Oops, should not have succeeded
     -		false
    @@ t/t6407-merge-binary.sh: test_expect_success setup '
     -		git ls-files -s >current
     -		test_cmp expect current
     -	fi
    -+	test_must_fail git merge -s recursive master &&
    ++	test_must_fail git merge -s recursive main &&
     +	git ls-files -s >current &&
     +	test_cmp expect current
      '
 2:  25e9c47e41 =  2:  575e24685d t6060: modify multiple files to expose a possible issue with merge-index
 -:  ---------- >  3:  4f366ff363 t6060: add tests for removed files
 -:  ---------- >  4:  6af79a6b2d merge-index: libify merge_one_path() and merge_all()
 -:  ---------- >  5:  909ed66114 merge-index: drop the index
 -:  ---------- >  6:  1a8aba05bd merge-index: add a new way to invoke `git-merge-one-file'
 3:  e7ea43c5ff !  7:  1f6635512c update-index: move add_cacheinfo() to read-cache.c
    @@ builtin/update-index.c: static int process_path(const char *path, struct stat *s
     -	if (add_cache_entry(ce, option))
     +	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
     +				     allow_add, allow_replace, NULL);
    -+	if (res == -1)
    -+		return res;
    -+	if (res == -2)
    ++	if (res == ADD_TO_INDEX_CACHEINFO_INVALID_PATH)
    ++		return error(_("Invalid path '%s'"), path);
    ++	if (res == ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD)
      		return error("%s: cannot add to the index - missing --add option?",
      			     path);
     +
    @@ cache.h: int remove_file_from_index(struct index_state *, const char *path);
      int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
      int add_file_to_index(struct index_state *, const char *path, int flags);
      
    ++#define ADD_TO_INDEX_CACHEINFO_INVALID_PATH (-1)
    ++#define ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD (-2)
    ++
     +int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
     +			   const struct object_id *oid, const char *path,
     +			   int stage, int allow_add, int allow_replace,
    -+			   struct cache_entry **pce);
    ++			   struct cache_entry **ce_ret);
     +
      int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
      int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
    @@ read-cache.c: int add_index_entry(struct index_state *istate, struct cache_entry
     +int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
     +			   const struct object_id *oid, const char *path,
     +			   int stage, int allow_add, int allow_replace,
    -+			   struct cache_entry **pce)
    ++			   struct cache_entry **ce_ret)
     +{
     +	int len, option;
    -+	struct cache_entry *ce = NULL;
    ++	struct cache_entry *ce;
     +
     +	if (!verify_path(path, mode))
    -+		return error(_("Invalid path '%s'"), path);
    ++		return ADD_TO_INDEX_CACHEINFO_INVALID_PATH;
     +
     +	len = strlen(path);
     +	ce = make_empty_cache_entry(istate, len);
    @@ read-cache.c: int add_index_entry(struct index_state *istate, struct cache_entry
     +
     +	if (add_index_entry(istate, ce, option)) {
     +		discard_cache_entry(ce);
    -+		return -2;
    ++		return ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD;
     +	}
     +
    -+	if (pce)
    -+		*pce = ce;
    ++	if (ce_ret)
    ++		*ce_ret = ce;
     +
     +	return 0;
     +}
 4:  284fc4227f !  8:  8755608f6d merge-one-file: rewrite in C
    @@ Commit message
         it did not because there was no regular file called `a/b'.  This test is
         now marked as successful.
     
    +    This also teaches `merge-index' to call merge_three_way() (when invoked
    +    with `--use=merge-one-file') without forking using a new callback,
    +    merge_one_file_func().
    +
    +    To avoid any issue with a shrinking index because of the merge function
    +    used (directly in the process or by forking), as described earlier, the
    +    iterator of the loop of merge_all_index() is increased by the number of
    +    entries with the same name, minus the difference between the number of
    +    entries in the index before and after the merge.
    +
    +    This should handle a shrinking index correctly, but could lead to issues
    +    with a growing index.  However, this case is not treated, as there is no
    +    callback that can produce such a case.
    +
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
      ## Makefile ##
    @@ Makefile: SCRIPT_SH += git-bisect.sh
      SCRIPT_SH += git-merge-resolve.sh
      SCRIPT_SH += git-mergetool.sh
      SCRIPT_SH += git-quiltimport.sh
    -@@ Makefile: LIB_OBJS += match-trees.o
    - LIB_OBJS += mem-pool.o
    - LIB_OBJS += merge-blobs.o
    - LIB_OBJS += merge-recursive.o
    -+LIB_OBJS += merge-strategies.o
    - LIB_OBJS += merge.o
    - LIB_OBJS += mergesort.o
    - LIB_OBJS += midx.o
     @@ Makefile: BUILTIN_OBJS += builtin/mailsplit.o
      BUILTIN_OBJS += builtin/merge-base.o
      BUILTIN_OBJS += builtin/merge-file.o
    @@ builtin.h: int cmd_merge_base(int argc, const char **argv, const char *prefix);
      int cmd_merge_tree(int argc, const char **argv, const char *prefix);
      int cmd_mktag(int argc, const char **argv, const char *prefix);
     
    + ## builtin/merge-index.c ##
    +@@ builtin/merge-index.c: static int merge_one_file_spawn(struct index_state *istate,
    + int cmd_merge_index(int argc, const char **argv, const char *prefix)
    + {
    + 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
    +-	merge_fn merge_action = merge_one_file_spawn;
    ++	merge_fn merge_action;
    + 	struct lock_file lock = LOCK_INIT;
    + 	struct repository *r = the_repository;
    + 	const char *use_internal = NULL;
    +@@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
    + 
    + 	if (skip_prefix(pgm, "--use=", &use_internal)) {
    + 		if (!strcmp(use_internal, "merge-one-file"))
    +-			pgm = "git-merge-one-file";
    ++			merge_action = merge_one_file_func;
    + 		else
    + 			die(_("git merge-index: unknown internal program %s"), use_internal);
    +-	}
    ++
    ++		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    ++	} else
    ++		merge_action = merge_one_file_spawn;
    + 
    + 	for (; i < argc; i++) {
    + 		const char *arg = argv[i];
    +
      ## builtin/merge-one-file.c (new) ##
     @@
     +/*
    @@ builtin/merge-one-file.c (new)
     + * that might change the tree layout.
     + */
     +
    -+#define USE_THE_INDEX_COMPATIBILITY_MACROS
     +#include "cache.h"
     +#include "builtin.h"
     +#include "lockfile.h"
    @@ builtin/merge-one-file.c (new)
     +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
     +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
     +	struct lock_file lock = LOCK_INIT;
    ++	struct repository *r = the_repository;
     +
     +	if (argc != 8)
     +		usage(builtin_merge_one_file_usage);
     +
    -+	if (read_cache() < 0)
    ++	if (repo_read_index(r) < 0)
     +		die("invalid index");
     +
    -+	hold_locked_index(&lock, LOCK_DIE_ON_ERROR);
    ++	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
     +
     +	if (!get_oid_hex(argv[1], &orig_blob)) {
     +		p_orig_blob = &orig_blob;
    @@ builtin/merge-one-file.c (new)
     +	if (ret)
     +		return ret;
     +
    -+	ret = merge_three_way(the_repository, p_orig_blob, p_our_blob, p_their_blob,
    ++	ret = merge_three_way(r->index, p_orig_blob, p_our_blob, p_their_blob,
     +			      argv[4], orig_mode, our_mode, their_mode);
     +
     +	if (ret) {
    @@ builtin/merge-one-file.c (new)
     +		return !!ret;
     +	}
     +
    -+	return write_locked_index(&the_index, &lock, COMMIT_LOCK);
    ++	return write_locked_index(r->index, &lock, COMMIT_LOCK);
     +}
     
      ## git-merge-one-file.sh (deleted) ##
    @@ git.c: static struct cmd_struct commands[] = {
      	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
      	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
     
    - ## merge-strategies.c (new) ##
    + ## merge-strategies.c ##
     @@
    -+#include "cache.h"
    + #include "cache.h"
     +#include "dir.h"
    -+#include "merge-strategies.h"
    + #include "merge-strategies.h"
     +#include "xdiff-interface.h"
     +
    -+static int checkout_from_index(struct index_state *istate, const char *path,
    -+			       struct cache_entry *ce)
    ++static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
    ++				     const struct object_id *oid, const char *path,
    ++				     int checkout)
     +{
    -+	struct checkout state = CHECKOUT_INIT;
    ++	struct cache_entry *ce;
    ++	int res;
     +
    -+	state.istate = istate;
    -+	state.force = 1;
    -+	state.base_dir = "";
    -+	state.base_dir_len = 0;
    ++	res = add_to_index_cacheinfo(istate, mode, oid, path, 0, 1, 1, &ce);
    ++	if (res == -1)
    ++		return error(_("Invalid path '%s'"), path);
    ++	else if (res == -2)
    ++		return -1;
    ++
    ++	if (checkout) {
    ++		struct checkout state = CHECKOUT_INIT;
    ++
    ++		state.istate = istate;
    ++		state.force = 1;
    ++		state.base_dir = "";
    ++		state.base_dir_len = 0;
    ++
    ++		if (checkout_entry(ce, &state, NULL, NULL) < 0)
    ++			return error(_("%s: cannot checkout file"), path);
    ++	}
     +
    -+	if (checkout_entry(ce, &state, NULL, NULL) < 0)
    -+		return error(_("%s: cannot checkout file"), path);
     +	return 0;
     +}
     +
    @@ merge-strategies.c (new)
     +				  const struct object_id *their_blob, const char *path,
     +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
     +{
    -+	if ((our_blob && orig_mode != our_mode) ||
    -+	    (their_blob && orig_mode != their_mode))
    ++	if ((!our_blob && orig_mode != their_mode) ||
    ++	    (!their_blob && orig_mode != our_mode))
     +		return error(_("File %s deleted on one branch but had its "
     +			       "permissions changed on the other."), path);
     +
    @@ merge-strategies.c (new)
     +	return add_file_to_index(istate, path, 0);
     +}
     +
    -+int merge_three_way(struct repository *r,
    ++int merge_three_way(struct index_state *istate,
     +		    const struct object_id *orig_blob,
     +		    const struct object_id *our_blob,
     +		    const struct object_id *their_blob, const char *path,
     +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
     +{
     +	if (orig_blob &&
    -+	    ((!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
    ++	    ((!our_blob && !their_blob) ||
    ++	     (!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
     +	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
     +		/* Deleted in both or deleted in one and unchanged in the other. */
    -+		return merge_one_file_deleted(r->index, our_blob, their_blob, path,
    ++		return merge_one_file_deleted(istate, our_blob, their_blob, path,
     +					      orig_mode, our_mode, their_mode);
     +	} else if (!orig_blob && our_blob && !their_blob) {
     +		/*
    -+		 * Added in one.  The other side did not add and we
    ++		 * Added in ours.  The other side did not add and we
     +		 * added so there is nothing to be done, except making
     +		 * the path merged.
     +		 */
    -+		return add_to_index_cacheinfo(r->index, our_mode, our_blob,
    -+					      path, 0, 1, 1, NULL);
    ++		return add_merge_result_to_index(istate, our_mode, our_blob, path, 0);
     +	} else if (!orig_blob && !our_blob && their_blob) {
    -+		struct cache_entry *ce;
     +		printf(_("Adding %s\n"), path);
     +
     +		if (file_exists(path))
     +			return error(_("untracked %s is overwritten by the merge."), path);
     +
    -+		if (add_to_index_cacheinfo(r->index, their_mode, their_blob,
    -+					   path, 0, 1, 1, &ce))
    -+			return -1;
    -+		return checkout_from_index(r->index, path, ce);
    ++		return add_merge_result_to_index(istate, their_mode, their_blob, path, 1);
     +	} else if (!orig_blob && our_blob && their_blob &&
     +		   oideq(our_blob, their_blob)) {
    -+		struct cache_entry *ce;
    -+
     +		/* Added in both, identically (check for same permissions). */
     +		if (our_mode != their_mode)
     +			return error(_("File %s added identically in both branches, "
    @@ merge-strategies.c (new)
     +
     +		printf(_("Adding %s\n"), path);
     +
    -+		if (add_to_index_cacheinfo(r->index, our_mode, our_blob,
    -+					   path, 0, 1, 1, &ce))
    -+			return -1;
    -+		return checkout_from_index(r->index, path, ce);
    ++		return add_merge_result_to_index(istate, our_mode, our_blob, path, 1);
     +	} else if (our_blob && their_blob) {
     +		/* Modified in both, but differently. */
    -+		return do_merge_one_file(r->index,
    ++		return do_merge_one_file(istate,
     +					 orig_blob, our_blob, their_blob, path,
     +					 orig_mode, our_mode, their_mode);
     +	} else {
    @@ merge-strategies.c (new)
     +
     +	return 0;
     +}
    ++
    ++int merge_one_file_func(struct index_state *istate,
    ++			const struct object_id *orig_blob,
    ++			const struct object_id *our_blob,
    ++			const struct object_id *their_blob, const char *path,
    ++			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    ++			void *data)
    ++{
    ++	return merge_three_way(istate,
    ++			       orig_blob, our_blob, their_blob, path,
    ++			       orig_mode, our_mode, their_mode);
    ++}
    + 
    + static int merge_entry(struct index_state *istate, int quiet, unsigned int pos,
    + 		       const char *path, int *err, merge_fn fn, void *data)
    +@@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    + 		    merge_fn fn, void *data)
    + {
    + 	int err = 0, ret;
    +-	unsigned int i;
    ++	unsigned int i, prev_nr;
    + 
    + 	for (i = 0; i < istate->cache_nr; i++) {
    + 		const struct cache_entry *ce = istate->cache[i];
    + 		if (!ce_stage(ce))
    + 			continue;
    + 
    ++		prev_nr = istate->cache_nr;
    + 		ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data);
    +-		if (ret > 0)
    +-			i += ret - 1;
    +-		else if (ret == -1)
    ++		if (ret > 0) {
    ++			/*
    ++			 * Don't bother handling an index that has
    ++			 * grown, since merge_one_file_func() can't grow
    ++			 * it, and merge_one_file_spawn() can't change
    ++			 * it.
    ++			 */
    ++			i += ret - (prev_nr - istate->cache_nr) - 1;
    ++		} else if (ret == -1)
    + 			return -1;
    + 
    + 		if (err && !oneshot)
     
    - ## merge-strategies.h (new) ##
    + ## merge-strategies.h ##
     @@
    -+#ifndef MERGE_STRATEGIES_H
    -+#define MERGE_STRATEGIES_H
    -+
    -+#include "object.h"
    -+
    -+int merge_three_way(struct repository *r,
    + 
    + #include "object.h"
    + 
    ++int merge_three_way(struct index_state *istate,
     +		    const struct object_id *orig_blob,
     +		    const struct object_id *our_blob,
     +		    const struct object_id *their_blob, const char *path,
     +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
     +
    -+#endif /* MERGE_STRATEGIES_H */
    + typedef int (*merge_fn)(struct index_state *istate,
    + 			const struct object_id *orig_blob,
    + 			const struct object_id *our_blob,
    +@@ merge-strategies.h: typedef int (*merge_fn)(struct index_state *istate,
    + 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    + 			void *data);
    + 
    ++int merge_one_file_func(struct index_state *istate,
    ++			const struct object_id *orig_blob,
    ++			const struct object_id *our_blob,
    ++			const struct object_id *their_blob, const char *path,
    ++			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    ++			void *data);
    ++
    + int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    + 		     const char *path, merge_fn fn, void *data);
    + int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    +
    + ## t/t6060-merge-index.sh ##
    +@@ t/t6060-merge-index.sh: test_expect_success 'merge-one-file fails without a work tree' '
    + 	(cd bare.git &&
    + 	 GIT_INDEX_FILE=$PWD/merge.index &&
    + 	 export GIT_INDEX_FILE &&
    +-	 test_must_fail git merge-index git-merge-one-file -a
    ++	 test_must_fail git merge-index --use=merge-one-file -a
    + 	)
    + '
    + 
     
      ## t/t6415-merge-dir-to-symlink.sh ##
     @@ t/t6415-merge-dir-to-symlink.sh: test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 5:  54abee902f <  -:  ---------- merge-index: libify merge_one_path() and merge_all()
 6:  acaf100edd <  -:  ---------- merge-index: don't fork if the requested program is `git-merge-one-file'
 7:  9a9e3faeff !  9:  3ecf49a8ac merge-resolve: rewrite in C
    @@ builtin/merge-resolve.c (new)
     + * Resolve two trees, using enhanced multi-base read-tree.
     + */
     +
    -+#define USE_THE_INDEX_COMPATIBILITY_MACROS
     +#include "cache.h"
     +#include "builtin.h"
     +#include "merge-strategies.h"
    @@ builtin/merge-resolve.c (new)
     +	const char *head = NULL;
     +	struct commit_list *bases = NULL, *remote = NULL;
     +	struct commit_list **next_base = &bases;
    ++	struct repository *r = the_repository;
     +
     +	if (argc < 5)
     +		usage(builtin_merge_resolve_usage);
     +
     +	setup_work_tree();
    -+	if (read_cache() < 0)
    ++	if (repo_read_index(r) < 0)
     +		die("invalid index");
     +
     +	/*
    @@ builtin/merge-resolve.c (new)
     +			if (get_oid(argv[i], &oid))
     +				die("object %s not found.", argv[i]);
     +
    -+			commit = lookup_commit_or_die(&oid, argv[i]);
    ++			commit = oideq(&oid, r->hash_algo->empty_tree) ?
    ++				NULL : lookup_commit_or_die(&oid, argv[i]);
     +
     +			if (sep_seen)
     +				commit_list_insert(commit, &remote);
    @@ builtin/merge-resolve.c (new)
     +	if (!bases)
     +		return 2;
     +
    -+	return merge_strategies_resolve(the_repository, bases, head, remote);
    ++	return merge_strategies_resolve(r, bases, head, remote);
     +}
     
      ## git-merge-resolve.sh (deleted) ##
    @@ git-merge-resolve.sh (deleted)
     -	exit 0
     -else
     -	echo "Simple merge failed, trying Automatic merge."
    --	if git merge-index -o git-merge-one-file -a
    +-	if git merge-index -o --use=merge-one-file -a
     -	then
     -		exit 0
     -	else
    @@ merge-strategies.c
      #include "dir.h"
     +#include "lockfile.h"
      #include "merge-strategies.h"
    - #include "run-command.h"
     +#include "unpack-trees.h"
      #include "xdiff-interface.h"
      
    - static int checkout_from_index(struct index_state *istate, const char *path,
    -@@ merge-strategies.c: int merge_all_index(struct repository *r, int oneshot, int quiet,
    + static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
    +@@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
      
      	return err;
      }
    @@ merge-strategies.c: int merge_all_index(struct repository *r, int oneshot, int q
     +
     +		puts(_("Simple merge failed, trying Automatic merge."));
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+		ret = merge_all_index(r, 1, 0, merge_one_file_func, NULL);
    ++		ret = merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
     +
     +		write_locked_index(r->index, &lock, COMMIT_LOCK);
     +		return !!ret;
    @@ merge-strategies.h
     +#include "commit.h"
      #include "object.h"
      
    - int merge_three_way(struct repository *r,
    -@@ merge-strategies.h: int merge_index_path(struct repository *r, int oneshot, int quiet,
    - int merge_all_index(struct repository *r, int oneshot, int quiet,
    + int merge_three_way(struct index_state *istate,
    +@@ merge-strategies.h: int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    + int merge_all_index(struct index_state *istate, int oneshot, int quiet,
      		    merge_fn fn, void *data);
      
     +int merge_strategies_resolve(struct repository *r,
 8:  359346229c = 10:  615b04d417 merge-recursive: move better_branch_name() to merge.c
 9:  4dff780212 ! 11:  a6ece04f3d merge-octopus: rewrite in C
    @@ builtin/merge-octopus.c (new)
     + * Resolve two or more trees.
     + */
     +
    -+#define USE_THE_INDEX_COMPATIBILITY_MACROS
     +#include "cache.h"
     +#include "builtin.h"
     +#include "commit.h"
    @@ builtin/merge-octopus.c (new)
     +	struct commit_list *bases = NULL, *remotes = NULL;
     +	struct commit_list **next_base = &bases, **next_remote = &remotes;
     +	const char *head_arg = NULL;
    ++	struct repository *r = the_repository;
     +
     +	if (argc < 5)
     +		usage(builtin_merge_octopus_usage);
     +
     +	setup_work_tree();
    -+	if (read_cache() < 0)
    ++	if (repo_read_index(r) < 0)
     +		die("invalid index");
     +
     +	/*
    @@ builtin/merge-octopus.c (new)
     +			if (get_oid(argv[i], &oid))
     +				die("object %s not found.", argv[i]);
     +
    -+			commit = lookup_commit_or_die(&oid, argv[i]);
    ++			commit = oideq(&oid, r->hash_algo->empty_tree) ?
    ++				NULL : lookup_commit_or_die(&oid, argv[i]);
     +
     +			if (sep_seen)
     +				next_remote = commit_list_append(commit, next_remote);
    @@ builtin/merge-octopus.c (new)
     +	if (commit_list_count(remotes) < 2)
     +		return 2;
     +
    -+	return merge_strategies_octopus(the_repository, bases, head_arg, remotes);
    ++	return merge_strategies_octopus(r, bases, head_arg, remotes);
     +}
     
      ## git-merge-octopus.sh (deleted) ##
    @@ git-merge-octopus.sh (deleted)
     -	if test $? -ne 0
     -	then
     -		gettextln "Simple merge did not work, trying automatic merge."
    --		git merge-index -o git-merge-one-file -a ||
    +-		git merge-index -o --use=merge-one-file -a ||
     -		OCTOPUS_FAILURE=1
     -		next=$(git write-tree 2>/dev/null)
     -	fi
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +			    struct tree **reference_tree)
     +{
     +	struct tree_desc t[MAX_UNPACK_TREES];
    -+	struct commit_list *j;
    ++	struct commit_list *i;
     +	int nr = 0, ret = 0;
     +
     +	printf(_("Trying simple merge with %s\n"), branch_name);
     +
    -+	for (j = common; j; j = j->next) {
    -+		struct tree *tree = repo_get_commit_tree(r, j->item);
    ++	for (i = common; i; i = i->next) {
    ++		struct tree *tree = repo_get_commit_tree(r, i->item);
     +		if (add_tree(tree, t + (nr++)))
     +			return -1;
     +	}
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +	if (add_tree(current_tree, t + (nr++)))
     +		return -1;
     +	if (fast_forward(r, t, nr, 1))
    -+		return -1;
    ++		return 2;
     +
     +	if (write_tree(r, reference_tree)) {
     +		struct lock_file lock = LOCK_INIT;
     +
     +		puts(_("Simple merge did not work, trying automatic merge."));
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+		ret = merge_all_index(r, 1, 0, merge_one_file_func, NULL);
    ++		ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, NULL);
     +		write_locked_index(r->index, &lock, COMMIT_LOCK);
     +
     +		write_tree(r, reference_tree);
     +	}
     +
    -+	return ret ? -2 : 0;
    ++	return ret;
     +}
     +
     +int merge_strategies_octopus(struct repository *r,
     +			     struct commit_list *bases, const char *head_arg,
     +			     struct commit_list *remotes)
     +{
    -+	int ff_merge = 1, ret = 0, references = 1;
    -+	struct commit **reference_commit, *head_commit;
    ++	int ff_merge = 1, ret = 0, nr_references = 1;
    ++	struct commit **reference_commits, *head_commit;
     +	struct tree *reference_tree, *head_tree;
     +	struct commit_list *i;
     +	struct object_id head;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		return 2;
     +	}
     +
    -+	reference_commit = xcalloc(commit_list_count(remotes) + 1,
    -+				   sizeof(struct commit *));
    -+	reference_commit[0] = head_commit;
    ++	CALLOC_ARRAY(reference_commits, commit_list_count(remotes) + 1);
    ++	reference_commits[0] = head_commit;
     +	reference_tree = head_tree;
     +
     +	for (i = remotes; i && i->item; i = i->next) {
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		struct object_id *oid = &c->object.oid;
     +		struct tree *current_tree = repo_get_commit_tree(r, c);
     +		struct commit_list *common, *j;
    -+		char *branch_name;
    -+		int k = 0, up_to_date = 0;
    -+
    -+		if (ret) {
    -+			/*
    -+			 * We allow only last one to have a
    -+			 * hand-resolvable conflicts.  Last round failed
    -+			 * and we still had a head to merge.
    -+			 */
    -+			puts(_("Automated merge did not work."));
    -+			puts(_("Should not be doing an octopus."));
    -+
    -+			free(reference_commit);
    -+			return 2;
    -+		}
    -+
    -+		branch_name = merge_get_better_branch_name(oid_to_hex(oid));
    -+		common = get_merge_bases_many(c, references, reference_commit);
    ++		char *branch_name = merge_get_better_branch_name(oid_to_hex(oid));
    ++		int up_to_date = 0;
     +
    ++		common = repo_get_merge_bases_many(r, c, nr_references, reference_commits);
     +		if (!common) {
     +			error(_("Unable to find common commit with %s"), branch_name);
     +
     +			free(branch_name);
     +			free_commit_list(common);
    -+			free(reference_commit);
    ++			free(reference_commits);
     +
     +			return 2;
     +		}
     +
    -+		for (j = common; j && !(up_to_date || !ff_merge); j = j->next) {
    ++		for (j = common; j && !up_to_date && ff_merge; j = j->next) {
     +			up_to_date |= oideq(&j->item->object.oid, oid);
     +
    -+			if (k < references)
    -+				ff_merge &= oideq(&j->item->object.oid, &reference_commit[k++]->object.oid);
    ++			if (!j->next &&
    ++			    !oideq(&j->item->object.oid,
    ++				   &reference_commits[nr_references - 1]->object.oid))
    ++				ff_merge = 0;
     +		}
     +
     +		if (up_to_date) {
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		if (ff_merge) {
     +			ret = octopus_fast_forward(r, branch_name, head_tree,
     +						   current_tree, &reference_tree);
    -+			references = 0;
    ++			nr_references = 0;
     +		} else {
     +			ret = octopus_do_merge(r, branch_name, common,
     +					       current_tree, &reference_tree);
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +		free(branch_name);
     +		free_commit_list(common);
     +
    -+		if (ret == -1)
    ++		if (ret == -1 || ret == 2)
     +			break;
    ++		else if (ret && i->next) {
    ++			/*
    ++			 * We allow only last one to have a
    ++			 * hand-resolvable conflicts.  Last round failed
    ++			 * and we still had a head to merge.
    ++			 */
    ++			puts(_("Automated merge did not work."));
    ++			puts(_("Should not be doing an octopus."));
     +
    -+		reference_commit[references++] = c;
    ++			free(reference_commits);
    ++			return 2;
    ++		}
    ++
    ++		reference_commits[nr_references++] = c;
     +	}
     +
    -+	free(reference_commit);
    ++	free(reference_commits);
     +	return ret;
     +}
     
      ## merge-strategies.h ##
    -@@ merge-strategies.h: int merge_all_index(struct repository *r, int oneshot, int quiet,
    +@@ merge-strategies.h: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
      int merge_strategies_resolve(struct repository *r,
      			     struct commit_list *bases, const char *head_arg,
      			     struct commit_list *remote);
10:  76f02b4531 = 12:  cc1500147b merge: use the "resolve" strategy without forking
11:  c9e0a38d0f = 13:  ec3dc3b81e merge: use the "octopus" strategy without forking
12:  5b595efa46 = 14:  e7dc4a15d4 sequencer: use the "resolve" strategy without forking
13:  7eb0f13442 = 15:  34280dd82d sequencer: use the "octopus" merge strategy without forking
-- 
2.31.0


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v7 01/15] t6407: modernise tests
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 02/15] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
                               ` (14 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

Some tests in t6407 uses a if/then/else to check if a command failed or
not, but we have the `test_must_fail' function to do it correctly for us
nowadays.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6407-merge-binary.sh | 27 ++++++---------------------
 1 file changed, 6 insertions(+), 21 deletions(-)

diff --git a/t/t6407-merge-binary.sh b/t/t6407-merge-binary.sh
index d4273f2575..bd2696367b 100755
--- a/t/t6407-merge-binary.sh
+++ b/t/t6407-merge-binary.sh
@@ -8,7 +8,6 @@ export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 . ./test-lib.sh
 
 test_expect_success setup '
-
 	cat "$TEST_DIRECTORY"/test-binary-1.png >m &&
 	git add m &&
 	git ls-files -s | sed -e "s/ 0	/ 1	/" >E1 &&
@@ -38,33 +37,19 @@ test_expect_success setup '
 '
 
 test_expect_success resolve '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s resolve main
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s resolve main &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_expect_success recursive '
-
 	rm -f a* m* &&
 	git reset --hard anchor &&
-
-	if git merge -s recursive main
-	then
-		echo Oops, should not have succeeded
-		false
-	else
-		git ls-files -s >current
-		test_cmp expect current
-	fi
+	test_must_fail git merge -s recursive main &&
+	git ls-files -s >current &&
+	test_cmp expect current
 '
 
 test_done
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 02/15] t6060: modify multiple files to expose a possible issue with merge-index
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 01/15] t6407: modernise tests Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 03/15] t6060: add tests for removed files Alban Gruin
                               ` (13 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

Currently, merge-index iterates over every index entry, skipping stage0
entries.  It will then count how many entries following the current one
have the same name, then fork to do the merge.  It will then increase
the iterator by the number of entries to skip them.  This behaviour is
correct, as even if the subprocess modifies the index, merge-index does
not reload it at all.

But when it will be rewritten to use a function, the index it will use
will be modified and may shrink when a conflict happens or if a file is
removed, so we have to be careful to handle such cases.

Here is an example:

 *    Merge branches, file1 and file2 are trivially mergeable.
 |\
 | *  Modifies file1 and file2.
 * |  Modifies file1 and file2.
 |/
 *    Adds file1 and file2.

When the merge happens, the index will look like that:

 i -> 0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
      3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

merge-index handles `file1' first.  As it appears 3 times after the
iterator, it is merged.  The index is now stale, `i' is increased by 3,
and the index now looks like this:

      0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
 i -> 3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

`file2' appears three times too, so it is merged.

With a naive rewrite, the index would look like this:

      0. file1 (stage0)
      1. file2 (stage1)
      2. file2 (stage2)
 i -> 3. file2 (stage3)

`file2' appears once at the iterator or after, so it will be added,
_not_ merged.  Which is wrong.

A naive rewrite would lead to unproperly merged files, or even files not
handled at all.

This changes t6060 to reproduce this case, by creating 2 files instead
of 1, to check the correctness of the soon-to-be-rewritten merge-index.
The files are identical, which is not really important -- the factors
that could trigger this issue are that they should be separated by at
most one entry in the index, and that the first one in the index should
be trivially mergeable.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6060-merge-index.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index ddf34f0115..9e15ceb957 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -7,16 +7,19 @@ test_expect_success 'setup diverging branches' '
 	for i in 1 2 3 4 5 6 7 8 9 10; do
 		echo $i
 	done >file &&
-	git add file &&
+	cp file file2 &&
+	git add file file2 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
 	sed s/10/ten/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m ten &&
 	git tag ten
 '
@@ -35,8 +38,11 @@ ten
 EOF
 
 test_expect_success 'read-tree does not resolve content merge' '
+	cat >expect <<-\EOF &&
+	file
+	file2
+	EOF
 	git read-tree -i -m base ten two &&
-	echo file >expect &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_cmp expect unmerged
 '
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 03/15] t6060: add tests for removed files
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 01/15] t6407: modernise tests Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 02/15] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-22 21:36               ` Johannes Schindelin
  2021-03-17 20:49             ` [PATCH v7 04/15] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                               ` (12 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

Until now, t6060 did not not check git-mere-one-file's behaviour when a
file is deleted in a branch.  To avoid regressions on this during the
conversion, this adds a new file, `file3', in the commit tagged as`base', and
deletes it in the commit tagged as `two'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6060-merge-index.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 9e15ceb957..0cbd8a1f7f 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -8,12 +8,14 @@ test_expect_success 'setup diverging branches' '
 		echo $i
 	done >file &&
 	cp file file2 &&
-	git add file file2 &&
+	cp file file3 &&
+	git add file file2 file3 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
 	cp file file2 &&
+	git rm file3 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
@@ -41,6 +43,7 @@ test_expect_success 'read-tree does not resolve content merge' '
 	cat >expect <<-\EOF &&
 	file
 	file2
+	file3
 	EOF
 	git read-tree -i -m base ten two &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 04/15] merge-index: libify merge_one_path() and merge_all()
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (2 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 03/15] t6060: add tests for removed files Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 05/15] merge-index: drop the index Alban Gruin
                               ` (11 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves and renames merge_one_path(), merge_all(), and
their helpers to merge-strategies.c.  They also take a callback to
dictate what they should do for each file.  For now, to preserve the
behaviour of `merge-index', only one callback, launching a new process,
is defined.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile              |  1 +
 builtin/merge-index.c | 90 +++++++++++++++----------------------------
 merge-strategies.c    | 75 ++++++++++++++++++++++++++++++++++++
 merge-strategies.h    | 18 +++++++++
 4 files changed, 125 insertions(+), 59 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index dfb0f1000f..1b1dc49e86 100644
--- a/Makefile
+++ b/Makefile
@@ -913,6 +913,7 @@ LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-ort.o
 LIB_OBJS += merge-ort-wrappers.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += mergesort.o
 LIB_OBJS += midx.o
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 38ea6ad6ca..70f440d9a0 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,74 +1,43 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "merge-strategies.h"
 #include "run-command.h"
 
 static const char *pgm;
-static int one_shot, quiet;
-static int err;
 
-static int merge_entry(int pos, const char *path)
+static int merge_one_file_spawn(struct index_state *istate,
+				const struct object_id *orig_blob,
+				const struct object_id *our_blob,
+				const struct object_id *their_blob, const char *path,
+				unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+				void *data)
 {
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
+	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
+	char modes[3][10] = {{0}};
+	const char *arguments[] = { pgm, oids[0], oids[1], oids[2],
+				    path, modes[0], modes[1], modes[2], NULL };
 
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
+	if (orig_blob) {
+		oid_to_hex_r(oids[0], orig_blob);
+		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
 	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
 
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
+	if (our_blob) {
+		oid_to_hex_r(oids[1], our_blob);
+		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
+	}
 
-static void merge_all(void)
-{
-	int i;
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
+	if (their_blob) {
+		oid_to_hex_r(oids[2], their_blob);
+		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
 	}
+
+	return run_command_v_opt(arguments, 0);
 }
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -89,7 +58,9 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -98,14 +69,15 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all_index(&the_index, one_shot, quiet,
+						       merge_one_file_spawn, NULL);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_index_path(&the_index, one_shot, quiet, arg,
+					merge_one_file_spawn, NULL);
 	}
-	if (err && !quiet)
-		die("merge program failed");
+
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..c80f964612
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,75 @@
+#include "cache.h"
+#include "merge-strategies.h"
+
+static int merge_entry(struct index_state *istate, int quiet, unsigned int pos,
+		       const char *path, int *err, merge_fn fn, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (fn(istate, oids[0], oids[1], oids[2], path,
+	       modes[0], modes[1], modes[2], data)) {
+		if (!quiet)
+			error(_("Merge program failed"));
+		(*err)++;
+	}
+
+	return found;
+}
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret, err = 0;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, quiet || oneshot, -pos - 1, path, &err, fn, data);
+		if (ret == -1)
+			return -1;
+		else if (err)
+			return 1;
+	}
+	return 0;
+}
+
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data)
+{
+	int err = 0, ret;
+	unsigned int i;
+
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+
+		if (err && !oneshot)
+			return 1;
+	}
+
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..88f476f170
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,18 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+typedef int (*merge_fn)(struct index_state *istate,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data);
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data);
+
+#endif /* MERGE_STRATEGIES_H */
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 05/15] merge-index: drop the index
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (3 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 04/15] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 06/15] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
                               ` (10 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

In an effort to reduce the usage of the global index throughout the
codebase, this removes references to it in `git merge-index'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 70f440d9a0..49ddf3f9cd 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,4 +1,3 @@
-#define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
 #include "merge-strategies.h"
 #include "run-command.h"
@@ -38,6 +37,7 @@ static int merge_one_file_spawn(struct index_state *istate,
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	struct repository *r = the_repository;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -47,7 +47,8 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	if (argc < 3)
 		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
 
-	read_cache();
+	if (repo_read_index(r) < 0)
+		die("invalid index");
 
 	i = 1;
 	if (!strcmp(argv[i], "-o")) {
@@ -69,13 +70,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				err |= merge_all_index(&the_index, one_shot, quiet,
+				err |= merge_all_index(r->index, one_shot, quiet,
 						       merge_one_file_spawn, NULL);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		err |= merge_index_path(&the_index, one_shot, quiet, arg,
+		err |= merge_index_path(r->index, one_shot, quiet, arg,
 					merge_one_file_spawn, NULL);
 	}
 
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 06/15] merge-index: add a new way to invoke `git-merge-one-file'
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (4 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 05/15] merge-index: drop the index Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
                               ` (9 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

Since `git-merge-one-file' will be rewritten and libified, there may be
cases where there is no executable named this way (ie. when git is
compiled with `SKIP_DASHED_BUILT_INS' enabled).  This adds a new way to
invoke this particular program even if it does not exist, by passing
`--use=merge-one-file' to merge-index.  For now, it still forks.

The test suite and shell scripts (git-merge-octopus.sh and
git-merge-resolve.sh) are updated to use this new convention.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Documentation/git-merge-index.txt |  7 ++++---
 builtin/merge-index.c             | 25 ++++++++++++++++++++++---
 git-merge-octopus.sh              |  2 +-
 git-merge-resolve.sh              |  2 +-
 t/t6060-merge-index.sh            |  8 ++++----
 5 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
index 2ab84a91e5..57e7e03b4c 100644
--- a/Documentation/git-merge-index.txt
+++ b/Documentation/git-merge-index.txt
@@ -9,7 +9,7 @@ git-merge-index - Run a merge for files needing merging
 SYNOPSIS
 --------
 [verse]
-'git merge-index' [-o] [-q] <merge-program> (-a | [--] <file>*)
+'git merge-index' [-o] [-q] (<merge-program> | --use=merge-one-file) (-a | [--] <file>*)
 
 DESCRIPTION
 -----------
@@ -44,8 +44,9 @@ code.
 Typically this is run with a script calling Git's imitation of
 the 'merge' command from the RCS package.
 
-A sample script called 'git merge-one-file' is included in the
-distribution.
+A sample script called 'git merge-one-file' used to be included in the
+distribution. This program must now be called with
+'--use=merge-one-file'.
 
 ALERT ALERT ALERT! The Git "merge object order" is different from the
 RCS 'merge' program merge object order. In the above ordering, the
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 49ddf3f9cd..fd5b1a5a92 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,4 +1,5 @@
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
 
@@ -37,7 +38,10 @@ static int merge_one_file_spawn(struct index_state *istate,
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	merge_fn merge_action = merge_one_file_spawn;
+	struct lock_file lock = LOCK_INIT;
 	struct repository *r = the_repository;
+	const char *use_internal = NULL;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -45,7 +49,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	signal(SIGCHLD, SIG_DFL);
 
 	if (argc < 3)
-		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
+		usage("git merge-index [-o] [-q] (<merge-program> | --use=merge-one-file) (-a | [--] [<filename>...])");
 
 	if (repo_read_index(r) < 0)
 		die("invalid index");
@@ -61,6 +65,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	}
 
 	pgm = argv[i++];
+	setup_work_tree();
+
+	if (skip_prefix(pgm, "--use=", &use_internal)) {
+		if (!strcmp(use_internal, "merge-one-file"))
+			pgm = "git-merge-one-file";
+		else
+			die(_("git merge-index: unknown internal program %s"), use_internal);
+	}
 
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
@@ -71,13 +83,20 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all_index(r->index, one_shot, quiet,
-						       merge_one_file_spawn, NULL);
+						       merge_action, NULL);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_index_path(r->index, one_shot, quiet, arg,
-					merge_one_file_spawn, NULL);
+					merge_action, NULL);
+	}
+
+	if (is_lock_file_locked(&lock)) {
+		if (err)
+			rollback_lock_file(&lock);
+		else
+			return write_locked_index(r->index, &lock, COMMIT_LOCK);
 	}
 
 	return err;
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
index 7d19d37951..2770891960 100755
--- a/git-merge-octopus.sh
+++ b/git-merge-octopus.sh
@@ -100,7 +100,7 @@ do
 	if test $? -ne 0
 	then
 		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
+		git merge-index -o --use=merge-one-file -a ||
 		OCTOPUS_FAILURE=1
 		next=$(git write-tree 2>/dev/null)
 	fi
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
index 343fe7bccd..0b4fc88b61 100755
--- a/git-merge-resolve.sh
+++ b/git-merge-resolve.sh
@@ -45,7 +45,7 @@ then
 	exit 0
 else
 	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
+	if git merge-index -o --use=merge-one-file -a
 	then
 		exit 0
 	else
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 0cbd8a1f7f..d0cdfeddc1 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -50,8 +50,8 @@ test_expect_success 'read-tree does not resolve content merge' '
 	test_cmp expect unmerged
 '
 
-test_expect_success 'git merge-index git-merge-one-file resolves' '
-	git merge-index git-merge-one-file -a &&
+test_expect_success 'git merge-index --use=merge-one-file resolves' '
+	git merge-index --use=merge-one-file -a &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_must_be_empty unmerged &&
 	test_cmp expect-merged file &&
@@ -83,7 +83,7 @@ test_expect_success 'merge-one-file respects GIT_WORK_TREE' '
 	 export GIT_WORK_TREE &&
 	 GIT_INDEX_FILE=$PWD/merge.index &&
 	 export GIT_INDEX_FILE &&
-	 git merge-index git-merge-one-file -a &&
+	 git merge-index --use=merge-one-file -a &&
 	 git cat-file blob :file >work/file-index
 	) &&
 	test_cmp expect-merged bare.git/work/file &&
@@ -98,7 +98,7 @@ test_expect_success 'merge-one-file respects core.worktree' '
 	 export GIT_DIR &&
 	 git config core.worktree "$PWD/child" &&
 	 git read-tree -i -m base ten two &&
-	 git merge-index git-merge-one-file -a &&
+	 git merge-index --use=merge-one-file -a &&
 	 git cat-file blob :file >file-index
 	) &&
 	test_cmp expect-merged subdir/child/file &&
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (5 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 06/15] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-22 21:59               ` Johannes Schindelin
  2021-03-17 20:49             ` [PATCH v7 08/15] merge-one-file: rewrite in C Alban Gruin
                               ` (8 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This moves the function add_cacheinfo() that already exists in
update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
and adds an `istate' parameter.  The new cache entry is returned through
a pointer passed in the parameters.  The return value is either 0
(success), -1 (invalid path), or -2 (failed to add the file in the
index).

This will become useful in the next commit, when the three-way merge
will need to call this function.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/update-index.c | 25 +++++++------------------
 cache.h                |  8 ++++++++
 read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 18 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index 79087bccea..6b86e89840 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -404,27 +404,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
 			 const char *path, int stage)
 {
-	int len, option;
-	struct cache_entry *ce;
+	int res;
 
-	if (!verify_path(path, mode))
-		return error("Invalid path '%s'", path);
-
-	len = strlen(path);
-	ce = make_empty_cache_entry(&the_index, len);
-
-	oidcpy(&ce->oid, oid);
-	memcpy(ce->name, path, len);
-	ce->ce_flags = create_ce_flags(stage);
-	ce->ce_namelen = len;
-	ce->ce_mode = create_ce_mode(mode);
-	if (assume_unchanged)
-		ce->ce_flags |= CE_VALID;
-	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
-	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
+				     allow_add, allow_replace, NULL);
+	if (res == ADD_TO_INDEX_CACHEINFO_INVALID_PATH)
+		return error(_("Invalid path '%s'"), path);
+	if (res == ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD)
 		return error("%s: cannot add to the index - missing --add option?",
 			     path);
+
 	report("add '%s'", path);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index 6fda8091f1..41e30c0da2 100644
--- a/cache.h
+++ b/cache.h
@@ -832,6 +832,14 @@ int remove_file_from_index(struct index_state *, const char *path);
 int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
 int add_file_to_index(struct index_state *, const char *path, int flags);
 
+#define ADD_TO_INDEX_CACHEINFO_INVALID_PATH (-1)
+#define ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD (-2)
+
+int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **ce_ret);
+
 int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
 int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
 void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
diff --git a/read-cache.c b/read-cache.c
index 1e9a50c6c7..b514523ca1 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1350,6 +1350,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
 	return 0;
 }
 
+int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **ce_ret)
+{
+	int len, option;
+	struct cache_entry *ce;
+
+	if (!verify_path(path, mode))
+		return ADD_TO_INDEX_CACHEINFO_INVALID_PATH;
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(stage);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
+	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
+
+	if (add_index_entry(istate, ce, option)) {
+		discard_cache_entry(ce);
+		return ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD;
+	}
+
+	if (ce_ret)
+		*ce_ret = ce;
+
+	return 0;
+}
+
 /*
  * "refresh" does not calculate a new sha1 file or bring the
  * cache up-to-date for mode/content changes. But what it
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 08/15] merge-one-file: rewrite in C
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (6 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-22 22:20               ` Johannes Schindelin
  2021-03-17 20:49             ` [PATCH v7 09/15] merge-resolve: " Alban Gruin
                               ` (7 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_to_index_cacheinfo();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_index();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and xdl_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are removed, as this is needed
   to have the correct permission bits on the merged file from the
   script, but not in the C version;

 - calls to `update-index' are replaced by calls to add_file_to_index().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

This also teaches `merge-index' to call merge_three_way() (when invoked
with `--use=merge-one-file') without forking using a new callback,
merge_one_file_func().

To avoid any issue with a shrinking index because of the merge function
used (directly in the process or by forking), as described earlier, the
iterator of the loop of merge_all_index() is increased by the number of
entries with the same name, minus the difference between the number of
entries in the index before and after the merge.

This should handle a shrinking index correctly, but could lead to issues
with a growing index.  However, this case is not treated, as there is no
callback that can produce such a case.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   2 +-
 builtin.h                       |   1 +
 builtin/merge-index.c           |   9 +-
 builtin/merge-one-file.c        |  94 +++++++++++++++
 git-merge-one-file.sh           | 167 --------------------------
 git.c                           |   1 +
 merge-strategies.c              | 207 +++++++++++++++++++++++++++++++-
 merge-strategies.h              |  13 ++
 t/t6060-merge-index.sh          |   2 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 10 files changed, 321 insertions(+), 177 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh

diff --git a/Makefile b/Makefile
index 1b1dc49e86..e2e4389f76 100644
--- a/Makefile
+++ b/Makefile
@@ -600,7 +600,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -1100,6 +1099,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index b6ce981b73..227c133036 100644
--- a/builtin.h
+++ b/builtin.h
@@ -179,6 +179,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index fd5b1a5a92..04d38aa130 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -38,7 +38,7 @@ static int merge_one_file_spawn(struct index_state *istate,
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
-	merge_fn merge_action = merge_one_file_spawn;
+	merge_fn merge_action;
 	struct lock_file lock = LOCK_INIT;
 	struct repository *r = the_repository;
 	const char *use_internal = NULL;
@@ -69,10 +69,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 
 	if (skip_prefix(pgm, "--use=", &use_internal)) {
 		if (!strcmp(use_internal, "merge-one-file"))
-			pgm = "git-merge-one-file";
+			merge_action = merge_one_file_func;
 		else
 			die(_("git merge-index: unknown internal program %s"), use_internal);
-	}
+
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+	} else
+		merge_action = merge_one_file_spawn;
 
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..ad99c6dbd4
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,94 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file object name (or empty)
+ *   argv[2] - file in branch1 object name (or empty)
+ *   argv[3] - file in branch2 object name (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+static int read_mode(const char *name, const char *arg, unsigned int *mode)
+{
+	char *last;
+	int ret = 0;
+
+	*mode = strtol(arg, &last, 8);
+
+	if (*last)
+		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
+	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
+		ret = error(_("invalid '%s' mode: %o"), name, *mode);
+
+	return ret;
+}
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct repository *r = the_repository;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (repo_read_index(r) < 0)
+		die("invalid index");
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	if (!get_oid_hex(argv[1], &orig_blob)) {
+		p_orig_blob = &orig_blob;
+		ret = read_mode("orig", argv[5], &orig_mode);
+	} else if (!*argv[1] && *argv[5])
+		ret = error(_("no 'orig' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[2], &our_blob)) {
+		p_our_blob = &our_blob;
+		ret = read_mode("our", argv[6], &our_mode);
+	} else if (!*argv[2] && *argv[6])
+		ret = error(_("no 'our' object id given, but a mode was still given."));
+
+	if (!get_oid_hex(argv[3], &their_blob)) {
+		p_their_blob = &their_blob;
+		ret = read_mode("their", argv[7], &their_mode);
+	} else if (!*argv[3] && *argv[7])
+		ret = error(_("no 'their' object id given, but a mode was still given."));
+
+	if (ret)
+		return ret;
+
+	ret = merge_three_way(r->index, p_orig_blob, p_our_blob, p_their_blob,
+			      argv[4], orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return !!ret;
+	}
+
+	return write_locked_index(r->index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index 9bc077a025..95eb74efe1 100644
--- a/git.c
+++ b/git.c
@@ -544,6 +544,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index c80f964612..2717af51fd 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,197 @@
 #include "cache.h"
+#include "dir.h"
 #include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
+				     const struct object_id *oid, const char *path,
+				     int checkout)
+{
+	struct cache_entry *ce;
+	int res;
+
+	res = add_to_index_cacheinfo(istate, mode, oid, path, 0, 1, 1, &ce);
+	if (res == -1)
+		return error(_("Invalid path '%s'"), path);
+	else if (res == -2)
+		return -1;
+
+	if (checkout) {
+		struct checkout state = CHECKOUT_INIT;
+
+		state.istate = istate;
+		state.force = 1;
+		state.base_dir = "";
+		state.base_dir_len = 0;
+
+		if (checkout_entry(ce, &state, NULL, NULL) < 0)
+			return error(_("%s: cannot checkout file"), path);
+	}
+
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((!our_blob && orig_mode != their_mode) ||
+	    (!their_blob && orig_mode != our_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	ssize_t written;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, &null_oid);
+	}
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
+
+	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret < 0) {
+		free(result.ptr);
+		return error(_("Failed to execute internal merge"));
+	}
+
+	if (ret > 0 || !orig_blob)
+		ret = error(_("content conflict in %s"), path);
+	if (our_mode != their_mode)
+		ret = error(_("permission conflict: %o->%o,%o in %s"),
+			    orig_mode, our_mode, their_mode, path);
+
+	unlink(path);
+	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
+		free(result.ptr);
+		return error_errno(_("failed to open file '%s'"), path);
+	}
+
+	written = write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (written < 0)
+		return error_errno(_("failed to write to '%s'"), path);
+	if (ret)
+		return ret;
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_three_way(struct index_state *istate,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!our_blob && !their_blob) ||
+	     (!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
+		/* Deleted in both or deleted in one and unchanged in the other. */
+		return merge_one_file_deleted(istate, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	} else if (!orig_blob && our_blob && !their_blob) {
+		/*
+		 * Added in ours.  The other side did not add and we
+		 * added so there is nothing to be done, except making
+		 * the path merged.
+		 */
+		return add_merge_result_to_index(istate, our_mode, our_blob, path, 0);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		return add_merge_result_to_index(istate, their_mode, their_blob, path, 1);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		/* Added in both, identically (check for same permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		return add_merge_result_to_index(istate, our_mode, our_blob, path, 1);
+	} else if (our_blob && their_blob) {
+		/* Modified in both, but differently. */
+		return do_merge_one_file(istate,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	} else {
+		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
+			their_hex[GIT_MAX_HEXSZ] = {0};
+
+		if (orig_blob)
+			oid_to_hex_r(orig_hex, orig_blob);
+		if (our_blob)
+			oid_to_hex_r(our_hex, our_blob);
+		if (their_blob)
+			oid_to_hex_r(their_hex, their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
+
+int merge_one_file_func(struct index_state *istate,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data)
+{
+	return merge_three_way(istate,
+			       orig_blob, our_blob, their_blob, path,
+			       orig_mode, our_mode, their_mode);
+}
 
 static int merge_entry(struct index_state *istate, int quiet, unsigned int pos,
 		       const char *path, int *err, merge_fn fn, void *data)
@@ -54,17 +246,24 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		    merge_fn fn, void *data)
 {
 	int err = 0, ret;
-	unsigned int i;
+	unsigned int i, prev_nr;
 
 	for (i = 0; i < istate->cache_nr; i++) {
 		const struct cache_entry *ce = istate->cache[i];
 		if (!ce_stage(ce))
 			continue;
 
+		prev_nr = istate->cache_nr;
 		ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data);
-		if (ret > 0)
-			i += ret - 1;
-		else if (ret == -1)
+		if (ret > 0) {
+			/*
+			 * Don't bother handling an index that has
+			 * grown, since merge_one_file_func() can't grow
+			 * it, and merge_one_file_spawn() can't change
+			 * it.
+			 */
+			i += ret - (prev_nr - istate->cache_nr) - 1;
+		} else if (ret == -1)
 			return -1;
 
 		if (err && !oneshot)
diff --git a/merge-strategies.h b/merge-strategies.h
index 88f476f170..8705a550ca 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -3,6 +3,12 @@
 
 #include "object.h"
 
+int merge_three_way(struct index_state *istate,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
+
 typedef int (*merge_fn)(struct index_state *istate,
 			const struct object_id *orig_blob,
 			const struct object_id *our_blob,
@@ -10,6 +16,13 @@ typedef int (*merge_fn)(struct index_state *istate,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_func(struct index_state *istate,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
 int merge_index_path(struct index_state *istate, int oneshot, int quiet,
 		     const char *path, merge_fn fn, void *data);
 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index d0cdfeddc1..d9c07965dc 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -72,7 +72,7 @@ test_expect_success 'merge-one-file fails without a work tree' '
 	(cd bare.git &&
 	 GIT_INDEX_FILE=$PWD/merge.index &&
 	 export GIT_INDEX_FILE &&
-	 test_must_fail git merge-index git-merge-one-file -a
+	 test_must_fail git merge-index --use=merge-one-file -a
 	)
 '
 
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2ce104aca7..075da1f55f 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -97,7 +97,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 09/15] merge-resolve: rewrite in C
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (7 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 08/15] merge-one-file: rewrite in C Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-23 22:21               ` Johannes Schindelin
  2021-03-17 20:49             ` [PATCH v7 10/15] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                               ` (6 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all_index() function.

The index is read in cmd_merge_resolve(), and is wrote back by
merge_strategies_resolve().

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |  2 +-
 builtin.h               |  1 +
 builtin/merge-resolve.c | 74 ++++++++++++++++++++++++++++++++
 git-merge-resolve.sh    | 54 -----------------------
 git.c                   |  1 +
 merge-strategies.c      | 95 +++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |  5 +++
 7 files changed, 177 insertions(+), 55 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index e2e4389f76..8fccc38006 100644
--- a/Makefile
+++ b/Makefile
@@ -600,7 +600,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1102,6 +1101,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index 227c133036..c3029cef46 100644
--- a/builtin.h
+++ b/builtin.h
@@ -181,6 +181,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..0f2e487c4d
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,74 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+	struct repository *r = the_repository;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (repo_read_index(r) < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (!strcmp(argv[i], "--"))
+			sep_seen = 1;
+		else if (!strcmp(argv[i], "-h"))
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = oideq(&oid, r->hash_algo->empty_tree) ?
+				NULL : lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				commit_list_insert(commit, &remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Give up if we are given two or more remotes.  Not handling
+	 * octopus.
+	 */
+	if (remote && remote->next)
+		return 2;
+
+	/* Give up if this is a baseless merge. */
+	if (!bases)
+		return 2;
+
+	return merge_strategies_resolve(r, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index 0b4fc88b61..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o --use=merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index 95eb74efe1..ce1f237369 100644
--- a/git.c
+++ b/git.c
@@ -548,6 +548,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 2717af51fd..a51700dae5 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,6 +1,9 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
 static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
@@ -272,3 +275,95 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 
 	return err;
 }
+
+static int fast_forward(struct repository *r, struct tree_desc *t,
+			int nr, int aggressive)
+{
+	struct unpack_trees_options opts;
+	struct lock_file lock = LOCK_INIT;
+
+	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return error(_("unable to write new index file"));
+
+	return 0;
+}
+
+static int add_tree(struct tree *tree, struct tree_desc *t)
+{
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct object_id head, oid;
+	struct commit_list *i;
+	int nr = 0;
+
+	if (head_arg)
+		get_oid(head_arg, &head);
+
+	puts(_("Trying simple merge."));
+
+	for (i = bases; i && i->item; i = i->next) {
+		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
+			return 2;
+	}
+
+	if (head_arg) {
+		struct tree *tree = parse_tree_indirect(&head);
+		if (add_tree(tree, t + (nr++)))
+			return 2;
+	}
+
+	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
+		return 2;
+
+	if (fast_forward(r, t, nr, 1))
+		return 2;
+
+	if (write_index_as_tree(&oid, r->index, r->index_file,
+				WRITE_TREE_SILENT, NULL)) {
+		int ret;
+		struct lock_file lock = LOCK_INIT;
+
+		puts(_("Simple merge failed, trying Automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
+
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+		return !!ret;
+	}
+
+	return 0;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 8705a550ca..bba4bf999c 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_three_way(struct index_state *istate,
@@ -28,4 +29,8 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		    merge_fn fn, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 10/15] merge-recursive: move better_branch_name() to merge.c
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (8 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 09/15] merge-resolve: " Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 11/15] merge-octopus: rewrite in C Alban Gruin
                               ` (5 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

better_branch_name() will be used by merge-octopus once it is rewritten
in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index a4bfd8fc51..972243b5e9 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index 41e30c0da2..e89a8c3404 100644
--- a/cache.h
+++ b/cache.h
@@ -1852,7 +1852,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 5fb88af102..801d673c5f 100644
--- a/merge.c
+++ b/merge.c
@@ -109,3 +109,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 11/15] merge-octopus: rewrite in C
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (9 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 10/15] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-23 23:58               ` Johannes Schindelin
  2021-03-17 20:49             ` [PATCH v7 12/15] merge: use the "resolve" strategy without forking Alban Gruin
                               ` (4 subsequent siblings)
  15 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to
   write_index_as_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all_index().

The index is read in cmd_merge_octopus(), and is wrote back by
merge_strategies_octopus().

Here to, merge_strategies_octopus() takes two commit lists and a string
to reduce frictions when try_merge_strategies() will be modified to call
it directly.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  70 ++++++++++++++++
 git-merge-octopus.sh    | 112 -------------------------
 git.c                   |   1 +
 merge-strategies.c      | 175 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 251 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 8fccc38006..fa8f1a2ddf 100644
--- a/Makefile
+++ b/Makefile
@@ -599,7 +599,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1098,6 +1097,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index c3029cef46..ef2b65c9d0 100644
--- a/builtin.h
+++ b/builtin.h
@@ -177,6 +177,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..9b9939b6b2
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,70 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+	struct repository *r = the_repository;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (repo_read_index(r) < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = oideq(&oid, r->hash_algo->empty_tree) ?
+				NULL : lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				next_remote = commit_list_append(commit, next_remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	/*
+	 * Reject if this is not an octopus -- resolve should be used
+	 * instead.
+	 */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	return merge_strategies_octopus(r, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 2770891960..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o --use=merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index ce1f237369..c47cc441a3 100644
--- a/git.c
+++ b/git.c
@@ -543,6 +543,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index a51700dae5..ebc0d0b1e2 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "lockfile.h"
 #include "merge-strategies.h"
@@ -367,3 +368,177 @@ int merge_strategies_resolve(struct repository *r,
 
 	return 0;
 }
+
+static int write_tree(struct repository *r, struct tree **reference_tree)
+{
+	struct object_id oid;
+	int ret;
+
+	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file,
+					WRITE_TREE_SILENT, NULL)))
+		*reference_tree = lookup_tree(r, &oid);
+
+	return ret;
+}
+
+static int octopus_fast_forward(struct repository *r, const char *branch_name,
+				struct tree *tree_head, struct tree *current_tree,
+				struct tree **reference_tree)
+{
+	/*
+	 * The first head being merged was a fast-forward.  Advance the
+	 * reference commit to the head being merged, and use that tree
+	 * as the intermediate result of the merge.  We still need to
+	 * count this as part of the parent set.
+	 */
+	struct tree_desc t[2];
+
+	printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+	init_tree_desc(t, tree_head->buffer, tree_head->size);
+	if (add_tree(current_tree, t + 1))
+		return -1;
+	if (fast_forward(r, t, 2, 0))
+		return -1;
+	if (write_tree(r, reference_tree))
+		return -1;
+
+	return 0;
+}
+
+static int octopus_do_merge(struct repository *r, const char *branch_name,
+			    struct commit_list *common, struct tree *current_tree,
+			    struct tree **reference_tree)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct commit_list *i;
+	int nr = 0, ret = 0;
+
+	printf(_("Trying simple merge with %s\n"), branch_name);
+
+	for (i = common; i; i = i->next) {
+		struct tree *tree = repo_get_commit_tree(r, i->item);
+		if (add_tree(tree, t + (nr++)))
+			return -1;
+	}
+
+	if (add_tree(*reference_tree, t + (nr++)))
+		return -1;
+	if (add_tree(current_tree, t + (nr++)))
+		return -1;
+	if (fast_forward(r, t, nr, 1))
+		return 2;
+
+	if (write_tree(r, reference_tree)) {
+		struct lock_file lock = LOCK_INIT;
+
+		puts(_("Simple merge did not work, trying automatic merge."));
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+		ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, NULL);
+		write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+		write_tree(r, reference_tree);
+	}
+
+	return ret;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int ff_merge = 1, ret = 0, nr_references = 1;
+	struct commit **reference_commits, *head_commit;
+	struct tree *reference_tree, *head_tree;
+	struct commit_list *i;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+	head_commit = lookup_commit_reference(r, &head);
+	head_tree = repo_get_commit_tree(r, head_commit);
+
+	if (parse_tree(head_tree))
+		return 2;
+
+	if (repo_index_has_changes(r, head_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		return 2;
+	}
+
+	CALLOC_ARRAY(reference_commits, commit_list_count(remotes) + 1);
+	reference_commits[0] = head_commit;
+	reference_tree = head_tree;
+
+	for (i = remotes; i && i->item; i = i->next) {
+		struct commit *c = i->item;
+		struct object_id *oid = &c->object.oid;
+		struct tree *current_tree = repo_get_commit_tree(r, c);
+		struct commit_list *common, *j;
+		char *branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		int up_to_date = 0;
+
+		common = repo_get_merge_bases_many(r, c, nr_references, reference_commits);
+		if (!common) {
+			error(_("Unable to find common commit with %s"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+			free(reference_commits);
+
+			return 2;
+		}
+
+		for (j = common; j && !up_to_date && ff_merge; j = j->next) {
+			up_to_date |= oideq(&j->item->object.oid, oid);
+
+			if (!j->next &&
+			    !oideq(&j->item->object.oid,
+				   &reference_commits[nr_references - 1]->object.oid))
+				ff_merge = 0;
+		}
+
+		if (up_to_date) {
+			printf(_("Already up to date with %s\n"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		if (ff_merge) {
+			ret = octopus_fast_forward(r, branch_name, head_tree,
+						   current_tree, &reference_tree);
+			nr_references = 0;
+		} else {
+			ret = octopus_do_merge(r, branch_name, common,
+					       current_tree, &reference_tree);
+		}
+
+		free(branch_name);
+		free_commit_list(common);
+
+		if (ret == -1 || ret == 2)
+			break;
+		else if (ret && i->next) {
+			/*
+			 * We allow only last one to have a
+			 * hand-resolvable conflicts.  Last round failed
+			 * and we still had a head to merge.
+			 */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			free(reference_commits);
+			return 2;
+		}
+
+		reference_commits[nr_references++] = c;
+	}
+
+	free(reference_commits);
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index bba4bf999c..8de2249ee6 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -32,5 +32,8 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 12/15] merge: use the "resolve" strategy without forking
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (10 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 11/15] merge-octopus: rewrite in C Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 13/15] merge: use the "octopus" " Alban Gruin
                               ` (3 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index eb00b273e6..87921497a2 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -43,6 +43,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -755,6 +756,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
+	} else if (!strcmp(strategy, "resolve")) {
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 13/15] merge: use the "octopus" strategy without forking
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (11 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 12/15] merge: use the "resolve" strategy without forking Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 14/15] sequencer: use the "resolve" " Alban Gruin
                               ` (2 subsequent siblings)
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 87921497a2..79f1e8bdd1 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -759,6 +759,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve")) {
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	} else if (!strcmp(strategy, "octopus")) {
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 14/15] sequencer: use the "resolve" strategy without forking
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (12 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 13/15] merge: use the "octopus" " Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2021-03-17 20:49             ` [PATCH v7 15/15] sequencer: use the "octopus" merge " Alban Gruin
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index d2332d3e17..ec8e9bda22 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -34,6 +34,7 @@
 #include "commit-reach.h"
 #include "rebase-interactive.h"
 #include "reset.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2049,9 +2050,16 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else {
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+		}
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v7 15/15] sequencer: use the "octopus" merge strategy without forking
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (13 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 14/15] sequencer: use the "resolve" " Alban Gruin
@ 2021-03-17 20:49             ` Alban Gruin
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
  15 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-17 20:49 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Derrick Stolee, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index ec8e9bda22..683ebfc8e2 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2054,6 +2054,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else {
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.31.0


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 03/15] t6060: add tests for removed files
  2021-03-17 20:49             ` [PATCH v7 03/15] t6060: add tests for removed files Alban Gruin
@ 2021-03-22 21:36               ` Johannes Schindelin
  2021-03-23 20:43                 ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2021-03-22 21:36 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Alban,

On Wed, 17 Mar 2021, Alban Gruin wrote:

> Until now, t6060 did not not check git-mere-one-file's behaviour when a

Channeling my inner Eric Sunshine: s/mere-one/merge-one/ ;-)

> file is deleted in a branch.  To avoid regressions on this during the
> conversion, this adds a new file, `file3', in the commit tagged as`base', and

Maybe "during the conversion from shell script to C"?

Other than that, looks good to me! Thanks,
Dscho

> deletes it in the commit tagged as `two'.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  t/t6060-merge-index.sh | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
> index 9e15ceb957..0cbd8a1f7f 100755
> --- a/t/t6060-merge-index.sh
> +++ b/t/t6060-merge-index.sh
> @@ -8,12 +8,14 @@ test_expect_success 'setup diverging branches' '
>  		echo $i
>  	done >file &&
>  	cp file file2 &&
> -	git add file file2 &&
> +	cp file file3 &&
> +	git add file file2 file3 &&
>  	git commit -m base &&
>  	git tag base &&
>  	sed s/2/two/ <file >tmp &&
>  	mv tmp file &&
>  	cp file file2 &&
> +	git rm file3 &&
>  	git commit -a -m two &&
>  	git tag two &&
>  	git checkout -b other HEAD^ &&
> @@ -41,6 +43,7 @@ test_expect_success 'read-tree does not resolve content merge' '
>  	cat >expect <<-\EOF &&
>  	file
>  	file2
> +	file3
>  	EOF
>  	git read-tree -i -m base ten two &&
>  	git diff-files --name-only --diff-filter=U >unmerged &&
> --
> 2.31.0
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c
  2021-03-17 20:49             ` [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2021-03-22 21:59               ` Johannes Schindelin
  2021-03-23 20:45                 ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2021-03-22 21:59 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Alban,

On Wed, 17 Mar 2021, Alban Gruin wrote:

> This moves the function add_cacheinfo() that already exists in
> update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
> and adds an `istate' parameter.  The new cache entry is returned through
> a pointer passed in the parameters.  The return value is either 0
> (success), -1 (invalid path), or -2 (failed to add the file in the
> index).

This paragraph still talks about magic numbers, but the code has constants
for them. Maybe elevate the commit message to a more generic description
that does not spend time on specifying the exact values, but rather lists
the three outcomes in plain English?

Other than that, this looks fine to me! Thanks,
Dscho

>
> This will become useful in the next commit, when the three-way merge
> will need to call this function.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  builtin/update-index.c | 25 +++++++------------------
>  cache.h                |  8 ++++++++
>  read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
>  3 files changed, 50 insertions(+), 18 deletions(-)
>
> diff --git a/builtin/update-index.c b/builtin/update-index.c
> index 79087bccea..6b86e89840 100644
> --- a/builtin/update-index.c
> +++ b/builtin/update-index.c
> @@ -404,27 +404,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
>  static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
>  			 const char *path, int stage)
>  {
> -	int len, option;
> -	struct cache_entry *ce;
> +	int res;
>
> -	if (!verify_path(path, mode))
> -		return error("Invalid path '%s'", path);
> -
> -	len = strlen(path);
> -	ce = make_empty_cache_entry(&the_index, len);
> -
> -	oidcpy(&ce->oid, oid);
> -	memcpy(ce->name, path, len);
> -	ce->ce_flags = create_ce_flags(stage);
> -	ce->ce_namelen = len;
> -	ce->ce_mode = create_ce_mode(mode);
> -	if (assume_unchanged)
> -		ce->ce_flags |= CE_VALID;
> -	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
> -	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
> -	if (add_cache_entry(ce, option))
> +	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
> +				     allow_add, allow_replace, NULL);
> +	if (res == ADD_TO_INDEX_CACHEINFO_INVALID_PATH)
> +		return error(_("Invalid path '%s'"), path);
> +	if (res == ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD)
>  		return error("%s: cannot add to the index - missing --add option?",
>  			     path);
> +
>  	report("add '%s'", path);
>  	return 0;
>  }
> diff --git a/cache.h b/cache.h
> index 6fda8091f1..41e30c0da2 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -832,6 +832,14 @@ int remove_file_from_index(struct index_state *, const char *path);
>  int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
>  int add_file_to_index(struct index_state *, const char *path, int flags);
>
> +#define ADD_TO_INDEX_CACHEINFO_INVALID_PATH (-1)
> +#define ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD (-2)
> +
> +int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
> +			   const struct object_id *oid, const char *path,
> +			   int stage, int allow_add, int allow_replace,
> +			   struct cache_entry **ce_ret);
> +
>  int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
>  int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
>  void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
> diff --git a/read-cache.c b/read-cache.c
> index 1e9a50c6c7..b514523ca1 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1350,6 +1350,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
>  	return 0;
>  }
>
> +int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
> +			   const struct object_id *oid, const char *path,
> +			   int stage, int allow_add, int allow_replace,
> +			   struct cache_entry **ce_ret)
> +{
> +	int len, option;
> +	struct cache_entry *ce;
> +
> +	if (!verify_path(path, mode))
> +		return ADD_TO_INDEX_CACHEINFO_INVALID_PATH;
> +
> +	len = strlen(path);
> +	ce = make_empty_cache_entry(istate, len);
> +
> +	oidcpy(&ce->oid, oid);
> +	memcpy(ce->name, path, len);
> +	ce->ce_flags = create_ce_flags(stage);
> +	ce->ce_namelen = len;
> +	ce->ce_mode = create_ce_mode(mode);
> +	if (assume_unchanged)
> +		ce->ce_flags |= CE_VALID;
> +	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
> +	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
> +
> +	if (add_index_entry(istate, ce, option)) {
> +		discard_cache_entry(ce);
> +		return ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD;
> +	}
> +
> +	if (ce_ret)
> +		*ce_ret = ce;
> +
> +	return 0;
> +}
> +
>  /*
>   * "refresh" does not calculate a new sha1 file or bring the
>   * cache up-to-date for mode/content changes. But what it
> --
> 2.31.0
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 08/15] merge-one-file: rewrite in C
  2021-03-17 20:49             ` [PATCH v7 08/15] merge-one-file: rewrite in C Alban Gruin
@ 2021-03-22 22:20               ` Johannes Schindelin
  2021-03-23 20:53                 ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2021-03-22 22:20 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Alban,

On Wed, 17 Mar 2021, Alban Gruin wrote:

> This rewrites `git merge-one-file' from shell to C.  This port is not
> completely straightforward: to save precious cycles by avoiding reading
> and flushing the index repeatedly, write temporary files when an
> operation can be performed in-memory, or allow other function to use the
> rewrite without forking nor worrying about the index, the calls to
> external processes are replaced by calls to functions in libgit.a:
>
>  - calls to `update-index --add --cacheinfo' are replaced by calls to
>    add_to_index_cacheinfo();
>
>  - calls to `update-index --remove' are replaced by calls to
>    remove_file_from_index();
>
>  - calls to `checkout-index -u -f' are replaced by calls to
>    checkout_entry();
>
>  - calls to `unpack-file' and `merge-files' are replaced by calls to
>    read_mmblob() and xdl_merge(), respectively, to merge files
>    in-memory;
>
>  - calls to `checkout-index -f --stage=2' are removed, as this is needed
>    to have the correct permission bits on the merged file from the
>    script, but not in the C version;
>
>  - calls to `update-index' are replaced by calls to add_file_to_index().
>
> The bulk of the rewrite is done in a new file in libgit.a,
> merge-strategies.c.  This will enable the resolve and octopus strategies
> to directly call it instead of forking.
>
> This also fixes a bug present in the original script: instead of
> checking if a _regular_ file exists when a file exists in the branch to
> merge, but not in our branch, the rewritten version checks if a file of
> any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
> where the branch to merge had a new file, `a/b', but our branch had a
> directory there; it should have failed because a directory exists, but
> it did not because there was no regular file called `a/b'.  This test is
> now marked as successful.
>
> This also teaches `merge-index' to call merge_three_way() (when invoked
> with `--use=merge-one-file') without forking using a new callback,
> merge_one_file_func().
>
> To avoid any issue with a shrinking index because of the merge function
> used (directly in the process or by forking), as described earlier, the
> iterator of the loop of merge_all_index() is increased by the number of
> entries with the same name, minus the difference between the number of
> entries in the index before and after the merge.
>
> This should handle a shrinking index correctly, but could lead to issues
> with a growing index.  However, this case is not treated, as there is no
> callback that can produce such a case.

Nice!

> diff --git a/builtin/merge-index.c b/builtin/merge-index.c
> index fd5b1a5a92..04d38aa130 100644
> --- a/builtin/merge-index.c
> +++ b/builtin/merge-index.c
> @@ -38,7 +38,7 @@ static int merge_one_file_spawn(struct index_state *istate,
>  int cmd_merge_index(int argc, const char **argv, const char *prefix)
>  {
>  	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
> -	merge_fn merge_action = merge_one_file_spawn;
> +	merge_fn merge_action;
>  	struct lock_file lock = LOCK_INIT;
>  	struct repository *r = the_repository;
>  	const char *use_internal = NULL;
> @@ -69,10 +69,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>
>  	if (skip_prefix(pgm, "--use=", &use_internal)) {
>  		if (!strcmp(use_internal, "merge-one-file"))
> -			pgm = "git-merge-one-file";
> +			merge_action = merge_one_file_func;
>  		else
>  			die(_("git merge-index: unknown internal program %s"), use_internal);
> -	}
> +
> +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
> +	} else
> +		merge_action = merge_one_file_spawn;

I would have a slight preference to keep the default initializer, because
that makes it easer to reason about. But if you _want_ to keep this patch
as-is, I won't object.

It is a bit sad that the conversion cannot be done more incrementally, as
there is a lot to unpack in the many different cases that are handled. It
looks correct, though.

Just one thing:

>
>  	for (; i < argc; i++) {
>  		const char *arg = argv[i];
> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..ad99c6dbd4
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,94 @@
> +/*
> + * Builtin "git merge-one-file"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> + *
> + * This is the git per-file merge utility, called with
> + *
> + *   argv[1] - original file object name (or empty)
> + *   argv[2] - file in branch1 object name (or empty)
> + *   argv[3] - file in branch2 object name (or empty)
> + *   argv[4] - pathname in repository
> + *   argv[5] - original file mode (or empty)
> + *   argv[6] - file in branch1 mode (or empty)
> + *   argv[7] - file in branch2 mode (or empty)
> + *
> + * Handle some trivial cases. The _really_ trivial cases have been
> + * handled already by git read-tree, but that one doesn't do any merges
> + * that might change the tree layout.
> + */
> +
> +#include "cache.h"
> +#include "builtin.h"
> +#include "lockfile.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_one_file_usage[] =
> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
> +	"<orig mode> <our mode> <their mode>\n\n"
> +	"Blob ids and modes should be empty for missing files.";
> +
> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
> +{
> +	char *last;
> +	int ret = 0;
> +
> +	*mode = strtol(arg, &last, 8);
> +
> +	if (*last)
> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
> +
> +	return ret;
> +}
> +
> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> +{
> +	struct object_id orig_blob, our_blob, their_blob,
> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
> +	struct lock_file lock = LOCK_INIT;
> +	struct repository *r = the_repository;
> +
> +	if (argc != 8)
> +		usage(builtin_merge_one_file_usage);
> +
> +	if (repo_read_index(r) < 0)
> +		die("invalid index");
> +
> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
> +
> +	if (!get_oid_hex(argv[1], &orig_blob)) {
> +		p_orig_blob = &orig_blob;
> +		ret = read_mode("orig", argv[5], &orig_mode);
> +	} else if (!*argv[1] && *argv[5])
> +		ret = error(_("no 'orig' object id given, but a mode was still given."));

Here, it looks as if the case of an empty `argv[1]` is not handled
_explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then
we rely on the second arm _also_ not re-assigning `orig_blob`.

I wonder whether this could be checked, and whether it would make sense to
fold this, along with most of these 5 lines, into the `read_mode()` helper
function (DRYing up the code even further).

As for the rest of the patch, it is totally possible that I missed a bug,
but it looks correct to me, and the added regression tests give me a good
feeling about the patch, too.

Thanks,
Dscho

> +
> +	if (!get_oid_hex(argv[2], &our_blob)) {
> +		p_our_blob = &our_blob;
> +		ret = read_mode("our", argv[6], &our_mode);
> +	} else if (!*argv[2] && *argv[6])
> +		ret = error(_("no 'our' object id given, but a mode was still given."));
> +
> +	if (!get_oid_hex(argv[3], &their_blob)) {
> +		p_their_blob = &their_blob;
> +		ret = read_mode("their", argv[7], &their_mode);
> +	} else if (!*argv[3] && *argv[7])
> +		ret = error(_("no 'their' object id given, but a mode was still given."));
> +
> +	if (ret)
> +		return ret;
> +
> +	ret = merge_three_way(r->index, p_orig_blob, p_our_blob, p_their_blob,
> +			      argv[4], orig_mode, our_mode, their_mode);
> +
> +	if (ret) {
> +		rollback_lock_file(&lock);
> +		return !!ret;
> +	}
> +
> +	return write_locked_index(r->index, &lock, COMMIT_LOCK);
> +}
> diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
> deleted file mode 100755
> index f6d9852d2f..0000000000
> --- a/git-merge-one-file.sh
> +++ /dev/null
> @@ -1,167 +0,0 @@
> -#!/bin/sh
> -#
> -# Copyright (c) Linus Torvalds, 2005
> -#
> -# This is the git per-file merge script, called with
> -#
> -#   $1 - original file SHA1 (or empty)
> -#   $2 - file in branch1 SHA1 (or empty)
> -#   $3 - file in branch2 SHA1 (or empty)
> -#   $4 - pathname in repository
> -#   $5 - original file mode (or empty)
> -#   $6 - file in branch1 mode (or empty)
> -#   $7 - file in branch2 mode (or empty)
> -#
> -# Handle some trivial cases.. The _really_ trivial cases have
> -# been handled already by git read-tree, but that one doesn't
> -# do any merges that might change the tree layout.
> -
> -USAGE='<orig blob> <our blob> <their blob> <path>'
> -USAGE="$USAGE <orig mode> <our mode> <their mode>"
> -LONG_USAGE="usage: git merge-one-file $USAGE
> -
> -Blob ids and modes should be empty for missing files."
> -
> -SUBDIRECTORY_OK=Yes
> -. git-sh-setup
> -cd_to_toplevel
> -require_work_tree
> -
> -if test $# != 7
> -then
> -	echo "$LONG_USAGE"
> -	exit 1
> -fi
> -
> -case "${1:-.}${2:-.}${3:-.}" in
> -#
> -# Deleted in both or deleted in one and unchanged in the other
> -#
> -"$1.." | "$1.$1" | "$1$1.")
> -	if { test -z "$6" && test "$5" != "$7"; } ||
> -	   { test -z "$7" && test "$5" != "$6"; }
> -	then
> -		echo "ERROR: File $4 deleted on one branch but had its" >&2
> -		echo "ERROR: permissions changed on the other." >&2
> -		exit 1
> -	fi
> -
> -	if test -n "$2"
> -	then
> -		echo "Removing $4"
> -	else
> -		# read-tree checked that index matches HEAD already,
> -		# so we know we do not have this path tracked.
> -		# there may be an unrelated working tree file here,
> -		# which we should just leave unmolested.  Make sure
> -		# we do not have it in the index, though.
> -		exec git update-index --remove -- "$4"
> -	fi
> -	if test -f "$4"
> -	then
> -		rm -f -- "$4" &&
> -		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
> -	fi &&
> -		exec git update-index --remove -- "$4"
> -	;;
> -
> -#
> -# Added in one.
> -#
> -".$2.")
> -	# the other side did not add and we added so there is nothing
> -	# to be done, except making the path merged.
> -	exec git update-index --add --cacheinfo "$6" "$2" "$4"
> -	;;
> -"..$3")
> -	echo "Adding $4"
> -	if test -f "$4"
> -	then
> -		echo "ERROR: untracked $4 is overwritten by the merge." >&2
> -		exit 1
> -	fi
> -	git update-index --add --cacheinfo "$7" "$3" "$4" &&
> -		exec git checkout-index -u -f -- "$4"
> -	;;
> -
> -#
> -# Added in both, identically (check for same permissions).
> -#
> -".$3$2")
> -	if test "$6" != "$7"
> -	then
> -		echo "ERROR: File $4 added identically in both branches," >&2
> -		echo "ERROR: but permissions conflict $6->$7." >&2
> -		exit 1
> -	fi
> -	echo "Adding $4"
> -	git update-index --add --cacheinfo "$6" "$2" "$4" &&
> -		exec git checkout-index -u -f -- "$4"
> -	;;
> -
> -#
> -# Modified in both, but differently.
> -#
> -"$1$2$3" | ".$2$3")
> -
> -	case ",$6,$7," in
> -	*,120000,*)
> -		echo "ERROR: $4: Not merging symbolic link changes." >&2
> -		exit 1
> -		;;
> -	*,160000,*)
> -		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
> -		exit 1
> -		;;
> -	esac
> -
> -	src1=$(git unpack-file $2)
> -	src2=$(git unpack-file $3)
> -	case "$1" in
> -	'')
> -		echo "Added $4 in both, but differently."
> -		orig=$(git unpack-file $(git hash-object /dev/null))
> -		;;
> -	*)
> -		echo "Auto-merging $4"
> -		orig=$(git unpack-file $1)
> -		;;
> -	esac
> -
> -	git merge-file "$src1" "$orig" "$src2"
> -	ret=$?
> -	msg=
> -	if test $ret != 0 || test -z "$1"
> -	then
> -		msg='content conflict'
> -		ret=1
> -	fi
> -
> -	# Create the working tree file, using "our tree" version from the
> -	# index, and then store the result of the merge.
> -	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
> -	rm -f -- "$orig" "$src1" "$src2"
> -
> -	if test "$6" != "$7"
> -	then
> -		if test -n "$msg"
> -		then
> -			msg="$msg, "
> -		fi
> -		msg="${msg}permissions conflict: $5->$6,$7"
> -		ret=1
> -	fi
> -
> -	if test $ret != 0
> -	then
> -		echo "ERROR: $msg in $4" >&2
> -		exit 1
> -	fi
> -	exec git update-index -- "$4"
> -	;;
> -
> -*)
> -	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
> -	;;
> -esac
> -exit 1
> diff --git a/git.c b/git.c
> index 9bc077a025..95eb74efe1 100644
> --- a/git.c
> +++ b/git.c
> @@ -544,6 +544,7 @@ static struct cmd_struct commands[] = {
>  	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
>  	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
>  	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
> +	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>  	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>  	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
>  	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
> diff --git a/merge-strategies.c b/merge-strategies.c
> index c80f964612..2717af51fd 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -1,5 +1,197 @@
>  #include "cache.h"
> +#include "dir.h"
>  #include "merge-strategies.h"
> +#include "xdiff-interface.h"
> +
> +static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
> +				     const struct object_id *oid, const char *path,
> +				     int checkout)
> +{
> +	struct cache_entry *ce;
> +	int res;
> +
> +	res = add_to_index_cacheinfo(istate, mode, oid, path, 0, 1, 1, &ce);
> +	if (res == -1)
> +		return error(_("Invalid path '%s'"), path);
> +	else if (res == -2)
> +		return -1;
> +
> +	if (checkout) {
> +		struct checkout state = CHECKOUT_INIT;
> +
> +		state.istate = istate;
> +		state.force = 1;
> +		state.base_dir = "";
> +		state.base_dir_len = 0;
> +
> +		if (checkout_entry(ce, &state, NULL, NULL) < 0)
> +			return error(_("%s: cannot checkout file"), path);
> +	}
> +
> +	return 0;
> +}
> +
> +static int merge_one_file_deleted(struct index_state *istate,
> +				  const struct object_id *our_blob,
> +				  const struct object_id *their_blob, const char *path,
> +				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if ((!our_blob && orig_mode != their_mode) ||
> +	    (!their_blob && orig_mode != our_mode))
> +		return error(_("File %s deleted on one branch but had its "
> +			       "permissions changed on the other."), path);
> +
> +	if (our_blob) {
> +		printf(_("Removing %s\n"), path);
> +
> +		if (file_exists(path))
> +			remove_path(path);
> +	}
> +
> +	if (remove_file_from_index(istate, path))
> +		return error("%s: cannot remove from the index", path);
> +	return 0;
> +}
> +
> +static int do_merge_one_file(struct index_state *istate,
> +			     const struct object_id *orig_blob,
> +			     const struct object_id *our_blob,
> +			     const struct object_id *their_blob, const char *path,
> +			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	int ret, i, dest;
> +	ssize_t written;
> +	mmbuffer_t result = {NULL, 0};
> +	mmfile_t mmfs[3];
> +	xmparam_t xmp = {{0}};
> +
> +	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
> +		return error(_("%s: Not merging symbolic link changes."), path);
> +	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
> +		return error(_("%s: Not merging conflicting submodule changes."), path);
> +
> +	if (orig_blob) {
> +		printf(_("Auto-merging %s\n"), path);
> +		read_mmblob(mmfs + 0, orig_blob);
> +	} else {
> +		printf(_("Added %s in both, but differently.\n"), path);
> +		read_mmblob(mmfs + 0, &null_oid);
> +	}
> +
> +	read_mmblob(mmfs + 1, our_blob);
> +	read_mmblob(mmfs + 2, their_blob);
> +
> +	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
> +	xmp.style = 0;
> +	xmp.favor = 0;
> +
> +	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
> +
> +	for (i = 0; i < 3; i++)
> +		free(mmfs[i].ptr);
> +
> +	if (ret < 0) {
> +		free(result.ptr);
> +		return error(_("Failed to execute internal merge"));
> +	}
> +
> +	if (ret > 0 || !orig_blob)
> +		ret = error(_("content conflict in %s"), path);
> +	if (our_mode != their_mode)
> +		ret = error(_("permission conflict: %o->%o,%o in %s"),
> +			    orig_mode, our_mode, their_mode, path);
> +
> +	unlink(path);
> +	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
> +		free(result.ptr);
> +		return error_errno(_("failed to open file '%s'"), path);
> +	}
> +
> +	written = write_in_full(dest, result.ptr, result.size);
> +	close(dest);
> +
> +	free(result.ptr);
> +
> +	if (written < 0)
> +		return error_errno(_("failed to write to '%s'"), path);
> +	if (ret)
> +		return ret;
> +
> +	return add_file_to_index(istate, path, 0);
> +}
> +
> +int merge_three_way(struct index_state *istate,
> +		    const struct object_id *orig_blob,
> +		    const struct object_id *our_blob,
> +		    const struct object_id *their_blob, const char *path,
> +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> +	if (orig_blob &&
> +	    ((!our_blob && !their_blob) ||
> +	     (!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
> +	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
> +		/* Deleted in both or deleted in one and unchanged in the other. */
> +		return merge_one_file_deleted(istate, our_blob, their_blob, path,
> +					      orig_mode, our_mode, their_mode);
> +	} else if (!orig_blob && our_blob && !their_blob) {
> +		/*
> +		 * Added in ours.  The other side did not add and we
> +		 * added so there is nothing to be done, except making
> +		 * the path merged.
> +		 */
> +		return add_merge_result_to_index(istate, our_mode, our_blob, path, 0);
> +	} else if (!orig_blob && !our_blob && their_blob) {
> +		printf(_("Adding %s\n"), path);
> +
> +		if (file_exists(path))
> +			return error(_("untracked %s is overwritten by the merge."), path);
> +
> +		return add_merge_result_to_index(istate, their_mode, their_blob, path, 1);
> +	} else if (!orig_blob && our_blob && their_blob &&
> +		   oideq(our_blob, their_blob)) {
> +		/* Added in both, identically (check for same permissions). */
> +		if (our_mode != their_mode)
> +			return error(_("File %s added identically in both branches, "
> +				       "but permissions conflict %o->%o."),
> +				     path, our_mode, their_mode);
> +
> +		printf(_("Adding %s\n"), path);
> +
> +		return add_merge_result_to_index(istate, our_mode, our_blob, path, 1);
> +	} else if (our_blob && their_blob) {
> +		/* Modified in both, but differently. */
> +		return do_merge_one_file(istate,
> +					 orig_blob, our_blob, their_blob, path,
> +					 orig_mode, our_mode, their_mode);
> +	} else {
> +		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
> +			their_hex[GIT_MAX_HEXSZ] = {0};
> +
> +		if (orig_blob)
> +			oid_to_hex_r(orig_hex, orig_blob);
> +		if (our_blob)
> +			oid_to_hex_r(our_hex, our_blob);
> +		if (their_blob)
> +			oid_to_hex_r(their_hex, their_blob);
> +
> +		return error(_("%s: Not handling case %s -> %s -> %s"),
> +			     path, orig_hex, our_hex, their_hex);
> +	}
> +
> +	return 0;
> +}
> +
> +int merge_one_file_func(struct index_state *istate,
> +			const struct object_id *orig_blob,
> +			const struct object_id *our_blob,
> +			const struct object_id *their_blob, const char *path,
> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			void *data)
> +{
> +	return merge_three_way(istate,
> +			       orig_blob, our_blob, their_blob, path,
> +			       orig_mode, our_mode, their_mode);
> +}
>
>  static int merge_entry(struct index_state *istate, int quiet, unsigned int pos,
>  		       const char *path, int *err, merge_fn fn, void *data)
> @@ -54,17 +246,24 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
>  		    merge_fn fn, void *data)
>  {
>  	int err = 0, ret;
> -	unsigned int i;
> +	unsigned int i, prev_nr;
>
>  	for (i = 0; i < istate->cache_nr; i++) {
>  		const struct cache_entry *ce = istate->cache[i];
>  		if (!ce_stage(ce))
>  			continue;
>
> +		prev_nr = istate->cache_nr;
>  		ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data);
> -		if (ret > 0)
> -			i += ret - 1;
> -		else if (ret == -1)
> +		if (ret > 0) {
> +			/*
> +			 * Don't bother handling an index that has
> +			 * grown, since merge_one_file_func() can't grow
> +			 * it, and merge_one_file_spawn() can't change
> +			 * it.
> +			 */
> +			i += ret - (prev_nr - istate->cache_nr) - 1;
> +		} else if (ret == -1)
>  			return -1;
>
>  		if (err && !oneshot)
> diff --git a/merge-strategies.h b/merge-strategies.h
> index 88f476f170..8705a550ca 100644
> --- a/merge-strategies.h
> +++ b/merge-strategies.h
> @@ -3,6 +3,12 @@
>
>  #include "object.h"
>
> +int merge_three_way(struct index_state *istate,
> +		    const struct object_id *orig_blob,
> +		    const struct object_id *our_blob,
> +		    const struct object_id *their_blob, const char *path,
> +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
> +
>  typedef int (*merge_fn)(struct index_state *istate,
>  			const struct object_id *orig_blob,
>  			const struct object_id *our_blob,
> @@ -10,6 +16,13 @@ typedef int (*merge_fn)(struct index_state *istate,
>  			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
>  			void *data);
>
> +int merge_one_file_func(struct index_state *istate,
> +			const struct object_id *orig_blob,
> +			const struct object_id *our_blob,
> +			const struct object_id *their_blob, const char *path,
> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			void *data);
> +
>  int merge_index_path(struct index_state *istate, int oneshot, int quiet,
>  		     const char *path, merge_fn fn, void *data);
>  int merge_all_index(struct index_state *istate, int oneshot, int quiet,
> diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
> index d0cdfeddc1..d9c07965dc 100755
> --- a/t/t6060-merge-index.sh
> +++ b/t/t6060-merge-index.sh
> @@ -72,7 +72,7 @@ test_expect_success 'merge-one-file fails without a work tree' '
>  	(cd bare.git &&
>  	 GIT_INDEX_FILE=$PWD/merge.index &&
>  	 export GIT_INDEX_FILE &&
> -	 test_must_fail git merge-index git-merge-one-file -a
> +	 test_must_fail git merge-index --use=merge-one-file -a
>  	)
>  '
>
> diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
> index 2ce104aca7..075da1f55f 100755
> --- a/t/t6415-merge-dir-to-symlink.sh
> +++ b/t/t6415-merge-dir-to-symlink.sh
> @@ -97,7 +97,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
>  	test -h a/b
>  '
>
> -test_expect_failure 'do not lose untracked in merge (resolve)' '
> +test_expect_success 'do not lose untracked in merge (resolve)' '
>  	git reset --hard &&
>  	git checkout baseline^0 &&
>  	>a/b/c/e &&
> --
> 2.31.0
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 03/15] t6060: add tests for removed files
  2021-03-22 21:36               ` Johannes Schindelin
@ 2021-03-23 20:43                 ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-23 20:43 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Johannes,

Le 22/03/2021 à 22:36, Johannes Schindelin a écrit :
> Hi Alban,
> 
> On Wed, 17 Mar 2021, Alban Gruin wrote:
> 
>> Until now, t6060 did not not check git-mere-one-file's behaviour when a
> 
> Channeling my inner Eric Sunshine: s/mere-one/merge-one/ ;-)
> 

Good catch.

>> file is deleted in a branch.  To avoid regressions on this during the
>> conversion, this adds a new file, `file3', in the commit tagged as`base', and
> 
> Maybe "during the conversion from shell script to C"?
> 

I'll rewrite it as "during the conversion from shell to C".

Cheers,
Alban

> Other than that, looks good to me! Thanks,
> Dscho
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c
  2021-03-22 21:59               ` Johannes Schindelin
@ 2021-03-23 20:45                 ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-03-23 20:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Johannes,

Le 22/03/2021 à 22:59, Johannes Schindelin a écrit :
> Hi Alban,
> 
> On Wed, 17 Mar 2021, Alban Gruin wrote:
> 
>> This moves the function add_cacheinfo() that already exists in
>> update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
>> and adds an `istate' parameter.  The new cache entry is returned through
>> a pointer passed in the parameters.  The return value is either 0
>> (success), -1 (invalid path), or -2 (failed to add the file in the
>> index).
> 
> This paragraph still talks about magic numbers, but the code has constants
> for them. Maybe elevate the commit message to a more generic description
> that does not spend time on specifying the exact values, but rather lists
> the three outcomes in plain English?
> 

Okay, I'll do this.

Cheers,
Alban

> Other than that, this looks fine to me! Thanks,
> Dscho
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 08/15] merge-one-file: rewrite in C
  2021-03-22 22:20               ` Johannes Schindelin
@ 2021-03-23 20:53                 ` Alban Gruin
  2021-03-24  9:10                   ` Johannes Schindelin
  0 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2021-03-23 20:53 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Johannes,

Le 22/03/2021 à 23:20, Johannes Schindelin a écrit :
> Hi Alban,
> 
> On Wed, 17 Mar 2021, Alban Gruin wrote:
> 

>> @@ -69,10 +69,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
>>
>>  	if (skip_prefix(pgm, "--use=", &use_internal)) {
>>  		if (!strcmp(use_internal, "merge-one-file"))
>> -			pgm = "git-merge-one-file";
>> +			merge_action = merge_one_file_func;
>>  		else
>>  			die(_("git merge-index: unknown internal program %s"), use_internal);
>> -	}
>> +
>> +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
>> +	} else
>> +		merge_action = merge_one_file_spawn;
> 
> I would have a slight preference to keep the default initializer, because
> that makes it easer to reason about. But if you _want_ to keep this patch
> as-is, I won't object.
> 

Yeah, not sure why I did this.  I'll change this.

> It is a bit sad that the conversion cannot be done more incrementally, as
> there is a lot to unpack in the many different cases that are handled. It
> looks correct, though.
> 
> Just one thing:
> 
>>
>>  	for (; i < argc; i++) {
>>  		const char *arg = argv[i];
>> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
>> new file mode 100644
>> index 0000000000..ad99c6dbd4
>> --- /dev/null
>> +++ b/builtin/merge-one-file.c
>> @@ -0,0 +1,94 @@
>> +/*
>> + * Builtin "git merge-one-file"
>> + *
>> + * Copyright (c) 2020 Alban Gruin
>> + *
>> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
>> + *
>> + * This is the git per-file merge utility, called with
>> + *
>> + *   argv[1] - original file object name (or empty)
>> + *   argv[2] - file in branch1 object name (or empty)
>> + *   argv[3] - file in branch2 object name (or empty)
>> + *   argv[4] - pathname in repository
>> + *   argv[5] - original file mode (or empty)
>> + *   argv[6] - file in branch1 mode (or empty)
>> + *   argv[7] - file in branch2 mode (or empty)
>> + *
>> + * Handle some trivial cases. The _really_ trivial cases have been
>> + * handled already by git read-tree, but that one doesn't do any merges
>> + * that might change the tree layout.
>> + */
>> +
>> +#include "cache.h"
>> +#include "builtin.h"
>> +#include "lockfile.h"
>> +#include "merge-strategies.h"
>> +
>> +static const char builtin_merge_one_file_usage[] =
>> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
>> +	"<orig mode> <our mode> <their mode>\n\n"
>> +	"Blob ids and modes should be empty for missing files.";
>> +
>> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
>> +{
>> +	char *last;
>> +	int ret = 0;
>> +
>> +	*mode = strtol(arg, &last, 8);
>> +
>> +	if (*last)
>> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
>> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
>> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
>> +
>> +	return ret;
>> +}
>> +
>> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
>> +{
>> +	struct object_id orig_blob, our_blob, their_blob,
>> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
>> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
>> +	struct lock_file lock = LOCK_INIT;
>> +	struct repository *r = the_repository;
>> +
>> +	if (argc != 8)
>> +		usage(builtin_merge_one_file_usage);
>> +
>> +	if (repo_read_index(r) < 0)
>> +		die("invalid index");
>> +
>> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
>> +
>> +	if (!get_oid_hex(argv[1], &orig_blob)) {
>> +		p_orig_blob = &orig_blob;
>> +		ret = read_mode("orig", argv[5], &orig_mode);
>> +	} else if (!*argv[1] && *argv[5])
>> +		ret = error(_("no 'orig' object id given, but a mode was still given."));
> 
> Here, it looks as if the case of an empty `argv[1]` is not handled
> _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then
> we rely on the second arm _also_ not re-assigning `orig_blob`.
> 
> I wonder whether this could be checked, and whether it would make sense to
> fold this, along with most of these 5 lines, into the `read_mode()` helper
> function (DRYing up the code even further).
> 

Do you mean rewriting the first condition to read like this:

    if (*argv[1] && !get_oid_hex(argv[1], &orig_blob)) {

?

In which case yes, I can do that.

BTW the two lasts calls to read_mode() should be like

    err |= read_mode(…);

Cheers,
Alban

> As for the rest of the patch, it is totally possible that I missed a bug,
> but it looks correct to me, and the added regression tests give me a good
> feeling about the patch, too.
> 
> Thanks,
> Dscho
> 

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 09/15] merge-resolve: rewrite in C
  2021-03-17 20:49             ` [PATCH v7 09/15] merge-resolve: " Alban Gruin
@ 2021-03-23 22:21               ` Johannes Schindelin
  2021-04-10 14:17                 ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2021-03-23 22:21 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Alban,

On Wed, 17 Mar 2021, Alban Gruin wrote:

> diff --git a/merge-strategies.c b/merge-strategies.c
> index 2717af51fd..a51700dae5 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -272,3 +275,95 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
>
>  	return err;
>  }
> +
> +static int fast_forward(struct repository *r, struct tree_desc *t,
> +			int nr, int aggressive)
> +{
> +	struct unpack_trees_options opts;
> +	struct lock_file lock = LOCK_INIT;
> +
> +	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);

Shouldn't we lock the index first, and _then_ refresh it? I guess not,
seeing as we don't do that either in `cmd_status()`: there, we also
refresh the index and _then_ lock it.

> +
> +	memset(&opts, 0, sizeof(opts));
> +	opts.head_idx = 1;
> +	opts.src_index = r->index;
> +	opts.dst_index = r->index;
> +	opts.merge = 1;
> +	opts.update = 1;
> +	opts.aggressive = aggressive;
> +
> +	if (nr == 1)
> +		opts.fn = oneway_merge;
> +	else if (nr == 2) {
> +		opts.fn = twoway_merge;
> +		opts.initial_checkout = is_index_unborn(r->index);
> +	} else if (nr >= 3) {
> +		opts.fn = threeway_merge;
> +		opts.head_idx = nr - 1;
> +	}

Given the function's name `fast_forward()`, I have to admit that I
somewhat stumbled over these merges.
> +
> +	if (unpack_trees(nr, t, &opts))
> +		return -1;
> +
> +	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
> +		return error(_("unable to write new index file"));
> +
> +	return 0;
> +}
> +
> +static int add_tree(struct tree *tree, struct tree_desc *t)
> +{
> +	if (parse_tree(tree))
> +		return -1;
> +
> +	init_tree_desc(t, tree->buffer, tree->size);
> +	return 0;
> +}

This is a really trivial helper, but it is used a couple times below, so
it makes sense to have it encapsulated in a separate function.

> +
> +int merge_strategies_resolve(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remote)

Since it is a list, and since the original variable in the shell script
had been named in the plural form, let's do the same here: `remotes`.

> +{
> +	struct tree_desc t[MAX_UNPACK_TREES];
> +	struct object_id head, oid;
> +	struct commit_list *i;
> +	int nr = 0;
> +
> +	if (head_arg)
> +		get_oid(head_arg, &head);
> +
> +	puts(_("Trying simple merge."));

Good. Usually I would recommend to print this to `stderr`, but the
original script prints it to `stdout`, so we should do that here, too.

> +
> +	for (i = bases; i && i->item; i = i->next) {
> +		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
> +			return 2;

Since we're talking about a library function, not a `cmd_*()` function,
the return value on error should probably be negative.

Even better would be to let the function return an `enum` that contains
labels with more intuitive meaning than "2".

It _is_ the expected exit code when calling `git merge-resolve`, of course
(because of the `|| exit 2` after that `read-tree` call), but I wonder
whether a better layer for that `2` would be the `cmd_merge_resolve()`
function, letting `merge_strategies_resolve()` report failures in a more
fine-grained fashion.

> +	}
> +
> +	if (head_arg) {

It would probably be easier to read if the `if (head_arg)` clause above
was merged into this here clause.

> +		struct tree *tree = parse_tree_indirect(&head);
> +		if (add_tree(tree, t + (nr++)))
> +			return 2;
> +	}
> +
> +	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
> +		return 2;

You get away with assuming that `remotes` only contains at most a single
entry because `cmd_merge_resolve()` verified it.

However, as the intention is to use this as a library function, I think
the input validation needs to be moved here instead of relying on all
callers to verify that they send at most one "remote" ref.

Other than that, this patch looks good to me.

Thanks,
Dscho

> +
> +	if (fast_forward(r, t, nr, 1))
> +		return 2;
> +
> +	if (write_index_as_tree(&oid, r->index, r->index_file,
> +				WRITE_TREE_SILENT, NULL)) {
> +		int ret;
> +		struct lock_file lock = LOCK_INIT;
> +
> +		puts(_("Simple merge failed, trying Automatic merge."));
> +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
> +		ret = merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
> +
> +		write_locked_index(r->index, &lock, COMMIT_LOCK);
> +		return !!ret;
> +	}
> +
> +	return 0;
> +}
> diff --git a/merge-strategies.h b/merge-strategies.h
> index 8705a550ca..bba4bf999c 100644
> --- a/merge-strategies.h
> +++ b/merge-strategies.h
> @@ -1,6 +1,7 @@
>  #ifndef MERGE_STRATEGIES_H
>  #define MERGE_STRATEGIES_H
>
> +#include "commit.h"
>  #include "object.h"
>
>  int merge_three_way(struct index_state *istate,
> @@ -28,4 +29,8 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
>  int merge_all_index(struct index_state *istate, int oneshot, int quiet,
>  		    merge_fn fn, void *data);
>
> +int merge_strategies_resolve(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remote);
> +
>  #endif /* MERGE_STRATEGIES_H */
> --
> 2.31.0
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 11/15] merge-octopus: rewrite in C
  2021-03-17 20:49             ` [PATCH v7 11/15] merge-octopus: rewrite in C Alban Gruin
@ 2021-03-23 23:58               ` Johannes Schindelin
  0 siblings, 0 replies; 221+ messages in thread
From: Johannes Schindelin @ 2021-03-23 23:58 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Alban,

On Wed, 17 Mar 2021, Alban Gruin wrote:

> This rewrites `git merge-octopus' from shell to C.  As for the two last
> conversions, this port removes calls to external processes to avoid
> reading and writing the index over and over again.
>
>  - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
>    unpack_trees().
>
>  - The call to `write-tree' is replaced by a call to
>    write_index_as_tree().
>
>  - The call to `diff-index ...' is replaced by a call to
>    repo_index_has_changes().
>
>  - The call to `merge-index', needed to invoke `git merge-one-file', is
>    replaced by a call to merge_all_index().
>
> The index is read in cmd_merge_octopus(), and is wrote back by

s/wrote/written/

> merge_strategies_octopus().

I wonder why, though. Maybe the commit message could clarify that?

> Here to, merge_strategies_octopus() takes two commit lists and a string

s/to,/too,/

> to reduce frictions when try_merge_strategies() will be modified to call

s/frictions/friction/

> it directly.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>
> [...]
> diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
> new file mode 100644
> index 0000000000..9b9939b6b2
> --- /dev/null
> +++ b/builtin/merge-octopus.c
> @@ -0,0 +1,70 @@
> +/*
> + * Builtin "git merge-octopus"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-octopus.sh, written by Junio C Hamano.
> + *
> + * Resolve two or more trees.
> + */
> +
> +#include "cache.h"
> +#include "builtin.h"
> +#include "commit.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_octopus_usage[] =
> +	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
> +
> +int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
> +{
> +	int i, sep_seen = 0;
> +	struct commit_list *bases = NULL, *remotes = NULL;
> +	struct commit_list **next_base = &bases, **next_remote = &remotes;
> +	const char *head_arg = NULL;
> +	struct repository *r = the_repository;
> +
> +	if (argc < 5)
> +		usage(builtin_merge_octopus_usage);
> +
> +	setup_work_tree();
> +	if (repo_read_index(r) < 0)
> +		die("invalid index");
> +
> +	/*
> +	 * The first parameters up to -- are merge bases; the rest are
> +	 * heads.
> +	 */
> +	for (i = 1; i < argc; i++) {
> +		if (strcmp(argv[i], "--") == 0)
> +			sep_seen = 1;
> +		else if (strcmp(argv[i], "-h") == 0)
> +			usage(builtin_merge_octopus_usage);
> +		else if (sep_seen && !head_arg)
> +			head_arg = argv[i];
> +		else {
> +			struct object_id oid;
> +			struct commit *commit;
> +
> +			if (get_oid(argv[i], &oid))
> +				die("object %s not found.", argv[i]);
> +
> +			commit = oideq(&oid, r->hash_algo->empty_tree) ?
> +				NULL : lookup_commit_or_die(&oid, argv[i]);
> +
> +			if (sep_seen)
> +				next_remote = commit_list_append(commit, next_remote);
> +			else
> +				next_base = commit_list_append(commit, next_base);
> +		}
> +	}
> +
> +	/*
> +	 * Reject if this is not an octopus -- resolve should be used
> +	 * instead.
> +	 */
> +	if (commit_list_count(remotes) < 2)
> +		return 2;

As with `merge-resolve`, I would suggest to:

- move this input validation down to `merge_strategies_octopus()`, and
- change that function's signature to return an `enum`, and then
- make sure that that `enum` uses easy-to-understand labels.

> +
> +	return merge_strategies_octopus(r, bases, head_arg, remotes);
> +}
>
> [...]
>
> diff --git a/merge-strategies.c b/merge-strategies.c
> index a51700dae5..ebc0d0b1e2 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -367,3 +368,177 @@ int merge_strategies_resolve(struct repository *r,
>
>  	return 0;
>  }
> +
> +static int write_tree(struct repository *r, struct tree **reference_tree)
> +{
> +	struct object_id oid;
> +	int ret;
> +
> +	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file,
> +					WRITE_TREE_SILENT, NULL)))
> +		*reference_tree = lookup_tree(r, &oid);
> +
> +	return ret;
> +}
> +
> +static int octopus_fast_forward(struct repository *r, const char *branch_name,
> +				struct tree *tree_head, struct tree *current_tree,
> +				struct tree **reference_tree)

While I objected to the name of the `fast_forward()` function, I think the
`octopus_fast_forward()` function is named aptly.

> +{
> +	/*
> +	 * The first head being merged was a fast-forward.  Advance the
> +	 * reference commit to the head being merged, and use that tree
> +	 * as the intermediate result of the merge.  We still need to
> +	 * count this as part of the parent set.
> +	 */
> +	struct tree_desc t[2];
> +
> +	printf(_("Fast-forwarding to: %s\n"), branch_name);
> +
> +	init_tree_desc(t, tree_head->buffer, tree_head->size);
> +	if (add_tree(current_tree, t + 1))
> +		return -1;
> +	if (fast_forward(r, t, 2, 0))
> +		return -1;
> +	if (write_tree(r, reference_tree))
> +		return -1;
> +
> +	return 0;
> +}
> +
> +static int octopus_do_merge(struct repository *r, const char *branch_name,
> +			    struct commit_list *common, struct tree *current_tree,
> +			    struct tree **reference_tree)
> +{
> +	struct tree_desc t[MAX_UNPACK_TREES];
> +	struct commit_list *i;
> +	int nr = 0, ret = 0;
> +
> +	printf(_("Trying simple merge with %s\n"), branch_name);
> +
> +	for (i = common; i; i = i->next) {
> +		struct tree *tree = repo_get_commit_tree(r, i->item);
> +		if (add_tree(tree, t + (nr++)))
> +			return -1;
> +	}
> +
> +	if (add_tree(*reference_tree, t + (nr++)))
> +		return -1;
> +	if (add_tree(current_tree, t + (nr++)))
> +		return -1;
> +	if (fast_forward(r, t, nr, 1))
> +		return 2;
> +
> +	if (write_tree(r, reference_tree)) {
> +		struct lock_file lock = LOCK_INIT;
> +
> +		puts(_("Simple merge did not work, trying automatic merge."));
> +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);

It is a bit funny to see this as the only time in this patch where the
index is locked, and it is immediately released thereafter.

I would have expected the lock to be taken first thing in
`merge_strategies_octopus()` and then being committed only on success, or
on failure to merge.

> +		ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, NULL);
> +		write_locked_index(r->index, &lock, COMMIT_LOCK);
> +
> +		write_tree(r, reference_tree);
> +	}
> +
> +	return ret;
> +}
> +
> +int merge_strategies_octopus(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remotes)
> +{
> +	int ff_merge = 1, ret = 0, nr_references = 1;
> +	struct commit **reference_commits, *head_commit;
> +	struct tree *reference_tree, *head_tree;
> +	struct commit_list *i;
> +	struct object_id head;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	get_oid(head_arg, &head);
> +	head_commit = lookup_commit_reference(r, &head);
> +	head_tree = repo_get_commit_tree(r, head_commit);
> +
> +	if (parse_tree(head_tree))
> +		return 2;
> +
> +	if (repo_index_has_changes(r, head_tree, &sb)) {
> +		error(_("Your local changes to the following files "
> +			"would be overwritten by merge:\n  %s"),
> +		      sb.buf);
> +		strbuf_release(&sb);
> +		return 2;
> +	}
> +
> +	CALLOC_ARRAY(reference_commits, commit_list_count(remotes) + 1);
> +	reference_commits[0] = head_commit;
> +	reference_tree = head_tree;
> +
> +	for (i = remotes; i && i->item; i = i->next) {
> +		struct commit *c = i->item;
> +		struct object_id *oid = &c->object.oid;
> +		struct tree *current_tree = repo_get_commit_tree(r, c);
> +		struct commit_list *common, *j;
> +		char *branch_name = merge_get_better_branch_name(oid_to_hex(oid));
> +		int up_to_date = 0;
> +
> +		common = repo_get_merge_bases_many(r, c, nr_references, reference_commits);
> +		if (!common) {
> +			error(_("Unable to find common commit with %s"), branch_name);
> +
> +			free(branch_name);
> +			free_commit_list(common);
> +			free(reference_commits);
> +
> +			return 2;
> +		}
> +
> +		for (j = common; j && !up_to_date && ff_merge; j = j->next) {
> +			up_to_date |= oideq(&j->item->object.oid, oid);

Semantically, I would argue that this is an `||=`, not `|=`: we want a
Boolean "or", not a bit-wise one.

> +
> +			if (!j->next &&
> +			    !oideq(&j->item->object.oid,
> +				   &reference_commits[nr_references - 1]->object.oid))
> +				ff_merge = 0;
> +		}

Hmm. This is combining two things into the same loop, with a combined loop
condition. The two things are:

	case "$LF$common$LF" in
        *"$LF$SHA1$LF"*)
                eval_gettextln "Already up to date with \$pretty_name"
                continue
                ;;
        esac

        if test "$common,$NON_FF_MERGE" = "$MRC,0"
        then
                # The first head being merged was a fast-forward.
                # Advance MRC to the head being merged, and use that
                # tree as the intermediate result of the merge.
                # We still need to count this as part of the parent set.

                eval_gettextln "Fast-forwarding to: \$pretty_name"
                git read-tree -u -m $head $SHA1 || exit
                MRC=$SHA1 MRT=$(git write-tree)
                continue
        fi

        NON_FF_MERGE=1

The first one tries to verify that the `common` list contains `oid`. The C
code does this, too, using the intuitive variable name `up_to_date`, which
is good.

Now, big question: is there a way for the loop to exit before we had a
chance to see the common commit that is identical to `oid`? And I think
there is: `ff_merge` is not reset between the outer loop (the one
iterating over `remotes`). If that is the case, then we would miss that
we're already up to date.

Next thing is that `if test "$common,$NON_FF_MERGE" = "$MRC,0"` thing.
This is turned into that `if (!j->next && ...)` thing, and I _think_ that
it does the wrong thing. Rather than verifying that the `common` list
is identical to "MRC" (= the merge reference list), it would only ever
compare the last entries of `common` and MRC.

I have a hard time convincing myself that this is idempotent to the shell
script version.

Instead, I think it should read somewhat like this:

		for (j = common, k = 0; j && (!up_to_date || ff_merge); j = j->next) {
			up_to_date ||= oideq(&j->item->object.oid, oid);

			if (ff_merge &&
			    (k >= nr_references ||
			     !oideq(&j->item->object.oid,
				    &reference_commits[k++]->object.oid))
				ff_merge = 0;
		}

But quite honestly, this still looks "too clever" and too fragile to me.
For something as rare as an octopus merge, I'd _much_ rather have simpler
code that is easy to reason about and does the job reliably (if somewhat
slower than a hyper-optimized version):

		/*
		 * If `oid` is reachable from `HEAD`, we're already up to
		 * date.
		 */
		for (j = common; j; j = j->next)
			if (oideq(&j->item->object.oid, oid)) {
				up_to_date = 1;
				break;
			}

		if (up_to_date) {
			printf(_("Already up to date with %s\n"), branch_name);

			free(branch_name);
			free_commit_list(common);
			continue;
		}

		for (j = common, k = 0; ff_merge && j; j = j->next)
			if (k >= nr_references ||
			    !oideq(&j->item->object.oid,
				   &reference_commits[k++]->object.oid))
				ff_merge = 0;
		if (k != nr_references)
			ff_merge = 0;


But the more I stare at the shell script code, the more I start to believe
that this `MRC` business is just a very convoluted way to essentially
verify that the `HEAD` is the _single_ merge base.

I say that because I cannot fail to notice that `$common` separates the
merge bases by newlines, while `$MRC` separates its entries by spaces.
Therefore,

		test "$common,$NON_FF_MERGE" = "$MRC,0"

can only ever evaluate to `true` if both `$common` and `$MRC` contains
exactly one and the same oid, namely the one of the revision to which we
just fast-forwarded in the previous iteration.

Therefore, the logic does not even need a loop. It would be as trivial as:

		/*
		 * If we could fast-forward so far and `HEAD` is the
		 * single merge base with the current `remote` revision,
		 * keep fast-forwarding.
		 */
		if (ff_merge && common && !common->next && nr_references == 1 &&
		    oideq(common->item->object.oid,
			  reference_commit[0]->object.oid)) {
			ret = octopus_fast_forward(r, branch_name, head_tree,
						   current_tree, &reference_tree);
			nr_references = 0;
		} else {
			ff_merge = 0;
			ret = octopus_do_merge(r, branch_name, common,
					       current_tree, &reference_tree);
		}


> +
> +		if (up_to_date) {
> +			printf(_("Already up to date with %s\n"), branch_name);
> +
> +			free(branch_name);
> +			free_commit_list(common);
> +			continue;
> +		}
> +
> +		if (ff_merge) {
> +			ret = octopus_fast_forward(r, branch_name, head_tree,
> +						   current_tree, &reference_tree);
> +			nr_references = 0;
> +		} else {
> +			ret = octopus_do_merge(r, branch_name, common,
> +					       current_tree, &reference_tree);
> +		}
> +
> +		free(branch_name);
> +		free_commit_list(common);
> +
> +		if (ret == -1 || ret == 2)
> +			break;
> +		else if (ret && i->next) {
> +			/*
> +			 * We allow only last one to have a
> +			 * hand-resolvable conflicts.  Last round failed
> +			 * and we still had a head to merge.
> +			 */
> +			puts(_("Automated merge did not work."));
> +			puts(_("Should not be doing an octopus."));
> +
> +			free(reference_commits);
> +			return 2;

I see that you moved this block from the beginning of the loop to the end
(in the script, it was at the start of the loop). This is a good change.

I wonder, though, whether it wouldn't make more sense to replace the last
two lines with this:

			ret = 2;
			break;

That way, we need not worry about releasing resources in multiple places
in the future: it will all be done at the end of the function.

Phew. What a lot to unpack.

Please let me express my gratitude for working on this. My many comments
may seem as if I am unhappy with the progress, but nothing could be
further from the truth. I am impressed by your tenacity, and I hope that I
could do my little bit to make this patch series as good as we can.

Thanks,
Dscho

> +		}
> +
> +		reference_commits[nr_references++] = c;
> +	}
> +
> +	free(reference_commits);
> +	return ret;
> +}
> diff --git a/merge-strategies.h b/merge-strategies.h
> index bba4bf999c..8de2249ee6 100644
> --- a/merge-strategies.h
> +++ b/merge-strategies.h
> @@ -32,5 +32,8 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
>  int merge_strategies_resolve(struct repository *r,
>  			     struct commit_list *bases, const char *head_arg,
>  			     struct commit_list *remote);
> +int merge_strategies_octopus(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remote);
>
>  #endif /* MERGE_STRATEGIES_H */
> --
> 2.31.0
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 08/15] merge-one-file: rewrite in C
  2021-03-23 20:53                 ` Alban Gruin
@ 2021-03-24  9:10                   ` Johannes Schindelin
  2021-04-10 14:17                     ` Alban Gruin
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2021-03-24  9:10 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

[-- Attachment #1: Type: text/plain, Size: 4269 bytes --]

Hi Alban,

On Tue, 23 Mar 2021, Alban Gruin wrote:

> Le 22/03/2021 à 23:20, Johannes Schindelin a écrit :
> >
> > On Wed, 17 Mar 2021, Alban Gruin wrote:
> >
> >>
> >>  	for (; i < argc; i++) {
> >>  		const char *arg = argv[i];
> >> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> >> new file mode 100644
> >> index 0000000000..ad99c6dbd4
> >> --- /dev/null
> >> +++ b/builtin/merge-one-file.c
> >> @@ -0,0 +1,94 @@
> >> +/*
> >> + * Builtin "git merge-one-file"
> >> + *
> >> + * Copyright (c) 2020 Alban Gruin
> >> + *
> >> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
> >> + *
> >> + * This is the git per-file merge utility, called with
> >> + *
> >> + *   argv[1] - original file object name (or empty)
> >> + *   argv[2] - file in branch1 object name (or empty)
> >> + *   argv[3] - file in branch2 object name (or empty)
> >> + *   argv[4] - pathname in repository
> >> + *   argv[5] - original file mode (or empty)
> >> + *   argv[6] - file in branch1 mode (or empty)
> >> + *   argv[7] - file in branch2 mode (or empty)
> >> + *
> >> + * Handle some trivial cases. The _really_ trivial cases have been
> >> + * handled already by git read-tree, but that one doesn't do any merges
> >> + * that might change the tree layout.
> >> + */
> >> +
> >> +#include "cache.h"
> >> +#include "builtin.h"
> >> +#include "lockfile.h"
> >> +#include "merge-strategies.h"
> >> +
> >> +static const char builtin_merge_one_file_usage[] =
> >> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
> >> +	"<orig mode> <our mode> <their mode>\n\n"
> >> +	"Blob ids and modes should be empty for missing files.";
> >> +
> >> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
> >> +{
> >> +	char *last;
> >> +	int ret = 0;
> >> +
> >> +	*mode = strtol(arg, &last, 8);
> >> +
> >> +	if (*last)
> >> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
> >> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
> >> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
> >> +
> >> +	return ret;
> >> +}
> >> +
> >> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
> >> +{
> >> +	struct object_id orig_blob, our_blob, their_blob,
> >> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
> >> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
> >> +	struct lock_file lock = LOCK_INIT;
> >> +	struct repository *r = the_repository;
> >> +
> >> +	if (argc != 8)
> >> +		usage(builtin_merge_one_file_usage);
> >> +
> >> +	if (repo_read_index(r) < 0)
> >> +		die("invalid index");
> >> +
> >> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
> >> +
> >> +	if (!get_oid_hex(argv[1], &orig_blob)) {
> >> +		p_orig_blob = &orig_blob;
> >> +		ret = read_mode("orig", argv[5], &orig_mode);
> >> +	} else if (!*argv[1] && *argv[5])
> >> +		ret = error(_("no 'orig' object id given, but a mode was still given."));
> >
> > Here, it looks as if the case of an empty `argv[1]` is not handled
> > _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then
> > we rely on the second arm _also_ not re-assigning `orig_blob`.
> >
> > I wonder whether this could be checked, and whether it would make sense to
> > fold this, along with most of these 5 lines, into the `read_mode()` helper
> > function (DRYing up the code even further).
> >
>
> Do you mean rewriting the first condition to read like this:
>
>     if (*argv[1] && !get_oid_hex(argv[1], &orig_blob)) {
>
> ?
>
> In which case yes, I can do that.

Yes, that's what I meant. Or this instead:

	if (!*argv[1]) {
		if (*argv[5])
			ret = error(... mode was still given ...)
	} else if (!get_oid_hex(...)) {
		...
	}

> BTW the two lasts calls to read_mode() should be like
>
>     err |= read_mode(…);

While this is certainly shorter than

	if (read_mode(...))
		ret = -1;

I actually prefer the latter, for clarity (we do want `read_mode()` to be
called, i.e. we cannot use `||=` here, but it is also not a bit-wise "or"
operation, therefore `|=` strikes me as misleading). What do you think?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 08/15] merge-one-file: rewrite in C
  2021-03-24  9:10                   ` Johannes Schindelin
@ 2021-04-10 14:17                     ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-04-10 14:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Johannes,

Le 24/03/2021 à 10:10, Johannes Schindelin a écrit :
> Hi Alban,
> 
> On Tue, 23 Mar 2021, Alban Gruin wrote:
> 
>> Le 22/03/2021 à 23:20, Johannes Schindelin a écrit :
>>>
>>> On Wed, 17 Mar 2021, Alban Gruin wrote:
>>>
>>>>
>>>>  	for (; i < argc; i++) {
>>>>  		const char *arg = argv[i];
>>>> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
>>>> new file mode 100644
>>>> index 0000000000..ad99c6dbd4
>>>> --- /dev/null
>>>> +++ b/builtin/merge-one-file.c
>>>> @@ -0,0 +1,94 @@
>>>> +/*
>>>> + * Builtin "git merge-one-file"
>>>> + *
>>>> + * Copyright (c) 2020 Alban Gruin
>>>> + *
>>>> + * Based on git-merge-one-file.sh, written by Linus Torvalds.
>>>> + *
>>>> + * This is the git per-file merge utility, called with
>>>> + *
>>>> + *   argv[1] - original file object name (or empty)
>>>> + *   argv[2] - file in branch1 object name (or empty)
>>>> + *   argv[3] - file in branch2 object name (or empty)
>>>> + *   argv[4] - pathname in repository
>>>> + *   argv[5] - original file mode (or empty)
>>>> + *   argv[6] - file in branch1 mode (or empty)
>>>> + *   argv[7] - file in branch2 mode (or empty)
>>>> + *
>>>> + * Handle some trivial cases. The _really_ trivial cases have been
>>>> + * handled already by git read-tree, but that one doesn't do any merges
>>>> + * that might change the tree layout.
>>>> + */
>>>> +
>>>> +#include "cache.h"
>>>> +#include "builtin.h"
>>>> +#include "lockfile.h"
>>>> +#include "merge-strategies.h"
>>>> +
>>>> +static const char builtin_merge_one_file_usage[] =
>>>> +	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
>>>> +	"<orig mode> <our mode> <their mode>\n\n"
>>>> +	"Blob ids and modes should be empty for missing files.";
>>>> +
>>>> +static int read_mode(const char *name, const char *arg, unsigned int *mode)
>>>> +{
>>>> +	char *last;
>>>> +	int ret = 0;
>>>> +
>>>> +	*mode = strtol(arg, &last, 8);
>>>> +
>>>> +	if (*last)
>>>> +		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
>>>> +	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
>>>> +		ret = error(_("invalid '%s' mode: %o"), name, *mode);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
>>>> +{
>>>> +	struct object_id orig_blob, our_blob, their_blob,
>>>> +		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
>>>> +	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
>>>> +	struct lock_file lock = LOCK_INIT;
>>>> +	struct repository *r = the_repository;
>>>> +
>>>> +	if (argc != 8)
>>>> +		usage(builtin_merge_one_file_usage);
>>>> +
>>>> +	if (repo_read_index(r) < 0)
>>>> +		die("invalid index");
>>>> +
>>>> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
>>>> +
>>>> +	if (!get_oid_hex(argv[1], &orig_blob)) {
>>>> +		p_orig_blob = &orig_blob;
>>>> +		ret = read_mode("orig", argv[5], &orig_mode);
>>>> +	} else if (!*argv[1] && *argv[5])
>>>> +		ret = error(_("no 'orig' object id given, but a mode was still given."));
>>>
>>> Here, it looks as if the case of an empty `argv[1]` is not handled
>>> _explicitly_, but we rely on `get_oid_hex()` to return non-zero, and then
>>> we rely on the second arm _also_ not re-assigning `orig_blob`.
>>>
>>> I wonder whether this could be checked, and whether it would make sense to
>>> fold this, along with most of these 5 lines, into the `read_mode()` helper
>>> function (DRYing up the code even further).
>>>
>>
>> Do you mean rewriting the first condition to read like this:
>>
>>     if (*argv[1] && !get_oid_hex(argv[1], &orig_blob)) {
>>
>> ?
>>
>> In which case yes, I can do that.
> 
> Yes, that's what I meant. Or this instead:
> 
> 	if (!*argv[1]) {
> 		if (*argv[5])
> 			ret = error(... mode was still given ...)
> 	} else if (!get_oid_hex(...)) {
> 		...
> 	}
> 
>> BTW the two lasts calls to read_mode() should be like
>>
>>     err |= read_mode(…);
> 
> While this is certainly shorter than
> 
> 	if (read_mode(...))
> 		ret = -1;
> 

So, I folded all of this into a single function that reads the mode,
convert the oid, and show an error if needed.  Now, I have:

    if (read_param("orig", argv[1], argv[5], &orig_blob,
                   &p_orig_blob, &orig_mode))
        ret = -1;

    if (read_param("our", …))
        ret = -1;

    if (read_param("their", …))
        ret = -1;

    if (ret)
        return ret;


> I actually prefer the latter, for clarity (we do want `read_mode()` to be
> called, i.e. we cannot use `||=` here, but it is also not a bit-wise "or"
> operation, therefore `|=` strikes me as misleading). What do you think?
> 

Yes, I think it's much clearer that way.

FIY, `||=' does not exist in C.

Cheers,
Alban

> Ciao,
> Dscho
> 


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v7 09/15] merge-resolve: rewrite in C
  2021-03-23 22:21               ` Johannes Schindelin
@ 2021-04-10 14:17                 ` Alban Gruin
  0 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2021-04-10 14:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Junio C Hamano, Phillip Wood, Derrick Stolee

Hi Johannes,

Le 23/03/2021 à 23:21, Johannes Schindelin a écrit :
> Hi Alban,
> 
> On Wed, 17 Mar 2021, Alban Gruin wrote:
> 
>> diff --git a/merge-strategies.c b/merge-strategies.c
>> index 2717af51fd..a51700dae5 100644
>> --- a/merge-strategies.c
>> +++ b/merge-strategies.c
>> @@ -272,3 +275,95 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
>>
>>  	return err;
>>  }
>> +
>> +static int fast_forward(struct repository *r, struct tree_desc *t,
>> +			int nr, int aggressive)
>> +{
>> +	struct unpack_trees_options opts;
>> +	struct lock_file lock = LOCK_INIT;
>> +
>> +	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
>> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
> 
> Shouldn't we lock the index first, and _then_ refresh it? I guess not,
> seeing as we don't do that either in `cmd_status()`: there, we also
> refresh the index and _then_ lock it.
> 

Yeah, I don't think I saw a lock/refresh sequence, but I may be wrong.

>> +
>> +	memset(&opts, 0, sizeof(opts));
>> +	opts.head_idx = 1;
>> +	opts.src_index = r->index;
>> +	opts.dst_index = r->index;
>> +	opts.merge = 1;
>> +	opts.update = 1;
>> +	opts.aggressive = aggressive;
>> +
>> +	if (nr == 1)
>> +		opts.fn = oneway_merge;
>> +	else if (nr == 2) {
>> +		opts.fn = twoway_merge;
>> +		opts.initial_checkout = is_index_unborn(r->index);
>> +	} else if (nr >= 3) {
>> +		opts.fn = threeway_merge;
>> +		opts.head_idx = nr - 1;
>> +	}
> 
> Given the function's name `fast_forward()`, I have to admit that I
> somewhat stumbled over these merges.
>> +
>> +	if (unpack_trees(nr, t, &opts))
>> +		return -1;
>> +

I just noticed that the lock is not released if there is an error here.

>> +	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
>> +		return error(_("unable to write new index file"));
>> +
>> +	return 0;
>> +}
>> +
>> +static int add_tree(struct tree *tree, struct tree_desc *t)
>> +{
>> +	if (parse_tree(tree))
>> +		return -1;
>> +
>> +	init_tree_desc(t, tree->buffer, tree->size);
>> +	return 0;
>> +}
> 
> This is a really trivial helper, but it is used a couple times below, so
> it makes sense to have it encapsulated in a separate function.
> 
>> +
>> +int merge_strategies_resolve(struct repository *r,
>> +			     struct commit_list *bases, const char *head_arg,
>> +			     struct commit_list *remote)
> 
> Since it is a list, and since the original variable in the shell script
> had been named in the plural form, let's do the same here: `remotes`.
> 

This one is supposed to contain only one commit, so I'm not really
conviced that this parameter should be in the plural form.

>> +{
>> +	struct tree_desc t[MAX_UNPACK_TREES];
>> +	struct object_id head, oid;
>> +	struct commit_list *i;
>> +	int nr = 0;
>> +
>> +	if (head_arg)
>> +		get_oid(head_arg, &head);
>> +
>> +	puts(_("Trying simple merge."));
> 
> Good. Usually I would recommend to print this to `stderr`, but the
> original script prints it to `stdout`, so we should do that here, too.
> 
>> +
>> +	for (i = bases; i && i->item; i = i->next) {
>> +		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
>> +			return 2;
> 
> Since we're talking about a library function, not a `cmd_*()` function,
> the return value on error should probably be negative.
> 
> Even better would be to let the function return an `enum` that contains
> labels with more intuitive meaning than "2".
> 
> It _is_ the expected exit code when calling `git merge-resolve`, of course
> (because of the `|| exit 2` after that `read-tree` call), but I wonder
> whether a better layer for that `2` would be the `cmd_merge_resolve()`
> function, letting `merge_strategies_resolve()` report failures in a more
> fine-grained fashion.
> 

Right -- I'll see what I can do here.

>> +	}
>> +
>> +	if (head_arg) {
> 
> It would probably be easier to read if the `if (head_arg)` clause above
> was merged into this here clause.
> 
>> +		struct tree *tree = parse_tree_indirect(&head);
>> +		if (add_tree(tree, t + (nr++)))
>> +			return 2;
>> +	}
>> +
>> +	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
>> +		return 2;
> 
> You get away with assuming that `remotes` only contains at most a single
> entry because `cmd_merge_resolve()` verified it.
> 
> However, as the intention is to use this as a library function, I think
> the input validation needs to be moved here instead of relying on all
> callers to verify that they send at most one "remote" ref.
> 
> Other than that, this patch looks good to me.
> 
Well, this condition checks that there is one commit, and if so, uses it
to call add_tree().  I don't see the mistake here.

Cheers,
Alban

> Thanks,
> Dscho
> 



^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C
  2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
                               ` (14 preceding siblings ...)
  2021-03-17 20:49             ` [PATCH v7 15/15] sequencer: use the "octopus" merge " Alban Gruin
@ 2022-08-09 18:54             ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 01/14] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
                                 ` (14 more replies)
  15 siblings, 15 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

In an effort to reduce the number of shell scripts in git's codebase, I
propose this patch series converting the two remaining merge strategies,
resolve and octopus, from shell to C.  This will enable slightly better
performance, better integration with git itself (no more forking to
perform these operations), better portability (Windows and shell scripts
don't mix well).

Three scripts are actually converted: first git-merge-one-file.sh, then
git-merge-resolve.sh, and finally git-merge-octopus.sh.  Not only they
are converted, but they also are modified to operate without forking,
and then libified so they can be used by git without spawning another
process.

This series keeps the commands `git merge-one-file', `git
merge-resolve', and `git merge-octopus', so any script depending on them
should keep working without changes.

This series is based on c50926e1f4 (The eleventh batch, 2022-08-08).
The tip is tagged as "rewrite-merge-strategies-v8" at
https://github.com/agrn/git.

Changes since v7:

 - The series has been rebased.

 - The first commit has been dropped, since t6407 was modernized by
   8127a2b1f5 (merge tests: use "test_must_fail" instead of ad-hoc
   pattern, 2022-03-07).

 - The `quiet' parameter of merge_entry() has been removed.  Merge
   program failures are now reported by merge_index_path() and
   merge_all_index().

 - merge_all_index() now reports merge program failures in oneshot mode,
   as merge-index did.

 - In the `merge-index' builtin, the change removing the default value
   of `merge_action' was reverted, as suggested by Johannes Schindelin.

 - The argument parsing and error handling in merge-one-file.c has been
   cleaned up, as suggested by Johannes.

 - Parameters checking of the merge strategies were moved from the
   builtins to merge_strategy_resolve() and merge_strategy_octopus(), as
   suggested by Johannes.

 - Both strategies were modified to lock the index only once at the
   start, and release the lock once at the end.  Calls to
   write_index_as_tree() were replaced to a new internal function,
   write_tree(), that do not lock the index.

   In the v7, write_tree() also called lookup_tree() on the result of
   write_index_as_tree().  As the result was only used by the octopus,
   this call was moved to merge_strategy_octopus().

   This change was suggested by Johannes.

 - 24ba8b70c9 (merge-resolve: abort if index does not match HEAD,
   2022-07-23) added a check in git-merge-resolve.sh that makes the
   strategy exit if there is changes in the worktree.  This change was
   brought along.  Since the same check was made in merge-octopus, it
   has been factored as a function in merge-strategies.c:
   check_index_is_head().  merge_strategy_octopus() was modified to use
   this new function, too.

 - In merge_strategies.c, fast_forward() was renamed to merge_trees().

 - Fixed the parameters to a call to merge_all_index() in octopus_do_merge().

 - The changes to merge_strategy_octopus() suggested by Johannes [0] were
   applied.

 - Some commit messages were clarified.

[0] https://lore.kernel.org/git/nycvar.QRO.7.76.6.2103232323330.50@tvgsbejvaqbjf.bet/

Alban Gruin (14):
  t6060: modify multiple files to expose a possible issue with
    merge-index
  t6060: add tests for removed files
  merge-index: libify merge_one_path() and merge_all()
  merge-index: drop the index
  merge-index: add a new way to invoke `git-merge-one-file'
  update-index: move add_cacheinfo() to read-cache.c
  merge-one-file: rewrite in C
  merge-resolve: rewrite in C
  merge-recursive: move better_branch_name() to merge.c
  merge-octopus: rewrite in C
  merge: use the "resolve" strategy without forking
  merge: use the "octopus" strategy without forking
  sequencer: use the "resolve" strategy without forking
  sequencer: use the "octopus" strategy without forking

 Documentation/git-merge-index.txt |   7 +-
 Makefile                          |   7 +-
 builtin.h                         |   3 +
 builtin/merge-index.c             | 122 +++---
 builtin/merge-octopus.c           |  63 ++++
 builtin/merge-one-file.c          |  92 +++++
 builtin/merge-recursive.c         |  16 +-
 builtin/merge-resolve.c           |  63 ++++
 builtin/merge.c                   |   7 +
 builtin/update-index.c            |  25 +-
 cache.h                           |  10 +-
 git-merge-octopus.sh              | 112 ------
 git-merge-one-file.sh             | 167 ---------
 git-merge-resolve.sh              |  64 ----
 git.c                             |   3 +
 merge-strategies.c                | 590 ++++++++++++++++++++++++++++++
 merge-strategies.h                |  39 ++
 merge.c                           |  12 +
 read-cache.c                      |  35 ++
 sequencer.c                       |  17 +-
 t/t6060-merge-index.sh            |  23 +-
 t/t6415-merge-dir-to-symlink.sh   |   2 +-
 t/t7607-merge-state.sh            |   2 +-
 23 files changed, 1022 insertions(+), 459 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 create mode 100644 builtin/merge-one-file.c
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-octopus.sh
 delete mode 100755 git-merge-one-file.sh
 delete mode 100755 git-merge-resolve.sh
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v7:
 1:  dfe230bfce <  -:  ---------- t6407: modernise tests
 2:  575e24685d !  1:  2e23e45435 t6060: modify multiple files to expose a possible issue with merge-index
    @@ Commit message
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
      ## t/t6060-merge-index.sh ##
    -@@ t/t6060-merge-index.sh: test_expect_success 'setup diverging branches' '
    - 	for i in 1 2 3 4 5 6 7 8 9 10; do
    - 		echo $i
    - 	done >file &&
    +@@ t/t6060-merge-index.sh: test_description='basic git merge-index / git-merge-one-file tests'
    + 
    + test_expect_success 'setup diverging branches' '
    + 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
     -	git add file &&
     +	cp file file2 &&
     +	git add file file2 &&
 3:  4f366ff363 !  2:  f48f2f7c3c t6060: add tests for removed files
    @@ Metadata
      ## Commit message ##
         t6060: add tests for removed files
     
    -    Until now, t6060 did not not check git-mere-one-file's behaviour when a
    +    Until now, t6060 did not not check git-merge-one-file's behaviour when a
         file is deleted in a branch.  To avoid regressions on this during the
    -    conversion, this adds a new file, `file3', in the commit tagged as`base', and
    -    deletes it in the commit tagged as `two'.
    +    conversion from shell to C, this adds a new file, `file3', in the commit
    +    tagged as `base', and deletes it in the commit tagged as `two'.
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
      ## t/t6060-merge-index.sh ##
    -@@ t/t6060-merge-index.sh: test_expect_success 'setup diverging branches' '
    - 		echo $i
    - 	done >file &&
    +@@ t/t6060-merge-index.sh: test_description='basic git merge-index / git-merge-one-file tests'
    + test_expect_success 'setup diverging branches' '
    + 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
      	cp file file2 &&
     -	git add file file2 &&
     +	cp file file3 &&
 4:  6af79a6b2d !  3:  331141f0cb merge-index: libify merge_one_path() and merge_all()
    @@ Makefile: LIB_OBJS += merge-blobs.o
      LIB_OBJS += merge-recursive.o
     +LIB_OBJS += merge-strategies.o
      LIB_OBJS += merge.o
    - LIB_OBJS += mergesort.o
      LIB_OBJS += midx.o
    + LIB_OBJS += name-hash.o
     
      ## builtin/merge-index.c ##
     @@
    @@ builtin/merge-index.c
     -static void merge_all(void)
     -{
     -	int i;
    +-	/* TODO: audit for interaction with sparse-index. */
    +-	ensure_full_index(&the_index);
     -	for (i = 0; i < active_nr; i++) {
     -		const struct cache_entry *ce = active_cache[i];
     -		if (!ce_stage(ce))
    @@ merge-strategies.c (new)
     +#include "cache.h"
     +#include "merge-strategies.h"
     +
    -+static int merge_entry(struct index_state *istate, int quiet, unsigned int pos,
    ++static int merge_entry(struct index_state *istate, unsigned int pos,
     +		       const char *path, int *err, merge_fn fn, void *data)
     +{
     +	int found = 0;
    @@ merge-strategies.c (new)
     +		return error(_("%s is not in the cache"), path);
     +
     +	if (fn(istate, oids[0], oids[1], oids[2], path,
    -+	       modes[0], modes[1], modes[2], data)) {
    -+		if (!quiet)
    -+			error(_("Merge program failed"));
    ++	       modes[0], modes[1], modes[2], data))
     +		(*err)++;
    -+	}
     +
     +	return found;
     +}
    @@ merge-strategies.c (new)
     +	 * already merged and there is nothing to do.
     +	 */
     +	if (pos < 0) {
    -+		ret = merge_entry(istate, quiet || oneshot, -pos - 1, path, &err, fn, data);
    ++		ret = merge_entry(istate, -pos - 1, path, &err, fn, data);
     +		if (ret == -1)
     +			return -1;
    -+		else if (err)
    ++		else if (err) {
    ++			if (!quiet && !oneshot)
    ++				error(_("merge program failed"));
     +			return 1;
    ++		}
     +	}
     +	return 0;
     +}
    @@ merge-strategies.c (new)
     +	int err = 0, ret;
     +	unsigned int i;
     +
    ++	/* TODO: audit for interaction with sparse-index. */
    ++	ensure_full_index(istate);
     +	for (i = 0; i < istate->cache_nr; i++) {
     +		const struct cache_entry *ce = istate->cache[i];
     +		if (!ce_stage(ce))
     +			continue;
     +
    -+		ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data);
    ++		ret = merge_entry(istate, i, ce->name, &err, fn, data);
     +		if (ret > 0)
     +			i += ret - 1;
     +		else if (ret == -1)
     +			return -1;
     +
    -+		if (err && !oneshot)
    ++		if (err && !oneshot) {
    ++			if (!quiet)
    ++				error(_("merge program failed"));
     +			return 1;
    ++		}
     +	}
     +
    ++	if (err && !quiet)
    ++		error(_("merge program failed"));
     +	return err;
     +}
     
    @@ merge-strategies.h (new)
     +		    merge_fn fn, void *data);
     +
     +#endif /* MERGE_STRATEGIES_H */
    +
    + ## t/t7607-merge-state.sh ##
    +@@ t/t7607-merge-state.sh: test_expect_success 'Ensure we restore original state if no merge strategy handl
    + 	# just hit conflicts, it completely fails and says that it cannot
    + 	# handle this type of merge.
    + 	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
    +-	grep "fatal: merge program failed" output &&
    ++	grep "error: merge program failed" output &&
    + 	grep "Should not be doing an octopus" output &&
    + 
    + 	# Make sure we did not leave stray changes around when no appropriate
 5:  909ed66114 !  4:  a3c0815fc1 merge-index: drop the index
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
     +	if (repo_read_index(r) < 0)
     +		die("invalid index");
      
    + 	/* TODO: audit for interaction with sparse-index. */
    +-	ensure_full_index(&the_index);
    ++	ensure_full_index(r->index);
    + 
      	i = 1;
      	if (!strcmp(argv[i], "-o")) {
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
 6:  1a8aba05bd !  5:  558e65e39b merge-index: add a new way to invoke `git-merge-one-file'
    @@ Documentation/git-merge-index.txt: git-merge-index - Run a merge for files needi
      SYNOPSIS
      --------
      [verse]
    --'git merge-index' [-o] [-q] <merge-program> (-a | [--] <file>*)
    -+'git merge-index' [-o] [-q] (<merge-program> | --use=merge-one-file) (-a | [--] <file>*)
    +-'git merge-index' [-o] [-q] <merge-program> (-a | ( [--] <file>...) )
    ++'git merge-index' [-o] [-q] (<merge-program> | --use=merge-one-file) (-a | ( [--] <file>...) )
      
      DESCRIPTION
      -----------
 7:  1f6635512c !  6:  94edebfb69 update-index: move add_cacheinfo() to read-cache.c
    @@ Commit message
         This moves the function add_cacheinfo() that already exists in
         update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
         and adds an `istate' parameter.  The new cache entry is returned through
    -    a pointer passed in the parameters.  The return value is either 0
    -    (success), -1 (invalid path), or -2 (failed to add the file in the
    -    index).
    +    a pointer passed in the parameters.  This function can return three
    +    values:
    +
    +     - 0, when the file has been successfully added to the index;
    +     - ADD_TO_INDEX_CACHEINFO_INVALID_PATH, when the file does not exists;
    +     - ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD, when the file could not be
    +       added to the index.
     
         This will become useful in the next commit, when the three-way merge
         will need to call this function.
 8:  8755608f6d !  7:  123d299df7 merge-one-file: rewrite in C
    @@ builtin.h: int cmd_merge_base(int argc, const char **argv, const char *prefix);
      int cmd_mktag(int argc, const char **argv, const char *prefix);
     
      ## builtin/merge-index.c ##
    -@@ builtin/merge-index.c: static int merge_one_file_spawn(struct index_state *istate,
    - int cmd_merge_index(int argc, const char **argv, const char *prefix)
    - {
    - 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
    --	merge_fn merge_action = merge_one_file_spawn;
    -+	merge_fn merge_action;
    - 	struct lock_file lock = LOCK_INIT;
    - 	struct repository *r = the_repository;
    - 	const char *use_internal = NULL;
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
      
      	if (skip_prefix(pgm, "--use=", &use_internal)) {
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
     +			merge_action = merge_one_file_func;
      		else
      			die(_("git merge-index: unknown internal program %s"), use_internal);
    --	}
     +
     +		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+	} else
    -+		merge_action = merge_one_file_spawn;
    + 	}
      
      	for (; i < argc; i++) {
    - 		const char *arg = argv[i];
     
      ## builtin/merge-one-file.c (new) ##
     @@
    @@ builtin/merge-one-file.c (new)
     +	"<orig mode> <our mode> <their mode>\n\n"
     +	"Blob ids and modes should be empty for missing files.";
     +
    -+static int read_mode(const char *name, const char *arg, unsigned int *mode)
    ++static int read_param(const char *name, const char *arg_blob, const char *arg_mode,
    ++		      struct object_id *blob, struct object_id **p_blob, unsigned int *mode)
     +{
    -+	char *last;
    -+	int ret = 0;
    ++	if (*arg_blob && !get_oid_hex(arg_blob, blob)) {
    ++		char *last;
     +
    -+	*mode = strtol(arg, &last, 8);
    ++		*p_blob = blob;
    ++		*mode = strtol(arg_mode, &last, 8);
     +
    -+	if (*last)
    -+		ret = error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
    -+	else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
    -+		ret = error(_("invalid '%s' mode: %o"), name, *mode);
    ++		if (*last)
    ++			return error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
    ++		else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
    ++			return error(_("invalid '%s' mode: %o"), name, *mode);
    ++	} else if (!*arg_blob && *arg_mode)
    ++		return error(_("no '%s' object id given, but a mode was still given."), name);
     +
    -+	return ret;
    ++	return 0;
     +}
     +
     +int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
    @@ builtin/merge-one-file.c (new)
     +
     +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
     +
    -+	if (!get_oid_hex(argv[1], &orig_blob)) {
    -+		p_orig_blob = &orig_blob;
    -+		ret = read_mode("orig", argv[5], &orig_mode);
    -+	} else if (!*argv[1] && *argv[5])
    -+		ret = error(_("no 'orig' object id given, but a mode was still given."));
    ++	if (read_param("orig", argv[1], argv[5], &orig_blob,
    ++		       &p_orig_blob, &orig_mode))
    ++		ret = -1;
     +
    -+	if (!get_oid_hex(argv[2], &our_blob)) {
    -+		p_our_blob = &our_blob;
    -+		ret = read_mode("our", argv[6], &our_mode);
    -+	} else if (!*argv[2] && *argv[6])
    -+		ret = error(_("no 'our' object id given, but a mode was still given."));
    ++	if (read_param("our", argv[2], argv[6], &our_blob,
    ++		       &p_our_blob, &our_mode))
    ++		ret = -1;
     +
    -+	if (!get_oid_hex(argv[3], &their_blob)) {
    -+		p_their_blob = &their_blob;
    -+		ret = read_mode("their", argv[7], &their_mode);
    -+	} else if (!*argv[3] && *argv[7])
    -+		ret = error(_("no 'their' object id given, but a mode was still given."));
    ++	if (read_param("their", argv[3], argv[7], &their_blob,
    ++		       &p_their_blob, &their_mode))
    ++		ret = -1;
     +
     +	if (ret)
     +		return ret;
    @@ merge-strategies.c
     @@
      #include "cache.h"
     +#include "dir.h"
    ++#include "entry.h"
      #include "merge-strategies.h"
     +#include "xdiff-interface.h"
     +
    @@ merge-strategies.c
     +		read_mmblob(mmfs + 0, orig_blob);
     +	} else {
     +		printf(_("Added %s in both, but differently.\n"), path);
    -+		read_mmblob(mmfs + 0, &null_oid);
    ++		read_mmblob(mmfs + 0, null_oid());
     +	}
     +
     +	read_mmblob(mmfs + 1, our_blob);
    @@ merge-strategies.c
     +			       orig_mode, our_mode, their_mode);
     +}
      
    - static int merge_entry(struct index_state *istate, int quiet, unsigned int pos,
    + static int merge_entry(struct index_state *istate, unsigned int pos,
      		       const char *path, int *err, merge_fn fn, void *data)
     @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
      		    merge_fn fn, void *data)
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     -	unsigned int i;
     +	unsigned int i, prev_nr;
      
    - 	for (i = 0; i < istate->cache_nr; i++) {
    - 		const struct cache_entry *ce = istate->cache[i];
    + 	/* TODO: audit for interaction with sparse-index. */
    + 	ensure_full_index(istate);
    +@@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
      		if (!ce_stage(ce))
      			continue;
      
     +		prev_nr = istate->cache_nr;
    - 		ret = merge_entry(istate, quiet || oneshot, i, ce->name, &err, fn, data);
    + 		ret = merge_entry(istate, i, ce->name, &err, fn, data);
     -		if (ret > 0)
     -			i += ret - 1;
     -		else if (ret == -1)
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     +		} else if (ret == -1)
      			return -1;
      
    - 		if (err && !oneshot)
    + 		if (err && !oneshot) {
     
      ## merge-strategies.h ##
     @@
 9:  3ecf49a8ac !  8:  f181cef10b merge-resolve: rewrite in C
    @@ Commit message
            all the setup needed).
     
          - The call to `write-tree' is replaced by a call to
    -       write_index_as_tree().
    +       cache_tree_update().  This call is wrapped in a new function,
    +       write_tree().  It is made to mimick write_index_as_tree() with
    +       WRITE_TREE_SILENT flag, but without locking the index; this is taken
    +       care directly in merge_strategies_resolve().
    +
    +     - The call to `diff-index ...' is replaced by a call to
    +       repo_index_has_changes().
     
          - The call to `merge-index', needed to invoke `git merge-one-file', is
            replaced by a call to the new merge_all_index() function.
     
         The index is read in cmd_merge_resolve(), and is wrote back by
    -    merge_strategies_resolve().
    +    merge_strategies_resolve().  This is to accomodate future applications:
    +    in `git-merge', the index has already been read when the merge strategy
    +    is called, so it would be redundant to read it again when the builtin
    +    will be able to use merge_strategies_resolve() directly.
     
         The parameters of merge_strategies_resolve() will be surprising at first
         glance: why using a commit list for `bases' and `remote', where we could
    @@ Commit message
         frictions later, merge_strategies_resolve() takes the same types of
         parameters.
     
    +    merge_strategies_resolve() locks the index only once, at the beginning
    +    of the merge, and releases it when the merge has been completed.
    +
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
      ## Makefile ##
    @@ builtin/merge-resolve.c (new)
     +		}
     +	}
     +
    -+	/*
    -+	 * Give up if we are given two or more remotes.  Not handling
    -+	 * octopus.
    -+	 */
    -+	if (remote && remote->next)
    -+		return 2;
    -+
    -+	/* Give up if this is a baseless merge. */
    -+	if (!bases)
    -+		return 2;
    -+
     +	return merge_strategies_resolve(r, bases, head, remote);
     +}
     
    @@ git-merge-resolve.sh (deleted)
     -#
     -# Resolve two trees, using enhanced multi-base read-tree.
     -
    +-. git-sh-setup
    +-
    +-# Abort if index does not match HEAD
    +-if ! git diff-index --quiet --cached HEAD --
    +-then
    +-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
    +-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
    +-    exit 2
    +-fi
    +-
     -# The first parameters up to -- are merge bases; the rest are heads.
     -bases= head= remotes= sep_seen=
     -for arg
    @@ git.c: static struct cmd_struct commands[] = {
      	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
     +	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
      	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
    - 	{ "merge-tree", cmd_merge_tree, RUN_SETUP | NO_PARSEOPT },
    + 	{ "merge-tree", cmd_merge_tree, RUN_SETUP },
      	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
     
      ## merge-strategies.c ##
    @@ merge-strategies.c
      #include "cache.h"
     +#include "cache-tree.h"
      #include "dir.h"
    + #include "entry.h"
     +#include "lockfile.h"
      #include "merge-strategies.h"
     +#include "unpack-trees.h"
      #include "xdiff-interface.h"
      
    ++static int check_index_is_head(struct repository *r, const char *head_arg)
    ++{
    ++	struct commit *head_commit;
    ++	struct tree *head_tree;
    ++	struct object_id head;
    ++	struct strbuf sb = STRBUF_INIT;
    ++
    ++	get_oid(head_arg, &head);
    ++	head_commit = lookup_commit_reference(r, &head);
    ++	head_tree = repo_get_commit_tree(r, head_commit);
    ++
    ++	if (repo_index_has_changes(r, head_tree, &sb)) {
    ++		error(_("Your local changes to the following files "
    ++			"would be overwritten by merge:\n  %s"),
    ++		      sb.buf);
    ++		strbuf_release(&sb);
    ++		return 1;
    ++	}
    ++
    ++	return 0;
    ++}
    ++
      static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
    + 				     const struct object_id *oid, const char *path,
    + 				     int checkout)
     @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    - 
    + 		error(_("merge program failed"));
      	return err;
      }
     +
    -+static int fast_forward(struct repository *r, struct tree_desc *t,
    -+			int nr, int aggressive)
    ++static int merge_trees(struct repository *r, struct tree_desc *t,
    ++		       int nr, int aggressive)
     +{
     +	struct unpack_trees_options opts;
    -+	struct lock_file lock = LOCK_INIT;
     +
     +	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
    -+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
     +
     +	memset(&opts, 0, sizeof(opts));
     +	opts.head_idx = 1;
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     +
     +	if (unpack_trees(nr, t, &opts))
     +		return -1;
    -+
    -+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
    -+		return error(_("unable to write new index file"));
    -+
     +	return 0;
     +}
     +
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     +	return 0;
     +}
     +
    ++static int write_tree(struct repository *r)
    ++{
    ++	int was_valid;
    ++	was_valid = r->index->cache_tree &&
    ++		cache_tree_fully_valid(r->index->cache_tree);
    ++
    ++	if (!was_valid && cache_tree_update(r->index, WRITE_TREE_SILENT) < 0)
    ++		return WRITE_TREE_UNMERGED_INDEX;
    ++	return 0;
    ++}
    ++
     +int merge_strategies_resolve(struct repository *r,
     +			     struct commit_list *bases, const char *head_arg,
     +			     struct commit_list *remote)
     +{
     +	struct tree_desc t[MAX_UNPACK_TREES];
    -+	struct object_id head, oid;
     +	struct commit_list *i;
    -+	int nr = 0;
    ++	struct lock_file lock = LOCK_INIT;
    ++	int nr = 0, ret = 0;
     +
    -+	if (head_arg)
    -+		get_oid(head_arg, &head);
    ++	/* Abort if index does not match head */
    ++	if (check_index_is_head(r, head_arg))
    ++		return 2;
    ++
    ++	/*
    ++	 * Give up if we are given two or more remotes.  Not handling
    ++	 * octopus.
    ++	 */
    ++	if (remote && remote->next)
    ++		return 2;
    ++
    ++	/* Give up if this is a baseless merge. */
    ++	if (!bases)
    ++		return 2;
     +
     +	puts(_("Trying simple merge."));
     +
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     +	}
     +
     +	if (head_arg) {
    -+		struct tree *tree = parse_tree_indirect(&head);
    ++		struct object_id head;
    ++		struct tree *tree;
    ++
    ++		get_oid(head_arg, &head);
    ++		tree = parse_tree_indirect(&head);
    ++
     +		if (add_tree(tree, t + (nr++)))
     +			return 2;
     +	}
    @@ merge-strategies.c: int merge_all_index(struct index_state *istate, int oneshot,
     +	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
     +		return 2;
     +
    -+	if (fast_forward(r, t, nr, 1))
    ++	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    ++
    ++	if (merge_trees(r, t, nr, 1)) {
    ++		rollback_lock_file(&lock);
     +		return 2;
    ++	}
     +
    -+	if (write_index_as_tree(&oid, r->index, r->index_file,
    -+				WRITE_TREE_SILENT, NULL)) {
    -+		int ret;
    -+		struct lock_file lock = LOCK_INIT;
    -+
    ++	if (write_tree(r)) {
     +		puts(_("Simple merge failed, trying Automatic merge."));
    -+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
     +		ret = merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
    -+
    -+		write_locked_index(r->index, &lock, COMMIT_LOCK);
    -+		return !!ret;
     +	}
     +
    -+	return 0;
    ++	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
    ++		return !!error(_("unable to write new index file"));
    ++	return !!ret;
     +}
     
      ## merge-strategies.h ##
10:  615b04d417 =  9:  cc1ba1acc9 merge-recursive: move better_branch_name() to merge.c
11:  a6ece04f3d ! 10:  c48e2de914 merge-octopus: rewrite in C
    @@ Commit message
          - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
            unpack_trees().
     
    -     - The call to `write-tree' is replaced by a call to
    -       write_index_as_tree().
    +     - The call to `write-tree' is replaced by a call to write_tree().
     
          - The call to `diff-index ...' is replaced by a call to
            repo_index_has_changes().
    @@ Commit message
          - The call to `merge-index', needed to invoke `git merge-one-file', is
            replaced by a call to merge_all_index().
     
    -    The index is read in cmd_merge_octopus(), and is wrote back by
    -    merge_strategies_octopus().
    +    The index is read in cmd_merge_octopus(), and is written back by
    +    merge_strategies_octopus(), for the same reason as merge-resolve.
     
    -    Here to, merge_strategies_octopus() takes two commit lists and a string
    -    to reduce frictions when try_merge_strategies() will be modified to call
    -    it directly.
    +    Here too, merge_strategies_octopus() takes two commit lists and a string
    +    to reduce friction when try_merge_strategies() will be modified to call
    +    it directly.  It also locks the index at the start of the merge, and
    +    releases it at the end.
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
     
    @@ builtin/merge-octopus.c (new)
     +		}
     +	}
     +
    -+	/*
    -+	 * Reject if this is not an octopus -- resolve should be used
    -+	 * instead.
    -+	 */
    -+	if (commit_list_count(remotes) < 2)
    -+		return 2;
    -+
     +	return merge_strategies_octopus(r, bases, head_arg, remotes);
     +}
     
    @@ merge-strategies.c
      #include "cache-tree.h"
     +#include "commit-reach.h"
      #include "dir.h"
    + #include "entry.h"
      #include "lockfile.h"
    - #include "merge-strategies.h"
     @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
    - 
    - 	return 0;
    + 		return !!error(_("unable to write new index file"));
    + 	return !!ret;
      }
     +
    -+static int write_tree(struct repository *r, struct tree **reference_tree)
    -+{
    -+	struct object_id oid;
    -+	int ret;
    -+
    -+	if (!(ret = write_index_as_tree(&oid, r->index, r->index_file,
    -+					WRITE_TREE_SILENT, NULL)))
    -+		*reference_tree = lookup_tree(r, &oid);
    -+
    -+	return ret;
    -+}
    -+
     +static int octopus_fast_forward(struct repository *r, const char *branch_name,
    -+				struct tree *tree_head, struct tree *current_tree,
    -+				struct tree **reference_tree)
    ++				struct tree *tree_head, struct tree *current_tree)
     +{
     +	/*
     +	 * The first head being merged was a fast-forward.  Advance the
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +	init_tree_desc(t, tree_head->buffer, tree_head->size);
     +	if (add_tree(current_tree, t + 1))
     +		return -1;
    -+	if (fast_forward(r, t, 2, 0))
    ++	if (merge_trees(r, t, 2, 0))
     +		return -1;
    -+	if (write_tree(r, reference_tree))
    ++	if (write_tree(r))
     +		return -1;
     +
     +	return 0;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +
     +static int octopus_do_merge(struct repository *r, const char *branch_name,
     +			    struct commit_list *common, struct tree *current_tree,
    -+			    struct tree **reference_tree)
    ++			    struct tree *reference_tree)
     +{
     +	struct tree_desc t[MAX_UNPACK_TREES];
     +	struct commit_list *i;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +			return -1;
     +	}
     +
    -+	if (add_tree(*reference_tree, t + (nr++)))
    ++	if (add_tree(reference_tree, t + (nr++)))
     +		return -1;
     +	if (add_tree(current_tree, t + (nr++)))
     +		return -1;
    -+	if (fast_forward(r, t, nr, 1))
    ++	if (merge_trees(r, t, nr, 1))
     +		return 2;
     +
    -+	if (write_tree(r, reference_tree)) {
    -+		struct lock_file lock = LOCK_INIT;
    -+
    ++	if (write_tree(r)) {
     +		puts(_("Simple merge did not work, trying automatic merge."));
    -+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    -+		ret = !!merge_all_index(r->index, 0, 0, merge_one_file_func, NULL);
    -+		write_locked_index(r->index, &lock, COMMIT_LOCK);
    -+
    -+		write_tree(r, reference_tree);
    ++		ret = !!merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
    ++		write_tree(r);
     +	}
     +
     +	return ret;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +	struct tree *reference_tree, *head_tree;
     +	struct commit_list *i;
     +	struct object_id head;
    -+	struct strbuf sb = STRBUF_INIT;
    ++	struct lock_file lock = LOCK_INIT;
    ++
    ++	/*
    ++	 * Reject if this is not an octopus -- resolve should be used
    ++	 * instead.
    ++	 */
    ++	if (commit_list_count(remotes) < 2)
    ++		return 2;
    ++
    ++	/* Abort if index does not match head */
    ++	if (check_index_is_head(r, head_arg))
    ++		return 2;
     +
     +	get_oid(head_arg, &head);
     +	head_commit = lookup_commit_reference(r, &head);
     +	head_tree = repo_get_commit_tree(r, head_commit);
     +
    -+	if (parse_tree(head_tree))
    -+		return 2;
    -+
    -+	if (repo_index_has_changes(r, head_tree, &sb)) {
    -+		error(_("Your local changes to the following files "
    -+			"would be overwritten by merge:\n  %s"),
    -+		      sb.buf);
    -+		strbuf_release(&sb);
    -+		return 2;
    -+	}
    -+
     +	CALLOC_ARRAY(reference_commits, commit_list_count(remotes) + 1);
     +	reference_commits[0] = head_commit;
     +	reference_tree = head_tree;
     +
    ++	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
    ++
     +	for (i = remotes; i && i->item; i = i->next) {
     +		struct commit *c = i->item;
     +		struct object_id *oid = &c->object.oid;
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +
     +			free(branch_name);
     +			free_commit_list(common);
    -+			free(reference_commits);
     +
    -+			return 2;
    ++			ret = 2;
    ++			break;
     +		}
     +
    -+		for (j = common; j && !up_to_date && ff_merge; j = j->next) {
    -+			up_to_date |= oideq(&j->item->object.oid, oid);
    -+
    -+			if (!j->next &&
    -+			    !oideq(&j->item->object.oid,
    -+				   &reference_commits[nr_references - 1]->object.oid))
    -+				ff_merge = 0;
    ++		/*
    ++		 * If `oid' is reachable from `HEAD', we're already up
    ++		 * to date.
    ++		 */
    ++		for (j = common; j; j = j->next) {
    ++			if (oideq(&j->item->object.oid, oid)) {
    ++				up_to_date = 1;
    ++				break;
    ++			}
     +		}
     +
     +		if (up_to_date) {
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +			continue;
     +		}
     +
    -+		if (ff_merge) {
    -+			ret = octopus_fast_forward(r, branch_name, head_tree,
    -+						   current_tree, &reference_tree);
    ++		/*
    ++		 * If we could fast-forward so far and `HEAD' is the
    ++		 * single merge base with the current `remote' revision,
    ++		 * keep fast-forwarding.
    ++		 */
    ++		if (ff_merge && common && !common->next && nr_references == 1 &&
    ++		    oideq(&common->item->object.oid,
    ++			  &reference_commits[0]->object.oid)) {
    ++			ret = octopus_fast_forward(r, branch_name, head_tree, current_tree);
     +			nr_references = 0;
     +		} else {
     +			ret = octopus_do_merge(r, branch_name, common,
    -+					       current_tree, &reference_tree);
    ++					       current_tree, reference_tree);
    ++			ff_merge = 0;
     +		}
     +
     +		free(branch_name);
    @@ merge-strategies.c: int merge_strategies_resolve(struct repository *r,
     +			puts(_("Automated merge did not work."));
     +			puts(_("Should not be doing an octopus."));
     +
    -+			free(reference_commits);
    -+			return 2;
    ++			ret = 2;
    ++			break;
     +		}
     +
     +		reference_commits[nr_references++] = c;
    ++		reference_tree = lookup_tree(r, &r->index->cache_tree->oid);
     +	}
     +
     +	free(reference_commits);
    ++	write_locked_index(r->index, &lock, COMMIT_LOCK);
    ++
     +	return ret;
     +}
     
12:  cc1500147b = 11:  bcc7b851ef merge: use the "resolve" strategy without forking
13:  ec3dc3b81e = 12:  9ba13186ed merge: use the "octopus" strategy without forking
14:  e7dc4a15d4 ! 13:  a815a16f33 sequencer: use the "resolve" strategy without forking
    @@ Commit message
     
      ## sequencer.c ##
     @@
    - #include "commit-reach.h"
    - #include "rebase-interactive.h"
      #include "reset.h"
    + #include "branch.h"
    + #include "log-tree.h"
     +#include "merge-strategies.h"
      
      #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
15:  34280dd82d ! 14:  5a11fd0e71 sequencer: use the "octopus" merge strategy without forking
    @@ Metadata
     Author: Alban Gruin <alban.gruin@gmail.com>
     
      ## Commit message ##
    -    sequencer: use the "octopus" merge strategy without forking
    +    sequencer: use the "octopus" strategy without forking
     
         This teaches the sequencer to invoke the "octopus" strategy with a
         function call instead of forking.
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v8 01/14] t6060: modify multiple files to expose a possible issue with merge-index
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 02/14] t6060: add tests for removed files Alban Gruin
                                 ` (13 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

Currently, merge-index iterates over every index entry, skipping stage0
entries.  It will then count how many entries following the current one
have the same name, then fork to do the merge.  It will then increase
the iterator by the number of entries to skip them.  This behaviour is
correct, as even if the subprocess modifies the index, merge-index does
not reload it at all.

But when it will be rewritten to use a function, the index it will use
will be modified and may shrink when a conflict happens or if a file is
removed, so we have to be careful to handle such cases.

Here is an example:

 *    Merge branches, file1 and file2 are trivially mergeable.
 |\
 | *  Modifies file1 and file2.
 * |  Modifies file1 and file2.
 |/
 *    Adds file1 and file2.

When the merge happens, the index will look like that:

 i -> 0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
      3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

merge-index handles `file1' first.  As it appears 3 times after the
iterator, it is merged.  The index is now stale, `i' is increased by 3,
and the index now looks like this:

      0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
 i -> 3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

`file2' appears three times too, so it is merged.

With a naive rewrite, the index would look like this:

      0. file1 (stage0)
      1. file2 (stage1)
      2. file2 (stage2)
 i -> 3. file2 (stage3)

`file2' appears once at the iterator or after, so it will be added,
_not_ merged.  Which is wrong.

A naive rewrite would lead to unproperly merged files, or even files not
handled at all.

This changes t6060 to reproduce this case, by creating 2 files instead
of 1, to check the correctness of the soon-to-be-rewritten merge-index.
The files are identical, which is not really important -- the factors
that could trigger this issue are that they should be separated by at
most one entry in the index, and that the first one in the index should
be trivially mergeable.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6060-merge-index.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index ed449abe55..d0d6dec0c8 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -5,16 +5,19 @@ test_description='basic git merge-index / git-merge-one-file tests'
 
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
-	git add file &&
+	cp file file2 &&
+	git add file file2 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
 	sed s/10/ten/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m ten &&
 	git tag ten
 '
@@ -33,8 +36,11 @@ ten
 EOF
 
 test_expect_success 'read-tree does not resolve content merge' '
+	cat >expect <<-\EOF &&
+	file
+	file2
+	EOF
 	git read-tree -i -m base ten two &&
-	echo file >expect &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_cmp expect unmerged
 '
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 02/14] t6060: add tests for removed files
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 01/14] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 03/14] merge-index: libify merge_one_path() and merge_all() Alban Gruin
                                 ` (12 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

Until now, t6060 did not not check git-merge-one-file's behaviour when a
file is deleted in a branch.  To avoid regressions on this during the
conversion from shell to C, this adds a new file, `file3', in the commit
tagged as `base', and deletes it in the commit tagged as `two'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 t/t6060-merge-index.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index d0d6dec0c8..bb4da4bbb2 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -6,12 +6,14 @@ test_description='basic git merge-index / git-merge-one-file tests'
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
 	cp file file2 &&
-	git add file file2 &&
+	cp file file3 &&
+	git add file file2 file3 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
 	cp file file2 &&
+	git rm file3 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
@@ -39,6 +41,7 @@ test_expect_success 'read-tree does not resolve content merge' '
 	cat >expect <<-\EOF &&
 	file
 	file2
+	file3
 	EOF
 	git read-tree -i -m base ten two &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 03/14] merge-index: libify merge_one_path() and merge_all()
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 01/14] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 02/14] t6060: add tests for removed files Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-17  2:10                 ` Ævar Arnfjörð Bjarmason
  2022-08-09 18:54               ` [PATCH v8 04/14] merge-index: drop the index Alban Gruin
                                 ` (11 subsequent siblings)
  14 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

The "resolve" and "octopus" merge strategies do not call directly `git
merge-one-file', they delegate the work to another git command, `git
merge-index', that will loop over files in the index and call the
specified command.  Unfortunately, these functions are not part of
libgit.a, which means that once rewritten, the strategies would still
have to invoke `merge-one-file' by spawning a new process first.

To avoid this, this moves and renames merge_one_path(), merge_all(), and
their helpers to merge-strategies.c.  They also take a callback to
dictate what they should do for each file.  For now, to preserve the
behaviour of `merge-index', only one callback, launching a new process,
is defined.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile               |  1 +
 builtin/merge-index.c  | 92 ++++++++++++++----------------------------
 merge-strategies.c     | 82 +++++++++++++++++++++++++++++++++++++
 merge-strategies.h     | 18 +++++++++
 t/t7607-merge-state.sh |  2 +-
 5 files changed, 133 insertions(+), 62 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index 2ec9b2dc6b..40d1be4e5e 100644
--- a/Makefile
+++ b/Makefile
@@ -991,6 +991,7 @@ LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-ort.o
 LIB_OBJS += merge-ort-wrappers.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += midx.o
 LIB_OBJS += name-hash.o
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index c0383fe9df..f66cc515d8 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,76 +1,43 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "merge-strategies.h"
 #include "run-command.h"
 
 static const char *pgm;
-static int one_shot, quiet;
-static int err;
 
-static int merge_entry(int pos, const char *path)
+static int merge_one_file_spawn(struct index_state *istate,
+				const struct object_id *orig_blob,
+				const struct object_id *our_blob,
+				const struct object_id *their_blob, const char *path,
+				unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+				void *data)
 {
-	int found;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
+	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
+	char modes[3][10] = {{0}};
+	const char *arguments[] = { pgm, oids[0], oids[1], oids[2],
+				    path, modes[0], modes[1], modes[2], NULL };
 
-	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = active_cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
-	if (!found)
-		die("git merge-index: %s not in the cache", path);
-
-	if (run_command_v_opt(arguments, 0)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die("merge program failed");
-			exit(1);
-		}
+	if (orig_blob) {
+		oid_to_hex_r(oids[0], orig_blob);
+		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
 	}
-	return found;
-}
-
-static void merge_one_path(const char *path)
-{
-	int pos = cache_name_pos(path, strlen(path));
 
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(-pos-1, path);
-}
+	if (our_blob) {
+		oid_to_hex_r(oids[1], our_blob);
+		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
+	}
 
-static void merge_all(void)
-{
-	int i;
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(i, ce->name)-1;
+	if (their_blob) {
+		oid_to_hex_r(oids[2], their_blob);
+		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
 	}
+
+	return run_command_v_opt(arguments, 0);
 }
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -94,7 +61,9 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		quiet = 1;
 		i++;
 	}
+
 	pgm = argv[i++];
+
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
 		if (!force_file && *arg == '-') {
@@ -103,14 +72,15 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				merge_all();
+				err |= merge_all_index(&the_index, one_shot, quiet,
+						       merge_one_file_spawn, NULL);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		merge_one_path(arg);
+		err |= merge_index_path(&the_index, one_shot, quiet, arg,
+					merge_one_file_spawn, NULL);
 	}
-	if (err && !quiet)
-		die("merge program failed");
+
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 0000000000..418c9dd710
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,82 @@
+#include "cache.h"
+#include "merge-strategies.h"
+
+static int merge_entry(struct index_state *istate, unsigned int pos,
+		       const char *path, int *err, merge_fn fn, void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = {NULL};
+	unsigned int modes[3] = {0};
+
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		return error(_("%s is not in the cache"), path);
+
+	if (fn(istate, oids[0], oids[1], oids[2], path,
+	       modes[0], modes[1], modes[2], data))
+		(*err)++;
+
+	return found;
+}
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data)
+{
+	int pos = index_name_pos(istate, path, strlen(path)), ret, err = 0;
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos < 0) {
+		ret = merge_entry(istate, -pos - 1, path, &err, fn, data);
+		if (ret == -1)
+			return -1;
+		else if (err) {
+			if (!quiet && !oneshot)
+				error(_("merge program failed"));
+			return 1;
+		}
+	}
+	return 0;
+}
+
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data)
+{
+	int err = 0, ret;
+	unsigned int i;
+
+	/* TODO: audit for interaction with sparse-index. */
+	ensure_full_index(istate);
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, i, ce->name, &err, fn, data);
+		if (ret > 0)
+			i += ret - 1;
+		else if (ret == -1)
+			return -1;
+
+		if (err && !oneshot) {
+			if (!quiet)
+				error(_("merge program failed"));
+			return 1;
+		}
+	}
+
+	if (err && !quiet)
+		error(_("merge program failed"));
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 0000000000..88f476f170
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,18 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+#include "object.h"
+
+typedef int (*merge_fn)(struct index_state *istate,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_fn fn, void *data);
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_fn fn, void *data);
+
+#endif /* MERGE_STRATEGIES_H */
diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
index 89a62ac53b..96befa5b80 100755
--- a/t/t7607-merge-state.sh
+++ b/t/t7607-merge-state.sh
@@ -20,7 +20,7 @@ test_expect_success 'Ensure we restore original state if no merge strategy handl
 	# just hit conflicts, it completely fails and says that it cannot
 	# handle this type of merge.
 	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
-	grep "fatal: merge program failed" output &&
+	grep "error: merge program failed" output &&
 	grep "Should not be doing an octopus" output &&
 
 	# Make sure we did not leave stray changes around when no appropriate
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 04/14] merge-index: drop the index
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (2 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 03/14] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
                                 ` (10 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

In an effort to reduce the usage of the global index throughout the
codebase, this removes references to it in `git merge-index'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-index.c | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index f66cc515d8..9d74b6e85c 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,4 +1,3 @@
-#define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
 #include "merge-strategies.h"
 #include "run-command.h"
@@ -38,6 +37,7 @@ static int merge_one_file_spawn(struct index_state *istate,
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	struct repository *r = the_repository;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -47,10 +47,11 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	if (argc < 3)
 		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
 
-	read_cache();
+	if (repo_read_index(r) < 0)
+		die("invalid index");
 
 	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
+	ensure_full_index(r->index);
 
 	i = 1;
 	if (!strcmp(argv[i], "-o")) {
@@ -72,13 +73,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "-a")) {
-				err |= merge_all_index(&the_index, one_shot, quiet,
+				err |= merge_all_index(r->index, one_shot, quiet,
 						       merge_one_file_spawn, NULL);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
-		err |= merge_index_path(&the_index, one_shot, quiet, arg,
+		err |= merge_index_path(r->index, one_shot, quiet, arg,
 					merge_one_file_spawn, NULL);
 	}
 
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file'
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (3 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 04/14] merge-index: drop the index Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 21:36                 ` Johannes Schindelin
  2022-08-09 18:54               ` [PATCH v8 06/14] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
                                 ` (9 subsequent siblings)
  14 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

Since `git-merge-one-file' will be rewritten and libified, there may be
cases where there is no executable named this way (ie. when git is
compiled with `SKIP_DASHED_BUILT_INS' enabled).  This adds a new way to
invoke this particular program even if it does not exist, by passing
`--use=merge-one-file' to merge-index.  For now, it still forks.

The test suite and shell scripts (git-merge-octopus.sh and
git-merge-resolve.sh) are updated to use this new convention.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Documentation/git-merge-index.txt |  7 ++++---
 builtin/merge-index.c             | 25 ++++++++++++++++++++++---
 git-merge-octopus.sh              |  2 +-
 git-merge-resolve.sh              |  2 +-
 t/t6060-merge-index.sh            |  8 ++++----
 5 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
index eea56b3154..622638a13b 100644
--- a/Documentation/git-merge-index.txt
+++ b/Documentation/git-merge-index.txt
@@ -9,7 +9,7 @@ git-merge-index - Run a merge for files needing merging
 SYNOPSIS
 --------
 [verse]
-'git merge-index' [-o] [-q] <merge-program> (-a | ( [--] <file>...) )
+'git merge-index' [-o] [-q] (<merge-program> | --use=merge-one-file) (-a | ( [--] <file>...) )
 
 DESCRIPTION
 -----------
@@ -44,8 +44,9 @@ code.
 Typically this is run with a script calling Git's imitation of
 the 'merge' command from the RCS package.
 
-A sample script called 'git merge-one-file' is included in the
-distribution.
+A sample script called 'git merge-one-file' used to be included in the
+distribution. This program must now be called with
+'--use=merge-one-file'.
 
 ALERT ALERT ALERT! The Git "merge object order" is different from the
 RCS 'merge' program merge object order. In the above ordering, the
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 9d74b6e85c..aba3ba5694 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,4 +1,5 @@
 #include "builtin.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
 #include "run-command.h"
 
@@ -37,7 +38,10 @@ static int merge_one_file_spawn(struct index_state *istate,
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
 	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
+	merge_fn merge_action = merge_one_file_spawn;
+	struct lock_file lock = LOCK_INIT;
 	struct repository *r = the_repository;
+	const char *use_internal = NULL;
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -45,7 +49,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	signal(SIGCHLD, SIG_DFL);
 
 	if (argc < 3)
-		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
+		usage("git merge-index [-o] [-q] (<merge-program> | --use=merge-one-file) (-a | [--] [<filename>...])");
 
 	if (repo_read_index(r) < 0)
 		die("invalid index");
@@ -64,6 +68,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	}
 
 	pgm = argv[i++];
+	setup_work_tree();
+
+	if (skip_prefix(pgm, "--use=", &use_internal)) {
+		if (!strcmp(use_internal, "merge-one-file"))
+			pgm = "git-merge-one-file";
+		else
+			die(_("git merge-index: unknown internal program %s"), use_internal);
+	}
 
 	for (; i < argc; i++) {
 		const char *arg = argv[i];
@@ -74,13 +86,20 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			}
 			if (!strcmp(arg, "-a")) {
 				err |= merge_all_index(r->index, one_shot, quiet,
-						       merge_one_file_spawn, NULL);
+						       merge_action, NULL);
 				continue;
 			}
 			die("git merge-index: unknown option %s", arg);
 		}
 		err |= merge_index_path(r->index, one_shot, quiet, arg,
-					merge_one_file_spawn, NULL);
+					merge_action, NULL);
+	}
+
+	if (is_lock_file_locked(&lock)) {
+		if (err)
+			rollback_lock_file(&lock);
+		else
+			return write_locked_index(r->index, &lock, COMMIT_LOCK);
 	}
 
 	return err;
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
index 7d19d37951..2770891960 100755
--- a/git-merge-octopus.sh
+++ b/git-merge-octopus.sh
@@ -100,7 +100,7 @@ do
 	if test $? -ne 0
 	then
 		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o git-merge-one-file -a ||
+		git merge-index -o --use=merge-one-file -a ||
 		OCTOPUS_FAILURE=1
 		next=$(git write-tree 2>/dev/null)
 	fi
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
index 77e93121bf..e59175eb75 100755
--- a/git-merge-resolve.sh
+++ b/git-merge-resolve.sh
@@ -55,7 +55,7 @@ then
 	exit 0
 else
 	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o git-merge-one-file -a
+	if git merge-index -o --use=merge-one-file -a
 	then
 		exit 0
 	else
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index bb4da4bbb2..3845a9d3cc 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -48,8 +48,8 @@ test_expect_success 'read-tree does not resolve content merge' '
 	test_cmp expect unmerged
 '
 
-test_expect_success 'git merge-index git-merge-one-file resolves' '
-	git merge-index git-merge-one-file -a &&
+test_expect_success 'git merge-index --use=merge-one-file resolves' '
+	git merge-index --use=merge-one-file -a &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_must_be_empty unmerged &&
 	test_cmp expect-merged file &&
@@ -81,7 +81,7 @@ test_expect_success 'merge-one-file respects GIT_WORK_TREE' '
 	 export GIT_WORK_TREE &&
 	 GIT_INDEX_FILE=$PWD/merge.index &&
 	 export GIT_INDEX_FILE &&
-	 git merge-index git-merge-one-file -a &&
+	 git merge-index --use=merge-one-file -a &&
 	 git cat-file blob :file >work/file-index
 	) &&
 	test_cmp expect-merged bare.git/work/file &&
@@ -96,7 +96,7 @@ test_expect_success 'merge-one-file respects core.worktree' '
 	 export GIT_DIR &&
 	 git config core.worktree "$PWD/child" &&
 	 git read-tree -i -m base ten two &&
-	 git merge-index git-merge-one-file -a &&
+	 git merge-index --use=merge-one-file -a &&
 	 git cat-file blob :file >file-index
 	) &&
 	test_cmp expect-merged subdir/child/file &&
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 06/14] update-index: move add_cacheinfo() to read-cache.c
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (4 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 07/14] merge-one-file: rewrite in C Alban Gruin
                                 ` (8 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This moves the function add_cacheinfo() that already exists in
update-index.c to update-index.c, renames it add_to_index_cacheinfo(),
and adds an `istate' parameter.  The new cache entry is returned through
a pointer passed in the parameters.  This function can return three
values:

 - 0, when the file has been successfully added to the index;
 - ADD_TO_INDEX_CACHEINFO_INVALID_PATH, when the file does not exists;
 - ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD, when the file could not be
   added to the index.

This will become useful in the next commit, when the three-way merge
will need to call this function.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/update-index.c | 25 +++++++------------------
 cache.h                |  8 ++++++++
 read-cache.c           | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 50 insertions(+), 18 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index b62249905f..2e322a58f2 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -411,27 +411,16 @@ static int process_path(const char *path, struct stat *st, int stat_errno)
 static int add_cacheinfo(unsigned int mode, const struct object_id *oid,
 			 const char *path, int stage)
 {
-	int len, option;
-	struct cache_entry *ce;
+	int res;
 
-	if (!verify_path(path, mode))
-		return error("Invalid path '%s'", path);
-
-	len = strlen(path);
-	ce = make_empty_cache_entry(&the_index, len);
-
-	oidcpy(&ce->oid, oid);
-	memcpy(ce->name, path, len);
-	ce->ce_flags = create_ce_flags(stage);
-	ce->ce_namelen = len;
-	ce->ce_mode = create_ce_mode(mode);
-	if (assume_unchanged)
-		ce->ce_flags |= CE_VALID;
-	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
-	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
-	if (add_cache_entry(ce, option))
+	res = add_to_index_cacheinfo(&the_index, mode, oid, path, stage,
+				     allow_add, allow_replace, NULL);
+	if (res == ADD_TO_INDEX_CACHEINFO_INVALID_PATH)
+		return error(_("Invalid path '%s'"), path);
+	if (res == ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD)
 		return error("%s: cannot add to the index - missing --add option?",
 			     path);
+
 	report("add '%s'", path);
 	return 0;
 }
diff --git a/cache.h b/cache.h
index 4aa1bd079d..6b5d0a2ba3 100644
--- a/cache.h
+++ b/cache.h
@@ -885,6 +885,14 @@ int remove_file_from_index(struct index_state *, const char *path);
 int add_to_index(struct index_state *, const char *path, struct stat *, int flags);
 int add_file_to_index(struct index_state *, const char *path, int flags);
 
+#define ADD_TO_INDEX_CACHEINFO_INVALID_PATH (-1)
+#define ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD (-2)
+
+int add_to_index_cacheinfo(struct index_state *, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **ce_ret);
+
 int chmod_index_entry(struct index_state *, struct cache_entry *ce, char flip);
 int ce_same_name(const struct cache_entry *a, const struct cache_entry *b);
 void set_object_name_for_intent_to_add_entry(struct cache_entry *ce);
diff --git a/read-cache.c b/read-cache.c
index 4de207752d..e895bf5c6a 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1436,6 +1436,41 @@ int add_index_entry(struct index_state *istate, struct cache_entry *ce, int opti
 	return 0;
 }
 
+int add_to_index_cacheinfo(struct index_state *istate, unsigned int mode,
+			   const struct object_id *oid, const char *path,
+			   int stage, int allow_add, int allow_replace,
+			   struct cache_entry **ce_ret)
+{
+	int len, option;
+	struct cache_entry *ce;
+
+	if (!verify_path(path, mode))
+		return ADD_TO_INDEX_CACHEINFO_INVALID_PATH;
+
+	len = strlen(path);
+	ce = make_empty_cache_entry(istate, len);
+
+	oidcpy(&ce->oid, oid);
+	memcpy(ce->name, path, len);
+	ce->ce_flags = create_ce_flags(stage);
+	ce->ce_namelen = len;
+	ce->ce_mode = create_ce_mode(mode);
+	if (assume_unchanged)
+		ce->ce_flags |= CE_VALID;
+	option = allow_add ? ADD_CACHE_OK_TO_ADD : 0;
+	option |= allow_replace ? ADD_CACHE_OK_TO_REPLACE : 0;
+
+	if (add_index_entry(istate, ce, option)) {
+		discard_cache_entry(ce);
+		return ADD_TO_INDEX_CACHEINFO_UNABLE_TO_ADD;
+	}
+
+	if (ce_ret)
+		*ce_ret = ce;
+
+	return 0;
+}
+
 /*
  * "refresh" does not calculate a new sha1 file or bring the
  * cache up-to-date for mode/content changes. But what it
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 07/14] merge-one-file: rewrite in C
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (5 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 06/14] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 22:01                 ` Johannes Schindelin
  2022-08-09 18:54               ` [PATCH v8 08/14] merge-resolve: " Alban Gruin
                                 ` (7 subsequent siblings)
  14 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This rewrites `git merge-one-file' from shell to C.  This port is not
completely straightforward: to save precious cycles by avoiding reading
and flushing the index repeatedly, write temporary files when an
operation can be performed in-memory, or allow other function to use the
rewrite without forking nor worrying about the index, the calls to
external processes are replaced by calls to functions in libgit.a:

 - calls to `update-index --add --cacheinfo' are replaced by calls to
   add_to_index_cacheinfo();

 - calls to `update-index --remove' are replaced by calls to
   remove_file_from_index();

 - calls to `checkout-index -u -f' are replaced by calls to
   checkout_entry();

 - calls to `unpack-file' and `merge-files' are replaced by calls to
   read_mmblob() and xdl_merge(), respectively, to merge files
   in-memory;

 - calls to `checkout-index -f --stage=2' are removed, as this is needed
   to have the correct permission bits on the merged file from the
   script, but not in the C version;

 - calls to `update-index' are replaced by calls to add_file_to_index().

The bulk of the rewrite is done in a new file in libgit.a,
merge-strategies.c.  This will enable the resolve and octopus strategies
to directly call it instead of forking.

This also fixes a bug present in the original script: instead of
checking if a _regular_ file exists when a file exists in the branch to
merge, but not in our branch, the rewritten version checks if a file of
any kind (ie. a directory, ...) exists.  This fixes the tests t6035.14,
where the branch to merge had a new file, `a/b', but our branch had a
directory there; it should have failed because a directory exists, but
it did not because there was no regular file called `a/b'.  This test is
now marked as successful.

This also teaches `merge-index' to call merge_three_way() (when invoked
with `--use=merge-one-file') without forking using a new callback,
merge_one_file_func().

To avoid any issue with a shrinking index because of the merge function
used (directly in the process or by forking), as described earlier, the
iterator of the loop of merge_all_index() is increased by the number of
entries with the same name, minus the difference between the number of
entries in the index before and after the merge.

This should handle a shrinking index correctly, but could lead to issues
with a growing index.  However, this case is not treated, as there is no
callback that can produce such a case.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                        |   2 +-
 builtin.h                       |   1 +
 builtin/merge-index.c           |   4 +-
 builtin/merge-one-file.c        |  92 ++++++++++++++
 git-merge-one-file.sh           | 167 -------------------------
 git.c                           |   1 +
 merge-strategies.c              | 208 +++++++++++++++++++++++++++++++-
 merge-strategies.h              |  13 ++
 t/t6060-merge-index.sh          |   2 +-
 t/t6415-merge-dir-to-symlink.sh |   2 +-
 10 files changed, 317 insertions(+), 175 deletions(-)
 create mode 100644 builtin/merge-one-file.c
 delete mode 100755 git-merge-one-file.sh

diff --git a/Makefile b/Makefile
index 40d1be4e5e..e2e6cbbb41 100644
--- a/Makefile
+++ b/Makefile
@@ -631,7 +631,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-one-file.sh
 SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
@@ -1186,6 +1185,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
 BUILTIN_OBJS += builtin/merge-tree.o
diff --git a/builtin.h b/builtin.h
index 40e9ecc848..cdbe91bbe8 100644
--- a/builtin.h
+++ b/builtin.h
@@ -182,6 +182,7 @@ int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index aba3ba5694..a242b357f8 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -72,9 +72,11 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 
 	if (skip_prefix(pgm, "--use=", &use_internal)) {
 		if (!strcmp(use_internal, "merge-one-file"))
-			pgm = "git-merge-one-file";
+			merge_action = merge_one_file_func;
 		else
 			die(_("git merge-index: unknown internal program %s"), use_internal);
+
+		repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
 	}
 
 	for (; i < argc; i++) {
diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
new file mode 100644
index 0000000000..ec718cc1c9
--- /dev/null
+++ b/builtin/merge-one-file.c
@@ -0,0 +1,92 @@
+/*
+ * Builtin "git merge-one-file"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-one-file.sh, written by Linus Torvalds.
+ *
+ * This is the git per-file merge utility, called with
+ *
+ *   argv[1] - original file object name (or empty)
+ *   argv[2] - file in branch1 object name (or empty)
+ *   argv[3] - file in branch2 object name (or empty)
+ *   argv[4] - pathname in repository
+ *   argv[5] - original file mode (or empty)
+ *   argv[6] - file in branch1 mode (or empty)
+ *   argv[7] - file in branch2 mode (or empty)
+ *
+ * Handle some trivial cases. The _really_ trivial cases have been
+ * handled already by git read-tree, but that one doesn't do any merges
+ * that might change the tree layout.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "lockfile.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_one_file_usage[] =
+	"git merge-one-file <orig blob> <our blob> <their blob> <path> "
+	"<orig mode> <our mode> <their mode>\n\n"
+	"Blob ids and modes should be empty for missing files.";
+
+static int read_param(const char *name, const char *arg_blob, const char *arg_mode,
+		      struct object_id *blob, struct object_id **p_blob, unsigned int *mode)
+{
+	if (*arg_blob && !get_oid_hex(arg_blob, blob)) {
+		char *last;
+
+		*p_blob = blob;
+		*mode = strtol(arg_mode, &last, 8);
+
+		if (*last)
+			return error(_("invalid '%s' mode: expected nothing, got '%c'"), name, *last);
+		else if (!(S_ISREG(*mode) || S_ISDIR(*mode) || S_ISLNK(*mode)))
+			return error(_("invalid '%s' mode: %o"), name, *mode);
+	} else if (!*arg_blob && *arg_mode)
+		return error(_("no '%s' object id given, but a mode was still given."), name);
+
+	return 0;
+}
+
+int cmd_merge_one_file(int argc, const char **argv, const char *prefix)
+{
+	struct object_id orig_blob, our_blob, their_blob,
+		*p_orig_blob = NULL, *p_our_blob = NULL, *p_their_blob = NULL;
+	unsigned int orig_mode = 0, our_mode = 0, their_mode = 0, ret = 0;
+	struct lock_file lock = LOCK_INIT;
+	struct repository *r = the_repository;
+
+	if (argc != 8)
+		usage(builtin_merge_one_file_usage);
+
+	if (repo_read_index(r) < 0)
+		die("invalid index");
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	if (read_param("orig", argv[1], argv[5], &orig_blob,
+		       &p_orig_blob, &orig_mode))
+		ret = -1;
+
+	if (read_param("our", argv[2], argv[6], &our_blob,
+		       &p_our_blob, &our_mode))
+		ret = -1;
+
+	if (read_param("their", argv[3], argv[7], &their_blob,
+		       &p_their_blob, &their_mode))
+		ret = -1;
+
+	if (ret)
+		return ret;
+
+	ret = merge_three_way(r->index, p_orig_blob, p_our_blob, p_their_blob,
+			      argv[4], orig_mode, our_mode, their_mode);
+
+	if (ret) {
+		rollback_lock_file(&lock);
+		return !!ret;
+	}
+
+	return write_locked_index(r->index, &lock, COMMIT_LOCK);
+}
diff --git a/git-merge-one-file.sh b/git-merge-one-file.sh
deleted file mode 100755
index f6d9852d2f..0000000000
--- a/git-merge-one-file.sh
+++ /dev/null
@@ -1,167 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) Linus Torvalds, 2005
-#
-# This is the git per-file merge script, called with
-#
-#   $1 - original file SHA1 (or empty)
-#   $2 - file in branch1 SHA1 (or empty)
-#   $3 - file in branch2 SHA1 (or empty)
-#   $4 - pathname in repository
-#   $5 - original file mode (or empty)
-#   $6 - file in branch1 mode (or empty)
-#   $7 - file in branch2 mode (or empty)
-#
-# Handle some trivial cases.. The _really_ trivial cases have
-# been handled already by git read-tree, but that one doesn't
-# do any merges that might change the tree layout.
-
-USAGE='<orig blob> <our blob> <their blob> <path>'
-USAGE="$USAGE <orig mode> <our mode> <their mode>"
-LONG_USAGE="usage: git merge-one-file $USAGE
-
-Blob ids and modes should be empty for missing files."
-
-SUBDIRECTORY_OK=Yes
-. git-sh-setup
-cd_to_toplevel
-require_work_tree
-
-if test $# != 7
-then
-	echo "$LONG_USAGE"
-	exit 1
-fi
-
-case "${1:-.}${2:-.}${3:-.}" in
-#
-# Deleted in both or deleted in one and unchanged in the other
-#
-"$1.." | "$1.$1" | "$1$1.")
-	if { test -z "$6" && test "$5" != "$7"; } ||
-	   { test -z "$7" && test "$5" != "$6"; }
-	then
-		echo "ERROR: File $4 deleted on one branch but had its" >&2
-		echo "ERROR: permissions changed on the other." >&2
-		exit 1
-	fi
-
-	if test -n "$2"
-	then
-		echo "Removing $4"
-	else
-		# read-tree checked that index matches HEAD already,
-		# so we know we do not have this path tracked.
-		# there may be an unrelated working tree file here,
-		# which we should just leave unmolested.  Make sure
-		# we do not have it in the index, though.
-		exec git update-index --remove -- "$4"
-	fi
-	if test -f "$4"
-	then
-		rm -f -- "$4" &&
-		rmdir -p "$(expr "z$4" : 'z\(.*\)/')" 2>/dev/null || :
-	fi &&
-		exec git update-index --remove -- "$4"
-	;;
-
-#
-# Added in one.
-#
-".$2.")
-	# the other side did not add and we added so there is nothing
-	# to be done, except making the path merged.
-	exec git update-index --add --cacheinfo "$6" "$2" "$4"
-	;;
-"..$3")
-	echo "Adding $4"
-	if test -f "$4"
-	then
-		echo "ERROR: untracked $4 is overwritten by the merge." >&2
-		exit 1
-	fi
-	git update-index --add --cacheinfo "$7" "$3" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Added in both, identically (check for same permissions).
-#
-".$3$2")
-	if test "$6" != "$7"
-	then
-		echo "ERROR: File $4 added identically in both branches," >&2
-		echo "ERROR: but permissions conflict $6->$7." >&2
-		exit 1
-	fi
-	echo "Adding $4"
-	git update-index --add --cacheinfo "$6" "$2" "$4" &&
-		exec git checkout-index -u -f -- "$4"
-	;;
-
-#
-# Modified in both, but differently.
-#
-"$1$2$3" | ".$2$3")
-
-	case ",$6,$7," in
-	*,120000,*)
-		echo "ERROR: $4: Not merging symbolic link changes." >&2
-		exit 1
-		;;
-	*,160000,*)
-		echo "ERROR: $4: Not merging conflicting submodule changes." >&2
-		exit 1
-		;;
-	esac
-
-	src1=$(git unpack-file $2)
-	src2=$(git unpack-file $3)
-	case "$1" in
-	'')
-		echo "Added $4 in both, but differently."
-		orig=$(git unpack-file $(git hash-object /dev/null))
-		;;
-	*)
-		echo "Auto-merging $4"
-		orig=$(git unpack-file $1)
-		;;
-	esac
-
-	git merge-file "$src1" "$orig" "$src2"
-	ret=$?
-	msg=
-	if test $ret != 0 || test -z "$1"
-	then
-		msg='content conflict'
-		ret=1
-	fi
-
-	# Create the working tree file, using "our tree" version from the
-	# index, and then store the result of the merge.
-	git checkout-index -f --stage=2 -- "$4" && cat "$src1" >"$4" || exit 1
-	rm -f -- "$orig" "$src1" "$src2"
-
-	if test "$6" != "$7"
-	then
-		if test -n "$msg"
-		then
-			msg="$msg, "
-		fi
-		msg="${msg}permissions conflict: $5->$6,$7"
-		ret=1
-	fi
-
-	if test $ret != 0
-	then
-		echo "ERROR: $msg in $4" >&2
-		exit 1
-	fi
-	exec git update-index -- "$4"
-	;;
-
-*)
-	echo "ERROR: $4: Not handling case $1 -> $2 -> $3" >&2
-	;;
-esac
-exit 1
diff --git a/git.c b/git.c
index e5d62fa5a9..f5d3c6cb39 100644
--- a/git.c
+++ b/git.c
@@ -561,6 +561,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 418c9dd710..373b69c10b 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,198 @@
 #include "cache.h"
+#include "dir.h"
+#include "entry.h"
 #include "merge-strategies.h"
+#include "xdiff-interface.h"
+
+static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
+				     const struct object_id *oid, const char *path,
+				     int checkout)
+{
+	struct cache_entry *ce;
+	int res;
+
+	res = add_to_index_cacheinfo(istate, mode, oid, path, 0, 1, 1, &ce);
+	if (res == -1)
+		return error(_("Invalid path '%s'"), path);
+	else if (res == -2)
+		return -1;
+
+	if (checkout) {
+		struct checkout state = CHECKOUT_INIT;
+
+		state.istate = istate;
+		state.force = 1;
+		state.base_dir = "";
+		state.base_dir_len = 0;
+
+		if (checkout_entry(ce, &state, NULL, NULL) < 0)
+			return error(_("%s: cannot checkout file"), path);
+	}
+
+	return 0;
+}
+
+static int merge_one_file_deleted(struct index_state *istate,
+				  const struct object_id *our_blob,
+				  const struct object_id *their_blob, const char *path,
+				  unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if ((!our_blob && orig_mode != their_mode) ||
+	    (!their_blob && orig_mode != our_mode))
+		return error(_("File %s deleted on one branch but had its "
+			       "permissions changed on the other."), path);
+
+	if (our_blob) {
+		printf(_("Removing %s\n"), path);
+
+		if (file_exists(path))
+			remove_path(path);
+	}
+
+	if (remove_file_from_index(istate, path))
+		return error("%s: cannot remove from the index", path);
+	return 0;
+}
+
+static int do_merge_one_file(struct index_state *istate,
+			     const struct object_id *orig_blob,
+			     const struct object_id *our_blob,
+			     const struct object_id *their_blob, const char *path,
+			     unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	int ret, i, dest;
+	ssize_t written;
+	mmbuffer_t result = {NULL, 0};
+	mmfile_t mmfs[3];
+	xmparam_t xmp = {{0}};
+
+	if (our_mode == S_IFLNK || their_mode == S_IFLNK)
+		return error(_("%s: Not merging symbolic link changes."), path);
+	else if (our_mode == S_IFGITLINK || their_mode == S_IFGITLINK)
+		return error(_("%s: Not merging conflicting submodule changes."), path);
+
+	if (orig_blob) {
+		printf(_("Auto-merging %s\n"), path);
+		read_mmblob(mmfs + 0, orig_blob);
+	} else {
+		printf(_("Added %s in both, but differently.\n"), path);
+		read_mmblob(mmfs + 0, null_oid());
+	}
+
+	read_mmblob(mmfs + 1, our_blob);
+	read_mmblob(mmfs + 2, their_blob);
+
+	xmp.level = XDL_MERGE_ZEALOUS_ALNUM;
+	xmp.style = 0;
+	xmp.favor = 0;
+
+	ret = xdl_merge(mmfs + 0, mmfs + 1, mmfs + 2, &xmp, &result);
+
+	for (i = 0; i < 3; i++)
+		free(mmfs[i].ptr);
+
+	if (ret < 0) {
+		free(result.ptr);
+		return error(_("Failed to execute internal merge"));
+	}
+
+	if (ret > 0 || !orig_blob)
+		ret = error(_("content conflict in %s"), path);
+	if (our_mode != their_mode)
+		ret = error(_("permission conflict: %o->%o,%o in %s"),
+			    orig_mode, our_mode, their_mode, path);
+
+	unlink(path);
+	if ((dest = open(path, O_WRONLY | O_CREAT, our_mode)) < 0) {
+		free(result.ptr);
+		return error_errno(_("failed to open file '%s'"), path);
+	}
+
+	written = write_in_full(dest, result.ptr, result.size);
+	close(dest);
+
+	free(result.ptr);
+
+	if (written < 0)
+		return error_errno(_("failed to write to '%s'"), path);
+	if (ret)
+		return ret;
+
+	return add_file_to_index(istate, path, 0);
+}
+
+int merge_three_way(struct index_state *istate,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
+{
+	if (orig_blob &&
+	    ((!our_blob && !their_blob) ||
+	     (!their_blob && our_blob && oideq(orig_blob, our_blob)) ||
+	     (!our_blob && their_blob && oideq(orig_blob, their_blob)))) {
+		/* Deleted in both or deleted in one and unchanged in the other. */
+		return merge_one_file_deleted(istate, our_blob, their_blob, path,
+					      orig_mode, our_mode, their_mode);
+	} else if (!orig_blob && our_blob && !their_blob) {
+		/*
+		 * Added in ours.  The other side did not add and we
+		 * added so there is nothing to be done, except making
+		 * the path merged.
+		 */
+		return add_merge_result_to_index(istate, our_mode, our_blob, path, 0);
+	} else if (!orig_blob && !our_blob && their_blob) {
+		printf(_("Adding %s\n"), path);
+
+		if (file_exists(path))
+			return error(_("untracked %s is overwritten by the merge."), path);
+
+		return add_merge_result_to_index(istate, their_mode, their_blob, path, 1);
+	} else if (!orig_blob && our_blob && their_blob &&
+		   oideq(our_blob, their_blob)) {
+		/* Added in both, identically (check for same permissions). */
+		if (our_mode != their_mode)
+			return error(_("File %s added identically in both branches, "
+				       "but permissions conflict %o->%o."),
+				     path, our_mode, their_mode);
+
+		printf(_("Adding %s\n"), path);
+
+		return add_merge_result_to_index(istate, our_mode, our_blob, path, 1);
+	} else if (our_blob && their_blob) {
+		/* Modified in both, but differently. */
+		return do_merge_one_file(istate,
+					 orig_blob, our_blob, their_blob, path,
+					 orig_mode, our_mode, their_mode);
+	} else {
+		char orig_hex[GIT_MAX_HEXSZ] = {0}, our_hex[GIT_MAX_HEXSZ] = {0},
+			their_hex[GIT_MAX_HEXSZ] = {0};
+
+		if (orig_blob)
+			oid_to_hex_r(orig_hex, orig_blob);
+		if (our_blob)
+			oid_to_hex_r(our_hex, our_blob);
+		if (their_blob)
+			oid_to_hex_r(their_hex, their_blob);
+
+		return error(_("%s: Not handling case %s -> %s -> %s"),
+			     path, orig_hex, our_hex, their_hex);
+	}
+
+	return 0;
+}
+
+int merge_one_file_func(struct index_state *istate,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data)
+{
+	return merge_three_way(istate,
+			       orig_blob, our_blob, their_blob, path,
+			       orig_mode, our_mode, their_mode);
+}
 
 static int merge_entry(struct index_state *istate, unsigned int pos,
 		       const char *path, int *err, merge_fn fn, void *data)
@@ -54,7 +247,7 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		    merge_fn fn, void *data)
 {
 	int err = 0, ret;
-	unsigned int i;
+	unsigned int i, prev_nr;
 
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(istate);
@@ -63,10 +256,17 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		if (!ce_stage(ce))
 			continue;
 
+		prev_nr = istate->cache_nr;
 		ret = merge_entry(istate, i, ce->name, &err, fn, data);
-		if (ret > 0)
-			i += ret - 1;
-		else if (ret == -1)
+		if (ret > 0) {
+			/*
+			 * Don't bother handling an index that has
+			 * grown, since merge_one_file_func() can't grow
+			 * it, and merge_one_file_spawn() can't change
+			 * it.
+			 */
+			i += ret - (prev_nr - istate->cache_nr) - 1;
+		} else if (ret == -1)
 			return -1;
 
 		if (err && !oneshot) {
diff --git a/merge-strategies.h b/merge-strategies.h
index 88f476f170..8705a550ca 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -3,6 +3,12 @@
 
 #include "object.h"
 
+int merge_three_way(struct index_state *istate,
+		    const struct object_id *orig_blob,
+		    const struct object_id *our_blob,
+		    const struct object_id *their_blob, const char *path,
+		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode);
+
 typedef int (*merge_fn)(struct index_state *istate,
 			const struct object_id *orig_blob,
 			const struct object_id *our_blob,
@@ -10,6 +16,13 @@ typedef int (*merge_fn)(struct index_state *istate,
 			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
 			void *data);
 
+int merge_one_file_func(struct index_state *istate,
+			const struct object_id *orig_blob,
+			const struct object_id *our_blob,
+			const struct object_id *their_blob, const char *path,
+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
+			void *data);
+
 int merge_index_path(struct index_state *istate, int oneshot, int quiet,
 		     const char *path, merge_fn fn, void *data);
 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 3845a9d3cc..9976996c80 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -70,7 +70,7 @@ test_expect_success 'merge-one-file fails without a work tree' '
 	(cd bare.git &&
 	 GIT_INDEX_FILE=$PWD/merge.index &&
 	 export GIT_INDEX_FILE &&
-	 test_must_fail git merge-index git-merge-one-file -a
+	 test_must_fail git merge-index --use=merge-one-file -a
 	)
 '
 
diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
index 2655e295f5..10bc5eb8c4 100755
--- a/t/t6415-merge-dir-to-symlink.sh
+++ b/t/t6415-merge-dir-to-symlink.sh
@@ -99,7 +99,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
 	test -h a/b
 '
 
-test_expect_failure 'do not lose untracked in merge (resolve)' '
+test_expect_success 'do not lose untracked in merge (resolve)' '
 	git reset --hard &&
 	git checkout baseline^0 &&
 	>a/b/c/e &&
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (6 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 07/14] merge-one-file: rewrite in C Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-10 15:03                 ` Phillip Wood
                                   ` (2 more replies)
  2022-08-09 18:54               ` [PATCH v8 09/14] merge-recursive: move better_branch_name() to merge.c Alban Gruin
                                 ` (6 subsequent siblings)
  14 siblings, 3 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This rewrites `git merge-resolve' from shell to C.  As for `git
merge-one-file', this port is not completely straightforward and removes
calls to external processes to avoid reading and writing the index over
and over again.

 - The call to `update-index -q --refresh' is replaced by a call to
   refresh_index().

 - The call to `read-tree' is replaced by a call to unpack_trees() (and
   all the setup needed).

 - The call to `write-tree' is replaced by a call to
   cache_tree_update().  This call is wrapped in a new function,
   write_tree().  It is made to mimick write_index_as_tree() with
   WRITE_TREE_SILENT flag, but without locking the index; this is taken
   care directly in merge_strategies_resolve().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to the new merge_all_index() function.

The index is read in cmd_merge_resolve(), and is wrote back by
merge_strategies_resolve().  This is to accomodate future applications:
in `git-merge', the index has already been read when the merge strategy
is called, so it would be redundant to read it again when the builtin
will be able to use merge_strategies_resolve() directly.

The parameters of merge_strategies_resolve() will be surprising at first
glance: why using a commit list for `bases' and `remote', where we could
use an oid array, and a pointer to an oid?  Because, in a later commit,
try_merge_strategy() will be able to call merge_strategies_resolve()
directly, and it already uses a commit list for `bases' (`common') and
`remote' (`remoteheads'), and a string for `head_arg'.  To reduce
frictions later, merge_strategies_resolve() takes the same types of
parameters.

merge_strategies_resolve() locks the index only once, at the beginning
of the merge, and releases it when the merge has been completed.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-resolve.c |  63 ++++++++++++++++++
 git-merge-resolve.sh    |  64 -------------------
 git.c                   |   1 +
 merge-strategies.c      | 137 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   5 ++
 7 files changed, 208 insertions(+), 65 deletions(-)
 create mode 100644 builtin/merge-resolve.c
 delete mode 100755 git-merge-resolve.sh

diff --git a/Makefile b/Makefile
index e2e6cbbb41..0c18acb979 100644
--- a/Makefile
+++ b/Makefile
@@ -631,7 +631,6 @@ SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
 SCRIPT_SH += git-merge-octopus.sh
-SCRIPT_SH += git-merge-resolve.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1188,6 +1187,7 @@ BUILTIN_OBJS += builtin/merge-index.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
+BUILTIN_OBJS += builtin/merge-resolve.o
 BUILTIN_OBJS += builtin/merge-tree.o
 BUILTIN_OBJS += builtin/merge.o
 BUILTIN_OBJS += builtin/mktag.o
diff --git a/builtin.h b/builtin.h
index cdbe91bbe8..4627229944 100644
--- a/builtin.h
+++ b/builtin.h
@@ -184,6 +184,7 @@ int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix);
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix);
 int cmd_merge_tree(int argc, const char **argv, const char *prefix);
 int cmd_mktag(int argc, const char **argv, const char *prefix);
 int cmd_mktree(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
new file mode 100644
index 0000000000..a51158ebf8
--- /dev/null
+++ b/builtin/merge-resolve.c
@@ -0,0 +1,63 @@
+/*
+ * Builtin "git merge-resolve"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
+ * Hamano.
+ *
+ * Resolve two trees, using enhanced multi-base read-tree.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_resolve_usage[] =
+	"git merge-resolve <bases>... -- <head> <remote>";
+
+int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	const char *head = NULL;
+	struct commit_list *bases = NULL, *remote = NULL;
+	struct commit_list **next_base = &bases;
+	struct repository *r = the_repository;
+
+	if (argc < 5)
+		usage(builtin_merge_resolve_usage);
+
+	setup_work_tree();
+	if (repo_read_index(r) < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (!strcmp(argv[i], "--"))
+			sep_seen = 1;
+		else if (!strcmp(argv[i], "-h"))
+			usage(builtin_merge_resolve_usage);
+		else if (sep_seen && !head)
+			head = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = oideq(&oid, r->hash_algo->empty_tree) ?
+				NULL : lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				commit_list_insert(commit, &remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	return merge_strategies_resolve(r, bases, head, remote);
+}
diff --git a/git-merge-resolve.sh b/git-merge-resolve.sh
deleted file mode 100755
index e59175eb75..0000000000
--- a/git-merge-resolve.sh
+++ /dev/null
@@ -1,64 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Linus Torvalds
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two trees, using enhanced multi-base read-tree.
-
-. git-sh-setup
-
-# Abort if index does not match HEAD
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Give up if we are given two or more remotes -- not handling octopus.
-case "$remotes" in
-?*' '?*)
-	exit 2 ;;
-esac
-
-# Give up if this is a baseless merge.
-if test '' = "$bases"
-then
-	exit 2
-fi
-
-git update-index -q --refresh
-git read-tree -u -m --aggressive $bases $head $remotes || exit 2
-echo "Trying simple merge."
-if result_tree=$(git write-tree 2>/dev/null)
-then
-	exit 0
-else
-	echo "Simple merge failed, trying Automatic merge."
-	if git merge-index -o --use=merge-one-file -a
-	then
-		exit 0
-	else
-		exit 1
-	fi
-fi
diff --git a/git.c b/git.c
index f5d3c6cb39..09d222da88 100644
--- a/git.c
+++ b/git.c
@@ -565,6 +565,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-theirs", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
+	{ "merge-resolve", cmd_merge_resolve, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-subtree", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-tree", cmd_merge_tree, RUN_SETUP },
 	{ "mktag", cmd_mktag, RUN_SETUP | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 373b69c10b..30f225ae5f 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,9 +1,34 @@
 #include "cache.h"
+#include "cache-tree.h"
 #include "dir.h"
 #include "entry.h"
+#include "lockfile.h"
 #include "merge-strategies.h"
+#include "unpack-trees.h"
 #include "xdiff-interface.h"
 
+static int check_index_is_head(struct repository *r, const char *head_arg)
+{
+	struct commit *head_commit;
+	struct tree *head_tree;
+	struct object_id head;
+	struct strbuf sb = STRBUF_INIT;
+
+	get_oid(head_arg, &head);
+	head_commit = lookup_commit_reference(r, &head);
+	head_tree = repo_get_commit_tree(r, head_commit);
+
+	if (repo_index_has_changes(r, head_tree, &sb)) {
+		error(_("Your local changes to the following files "
+			"would be overwritten by merge:\n  %s"),
+		      sb.buf);
+		strbuf_release(&sb);
+		return 1;
+	}
+
+	return 0;
+}
+
 static int add_merge_result_to_index(struct index_state *istate, unsigned int mode,
 				     const struct object_id *oid, const char *path,
 				     int checkout)
@@ -280,3 +305,115 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		error(_("merge program failed"));
 	return err;
 }
+
+static int merge_trees(struct repository *r, struct tree_desc *t,
+		       int nr, int aggressive)
+{
+	struct unpack_trees_options opts;
+
+	refresh_index(r->index, REFRESH_QUIET, NULL, NULL, NULL);
+
+	memset(&opts, 0, sizeof(opts));
+	opts.head_idx = 1;
+	opts.src_index = r->index;
+	opts.dst_index = r->index;
+	opts.merge = 1;
+	opts.update = 1;
+	opts.aggressive = aggressive;
+
+	if (nr == 1)
+		opts.fn = oneway_merge;
+	else if (nr == 2) {
+		opts.fn = twoway_merge;
+		opts.initial_checkout = is_index_unborn(r->index);
+	} else if (nr >= 3) {
+		opts.fn = threeway_merge;
+		opts.head_idx = nr - 1;
+	}
+
+	if (unpack_trees(nr, t, &opts))
+		return -1;
+	return 0;
+}
+
+static int add_tree(struct tree *tree, struct tree_desc *t)
+{
+	if (parse_tree(tree))
+		return -1;
+
+	init_tree_desc(t, tree->buffer, tree->size);
+	return 0;
+}
+
+static int write_tree(struct repository *r)
+{
+	int was_valid;
+	was_valid = r->index->cache_tree &&
+		cache_tree_fully_valid(r->index->cache_tree);
+
+	if (!was_valid && cache_tree_update(r->index, WRITE_TREE_SILENT) < 0)
+		return WRITE_TREE_UNMERGED_INDEX;
+	return 0;
+}
+
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct commit_list *i;
+	struct lock_file lock = LOCK_INIT;
+	int nr = 0, ret = 0;
+
+	/* Abort if index does not match head */
+	if (check_index_is_head(r, head_arg))
+		return 2;
+
+	/*
+	 * Give up if we are given two or more remotes.  Not handling
+	 * octopus.
+	 */
+	if (remote && remote->next)
+		return 2;
+
+	/* Give up if this is a baseless merge. */
+	if (!bases)
+		return 2;
+
+	puts(_("Trying simple merge."));
+
+	for (i = bases; i && i->item; i = i->next) {
+		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))
+			return 2;
+	}
+
+	if (head_arg) {
+		struct object_id head;
+		struct tree *tree;
+
+		get_oid(head_arg, &head);
+		tree = parse_tree_indirect(&head);
+
+		if (add_tree(tree, t + (nr++)))
+			return 2;
+	}
+
+	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
+		return 2;
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	if (merge_trees(r, t, nr, 1)) {
+		rollback_lock_file(&lock);
+		return 2;
+	}
+
+	if (write_tree(r)) {
+		puts(_("Simple merge failed, trying Automatic merge."));
+		ret = merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
+	}
+
+	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
+		return !!error(_("unable to write new index file"));
+	return !!ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index 8705a550ca..bba4bf999c 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_STRATEGIES_H
 #define MERGE_STRATEGIES_H
 
+#include "commit.h"
 #include "object.h"
 
 int merge_three_way(struct index_state *istate,
@@ -28,4 +29,8 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 		    merge_fn fn, void *data);
 
+int merge_strategies_resolve(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
+
 #endif /* MERGE_STRATEGIES_H */
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 09/14] merge-recursive: move better_branch_name() to merge.c
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (7 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 08/14] merge-resolve: " Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 10/14] merge-octopus: rewrite in C Alban Gruin
                                 ` (5 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

better_branch_name() will be used by merge-octopus once it is rewritten
in C, so instead of duplicating it, this moves this function
preventively inside an appropriate file in libgit.a.  This function is
also renamed to reflect its usage by merge strategies.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge-recursive.c | 16 ++--------------
 cache.h                   |  2 +-
 merge.c                   | 12 ++++++++++++
 3 files changed, 15 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index b9acbf5d34..ae429c8514 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -8,18 +8,6 @@
 static const char builtin_merge_recursive_usage[] =
 	"git %s <base>... -- <head> <remote> ...";
 
-static char *better_branch_name(const char *branch)
-{
-	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
-	char *name;
-
-	if (strlen(branch) != the_hash_algo->hexsz)
-		return xstrdup(branch);
-	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
-	name = getenv(githead_env);
-	return xstrdup(name ? name : branch);
-}
-
 int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 {
 	const struct object_id *bases[21];
@@ -75,8 +63,8 @@ int cmd_merge_recursive(int argc, const char **argv, const char *prefix)
 	if (get_oid(o.branch2, &h2))
 		die(_("could not resolve ref '%s'"), o.branch2);
 
-	o.branch1 = better1 = better_branch_name(o.branch1);
-	o.branch2 = better2 = better_branch_name(o.branch2);
+	o.branch1 = better1 = merge_get_better_branch_name(o.branch1);
+	o.branch2 = better2 = merge_get_better_branch_name(o.branch2);
 
 	if (o.verbosity >= 3)
 		printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
diff --git a/cache.h b/cache.h
index 6b5d0a2ba3..61ac42fa43 100644
--- a/cache.h
+++ b/cache.h
@@ -1916,7 +1916,7 @@ int checkout_fast_forward(struct repository *r,
 			  const struct object_id *from,
 			  const struct object_id *to,
 			  int overwrite_ignore);
-
+char *merge_get_better_branch_name(const char *branch);
 
 int sane_execvp(const char *file, char *const argv[]);
 
diff --git a/merge.c b/merge.c
index 2382ff66d3..d87bfd4824 100644
--- a/merge.c
+++ b/merge.c
@@ -102,3 +102,15 @@ int checkout_fast_forward(struct repository *r,
 		return error(_("unable to write new index file"));
 	return 0;
 }
+
+char *merge_get_better_branch_name(const char *branch)
+{
+	static char githead_env[8 + GIT_MAX_HEXSZ + 1];
+	char *name;
+
+	if (strlen(branch) != the_hash_algo->hexsz)
+		return xstrdup(branch);
+	xsnprintf(githead_env, sizeof(githead_env), "GITHEAD_%s", branch);
+	name = getenv(githead_env);
+	return xstrdup(name ? name : branch);
+}
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 10/14] merge-octopus: rewrite in C
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (8 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 09/14] merge-recursive: move better_branch_name() to merge.c Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 11/14] merge: use the "resolve" strategy without forking Alban Gruin
                                 ` (4 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This rewrites `git merge-octopus' from shell to C.  As for the two last
conversions, this port removes calls to external processes to avoid
reading and writing the index over and over again.

 - Calls to `read-tree -u -m (--aggressive)?' are replaced by calls to
   unpack_trees().

 - The call to `write-tree' is replaced by a call to write_tree().

 - The call to `diff-index ...' is replaced by a call to
   repo_index_has_changes().

 - The call to `merge-index', needed to invoke `git merge-one-file', is
   replaced by a call to merge_all_index().

The index is read in cmd_merge_octopus(), and is written back by
merge_strategies_octopus(), for the same reason as merge-resolve.

Here too, merge_strategies_octopus() takes two commit lists and a string
to reduce friction when try_merge_strategies() will be modified to call
it directly.  It also locks the index at the start of the merge, and
releases it at the end.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 Makefile                |   2 +-
 builtin.h               |   1 +
 builtin/merge-octopus.c |  63 +++++++++++++++
 git-merge-octopus.sh    | 112 --------------------------
 git.c                   |   1 +
 merge-strategies.c      | 171 ++++++++++++++++++++++++++++++++++++++++
 merge-strategies.h      |   3 +
 7 files changed, 240 insertions(+), 113 deletions(-)
 create mode 100644 builtin/merge-octopus.c
 delete mode 100755 git-merge-octopus.sh

diff --git a/Makefile b/Makefile
index 0c18acb979..9fe1e72f6e 100644
--- a/Makefile
+++ b/Makefile
@@ -630,7 +630,6 @@ unexport CDPATH
 SCRIPT_SH += git-bisect.sh
 SCRIPT_SH += git-difftool--helper.sh
 SCRIPT_SH += git-filter-branch.sh
-SCRIPT_SH += git-merge-octopus.sh
 SCRIPT_SH += git-mergetool.sh
 SCRIPT_SH += git-quiltimport.sh
 SCRIPT_SH += git-request-pull.sh
@@ -1184,6 +1183,7 @@ BUILTIN_OBJS += builtin/mailsplit.o
 BUILTIN_OBJS += builtin/merge-base.o
 BUILTIN_OBJS += builtin/merge-file.o
 BUILTIN_OBJS += builtin/merge-index.o
+BUILTIN_OBJS += builtin/merge-octopus.o
 BUILTIN_OBJS += builtin/merge-one-file.o
 BUILTIN_OBJS += builtin/merge-ours.o
 BUILTIN_OBJS += builtin/merge-recursive.o
diff --git a/builtin.h b/builtin.h
index 4627229944..9305dda166 100644
--- a/builtin.h
+++ b/builtin.h
@@ -180,6 +180,7 @@ int cmd_maintenance(int argc, const char **argv, const char *prefix);
 int cmd_merge(int argc, const char **argv, const char *prefix);
 int cmd_merge_base(int argc, const char **argv, const char *prefix);
 int cmd_merge_index(int argc, const char **argv, const char *prefix);
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix);
 int cmd_merge_ours(int argc, const char **argv, const char *prefix);
 int cmd_merge_file(int argc, const char **argv, const char *prefix);
 int cmd_merge_one_file(int argc, const char **argv, const char *prefix);
diff --git a/builtin/merge-octopus.c b/builtin/merge-octopus.c
new file mode 100644
index 0000000000..ff3089bfca
--- /dev/null
+++ b/builtin/merge-octopus.c
@@ -0,0 +1,63 @@
+/*
+ * Builtin "git merge-octopus"
+ *
+ * Copyright (c) 2020 Alban Gruin
+ *
+ * Based on git-merge-octopus.sh, written by Junio C Hamano.
+ *
+ * Resolve two or more trees.
+ */
+
+#include "cache.h"
+#include "builtin.h"
+#include "commit.h"
+#include "merge-strategies.h"
+
+static const char builtin_merge_octopus_usage[] =
+	"git merge-octopus [<bases>...] -- <head> <remote1> <remote2> [<remotes>...]";
+
+int cmd_merge_octopus(int argc, const char **argv, const char *prefix)
+{
+	int i, sep_seen = 0;
+	struct commit_list *bases = NULL, *remotes = NULL;
+	struct commit_list **next_base = &bases, **next_remote = &remotes;
+	const char *head_arg = NULL;
+	struct repository *r = the_repository;
+
+	if (argc < 5)
+		usage(builtin_merge_octopus_usage);
+
+	setup_work_tree();
+	if (repo_read_index(r) < 0)
+		die("invalid index");
+
+	/*
+	 * The first parameters up to -- are merge bases; the rest are
+	 * heads.
+	 */
+	for (i = 1; i < argc; i++) {
+		if (strcmp(argv[i], "--") == 0)
+			sep_seen = 1;
+		else if (strcmp(argv[i], "-h") == 0)
+			usage(builtin_merge_octopus_usage);
+		else if (sep_seen && !head_arg)
+			head_arg = argv[i];
+		else {
+			struct object_id oid;
+			struct commit *commit;
+
+			if (get_oid(argv[i], &oid))
+				die("object %s not found.", argv[i]);
+
+			commit = oideq(&oid, r->hash_algo->empty_tree) ?
+				NULL : lookup_commit_or_die(&oid, argv[i]);
+
+			if (sep_seen)
+				next_remote = commit_list_append(commit, next_remote);
+			else
+				next_base = commit_list_append(commit, next_base);
+		}
+	}
+
+	return merge_strategies_octopus(r, bases, head_arg, remotes);
+}
diff --git a/git-merge-octopus.sh b/git-merge-octopus.sh
deleted file mode 100755
index 2770891960..0000000000
--- a/git-merge-octopus.sh
+++ /dev/null
@@ -1,112 +0,0 @@
-#!/bin/sh
-#
-# Copyright (c) 2005 Junio C Hamano
-#
-# Resolve two or more trees.
-#
-
-. git-sh-setup
-
-LF='
-'
-
-# The first parameters up to -- are merge bases; the rest are heads.
-bases= head= remotes= sep_seen=
-for arg
-do
-	case ",$sep_seen,$head,$arg," in
-	*,--,)
-		sep_seen=yes
-		;;
-	,yes,,*)
-		head=$arg
-		;;
-	,yes,*)
-		remotes="$remotes$arg "
-		;;
-	*)
-		bases="$bases$arg "
-		;;
-	esac
-done
-
-# Reject if this is not an octopus -- resolve should be used instead.
-case "$remotes" in
-?*' '?*)
-	;;
-*)
-	exit 2 ;;
-esac
-
-# MRC is the current "merge reference commit"
-# MRT is the current "merge result tree"
-
-if ! git diff-index --quiet --cached HEAD --
-then
-    gettextln "Error: Your local changes to the following files would be overwritten by merge"
-    git diff-index --cached --name-only HEAD -- | sed -e 's/^/    /'
-    exit 2
-fi
-MRC=$(git rev-parse --verify -q $head)
-MRT=$(git write-tree)
-NON_FF_MERGE=0
-OCTOPUS_FAILURE=0
-for SHA1 in $remotes
-do
-	case "$OCTOPUS_FAILURE" in
-	1)
-		# We allow only last one to have a hand-resolvable
-		# conflicts.  Last round failed and we still had
-		# a head to merge.
-		gettextln "Automated merge did not work."
-		gettextln "Should not be doing an octopus."
-		exit 2
-	esac
-
-	eval pretty_name=\${GITHEAD_$SHA1:-$SHA1}
-	if test "$SHA1" = "$pretty_name"
-	then
-		SHA1_UP="$(echo "$SHA1" | tr a-z A-Z)"
-		eval pretty_name=\${GITHEAD_$SHA1_UP:-$pretty_name}
-	fi
-	common=$(git merge-base --all $SHA1 $MRC) ||
-		die "$(eval_gettext "Unable to find common commit with \$pretty_name")"
-
-	case "$LF$common$LF" in
-	*"$LF$SHA1$LF"*)
-		eval_gettextln "Already up to date with \$pretty_name"
-		continue
-		;;
-	esac
-
-	if test "$common,$NON_FF_MERGE" = "$MRC,0"
-	then
-		# The first head being merged was a fast-forward.
-		# Advance MRC to the head being merged, and use that
-		# tree as the intermediate result of the merge.
-		# We still need to count this as part of the parent set.
-
-		eval_gettextln "Fast-forwarding to: \$pretty_name"
-		git read-tree -u -m $head $SHA1 || exit
-		MRC=$SHA1 MRT=$(git write-tree)
-		continue
-	fi
-
-	NON_FF_MERGE=1
-
-	eval_gettextln "Trying simple merge with \$pretty_name"
-	git read-tree -u -m --aggressive  $common $MRT $SHA1 || exit 2
-	next=$(git write-tree 2>/dev/null)
-	if test $? -ne 0
-	then
-		gettextln "Simple merge did not work, trying automatic merge."
-		git merge-index -o --use=merge-one-file -a ||
-		OCTOPUS_FAILURE=1
-		next=$(git write-tree 2>/dev/null)
-	fi
-
-	MRC="$MRC $SHA1"
-	MRT=$next
-done
-
-exit "$OCTOPUS_FAILURE"
diff --git a/git.c b/git.c
index 09d222da88..7a5e506c64 100644
--- a/git.c
+++ b/git.c
@@ -560,6 +560,7 @@ static struct cmd_struct commands[] = {
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
 	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-octopus", cmd_merge_octopus, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-one-file", cmd_merge_one_file, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/merge-strategies.c b/merge-strategies.c
index 30f225ae5f..3e8255614b 100644
--- a/merge-strategies.c
+++ b/merge-strategies.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "cache-tree.h"
+#include "commit-reach.h"
 #include "dir.h"
 #include "entry.h"
 #include "lockfile.h"
@@ -417,3 +418,173 @@ int merge_strategies_resolve(struct repository *r,
 		return !!error(_("unable to write new index file"));
 	return !!ret;
 }
+
+static int octopus_fast_forward(struct repository *r, const char *branch_name,
+				struct tree *tree_head, struct tree *current_tree)
+{
+	/*
+	 * The first head being merged was a fast-forward.  Advance the
+	 * reference commit to the head being merged, and use that tree
+	 * as the intermediate result of the merge.  We still need to
+	 * count this as part of the parent set.
+	 */
+	struct tree_desc t[2];
+
+	printf(_("Fast-forwarding to: %s\n"), branch_name);
+
+	init_tree_desc(t, tree_head->buffer, tree_head->size);
+	if (add_tree(current_tree, t + 1))
+		return -1;
+	if (merge_trees(r, t, 2, 0))
+		return -1;
+	if (write_tree(r))
+		return -1;
+
+	return 0;
+}
+
+static int octopus_do_merge(struct repository *r, const char *branch_name,
+			    struct commit_list *common, struct tree *current_tree,
+			    struct tree *reference_tree)
+{
+	struct tree_desc t[MAX_UNPACK_TREES];
+	struct commit_list *i;
+	int nr = 0, ret = 0;
+
+	printf(_("Trying simple merge with %s\n"), branch_name);
+
+	for (i = common; i; i = i->next) {
+		struct tree *tree = repo_get_commit_tree(r, i->item);
+		if (add_tree(tree, t + (nr++)))
+			return -1;
+	}
+
+	if (add_tree(reference_tree, t + (nr++)))
+		return -1;
+	if (add_tree(current_tree, t + (nr++)))
+		return -1;
+	if (merge_trees(r, t, nr, 1))
+		return 2;
+
+	if (write_tree(r)) {
+		puts(_("Simple merge did not work, trying automatic merge."));
+		ret = !!merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
+		write_tree(r);
+	}
+
+	return ret;
+}
+
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remotes)
+{
+	int ff_merge = 1, ret = 0, nr_references = 1;
+	struct commit **reference_commits, *head_commit;
+	struct tree *reference_tree, *head_tree;
+	struct commit_list *i;
+	struct object_id head;
+	struct lock_file lock = LOCK_INIT;
+
+	/*
+	 * Reject if this is not an octopus -- resolve should be used
+	 * instead.
+	 */
+	if (commit_list_count(remotes) < 2)
+		return 2;
+
+	/* Abort if index does not match head */
+	if (check_index_is_head(r, head_arg))
+		return 2;
+
+	get_oid(head_arg, &head);
+	head_commit = lookup_commit_reference(r, &head);
+	head_tree = repo_get_commit_tree(r, head_commit);
+
+	CALLOC_ARRAY(reference_commits, commit_list_count(remotes) + 1);
+	reference_commits[0] = head_commit;
+	reference_tree = head_tree;
+
+	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
+
+	for (i = remotes; i && i->item; i = i->next) {
+		struct commit *c = i->item;
+		struct object_id *oid = &c->object.oid;
+		struct tree *current_tree = repo_get_commit_tree(r, c);
+		struct commit_list *common, *j;
+		char *branch_name = merge_get_better_branch_name(oid_to_hex(oid));
+		int up_to_date = 0;
+
+		common = repo_get_merge_bases_many(r, c, nr_references, reference_commits);
+		if (!common) {
+			error(_("Unable to find common commit with %s"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+
+			ret = 2;
+			break;
+		}
+
+		/*
+		 * If `oid' is reachable from `HEAD', we're already up
+		 * to date.
+		 */
+		for (j = common; j; j = j->next) {
+			if (oideq(&j->item->object.oid, oid)) {
+				up_to_date = 1;
+				break;
+			}
+		}
+
+		if (up_to_date) {
+			printf(_("Already up to date with %s\n"), branch_name);
+
+			free(branch_name);
+			free_commit_list(common);
+			continue;
+		}
+
+		/*
+		 * If we could fast-forward so far and `HEAD' is the
+		 * single merge base with the current `remote' revision,
+		 * keep fast-forwarding.
+		 */
+		if (ff_merge && common && !common->next && nr_references == 1 &&
+		    oideq(&common->item->object.oid,
+			  &reference_commits[0]->object.oid)) {
+			ret = octopus_fast_forward(r, branch_name, head_tree, current_tree);
+			nr_references = 0;
+		} else {
+			ret = octopus_do_merge(r, branch_name, common,
+					       current_tree, reference_tree);
+			ff_merge = 0;
+		}
+
+		free(branch_name);
+		free_commit_list(common);
+
+		if (ret == -1 || ret == 2)
+			break;
+		else if (ret && i->next) {
+			/*
+			 * We allow only last one to have a
+			 * hand-resolvable conflicts.  Last round failed
+			 * and we still had a head to merge.
+			 */
+			puts(_("Automated merge did not work."));
+			puts(_("Should not be doing an octopus."));
+
+			ret = 2;
+			break;
+		}
+
+		reference_commits[nr_references++] = c;
+		reference_tree = lookup_tree(r, &r->index->cache_tree->oid);
+	}
+
+	free(reference_commits);
+	write_locked_index(r->index, &lock, COMMIT_LOCK);
+
+	return ret;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
index bba4bf999c..8de2249ee6 100644
--- a/merge-strategies.h
+++ b/merge-strategies.h
@@ -32,5 +32,8 @@ int merge_all_index(struct index_state *istate, int oneshot, int quiet,
 int merge_strategies_resolve(struct repository *r,
 			     struct commit_list *bases, const char *head_arg,
 			     struct commit_list *remote);
+int merge_strategies_octopus(struct repository *r,
+			     struct commit_list *bases, const char *head_arg,
+			     struct commit_list *remote);
 
 #endif /* MERGE_STRATEGIES_H */
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 11/14] merge: use the "resolve" strategy without forking
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (9 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 10/14] merge-octopus: rewrite in C Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-13 16:18                 ` Junio C Hamano
  2022-08-09 18:54               ` [PATCH v8 12/14] merge: use the "octopus" " Alban Gruin
                                 ` (3 subsequent siblings)
  14 siblings, 1 reply; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This teaches `git merge' to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index f7c92c0e64..0ab2993ab2 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -44,6 +44,7 @@
 #include "commit-reach.h"
 #include "wt-status.h"
 #include "commit-graph.h"
+#include "merge-strategies.h"
 
 #define DEFAULT_TWOHEAD (1<<0)
 #define DEFAULT_OCTOPUS (1<<1)
@@ -774,6 +775,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
 			die(_("unable to write %s"), get_index_file());
 		return clean ? 0 : 1;
+	} else if (!strcmp(strategy, "resolve")) {
+		return merge_strategies_resolve(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 12/14] merge: use the "octopus" strategy without forking
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (10 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 11/14] merge: use the "resolve" strategy without forking Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 13/14] sequencer: use the "resolve" " Alban Gruin
                                 ` (2 subsequent siblings)
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This teaches `git merge' to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 builtin/merge.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/builtin/merge.c b/builtin/merge.c
index 0ab2993ab2..a44a6b810b 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -778,6 +778,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
 	} else if (!strcmp(strategy, "resolve")) {
 		return merge_strategies_resolve(the_repository, common,
 						head_arg, remoteheads);
+	} else if (!strcmp(strategy, "octopus")) {
+		return merge_strategies_octopus(the_repository, common,
+						head_arg, remoteheads);
 	} else {
 		return try_merge_command(the_repository,
 					 strategy, xopts_nr, xopts,
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 13/14] sequencer: use the "resolve" strategy without forking
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (11 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 12/14] merge: use the "octopus" " Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-08-09 18:54               ` [PATCH v8 14/14] sequencer: use the "octopus" " Alban Gruin
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This teaches the sequencer to invoke the "resolve" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 5f22b7cd37..0e5e6cbb24 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -37,6 +37,7 @@
 #include "reset.h"
 #include "branch.h"
 #include "log-tree.h"
+#include "merge-strategies.h"
 
 #define GIT_REFLOG_ACTION "GIT_REFLOG_ACTION"
 
@@ -2314,9 +2315,16 @@ static int do_pick_commit(struct repository *r,
 
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
-		res |= try_merge_command(r, opts->strategy,
-					 opts->xopts_nr, (const char **)opts->xopts,
-					common, oid_to_hex(&head), remotes);
+
+		if (!strcmp(opts->strategy, "resolve")) {
+			repo_read_index(r);
+			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else {
+			res |= try_merge_command(r, opts->strategy,
+						 opts->xopts_nr, (const char **)opts->xopts,
+						 common, oid_to_hex(&head), remotes);
+		}
+
 		free_commit_list(common);
 		free_commit_list(remotes);
 	}
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v8 14/14] sequencer: use the "octopus" strategy without forking
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (12 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 13/14] sequencer: use the "resolve" " Alban Gruin
@ 2022-08-09 18:54               ` Alban Gruin
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
  14 siblings, 0 replies; 221+ messages in thread
From: Alban Gruin @ 2022-08-09 18:54 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Phillip Wood, Johannes Schindelin, Alban Gruin

This teaches the sequencer to invoke the "octopus" strategy with a
function call instead of forking.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
---
 sequencer.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/sequencer.c b/sequencer.c
index 0e5e6cbb24..00a3620584 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2319,6 +2319,9 @@ static int do_pick_commit(struct repository *r,
 		if (!strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
+		} else if (!strcmp(opts->strategy, "octopus")) {
+			repo_read_index(r);
+			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else {
 			res |= try_merge_command(r, opts->strategy,
 						 opts->xopts_nr, (const char **)opts->xopts,
-- 
2.37.1.412.gcfdce49ffd


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file'
  2022-08-09 18:54               ` [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
@ 2022-08-09 21:36                 ` Johannes Schindelin
  2022-08-10 13:14                   ` Phillip Wood
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2022-08-09 21:36 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood

Hi Alban,

On Tue, 9 Aug 2022, Alban Gruin wrote:

> Since `git-merge-one-file' will be rewritten and libified, there may be
> cases where there is no executable named this way (ie. when git is
> compiled with `SKIP_DASHED_BUILT_INS' enabled).  This adds a new way to
> invoke this particular program even if it does not exist, by passing
> `--use=merge-one-file' to merge-index.  For now, it still forks.

I read up about Stolee's and Phillip's suggestion, and about Junio chiming
in, but I have to point out that all the objections against special-casing
`!strcmp(pgm, "git-merge-one-file`") share one fundamental flaw: they fail
to acknowledge that we will _have_ to special-case this value once we turn
`merge-one-file` into a built-in.

And the reason is: there might be scripts out there that expect `git
merge-index git-merge-one-file [...]` to continue to work even when
building Git with `SKIP_DASHED_BUILT_INS`.

In light of that, I would like to point out that we really _must_ revert
to `if (!strcmp(pgm, "git-merge-one-file"))`, ie. to what v6 did (see
https://lore.kernel.org/git/20201124115315.13311-7-alban.gruin@gmail.com/).

And since we must do that anyway, I do not see any need for the `--use`
option at all, it just complicates the usage and does not really provide
any benefit that I can see.

On the upside: skipping the `--use` option will dramatically simplify this
patch.

Sorry for not catching this earlier.

> diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
> [...]
> @@ -44,8 +44,9 @@ code.
>  Typically this is run with a script calling Git's imitation of
>  the 'merge' command from the RCS package.
>
> -A sample script called 'git merge-one-file' is included in the
> -distribution.
> +A sample script called 'git merge-one-file' used to be included in the
> +distribution. This program must now be called with
> +'--use=merge-one-file'.

It probably makes more sense to just drop this paragraph because we will
no longer provide that sample script.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 07/14] merge-one-file: rewrite in C
  2022-08-09 18:54               ` [PATCH v8 07/14] merge-one-file: rewrite in C Alban Gruin
@ 2022-08-09 22:01                 ` Johannes Schindelin
  0 siblings, 0 replies; 221+ messages in thread
From: Johannes Schindelin @ 2022-08-09 22:01 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood

Hi Alban,

what an incredible amount of careful work. Thank you for doing this.

A few minor comments:

On Tue, 9 Aug 2022, Alban Gruin wrote:

> diff --git a/builtin/merge-one-file.c b/builtin/merge-one-file.c
> new file mode 100644
> index 0000000000..ec718cc1c9
> --- /dev/null
> +++ b/builtin/merge-one-file.c
> @@ -0,0 +1,92 @@
> +/*
> + * Builtin "git merge-one-file"
> + *
> + * Copyright (c) 2020 Alban Gruin

There have been claims that it is still March 2020 (see e.g.
https://ismarchoveryet.com/), but I believe that those are jokes and that
we're really in the year 2022 now. It should be safe to adjust the text
accordingly.

:-)

> [...]
> +int merge_three_way(struct index_state *istate,
> +		    const struct object_id *orig_blob,
> +		    const struct object_id *our_blob,
> +		    const struct object_id *their_blob, const char *path,
> +		    unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode)
> +{
> [...]
> +}
> +
> +int merge_one_file_func(struct index_state *istate,
> +			const struct object_id *orig_blob,
> +			const struct object_id *our_blob,
> +			const struct object_id *their_blob, const char *path,
> +			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
> +			void *data)
> +{
> +	return merge_three_way(istate,
> +			       orig_blob, our_blob, their_blob, path,
> +			       orig_mode, our_mode, their_mode);
> +}

I have only read the patch series until this point (and plan on continuing
with the remaining patches tomorrow), so I might be wrong, but... is there
any other user of `merge_three_way()` left? If not (and I suspect this is
the case), then the `merge_three_way()` code could be moved into
`merge_one_file_func()`.

> [...]
> diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
> index 3845a9d3cc..9976996c80 100755
> --- a/t/t6060-merge-index.sh
> +++ b/t/t6060-merge-index.sh
> @@ -70,7 +70,7 @@ test_expect_success 'merge-one-file fails without a work tree' '
>  	(cd bare.git &&
>  	 GIT_INDEX_FILE=$PWD/merge.index &&
>  	 export GIT_INDEX_FILE &&
> -	 test_must_fail git merge-index git-merge-one-file -a
> +	 test_must_fail git merge-index --use=merge-one-file -a

This hunk probably wanted to live in [PATCH v8 05/14] merge-index: add a
new way to invoke `git-merge-one-file', but as I pointed out in my reply
to that patch: I do not think that we have to introduce that `--use=<...>`
option at all.

>  	)
>  '
>
> diff --git a/t/t6415-merge-dir-to-symlink.sh b/t/t6415-merge-dir-to-symlink.sh
> index 2655e295f5..10bc5eb8c4 100755
> --- a/t/t6415-merge-dir-to-symlink.sh
> +++ b/t/t6415-merge-dir-to-symlink.sh
> @@ -99,7 +99,7 @@ test_expect_success SYMLINKS 'a/b was resolved as symlink' '
>  	test -h a/b
>  '
>
> -test_expect_failure 'do not lose untracked in merge (resolve)' '
> +test_expect_success 'do not lose untracked in merge (resolve)' '

Very, very nice.

Thank you!
Dscho

>  	git reset --hard &&
>  	git checkout baseline^0 &&
>  	>a/b/c/e &&
> --
> 2.37.1.412.gcfdce49ffd
>
>

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file'
  2022-08-09 21:36                 ` Johannes Schindelin
@ 2022-08-10 13:14                   ` Phillip Wood
  0 siblings, 0 replies; 221+ messages in thread
From: Phillip Wood @ 2022-08-10 13:14 UTC (permalink / raw)
  To: Johannes Schindelin, Alban Gruin; +Cc: git, Junio C Hamano

On 09/08/2022 22:36, Johannes Schindelin wrote:
> Hi Alban,
> 
> On Tue, 9 Aug 2022, Alban Gruin wrote:
> 
>> Since `git-merge-one-file' will be rewritten and libified, there may be
>> cases where there is no executable named this way (ie. when git is
>> compiled with `SKIP_DASHED_BUILT_INS' enabled).  This adds a new way to
>> invoke this particular program even if it does not exist, by passing
>> `--use=merge-one-file' to merge-index.  For now, it still forks.
> 
> I read up about Stolee's and Phillip's suggestion,

I thought I was in favor of special casing git-merge-one-file. Stolee 
seems to have been worried about someone passing "git merge-one-file" 
but I think we only accept a program name and not a program plus 
arguments so that shouldn't be a problem.

> and about Junio chiming
> in, but I have to point out that all the objections against special-casing
> `!strcmp(pgm, "git-merge-one-file`") share one fundamental flaw: they fail
> to acknowledge that we will _have_ to special-case this value once we turn
> `merge-one-file` into a built-in.
> 
> And the reason is: there might be scripts out there that expect `git
> merge-index git-merge-one-file [...]` to continue to work even when
> building Git with `SKIP_DASHED_BUILT_INS`.
> 
> In light of that, I would like to point out that we really _must_ revert
> to `if (!strcmp(pgm, "git-merge-one-file"))`, ie. to what v6 did (see
> https://lore.kernel.org/git/20201124115315.13311-7-alban.gruin@gmail.com/).
> 
> And since we must do that anyway, I do not see any need for the `--use`
> option at all, it just complicates the usage and does not really provide
> any benefit that I can see.
> 
> On the upside: skipping the `--use` option will dramatically simplify this
> patch.

I'd be happy to see the '--use' option dropped as well.

Thanks for continuing to work on this Alban, I'm sorry I've not had time 
to look at it properly since v1, I'll try and take a good look at this 
version.

Best Wishes

Phillip

> Sorry for not catching this earlier.
> 
>> diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
>> [...]
>> @@ -44,8 +44,9 @@ code.
>>   Typically this is run with a script calling Git's imitation of
>>   the 'merge' command from the RCS package.
>>
>> -A sample script called 'git merge-one-file' is included in the
>> -distribution.
>> +A sample script called 'git merge-one-file' used to be included in the
>> +distribution. This program must now be called with
>> +'--use=merge-one-file'.
> 
> It probably makes more sense to just drop this paragraph because we will
> no longer provide that sample script.
> 
> Thanks,
> Dscho


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-09 18:54               ` [PATCH v8 08/14] merge-resolve: " Alban Gruin
@ 2022-08-10 15:03                 ` Phillip Wood
  2022-08-10 21:20                   ` Junio C Hamano
  2022-08-16 12:17                   ` Johannes Schindelin
  2022-08-17  2:16                 ` Ævar Arnfjörð Bjarmason
  2022-08-18 14:43                 ` Ævar Arnfjörð Bjarmason
  2 siblings, 2 replies; 221+ messages in thread
From: Phillip Wood @ 2022-08-10 15:03 UTC (permalink / raw)
  To: Alban Gruin, git; +Cc: Junio C Hamano, Johannes Schindelin

Hi Alban

On 09/08/2022 19:54, Alban Gruin wrote:
> This rewrites `git merge-resolve' from shell to C.  As for `git
> merge-one-file', this port is not completely straightforward and removes
> calls to external processes to avoid reading and writing the index over
> and over again.
> 
>   - The call to `update-index -q --refresh' is replaced by a call to
>     refresh_index().
> 
>   - The call to `read-tree' is replaced by a call to unpack_trees() (and
>     all the setup needed).
> 
>   - The call to `write-tree' is replaced by a call to
>     cache_tree_update().  This call is wrapped in a new function,
>     write_tree().  It is made to mimick write_index_as_tree() with
>     WRITE_TREE_SILENT flag, but without locking the index; this is taken
>     care directly in merge_strategies_resolve().
> 
>   - The call to `diff-index ...' is replaced by a call to
>     repo_index_has_changes().
> 
>   - The call to `merge-index', needed to invoke `git merge-one-file', is
>     replaced by a call to the new merge_all_index() function.
> 
> The index is read in cmd_merge_resolve(), and is wrote back by
> merge_strategies_resolve().  This is to accomodate future applications:
> in `git-merge', the index has already been read when the merge strategy
> is called, so it would be redundant to read it again when the builtin
> will be able to use merge_strategies_resolve() directly.
> 
> The parameters of merge_strategies_resolve() will be surprising at first
> glance: why using a commit list for `bases' and `remote', where we could
> use an oid array, and a pointer to an oid?  Because, in a later commit,
> try_merge_strategy() will be able to call merge_strategies_resolve()
> directly, and it already uses a commit list for `bases' (`common') and
> `remote' (`remoteheads'), and a string for `head_arg'.  To reduce
> frictions later, merge_strategies_resolve() takes the same types of
> parameters.

git-merge-resolve will happily merge three trees, unfortunately using 
lists of commits will break that.

> merge_strategies_resolve() locks the index only once, at the beginning
> of the merge, and releases it when the merge has been completed.
> 
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
> diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
> new file mode 100644
> index 0000000000..a51158ebf8
> --- /dev/null
> +++ b/builtin/merge-resolve.c
> @@ -0,0 +1,63 @@
> +/*
> + * Builtin "git merge-resolve"
> + *
> + * Copyright (c) 2020 Alban Gruin
> + *
> + * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
> + * Hamano.
> + *
> + * Resolve two trees, using enhanced multi-base read-tree.
> + */
> +
> +#include "cache.h"
> +#include "builtin.h"
> +#include "merge-strategies.h"
> +
> +static const char builtin_merge_resolve_usage[] =
> +	"git merge-resolve <bases>... -- <head> <remote>";
> +
> +int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
> +{
> +	int i, sep_seen = 0;
> +	const char *head = NULL;
> +	struct commit_list *bases = NULL, *remote = NULL;
> +	struct commit_list **next_base = &bases;
> +	struct repository *r = the_repository;
> +
> +	if (argc < 5)
> +		usage(builtin_merge_resolve_usage);

I think it would be better to call parse_options() and then check argc. 
That would give better error messages for unknown options and supports 
'-h' for free.

I think we also need to call git_config(). I see that read-tree respects 
submodule.recurse so I think we need the same here. I suspect we should 
also be reading the merge config to respect merge.conflictStyle.

> +	setup_work_tree();
> +	if (repo_read_index(r) < 0)
> +		die("invalid index");

This should probably be marked for translation.

> +
> +	/*
> +	 * The first parameters up to -- are merge bases; the rest are
> +	 * heads.
> +	 */
> +	for (i = 1; i < argc; i++) {
> +		if (!strcmp(argv[i], "--"))
> +			sep_seen = 1;
> +		else if (!strcmp(argv[i], "-h"))
> +			usage(builtin_merge_resolve_usage);
> +		else if (sep_seen && !head)
> +			head = argv[i];
> +		else {
> +			struct object_id oid;
> +			struct commit *commit;
> +
> +			if (get_oid(argv[i], &oid))
> +				die("object %s not found.", argv[i]);

translation here as well.

> +			commit = oideq(&oid, r->hash_algo->empty_tree) ?
> +				NULL : lookup_commit_or_die(&oid, argv[i]);

As I said above, git-merge-resolve should be callable with trees I think.

> +
> +			if (sep_seen)
> +				commit_list_insert(commit, &remote);
> +			else
> +				next_base = commit_list_append(commit, next_base);
> +		}
> +	}
> +
> +	return merge_strategies_resolve(r, bases, head, remote);
> +}
> diff --git a/merge-strategies.c b/merge-strategies.c
> index 373b69c10b..30f225ae5f 100644
> --- a/merge-strategies.c
> +++ b/merge-strategies.c
> @@ -1,9 +1,34 @@
>   #include "cache.h"
> +#include "cache-tree.h"
>   #include "dir.h"
>   #include "entry.h"
> +#include "lockfile.h"
>   #include "merge-strategies.h"
> +#include "unpack-trees.h"
>   #include "xdiff-interface.h"
>   
> +static int check_index_is_head(struct repository *r, const char *head_arg)
> +{
> +	struct commit *head_commit;
> +	struct tree *head_tree;
> +	struct object_id head;
> +	struct strbuf sb = STRBUF_INIT;
> +
> +	get_oid(head_arg, &head);
> +	head_commit = lookup_commit_reference(r, &head);
> +	head_tree = repo_get_commit_tree(r, head_commit);

Can this all be replaced by a call to parse_tree_indirect(), we should 
also handle an invalid HEAD.

> +
> +	if (repo_index_has_changes(r, head_tree, &sb)) {
> +		error(_("Your local changes to the following files "
> +			"would be overwritten by merge:\n  %s"),
> +		      sb.buf);

This matches the script but I wonder why that did not check for unstaged 
changes.


> +int merge_strategies_resolve(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,

As well as the commit vs tree comments above I think that we should be 
getting the callers to parse head rather than passing a string. Both 
builtin/merge.c and sequencer.c have a struct commit we can use.

> +			     struct commit_list *remote)
> +{
> +	struct tree_desc t[MAX_UNPACK_TREES];
> +	struct commit_list *i;
> +	struct lock_file lock = LOCK_INIT;
> +	int nr = 0, ret = 0;
> +
> +	/* Abort if index does not match head */
> +	if (check_index_is_head(r, head_arg))
> +		return 2;
> +
> +	/*
> +	 * Give up if we are given two or more remotes.  Not handling
> +	 * octopus.
> +	 */
> +	if (remote && remote->next)
> +		return 2;
> +
> +	/* Give up if this is a baseless merge. */
> +	if (!bases)
> +		return 2;
> +
> +	puts(_("Trying simple merge."));
> +
> +	for (i = bases; i && i->item; i = i->next) {
> +		if (add_tree(repo_get_commit_tree(r, i->item), t + (nr++)))

This needs to check that we're not overrunning the end of t as 
builtin/read-tree.c:list_trees() does.


Except for the tree issue the conversion of the script looks correct to 
me and you have been careful to preserve the exit values.

Best Wishes

Phillip

> +			return 2;
> +	}
> +
> +	if (head_arg) {
> +		struct object_id head;
> +		struct tree *tree;
> +
> +		get_oid(head_arg, &head);
> +		tree = parse_tree_indirect(&head);
> +
> +		if (add_tree(tree, t + (nr++)))
> +			return 2;
> +	}
> +
> +	if (remote && add_tree(repo_get_commit_tree(r, remote->item), t + (nr++)))
> +		return 2;
> +
> +	repo_hold_locked_index(r, &lock, LOCK_DIE_ON_ERROR);
> +
> +	if (merge_trees(r, t, nr, 1)) {
> +		rollback_lock_file(&lock);
> +		return 2;
> +	}
> +
> +	if (write_tree(r)) {
> +		puts(_("Simple merge failed, trying Automatic merge."));
> +		ret = merge_all_index(r->index, 1, 0, merge_one_file_func, NULL);
> +	}
> +
> +	if (write_locked_index(r->index, &lock, COMMIT_LOCK))
> +		return !!error(_("unable to write new index file"));
> +	return !!ret;
> +}
> diff --git a/merge-strategies.h b/merge-strategies.h
> index 8705a550ca..bba4bf999c 100644
> --- a/merge-strategies.h
> +++ b/merge-strategies.h
> @@ -1,6 +1,7 @@
>   #ifndef MERGE_STRATEGIES_H
>   #define MERGE_STRATEGIES_H
>   
> +#include "commit.h"
>   #include "object.h"
>   
>   int merge_three_way(struct index_state *istate,
> @@ -28,4 +29,8 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
>   int merge_all_index(struct index_state *istate, int oneshot, int quiet,
>   		    merge_fn fn, void *data);
>   
> +int merge_strategies_resolve(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remote);
> +
>   #endif /* MERGE_STRATEGIES_H */

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-10 15:03                 ` Phillip Wood
@ 2022-08-10 21:20                   ` Junio C Hamano
  2022-08-16 12:09                     ` Johannes Schindelin
  2022-08-16 12:17                   ` Johannes Schindelin
  1 sibling, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2022-08-10 21:20 UTC (permalink / raw)
  To: Phillip Wood; +Cc: Alban Gruin, git, Johannes Schindelin

Phillip Wood <phillip.wood123@gmail.com> writes:

> git-merge-resolve will happily merge three trees, unfortunately using
> lists of commits will break that.

True.

While I agree that it would make sense to rewrite some strategies in
C, I do not quite see the point of redoing this particular one.  Its
simplicity is one of the only few remaining shining points in the
"resolve" strategy, and it can serve as an easy-to-understand
example to demonstrate what a merge-strategy implementation should
look like.  I however doubt with improvements to the "recursive" and
more recently the "ort" strategy, I do not know how much "real" use
there is to it.  I even suspect that the users do not mind if a
platform does not ship this strategy by default if it has so much
problem running a shell script.

By rewriting it to C, we would lose an easy-to-understand example
that the users can easily run to see how it works, but what we gain
in exchange is not clear, at least to me.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 11/14] merge: use the "resolve" strategy without forking
  2022-08-09 18:54               ` [PATCH v8 11/14] merge: use the "resolve" strategy without forking Alban Gruin
@ 2022-08-13 16:18                 ` Junio C Hamano
  0 siblings, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2022-08-13 16:18 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Phillip Wood, Johannes Schindelin

Alban Gruin <alban.gruin@gmail.com> writes:

> This teaches `git merge' to invoke the "resolve" strategy with a
> function call instead of forking.
>
> Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> ---
>  builtin/merge.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index f7c92c0e64..0ab2993ab2 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -44,6 +44,7 @@
>  #include "commit-reach.h"
>  #include "wt-status.h"
>  #include "commit-graph.h"
> +#include "merge-strategies.h"
>  
>  #define DEFAULT_TWOHEAD (1<<0)
>  #define DEFAULT_OCTOPUS (1<<1)
> @@ -774,6 +775,9 @@ static int try_merge_strategy(const char *strategy, struct commit_list *common,
>  				       COMMIT_LOCK | SKIP_IF_UNCHANGED))
>  			die(_("unable to write %s"), get_index_file());
>  		return clean ? 0 : 1;
> +	} else if (!strcmp(strategy, "resolve")) {
> +		return merge_strategies_resolve(the_repository, common,
> +						head_arg, remoteheads);
>  	} else {
>  		return try_merge_command(the_repository,
>  					 strategy, xopts_nr, xopts,

This is another thing that probably hurts the overall project more
than it helps, I am afraid.

Recall the recent effort by Elijah's en/merge-restore-to-pristine topic cf.
https://lore.kernel.org/git/pull.1231.v5.git.1658541198.gitgitgadget@gmail.com/

There are different failure modes of merge strategy backends, and
the "git merge" command that drives them must be prepared to handle
various failures from them.  It is one selling point of "git merge"
that there is a codepath that lets you use your own merge strategy
backend.

Before this series, we had recursive and ort backends that are
internally called without going through try_merge_command()
codepath, and resolve and octopus covered the more general codepath,
the same one that is used by external third-party strategy backends,
and we had test coverage for all.

As I said earlier, as the "ort" strategy got more mature and
performant, the simpler "resolve" may have outlived the value we get
out of its use in the real world (read: what's the last time you ran
"git merge" with th e"-s resolve" option?).  So at this point, the
value of having tests of "-s resolve" in our test suite mostly does
not come from the fact that we are keeping "resolve" alive.  It
comes from the fact that we are making sure that the codepath to
drive external merge strategy does not regress.  While it moves to
internally call resolve and octopus, I do not think this series
compensates the loss of test coverage by adding tests to drive a
custom merge strategy.

A possible correction may be to _add_ a new merge strategy written
in C that implements the same algorithm "resolve" uses, give it a
different name, say "c-resolve", and call it internally instead of
spawning.  And keep "resolve" instead of replacing it with
"c-resolve".  You can duplicate the tests we have for "-s resolve"
so that the new "-s c-resolve" codepath gets tested to the same
degree.  Then we will not lose the value "resolve" has, which is to
serve as a testbed for external merge strategy.

But as I said already, I suspect that "-s resolve" is not of much
use in the real world, not because it is not implemented in C but
because there is a generally better alternative.  It makes us wonder
if we are making good use of our engineering effort by giving yet
another strategy, "-s c-resolve", to the users.

IOW, I am not sure there is value in rewriting resolve in C (except
for educational value for the developer who does the task, that is),
and it is doubly dubious to call it internally instead of spawning
it as an external command.

So, I dunno.  I think between octopus and resolve, the former might
still be used and it might make sense to have a more "performant"
version of it (there is no strong reason why it needs to use the
same resolve backend for repeated pairwise merges it does---it could
just call into recursive or ort machinery instead if the resolve
machinery is more cumbersome to use) by rewriting it in C.  But
rewriting "resolve" in C to call it internally looks to me a
regression overall to the value "resolve" gives to this project.
Stopping at rewriting it in C but still calling it externally might
make it more acceptable, though.

Thanks.


^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-10 21:20                   ` Junio C Hamano
@ 2022-08-16 12:09                     ` Johannes Schindelin
  2022-08-16 19:36                       ` Junio C Hamano
  0 siblings, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2022-08-16 12:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, Alban Gruin, git

Hi Junio,

On Wed, 10 Aug 2022, Junio C Hamano wrote:

> While I agree that it would make sense to rewrite some strategies in
> C, I do not quite see the point of redoing this particular one.  Its
> simplicity is one of the only few remaining shining points in the
> "resolve" strategy, and it can serve as an easy-to-understand
> example to demonstrate what a merge-strategy implementation should
> look like.

I am sure we can do much better than

	https://github.com/git/git/blob/v2.37.2/git-merge-resolve.sh

when it comes to demonstrating a script to implement a custom merge
strategy. The really nice thing about a custom merge strategy, after all,
is that you can forgo pretty much all error handling and command-line
parsing because you know precisely how you are going to use it.

> I however doubt with improvements to the "recursive" and more recently
> the "ort" strategy, I do not know how much "real" use there is to it.  I
> even suspect that the users do not mind if a platform does not ship this
> strategy by default if it has so much problem running a shell script.
>
> By rewriting it to C, we would lose an easy-to-understand example that
> the users can easily run to see how it works, but what we gain in
> exchange is not clear, at least to me.

We reduce Git's reliance on POSIX shell scripting, we reduce the number of
programming languages contributors need to be familiar with, we open up to
code coverage/static analysis tools that handle C but not shell scripts,
just to name a few.

If you want to have an easy example of a custom merge strategy, then let's
have that easy example. `git-merge-resolve.sh` ain't that example.

It would be a different matter if you had commented about
`git-merge-ours.sh`:
https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
That _was_ a simple and easy example.

I would also have understood a lament about the absence of any good
example in https://git-scm.com/docs/git-merge#_merge_strategies to help
users develop their own custom merge strategies.

I'm all in favor of adding such a good example there, but there is no
reason to hold back `git merge-resolve` from being implemented in C.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-10 15:03                 ` Phillip Wood
  2022-08-10 21:20                   ` Junio C Hamano
@ 2022-08-16 12:17                   ` Johannes Schindelin
  2022-08-16 14:02                     ` Phillip Wood
  1 sibling, 1 reply; 221+ messages in thread
From: Johannes Schindelin @ 2022-08-16 12:17 UTC (permalink / raw)
  To: phillip.wood; +Cc: Alban Gruin, git, Junio C Hamano

Hi Phillip,

On Wed, 10 Aug 2022, Phillip Wood wrote:

> On 09/08/2022 19:54, Alban Gruin wrote:
> > This rewrites `git merge-resolve' from shell to C.  As for `git
> > merge-one-file', this port is not completely straightforward and removes
> > calls to external processes to avoid reading and writing the index over
> > and over again.
> >
> >   - The call to `update-index -q --refresh' is replaced by a call to
> >     refresh_index().
> >
> >   - The call to `read-tree' is replaced by a call to unpack_trees() (and
> >     all the setup needed).
> >
> >   - The call to `write-tree' is replaced by a call to
> >     cache_tree_update().  This call is wrapped in a new function,
> >     write_tree().  It is made to mimick write_index_as_tree() with
> >     WRITE_TREE_SILENT flag, but without locking the index; this is taken
> >     care directly in merge_strategies_resolve().
> >
> >   - The call to `diff-index ...' is replaced by a call to
> >     repo_index_has_changes().
> >
> >   - The call to `merge-index', needed to invoke `git merge-one-file', is
> >     replaced by a call to the new merge_all_index() function.
> >
> > The index is read in cmd_merge_resolve(), and is wrote back by
> > merge_strategies_resolve().  This is to accomodate future applications:
> > in `git-merge', the index has already been read when the merge strategy
> > is called, so it would be redundant to read it again when the builtin
> > will be able to use merge_strategies_resolve() directly.
> >
> > The parameters of merge_strategies_resolve() will be surprising at first
> > glance: why using a commit list for `bases' and `remote', where we could
> > use an oid array, and a pointer to an oid?  Because, in a later commit,
> > try_merge_strategy() will be able to call merge_strategies_resolve()
> > directly, and it already uses a commit list for `bases' (`common') and
> > `remote' (`remoteheads'), and a string for `head_arg'.  To reduce
> > frictions later, merge_strategies_resolve() takes the same types of
> > parameters.
>
> git-merge-resolve will happily merge three trees, unfortunately using
> lists of commits will break that.

But isn't `merge-resolve` specifically implemented as a merge strategy? I
do not see any contract in Git's documentation that commits to supporting
direct calls to the implementation detail that is `git merge-resolve`:

	$ man git-merge-resolve
	No manual entry for git-merge-resolve

> > merge_strategies_resolve() locks the index only once, at the beginning
> > of the merge, and releases it when the merge has been completed.
> >
> > Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
> > ---
> > diff --git a/builtin/merge-resolve.c b/builtin/merge-resolve.c
> > new file mode 100644
> > index 0000000000..a51158ebf8
> > --- /dev/null
> > +++ b/builtin/merge-resolve.c
> > @@ -0,0 +1,63 @@
> > +/*
> > + * Builtin "git merge-resolve"
> > + *
> > + * Copyright (c) 2020 Alban Gruin
> > + *
> > + * Based on git-merge-resolve.sh, written by Linus Torvalds and Junio C
> > + * Hamano.
> > + *
> > + * Resolve two trees, using enhanced multi-base read-tree.
> > + */
> > +
> > +#include "cache.h"
> > +#include "builtin.h"
> > +#include "merge-strategies.h"
> > +
> > +static const char builtin_merge_resolve_usage[] =
> > +	"git merge-resolve <bases>... -- <head> <remote>";
> > +
> > +int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
> > +{
> > +	int i, sep_seen = 0;
> > +	const char *head = NULL;
> > +	struct commit_list *bases = NULL, *remote = NULL;
> > +	struct commit_list **next_base = &bases;
> > +	struct repository *r = the_repository;
> > +
> > +	if (argc < 5)
> > +		usage(builtin_merge_resolve_usage);
>
> I think it would be better to call parse_options() and then check argc. That
> would give better error messages for unknown options and supports '-h' for
> free.

Again, we are talking about a merge strategy, a program that is not meant
to be called directly by the user. Why should we complicate the code by
using the `parse_options` machinery?

> I think we also need to call git_config(). I see that read-tree respects
> submodule.recurse so I think we need the same here. I suspect we should
> also be reading the merge config to respect merge.conflictStyle.

Valid concerns. Extra brownie points if you can provide a simple test case
that demonstrates the current behavior.

> > +
> > +	if (repo_index_has_changes(r, head_tree, &sb)) {
> > +		error(_("Your local changes to the following files "
> > +			"would be overwritten by merge:\n  %s"),
> > +		      sb.buf);
>
> This matches the script but I wonder why that did not check for unstaged
> changes.

Any deviations from the scripted behavior should be done on top of this
patch series, unless the deviations make the conversion substantially
cleaner.

Thanks,
Dscho

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-16 12:17                   ` Johannes Schindelin
@ 2022-08-16 14:02                     ` Phillip Wood
  0 siblings, 0 replies; 221+ messages in thread
From: Phillip Wood @ 2022-08-16 14:02 UTC (permalink / raw)
  To: Johannes Schindelin, phillip.wood; +Cc: Alban Gruin, git, Junio C Hamano



On 16/08/2022 13:17, Johannes Schindelin wrote:
> Hi Phillip,
> 
> On Wed, 10 Aug 2022, Phillip Wood wrote:
> 
>> On 09/08/2022 19:54, Alban Gruin wrote:
>>> This rewrites `git merge-resolve' from shell to C.  As for `git
>>> merge-one-file', this port is not completely straightforward and removes
>>> calls to external processes to avoid reading and writing the index over
>>> and over again.
>>>
>>>    - The call to `update-index -q --refresh' is replaced by a call to
>>>      refresh_index().
>>>
>>>    - The call to `read-tree' is replaced by a call to unpack_trees() (and
>>>      all the setup needed).
>>>
>>>    - The call to `write-tree' is replaced by a call to
>>>      cache_tree_update().  This call is wrapped in a new function,
>>>      write_tree().  It is made to mimick write_index_as_tree() with
>>>      WRITE_TREE_SILENT flag, but without locking the index; this is taken
>>>      care directly in merge_strategies_resolve().
>>>
>>>    - The call to `diff-index ...' is replaced by a call to
>>>      repo_index_has_changes().
>>>
>>>    - The call to `merge-index', needed to invoke `git merge-one-file', is
>>>      replaced by a call to the new merge_all_index() function.
>>>
>>> The index is read in cmd_merge_resolve(), and is wrote back by
>>> merge_strategies_resolve().  This is to accomodate future applications:
>>> in `git-merge', the index has already been read when the merge strategy
>>> is called, so it would be redundant to read it again when the builtin
>>> will be able to use merge_strategies_resolve() directly.
>>>
>>> The parameters of merge_strategies_resolve() will be surprising at first
>>> glance: why using a commit list for `bases' and `remote', where we could
>>> use an oid array, and a pointer to an oid?  Because, in a later commit,
>>> try_merge_strategy() will be able to call merge_strategies_resolve()
>>> directly, and it already uses a commit list for `bases' (`common') and
>>> `remote' (`remoteheads'), and a string for `head_arg'.  To reduce
>>> frictions later, merge_strategies_resolve() takes the same types of
>>> parameters.
>>
>> git-merge-resolve will happily merge three trees, unfortunately using
>> lists of commits will break that.
> 
> But isn't `merge-resolve` specifically implemented as a merge strategy? I
> do not see any contract in Git's documentation that commits to supporting
> direct calls to the implementation detail that is `git merge-resolve`:
> 
> 	$ man git-merge-resolve
> 	No manual entry for git-merge-resolve

I've certainly got scripts that call "git merge-recursive" with a 
mixture of commits and trees (it's kind of doing an cherry-pick), it 
wouldn't surprise me if someone was doing something weird with 
merge-resolve.
>>> +int cmd_merge_resolve(int argc, const char **argv, const char *prefix)
>>> +{
>>> +	int i, sep_seen = 0;
>>> +	const char *head = NULL;
>>> +	struct commit_list *bases = NULL, *remote = NULL;
>>> +	struct commit_list **next_base = &bases;
>>> +	struct repository *r = the_repository;
>>> +
>>> +	if (argc < 5)
>>> +		usage(builtin_merge_resolve_usage);
>>
>> I think it would be better to call parse_options() and then check argc. That
>> would give better error messages for unknown options and supports '-h' for
>> free.
> 
> Again, we are talking about a merge strategy, a program that is not meant
> to be called directly by the user. Why should we complicate the code by
> using the `parse_options` machinery?

I thought it would simplify the implementation of '-h' below. However as 
the script does not support '-h' we should perhaps drop support for that 
and the usage() call if we want a strictly equivalent conversion.

>> I think we also need to call git_config(). I see that read-tree respects
>> submodule.recurse so I think we need the same here. I suspect we should
>> also be reading the merge config to respect merge.conflictStyle.
> 
> Valid concerns. Extra brownie points if you can provide a simple test case
> that demonstrates the current behavior.

I'll add it to my todo list.

>>> +
>>> +	if (repo_index_has_changes(r, head_tree, &sb)) {
>>> +		error(_("Your local changes to the following files "
>>> +			"would be overwritten by merge:\n  %s"),
>>> +		      sb.buf);
>>
>> This matches the script but I wonder why that did not check for unstaged
>> changes.
> 
> Any deviations from the scripted behavior should be done on top of this
> patch series, unless the deviations make the conversion substantially
> cleaner.

I agree. Having thought some more I suspect it is relying on 
unpack_trees() to error out if there are unstaged changes.

Best Wishes

Phillip

> Thanks,
> Dscho

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-16 12:09                     ` Johannes Schindelin
@ 2022-08-16 19:36                       ` Junio C Hamano
  2022-08-17  9:42                         ` Johannes Schindelin
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2022-08-16 19:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Phillip Wood, Alban Gruin, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I'm all in favor of adding such a good example there, but there is no
> reason to hold back `git merge-resolve` from being implemented in C.

You did not address the primary point, i.e. why the particular
change is a bad one.  Sure, you lost a scripted porcelain or two
that are not used much, but in exchange for what?  That is _the_
issue and you skirt around it.

The series makes us lose all strategies that are actively tested
that are spawned as a subprocess, which is the way all third-party
strategies will be used.  After this, we have less test coverage of
the codepaths we care about, which is *not* a scripted "resolve"
strategy, but the code that runs third-party strategies as
externals.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 03/14] merge-index: libify merge_one_path() and merge_all()
  2022-08-09 18:54               ` [PATCH v8 03/14] merge-index: libify merge_one_path() and merge_all() Alban Gruin
@ 2022-08-17  2:10                 ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-17  2:10 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Johannes Schindelin


On Tue, Aug 09 2022, Alban Gruin wrote:

[Re-arranged]

Rather than changing this behavior:

> diff --git a/t/t7607-merge-state.sh b/t/t7607-merge-state.sh
> index 89a62ac53b..96befa5b80 100755
> --- a/t/t7607-merge-state.sh
> +++ b/t/t7607-merge-state.sh
> @@ -20,7 +20,7 @@ test_expect_success 'Ensure we restore original state if no merge strategy handl
>  	# just hit conflicts, it completely fails and says that it cannot
>  	# handle this type of merge.
>  	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
> -	grep "fatal: merge program failed" output &&
> +	grep "error: merge program failed" output &&
>  	grep "Should not be doing an octopus" output &&

In this (or some of it?):

> [...]
>  	# Make sure we did not leave stray changes around when no appropriate
> [...]
> -	if (run_command_v_opt(arguments, 0)) {
> -		if (one_shot)
> -			err++;
> -		else {
> -			if (!quiet)
> -				die("merge program failed");
> -			exit(1);
> [...]
> +			if (!quiet && !oneshot)
> +				error(_("merge program failed"));
> +			return 1;
> [...]
> +		if (err && !oneshot) {
> +			if (!quiet)
> +				error(_("merge program failed"));
> +			return 1;
> +		}
> +	}
> +
> +	if (err && !quiet)
> +		error(_("merge program failed"));

Should we not be using die_message() here instead?

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-09 18:54               ` [PATCH v8 08/14] merge-resolve: " Alban Gruin
  2022-08-10 15:03                 ` Phillip Wood
@ 2022-08-17  2:16                 ` Ævar Arnfjörð Bjarmason
  2022-08-18 14:43                 ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-17  2:16 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Johannes Schindelin


On Tue, Aug 09 2022, Alban Gruin wrote:

I think the rest of this series has been careful to keep output as-is
(in some cases arguably to a fault, e.g. carrying forward two
"gettextln" invocations as two puts(), which we should almost definitely
fold into one string).

But here:

> -    gettextln "Error: Your local changes to the following files would be overwritten by merge"
> [...]
> +		error(_("Your local changes to the following files "
> +			"would be overwritten by merge:\n  %s"),
> +		      sb.buf);

We introduce a subtle behavior change, we used to say "Error:", but now
it's "error:". Also since a1fd2cf8cd6 (i18n: mark message helpers prefix
for translation, 2022-06-21) the interaction with how "error: " is
translated is different, but let's leave that aside.

Now, I think the change probably makes sense & isn't risky, but perhaps
note it in a commit message, or even precede it with a commit to
s/Error/error/g in the *.sh code before the migration?

Also, if we *are* changing it while we're at it let's also s/Your/your/,
no? I see some other uses in other pre-existing merge*.c files, so maybe
the unusual casing is magical.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-16 19:36                       ` Junio C Hamano
@ 2022-08-17  9:42                         ` Johannes Schindelin
  2022-08-17 19:06                           ` Elijah Newren
  2022-08-17 19:12                           ` Junio C Hamano
  0 siblings, 2 replies; 221+ messages in thread
From: Johannes Schindelin @ 2022-08-17  9:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phillip Wood, Alban Gruin, git

Hi Junio,

On Tue, 16 Aug 2022, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > I'm all in favor of adding such a good example there, but there is no
> > reason to hold back `git merge-resolve` from being implemented in C.
>
> You did not address the primary point, i.e. why the particular
> change is a bad one.  Sure, you lost a scripted porcelain or two
> that are not used much, but in exchange for what?  That is _the_
> issue and you skirt around it.

In exchange for what I mentioned already in
https://lore.kernel.org/git/qs23r0n8-9r24-6095-3n9n-9131s69974p1@tzk.qr/,
i.e. in the part you deleted from the quoted mail:

	We reduce Git's reliance on POSIX shell scripting, we reduce the
	number of programming languages contributors need to be familiar
	with, we open up to code coverage/static analysis tools that
	handle C but not shell scripts, just to name a few.

To reiterate why reducing the reliance on POSIX shell scripting is a good
thing:

- we pay a steep price in the form of performance issues (you will recall
  that merely rewriting the `rebase -i` engine in C and nothing else
  improved the overall run time of p3404 5x on Windows, 4x on macOS and
  still 3.5x on Linux, see
  https://lore.kernel.org/git/cover.1483370556.git.johannes.schindelin@gmx.de/)

  Yes, Linux sees such an incredible performance boost. Surprising, right?

- on Windows, even aside from the performance problems (which I deem
  reason enough on their own to aim for Git being implemented purely in
  C), users run into issues where anti-malware simply blocks shell
  scripts, sometimes even quarantines entire parts of Git for Windows.

- have you ever attempted to debug a Git invocation that involves spawning
  a shell script that in turn spawns the failing Git command, using `gdb`?
  I have. It ain't pretty. And you know that there are easier ways to
  abuse and deter new contributors than to ask them to do the same. In
  particular when large amounts of data have to be passed between those
  processes, typically via `stdio`.

- show me the equivalent of CodeQL/Coverity for POSIX shell scripting? ;-)

- portability issues dictate that we're not just using your grand father's
  POSIX shell scripting, but that we limit it to a subset that is opaque
  to developers unfamiliar with Git project.

- as a consequence, our shell scripts are highly opinionated, often using
  unintuitive idioms such as `&&` chains instead of `set -e`, which makes
  them unsuitable as examples how to script Git for regular users.

- a decreasing number of software developers is familiar with the
  intricacies of that language, leaving us with tech debt.

In short, there is not a single shred of doubt in my mind that avoiding
shell scripted parts in Git is a really good goal to have for this
project.

> The series makes us lose all strategies that are actively tested
> that are spawned as a subprocess, which is the way all third-party
> strategies will be used.

Then have that even-simpler-than `git-merge-resolve.sh` example be tested
as part of the test suite. That's what the test suite is for.

> After this, we have less test coverage of the codepaths we care about,
> which is *not* a scripted "resolve" strategy, but the code that runs
> third-party strategies as externals.

It is better to leave the responsibility of test coverage to the test
suite, avoiding to ship the corresponding support code to users.

tl;dr your concerns are easy to address, without having to incur the price
of keeping parts of Git implemented in shell.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-17  9:42                         ` Johannes Schindelin
@ 2022-08-17 19:06                           ` Elijah Newren
  2022-08-17 19:18                             ` Junio C Hamano
  2022-08-17 19:12                           ` Junio C Hamano
  1 sibling, 1 reply; 221+ messages in thread
From: Elijah Newren @ 2022-08-17 19:06 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Junio C Hamano, Phillip Wood, Alban Gruin, Git Mailing List

Hi Dscho,

I share some of Junio's concerns, and feel your response is addressing
a tangent but not the actual issues.  Perhaps I can try to explain why
from a slightly different perspective...

On Wed, Aug 17, 2022 at 2:51 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Junio,
>
> On Tue, 16 Aug 2022, Junio C Hamano wrote:
>
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >
> > > I'm all in favor of adding such a good example there, but there is no
> > > reason to hold back `git merge-resolve` from being implemented in C.
> >
> > You did not address the primary point, i.e. why the particular
> > change is a bad one.  Sure, you lost a scripted porcelain or two
> > that are not used much, but in exchange for what?  That is _the_
> > issue and you skirt around it.
>
> In exchange for what I mentioned already in
> https://lore.kernel.org/git/qs23r0n8-9r24-6095-3n9n-9131s69974p1@tzk.qr/,
> i.e. in the part you deleted from the quoted mail:
>
>         We reduce Git's reliance on POSIX shell scripting, we reduce the
>         number of programming languages contributors need to be familiar
>         with, we open up to code coverage/static analysis tools that
>         handle C but not shell scripts, just to name a few.
>
> To reiterate why reducing the reliance on POSIX shell scripting is a good
> thing:
>
> - we pay a steep price in the form of performance issues (you will recall
>   that merely rewriting the `rebase -i` engine in C and nothing else
>   improved the overall run time of p3404 5x on Windows, 4x on macOS and
>   still 3.5x on Linux, see
>   https://lore.kernel.org/git/cover.1483370556.git.johannes.schindelin@gmx.de/)
>
>   Yes, Linux sees such an incredible performance boost. Surprising, right?

Sure, scripts are slow *if* run.  Junio asked explicitly about that
"if" part, which you seem to be overlooking, and thus you are
answering a different question.

Is anyone, anywhere, ever running `-s resolve`?  Junio is doubting it,
and given that even some Git developers who were unaware of its
existence[1], I have to wonder too.

[1] https://public-inbox.org/git/kl6l7d58k535.fsf@chooglen-macbookpro.roam.corp.google.com/,
look for "today I learned..."

> - on Windows, even aside from the performance problems (which I deem
>   reason enough on their own to aim for Git being implemented purely in
>   C), users run into issues where anti-malware simply blocks shell
>   scripts, sometimes even quarantines entire parts of Git for Windows.

I'm not sure I'm following.  If users do attempt to run `git
{merge,rebase,cherry-pick,revert} --strategy resolve`, then
anti-malware disables other parts of Git for Windows?  Or is the mere
presence of git-merge-resolve enough to trigger such problems?  If the
latter, then I could agree that's really problematic and worth
addressing.  If the former, we may be back to that all important "if".
But I'm not sure it's one of those two; could you clarify a bit here?

> - have you ever attempted to debug a Git invocation that involves spawning
>   a shell script that in turn spawns the failing Git command, using `gdb`?
>   I have. It ain't pretty. And you know that there are easier ways to
>   abuse and deter new contributors than to ask them to do the same. In
>   particular when large amounts of data have to be passed between those
>   processes, typically via `stdio`.

Yes, that's very painful.  It's annoyed me many times.  It's a
problem, *if* you need to debug a script.  But again, you seem to be
presuming that git-merge-resolve is in use, which dodges the very
question Junio was asking.  Is it in use?

> - show me the equivalent of CodeQL/Coverity for POSIX shell scripting? ;-)
>
> - portability issues dictate that we're not just using your grand father's
>   POSIX shell scripting, but that we limit it to a subset that is opaque
>   to developers unfamiliar with Git project.
>
> - as a consequence, our shell scripts are highly opinionated, often using
>   unintuitive idioms such as `&&` chains instead of `set -e`, which makes
>   them unsuitable as examples how to script Git for regular users.
>
> - a decreasing number of software developers is familiar with the
>   intricacies of that language, leaving us with tech debt.

Yes, these are real issues for code written in shell being actively
developed and maintained, yes.  (shellcheck might help as a
CodeQL/Coverity-like thing for shell.)

However, that doesn't really apply here.  There have literally only
been two commits to git-merge-resolve.sh in the last decade, one from
me that copied a few lines verbatim from git-merge-octopus.sh, and the
other was a single character change 5 years ago.

> In short, there is not a single shred of doubt in my mind that avoiding
> shell scripted parts in Git is a really good goal to have for this
> project.

I think it's a good goal in general, especially for anything heavily
used.  I share Junio's concern about this one in particular.  I'm not
sure this script is even being used directly, the maintenance burden
for it is essentially zero, and the script does have both educational
and testing value.

> > The series makes us lose all strategies that are actively tested
> > that are spawned as a subprocess, which is the way all third-party
> > strategies will be used.
>
> Then have that even-simpler-than `git-merge-resolve.sh` example be tested
> as part of the test suite. That's what the test suite is for.

That simpler thing being a resurrection of git-merge-ours.sh from
a00a42ae33708caa742d9e9fbf10692cfa42f032^ ?

That would test that we shell out to another strategy.  But it
wouldn't really test as many of the cases in builtin/merge.c for
dealing with external strategies.  `-s resolve` can fail on
"interesting" changes, after making changes to the working tree and
index, and builtin/merge.c is expected to handle that -- using a
simpler example would lose that important testing.  (I kinda think
it's a bug that it doesn't clean up after itself and that we made
builtin/merge.c do the cleanup, but backward compatibility suggests we
at least need some way to keep testing that we handle that.)  We would
also need to be careful about testing the "preferred" strategy when
the user asks for multiple strategies, another thing covered in our
testsuite (though using two builtins might be good enough for that).
I'd have to look over the testsuite to check and see if there are
other important properties being tested too; -s resolve has been used
in a few dozen places.

> > After this, we have less test coverage of the codepaths we care about,
> > which is *not* a scripted "resolve" strategy, but the code that runs
> > third-party strategies as externals.
>
> It is better to leave the responsibility of test coverage to the test
> suite, avoiding to ship the corresponding support code to users.
>
> tl;dr your concerns are easy to address, without having to incur the price
> of keeping parts of Git implemented in shell.

There's also another concern you tried to address in your other email;
let me quote from that email here:

> If you want to have an easy example of a custom merge strategy, then let's
> have that easy example. `git-merge-resolve.sh` ain't that example.
>
> It would be a different matter if you had commented about
> `git-merge-ours.sh`:
> https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
> That _was_ a simple and easy example.

...and it was _utterly useless_ as an example.  It only checked that
the user hadn't modified the index since HEAD.  It doesn't demonstrate
anything about how to merge differing entries, since that merge
strategy specifically ignores changes made on the other side.  Since
merging differing entries is the whole point of writing a strategy, I
see no educational value in that particular script.

`git-merge-resolve.sh` may be an imperfect example, but it's certainly
far superior to that.

> I would also have understood a lament about the absence of any good
> example in https://git-scm.com/docs/git-merge#_merge_strategies to help
> users develop their own custom merge strategies.
>
> I'm all in favor of adding such a good example there, but there is no
> reason to hold back `git merge-resolve` from being implemented in C.

If someone makes a better example (which I agree could be done,
especially if it added lots of comments about what was required and
why), and ensures we keep useful test coverage (maybe using Junio's
c-resolve suggestion in another email), then my concerns about
reimplementing git-merge-resolve.sh in C go away.

If that happens, then I still think it's a useless exercise to do the
reimplementation -- unless someone can provide evidence of `-s
resolve` being in use -- but it's not a harmful exercise and wouldn't
concern me.

If the better example and mechanism to retain good test coverage
aren't provided, then I worry that reimplementing is a bunch of work
for an at best theoretical benefit, coupled with a double whammy
practical regression.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-17  9:42                         ` Johannes Schindelin
  2022-08-17 19:06                           ` Elijah Newren
@ 2022-08-17 19:12                           ` Junio C Hamano
  1 sibling, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2022-08-17 19:12 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Phillip Wood, Alban Gruin, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> To reiterate why reducing the reliance on POSIX shell scripting is a good
> thing:
>
> - we pay a steep price in the form of performance issues (you will recall

Irrelevant.  Who uses resolve these days?

> - have you ever attempted to debug a Git invocation that involves spawning
>   a shell script that in turn spawns the failing Git command, using `gdb`?

Remember, I have been doing this longer than you have, so of course
I have, but I do not think it is relevant.  An external program as a
merge strategy does not have to be written in shell, but third-party
strategies can be written in anything, so some who choose to do so
may still have to.  There is no avoiding that.

What our contributors, new and old, need to do is to maintain the
codepath that spawns these third-party strategy programs working.

There were two steps I gave review messages to, and the one I had
more trouble with was actually not the [08/14] you are making big
fuss about.  It was the "we no longer spawn resolve or octopus"
step(s).  If we really want to rewrite "resolve" in C, while I think
there are better ways to use our resources, rewriting it by itself
would not _hurt_ the project all that much, as long as we keep it an
external program.

And by "maintain the codepath working", we would want to catch silly
mistakes while "refactoring", like the one we had when we changed
the underlying machinery to spawn hooks in a recent release, without
caring (I wouldn't say "without knowing"; those who did and reviewed
the change including me didn't even think about how the standard I/O
streams are seen by hook scripts and how they react to them).  Just
like tests around small toy sample hooks did not catch the
regression, "a small toy sample that is only spawned in a test piece
or two to pretend to be a merge strategy program" would not be a
good substitute for running something real.



^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-17 19:06                           ` Elijah Newren
@ 2022-08-17 19:18                             ` Junio C Hamano
  2022-08-18 14:24                               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 221+ messages in thread
From: Junio C Hamano @ 2022-08-17 19:18 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Johannes Schindelin, Phillip Wood, Alban Gruin, Git Mailing List

Elijah Newren <newren@gmail.com> writes:

> There's also another concern you tried to address in your other email;
> let me quote from that email here:
>
>> If you want to have an easy example of a custom merge strategy, then let's
>> have that easy example. `git-merge-resolve.sh` ain't that example.
>>
>> It would be a different matter if you had commented about
>> `git-merge-ours.sh`:
>> https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
>> That _was_ a simple and easy example.
>
> ...and it was _utterly useless_ as an example.  It only checked that
> the user hadn't modified the index since HEAD.  It doesn't demonstrate
> anything about how to merge differing entries, since that merge
> strategy specifically ignores changes made on the other side.  Since
> merging differing entries is the whole point of writing a strategy, I
> see no educational value in that particular script.
>
> `git-merge-resolve.sh` may be an imperfect example, but it's certainly
> far superior to that.
> ...
> If someone makes a better example (which I agree could be done,
> especially if it added lots of comments about what was required and
> why), and ensures we keep useful test coverage (maybe using Junio's
> c-resolve suggestion in another email), then my concerns about
> reimplementing git-merge-resolve.sh in C go away.
>
> If that happens, then I still think it's a useless exercise to do the
> reimplementation -- unless someone can provide evidence of `-s
> resolve` being in use -- but it's not a harmful exercise and wouldn't
> concern me.
>
> If the better example and mechanism to retain good test coverage
> aren't provided, then I worry that reimplementing is a bunch of work
> for an at best theoretical benefit, coupled with a double whammy
> practical regression.

Ah, you said many things I wanted to say already.  Thanks.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-17 19:18                             ` Junio C Hamano
@ 2022-08-18 14:24                               ` Ævar Arnfjörð Bjarmason
  2022-08-18 17:32                                 ` Junio C Hamano
  2022-08-19  1:43                                 ` Elijah Newren
  0 siblings, 2 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-18 14:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Elijah Newren, Johannes Schindelin, Phillip Wood, Alban Gruin,
	Git Mailing List


On Wed, Aug 17 2022, Junio C Hamano wrote:

> Elijah Newren <newren@gmail.com> writes:
>
>> There's also another concern you tried to address in your other email;
>> let me quote from that email here:
>>
>>> If you want to have an easy example of a custom merge strategy, then let's
>>> have that easy example. `git-merge-resolve.sh` ain't that example.
>>>
>>> It would be a different matter if you had commented about
>>> `git-merge-ours.sh`:
>>> https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
>>> That _was_ a simple and easy example.
>>
>> ...and it was _utterly useless_ as an example.  It only checked that
>> the user hadn't modified the index since HEAD.  It doesn't demonstrate
>> anything about how to merge differing entries, since that merge
>> strategy specifically ignores changes made on the other side.  Since
>> merging differing entries is the whole point of writing a strategy, I
>> see no educational value in that particular script.
>>
>> `git-merge-resolve.sh` may be an imperfect example, but it's certainly
>> far superior to that.
>> ...
>> If someone makes a better example (which I agree could be done,
>> especially if it added lots of comments about what was required and
>> why), and ensures we keep useful test coverage (maybe using Junio's
>> c-resolve suggestion in another email), then my concerns about
>> reimplementing git-merge-resolve.sh in C go away.
>>
>> If that happens, then I still think it's a useless exercise to do the
>> reimplementation -- unless someone can provide evidence of `-s
>> resolve` being in use -- but it's not a harmful exercise and wouldn't
>> concern me.
>>
>> If the better example and mechanism to retain good test coverage
>> aren't provided, then I worry that reimplementing is a bunch of work
>> for an at best theoretical benefit, coupled with a double whammy
>> practical regression.
>
> Ah, you said many things I wanted to say already.  Thanks.

I may have missed something in this thread, but wouldn't an acceptable
way to please everyone here be to:

 1. Have git's behavior be that of the end of this series...
 2. Add a GIT_TEST_* mode where we'll optionally invoke these "built-in"
    merge strategies as commands, i.e. have them fall back to
    "try_merge_command()".

So something like this on top of this series (assume my SOB etc. if this
is acceptable). I only tested this locally, but it seems to do the right
thing for me:

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 8ebff425967..9d0f68b8147 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -30,6 +30,7 @@ linux-TEST-vars)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
 	export GIT_TEST_WRITE_REV_INDEX=1
 	export GIT_TEST_CHECKOUT_WORKERS=2
+	export GIT_TEST_MERGE_COMMANDS=true
 	;;
 linux-clang)
 	export GIT_TEST_DEFAULT_HASH=sha1
diff --git a/sequencer.c b/sequencer.c
index 00a36205848..91d651f9b12 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -2309,6 +2309,7 @@ static int do_pick_commit(struct repository *r,
 	} else {
 		struct commit_list *common = NULL;
 		struct commit_list *remotes = NULL;
+		const int test_commands = git_env_bool("GIT_TEST_MERGE_COMMANDS", 0);
 
 		res = write_message(msgbuf.buf, msgbuf.len,
 				    git_path_merge_msg(r), 0);
@@ -2316,10 +2317,10 @@ static int do_pick_commit(struct repository *r,
 		commit_list_insert(base, &common);
 		commit_list_insert(next, &remotes);
 
-		if (!strcmp(opts->strategy, "resolve")) {
+		if (!test_commands && !strcmp(opts->strategy, "resolve")) {
 			repo_read_index(r);
 			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
-		} else if (!strcmp(opts->strategy, "octopus")) {
+		} else if (!test_commands && !strcmp(opts->strategy, "octopus")) {
 			repo_read_index(r);
 			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
 		} else {

^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-09 18:54               ` [PATCH v8 08/14] merge-resolve: " Alban Gruin
  2022-08-10 15:03                 ` Phillip Wood
  2022-08-17  2:16                 ` Ævar Arnfjörð Bjarmason
@ 2022-08-18 14:43                 ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-18 14:43 UTC (permalink / raw)
  To: Alban Gruin; +Cc: git, Junio C Hamano, Phillip Wood, Johannes Schindelin


On Tue, Aug 09 2022, Alban Gruin wrote:

> +int merge_strategies_resolve(struct repository *r,
> +			     struct commit_list *bases, const char *head_arg,
> +			     struct commit_list *remote);

It would be very nice to have this prototype declared as a:

	typedef int (*merge_strategy_fn_t)(...);

Or whatever, so that when you later use this in 12/14. Then the end
state of this series could have this on top:
	
	diff --git a/merge-strategies.h b/merge-strategies.h
	index 8de2249ee6b..79b828105ba 100644
	--- a/merge-strategies.h
	+++ b/merge-strategies.h
	@@ -29,6 +29,9 @@ int merge_index_path(struct index_state *istate, int oneshot, int quiet,
	 int merge_all_index(struct index_state *istate, int oneshot, int quiet,
	 		    merge_fn fn, void *data);
	 
	+typedef int (*merge_strategy_fn_t)(struct repository *r,
	+			     struct commit_list *bases, const char *head_arg,
	+			     struct commit_list *remote);
	 int merge_strategies_resolve(struct repository *r,
	 			     struct commit_list *bases, const char *head_arg,
	 			     struct commit_list *remote);
	diff --git a/sequencer.c b/sequencer.c
	index 00a36205848..d5ef12dda27 100644
	--- a/sequencer.c
	+++ b/sequencer.c
	@@ -2309,6 +2309,7 @@ static int do_pick_commit(struct repository *r,
	 	} else {
	 		struct commit_list *common = NULL;
	 		struct commit_list *remotes = NULL;
	+		merge_strategy_fn_t fn = NULL;
	 
	 		res = write_message(msgbuf.buf, msgbuf.len,
	 				    git_path_merge_msg(r), 0);
	@@ -2316,12 +2317,14 @@ static int do_pick_commit(struct repository *r,
	 		commit_list_insert(base, &common);
	 		commit_list_insert(next, &remotes);
	 
	-		if (!strcmp(opts->strategy, "resolve")) {
	-			repo_read_index(r);
	-			res |= merge_strategies_resolve(r, common, oid_to_hex(&head), remotes);
	-		} else if (!strcmp(opts->strategy, "octopus")) {
	+		if (!strcmp(opts->strategy, "resolve"))
	+			fn = merge_strategies_resolve;
	+		else if (!strcmp(opts->strategy, "resolve"))
	+			fn = merge_strategies_octopus;
	+
	+		if (fn) {
	 			repo_read_index(r);
	-			res |= merge_strategies_octopus(r, common, oid_to_hex(&head), remotes);
	+			res |= fn(r, common, oid_to_hex(&head), remotes);
	 		} else {
	 			res |= try_merge_command(r, opts->strategy,
	 						 opts->xopts_nr, (const char **)opts->xopts,

We could replace that if/else if with a static array, and loop over it
to find the "fn" (if any), but I though it wasn't worth it just for
this.

This would also make my suggestion on top at
https://lore.kernel.org/git/220818.868rnlaa0h.gmgdl@evledraar.gmail.com/
nicer. I.e. we could just make that:

	if (git_env_bool("GIT_TEST_MERGE_COMMANDS", 0))
		fn = NULL;

And not need to add the "are we in the test mode" to the if/else if
branch for all of the internal strategies.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-18 14:24                               ` Ævar Arnfjörð Bjarmason
@ 2022-08-18 17:32                                 ` Junio C Hamano
  2022-08-19  1:43                                 ` Elijah Newren
  1 sibling, 0 replies; 221+ messages in thread
From: Junio C Hamano @ 2022-08-18 17:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Elijah Newren, Johannes Schindelin, Phillip Wood, Alban Gruin,
	Git Mailing List

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> I may have missed something in this thread, but wouldn't an acceptable
> way to please everyone here be to:

Why pile on MORE cruft on top of a needless rewrite into an internal
call, when it is cleaner to just get rid of the part that makes an
internal call?

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-18 14:24                               ` Ævar Arnfjörð Bjarmason
  2022-08-18 17:32                                 ` Junio C Hamano
@ 2022-08-19  1:43                                 ` Elijah Newren
  2022-08-19  2:45                                   ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 221+ messages in thread
From: Elijah Newren @ 2022-08-19  1:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Johannes Schindelin, Phillip Wood, Alban Gruin,
	Git Mailing List

On Thu, Aug 18, 2022 at 7:42 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Wed, Aug 17 2022, Junio C Hamano wrote:
>
> > Elijah Newren <newren@gmail.com> writes:
> >
> >> There's also another concern you tried to address in your other email;
> >> let me quote from that email here:
> >>
> >>> If you want to have an easy example of a custom merge strategy, then let's
> >>> have that easy example. `git-merge-resolve.sh` ain't that example.
> >>>
> >>> It would be a different matter if you had commented about
> >>> `git-merge-ours.sh`:
> >>> https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
> >>> That _was_ a simple and easy example.
> >>
> >> ...and it was _utterly useless_ as an example.  It only checked that
> >> the user hadn't modified the index since HEAD.  It doesn't demonstrate
> >> anything about how to merge differing entries, since that merge
> >> strategy specifically ignores changes made on the other side.  Since
> >> merging differing entries is the whole point of writing a strategy, I
> >> see no educational value in that particular script.
> >>
> >> `git-merge-resolve.sh` may be an imperfect example, but it's certainly
> >> far superior to that.
> >> ...
> >> If someone makes a better example (which I agree could be done,
> >> especially if it added lots of comments about what was required and
> >> why), and ensures we keep useful test coverage (maybe using Junio's
> >> c-resolve suggestion in another email), then my concerns about
> >> reimplementing git-merge-resolve.sh in C go away.
> >>
> >> If that happens, then I still think it's a useless exercise to do the
> >> reimplementation -- unless someone can provide evidence of `-s
> >> resolve` being in use -- but it's not a harmful exercise and wouldn't
> >> concern me.
> >>
> >> If the better example and mechanism to retain good test coverage
> >> aren't provided, then I worry that reimplementing is a bunch of work
> >> for an at best theoretical benefit, coupled with a double whammy
> >> practical regression.
> >
> > Ah, you said many things I wanted to say already.  Thanks.
>
> I may have missed something in this thread, but wouldn't an acceptable
> way to please everyone here be to:
>
>  1. Have git's behavior be that of the end of this series...
>  2. Add a GIT_TEST_* mode where we'll optionally invoke these "built-in"
>     merge strategies as commands, i.e. have them fall back to
>     "try_merge_command()".

In the portion of the email you quoted and responded to, most of the
text was talking about how git-merge-resolve.sh serves an important
educational purpose, yet you've only tried to address the testing
issue.  I think both are important.  The easiest way to fix the
educational shortcoming of this series is to reverse the deleting of
git-merge-resolve.sh, and restore the building and distribution of
git-merge-resolve from that script.  Unfortunately, that generates a
collision between both the script and the builtin being used to build
the same file (namely, git-merge-resolve)...which is yet another
reason that the easiest solution available here is to just not rewrite
this script in C at all.

There are certainly other possible solutions to the educational issue,
and might not even be too hard, but we'd need someone to implement one
before I'd agree we found an "acceptable way to please everyone".  :-)

> So something like this on top of this series (assume my SOB etc. if this
> is acceptable). I only tested this locally, but it seems to do the right
> thing for me:
<snip patch>

How did you test?  I'm a bit confused...unless I'm misreading
something, it appears to me that ci/lib.sh sets SKIP_DASHED_BUILT_INS
unconditionally which would probably cause your proposal to break.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-19  1:43                                 ` Elijah Newren
@ 2022-08-19  2:45                                   ` Ævar Arnfjörð Bjarmason
  2022-08-19  4:27                                     ` Elijah Newren
  0 siblings, 1 reply; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-08-19  2:45 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, Johannes Schindelin, Phillip Wood, Alban Gruin,
	Git Mailing List


On Thu, Aug 18 2022, Elijah Newren wrote:

> On Thu, Aug 18, 2022 at 7:42 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> On Wed, Aug 17 2022, Junio C Hamano wrote:
>>
>> > Elijah Newren <newren@gmail.com> writes:
>> >
>> >> There's also another concern you tried to address in your other email;
>> >> let me quote from that email here:
>> >>
>> >>> If you want to have an easy example of a custom merge strategy, then let's
>> >>> have that easy example. `git-merge-resolve.sh` ain't that example.
>> >>>
>> >>> It would be a different matter if you had commented about
>> >>> `git-merge-ours.sh`:
>> >>> https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
>> >>> That _was_ a simple and easy example.
>> >>
>> >> ...and it was _utterly useless_ as an example.  It only checked that
>> >> the user hadn't modified the index since HEAD.  It doesn't demonstrate
>> >> anything about how to merge differing entries, since that merge
>> >> strategy specifically ignores changes made on the other side.  Since
>> >> merging differing entries is the whole point of writing a strategy, I
>> >> see no educational value in that particular script.
>> >>
>> >> `git-merge-resolve.sh` may be an imperfect example, but it's certainly
>> >> far superior to that.
>> >> ...
>> >> If someone makes a better example (which I agree could be done,
>> >> especially if it added lots of comments about what was required and
>> >> why), and ensures we keep useful test coverage (maybe using Junio's
>> >> c-resolve suggestion in another email), then my concerns about
>> >> reimplementing git-merge-resolve.sh in C go away.
>> >>
>> >> If that happens, then I still think it's a useless exercise to do the
>> >> reimplementation -- unless someone can provide evidence of `-s
>> >> resolve` being in use -- but it's not a harmful exercise and wouldn't
>> >> concern me.
>> >>
>> >> If the better example and mechanism to retain good test coverage
>> >> aren't provided, then I worry that reimplementing is a bunch of work
>> >> for an at best theoretical benefit, coupled with a double whammy
>> >> practical regression.
>> >
>> > Ah, you said many things I wanted to say already.  Thanks.
>>
>> I may have missed something in this thread, but wouldn't an acceptable
>> way to please everyone here be to:
>>
>>  1. Have git's behavior be that of the end of this series...
>>  2. Add a GIT_TEST_* mode where we'll optionally invoke these "built-in"
>>     merge strategies as commands, i.e. have them fall back to
>>     "try_merge_command()".
>
> In the portion of the email you quoted and responded to, most of the
> text was talking about how git-merge-resolve.sh serves an important
> educational purpose, yet you've only tried to address the testing
> issue.  I think both are important.

*Nod*, I meant (but should have said) "on the topic of the test
 coverage"...

> The easiest way to fix the
> educational shortcoming of this series is to reverse the deleting of
> git-merge-resolve.sh, and restore the building and distribution of
> git-merge-resolve from that script.  Unfortunately, that generates a
> collision between both the script and the builtin being used to build
> the same file (namely, git-merge-resolve)...

I'd think if we were shipping it as an example we could give it a
different name, or not install it as an executable, but in the "shared"
part (along with the README etc.).

Or keep it in-tree in contrib, but we did try that sort of thing before
with 49eb8d39c78 (Remove contrib/examples/*, 2018-03-25) :)

I think the best way forward is to just note in the documentation some
examples of how to write a merge driver, either by linking to an older
version of the script, or quoting it inline.

> which is yet another
> reason that the easiest solution available here is to just not rewrite
> this script in C at all.

I think there's bigger benefits to moving more things to C & built-ins,
so I'd prefer to see some version of this where what we do by default is
to call this C code (or similar), and not as a sub-process.

> There are certainly other possible solutions to the educational issue,
> and might not even be too hard, but we'd need someone to implement one
> before I'd agree we found an "acceptable way to please everyone".  :-)

*nod*

>> So something like this on top of this series (assume my SOB etc. if this
>> is acceptable). I only tested this locally, but it seems to do the right
>> thing for me:
> <snip patch>
>
> How did you test?  I'm a bit confused...unless I'm misreading
> something, it appears to me that ci/lib.sh sets SKIP_DASHED_BUILT_INS
> unconditionally which would probably cause your proposal to break.

Admittedly not very thoroughly, but I'm fairly sure it does the right
thing when it comes to this, and SKIP_DASHED_BUILT_INS doesn't enter
into it (and all my local builds use SKIP_DASHED_BUILT_INS=Y).

The try_merge_command() invokes merge-what-ever, and does a
run_command_v_opt(args.v, RUN_GIT_CMD). At that point we'll invoke a
"git merge-what-ever", i.e. we don't need a "git-merge-what-ever" binary
to exist.

This is what we do in general when git is invoking itself, and we'd need
to go out of our way to have it not work in this case (i.e. build it as
a stand-alone program, like git-http-fetch, and not as a built-in).

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v8 08/14] merge-resolve: rewrite in C
  2022-08-19  2:45                                   ` Ævar Arnfjörð Bjarmason
@ 2022-08-19  4:27                                     ` Elijah Newren
  0 siblings, 0 replies; 221+ messages in thread
From: Elijah Newren @ 2022-08-19  4:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, Johannes Schindelin, Phillip Wood, Alban Gruin,
	Git Mailing List

On Thu, Aug 18, 2022 at 7:55 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> On Thu, Aug 18 2022, Elijah Newren wrote:
>
> > On Thu, Aug 18, 2022 at 7:42 AM Ævar Arnfjörð Bjarmason
> > <avarab@gmail.com> wrote:
> >>
> >> On Wed, Aug 17 2022, Junio C Hamano wrote:
> >>
> >> > Elijah Newren <newren@gmail.com> writes:
> >> >
> >> >> There's also another concern you tried to address in your other email;
> >> >> let me quote from that email here:
> >> >>
> >> >>> If you want to have an easy example of a custom merge strategy, then let's
> >> >>> have that easy example. `git-merge-resolve.sh` ain't that example.
> >> >>>
> >> >>> It would be a different matter if you had commented about
> >> >>> `git-merge-ours.sh`:
> >> >>> https://github.com/git/git/blob/v2.17.0/contrib/examples/git-merge-ours.sh
> >> >>> That _was_ a simple and easy example.
> >> >>
> >> >> ...and it was _utterly useless_ as an example.  It only checked that
> >> >> the user hadn't modified the index since HEAD.  It doesn't demonstrate
> >> >> anything about how to merge differing entries, since that merge
> >> >> strategy specifically ignores changes made on the other side.  Since
> >> >> merging differing entries is the whole point of writing a strategy, I
> >> >> see no educational value in that particular script.
> >> >>
> >> >> `git-merge-resolve.sh` may be an imperfect example, but it's certainly
> >> >> far superior to that.
> >> >> ...
> >> >> If someone makes a better example (which I agree could be done,
> >> >> especially if it added lots of comments about what was required and
> >> >> why), and ensures we keep useful test coverage (maybe using Junio's
> >> >> c-resolve suggestion in another email), then my concerns about
> >> >> reimplementing git-merge-resolve.sh in C go away.
> >> >>
> >> >> If that happens, then I still think it's a useless exercise to do the
> >> >> reimplementation -- unless someone can provide evidence of `-s
> >> >> resolve` being in use -- but it's not a harmful exercise and wouldn't
> >> >> concern me.
> >> >>
> >> >> If the better example and mechanism to retain good test coverage
> >> >> aren't provided, then I worry that reimplementing is a bunch of work
> >> >> for an at best theoretical benefit, coupled with a double whammy
> >> >> practical regression.
> >> >
> >> > Ah, you said many things I wanted to say already.  Thanks.
> >>
> >> I may have missed something in this thread, but wouldn't an acceptable
> >> way to please everyone here be to:
> >>
> >>  1. Have git's behavior be that of the end of this series...
> >>  2. Add a GIT_TEST_* mode where we'll optionally invoke these "built-in"
> >>     merge strategies as commands, i.e. have them fall back to
> >>     "try_merge_command()".
> >
> > In the portion of the email you quoted and responded to, most of the
> > text was talking about how git-merge-resolve.sh serves an important
> > educational purpose, yet you've only tried to address the testing
> > issue.  I think both are important.
>
> *Nod*, I meant (but should have said) "on the topic of the test
>  coverage"...

Ah, yes, that would have helped.  :-)

> > The easiest way to fix the
> > educational shortcoming of this series is to reverse the deleting of
> > git-merge-resolve.sh, and restore the building and distribution of
> > git-merge-resolve from that script.  Unfortunately, that generates a
> > collision between both the script and the builtin being used to build
> > the same file (namely, git-merge-resolve)...
>
> I'd think if we were shipping it as an example we could give it a
> different name, or not install it as an executable, but in the "shared"
> part (along with the README etc.).

Seems reasonable; I'm slightly partial to the name
"git-merge-strategy-demo" (though "--strategy strategy-demo" might
look weird), or perhaps just "git-merge-demo" (though that makes
people wonder what kind of demo).

> Or keep it in-tree in contrib, but we did try that sort of thing before
> with 49eb8d39c78 (Remove contrib/examples/*, 2018-03-25) :)
>
> I think the best way forward is to just note in the documentation some
> examples of how to write a merge driver, either by linking to an older
> version of the script, or quoting it inline.

Nitpick: "merge strategy", not "merge driver".

A merge driver is something defined in .gitattributes and only ever
functions on three versions of one file, never having bigger knowledge
of the wider tree.  A merge driver is thus a special purpose three-way
content merge of a single file (replacing the normal xdiff merge
stuff) tailored to a specific file type.

A merge strategy, in contrast, is given multiple commits to merge and
thus has a view of the whole tree.  A merge strategy needs to decide
whether and how to handle directory/file conflicts, differing modes,
submodule updates, recursive ancestor consolidation, file renames
(including weird cases like colliding renames or renamed differently),
directory renames, etc.  A merge strategy may well call various merge
drivers (assuming some are defined in .gitattributes) for different
paths within the tree, and/or fall back to calling (directly or
indirectly) the code in xdiff to handle the three-way merge of
individual files.

> > which is yet another
> > reason that the easiest solution available here is to just not rewrite
> > this script in C at all.
>
> I think there's bigger benefits to moving more things to C & built-ins,
> so I'd prefer to see some version of this where what we do by default is
> to call this C code (or similar), and not as a sub-process.

Yes, in general I agree there are big benefits to moving towards C &
built-ins.  I'm unconvinced any of them apply in the specific case of
merge-resolve, as noted at length earlier in this thread.

If someone wants to do it anyway, they should just make sure that (1)
testing of external merge strategies doesn't regress and remains well
tested, and (2) there is a good story for educating users about how to
write external merge strategies, or at least as good as what we have
now.  If I feel either is being ignored or regressing, I'll likely
express my concerns again.

> > There are certainly other possible solutions to the educational issue,
> > and might not even be too hard, but we'd need someone to implement one
> > before I'd agree we found an "acceptable way to please everyone".  :-)
>
> *nod*
>
> >> So something like this on top of this series (assume my SOB etc. if this
> >> is acceptable). I only tested this locally, but it seems to do the right
> >> thing for me:
> > <snip patch>
> >
> > How did you test?  I'm a bit confused...unless I'm misreading
> > something, it appears to me that ci/lib.sh sets SKIP_DASHED_BUILT_INS
> > unconditionally which would probably cause your proposal to break.
>
> Admittedly not very thoroughly, but I'm fairly sure it does the right
> thing when it comes to this, and SKIP_DASHED_BUILT_INS doesn't enter
> into it (and all my local builds use SKIP_DASHED_BUILT_INS=Y).
>
> The try_merge_command() invokes merge-what-ever, and does a
> run_command_v_opt(args.v, RUN_GIT_CMD). At that point we'll invoke a
> "git merge-what-ever", i.e. we don't need a "git-merge-what-ever" binary
> to exist.
>
> This is what we do in general when git is invoking itself, and we'd need
> to go out of our way to have it not work in this case (i.e. build it as
> a stand-alone program, like git-http-fetch, and not as a built-in).

Oh, right, I was mixing up git-merge-one-file (which merge-resolve has
merge-index invoke, and yes including the dash right after "git") and
`git merge-resolve`.  Sorry about that.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C
  2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
                                 ` (13 preceding siblings ...)
  2022-08-09 18:54               ` [PATCH v8 14/14] sequencer: use the "octopus" " Alban Gruin
@ 2022-11-18 11:18               ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
                                   ` (13 more replies)
  14 siblings, 14 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

This is a prep series for a re-roll of Alban Gruin's series to rewrite
various merge drivers from *.sh to *.c, and being able to call those
in-process.

This was last discussed on-list in August[1], and has been ejected
from "seen" due to staleness.

The last time around there were concerns with the later part of this
topic, but the parts that are included here weren't controversial,
those will be part 2 (and I think I've addressed those concerns).

Changes since the v8:

 * Migrate "merge-index" to parse_options(), first in a bug-for-bug
   compatible way, and then later on fix the behavior, and add tests
   along the way.

 * The 5-9/12 here were all split out in one way or another from
   Alban's 3/14[2], in such a way as to make the diff for 10/12 as
   friendly as possible (e.g. catering to rename detection).

 * Alban's converted die()/exit() in the built-in to "error()", but in
   doing so introduced a behavior change: When we'd previously process
   N items we'd exit right away, but in the v8 we'd attempt all N
   items.

   It turns out that almost nothing that came after care about the
   die(), i.e. even if we've lib-ified it it's OK to call die() within
   that library.

 * A new 11/12 hopefully makes the way merge-index parses out OIDs and
   types easier to reason about.

 * Finally, 12/12 makes the semantics of "merge-index" sane vis-a-vis
   parse_options().

Passing CI and branch for this at
https://github.com/avar/git/tree/ag/merge-strategies-in-c-prep

The follow-on from this is then
https://github.com/avar/git/tree/ag/merge-strategies-in-c-2, for those
that want to peek ahead.

1. https://lore.kernel.org/git/20220809185429.20098-9-alban.gruin@gmail.com/
2. https://lore.kernel.org/git/20220809185429.20098-4-alban.gruin@gmail.com/

Alban Gruin (4):
  t6060: modify multiple files to expose a possible issue with
    merge-index
  t6060: add tests for removed files
  merge-index: improve die() error messages
  merge-index: libify merge_one_path() and merge_all()

Ævar Arnfjörð Bjarmason (8):
  merge-index doc & -h: fix padding, labels and "()" use
  merge-index tests: add usage tests
  merge-index: migrate to parse_options() API
  merge-index i18n: mark die() messages for translation
  merge-index: stop calling ensure_full_index() twice
  builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS
  merge-index: use "struct strvec" and helper to prepare args
  merge-index: make the argument parsing sensible & simpler

 Documentation/git-merge-index.txt |   2 +-
 Makefile                          |   1 +
 builtin/merge-index.c             | 169 ++++++++++++++----------------
 git.c                             |   2 +-
 merge-strategies.c                |  87 +++++++++++++++
 merge-strategies.h                |  19 ++++
 t/t0450/txt-help-mismatches       |   1 -
 t/t6060-merge-index.sh            |  65 +++++++++++-
 8 files changed, 250 insertions(+), 96 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v8:
 -:  ----------- >  1:  cafc7db374e merge-index doc & -h: fix padding, labels and "()" use
 1:  0f791f500e6 !  2:  099d4812601 t6060: modify multiple files to expose a possible issue with merge-index
    @@ Commit message
         be trivially mergeable.
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## t/t6060-merge-index.sh ##
    -@@ t/t6060-merge-index.sh: test_description='basic git merge-index / git-merge-one-file tests'
    +@@ t/t6060-merge-index.sh: TEST_PASSES_SANITIZE_LEAK=true
      
      test_expect_success 'setup diverging branches' '
      	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
 2:  ed9e7a45855 !  3:  af3a235a224 t6060: add tests for removed files
    @@ Commit message
         tagged as `base', and deletes it in the commit tagged as `two'.
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## t/t6060-merge-index.sh ##
    -@@ t/t6060-merge-index.sh: test_description='basic git merge-index / git-merge-one-file tests'
    +@@ t/t6060-merge-index.sh: TEST_PASSES_SANITIZE_LEAK=true
      test_expect_success 'setup diverging branches' '
      	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
      	cp file file2 &&
 -:  ----------- >  4:  7d686637fa3 merge-index tests: add usage tests
 -:  ----------- >  5:  845f9b0cc19 merge-index: migrate to parse_options() API
 -:  ----------- >  6:  fc4e64f669e merge-index: improve die() error messages
 -:  ----------- >  7:  04c2bae9e68 merge-index i18n: mark die() messages for translation
 -:  ----------- >  8:  badfc60354a merge-index: stop calling ensure_full_index() twice
 -:  ----------- >  9:  f29343197eb builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS
 3:  d1d5740a8e5 ! 10:  c7a131a9a86 merge-index: libify merge_one_path() and merge_all()
    @@ Metadata
      ## Commit message ##
         merge-index: libify merge_one_path() and merge_all()
     
    -    The "resolve" and "octopus" merge strategies do not call directly `git
    -    merge-one-file', they delegate the work to another git command, `git
    -    merge-index', that will loop over files in the index and call the
    -    specified command.  Unfortunately, these functions are not part of
    -    libgit.a, which means that once rewritten, the strategies would still
    -    have to invoke `merge-one-file' by spawning a new process first.
    +    Move the workhorse functions in "builtin/merge-index.c" into a new
    +    "merge-strategies" library, and mostly "libify" the code while doing
    +    so.
     
    -    To avoid this, this moves and renames merge_one_path(), merge_all(), and
    -    their helpers to merge-strategies.c.  They also take a callback to
    -    dictate what they should do for each file.  For now, to preserve the
    -    behaviour of `merge-index', only one callback, launching a new process,
    -    is defined.
    +    Eventually this will allow us to invoke merge strategies such as
    +    "resolve" and "octopus" in-process, once we've followed-up and
    +    replaced "git-merge-{resolve,octopus}.sh" etc.
    +
    +    But for now let's move this code, while trying to optimize for as much
    +    of it as possible being highlighted by the diff rename detection.
    +
    +    We still call die() in this library. An earlier version of this[1]
    +    converted these to "error()", but the problem with that that we'd then
    +    potentially run into the same error N times, e.g. once for every
    +    "<file>" we were asked to operate on, instead of dying on the first
    +    case. So let's leave those to "die()" for now.
    +
    +    1. https://lore.kernel.org/git/20220809185429.20098-4-alban.gruin@gmail.com/
     
         Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
    -    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Makefile ##
     @@ Makefile: LIB_OBJS += merge-blobs.o
    @@ Makefile: LIB_OBJS += merge-blobs.o
     
      ## builtin/merge-index.c ##
     @@
    - #define USE_THE_INDEX_COMPATIBILITY_MACROS
      #include "builtin.h"
    + #include "parse-options.h"
     +#include "merge-strategies.h"
      #include "run-command.h"
      
    - static const char *pgm;
    +-static const char *pgm;
     -static int one_shot, quiet;
     -static int err;
    ++struct mofs_data {
    ++	const char *program;
    ++};
      
    --static int merge_entry(int pos, const char *path)
    -+static int merge_one_file_spawn(struct index_state *istate,
    -+				const struct object_id *orig_blob,
    -+				const struct object_id *our_blob,
    -+				const struct object_id *their_blob, const char *path,
    -+				unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    -+				void *data)
    +-static int merge_entry(struct index_state *istate, int pos, const char *path)
    ++static int merge_one_file(struct index_state *istate,
    ++			  const struct object_id *orig_blob,
    ++			  const struct object_id *our_blob,
    ++			  const struct object_id *their_blob, const char *path,
    ++			  unsigned int orig_mode, unsigned int our_mode,
    ++			  unsigned int their_mode, void *data)
      {
     -	int found;
    --	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
    --	char hexbuf[4][GIT_MAX_HEXSZ + 1];
    --	char ownbuf[4][60];
    -+	char oids[3][GIT_MAX_HEXSZ + 1] = {{0}};
    -+	char modes[3][10] = {{0}};
    -+	const char *arguments[] = { pgm, oids[0], oids[1], oids[2],
    -+				    path, modes[0], modes[1], modes[2], NULL };
    ++	struct mofs_data *d = data;
    ++	const char *pgm = d->program;
    + 	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
    + 	char hexbuf[4][GIT_MAX_HEXSZ + 1];
    + 	char ownbuf[4][60];
    ++	int stage = 0;
    + 	struct child_process cmd = CHILD_PROCESS_INIT;
      
    --	if (pos >= active_nr)
    --		die("git merge-index: %s not in the cache", path);
    +-	if (pos >= istate->cache_nr)
    +-		die(_("'%s' is not in the cache"), path);
     -	found = 0;
     -	do {
    --		const struct cache_entry *ce = active_cache[pos];
    +-		const struct cache_entry *ce = istate->cache[pos];
     -		int stage = ce_stage(ce);
     -
     -		if (strcmp(ce->name, path))
    @@ builtin/merge-index.c
     -		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
     -		arguments[stage] = hexbuf[stage];
     -		arguments[stage + 4] = ownbuf[stage];
    --	} while (++pos < active_nr);
    +-	} while (++pos < istate->cache_nr);
     -	if (!found)
    --		die("git merge-index: %s not in the cache", path);
    +-		die(_("'%s' is not in the cache"), path);
     -
    --	if (run_command_v_opt(arguments, 0)) {
    +-	strvec_pushv(&cmd.args, arguments);
    +-	if (run_command(&cmd)) {
     -		if (one_shot)
     -			err++;
     -		else {
     -			if (!quiet)
    --				die("merge program failed");
    +-				die(_("merge program failed"));
     -			exit(1);
     -		}
    -+	if (orig_blob) {
    -+		oid_to_hex_r(oids[0], orig_blob);
    -+		xsnprintf(modes[0], sizeof(modes[0]), "%06o", orig_mode);
    ++#define ADD_MOF_ARG(oid, mode) \
    ++	if ((oid)) { \
    ++		stage++; \
    ++		oid_to_hex_r(hexbuf[stage], (oid)); \
    ++		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%06o", (mode)); \
    ++		arguments[stage] = hexbuf[stage]; \
    ++		arguments[stage + 4] = ownbuf[stage]; \
      	}
     -	return found;
     -}
    - 
    --static void merge_one_path(const char *path)
    --{
    --	int pos = cache_name_pos(path, strlen(path));
     -
    +-static void merge_one_path(struct index_state *istate, const char *path)
    +-{
    +-	int pos = index_name_pos(istate, path, strlen(path));
    + 
     -	/*
     -	 * If it already exists in the cache as stage0, it's
     -	 * already merged and there is nothing to do.
     -	 */
     -	if (pos < 0)
    --		merge_entry(-pos-1, path);
    +-		merge_entry(istate, -pos-1, path);
     -}
    -+	if (our_blob) {
    -+		oid_to_hex_r(oids[1], our_blob);
    -+		xsnprintf(modes[1], sizeof(modes[1]), "%06o", our_mode);
    -+	}
    - 
    --static void merge_all(void)
    +-
    +-static void merge_all(struct index_state *istate)
     -{
     -	int i;
    --	/* TODO: audit for interaction with sparse-index. */
    --	ensure_full_index(&the_index);
    --	for (i = 0; i < active_nr; i++) {
    --		const struct cache_entry *ce = active_cache[i];
    ++	ADD_MOF_ARG(orig_blob, orig_mode);
    ++	ADD_MOF_ARG(our_blob, our_mode);
    ++	ADD_MOF_ARG(their_blob, their_mode);
    + 
    +-	for (i = 0; i < istate->cache_nr; i++) {
    +-		const struct cache_entry *ce = istate->cache[i];
     -		if (!ce_stage(ce))
     -			continue;
    --		i += merge_entry(i, ce->name)-1;
    -+	if (their_blob) {
    -+		oid_to_hex_r(oids[2], their_blob);
    -+		xsnprintf(modes[2], sizeof(modes[2]), "%06o", their_mode);
    - 	}
    -+
    -+	return run_command_v_opt(arguments, 0);
    +-		i += merge_entry(istate, i, ce->name)-1;
    +-	}
    ++	strvec_pushv(&cmd.args, arguments);
    ++	return run_command(&cmd);
      }
      
      int cmd_merge_index(int argc, const char **argv, const char *prefix)
      {
    --	int i, force_file = 0;
    -+	int i, force_file = 0, err = 0, one_shot = 0, quiet = 0;
    ++	int err = 0;
    + 	int all = 0;
    ++	int one_shot = 0;
    ++	int quiet = 0;
    + 	const char * const usage[] = {
    + 		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
    + 		NULL
    +@@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
    + 		OPT_END(),
    + 	};
    + #undef OPT__MERGE_INDEX_ALL
    ++	struct mofs_data data = { 0 };
      
      	/* Without this we cannot rely on waitpid() to tell
      	 * what happened to our children.
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
    - 		quiet = 1;
    - 		i++;
    - 	}
    -+
    - 	pgm = argv[i++];
    -+
    - 	for (; i < argc; i++) {
    - 		const char *arg = argv[i];
    - 		if (!force_file && *arg == '-') {
    + 	/* <merge-program> and its options */
    + 	if (!argc)
    + 		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
    +-	pgm = argv[0];
    ++	data.program = argv[0];
    + 	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
    + 	if (argc && all)
    + 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
    - 				continue;
    - 			}
    - 			if (!strcmp(arg, "-a")) {
    --				merge_all();
    -+				err |= merge_all_index(&the_index, one_shot, quiet,
    -+						       merge_one_file_spawn, NULL);
    - 				continue;
    - 			}
    - 			die("git merge-index: unknown option %s", arg);
    - 		}
    --		merge_one_path(arg);
    -+		err |= merge_index_path(&the_index, one_shot, quiet, arg,
    -+					merge_one_file_spawn, NULL);
    - 	}
    + 	ensure_full_index(the_repository->index);
    + 
    + 	if (all)
    +-		merge_all(the_repository->index);
    ++		err |= merge_all_index(the_repository->index, one_shot, quiet,
    ++				       merge_one_file, &data);
    + 	else
    + 		for (size_t i = 0; i < argc; i++)
    +-			merge_one_path(the_repository->index, argv[i]);
    ++			err |= merge_index_path(the_repository->index,
    ++						one_shot, quiet, argv[i],
    ++						merge_one_file, &data);
    + 
     -	if (err && !quiet)
    --		die("merge program failed");
    -+
    +-		die(_("merge program failed"));
      	return err;
      }
     
    @@ merge-strategies.c (new)
     +#include "merge-strategies.h"
     +
     +static int merge_entry(struct index_state *istate, unsigned int pos,
    -+		       const char *path, int *err, merge_fn fn, void *data)
    ++		       const char *path, int *err, merge_index_fn fn,
    ++		       void *data)
     +{
     +	int found = 0;
    -+	const struct object_id *oids[3] = {NULL};
    -+	unsigned int modes[3] = {0};
    ++	const struct object_id *oids[3] = { 0 };
    ++	unsigned int modes[3] = { 0 };
     +
    ++	*err = 0;
    ++
    ++	if (pos >= istate->cache_nr)
    ++		die(_("'%s' is not in the cache"), path);
     +	do {
     +		const struct cache_entry *ce = istate->cache[pos];
     +		int stage = ce_stage(ce);
    @@ merge-strategies.c (new)
     +		modes[stage - 1] = ce->ce_mode;
     +	} while (++pos < istate->cache_nr);
     +	if (!found)
    -+		return error(_("%s is not in the cache"), path);
    ++		die(_("'%s' is not in the cache"), path);
     +
    -+	if (fn(istate, oids[0], oids[1], oids[2], path,
    -+	       modes[0], modes[1], modes[2], data))
    ++	if (fn(istate, oids[0], oids[1], oids[2], path, modes[0], modes[1],
    ++	       modes[2], data))
     +		(*err)++;
     +
     +	return found;
     +}
     +
     +int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    -+		     const char *path, merge_fn fn, void *data)
    ++		     const char *path, merge_index_fn fn, void *data)
     +{
    -+	int pos = index_name_pos(istate, path, strlen(path)), ret, err = 0;
    ++	int err, ret;
    ++	int pos = index_name_pos(istate, path, strlen(path));
     +
     +	/*
     +	 * If it already exists in the cache as stage0, it's
     +	 * already merged and there is nothing to do.
     +	 */
    -+	if (pos < 0) {
    -+		ret = merge_entry(istate, -pos - 1, path, &err, fn, data);
    -+		if (ret == -1)
    -+			return -1;
    -+		else if (err) {
    -+			if (!quiet && !oneshot)
    -+				error(_("merge program failed"));
    -+			return 1;
    -+		}
    ++	if (pos >= 0)
    ++		return 0;
    ++
    ++	ret = merge_entry(istate, -pos - 1, path, &err, fn, data);
    ++	if (ret < 0)
    ++		return ret;
    ++	if (err) {
    ++		if (!quiet && !oneshot)
    ++			die(_("merge program failed"));
    ++		return 1;
     +	}
     +	return 0;
     +}
     +
     +int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    -+		    merge_fn fn, void *data)
    ++		    merge_index_fn fn, void *data)
     +{
    -+	int err = 0, ret;
    ++	int err, ret;
     +	unsigned int i;
     +
    -+	/* TODO: audit for interaction with sparse-index. */
    -+	ensure_full_index(istate);
     +	for (i = 0; i < istate->cache_nr; i++) {
     +		const struct cache_entry *ce = istate->cache[i];
     +		if (!ce_stage(ce))
     +			continue;
     +
     +		ret = merge_entry(istate, i, ce->name, &err, fn, data);
    -+		if (ret > 0)
    ++		if (ret < 0)
    ++			return ret;
    ++		else if (ret > 0)
     +			i += ret - 1;
    -+		else if (ret == -1)
    -+			return -1;
     +
     +		if (err && !oneshot) {
     +			if (!quiet)
    -+				error(_("merge program failed"));
    ++				die(_("merge program failed"));
     +			return 1;
     +		}
     +	}
     +
     +	if (err && !quiet)
    -+		error(_("merge program failed"));
    ++		die(_("merge program failed"));
     +	return err;
     +}
     
    @@ merge-strategies.h (new)
     +#ifndef MERGE_STRATEGIES_H
     +#define MERGE_STRATEGIES_H
     +
    -+#include "object.h"
    -+
    -+typedef int (*merge_fn)(struct index_state *istate,
    -+			const struct object_id *orig_blob,
    -+			const struct object_id *our_blob,
    -+			const struct object_id *their_blob, const char *path,
    -+			unsigned int orig_mode, unsigned int our_mode, unsigned int their_mode,
    -+			void *data);
    ++struct object_id;
    ++struct index_state;
    ++typedef int (*merge_index_fn)(struct index_state *istate,
    ++			      const struct object_id *orig_blob,
    ++			      const struct object_id *our_blob,
    ++			      const struct object_id *their_blob,
    ++			      const char *path, unsigned int orig_mode,
    ++			      unsigned int our_mode, unsigned int their_mode,
    ++			      void *data);
     +
     +int merge_index_path(struct index_state *istate, int oneshot, int quiet,
    -+		     const char *path, merge_fn fn, void *data);
    ++		     const char *path, merge_index_fn fn, void *data);
     +int merge_all_index(struct index_state *istate, int oneshot, int quiet,
    -+		    merge_fn fn, void *data);
    ++		    merge_index_fn fn, void *data);
     +
     +#endif /* MERGE_STRATEGIES_H */
    -
    - ## t/t7607-merge-state.sh ##
    -@@ t/t7607-merge-state.sh: test_expect_success 'Ensure we restore original state if no merge strategy handl
    - 	# just hit conflicts, it completely fails and says that it cannot
    - 	# handle this type of merge.
    - 	test_expect_code 2 git merge branch2 branch3 >output 2>&1 &&
    --	grep "fatal: merge program failed" output &&
    -+	grep "error: merge program failed" output &&
    - 	grep "Should not be doing an octopus" output &&
    - 
    - 	# Make sure we did not leave stray changes around when no appropriate
 4:  4b0420836c1 <  -:  ----------- merge-index: drop the index
 5:  19a4fc52c57 <  -:  ----------- merge-index: add a new way to invoke `git-merge-one-file'
 6:  376130c1334 <  -:  ----------- update-index: move add_cacheinfo() to read-cache.c
 7:  e440127edf2 <  -:  ----------- merge-one-file: rewrite in C
 8:  661c358836e <  -:  ----------- merge-resolve: rewrite in C
 9:  388128cd351 <  -:  ----------- merge-recursive: move better_branch_name() to merge.c
10:  1515e154bf5 <  -:  ----------- merge-octopus: rewrite in C
11:  701c47371a7 <  -:  ----------- merge: use the "resolve" strategy without forking
12:  17597d0cc57 <  -:  ----------- merge: use the "octopus" strategy without forking
13:  cecfa666ecb <  -:  ----------- sequencer: use the "resolve" strategy without forking
14:  a23c0491a1f <  -:  ----------- sequencer: use the "octopus" strategy without forking
 -:  ----------- > 11:  adb712ca7a5 merge-index: use "struct strvec" and helper to prepare args
 -:  ----------- > 12:  f0368560140 merge-index: make the argument parsing sensible & simpler
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v9 01/12] merge-index doc & -h: fix padding, labels and "()" use
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 02/12] t6060: modify multiple files to expose a possible issue with merge-index Ævar Arnfjörð Bjarmason
                                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Make the "merge-index" doc SYNOPSIS and "-h" output consistent with
one another, and small issues with it:

- Whitespace padding, per e2f4e7e8c0f (doc txt & -h consistency:
  correct padding around "[]()", 2022-10-13).

- Use "<file>" consistently, rather than using "<filename>" in the
  "-h" output, and "<file>" in the SYNOPSIS.

- The "-h" version incorrectly claimed that the filename was optional,
  but it's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/git-merge-index.txt | 2 +-
 builtin/merge-index.c             | 2 +-
 t/t0450/txt-help-mismatches       | 1 -
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
index eea56b3154e..a297105d6d8 100644
--- a/Documentation/git-merge-index.txt
+++ b/Documentation/git-merge-index.txt
@@ -9,7 +9,7 @@ git-merge-index - Run a merge for files needing merging
 SYNOPSIS
 --------
 [verse]
-'git merge-index' [-o] [-q] <merge-program> (-a | ( [--] <file>...) )
+'git merge-index' [-o] [-q] <merge-program> (-a | ([--] <file>...))
 
 DESCRIPTION
 -----------
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 012f52bd007..1a5a64afd2a 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -80,7 +80,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	signal(SIGCHLD, SIG_DFL);
 
 	if (argc < 3)
-		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
+		usage("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))");
 
 	read_cache();
 
diff --git a/t/t0450/txt-help-mismatches b/t/t0450/txt-help-mismatches
index a0777acd667..9e73c1892ae 100644
--- a/t/t0450/txt-help-mismatches
+++ b/t/t0450/txt-help-mismatches
@@ -34,7 +34,6 @@ mailsplit
 maintenance
 merge
 merge-file
-merge-index
 merge-one-file
 multi-pack-index
 name-rev
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 02/12] t6060: modify multiple files to expose a possible issue with merge-index
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 03/12] t6060: add tests for removed files Ævar Arnfjörð Bjarmason
                                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Currently, merge-index iterates over every index entry, skipping stage0
entries.  It will then count how many entries following the current one
have the same name, then fork to do the merge.  It will then increase
the iterator by the number of entries to skip them.  This behaviour is
correct, as even if the subprocess modifies the index, merge-index does
not reload it at all.

But when it will be rewritten to use a function, the index it will use
will be modified and may shrink when a conflict happens or if a file is
removed, so we have to be careful to handle such cases.

Here is an example:

 *    Merge branches, file1 and file2 are trivially mergeable.
 |\
 | *  Modifies file1 and file2.
 * |  Modifies file1 and file2.
 |/
 *    Adds file1 and file2.

When the merge happens, the index will look like that:

 i -> 0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
      3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

merge-index handles `file1' first.  As it appears 3 times after the
iterator, it is merged.  The index is now stale, `i' is increased by 3,
and the index now looks like this:

      0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
 i -> 3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

`file2' appears three times too, so it is merged.

With a naive rewrite, the index would look like this:

      0. file1 (stage0)
      1. file2 (stage1)
      2. file2 (stage2)
 i -> 3. file2 (stage3)

`file2' appears once at the iterator or after, so it will be added,
_not_ merged.  Which is wrong.

A naive rewrite would lead to unproperly merged files, or even files not
handled at all.

This changes t6060 to reproduce this case, by creating 2 files instead
of 1, to check the correctness of the soon-to-be-rewritten merge-index.
The files are identical, which is not really important -- the factors
that could trigger this issue are that they should be separated by at
most one entry in the index, and that the first one in the index should
be trivially mergeable.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t6060-merge-index.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 1a8b64cce18..30513351c23 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -7,16 +7,19 @@ TEST_PASSES_SANITIZE_LEAK=true
 
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
-	git add file &&
+	cp file file2 &&
+	git add file file2 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
 	sed s/10/ten/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m ten &&
 	git tag ten
 '
@@ -35,8 +38,11 @@ ten
 EOF
 
 test_expect_success 'read-tree does not resolve content merge' '
+	cat >expect <<-\EOF &&
+	file
+	file2
+	EOF
 	git read-tree -i -m base ten two &&
-	echo file >expect &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_cmp expect unmerged
 '
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 03/12] t6060: add tests for removed files
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 02/12] t6060: modify multiple files to expose a possible issue with merge-index Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 04/12] merge-index tests: add usage tests Ævar Arnfjörð Bjarmason
                                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Until now, t6060 did not not check git-merge-one-file's behaviour when a
file is deleted in a branch.  To avoid regressions on this during the
conversion from shell to C, this adds a new file, `file3', in the commit
tagged as `base', and deletes it in the commit tagged as `two'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t6060-merge-index.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 30513351c23..079151ee06d 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -8,12 +8,14 @@ TEST_PASSES_SANITIZE_LEAK=true
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
 	cp file file2 &&
-	git add file file2 &&
+	cp file file3 &&
+	git add file file2 file3 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
 	cp file file2 &&
+	git rm file3 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
@@ -41,6 +43,7 @@ test_expect_success 'read-tree does not resolve content merge' '
 	cat >expect <<-\EOF &&
 	file
 	file2
+	file3
 	EOF
 	git read-tree -i -m base ten two &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 04/12] merge-index tests: add usage tests
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (2 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 03/12] t6060: add tests for removed files Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 05/12] merge-index: migrate to parse_options() API Ævar Arnfjörð Bjarmason
                                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add tests that stress the current behavior of the options parsing in
cmd_merge_index(), in preparation for moving it over to
parse_options().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t6060-merge-index.sh | 44 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 079151ee06d..edc03b41ab9 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -5,6 +5,50 @@ test_description='basic git merge-index / git-merge-one-file tests'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+test_expect_success 'usage: 1 argument' '
+	test_expect_code 129 git merge-index a >out 2>err &&
+	test_must_be_empty out &&
+	grep ^usage err
+'
+
+test_expect_success 'usage: 2 arguments' '
+	cat >expect <<-\EOF &&
+	fatal: git merge-index: b not in the cache
+	EOF
+	test_expect_code 128 git merge-index a b >out 2>actual &&
+	test_must_be_empty out &&
+	test_cmp expect actual
+'
+
+test_expect_success 'usage: -a before <program>' '
+	cat >expect <<-\EOF &&
+	fatal: git merge-index: b not in the cache
+	EOF
+	test_expect_code 128 git merge-index -a b program >out 2>actual &&
+	test_must_be_empty out &&
+	test_cmp expect actual
+'
+
+for opt in -q -o
+do
+	test_expect_success "usage: $opt after -a" '
+		cat >expect <<-EOF &&
+		fatal: git merge-index: unknown option $opt
+		EOF
+		test_expect_code 128 git merge-index -a $opt >out 2>actual &&
+		test_must_be_empty out &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "usage: $opt program" '
+		test_expect_code 0 git merge-index $opt program
+	'
+done
+
+test_expect_success 'usage: program' '
+	test_expect_code 129 git merge-index program
+'
+
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
 	cp file file2 &&
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 05/12] merge-index: migrate to parse_options() API
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (3 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 04/12] merge-index tests: add usage tests Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 06/12] merge-index: improve die() error messages Ævar Arnfjörð Bjarmason
                                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Migrate the "merge-index" command to the parse_options() API, a
preceding commit added tests for the existing behavior.

In a subsequent commit we'll adjust the behavior to be more consistent
with how most other commands work, but for now let's take pains to
preserve it as-is. We need to e.g. call parse_options() twice now, as
the "-a" option is currently only understood after "<merge-program>".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c  | 71 ++++++++++++++++++++++++++----------------
 git.c                  |  2 +-
 t/t6060-merge-index.sh | 10 +++---
 3 files changed, 51 insertions(+), 32 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 1a5a64afd2a..3bd0790465e 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,5 +1,6 @@
 #define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
+#include "parse-options.h"
 #include "run-command.h"
 
 static const char *pgm;
@@ -72,7 +73,26 @@ static void merge_all(void)
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int all = 0;
+	const char * const usage[] = {
+		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
+		NULL
+	};
+#define OPT__MERGE_INDEX_ALL(v) \
+	OPT_BOOL('a', NULL, (v), \
+		 N_("merge all files in the index that need merging"))
+	struct option options[] = {
+		OPT_BOOL('o', NULL, &one_shot,
+			 N_("don't stop at the first failed merge")),
+		OPT__QUIET(&quiet, N_("be quiet")),
+		OPT__MERGE_INDEX_ALL(&all), /* include "-a" to show it in "-bh" */
+		OPT_END(),
+	};
+	struct option options_prog[] = {
+		OPT__MERGE_INDEX_ALL(&all),
+		OPT_END(),
+	};
+#undef OPT__MERGE_INDEX_ALL
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -80,38 +100,35 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	signal(SIGCHLD, SIG_DFL);
 
 	if (argc < 3)
-		usage("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))");
+		usage_with_options(usage, options);
+
+	/* Option parsing without <merge-program> options */
+	argc = parse_options(argc, argv, prefix, options, usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+	if (all)
+		usage_msg_optf(_("'%s' option can only be provided after '<merge-program>'"),
+			      usage, options, "-a");
+	/* <merge-program> and its options */
+	if (!argc)
+		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
+	pgm = argv[0];
+	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
+	if (argc && all)
+		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
+			      usage, options);
 
 	read_cache();
 
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(&the_index);
 
-	i = 1;
-	if (!strcmp(argv[i], "-o")) {
-		one_shot = 1;
-		i++;
-	}
-	if (!strcmp(argv[i], "-q")) {
-		quiet = 1;
-		i++;
-	}
-	pgm = argv[i++];
-	for (; i < argc; i++) {
-		const char *arg = argv[i];
-		if (!force_file && *arg == '-') {
-			if (!strcmp(arg, "--")) {
-				force_file = 1;
-				continue;
-			}
-			if (!strcmp(arg, "-a")) {
-				merge_all();
-				continue;
-			}
-			die("git merge-index: unknown option %s", arg);
-		}
-		merge_one_path(arg);
-	}
+
+	if (all)
+		merge_all();
+	else
+		for (size_t i = 0; i < argc; i++)
+			merge_one_path(argv[i]);
+
 	if (err && !quiet)
 		die("merge program failed");
 	return err;
diff --git a/git.c b/git.c
index 6662548986f..83696fd8b4a 100644
--- a/git.c
+++ b/git.c
@@ -560,7 +560,7 @@ static struct cmd_struct commands[] = {
 	{ "merge", cmd_merge, RUN_SETUP | NEED_WORK_TREE },
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
-	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-index", cmd_merge_index, RUN_SETUP },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index edc03b41ab9..6c59e7bc4e5 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -22,9 +22,10 @@ test_expect_success 'usage: 2 arguments' '
 
 test_expect_success 'usage: -a before <program>' '
 	cat >expect <<-\EOF &&
-	fatal: git merge-index: b not in the cache
+	fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
 	EOF
-	test_expect_code 128 git merge-index -a b program >out 2>actual &&
+	test_expect_code 129 git merge-index -a b program >out 2>actual.raw &&
+	grep "^fatal:" actual.raw >actual &&
 	test_must_be_empty out &&
 	test_cmp expect actual
 '
@@ -33,9 +34,10 @@ for opt in -q -o
 do
 	test_expect_success "usage: $opt after -a" '
 		cat >expect <<-EOF &&
-		fatal: git merge-index: unknown option $opt
+		fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
 		EOF
-		test_expect_code 128 git merge-index -a $opt >out 2>actual &&
+		test_expect_code 129 git merge-index -a $opt >out 2>actual.raw &&
+		grep "^fatal:" actual.raw >actual &&
 		test_must_be_empty out &&
 		test_cmp expect actual
 	'
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 06/12] merge-index: improve die() error messages
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (4 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 05/12] merge-index: migrate to parse_options() API Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 07/12] merge-index i18n: mark die() messages for translation Ævar Arnfjörð Bjarmason
                                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Our usual convention is not to repeat the program name back at the
user, and to quote path arguments. Let's do that now to reduce the
size of the subsequent commit.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c  | 4 ++--
 t/t6060-merge-index.sh | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 3bd0790465e..0b06c69354b 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -16,7 +16,7 @@ static int merge_entry(int pos, const char *path)
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
 	if (pos >= active_nr)
-		die("git merge-index: %s not in the cache", path);
+		die("'%s' is not in the cache", path);
 	found = 0;
 	do {
 		const struct cache_entry *ce = active_cache[pos];
@@ -31,7 +31,7 @@ static int merge_entry(int pos, const char *path)
 		arguments[stage + 4] = ownbuf[stage];
 	} while (++pos < active_nr);
 	if (!found)
-		die("git merge-index: %s not in the cache", path);
+		die("'%s' is not in the cache", path);
 
 	strvec_pushv(&cmd.args, arguments);
 	if (run_command(&cmd)) {
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 6c59e7bc4e5..bc201a69552 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -13,7 +13,7 @@ test_expect_success 'usage: 1 argument' '
 
 test_expect_success 'usage: 2 arguments' '
 	cat >expect <<-\EOF &&
-	fatal: git merge-index: b not in the cache
+	fatal: '\''b'\'' is not in the cache
 	EOF
 	test_expect_code 128 git merge-index a b >out 2>actual &&
 	test_must_be_empty out &&
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 07/12] merge-index i18n: mark die() messages for translation
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (5 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 06/12] merge-index: improve die() error messages Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 08/12] merge-index: stop calling ensure_full_index() twice Ævar Arnfjörð Bjarmason
                                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Mark the die() messages for translation with _(). We don't rely on the
specifics of these messages as plumbing, so they can be safely
translated.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 0b06c69354b..ee48587a8fb 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -16,7 +16,7 @@ static int merge_entry(int pos, const char *path)
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
 	if (pos >= active_nr)
-		die("'%s' is not in the cache", path);
+		die(_("'%s' is not in the cache"), path);
 	found = 0;
 	do {
 		const struct cache_entry *ce = active_cache[pos];
@@ -31,7 +31,7 @@ static int merge_entry(int pos, const char *path)
 		arguments[stage + 4] = ownbuf[stage];
 	} while (++pos < active_nr);
 	if (!found)
-		die("'%s' is not in the cache", path);
+		die(_("'%s' is not in the cache"), path);
 
 	strvec_pushv(&cmd.args, arguments);
 	if (run_command(&cmd)) {
@@ -39,7 +39,7 @@ static int merge_entry(int pos, const char *path)
 			err++;
 		else {
 			if (!quiet)
-				die("merge program failed");
+				die(_("merge program failed"));
 			exit(1);
 		}
 	}
@@ -130,6 +130,6 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			merge_one_path(argv[i]);
 
 	if (err && !quiet)
-		die("merge program failed");
+		die(_("merge program failed"));
 	return err;
 }
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 08/12] merge-index: stop calling ensure_full_index() twice
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (6 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 07/12] merge-index i18n: mark die() messages for translation Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 09/12] builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS Ævar Arnfjörð Bjarmason
                                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

When most of the ensure_full_index() calls were added in
8e97852919f (Merge branch 'ds/sparse-index-protections', 2021-04-30)
we could add them at the start of cmd_*() for built-ins, but in some
cases we couldn't do that, as we'd only want to initialize the index
conditionally on some branches in the code.

But this code added in 299e2c4561b (merge-index: ensure full index,
2021-04-01) (part of 8e97852919f) isn't such a case. The merge_all()
function is only called by cmd_merge_index(), which before calling it
will have called ensure_full_index() unconditionally.

We can therefore skip this. While we're at it, and mainly so that
we'll see the relevant code in the context, let's fix a minor
whitespace issue that the addition of the ensure_full_index() call in
299e2c4561b introduced.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index ee48587a8fb..9bffcc5b0f1 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -61,8 +61,7 @@ static void merge_one_path(const char *path)
 static void merge_all(void)
 {
 	int i;
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
+
 	for (i = 0; i < active_nr; i++) {
 		const struct cache_entry *ce = active_cache[i];
 		if (!ce_stage(ce))
@@ -122,7 +121,6 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(&the_index);
 
-
 	if (all)
 		merge_all();
 	else
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 09/12] builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (7 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 08/12] merge-index: stop calling ensure_full_index() twice Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 10/12] merge-index: libify merge_one_path() and merge_all() Ævar Arnfjörð Bjarmason
                                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Remove "USE_THE_INDEX_COMPATIBILITY_MACROS" and instead pass
"the_index" around between the functions in this file. In a subsequent
commit we'll libify this, and don't want to use
"USE_THE_INDEX_COMPATIBILITY_MACROS" in any more places in the
top-level *.c files. Doing this first makes that diff a lot smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 9bffcc5b0f1..c269d76cc8f 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,4 +1,3 @@
-#define USE_THE_INDEX_COMPATIBILITY_MACROS
 #include "builtin.h"
 #include "parse-options.h"
 #include "run-command.h"
@@ -7,7 +6,7 @@ static const char *pgm;
 static int one_shot, quiet;
 static int err;
 
-static int merge_entry(int pos, const char *path)
+static int merge_entry(struct index_state *istate, int pos, const char *path)
 {
 	int found;
 	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
@@ -15,11 +14,11 @@ static int merge_entry(int pos, const char *path)
 	char ownbuf[4][60];
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
-	if (pos >= active_nr)
+	if (pos >= istate->cache_nr)
 		die(_("'%s' is not in the cache"), path);
 	found = 0;
 	do {
-		const struct cache_entry *ce = active_cache[pos];
+		const struct cache_entry *ce = istate->cache[pos];
 		int stage = ce_stage(ce);
 
 		if (strcmp(ce->name, path))
@@ -29,7 +28,7 @@ static int merge_entry(int pos, const char *path)
 		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
 		arguments[stage] = hexbuf[stage];
 		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < active_nr);
+	} while (++pos < istate->cache_nr);
 	if (!found)
 		die(_("'%s' is not in the cache"), path);
 
@@ -46,27 +45,27 @@ static int merge_entry(int pos, const char *path)
 	return found;
 }
 
-static void merge_one_path(const char *path)
+static void merge_one_path(struct index_state *istate, const char *path)
 {
-	int pos = cache_name_pos(path, strlen(path));
+	int pos = index_name_pos(istate, path, strlen(path));
 
 	/*
 	 * If it already exists in the cache as stage0, it's
 	 * already merged and there is nothing to do.
 	 */
 	if (pos < 0)
-		merge_entry(-pos-1, path);
+		merge_entry(istate, -pos-1, path);
 }
 
-static void merge_all(void)
+static void merge_all(struct index_state *istate)
 {
 	int i;
 
-	for (i = 0; i < active_nr; i++) {
-		const struct cache_entry *ce = active_cache[i];
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
 		if (!ce_stage(ce))
 			continue;
-		i += merge_entry(i, ce->name)-1;
+		i += merge_entry(istate, i, ce->name)-1;
 	}
 }
 
@@ -116,16 +115,16 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
 			      usage, options);
 
-	read_cache();
+	repo_read_index(the_repository);
 
 	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
+	ensure_full_index(the_repository->index);
 
 	if (all)
-		merge_all();
+		merge_all(the_repository->index);
 	else
 		for (size_t i = 0; i < argc; i++)
-			merge_one_path(argv[i]);
+			merge_one_path(the_repository->index, argv[i]);
 
 	if (err && !quiet)
 		die(_("merge program failed"));
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 10/12] merge-index: libify merge_one_path() and merge_all()
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (8 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 09/12] builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 11/12] merge-index: use "struct strvec" and helper to prepare args Ævar Arnfjörð Bjarmason
                                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Move the workhorse functions in "builtin/merge-index.c" into a new
"merge-strategies" library, and mostly "libify" the code while doing
so.

Eventually this will allow us to invoke merge strategies such as
"resolve" and "octopus" in-process, once we've followed-up and
replaced "git-merge-{resolve,octopus}.sh" etc.

But for now let's move this code, while trying to optimize for as much
of it as possible being highlighted by the diff rename detection.

We still call die() in this library. An earlier version of this[1]
converted these to "error()", but the problem with that that we'd then
potentially run into the same error N times, e.g. once for every
"<file>" we were asked to operate on, instead of dying on the first
case. So let's leave those to "die()" for now.

1. https://lore.kernel.org/git/20220809185429.20098-4-alban.gruin@gmail.com/

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              |  1 +
 builtin/merge-index.c | 95 ++++++++++++++++---------------------------
 merge-strategies.c    | 87 +++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    | 19 +++++++++
 4 files changed, 142 insertions(+), 60 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index 4927379184c..ccd467cec79 100644
--- a/Makefile
+++ b/Makefile
@@ -1000,6 +1000,7 @@ LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-ort.o
 LIB_OBJS += merge-ort-wrappers.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += midx.o
 LIB_OBJS += name-hash.o
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index c269d76cc8f..21598a52383 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,77 +1,50 @@
 #include "builtin.h"
 #include "parse-options.h"
+#include "merge-strategies.h"
 #include "run-command.h"
 
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
+struct mofs_data {
+	const char *program;
+};
 
-static int merge_entry(struct index_state *istate, int pos, const char *path)
+static int merge_one_file(struct index_state *istate,
+			  const struct object_id *orig_blob,
+			  const struct object_id *our_blob,
+			  const struct object_id *their_blob, const char *path,
+			  unsigned int orig_mode, unsigned int our_mode,
+			  unsigned int their_mode, void *data)
 {
-	int found;
+	struct mofs_data *d = data;
+	const char *pgm = d->program;
 	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
 	char hexbuf[4][GIT_MAX_HEXSZ + 1];
 	char ownbuf[4][60];
+	int stage = 0;
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
-	if (pos >= istate->cache_nr)
-		die(_("'%s' is not in the cache"), path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = istate->cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < istate->cache_nr);
-	if (!found)
-		die(_("'%s' is not in the cache"), path);
-
-	strvec_pushv(&cmd.args, arguments);
-	if (run_command(&cmd)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die(_("merge program failed"));
-			exit(1);
-		}
+#define ADD_MOF_ARG(oid, mode) \
+	if ((oid)) { \
+		stage++; \
+		oid_to_hex_r(hexbuf[stage], (oid)); \
+		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%06o", (mode)); \
+		arguments[stage] = hexbuf[stage]; \
+		arguments[stage + 4] = ownbuf[stage]; \
 	}
-	return found;
-}
-
-static void merge_one_path(struct index_state *istate, const char *path)
-{
-	int pos = index_name_pos(istate, path, strlen(path));
 
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(istate, -pos-1, path);
-}
-
-static void merge_all(struct index_state *istate)
-{
-	int i;
+	ADD_MOF_ARG(orig_blob, orig_mode);
+	ADD_MOF_ARG(our_blob, our_mode);
+	ADD_MOF_ARG(their_blob, their_mode);
 
-	for (i = 0; i < istate->cache_nr; i++) {
-		const struct cache_entry *ce = istate->cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(istate, i, ce->name)-1;
-	}
+	strvec_pushv(&cmd.args, arguments);
+	return run_command(&cmd);
 }
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
+	int err = 0;
 	int all = 0;
+	int one_shot = 0;
+	int quiet = 0;
 	const char * const usage[] = {
 		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
 		NULL
@@ -91,6 +64,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 #undef OPT__MERGE_INDEX_ALL
+	struct mofs_data data = { 0 };
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -109,7 +83,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	/* <merge-program> and its options */
 	if (!argc)
 		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
-	pgm = argv[0];
+	data.program = argv[0];
 	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
 	if (argc && all)
 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
@@ -121,12 +95,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	ensure_full_index(the_repository->index);
 
 	if (all)
-		merge_all(the_repository->index);
+		err |= merge_all_index(the_repository->index, one_shot, quiet,
+				       merge_one_file, &data);
 	else
 		for (size_t i = 0; i < argc; i++)
-			merge_one_path(the_repository->index, argv[i]);
+			err |= merge_index_path(the_repository->index,
+						one_shot, quiet, argv[i],
+						merge_one_file, &data);
 
-	if (err && !quiet)
-		die(_("merge program failed"));
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 00000000000..30691fccd77
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,87 @@
+#include "cache.h"
+#include "merge-strategies.h"
+
+static int merge_entry(struct index_state *istate, unsigned int pos,
+		       const char *path, int *err, merge_index_fn fn,
+		       void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = { 0 };
+	unsigned int modes[3] = { 0 };
+
+	*err = 0;
+
+	if (pos >= istate->cache_nr)
+		die(_("'%s' is not in the cache"), path);
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		die(_("'%s' is not in the cache"), path);
+
+	if (fn(istate, oids[0], oids[1], oids[2], path, modes[0], modes[1],
+	       modes[2], data))
+		(*err)++;
+
+	return found;
+}
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_index_fn fn, void *data)
+{
+	int err, ret;
+	int pos = index_name_pos(istate, path, strlen(path));
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos >= 0)
+		return 0;
+
+	ret = merge_entry(istate, -pos - 1, path, &err, fn, data);
+	if (ret < 0)
+		return ret;
+	if (err) {
+		if (!quiet && !oneshot)
+			die(_("merge program failed"));
+		return 1;
+	}
+	return 0;
+}
+
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_index_fn fn, void *data)
+{
+	int err, ret;
+	unsigned int i;
+
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, i, ce->name, &err, fn, data);
+		if (ret < 0)
+			return ret;
+		else if (ret > 0)
+			i += ret - 1;
+
+		if (err && !oneshot) {
+			if (!quiet)
+				die(_("merge program failed"));
+			return 1;
+		}
+	}
+
+	if (err && !quiet)
+		die(_("merge program failed"));
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 00000000000..cee9168a046
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,19 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+struct object_id;
+struct index_state;
+typedef int (*merge_index_fn)(struct index_state *istate,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob,
+			      const char *path, unsigned int orig_mode,
+			      unsigned int our_mode, unsigned int their_mode,
+			      void *data);
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_index_fn fn, void *data);
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_index_fn fn, void *data);
+
+#endif /* MERGE_STRATEGIES_H */
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 11/12] merge-index: use "struct strvec" and helper to prepare args
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (9 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 10/12] merge-index: libify merge_one_path() and merge_all() Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 11:18                 ` [PATCH v9 12/12] merge-index: make the argument parsing sensible & simpler Ævar Arnfjörð Bjarmason
                                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Refactor the code that was libified in the preceding commit to use
strvec_pushf() with a helper function, instead of in-place xsnprintf()
code that we generate with a macro.

This is less efficient in term of the number of allocations we do, but
it's now much clearer what's going on. The logic is simply that we
have an argument list like:

	<merge-program> <oids> <path> <modes>

Where we always need either an OID/mode pair, or "". Now we'll add
both to their own strvec, which we then combine at the end.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 44 ++++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 21598a52383..d679272391b 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -7,6 +7,18 @@ struct mofs_data {
 	const char *program;
 };
 
+static void push_arg(struct strvec *oids, struct strvec *modes,
+		     const struct object_id *oid, const unsigned int mode)
+{
+	if (oid) {
+		strvec_push(oids, oid_to_hex(oid));
+		strvec_pushf(modes, "%06o", mode);
+	} else {
+		strvec_push(oids, "");
+		strvec_push(modes, "");
+	}
+}
+
 static int merge_one_file(struct index_state *istate,
 			  const struct object_id *orig_blob,
 			  const struct object_id *our_blob,
@@ -15,27 +27,25 @@ static int merge_one_file(struct index_state *istate,
 			  unsigned int their_mode, void *data)
 {
 	struct mofs_data *d = data;
-	const char *pgm = d->program;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-	int stage = 0;
+	const char *program = d->program;
+	struct strvec oids = STRVEC_INIT;
+	struct strvec modes = STRVEC_INIT;
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
-#define ADD_MOF_ARG(oid, mode) \
-	if ((oid)) { \
-		stage++; \
-		oid_to_hex_r(hexbuf[stage], (oid)); \
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%06o", (mode)); \
-		arguments[stage] = hexbuf[stage]; \
-		arguments[stage + 4] = ownbuf[stage]; \
-	}
+	strvec_push(&cmd.args, program);
+
+	push_arg(&oids, &modes, orig_blob, orig_mode);
+	push_arg(&oids, &modes, our_blob, our_mode);
+	push_arg(&oids, &modes, their_blob, their_mode);
+
+	strvec_pushv(&cmd.args, oids.v);
+	strvec_clear(&oids);
+
+	strvec_push(&cmd.args, path);
 
-	ADD_MOF_ARG(orig_blob, orig_mode);
-	ADD_MOF_ARG(our_blob, our_mode);
-	ADD_MOF_ARG(their_blob, their_mode);
+	strvec_pushv(&cmd.args, modes.v);
+	strvec_clear(&modes);
 
-	strvec_pushv(&cmd.args, arguments);
 	return run_command(&cmd);
 }
 
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v9 12/12] merge-index: make the argument parsing sensible & simpler
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (10 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 11/12] merge-index: use "struct strvec" and helper to prepare args Ævar Arnfjörð Bjarmason
@ 2022-11-18 11:18                 ` Ævar Arnfjörð Bjarmason
  2022-11-18 23:30                 ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Taylor Blau
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
  13 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-18 11:18 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

In a preceding commit when we migrated to parse_options() we took
pains to be bug-for-bug compatible with the existing command-line
interface, if possible.

I.e. we forbade forms like:

	git merge-index -a <program>
	git merge-index <program> <opts> -a

But allowed:

	git merge-index <program> -a
	git merge-index <opts> <program> -a

As the "-a" argument was considered be provided for the "<program>",
but not a part of "<opts>".

We don't really need this strictness, as we don't have two "-a"
options. It's much simpler to implement a schema where the first
non-option argument is the <program>, and the rest are the
"<file>...". We only allow that rest if the "-a" option isn't
supplied.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c  | 28 ++++++++--------------------
 t/t6060-merge-index.sh | 12 +++++++++---
 2 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index d679272391b..d8b62e4f663 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -59,21 +59,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
 		NULL
 	};
-#define OPT__MERGE_INDEX_ALL(v) \
-	OPT_BOOL('a', NULL, (v), \
-		 N_("merge all files in the index that need merging"))
 	struct option options[] = {
 		OPT_BOOL('o', NULL, &one_shot,
 			 N_("don't stop at the first failed merge")),
 		OPT__QUIET(&quiet, N_("be quiet")),
-		OPT__MERGE_INDEX_ALL(&all), /* include "-a" to show it in "-bh" */
+		OPT_BOOL('a', NULL, &all,
+			 N_("merge all files in the index that need merging")),
 		OPT_END(),
 	};
-	struct option options_prog[] = {
-		OPT__MERGE_INDEX_ALL(&all),
-		OPT_END(),
-	};
-#undef OPT__MERGE_INDEX_ALL
 	struct mofs_data data = { 0 };
 
 	/* Without this we cannot rely on waitpid() to tell
@@ -81,20 +74,15 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	 */
 	signal(SIGCHLD, SIG_DFL);
 
-	if (argc < 3)
-		usage_with_options(usage, options);
-
-	/* Option parsing without <merge-program> options */
-	argc = parse_options(argc, argv, prefix, options, usage,
-			     PARSE_OPT_STOP_AT_NON_OPTION);
-	if (all)
-		usage_msg_optf(_("'%s' option can only be provided after '<merge-program>'"),
-			      usage, options, "-a");
-	/* <merge-program> and its options */
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
 	if (!argc)
 		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
 	data.program = argv[0];
-	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
+	argv++;
+	argc--;
+	if (!argc && !all)
+		usage_msg_opt(_("need '-a' or '<file>...'"),
+			      usage, options);
 	if (argc && all)
 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
 			      usage, options);
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index bc201a69552..4ff9ace7f73 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -22,7 +22,7 @@ test_expect_success 'usage: 2 arguments' '
 
 test_expect_success 'usage: -a before <program>' '
 	cat >expect <<-\EOF &&
-	fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
+	fatal: '\''-a'\'' and '\''<file>...'\'' are mutually exclusive
 	EOF
 	test_expect_code 129 git merge-index -a b program >out 2>actual.raw &&
 	grep "^fatal:" actual.raw >actual &&
@@ -34,7 +34,7 @@ for opt in -q -o
 do
 	test_expect_success "usage: $opt after -a" '
 		cat >expect <<-EOF &&
-		fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
+		fatal: need a <merge-program> argument
 		EOF
 		test_expect_code 129 git merge-index -a $opt >out 2>actual.raw &&
 		grep "^fatal:" actual.raw >actual &&
@@ -43,7 +43,13 @@ do
 	'
 
 	test_expect_success "usage: $opt program" '
-		test_expect_code 0 git merge-index $opt program
+		cat >expect <<-EOF &&
+		fatal: need '\''-a'\'' or '\''<file>...'\''
+		EOF
+		test_expect_code 129 git merge-index $opt program 2>actual.raw &&
+		grep "^fatal:" actual.raw >actual &&
+		test_must_be_empty out &&
+		test_cmp expect actual
 	'
 done
 
-- 
2.38.0.1511.gcdcff1f1dc2


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* Re: [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (11 preceding siblings ...)
  2022-11-18 11:18                 ` [PATCH v9 12/12] merge-index: make the argument parsing sensible & simpler Ævar Arnfjörð Bjarmason
@ 2022-11-18 23:30                 ` Taylor Blau
  2022-11-19 12:46                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
  13 siblings, 1 reply; 221+ messages in thread
From: Taylor Blau @ 2022-11-18 23:30 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Alban Gruin, Phillip Wood, Elijah Newren,
	Johannes Schindelin

On Fri, Nov 18, 2022 at 12:18:17PM +0100, Ævar Arnfjörð Bjarmason wrote:
> This is a prep series for a re-roll of Alban Gruin's series to rewrite
> various merge drivers from *.sh to *.c, and being able to call those
> in-process.

Thanks for resurrecting this topic. I couldn't quite tell what this was
supposed to be based on from your cover letter, but digging around your
repo, the best I could come up with was:

    $ git log --oneline --first-parent --merges master.
    00c0dd7b8a Merge branch 'ab/various-leak-fixes' into ab/merge-index-prep
    dc39d4bbb4 Merge branch 'pw/rebase-no-reflog-action' into ab/merge-index-prep

when queuing, which seemed to do the trick.

If that wasn't what you had intended, let me know. The series does not
apply as-is on top of 'master' (which is at eea7033409 (The twelfth
batch, 2022-11-14), at the time of writing).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 221+ messages in thread

* Re: [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C
  2022-11-18 23:30                 ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Taylor Blau
@ 2022-11-19 12:46                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-11-19 12:46 UTC (permalink / raw)
  To: Taylor Blau
  Cc: git, Junio C Hamano, Alban Gruin, Phillip Wood, Elijah Newren,
	Johannes Schindelin


On Fri, Nov 18 2022, Taylor Blau wrote:

> On Fri, Nov 18, 2022 at 12:18:17PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> This is a prep series for a re-roll of Alban Gruin's series to rewrite
>> various merge drivers from *.sh to *.c, and being able to call those
>> in-process.
>
> Thanks for resurrecting this topic. I couldn't quite tell what this was
> supposed to be based on from your cover letter, but digging around your
> repo, the best I could come up with was:
>
>     $ git log --oneline --first-parent --merges master.
>     00c0dd7b8a Merge branch 'ab/various-leak-fixes' into ab/merge-index-prep
>     dc39d4bbb4 Merge branch 'pw/rebase-no-reflog-action' into ab/merge-index-prep
>
> when queuing, which seemed to do the trick.

Yes, sorry. It completely slipped my mind to mention it, but it's on top
of pw/rebase-no-reflog-action + ab/various-leak-fixes, except...

> If that wasn't what you had intended, let me know. The series does not
> apply as-is on top of 'master' (which is at eea7033409 (The twelfth
> batch, 2022-11-14), at the time of writing).

...just applying it on ab/various-leak-fixes won't *quite* do it, it'll
also need the more recent "master", namely the now-landed
rs/no-more-run-command-v.

^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v10 00/12] merge-index: prepare to rewrite merge drivers in C
  2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
                                   ` (12 preceding siblings ...)
  2022-11-18 23:30                 ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Taylor Blau
@ 2022-12-15  8:52                 ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
                                     ` (11 more replies)
  13 siblings, 12 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

This is a prep series for a re-roll of Alban Gruin's series to rewrite
various merge drivers from *.sh to *.c, and being able to call those
in-process.

That series was discussed on-list in August[1], and has now been
ejected from "seen" due to staleness. This v10 re-roll is my second
attempt at re-starting this topic, see [2] for v9.

In v8 there were concerns with the later part of this topic, but the
parts that are included here weren't controversial, those will be part
2 (and I think I've addressed those concerns).

Changes since v9:

* Rebase on minor (unrelated) merge-index and
  "USE_THE_INDEX_COMPATIBILITY_MACROS" changes that have since landed.

* Trivial adjustments to error messages, including marking one that
  wasn't marked with _() for translation.

See [3] for my branch for this topic, which includes passing CI.

1. https://lore.kernel.org/git/20220809185429.20098-9-alban.gruin@gmail.com/
2. https://lore.kernel.org/git/cover-v9-00.12-00000000000-20221118T110058Z-avarab@gmail.com/
3. https://github.com/avar/git/tree/ag/merge-strategies-in-c-prep-3

Alban Gruin (4):
  t6060: modify multiple files to expose a possible issue with
    merge-index
  t6060: add tests for removed files
  merge-index: improve die() error messages
  merge-index: libify merge_one_path() and merge_all()

Ævar Arnfjörð Bjarmason (8):
  merge-index doc & -h: fix padding, labels and "()" use
  merge-index tests: add usage tests
  merge-index: migrate to parse_options() API
  merge-index i18n: mark die() messages for translation
  merge-index: stop calling ensure_full_index() twice
  builtin/merge-index.c: don't USE_THE_INDEX_VARIABLE
  merge-index: use "struct strvec" and helper to prepare args
  merge-index: make the argument parsing sensible & simpler

 Documentation/git-merge-index.txt |   2 +-
 Makefile                          |   1 +
 builtin/merge-index.c             | 167 ++++++++++++++----------------
 git.c                             |   2 +-
 merge-strategies.c                |  87 ++++++++++++++++
 merge-strategies.h                |  19 ++++
 t/t0450/txt-help-mismatches       |   1 -
 t/t6060-merge-index.sh            |  65 +++++++++++-
 8 files changed, 249 insertions(+), 95 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

Range-diff against v9:
 1:  660b1242707 !  1:  9240ab10649 merge-index doc & -h: fix padding, labels and "()" use
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
     -		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
     +		usage("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))");
      
    - 	read_cache();
    + 	repo_read_index(the_repository);
      
     
      ## t/t0450/txt-help-mismatches ##
 2:  caf4a3790c4 =  2:  de36b52286b t6060: modify multiple files to expose a possible issue with merge-index
 3:  d659ac983f8 =  3:  5edc8132329 t6060: add tests for removed files
 4:  7c5b7c36411 =  4:  aa731011e0a merge-index tests: add usage tests
 5:  07f6936011a !  5:  a3f69564ac5 merge-index: migrate to parse_options() API
    @@ Commit message
     
      ## builtin/merge-index.c ##
     @@
    - #define USE_THE_INDEX_COMPATIBILITY_MACROS
    + #define USE_THE_INDEX_VARIABLE
      #include "builtin.h"
     +#include "parse-options.h"
      #include "run-command.h"
    @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const ch
     +		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
     +			      usage, options);
      
    - 	read_cache();
    + 	repo_read_index(the_repository);
      
      	/* TODO: audit for interaction with sparse-index. */
      	ensure_full_index(&the_index);
 6:  8d6cfd4bacc !  6:  324368401a2 merge-index: improve die() error messages
    @@ builtin/merge-index.c
     @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      	struct child_process cmd = CHILD_PROCESS_INIT;
      
    - 	if (pos >= active_nr)
    + 	if (pos >= the_index.cache_nr)
     -		die("git merge-index: %s not in the cache", path);
     +		die("'%s' is not in the cache", path);
      	found = 0;
      	do {
    - 		const struct cache_entry *ce = active_cache[pos];
    + 		const struct cache_entry *ce = the_index.cache[pos];
     @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      		arguments[stage + 4] = ownbuf[stage];
    - 	} while (++pos < active_nr);
    + 	} while (++pos < the_index.cache_nr);
      	if (!found)
     -		die("git merge-index: %s not in the cache", path);
     +		die("'%s' is not in the cache", path);
 7:  62c5fd4faaa !  7:  de4d11798db merge-index i18n: mark die() messages for translation
    @@ builtin/merge-index.c
     @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      	struct child_process cmd = CHILD_PROCESS_INIT;
      
    - 	if (pos >= active_nr)
    + 	if (pos >= the_index.cache_nr)
     -		die("'%s' is not in the cache", path);
     +		die(_("'%s' is not in the cache"), path);
      	found = 0;
      	do {
    - 		const struct cache_entry *ce = active_cache[pos];
    + 		const struct cache_entry *ce = the_index.cache[pos];
     @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      		arguments[stage + 4] = ownbuf[stage];
    - 	} while (++pos < active_nr);
    + 	} while (++pos < the_index.cache_nr);
      	if (!found)
     -		die("'%s' is not in the cache", path);
     +		die(_("'%s' is not in the cache"), path);
 8:  e44d58a505a !  8:  45cf7995448 merge-index: stop calling ensure_full_index() twice
    @@ builtin/merge-index.c: static void merge_one_path(const char *path)
     -	/* TODO: audit for interaction with sparse-index. */
     -	ensure_full_index(&the_index);
     +
    - 	for (i = 0; i < active_nr; i++) {
    - 		const struct cache_entry *ce = active_cache[i];
    + 	for (i = 0; i < the_index.cache_nr; i++) {
    + 		const struct cache_entry *ce = the_index.cache[i];
      		if (!ce_stage(ce))
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
      	/* TODO: audit for interaction with sparse-index. */
 9:  1f7c941035d !  9:  fc9a05ee034 builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS
    +    builtin/merge-index.c: don't USE_THE_INDEX_VARIABLE
     
    -    Remove "USE_THE_INDEX_COMPATIBILITY_MACROS" and instead pass
    -    "the_index" around between the functions in this file. In a subsequent
    -    commit we'll libify this, and don't want to use
    -    "USE_THE_INDEX_COMPATIBILITY_MACROS" in any more places in the
    -    top-level *.c files. Doing this first makes that diff a lot smaller.
    +    Remove "USE_THE_INDEX_VARIABLE" and instead pass "the_index" around
    +    between the functions in this file. In a subsequent commit we'll
    +    libify this, and don't want to use "USE_THE_INDEX_VARIABLE" in any
    +    more places in the top-level *.c files. Doing this first makes that
    +    diff a lot smaller.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## builtin/merge-index.c ##
     @@
    --#define USE_THE_INDEX_COMPATIBILITY_MACROS
    +-#define USE_THE_INDEX_VARIABLE
      #include "builtin.h"
      #include "parse-options.h"
      #include "run-command.h"
    @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      	char ownbuf[4][60];
      	struct child_process cmd = CHILD_PROCESS_INIT;
      
    --	if (pos >= active_nr)
    +-	if (pos >= the_index.cache_nr)
     +	if (pos >= istate->cache_nr)
      		die(_("'%s' is not in the cache"), path);
      	found = 0;
      	do {
    --		const struct cache_entry *ce = active_cache[pos];
    +-		const struct cache_entry *ce = the_index.cache[pos];
     +		const struct cache_entry *ce = istate->cache[pos];
      		int stage = ce_stage(ce);
      
    @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
      		arguments[stage] = hexbuf[stage];
      		arguments[stage + 4] = ownbuf[stage];
    --	} while (++pos < active_nr);
    +-	} while (++pos < the_index.cache_nr);
     +	} while (++pos < istate->cache_nr);
      	if (!found)
      		die(_("'%s' is not in the cache"), path);
    @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
     -static void merge_one_path(const char *path)
     +static void merge_one_path(struct index_state *istate, const char *path)
      {
    --	int pos = cache_name_pos(path, strlen(path));
    +-	int pos = index_name_pos(&the_index, path, strlen(path));
     +	int pos = index_name_pos(istate, path, strlen(path));
      
      	/*
    @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      {
      	int i;
      
    --	for (i = 0; i < active_nr; i++) {
    --		const struct cache_entry *ce = active_cache[i];
    +-	for (i = 0; i < the_index.cache_nr; i++) {
    +-		const struct cache_entry *ce = the_index.cache[i];
     +	for (i = 0; i < istate->cache_nr; i++) {
     +		const struct cache_entry *ce = istate->cache[i];
      		if (!ce_stage(ce))
    @@ builtin/merge-index.c: static int merge_entry(int pos, const char *path)
      }
      
     @@ builtin/merge-index.c: int cmd_merge_index(int argc, const char **argv, const char *prefix)
    - 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
    - 			      usage, options);
    - 
    --	read_cache();
    -+	repo_read_index(the_repository);
    + 	repo_read_index(the_repository);
      
      	/* TODO: audit for interaction with sparse-index. */
     -	ensure_full_index(&the_index);
10:  8c43b64dec4 = 10:  0efc5039e46 merge-index: libify merge_one_path() and merge_all()
11:  592db883dad = 11:  748fef4434f merge-index: use "struct strvec" and helper to prepare args
12:  5a2c4dd3acf = 12:  40b6d296f3a merge-index: make the argument parsing sensible & simpler
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply	[flat|nested] 221+ messages in thread

* [PATCH v10 01/12] merge-index doc & -h: fix padding, labels and "()" use
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 02/12] t6060: modify multiple files to expose a possible issue with merge-index Ævar Arnfjörð Bjarmason
                                     ` (10 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Make the "merge-index" doc SYNOPSIS and "-h" output consistent with
one another, and small issues with it:

- Whitespace padding, per e2f4e7e8c0f (doc txt & -h consistency:
  correct padding around "[]()", 2022-10-13).

- Use "<file>" consistently, rather than using "<filename>" in the
  "-h" output, and "<file>" in the SYNOPSIS.

- The "-h" version incorrectly claimed that the filename was optional,
  but it's not.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Documentation/git-merge-index.txt | 2 +-
 builtin/merge-index.c             | 2 +-
 t/t0450/txt-help-mismatches       | 1 -
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-merge-index.txt b/Documentation/git-merge-index.txt
index eea56b3154e..a297105d6d8 100644
--- a/Documentation/git-merge-index.txt
+++ b/Documentation/git-merge-index.txt
@@ -9,7 +9,7 @@ git-merge-index - Run a merge for files needing merging
 SYNOPSIS
 --------
 [verse]
-'git merge-index' [-o] [-q] <merge-program> (-a | ( [--] <file>...) )
+'git merge-index' [-o] [-q] <merge-program> (-a | ([--] <file>...))
 
 DESCRIPTION
 -----------
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 452f833ac46..69b18ed82ac 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -80,7 +80,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	signal(SIGCHLD, SIG_DFL);
 
 	if (argc < 3)
-		usage("git merge-index [-o] [-q] <merge-program> (-a | [--] [<filename>...])");
+		usage("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))");
 
 	repo_read_index(the_repository);
 
diff --git a/t/t0450/txt-help-mismatches b/t/t0450/txt-help-mismatches
index a0777acd667..9e73c1892ae 100644
--- a/t/t0450/txt-help-mismatches
+++ b/t/t0450/txt-help-mismatches
@@ -34,7 +34,6 @@ mailsplit
 maintenance
 merge
 merge-file
-merge-index
 merge-one-file
 multi-pack-index
 name-rev
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 02/12] t6060: modify multiple files to expose a possible issue with merge-index
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 03/12] t6060: add tests for removed files Ævar Arnfjörð Bjarmason
                                     ` (9 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Currently, merge-index iterates over every index entry, skipping stage0
entries.  It will then count how many entries following the current one
have the same name, then fork to do the merge.  It will then increase
the iterator by the number of entries to skip them.  This behaviour is
correct, as even if the subprocess modifies the index, merge-index does
not reload it at all.

But when it will be rewritten to use a function, the index it will use
will be modified and may shrink when a conflict happens or if a file is
removed, so we have to be careful to handle such cases.

Here is an example:

 *    Merge branches, file1 and file2 are trivially mergeable.
 |\
 | *  Modifies file1 and file2.
 * |  Modifies file1 and file2.
 |/
 *    Adds file1 and file2.

When the merge happens, the index will look like that:

 i -> 0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
      3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

merge-index handles `file1' first.  As it appears 3 times after the
iterator, it is merged.  The index is now stale, `i' is increased by 3,
and the index now looks like this:

      0. file1 (stage1)
      1. file1 (stage2)
      2. file1 (stage3)
 i -> 3. file2 (stage1)
      4. file2 (stage2)
      5. file2 (stage3)

`file2' appears three times too, so it is merged.

With a naive rewrite, the index would look like this:

      0. file1 (stage0)
      1. file2 (stage1)
      2. file2 (stage2)
 i -> 3. file2 (stage3)

`file2' appears once at the iterator or after, so it will be added,
_not_ merged.  Which is wrong.

A naive rewrite would lead to unproperly merged files, or even files not
handled at all.

This changes t6060 to reproduce this case, by creating 2 files instead
of 1, to check the correctness of the soon-to-be-rewritten merge-index.
The files are identical, which is not really important -- the factors
that could trigger this issue are that they should be separated by at
most one entry in the index, and that the first one in the index should
be trivially mergeable.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t6060-merge-index.sh | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 1a8b64cce18..30513351c23 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -7,16 +7,19 @@ TEST_PASSES_SANITIZE_LEAK=true
 
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
-	git add file &&
+	cp file file2 &&
+	git add file file2 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
 	sed s/10/ten/ <file >tmp &&
 	mv tmp file &&
+	cp file file2 &&
 	git commit -a -m ten &&
 	git tag ten
 '
@@ -35,8 +38,11 @@ ten
 EOF
 
 test_expect_success 'read-tree does not resolve content merge' '
+	cat >expect <<-\EOF &&
+	file
+	file2
+	EOF
 	git read-tree -i -m base ten two &&
-	echo file >expect &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
 	test_cmp expect unmerged
 '
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 03/12] t6060: add tests for removed files
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 02/12] t6060: modify multiple files to expose a possible issue with merge-index Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 04/12] merge-index tests: add usage tests Ævar Arnfjörð Bjarmason
                                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Until now, t6060 did not not check git-merge-one-file's behaviour when a
file is deleted in a branch.  To avoid regressions on this during the
conversion from shell to C, this adds a new file, `file3', in the commit
tagged as `base', and deletes it in the commit tagged as `two'.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t6060-merge-index.sh | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 30513351c23..079151ee06d 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -8,12 +8,14 @@ TEST_PASSES_SANITIZE_LEAK=true
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
 	cp file file2 &&
-	git add file file2 &&
+	cp file file3 &&
+	git add file file2 file3 &&
 	git commit -m base &&
 	git tag base &&
 	sed s/2/two/ <file >tmp &&
 	mv tmp file &&
 	cp file file2 &&
+	git rm file3 &&
 	git commit -a -m two &&
 	git tag two &&
 	git checkout -b other HEAD^ &&
@@ -41,6 +43,7 @@ test_expect_success 'read-tree does not resolve content merge' '
 	cat >expect <<-\EOF &&
 	file
 	file2
+	file3
 	EOF
 	git read-tree -i -m base ten two &&
 	git diff-files --name-only --diff-filter=U >unmerged &&
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 04/12] merge-index tests: add usage tests
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (2 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 03/12] t6060: add tests for removed files Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 05/12] merge-index: migrate to parse_options() API Ævar Arnfjörð Bjarmason
                                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Add tests that stress the current behavior of the options parsing in
cmd_merge_index(), in preparation for moving it over to
parse_options().

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t6060-merge-index.sh | 44 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 079151ee06d..edc03b41ab9 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -5,6 +5,50 @@ test_description='basic git merge-index / git-merge-one-file tests'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+test_expect_success 'usage: 1 argument' '
+	test_expect_code 129 git merge-index a >out 2>err &&
+	test_must_be_empty out &&
+	grep ^usage err
+'
+
+test_expect_success 'usage: 2 arguments' '
+	cat >expect <<-\EOF &&
+	fatal: git merge-index: b not in the cache
+	EOF
+	test_expect_code 128 git merge-index a b >out 2>actual &&
+	test_must_be_empty out &&
+	test_cmp expect actual
+'
+
+test_expect_success 'usage: -a before <program>' '
+	cat >expect <<-\EOF &&
+	fatal: git merge-index: b not in the cache
+	EOF
+	test_expect_code 128 git merge-index -a b program >out 2>actual &&
+	test_must_be_empty out &&
+	test_cmp expect actual
+'
+
+for opt in -q -o
+do
+	test_expect_success "usage: $opt after -a" '
+		cat >expect <<-EOF &&
+		fatal: git merge-index: unknown option $opt
+		EOF
+		test_expect_code 128 git merge-index -a $opt >out 2>actual &&
+		test_must_be_empty out &&
+		test_cmp expect actual
+	'
+
+	test_expect_success "usage: $opt program" '
+		test_expect_code 0 git merge-index $opt program
+	'
+done
+
+test_expect_success 'usage: program' '
+	test_expect_code 129 git merge-index program
+'
+
 test_expect_success 'setup diverging branches' '
 	test_write_lines 1 2 3 4 5 6 7 8 9 10 >file &&
 	cp file file2 &&
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 05/12] merge-index: migrate to parse_options() API
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (3 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 04/12] merge-index tests: add usage tests Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 06/12] merge-index: improve die() error messages Ævar Arnfjörð Bjarmason
                                     ` (6 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Migrate the "merge-index" command to the parse_options() API, a
preceding commit added tests for the existing behavior.

In a subsequent commit we'll adjust the behavior to be more consistent
with how most other commands work, but for now let's take pains to
preserve it as-is. We need to e.g. call parse_options() twice now, as
the "-a" option is currently only understood after "<merge-program>".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c  | 71 ++++++++++++++++++++++++++----------------
 git.c                  |  2 +-
 t/t6060-merge-index.sh | 10 +++---
 3 files changed, 51 insertions(+), 32 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 69b18ed82ac..3855531c579 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,5 +1,6 @@
 #define USE_THE_INDEX_VARIABLE
 #include "builtin.h"
+#include "parse-options.h"
 #include "run-command.h"
 
 static const char *pgm;
@@ -72,7 +73,26 @@ static void merge_all(void)
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
-	int i, force_file = 0;
+	int all = 0;
+	const char * const usage[] = {
+		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
+		NULL
+	};
+#define OPT__MERGE_INDEX_ALL(v) \
+	OPT_BOOL('a', NULL, (v), \
+		 N_("merge all files in the index that need merging"))
+	struct option options[] = {
+		OPT_BOOL('o', NULL, &one_shot,
+			 N_("don't stop at the first failed merge")),
+		OPT__QUIET(&quiet, N_("be quiet")),
+		OPT__MERGE_INDEX_ALL(&all), /* include "-a" to show it in "-bh" */
+		OPT_END(),
+	};
+	struct option options_prog[] = {
+		OPT__MERGE_INDEX_ALL(&all),
+		OPT_END(),
+	};
+#undef OPT__MERGE_INDEX_ALL
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -80,38 +100,35 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	signal(SIGCHLD, SIG_DFL);
 
 	if (argc < 3)
-		usage("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))");
+		usage_with_options(usage, options);
+
+	/* Option parsing without <merge-program> options */
+	argc = parse_options(argc, argv, prefix, options, usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+	if (all)
+		usage_msg_optf(_("'%s' option can only be provided after '<merge-program>'"),
+			      usage, options, "-a");
+	/* <merge-program> and its options */
+	if (!argc)
+		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
+	pgm = argv[0];
+	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
+	if (argc && all)
+		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
+			      usage, options);
 
 	repo_read_index(the_repository);
 
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(&the_index);
 
-	i = 1;
-	if (!strcmp(argv[i], "-o")) {
-		one_shot = 1;
-		i++;
-	}
-	if (!strcmp(argv[i], "-q")) {
-		quiet = 1;
-		i++;
-	}
-	pgm = argv[i++];
-	for (; i < argc; i++) {
-		const char *arg = argv[i];
-		if (!force_file && *arg == '-') {
-			if (!strcmp(arg, "--")) {
-				force_file = 1;
-				continue;
-			}
-			if (!strcmp(arg, "-a")) {
-				merge_all();
-				continue;
-			}
-			die("git merge-index: unknown option %s", arg);
-		}
-		merge_one_path(arg);
-	}
+
+	if (all)
+		merge_all();
+	else
+		for (size_t i = 0; i < argc; i++)
+			merge_one_path(argv[i]);
+
 	if (err && !quiet)
 		die("merge program failed");
 	return err;
diff --git a/git.c b/git.c
index 277a8cce840..557a33925e3 100644
--- a/git.c
+++ b/git.c
@@ -560,7 +560,7 @@ static struct cmd_struct commands[] = {
 	{ "merge", cmd_merge, RUN_SETUP | NEED_WORK_TREE },
 	{ "merge-base", cmd_merge_base, RUN_SETUP },
 	{ "merge-file", cmd_merge_file, RUN_SETUP_GENTLY },
-	{ "merge-index", cmd_merge_index, RUN_SETUP | NO_PARSEOPT },
+	{ "merge-index", cmd_merge_index, RUN_SETUP },
 	{ "merge-ours", cmd_merge_ours, RUN_SETUP | NO_PARSEOPT },
 	{ "merge-recursive", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
 	{ "merge-recursive-ours", cmd_merge_recursive, RUN_SETUP | NEED_WORK_TREE | NO_PARSEOPT },
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index edc03b41ab9..6c59e7bc4e5 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -22,9 +22,10 @@ test_expect_success 'usage: 2 arguments' '
 
 test_expect_success 'usage: -a before <program>' '
 	cat >expect <<-\EOF &&
-	fatal: git merge-index: b not in the cache
+	fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
 	EOF
-	test_expect_code 128 git merge-index -a b program >out 2>actual &&
+	test_expect_code 129 git merge-index -a b program >out 2>actual.raw &&
+	grep "^fatal:" actual.raw >actual &&
 	test_must_be_empty out &&
 	test_cmp expect actual
 '
@@ -33,9 +34,10 @@ for opt in -q -o
 do
 	test_expect_success "usage: $opt after -a" '
 		cat >expect <<-EOF &&
-		fatal: git merge-index: unknown option $opt
+		fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
 		EOF
-		test_expect_code 128 git merge-index -a $opt >out 2>actual &&
+		test_expect_code 129 git merge-index -a $opt >out 2>actual.raw &&
+		grep "^fatal:" actual.raw >actual &&
 		test_must_be_empty out &&
 		test_cmp expect actual
 	'
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 06/12] merge-index: improve die() error messages
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (4 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 05/12] merge-index: migrate to parse_options() API Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 07/12] merge-index i18n: mark die() messages for translation Ævar Arnfjörð Bjarmason
                                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Our usual convention is not to repeat the program name back at the
user, and to quote path arguments. Let's do that now to reduce the
size of the subsequent commit.

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c  | 4 ++--
 t/t6060-merge-index.sh | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 3855531c579..2dc789fb787 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -16,7 +16,7 @@ static int merge_entry(int pos, const char *path)
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
 	if (pos >= the_index.cache_nr)
-		die("git merge-index: %s not in the cache", path);
+		die("'%s' is not in the cache", path);
 	found = 0;
 	do {
 		const struct cache_entry *ce = the_index.cache[pos];
@@ -31,7 +31,7 @@ static int merge_entry(int pos, const char *path)
 		arguments[stage + 4] = ownbuf[stage];
 	} while (++pos < the_index.cache_nr);
 	if (!found)
-		die("git merge-index: %s not in the cache", path);
+		die("'%s' is not in the cache", path);
 
 	strvec_pushv(&cmd.args, arguments);
 	if (run_command(&cmd)) {
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index 6c59e7bc4e5..bc201a69552 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -13,7 +13,7 @@ test_expect_success 'usage: 1 argument' '
 
 test_expect_success 'usage: 2 arguments' '
 	cat >expect <<-\EOF &&
-	fatal: git merge-index: b not in the cache
+	fatal: '\''b'\'' is not in the cache
 	EOF
 	test_expect_code 128 git merge-index a b >out 2>actual &&
 	test_must_be_empty out &&
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 07/12] merge-index i18n: mark die() messages for translation
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (5 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 06/12] merge-index: improve die() error messages Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 08/12] merge-index: stop calling ensure_full_index() twice Ævar Arnfjörð Bjarmason
                                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Mark the die() messages for translation with _(). We don't rely on the
specifics of these messages as plumbing, so they can be safely
translated.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 2dc789fb787..4d91e7ea122 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -16,7 +16,7 @@ static int merge_entry(int pos, const char *path)
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
 	if (pos >= the_index.cache_nr)
-		die("'%s' is not in the cache", path);
+		die(_("'%s' is not in the cache"), path);
 	found = 0;
 	do {
 		const struct cache_entry *ce = the_index.cache[pos];
@@ -31,7 +31,7 @@ static int merge_entry(int pos, const char *path)
 		arguments[stage + 4] = ownbuf[stage];
 	} while (++pos < the_index.cache_nr);
 	if (!found)
-		die("'%s' is not in the cache", path);
+		die(_("'%s' is not in the cache"), path);
 
 	strvec_pushv(&cmd.args, arguments);
 	if (run_command(&cmd)) {
@@ -39,7 +39,7 @@ static int merge_entry(int pos, const char *path)
 			err++;
 		else {
 			if (!quiet)
-				die("merge program failed");
+				die(_("merge program failed"));
 			exit(1);
 		}
 	}
@@ -130,6 +130,6 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 			merge_one_path(argv[i]);
 
 	if (err && !quiet)
-		die("merge program failed");
+		die(_("merge program failed"));
 	return err;
 }
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 08/12] merge-index: stop calling ensure_full_index() twice
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (6 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 07/12] merge-index i18n: mark die() messages for translation Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 09/12] builtin/merge-index.c: don't USE_THE_INDEX_VARIABLE Ævar Arnfjörð Bjarmason
                                     ` (3 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

When most of the ensure_full_index() calls were added in
8e97852919f (Merge branch 'ds/sparse-index-protections', 2021-04-30)
we could add them at the start of cmd_*() for built-ins, but in some
cases we couldn't do that, as we'd only want to initialize the index
conditionally on some branches in the code.

But this code added in 299e2c4561b (merge-index: ensure full index,
2021-04-01) (part of 8e97852919f) isn't such a case. The merge_all()
function is only called by cmd_merge_index(), which before calling it
will have called ensure_full_index() unconditionally.

We can therefore skip this. While we're at it, and mainly so that
we'll see the relevant code in the context, let's fix a minor
whitespace issue that the addition of the ensure_full_index() call in
299e2c4561b introduced.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 4d91e7ea122..cd160779cbf 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -61,8 +61,7 @@ static void merge_one_path(const char *path)
 static void merge_all(void)
 {
 	int i;
-	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
+
 	for (i = 0; i < the_index.cache_nr; i++) {
 		const struct cache_entry *ce = the_index.cache[i];
 		if (!ce_stage(ce))
@@ -122,7 +121,6 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	/* TODO: audit for interaction with sparse-index. */
 	ensure_full_index(&the_index);
 
-
 	if (all)
 		merge_all();
 	else
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 09/12] builtin/merge-index.c: don't USE_THE_INDEX_VARIABLE
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (7 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 08/12] merge-index: stop calling ensure_full_index() twice Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 10/12] merge-index: libify merge_one_path() and merge_all() Ævar Arnfjörð Bjarmason
                                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Remove "USE_THE_INDEX_VARIABLE" and instead pass "the_index" around
between the functions in this file. In a subsequent commit we'll
libify this, and don't want to use "USE_THE_INDEX_VARIABLE" in any
more places in the top-level *.c files. Doing this first makes that
diff a lot smaller.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 29 ++++++++++++++---------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index cd160779cbf..c269d76cc8f 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,4 +1,3 @@
-#define USE_THE_INDEX_VARIABLE
 #include "builtin.h"
 #include "parse-options.h"
 #include "run-command.h"
@@ -7,7 +6,7 @@ static const char *pgm;
 static int one_shot, quiet;
 static int err;
 
-static int merge_entry(int pos, const char *path)
+static int merge_entry(struct index_state *istate, int pos, const char *path)
 {
 	int found;
 	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
@@ -15,11 +14,11 @@ static int merge_entry(int pos, const char *path)
 	char ownbuf[4][60];
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
-	if (pos >= the_index.cache_nr)
+	if (pos >= istate->cache_nr)
 		die(_("'%s' is not in the cache"), path);
 	found = 0;
 	do {
-		const struct cache_entry *ce = the_index.cache[pos];
+		const struct cache_entry *ce = istate->cache[pos];
 		int stage = ce_stage(ce);
 
 		if (strcmp(ce->name, path))
@@ -29,7 +28,7 @@ static int merge_entry(int pos, const char *path)
 		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
 		arguments[stage] = hexbuf[stage];
 		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < the_index.cache_nr);
+	} while (++pos < istate->cache_nr);
 	if (!found)
 		die(_("'%s' is not in the cache"), path);
 
@@ -46,27 +45,27 @@ static int merge_entry(int pos, const char *path)
 	return found;
 }
 
-static void merge_one_path(const char *path)
+static void merge_one_path(struct index_state *istate, const char *path)
 {
-	int pos = index_name_pos(&the_index, path, strlen(path));
+	int pos = index_name_pos(istate, path, strlen(path));
 
 	/*
 	 * If it already exists in the cache as stage0, it's
 	 * already merged and there is nothing to do.
 	 */
 	if (pos < 0)
-		merge_entry(-pos-1, path);
+		merge_entry(istate, -pos-1, path);
 }
 
-static void merge_all(void)
+static void merge_all(struct index_state *istate)
 {
 	int i;
 
-	for (i = 0; i < the_index.cache_nr; i++) {
-		const struct cache_entry *ce = the_index.cache[i];
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
 		if (!ce_stage(ce))
 			continue;
-		i += merge_entry(i, ce->name)-1;
+		i += merge_entry(istate, i, ce->name)-1;
 	}
 }
 
@@ -119,13 +118,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	repo_read_index(the_repository);
 
 	/* TODO: audit for interaction with sparse-index. */
-	ensure_full_index(&the_index);
+	ensure_full_index(the_repository->index);
 
 	if (all)
-		merge_all();
+		merge_all(the_repository->index);
 	else
 		for (size_t i = 0; i < argc; i++)
-			merge_one_path(argv[i]);
+			merge_one_path(the_repository->index, argv[i]);
 
 	if (err && !quiet)
 		die(_("merge program failed"));
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 10/12] merge-index: libify merge_one_path() and merge_all()
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (8 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 09/12] builtin/merge-index.c: don't USE_THE_INDEX_VARIABLE Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 11/12] merge-index: use "struct strvec" and helper to prepare args Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 12/12] merge-index: make the argument parsing sensible & simpler Ævar Arnfjörð Bjarmason
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

From: Alban Gruin <alban.gruin@gmail.com>

Move the workhorse functions in "builtin/merge-index.c" into a new
"merge-strategies" library, and mostly "libify" the code while doing
so.

Eventually this will allow us to invoke merge strategies such as
"resolve" and "octopus" in-process, once we've followed-up and
replaced "git-merge-{resolve,octopus}.sh" etc.

But for now let's move this code, while trying to optimize for as much
of it as possible being highlighted by the diff rename detection.

We still call die() in this library. An earlier version of this[1]
converted these to "error()", but the problem with that that we'd then
potentially run into the same error N times, e.g. once for every
"<file>" we were asked to operate on, instead of dying on the first
case. So let's leave those to "die()" for now.

1. https://lore.kernel.org/git/20220809185429.20098-4-alban.gruin@gmail.com/

Signed-off-by: Alban Gruin <alban.gruin@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              |  1 +
 builtin/merge-index.c | 95 ++++++++++++++++---------------------------
 merge-strategies.c    | 87 +++++++++++++++++++++++++++++++++++++++
 merge-strategies.h    | 19 +++++++++
 4 files changed, 142 insertions(+), 60 deletions(-)
 create mode 100644 merge-strategies.c
 create mode 100644 merge-strategies.h

diff --git a/Makefile b/Makefile
index 0f7d7ab1fd2..6f4ac2e541d 100644
--- a/Makefile
+++ b/Makefile
@@ -1064,6 +1064,7 @@ LIB_OBJS += merge-blobs.o
 LIB_OBJS += merge-ort.o
 LIB_OBJS += merge-ort-wrappers.o
 LIB_OBJS += merge-recursive.o
+LIB_OBJS += merge-strategies.o
 LIB_OBJS += merge.o
 LIB_OBJS += midx.o
 LIB_OBJS += name-hash.o
diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index c269d76cc8f..21598a52383 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -1,77 +1,50 @@
 #include "builtin.h"
 #include "parse-options.h"
+#include "merge-strategies.h"
 #include "run-command.h"
 
-static const char *pgm;
-static int one_shot, quiet;
-static int err;
+struct mofs_data {
+	const char *program;
+};
 
-static int merge_entry(struct index_state *istate, int pos, const char *path)
+static int merge_one_file(struct index_state *istate,
+			  const struct object_id *orig_blob,
+			  const struct object_id *our_blob,
+			  const struct object_id *their_blob, const char *path,
+			  unsigned int orig_mode, unsigned int our_mode,
+			  unsigned int their_mode, void *data)
 {
-	int found;
+	struct mofs_data *d = data;
+	const char *pgm = d->program;
 	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
 	char hexbuf[4][GIT_MAX_HEXSZ + 1];
 	char ownbuf[4][60];
+	int stage = 0;
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
-	if (pos >= istate->cache_nr)
-		die(_("'%s' is not in the cache"), path);
-	found = 0;
-	do {
-		const struct cache_entry *ce = istate->cache[pos];
-		int stage = ce_stage(ce);
-
-		if (strcmp(ce->name, path))
-			break;
-		found++;
-		oid_to_hex_r(hexbuf[stage], &ce->oid);
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%o", ce->ce_mode);
-		arguments[stage] = hexbuf[stage];
-		arguments[stage + 4] = ownbuf[stage];
-	} while (++pos < istate->cache_nr);
-	if (!found)
-		die(_("'%s' is not in the cache"), path);
-
-	strvec_pushv(&cmd.args, arguments);
-	if (run_command(&cmd)) {
-		if (one_shot)
-			err++;
-		else {
-			if (!quiet)
-				die(_("merge program failed"));
-			exit(1);
-		}
+#define ADD_MOF_ARG(oid, mode) \
+	if ((oid)) { \
+		stage++; \
+		oid_to_hex_r(hexbuf[stage], (oid)); \
+		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%06o", (mode)); \
+		arguments[stage] = hexbuf[stage]; \
+		arguments[stage + 4] = ownbuf[stage]; \
 	}
-	return found;
-}
-
-static void merge_one_path(struct index_state *istate, const char *path)
-{
-	int pos = index_name_pos(istate, path, strlen(path));
 
-	/*
-	 * If it already exists in the cache as stage0, it's
-	 * already merged and there is nothing to do.
-	 */
-	if (pos < 0)
-		merge_entry(istate, -pos-1, path);
-}
-
-static void merge_all(struct index_state *istate)
-{
-	int i;
+	ADD_MOF_ARG(orig_blob, orig_mode);
+	ADD_MOF_ARG(our_blob, our_mode);
+	ADD_MOF_ARG(their_blob, their_mode);
 
-	for (i = 0; i < istate->cache_nr; i++) {
-		const struct cache_entry *ce = istate->cache[i];
-		if (!ce_stage(ce))
-			continue;
-		i += merge_entry(istate, i, ce->name)-1;
-	}
+	strvec_pushv(&cmd.args, arguments);
+	return run_command(&cmd);
 }
 
 int cmd_merge_index(int argc, const char **argv, const char *prefix)
 {
+	int err = 0;
 	int all = 0;
+	int one_shot = 0;
+	int quiet = 0;
 	const char * const usage[] = {
 		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
 		NULL
@@ -91,6 +64,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		OPT_END(),
 	};
 #undef OPT__MERGE_INDEX_ALL
+	struct mofs_data data = { 0 };
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -109,7 +83,7 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	/* <merge-program> and its options */
 	if (!argc)
 		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
-	pgm = argv[0];
+	data.program = argv[0];
 	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
 	if (argc && all)
 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
@@ -121,12 +95,13 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	ensure_full_index(the_repository->index);
 
 	if (all)
-		merge_all(the_repository->index);
+		err |= merge_all_index(the_repository->index, one_shot, quiet,
+				       merge_one_file, &data);
 	else
 		for (size_t i = 0; i < argc; i++)
-			merge_one_path(the_repository->index, argv[i]);
+			err |= merge_index_path(the_repository->index,
+						one_shot, quiet, argv[i],
+						merge_one_file, &data);
 
-	if (err && !quiet)
-		die(_("merge program failed"));
 	return err;
 }
diff --git a/merge-strategies.c b/merge-strategies.c
new file mode 100644
index 00000000000..30691fccd77
--- /dev/null
+++ b/merge-strategies.c
@@ -0,0 +1,87 @@
+#include "cache.h"
+#include "merge-strategies.h"
+
+static int merge_entry(struct index_state *istate, unsigned int pos,
+		       const char *path, int *err, merge_index_fn fn,
+		       void *data)
+{
+	int found = 0;
+	const struct object_id *oids[3] = { 0 };
+	unsigned int modes[3] = { 0 };
+
+	*err = 0;
+
+	if (pos >= istate->cache_nr)
+		die(_("'%s' is not in the cache"), path);
+	do {
+		const struct cache_entry *ce = istate->cache[pos];
+		int stage = ce_stage(ce);
+
+		if (strcmp(ce->name, path))
+			break;
+		found++;
+		oids[stage - 1] = &ce->oid;
+		modes[stage - 1] = ce->ce_mode;
+	} while (++pos < istate->cache_nr);
+	if (!found)
+		die(_("'%s' is not in the cache"), path);
+
+	if (fn(istate, oids[0], oids[1], oids[2], path, modes[0], modes[1],
+	       modes[2], data))
+		(*err)++;
+
+	return found;
+}
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_index_fn fn, void *data)
+{
+	int err, ret;
+	int pos = index_name_pos(istate, path, strlen(path));
+
+	/*
+	 * If it already exists in the cache as stage0, it's
+	 * already merged and there is nothing to do.
+	 */
+	if (pos >= 0)
+		return 0;
+
+	ret = merge_entry(istate, -pos - 1, path, &err, fn, data);
+	if (ret < 0)
+		return ret;
+	if (err) {
+		if (!quiet && !oneshot)
+			die(_("merge program failed"));
+		return 1;
+	}
+	return 0;
+}
+
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_index_fn fn, void *data)
+{
+	int err, ret;
+	unsigned int i;
+
+	for (i = 0; i < istate->cache_nr; i++) {
+		const struct cache_entry *ce = istate->cache[i];
+		if (!ce_stage(ce))
+			continue;
+
+		ret = merge_entry(istate, i, ce->name, &err, fn, data);
+		if (ret < 0)
+			return ret;
+		else if (ret > 0)
+			i += ret - 1;
+
+		if (err && !oneshot) {
+			if (!quiet)
+				die(_("merge program failed"));
+			return 1;
+		}
+	}
+
+	if (err && !quiet)
+		die(_("merge program failed"));
+	return err;
+}
diff --git a/merge-strategies.h b/merge-strategies.h
new file mode 100644
index 00000000000..cee9168a046
--- /dev/null
+++ b/merge-strategies.h
@@ -0,0 +1,19 @@
+#ifndef MERGE_STRATEGIES_H
+#define MERGE_STRATEGIES_H
+
+struct object_id;
+struct index_state;
+typedef int (*merge_index_fn)(struct index_state *istate,
+			      const struct object_id *orig_blob,
+			      const struct object_id *our_blob,
+			      const struct object_id *their_blob,
+			      const char *path, unsigned int orig_mode,
+			      unsigned int our_mode, unsigned int their_mode,
+			      void *data);
+
+int merge_index_path(struct index_state *istate, int oneshot, int quiet,
+		     const char *path, merge_index_fn fn, void *data);
+int merge_all_index(struct index_state *istate, int oneshot, int quiet,
+		    merge_index_fn fn, void *data);
+
+#endif /* MERGE_STRATEGIES_H */
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 11/12] merge-index: use "struct strvec" and helper to prepare args
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (9 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 10/12] merge-index: libify merge_one_path() and merge_all() Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  2022-12-15  8:52                   ` [PATCH v10 12/12] merge-index: make the argument parsing sensible & simpler Ævar Arnfjörð Bjarmason
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

Refactor the code that was libified in the preceding commit to use
strvec_pushf() with a helper function, instead of in-place xsnprintf()
code that we generate with a macro.

This is less efficient in term of the number of allocations we do, but
it's now much clearer what's going on. The logic is simply that we
have an argument list like:

	<merge-program> <oids> <path> <modes>

Where we always need either an OID/mode pair, or "". Now we'll add
both to their own strvec, which we then combine at the end.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c | 44 ++++++++++++++++++++++++++-----------------
 1 file changed, 27 insertions(+), 17 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index 21598a52383..d679272391b 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -7,6 +7,18 @@ struct mofs_data {
 	const char *program;
 };
 
+static void push_arg(struct strvec *oids, struct strvec *modes,
+		     const struct object_id *oid, const unsigned int mode)
+{
+	if (oid) {
+		strvec_push(oids, oid_to_hex(oid));
+		strvec_pushf(modes, "%06o", mode);
+	} else {
+		strvec_push(oids, "");
+		strvec_push(modes, "");
+	}
+}
+
 static int merge_one_file(struct index_state *istate,
 			  const struct object_id *orig_blob,
 			  const struct object_id *our_blob,
@@ -15,27 +27,25 @@ static int merge_one_file(struct index_state *istate,
 			  unsigned int their_mode, void *data)
 {
 	struct mofs_data *d = data;
-	const char *pgm = d->program;
-	const char *arguments[] = { pgm, "", "", "", path, "", "", "", NULL };
-	char hexbuf[4][GIT_MAX_HEXSZ + 1];
-	char ownbuf[4][60];
-	int stage = 0;
+	const char *program = d->program;
+	struct strvec oids = STRVEC_INIT;
+	struct strvec modes = STRVEC_INIT;
 	struct child_process cmd = CHILD_PROCESS_INIT;
 
-#define ADD_MOF_ARG(oid, mode) \
-	if ((oid)) { \
-		stage++; \
-		oid_to_hex_r(hexbuf[stage], (oid)); \
-		xsnprintf(ownbuf[stage], sizeof(ownbuf[stage]), "%06o", (mode)); \
-		arguments[stage] = hexbuf[stage]; \
-		arguments[stage + 4] = ownbuf[stage]; \
-	}
+	strvec_push(&cmd.args, program);
+
+	push_arg(&oids, &modes, orig_blob, orig_mode);
+	push_arg(&oids, &modes, our_blob, our_mode);
+	push_arg(&oids, &modes, their_blob, their_mode);
+
+	strvec_pushv(&cmd.args, oids.v);
+	strvec_clear(&oids);
+
+	strvec_push(&cmd.args, path);
 
-	ADD_MOF_ARG(orig_blob, orig_mode);
-	ADD_MOF_ARG(our_blob, our_mode);
-	ADD_MOF_ARG(their_blob, their_mode);
+	strvec_pushv(&cmd.args, modes.v);
+	strvec_clear(&modes);
 
-	strvec_pushv(&cmd.args, arguments);
 	return run_command(&cmd);
 }
 
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

* [PATCH v10 12/12] merge-index: make the argument parsing sensible & simpler
  2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
                                     ` (10 preceding siblings ...)
  2022-12-15  8:52                   ` [PATCH v10 11/12] merge-index: use "struct strvec" and helper to prepare args Ævar Arnfjörð Bjarmason
@ 2022-12-15  8:52                   ` Ævar Arnfjörð Bjarmason
  11 siblings, 0 replies; 221+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-12-15  8:52 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Junio C Hamano, Alban Gruin, Phillip Wood,
	Elijah Newren, Johannes Schindelin,
	Ævar Arnfjörð Bjarmason

In a preceding commit when we migrated to parse_options() we took
pains to be bug-for-bug compatible with the existing command-line
interface, if possible.

I.e. we forbade forms like:

	git merge-index -a <program>
	git merge-index <program> <opts> -a

But allowed:

	git merge-index <program> -a
	git merge-index <opts> <program> -a

As the "-a" argument was considered be provided for the "<program>",
but not a part of "<opts>".

We don't really need this strictness, as we don't have two "-a"
options. It's much simpler to implement a schema where the first
non-option argument is the <program>, and the rest are the
"<file>...". We only allow that rest if the "-a" option isn't
supplied.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 builtin/merge-index.c  | 28 ++++++++--------------------
 t/t6060-merge-index.sh | 12 +++++++++---
 2 files changed, 17 insertions(+), 23 deletions(-)

diff --git a/builtin/merge-index.c b/builtin/merge-index.c
index d679272391b..d8b62e4f663 100644
--- a/builtin/merge-index.c
+++ b/builtin/merge-index.c
@@ -59,21 +59,14 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 		N_("git merge-index [-o] [-q] <merge-program> (-a | ([--] <file>...))"),
 		NULL
 	};
-#define OPT__MERGE_INDEX_ALL(v) \
-	OPT_BOOL('a', NULL, (v), \
-		 N_("merge all files in the index that need merging"))
 	struct option options[] = {
 		OPT_BOOL('o', NULL, &one_shot,
 			 N_("don't stop at the first failed merge")),
 		OPT__QUIET(&quiet, N_("be quiet")),
-		OPT__MERGE_INDEX_ALL(&all), /* include "-a" to show it in "-bh" */
+		OPT_BOOL('a', NULL, &all,
+			 N_("merge all files in the index that need merging")),
 		OPT_END(),
 	};
-	struct option options_prog[] = {
-		OPT__MERGE_INDEX_ALL(&all),
-		OPT_END(),
-	};
-#undef OPT__MERGE_INDEX_ALL
 	struct mofs_data data = { 0 };
 
 	/* Without this we cannot rely on waitpid() to tell
@@ -81,20 +74,15 @@ int cmd_merge_index(int argc, const char **argv, const char *prefix)
 	 */
 	signal(SIGCHLD, SIG_DFL);
 
-	if (argc < 3)
-		usage_with_options(usage, options);
-
-	/* Option parsing without <merge-program> options */
-	argc = parse_options(argc, argv, prefix, options, usage,
-			     PARSE_OPT_STOP_AT_NON_OPTION);
-	if (all)
-		usage_msg_optf(_("'%s' option can only be provided after '<merge-program>'"),
-			      usage, options, "-a");
-	/* <merge-program> and its options */
+	argc = parse_options(argc, argv, prefix, options, usage, 0);
 	if (!argc)
 		usage_msg_opt(_("need a <merge-program> argument"), usage, options);
 	data.program = argv[0];
-	argc = parse_options(argc, argv, prefix, options_prog, usage, 0);
+	argv++;
+	argc--;
+	if (!argc && !all)
+		usage_msg_opt(_("need '-a' or '<file>...'"),
+			      usage, options);
 	if (argc && all)
 		usage_msg_opt(_("'-a' and '<file>...' are mutually exclusive"),
 			      usage, options);
diff --git a/t/t6060-merge-index.sh b/t/t6060-merge-index.sh
index bc201a69552..4ff9ace7f73 100755
--- a/t/t6060-merge-index.sh
+++ b/t/t6060-merge-index.sh
@@ -22,7 +22,7 @@ test_expect_success 'usage: 2 arguments' '
 
 test_expect_success 'usage: -a before <program>' '
 	cat >expect <<-\EOF &&
-	fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
+	fatal: '\''-a'\'' and '\''<file>...'\'' are mutually exclusive
 	EOF
 	test_expect_code 129 git merge-index -a b program >out 2>actual.raw &&
 	grep "^fatal:" actual.raw >actual &&
@@ -34,7 +34,7 @@ for opt in -q -o
 do
 	test_expect_success "usage: $opt after -a" '
 		cat >expect <<-EOF &&
-		fatal: '\''-a'\'' option can only be provided after '\''<merge-program>'\''
+		fatal: need a <merge-program> argument
 		EOF
 		test_expect_code 129 git merge-index -a $opt >out 2>actual.raw &&
 		grep "^fatal:" actual.raw >actual &&
@@ -43,7 +43,13 @@ do
 	'
 
 	test_expect_success "usage: $opt program" '
-		test_expect_code 0 git merge-index $opt program
+		cat >expect <<-EOF &&
+		fatal: need '\''-a'\'' or '\''<file>...'\''
+		EOF
+		test_expect_code 129 git merge-index $opt program 2>actual.raw &&
+		grep "^fatal:" actual.raw >actual &&
+		test_must_be_empty out &&
+		test_cmp expect actual
 	'
 done
 
-- 
2.39.0.rc2.1048.g0e5493b8d5b


^ permalink raw reply related	[flat|nested] 221+ messages in thread

end of thread, other threads:[~2022-12-15  8:53 UTC | newest]

Thread overview: 221+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25 12:19 [RFC PATCH v1 00/17] Rewrite the remaining merge strategies from shell to C Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 01/17] t6027: modernise tests Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 02/17] merge-one-file: rewrite in C Alban Gruin
2020-06-25 14:55   ` Chris Torek
2020-06-25 15:16   ` Phillip Wood
2020-06-25 18:17     ` Phillip Wood
2020-06-26 14:33       ` Phillip Wood
2020-07-12 11:22     ` Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 03/17] merge-one-file: remove calls to external processes Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 04/17] merge-one-file: use error() instead of fprintf(stderr, ...) Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 05/17] merge-one-file: libify merge_one_file() Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 06/17] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2020-06-26 10:13   ` Phillip Wood
2020-06-26 14:32     ` Phillip Wood
2020-07-12 11:36     ` Alban Gruin
2020-07-12 18:02       ` Phillip Wood
2020-07-12 20:10         ` Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 07/17] merge-resolve: rewrite in C Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 08/17] merge-resolve: remove calls to external processes Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 09/17] merge-resolve: libify merge_resolve() Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 10/17] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 11/17] merge-octopus: rewrite in C Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 12/17] merge-octopus: remove calls to external processes Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 13/17] merge-octopus: libify merge_octopus() Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 14/17] merge: use the "resolve" strategy without forking Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 15/17] merge: use the "octopus" " Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 16/17] sequencer: use the "resolve" " Alban Gruin
2020-06-25 16:11   ` Phillip Wood
2020-07-12 11:27     ` Alban Gruin
2020-06-25 12:19 ` [RFC PATCH v1 17/17] sequencer: use the "octopus" merge " Alban Gruin
2020-09-01 10:56 ` [PATCH v2 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
2020-09-01 10:56   ` [PATCH v2 01/11] t6027: modernise tests Alban Gruin
2020-09-01 10:56   ` [PATCH v2 02/11] merge-one-file: rewrite in C Alban Gruin
2020-09-01 21:06     ` Junio C Hamano
2020-09-02 14:50       ` Alban Gruin
2020-09-01 10:56   ` [PATCH v2 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2020-09-01 21:11     ` Junio C Hamano
2020-09-02 15:37       ` Alban Gruin
2020-09-01 10:56   ` [PATCH v2 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
2020-09-01 10:56   ` [PATCH v2 05/11] merge-resolve: rewrite in C Alban Gruin
2020-09-01 10:57   ` [PATCH v2 06/11] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2020-09-01 10:57   ` [PATCH v2 07/11] merge-octopus: rewrite in C Alban Gruin
2020-09-01 10:57   ` [PATCH v2 08/11] merge: use the "resolve" strategy without forking Alban Gruin
2020-09-01 10:57   ` [PATCH v2 09/11] merge: use the "octopus" " Alban Gruin
2020-09-01 10:57   ` [PATCH v2 10/11] sequencer: use the "resolve" " Alban Gruin
2020-09-01 10:57   ` [PATCH v2 11/11] sequencer: use the "octopus" merge " Alban Gruin
2020-10-05 12:26   ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Alban Gruin
2020-10-05 12:26     ` [PATCH v3 01/11] t6027: modernise tests Alban Gruin
2020-10-06 20:50       ` Junio C Hamano
2020-10-05 12:26     ` [PATCH v3 02/11] merge-one-file: rewrite in C Alban Gruin
2020-10-06 22:01       ` Junio C Hamano
2020-10-21 19:47         ` Alban Gruin
2020-10-21 20:28           ` Junio C Hamano
2020-10-21 21:20             ` Junio C Hamano
2020-10-21 20:30           ` Junio C Hamano
2020-10-05 12:26     ` [PATCH v3 03/11] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2020-10-09  4:48       ` Junio C Hamano
2020-11-06 19:53         ` Alban Gruin
2020-10-05 12:26     ` [PATCH v3 04/11] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
2020-10-16 19:07       ` Junio C Hamano
2020-10-05 12:26     ` [PATCH v3 05/11] merge-resolve: rewrite in C Alban Gruin
2020-10-16 19:19       ` Junio C Hamano
2020-11-06 19:53         ` Alban Gruin
2020-10-05 12:26     ` [PATCH v3 06/11] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2020-10-05 12:26     ` [PATCH v3 07/11] merge-octopus: rewrite in C Alban Gruin
2020-10-05 12:26     ` [PATCH v3 08/11] merge: use the "resolve" strategy without forking Alban Gruin
2020-10-05 12:26     ` [PATCH v3 09/11] merge: use the "octopus" " Alban Gruin
2020-10-05 12:26     ` [PATCH v3 10/11] sequencer: use the "resolve" " Alban Gruin
2020-10-05 12:26     ` [PATCH v3 11/11] sequencer: use the "octopus" merge " Alban Gruin
2020-10-07  6:57     ` [PATCH v3 00/11] Rewrite the remaining merge strategies from shell to C Johannes Schindelin
2020-11-13 11:04     ` [PATCH v4 00/12] " Alban Gruin
2020-11-13 11:04       ` [PATCH v4 01/12] t6027: modernise tests Alban Gruin
2020-11-13 11:04       ` [PATCH v4 02/12] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
2020-11-13 11:04       ` [PATCH v4 03/12] merge-one-file: rewrite in C Alban Gruin
2020-11-13 11:04       ` [PATCH v4 04/12] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2020-11-13 11:04       ` [PATCH v4 05/12] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
2020-11-13 11:04       ` [PATCH v4 06/12] merge-resolve: rewrite in C Alban Gruin
2020-11-13 11:04       ` [PATCH v4 07/12] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2020-11-13 11:04       ` [PATCH v4 08/12] merge-octopus: rewrite in C Alban Gruin
2020-11-13 11:04       ` [PATCH v4 09/12] merge: use the "resolve" strategy without forking Alban Gruin
2020-11-13 11:04       ` [PATCH v4 10/12] merge: use the "octopus" " Alban Gruin
2020-11-13 11:04       ` [PATCH v4 11/12] sequencer: use the "resolve" " Alban Gruin
2020-11-13 11:04       ` [PATCH v4 12/12] sequencer: use the "octopus" merge " Alban Gruin
2020-11-16 10:21       ` [PATCH v5 00/12] Rewrite the remaining merge strategies from shell to C Alban Gruin
2020-11-16 10:21         ` [PATCH v5 01/12] t6027: modernise tests Alban Gruin
2020-11-16 10:21         ` [PATCH v5 02/12] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
2020-11-16 10:21         ` [PATCH v5 03/12] merge-one-file: rewrite in C Alban Gruin
2020-11-16 10:21         ` [PATCH v5 04/12] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2020-11-16 10:21         ` [PATCH v5 05/12] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
2020-11-16 10:21         ` [PATCH v5 06/12] merge-resolve: rewrite in C Alban Gruin
2020-11-16 10:21         ` [PATCH v5 07/12] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2020-11-16 10:21         ` [PATCH v5 08/12] merge-octopus: rewrite in C Alban Gruin
2020-11-16 10:21         ` [PATCH v5 09/12] merge: use the "resolve" strategy without forking Alban Gruin
2020-11-16 10:21         ` [PATCH v5 10/12] merge: use the "octopus" " Alban Gruin
2020-11-16 10:21         ` [PATCH v5 11/12] sequencer: use the "resolve" " Alban Gruin
2020-11-16 10:21         ` [PATCH v5 12/12] sequencer: use the "octopus" merge " Alban Gruin
2020-11-24 11:53         ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C Alban Gruin
2020-11-24 11:53           ` [PATCH v6 01/13] t6407: modernise tests Alban Gruin
2020-11-24 11:53           ` [PATCH v6 02/13] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
2020-11-24 11:53           ` [PATCH v6 03/13] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
2020-12-22 20:54             ` Junio C Hamano
2020-11-24 11:53           ` [PATCH v6 04/13] merge-one-file: rewrite in C Alban Gruin
2020-12-22 21:36             ` Junio C Hamano
2021-01-03 22:41               ` Alban Gruin
2021-01-08  6:54                 ` Junio C Hamano
2020-11-24 11:53           ` [PATCH v6 05/13] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2021-01-05 15:59             ` Derrick Stolee
2021-01-05 23:20               ` Alban Gruin
2020-11-24 11:53           ` [PATCH v6 06/13] merge-index: don't fork if the requested program is `git-merge-one-file' Alban Gruin
2021-01-05 16:11             ` Derrick Stolee
2021-01-05 17:35               ` Martin Ågren
2021-01-05 23:20                 ` Alban Gruin
2021-01-05 23:20               ` Alban Gruin
2021-01-06  2:04                 ` Junio C Hamano
2021-01-10 17:15                   ` Alban Gruin
2021-01-10 20:51                     ` Junio C Hamano
2021-03-08 20:32                       ` Alban Gruin
2020-11-24 11:53           ` [PATCH v6 07/13] merge-resolve: rewrite in C Alban Gruin
2020-11-24 11:53           ` [PATCH v6 08/13] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2021-01-05 16:19             ` Derrick Stolee
2020-11-24 11:53           ` [PATCH v6 09/13] merge-octopus: rewrite in C Alban Gruin
2021-01-05 16:40             ` Derrick Stolee
2020-11-24 11:53           ` [PATCH v6 10/13] merge: use the "resolve" strategy without forking Alban Gruin
2021-01-05 16:45             ` Derrick Stolee
2020-11-24 11:53           ` [PATCH v6 11/13] merge: use the "octopus" " Alban Gruin
2020-11-24 11:53           ` [PATCH v6 12/13] sequencer: use the "resolve" " Alban Gruin
2020-11-24 11:53           ` [PATCH v6 13/13] sequencer: use the "octopus" merge " Alban Gruin
2020-11-24 19:34           ` [PATCH v6 00/13] Rewrite the remaining merge strategies from shell to C SZEDER Gábor
2021-01-05 16:50           ` Derrick Stolee
2021-03-17 20:49           ` [PATCH v7 00/15] " Alban Gruin
2021-03-17 20:49             ` [PATCH v7 01/15] t6407: modernise tests Alban Gruin
2021-03-17 20:49             ` [PATCH v7 02/15] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
2021-03-17 20:49             ` [PATCH v7 03/15] t6060: add tests for removed files Alban Gruin
2021-03-22 21:36               ` Johannes Schindelin
2021-03-23 20:43                 ` Alban Gruin
2021-03-17 20:49             ` [PATCH v7 04/15] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2021-03-17 20:49             ` [PATCH v7 05/15] merge-index: drop the index Alban Gruin
2021-03-17 20:49             ` [PATCH v7 06/15] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
2021-03-17 20:49             ` [PATCH v7 07/15] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
2021-03-22 21:59               ` Johannes Schindelin
2021-03-23 20:45                 ` Alban Gruin
2021-03-17 20:49             ` [PATCH v7 08/15] merge-one-file: rewrite in C Alban Gruin
2021-03-22 22:20               ` Johannes Schindelin
2021-03-23 20:53                 ` Alban Gruin
2021-03-24  9:10                   ` Johannes Schindelin
2021-04-10 14:17                     ` Alban Gruin
2021-03-17 20:49             ` [PATCH v7 09/15] merge-resolve: " Alban Gruin
2021-03-23 22:21               ` Johannes Schindelin
2021-04-10 14:17                 ` Alban Gruin
2021-03-17 20:49             ` [PATCH v7 10/15] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2021-03-17 20:49             ` [PATCH v7 11/15] merge-octopus: rewrite in C Alban Gruin
2021-03-23 23:58               ` Johannes Schindelin
2021-03-17 20:49             ` [PATCH v7 12/15] merge: use the "resolve" strategy without forking Alban Gruin
2021-03-17 20:49             ` [PATCH v7 13/15] merge: use the "octopus" " Alban Gruin
2021-03-17 20:49             ` [PATCH v7 14/15] sequencer: use the "resolve" " Alban Gruin
2021-03-17 20:49             ` [PATCH v7 15/15] sequencer: use the "octopus" merge " Alban Gruin
2022-08-09 18:54             ` [PATCH v8 00/14] Rewrite the remaining merge strategies from shell to C Alban Gruin
2022-08-09 18:54               ` [PATCH v8 01/14] t6060: modify multiple files to expose a possible issue with merge-index Alban Gruin
2022-08-09 18:54               ` [PATCH v8 02/14] t6060: add tests for removed files Alban Gruin
2022-08-09 18:54               ` [PATCH v8 03/14] merge-index: libify merge_one_path() and merge_all() Alban Gruin
2022-08-17  2:10                 ` Ævar Arnfjörð Bjarmason
2022-08-09 18:54               ` [PATCH v8 04/14] merge-index: drop the index Alban Gruin
2022-08-09 18:54               ` [PATCH v8 05/14] merge-index: add a new way to invoke `git-merge-one-file' Alban Gruin
2022-08-09 21:36                 ` Johannes Schindelin
2022-08-10 13:14                   ` Phillip Wood
2022-08-09 18:54               ` [PATCH v8 06/14] update-index: move add_cacheinfo() to read-cache.c Alban Gruin
2022-08-09 18:54               ` [PATCH v8 07/14] merge-one-file: rewrite in C Alban Gruin
2022-08-09 22:01                 ` Johannes Schindelin
2022-08-09 18:54               ` [PATCH v8 08/14] merge-resolve: " Alban Gruin
2022-08-10 15:03                 ` Phillip Wood
2022-08-10 21:20                   ` Junio C Hamano
2022-08-16 12:09                     ` Johannes Schindelin
2022-08-16 19:36                       ` Junio C Hamano
2022-08-17  9:42                         ` Johannes Schindelin
2022-08-17 19:06                           ` Elijah Newren
2022-08-17 19:18                             ` Junio C Hamano
2022-08-18 14:24                               ` Ævar Arnfjörð Bjarmason
2022-08-18 17:32                                 ` Junio C Hamano
2022-08-19  1:43                                 ` Elijah Newren
2022-08-19  2:45                                   ` Ævar Arnfjörð Bjarmason
2022-08-19  4:27                                     ` Elijah Newren
2022-08-17 19:12                           ` Junio C Hamano
2022-08-16 12:17                   ` Johannes Schindelin
2022-08-16 14:02                     ` Phillip Wood
2022-08-17  2:16                 ` Ævar Arnfjörð Bjarmason
2022-08-18 14:43                 ` Ævar Arnfjörð Bjarmason
2022-08-09 18:54               ` [PATCH v8 09/14] merge-recursive: move better_branch_name() to merge.c Alban Gruin
2022-08-09 18:54               ` [PATCH v8 10/14] merge-octopus: rewrite in C Alban Gruin
2022-08-09 18:54               ` [PATCH v8 11/14] merge: use the "resolve" strategy without forking Alban Gruin
2022-08-13 16:18                 ` Junio C Hamano
2022-08-09 18:54               ` [PATCH v8 12/14] merge: use the "octopus" " Alban Gruin
2022-08-09 18:54               ` [PATCH v8 13/14] sequencer: use the "resolve" " Alban Gruin
2022-08-09 18:54               ` [PATCH v8 14/14] sequencer: use the "octopus" " Alban Gruin
2022-11-18 11:18               ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 02/12] t6060: modify multiple files to expose a possible issue with merge-index Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 03/12] t6060: add tests for removed files Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 04/12] merge-index tests: add usage tests Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 05/12] merge-index: migrate to parse_options() API Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 06/12] merge-index: improve die() error messages Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 07/12] merge-index i18n: mark die() messages for translation Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 08/12] merge-index: stop calling ensure_full_index() twice Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 09/12] builtin/merge-index.c: don't USE_THE_INDEX_COMPATIBILITY_MACROS Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 10/12] merge-index: libify merge_one_path() and merge_all() Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 11/12] merge-index: use "struct strvec" and helper to prepare args Ævar Arnfjörð Bjarmason
2022-11-18 11:18                 ` [PATCH v9 12/12] merge-index: make the argument parsing sensible & simpler Ævar Arnfjörð Bjarmason
2022-11-18 23:30                 ` [PATCH v9 00/12] merge-index: prepare to rewrite merge drivers in C Taylor Blau
2022-11-19 12:46                   ` Ævar Arnfjörð Bjarmason
2022-12-15  8:52                 ` [PATCH v10 " Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 01/12] merge-index doc & -h: fix padding, labels and "()" use Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 02/12] t6060: modify multiple files to expose a possible issue with merge-index Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 03/12] t6060: add tests for removed files Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 04/12] merge-index tests: add usage tests Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 05/12] merge-index: migrate to parse_options() API Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 06/12] merge-index: improve die() error messages Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 07/12] merge-index i18n: mark die() messages for translation Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 08/12] merge-index: stop calling ensure_full_index() twice Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 09/12] builtin/merge-index.c: don't USE_THE_INDEX_VARIABLE Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 10/12] merge-index: libify merge_one_path() and merge_all() Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 11/12] merge-index: use "struct strvec" and helper to prepare args Ævar Arnfjörð Bjarmason
2022-12-15  8:52                   ` [PATCH v10 12/12] merge-index: make the argument parsing sensible & simpler Ævar Arnfjörð Bjarmason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.