All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] push: submodule support
@ 2011-08-19 22:08 Fredrik Gustafsson
  2011-08-19 22:08 ` [PATCH v4 1/2] push: Don't push a repository with unpushed submodules Fredrik Gustafsson
  2011-08-19 22:08 ` [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option Fredrik Gustafsson
  0 siblings, 2 replies; 20+ messages in thread
From: Fredrik Gustafsson @ 2011-08-19 22:08 UTC (permalink / raw)
  To: git; +Cc: iveqy, hvoigt, jens.lehmann, gitster

The first iteration of this patch series can be found here:
http://thread.gmane.org/gmane.comp.version-control.git/176328/focus=176327

The second iteration of this patch series can be found here:
http://thread.gmane.org/gmane.comp.version-control.git/177992

The third iteration of this patch series can be found here:
http://thread.gmane.org/gmane.comp.version-control.git/179037/focus=179048

Fredrik Gustafsson (2):
  push: Don't push a repository with unpushed submodules
  push: teach --recurse-submodules the on-demand option

 Documentation/git-push.txt     |    9 ++
 builtin/push.c                 |   26 +++++++
 combine-diff.c                 |    2 +-
 submodule.c                    |  161 ++++++++++++++++++++++++++++++++++++++++
 submodule.h                    |    2 +
 t/t5531-deep-submodule-push.sh |  111 +++++++++++++++++++++++++++
 transport.c                    |   17 ++++
 transport.h                    |    2 +
 8 files changed, 329 insertions(+), 1 deletions(-)

-- 
1.7.6.551.gfb18e

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-19 22:08 [PATCH v4 0/2] push: submodule support Fredrik Gustafsson
@ 2011-08-19 22:08 ` Fredrik Gustafsson
  2011-08-19 23:26   ` Junio C Hamano
  2011-08-19 22:08 ` [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option Fredrik Gustafsson
  1 sibling, 1 reply; 20+ messages in thread
From: Fredrik Gustafsson @ 2011-08-19 22:08 UTC (permalink / raw)
  To: git; +Cc: iveqy, hvoigt, jens.lehmann, gitster

When working with submodules it is easy to forget to push a
submodule to the server but pushing a super-project that
contains a commit for that submodule. The result is that the
superproject points at a submodule commit that is not available
on the server.

This adds the option --recurse-submodules=check to push. When
using this option git will check that all submodule commits that
are about to be pushed are present on a remote of the submodule.

To be able to use a combined diff, disabling a diff callback has
been removed from combined-diff.c.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Mentored-by: Jens Lehmann <Jens.Lehmann@web.de>
Mentored-by: Heiko Voigt <hvoigt@hvoigt.net>
---
 Documentation/git-push.txt     |    6 ++
 builtin/push.c                 |   19 +++++++
 combine-diff.c                 |    2 +-
 submodule.c                    |  108 ++++++++++++++++++++++++++++++++++++++++
 submodule.h                    |    1 +
 t/t5531-deep-submodule-push.sh |   87 ++++++++++++++++++++++++++++++++
 transport.c                    |    9 +++
 transport.h                    |    1 +
 8 files changed, 232 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 49c6e9f..aede488 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -162,6 +162,12 @@ useful if you write an alias or script around 'git push'.
 	is specified. This flag forces progress status even if the
 	standard error stream is not directed to a terminal.
 
+--recurse-submodules=check::
+	Check whether all submodule commits used by the revisions to be
+	pushed are available on a remote tracking branch. Otherwise the
+	push will be aborted and the command will exit with non-zero status.
+
+
 include::urls-remotes.txt[]
 
 OUTPUT
diff --git a/builtin/push.c b/builtin/push.c
index 9cebf9e..35cce53 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -8,6 +8,7 @@
 #include "remote.h"
 #include "transport.h"
 #include "parse-options.h"
+#include "submodule.h"
 
 static const char * const push_usage[] = {
 	"git push [<options>] [<repository> [<refspec>...]]",
@@ -219,6 +220,21 @@ static int do_push(const char *repo, int flags)
 	return !!errs;
 }
 
+static int option_parse_recurse_submodules(const struct option *opt,
+				   const char *arg, int unset)
+{
+	int *flags = opt->value;
+	if (arg) {
+		if (!strcmp(arg, "check"))
+			*flags |= TRANSPORT_RECURSE_SUBMODULES_CHECK;
+		else
+			die("bad %s argument: %s", opt->long_name, arg);
+	} else
+		die("option %s needs an argument (check)", opt->long_name);
+
+	return 0;
+}
+
 int cmd_push(int argc, const char **argv, const char *prefix)
 {
 	int flags = 0;
@@ -236,6 +252,9 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 		OPT_BIT('n' , "dry-run", &flags, "dry run", TRANSPORT_PUSH_DRY_RUN),
 		OPT_BIT( 0,  "porcelain", &flags, "machine-readable output", TRANSPORT_PUSH_PORCELAIN),
 		OPT_BIT('f', "force", &flags, "force updates", TRANSPORT_PUSH_FORCE),
+		{ OPTION_CALLBACK, 0, "recurse-submodules", &flags, "check",
+			"controls recursive pushing of submodules",
+			PARSE_OPT_OPTARG, option_parse_recurse_submodules },
 		OPT_BOOLEAN( 0 , "thin", &thin, "use thin pack"),
 		OPT_STRING( 0 , "receive-pack", &receivepack, "receive-pack", "receive pack program"),
 		OPT_STRING( 0 , "exec", &receivepack, "receive-pack", "receive pack program"),
diff --git a/combine-diff.c b/combine-diff.c
index b11eb71..f7a8978 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1074,7 +1074,7 @@ void diff_tree_combined(const unsigned char *sha1,
 		 * when doing combined diff.
 		 */
 		int stat_opt = (opt->output_format &
-				(DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT));
+				(DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT|DIFF_FORMAT_CALLBACK));
 		if (i == 0 && stat_opt)
 			diffopts.output_format = stat_opt;
 		else
diff --git a/submodule.c b/submodule.c
index 1ba9646..45f508c 100644
--- a/submodule.c
+++ b/submodule.c
@@ -308,6 +308,114 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
+static int has_remote(const char *refname, const unsigned char *sha1, int flags, void *cb_data)
+{
+	return 1;
+}
+
+static int submodule_needs_pushing(const char *path, const unsigned char sha1[20])
+{
+	if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
+		return 0;
+
+	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
+		struct child_process cp;
+		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
+		struct strbuf buf = STRBUF_INIT;
+		int needs_pushing = 0;
+
+		argv[1] = sha1_to_hex(sha1);
+		memset(&cp, 0, sizeof(cp));
+		cp.argv = argv;
+		cp.env = local_repo_env;
+		cp.git_cmd = 1;
+		cp.no_stdin = 1;
+		cp.out = -1;
+		cp.dir = path;
+		if (start_command(&cp))
+			die("Could not run 'git rev-list %s --not --remotes -n 1' command in submodule %s",
+				sha1_to_hex(sha1), path);
+		if (strbuf_read(&buf, cp.out, 41))
+			needs_pushing = 1;
+		finish_command(&cp);
+		close(cp.out);
+		strbuf_release(&buf);
+		return needs_pushing;
+	}
+
+	return 0;
+}
+
+static void collect_submodules_from_diff(struct diff_queue_struct *q,
+					 struct diff_options *options,
+					 void *data)
+{
+	int i;
+	int *needs_pushing = data;
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		if (!S_ISGITLINK(p->two->mode))
+			continue;
+		if (submodule_needs_pushing(p->two->path, p->two->sha1)) {
+			*needs_pushing = 1;
+			break;
+		}
+	}
+}
+
+
+static void commit_need_pushing(struct commit *commit, struct commit_list *parent, int *needs_pushing)
+{
+	const unsigned char (*parents)[20];
+	unsigned int i, n;
+	struct rev_info rev;
+
+	n = commit_list_count(parent);
+	parents = xmalloc(n * sizeof(*parents));
+
+	for (i = 0; i < n; i++) {
+		hashcpy((unsigned char *)(parents + i), parent->item->object.sha1);
+		parent = parent->next;
+	}
+
+	init_revisions(&rev, NULL);
+	rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
+	rev.diffopt.format_callback = collect_submodules_from_diff;
+	rev.diffopt.format_callback_data = needs_pushing;
+	diff_tree_combined(commit->object.sha1, parents, n, 1, &rev);
+
+	free(parents);
+}
+
+int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name)
+{
+	struct rev_info rev;
+	struct commit *commit;
+	const char *argv[] = {NULL, NULL, "--not", "NULL", NULL};
+	int argc = ARRAY_SIZE(argv) - 1;
+	char *sha1_copy;
+	int needs_pushing = 0;
+	struct strbuf remotes_arg = STRBUF_INIT;
+
+	strbuf_addf(&remotes_arg, "--remotes=%s", remotes_name);
+	init_revisions(&rev, NULL);
+	sha1_copy = xstrdup(sha1_to_hex(new_sha1));
+	argv[1] = sha1_copy;
+	argv[3] = remotes_arg.buf;
+	setup_revisions(argc, argv, &rev, NULL);
+	if (prepare_revision_walk(&rev))
+		die("revision walk setup failed");
+
+	while ((commit = get_revision(&rev)) && !needs_pushing)
+		commit_need_pushing(commit, commit->parents, &needs_pushing);
+
+	free(sha1_copy);
+	strbuf_release(&remotes_arg);
+
+	return needs_pushing;
+}
+
 static int is_submodule_commit_present(const char *path, unsigned char sha1[20])
 {
 	int is_present = 0;
diff --git a/submodule.h b/submodule.h
index 5350b0d..799c22d 100644
--- a/submodule.h
+++ b/submodule.h
@@ -29,5 +29,6 @@ int fetch_populated_submodules(int num_options, const char **options,
 unsigned is_submodule_modified(const char *path, int ignore_untracked);
 int merge_submodule(unsigned char result[20], const char *path, const unsigned char base[20],
 		    const unsigned char a[20], const unsigned char b[20]);
+int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name);
 
 #endif
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index faa2e96..30bec4b 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -32,4 +32,91 @@ test_expect_success push '
 	)
 '
 
+test_expect_success 'push if submodule has no remote' '
+	(
+		cd work/gar/bage &&
+		>junk2 &&
+		git add junk2 &&
+		git commit -m "Second junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Second commit for gar/bage" &&
+		git push --recurse-submodules=check ../pub.git master
+	)
+'
+
+test_expect_success 'push fails if submodule commit not on remote' '
+	(
+		cd work/gar &&
+		git clone --bare bage ../../submodule.git &&
+		cd bage &&
+		git remote add origin ../../../submodule.git &&
+		git fetch &&
+		>junk3 &&
+		git add junk3 &&
+		git commit -m "Third junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Third commit for gar/bage" &&
+		test_must_fail git push --recurse-submodules=check ../pub.git master
+	)
+'
+
+test_expect_success 'push succeeds after commit was pushed to remote' '
+	(
+		cd work/gar/bage &&
+		git push origin master
+	) &&
+	(
+		cd work &&
+		git push --recurse-submodules=check ../pub.git master
+	)
+'
+
+test_expect_success 'push fails when commit on multiple branches if one branch has no remote' '
+	(
+		cd work/gar/bage &&
+		>junk4 &&
+		git add junk4 &&
+		git commit -m "Fourth junk"
+	) &&
+	(
+		cd work &&
+		git branch branch2 &&
+		git add gar/bage &&
+		git commit -m "Fourth commit for gar/bage" &&
+		git checkout branch2 &&
+		(
+			cd gar/bage &&
+			git checkout HEAD~1
+		) &&
+		>junk1 &&
+		git add junk1 &&
+		git commit -m "First junk" &&
+		test_must_fail git push --recurse-submodules=check ../pub.git
+	)
+'
+
+test_expect_success 'push succeeds if submodule has no remote and is on the first superproject commit' '
+	git init --bare a
+	git clone a a1 &&
+	(
+		cd a1 &&
+		git init b
+		(
+			cd b &&
+			>junk &&
+			git add junk &&
+			git commit -m "initial"
+		) &&
+		git add b &&
+		git commit -m "added submodule" &&
+		git push --recurse-submodule=check origin master
+	)
+'
+
 test_done
diff --git a/transport.c b/transport.c
index 98c5778..d2725e5 100644
--- a/transport.c
+++ b/transport.c
@@ -10,6 +10,7 @@
 #include "refs.h"
 #include "branch.h"
 #include "url.h"
+#include "submodule.h"
 
 /* rsync support */
 
@@ -1045,6 +1046,14 @@ int transport_push(struct transport *transport,
 			flags & TRANSPORT_PUSH_MIRROR,
 			flags & TRANSPORT_PUSH_FORCE);
 
+		if ((flags & TRANSPORT_RECURSE_SUBMODULES_CHECK) && !is_bare_repository()) {
+			struct ref *ref = remote_refs;
+			for (; ref; ref = ref->next)
+				if (!is_null_sha1(ref->new_sha1) &&
+				    check_submodule_needs_pushing(ref->new_sha1,transport->remote->name))
+					die("There are unpushed submodules, aborting.");
+		}
+
 		push_ret = transport->push_refs(transport, remote_refs, flags);
 		err = push_had_errors(remote_refs);
 		ret = push_ret | err;
diff --git a/transport.h b/transport.h
index 161d724..059b330 100644
--- a/transport.h
+++ b/transport.h
@@ -101,6 +101,7 @@ struct transport {
 #define TRANSPORT_PUSH_MIRROR 8
 #define TRANSPORT_PUSH_PORCELAIN 16
 #define TRANSPORT_PUSH_SET_UPSTREAM 32
+#define TRANSPORT_RECURSE_SUBMODULES_CHECK 64
 
 #define TRANSPORT_SUMMARY_WIDTH (2 * DEFAULT_ABBREV + 3)
 
-- 
1.7.6.551.gfb18e

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-08-19 22:08 [PATCH v4 0/2] push: submodule support Fredrik Gustafsson
  2011-08-19 22:08 ` [PATCH v4 1/2] push: Don't push a repository with unpushed submodules Fredrik Gustafsson
@ 2011-08-19 22:08 ` Fredrik Gustafsson
  2011-09-02 18:21   ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Fredrik Gustafsson @ 2011-08-19 22:08 UTC (permalink / raw)
  To: git; +Cc: iveqy, hvoigt, jens.lehmann, gitster

When using this option git will search for all submodules that
have changed in the revisions to be send. It will then try to
push the currently checked out branch of each submodule.

This helps when a user has finished working on a change which
involves submodules and just wants to push everything in one go.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Mentored-by: Jens Lehmann <Jens.Lehmann@web.de>
Mentored-by: Heiko Voigt <hvoigt@hvoigt.net>
---
 Documentation/git-push.txt     |   13 ++++--
 builtin/push.c                 |    7 +++
 submodule.c                    |   89 ++++++++++++++++++++++++++++++++--------
 submodule.h                    |    1 +
 t/t5531-deep-submodule-push.sh |   24 +++++++++++
 transport.c                    |   10 ++++-
 transport.h                    |    1 +
 7 files changed, 121 insertions(+), 24 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index aede488..fe60d28 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -162,11 +162,14 @@ useful if you write an alias or script around 'git push'.
 	is specified. This flag forces progress status even if the
 	standard error stream is not directed to a terminal.
 
---recurse-submodules=check::
-	Check whether all submodule commits used by the revisions to be
-	pushed are available on a remote tracking branch. Otherwise the
-	push will be aborted and the command will exit with non-zero status.
-
+--recurse-submodules=<check|on-demand>::
+	Check whether all submodule commits used by the revisions to be pushed
+	are available on a remote tracking branch. If check is used the push
+	will be aborted and the command will exit with non-zero status.
+	If on-demand is used all submodules that changed in the
+	to be pushed will be pushed. If on-demand was not able
+	to push all necessary revisions it will also be aborted and exit
+	with non-zero status.
 
 include::urls-remotes.txt[]
 
diff --git a/builtin/push.c b/builtin/push.c
index 35cce53..f2ef8dd 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -224,9 +224,16 @@ static int option_parse_recurse_submodules(const struct option *opt,
 				   const char *arg, int unset)
 {
 	int *flags = opt->value;
+
+	if (*flags & (TRANSPORT_RECURSE_SUBMODULES_CHECK |
+		      TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND))
+		die("%s can only be used once.", opt->long_name);
+
 	if (arg) {
 		if (!strcmp(arg, "check"))
 			*flags |= TRANSPORT_RECURSE_SUBMODULES_CHECK;
+		else if (!strcmp(arg, "on-demand"))
+			*flags |= TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND;
 		else
 			die("bad %s argument: %s", opt->long_name, arg);
 	} else
diff --git a/submodule.c b/submodule.c
index 45f508c..dc95498 100644
--- a/submodule.c
+++ b/submodule.c
@@ -8,7 +8,10 @@
 #include "diffcore.h"
 #include "refs.h"
 #include "string-list.h"
+#include "transport.h"
 
+typedef int (*needs_push_func_t)(const char *path, const unsigned char sha1[20],
+		void *data);
 static struct string_list config_name_for_path;
 static struct string_list config_fetch_recurse_submodules_for_name;
 static struct string_list config_ignore_for_name;
@@ -308,21 +311,24 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
+typedef int (*module_func_t)(const char *path, const unsigned char sha1[20], void *data);
+
 static int has_remote(const char *refname, const unsigned char *sha1, int flags, void *cb_data)
 {
 	return 1;
 }
 
-static int submodule_needs_pushing(const char *path, const unsigned char sha1[20])
+int submodule_needs_pushing(const char *path, const unsigned char sha1[20], void *data)
 {
+	int *needs_pushing = data;
+
 	if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
-		return 0;
+		return 1;
 
 	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
 		struct child_process cp;
 		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
 		struct strbuf buf = STRBUF_INIT;
-		int needs_pushing = 0;
 
 		argv[1] = sha1_to_hex(sha1);
 		memset(&cp, 0, sizeof(cp));
@@ -336,41 +342,74 @@ static int submodule_needs_pushing(const char *path, const unsigned char sha1[20
 			die("Could not run 'git rev-list %s --not --remotes -n 1' command in submodule %s",
 				sha1_to_hex(sha1), path);
 		if (strbuf_read(&buf, cp.out, 41))
-			needs_pushing = 1;
+			*needs_pushing = 1;
 		finish_command(&cp);
 		close(cp.out);
 		strbuf_release(&buf);
-		return needs_pushing;
+		return !*needs_pushing;
 	}
 
-	return 0;
+	return 1;
+}
+
+int push_submodule(const char *path, const unsigned char sha1[20], void *data)
+{
+	if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
+		return 1;
+
+	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
+		struct child_process cp;
+		const char *argv[] = {"push", NULL};
+
+		memset(&cp, 0, sizeof(cp));
+		cp.argv = argv;
+		cp.env = local_repo_env;
+		cp.git_cmd = 1;
+		cp.no_stdin = 1;
+		cp.out = -1;
+		cp.dir = path;
+		if (run_command(&cp))
+			die("Could not run 'git push' command in submodule %s", path);
+		close(cp.out);
+	}
+
+	return 1;
 }
 
+struct collect_submodules_data {
+	module_func_t func;
+	void *data;
+	int ret;
+};
+
 static void collect_submodules_from_diff(struct diff_queue_struct *q,
 					 struct diff_options *options,
 					 void *data)
 {
 	int i;
-	int *needs_pushing = data;
+	struct collect_submodules_data *me = data;
 
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 		if (!S_ISGITLINK(p->two->mode))
 			continue;
-		if (submodule_needs_pushing(p->two->path, p->two->sha1)) {
-			*needs_pushing = 1;
+		if (!(me->ret = me->func(p->two->path, p->two->sha1, me->data)))
 			break;
-		}
 	}
 }
 
-
-static void commit_need_pushing(struct commit *commit, struct commit_list *parent, int *needs_pushing)
+static int commit_need_pushing(struct commit *commit, struct commit_list *parent,
+	module_func_t func, void *data)
 {
 	const unsigned char (*parents)[20];
 	unsigned int i, n;
 	struct rev_info rev;
 
+	struct collect_submodules_data cb;
+	cb.func = func;
+	cb.data = data;
+	cb.ret = 1;
+
 	n = commit_list_count(parent);
 	parents = xmalloc(n * sizeof(*parents));
 
@@ -382,21 +421,23 @@ static void commit_need_pushing(struct commit *commit, struct commit_list *paren
 	init_revisions(&rev, NULL);
 	rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
 	rev.diffopt.format_callback = collect_submodules_from_diff;
-	rev.diffopt.format_callback_data = needs_pushing;
+	rev.diffopt.format_callback_data = &cb;
 	diff_tree_combined(commit->object.sha1, parents, n, 1, &rev);
 
 	free(parents);
+	return cb.ret;
 }
 
-int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name)
+static int inspect_superproject_commits(unsigned char new_sha1[20], const char *remotes_name,
+	module_func_t func, void *data)
 {
 	struct rev_info rev;
 	struct commit *commit;
 	const char *argv[] = {NULL, NULL, "--not", "NULL", NULL};
 	int argc = ARRAY_SIZE(argv) - 1;
 	char *sha1_copy;
-	int needs_pushing = 0;
 	struct strbuf remotes_arg = STRBUF_INIT;
+	int do_continue = 1;
 
 	strbuf_addf(&remotes_arg, "--remotes=%s", remotes_name);
 	init_revisions(&rev, NULL);
@@ -407,13 +448,25 @@ int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remote
 	if (prepare_revision_walk(&rev))
 		die("revision walk setup failed");
 
-	while ((commit = get_revision(&rev)) && !needs_pushing)
-		commit_need_pushing(commit, commit->parents, &needs_pushing);
+	while ((commit = get_revision(&rev)) && do_continue)
+		do_continue = commit_need_pushing(commit, commit->parents, func, data);
 
 	free(sha1_copy);
 	strbuf_release(&remotes_arg);
 
-	return needs_pushing;
+	return do_continue;
+}
+
+int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name)
+{
+	int needs_push = 0;
+	inspect_superproject_commits(new_sha1, remotes_name, submodule_needs_pushing, &needs_push);
+	return needs_push;
+}
+
+void push_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_name)
+{
+	inspect_superproject_commits(new_sha1, remotes_name, push_submodule, NULL);
 }
 
 static int is_submodule_commit_present(const char *path, unsigned char sha1[20])
diff --git a/submodule.h b/submodule.h
index 799c22d..a0074aa 100644
--- a/submodule.h
+++ b/submodule.h
@@ -30,5 +30,6 @@ unsigned is_submodule_modified(const char *path, int ignore_untracked);
 int merge_submodule(unsigned char result[20], const char *path, const unsigned char base[20],
 		    const unsigned char a[20], const unsigned char b[20]);
 int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name);
+void push_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_name);
 
 #endif
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index 30bec4b..35820ec 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -119,4 +119,28 @@ test_expect_success 'push succeeds if submodule has no remote and is on the firs
 	)
 '
 
+test_expect_success 'push unpushed submodules' '
+	(
+		cd work &&
+		git checkout master &&
+		git push --recurse-submodules=on-demand ../pub.git master
+	)
+'
+
+test_expect_success 'push unpushed submodules when not needed' '
+	(
+		cd work &&
+		(
+			cd gar/bage &&
+			>junk4 &&
+			git add junk4 &&
+			git commit -m "junk4" &&
+			git push
+		) &&
+		git add gar/bage &&
+		git commit -m "updated submodule" &&
+		git push --recurse-submodules=on-demand ../pub.git master
+	)
+'
+
 test_done
diff --git a/transport.c b/transport.c
index d2725e5..59c90c7 100644
--- a/transport.c
+++ b/transport.c
@@ -1046,7 +1046,15 @@ int transport_push(struct transport *transport,
 			flags & TRANSPORT_PUSH_MIRROR,
 			flags & TRANSPORT_PUSH_FORCE);
 
-		if ((flags & TRANSPORT_RECURSE_SUBMODULES_CHECK) && !is_bare_repository()) {
+		if ((flags & TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND) && !is_bare_repository()) {
+			struct ref *ref = remote_refs;
+			for (; ref; ref = ref->next)
+				if (!is_null_sha1(ref->new_sha1))
+				    push_unpushed_submodules(ref->new_sha1,transport->remote->name);
+		}
+
+		if ((flags & (TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND |
+			      TRANSPORT_RECURSE_SUBMODULES_CHECK)) && !is_bare_repository()) {
 			struct ref *ref = remote_refs;
 			for (; ref; ref = ref->next)
 				if (!is_null_sha1(ref->new_sha1) &&
diff --git a/transport.h b/transport.h
index 059b330..9d19c78 100644
--- a/transport.h
+++ b/transport.h
@@ -102,6 +102,7 @@ struct transport {
 #define TRANSPORT_PUSH_PORCELAIN 16
 #define TRANSPORT_PUSH_SET_UPSTREAM 32
 #define TRANSPORT_RECURSE_SUBMODULES_CHECK 64
+#define TRANSPORT_RECURSE_SUBMODULES_ON_DEMAND 128
 
 #define TRANSPORT_SUMMARY_WIDTH (2 * DEFAULT_ABBREV + 3)
 
-- 
1.7.6.551.gfb18e

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-19 22:08 ` [PATCH v4 1/2] push: Don't push a repository with unpushed submodules Fredrik Gustafsson
@ 2011-08-19 23:26   ` Junio C Hamano
  2011-08-20  6:32     ` Junio C Hamano
  2011-08-20  6:34     ` [PATCH 2/2] demonstrate format-callback used in combined diff Junio C Hamano
  0 siblings, 2 replies; 20+ messages in thread
From: Junio C Hamano @ 2011-08-19 23:26 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, hvoigt, jens.lehmann

Fredrik Gustafsson <iveqy@iveqy.com> writes:

> diff --git a/combine-diff.c b/combine-diff.c
> index b11eb71..f7a8978 100644
> --- a/combine-diff.c
> +++ b/combine-diff.c
> @@ -1074,7 +1074,7 @@ void diff_tree_combined(const unsigned char *sha1,
>  		 * when doing combined diff.
>  		 */
>  		int stat_opt = (opt->output_format &
> -				(DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT));
> +				(DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT|DIFF_FORMAT_CALLBACK));
>  		if (i == 0 && stat_opt)
>  			diffopts.output_format = stat_opt;
>  		else

Sorry, but this is not what I meant. With this change, you are running N
(= number of parents) diffs with the end result, but only making a
callback while running a diff with the first parent, and not getting
anything from comparison with other parents.

The existing NUMSTAT/STAT exception is only justified because that is how
"diff --stat" shows merges (i.e. showing the extent of damage to the
mainline, assuming you are viewing a merge to the mainline from a side
branch).

What I meant was more along the lines of the following, but I think we
would need a new kind of callback that can take N-way parents (which is
not depicted here).

Let me cook up something and get back to you later tonight.

 combine-diff.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 655fa89..51ebd31 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1017,6 +1017,12 @@ void diff_tree_combined(const unsigned char *sha1,
 			num_paths++;
 	}
 	if (num_paths) {
+		if (opt->output_format & DIFF_FORMAT_CALLBACK) {
+			for (p = paths; p; p = p->next) {
+				if (p->len)
+					... make callback here ...
+			}
+		}
 		if (opt->output_format & (DIFF_FORMAT_RAW |
 					  DIFF_FORMAT_NAME |
 					  DIFF_FORMAT_NAME_STATUS)) {

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-19 23:26   ` Junio C Hamano
@ 2011-08-20  6:32     ` Junio C Hamano
  2011-08-21  6:48       ` Junio C Hamano
  2011-08-20  6:34     ` [PATCH 2/2] demonstrate format-callback used in combined diff Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2011-08-20  6:32 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, hvoigt, jens.lehmann

Junio C Hamano <gitster@pobox.com> writes:

> What I meant was more along the lines of the following, but I think we
> would need a new kind of callback that can take N-way parents (which is
> not depicted here).
>
> Let me cook up something and get back to you later tonight.

And here is a two-patch series to do just that.

The first one is meant for you to use, and the second one is a sample
application of the new machinery.

-- >8 --
Subject: [PATCH 1/2] combine-diff: support format_callback

This teaches combine-diff machinery to feed a combined merge to a callback
function when DIFF_FORMAT_CALLBACK is specified.

So far, format callback functions are not used for anything but 2-way
diffs. A callback is given a diff_queue_struct, which is an array of
diff_filepair. As its name suggests, a diff_filepair is a _pair_ of
diff_filespec that represents a single preimage and a single postimage.

Since "diff -c" is to compare N parents with a single merge result and
filter out any paths whose result match one (or more) of the parent(s),
its output has to be able to represent N preimages and 1 postimage. For
this reason, a callback function that inspects a diff_filepair that
results from this new infrastructure can and is expected to view the
preimage side (i.e. pair->one) as an array of diff_filespec. Each element
in the array, except for the last one, is marked with "has_more_entries"
bit, so that the same callback function can be used for 2-way diffs and
combined diffs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 combine-diff.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 diffcore.h     |    2 +-
 2 files changed, 70 insertions(+), 1 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 655fa89..de88186 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -970,6 +970,72 @@ void show_combined_diff(struct combine_diff_path *p,
 		show_patch_diff(p, num_parent, dense, rev);
 }
 
+static void free_combined_pair(struct diff_filepair *pair)
+{
+	free(pair->two);
+	free(pair);
+}
+
+/*
+ * A combine_diff_path expresses N parents on the LHS against 1 merge
+ * result. Synthesize a diff_filepair that has N entries on the "one"
+ * side and 1 entry on the "two" side.
+ *
+ * In the future, we might want to add more data to combine_diff_path
+ * so that we can fill fields we are ignoring (most notably, size) here,
+ * but currently nobody uses it, so this should suffice for now.
+ */
+static struct diff_filepair *combined_pair(struct combine_diff_path *p,
+					   int num_parent)
+{
+	int i;
+	struct diff_filepair *pair;
+	struct diff_filespec *pool;
+
+	pair = xmalloc(sizeof(*pair));
+	pool = xcalloc(num_parent + 1, sizeof(struct diff_filespec));
+	pair->one = pool + 1;
+	pair->two = pool;
+
+	for (i = 0; i < num_parent; i++) {
+		pair->one[i].path = p->path;
+		pair->one[i].mode = p->parent[i].mode;
+		hashcpy(pair->one[i].sha1, p->parent[i].sha1);
+		pair->one[i].sha1_valid = !is_null_sha1(p->parent[i].sha1);
+		pair->one[i].has_more_entries = 1;
+	}
+	pair->one[num_parent - 1].has_more_entries = 0;
+
+	pair->two->path = p->path;
+	pair->two->mode = p->mode;
+	hashcpy(pair->two->sha1, p->sha1);
+	pair->two->sha1_valid = !is_null_sha1(p->sha1);
+	return pair;
+}
+
+static void handle_combined_callback(struct diff_options *opt,
+				     struct combine_diff_path *paths,
+				     int num_parent,
+				     int num_paths)
+{
+	struct combine_diff_path *p;
+	struct diff_queue_struct q;
+	int i;
+
+	q.queue = xcalloc(num_paths, sizeof(struct diff_filepair *));
+	q.alloc = num_paths;
+	q.nr = num_paths;
+	for (i = 0, p = paths; p; p = p->next) {
+		if (!p->len)
+			continue;
+		q.queue[i++] = combined_pair(p, num_parent);
+	}
+	opt->format_callback(&q, opt, opt->format_callback_data);
+	for (i = 0; i < num_paths; i++)
+		free_combined_pair(q.queue[i]);
+	free(q.queue);
+}
+
 void diff_tree_combined(const unsigned char *sha1,
 			const unsigned char parent[][20],
 			int num_parent,
@@ -1029,6 +1095,9 @@ void diff_tree_combined(const unsigned char *sha1,
 		else if (opt->output_format &
 			 (DIFF_FORMAT_NUMSTAT|DIFF_FORMAT_DIFFSTAT))
 			needsep = 1;
+		else if (opt->output_format & DIFF_FORMAT_CALLBACK)
+			handle_combined_callback(opt, paths, num_parent, num_paths);
+
 		if (opt->output_format & DIFF_FORMAT_PATCH) {
 			if (needsep)
 				putchar(opt->line_termination);
diff --git a/diffcore.h b/diffcore.h
index b8f1fde..8f32b82 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -45,7 +45,7 @@ struct diff_filespec {
 	unsigned dirty_submodule : 2;  /* For submodules: its work tree is dirty */
 #define DIRTY_SUBMODULE_UNTRACKED 1
 #define DIRTY_SUBMODULE_MODIFIED  2
-
+	unsigned has_more_entries : 1; /* only appear in combined diff */
 	struct userdiff_driver *driver;
 	/* data should be considered "binary"; -1 means "don't know yet" */
 	int is_binary;
-- 
1.7.6.557.gcee42

 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/2] demonstrate format-callback used in combined diff
  2011-08-19 23:26   ` Junio C Hamano
  2011-08-20  6:32     ` Junio C Hamano
@ 2011-08-20  6:34     ` Junio C Hamano
  2011-08-21 21:55       ` Fredrik Gustafsson
  1 sibling, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2011-08-20  6:34 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, hvoigt, jens.lehmann

This demonstrates how to use the format-callback machinery added
to combined diff.

 $ ./git demo v1.7.6..maint

works like "git log" with the same revision-range arguments, but shows
list of paths that have contents in the child commit different from any of
its parent commit(s). As a consequence, when a trivial merge takes the
contents of a path as a whole from one parent, such a path is not shown.

Notice how the same function can be used to be called back for a two-way
diff (i.e. there is only one entry on the preimage "one" side) and also
for a combined diff.

Obviously not meant for inclusion.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin.h     |    1 +
 builtin/log.c |   48 ++++++++++++++++++++++++++++++++++++++++++++++++
 git.c         |    1 +
 3 files changed, 50 insertions(+), 0 deletions(-)

diff --git a/builtin.h b/builtin.h
index 0e9da90..aef8917 100644
--- a/builtin.h
+++ b/builtin.h
@@ -59,6 +59,7 @@ extern int cmd_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_commit_tree(int argc, const char **argv, const char *prefix);
 extern int cmd_config(int argc, const char **argv, const char *prefix);
 extern int cmd_count_objects(int argc, const char **argv, const char *prefix);
+extern int cmd_demo(int argc, const char **argv, const char *prefix);
 extern int cmd_describe(int argc, const char **argv, const char *prefix);
 extern int cmd_diff_files(int argc, const char **argv, const char *prefix);
 extern int cmd_diff_index(int argc, const char **argv, const char *prefix);
diff --git a/builtin/log.c b/builtin/log.c
index 5c2af59..cc222c8 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -8,6 +8,7 @@
 #include "color.h"
 #include "commit.h"
 #include "diff.h"
+#include "diffcore.h"
 #include "revision.h"
 #include "log-tree.h"
 #include "builtin.h"
@@ -436,6 +437,53 @@ static void show_rev_tweak_rev(struct rev_info *rev, struct setup_revision_opt *
 		rev->diffopt.output_format = DIFF_FORMAT_PATCH;
 }
 
+static void show_paths_callback(struct diff_queue_struct *q,
+				struct diff_options *options,
+				void *data)
+{
+	int i;
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *pair = q->queue[i];
+		struct diff_filespec *spec;
+		int j;
+
+		j = 0;
+		spec = pair->one;
+		while (1) {
+			printf("Parent[%d] %s (%s)\n",
+			       j, spec->path, sha1_to_hex(spec->sha1));
+			if (!spec->has_more_entries)
+				break;
+			j++;
+			spec++;
+		}
+		printf("Result    %s (%s)\n",
+		       pair->two->path, sha1_to_hex(pair->two->sha1));
+	}
+}
+
+int cmd_demo(int argc, const char **argv, const char *prefix)
+{
+	struct rev_info rev;
+	struct setup_revision_opt opt;
+
+	init_revisions(&rev, prefix);
+	rev.diff = 1;
+	memset(&opt, 0, sizeof(opt));
+	opt.def = "HEAD";
+	cmd_log_init(argc, argv, prefix, &rev, &opt);
+
+	rev.diffopt.output_format = DIFF_FORMAT_CALLBACK;
+	rev.diffopt.format_callback = show_paths_callback;
+	rev.diffopt.format_callback_data = NULL;
+	rev.diff = 1;
+	rev.combine_merges = 1;
+	rev.dense_combined_merges = 0;
+	rev.ignore_merges = 0;
+
+	return cmd_log_walk(&rev);
+}
+
 int cmd_show(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info rev;
diff --git a/git.c b/git.c
index 89721d4..34d2381 100644
--- a/git.c
+++ b/git.c
@@ -346,6 +346,7 @@ static void handle_internal_command(int argc, const char **argv)
 		{ "commit-tree", cmd_commit_tree, RUN_SETUP },
 		{ "config", cmd_config, RUN_SETUP_GENTLY },
 		{ "count-objects", cmd_count_objects, RUN_SETUP },
+		{ "demo", cmd_demo, RUN_SETUP },
 		{ "describe", cmd_describe, RUN_SETUP },
 		{ "diff", cmd_diff },
 		{ "diff-files", cmd_diff_files, RUN_SETUP | NEED_WORK_TREE },
-- 
1.7.6.557.gcee42

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-20  6:32     ` Junio C Hamano
@ 2011-08-21  6:48       ` Junio C Hamano
  2011-08-22 19:47         ` Heiko Voigt
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2011-08-21  6:48 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, hvoigt, jens.lehmann

Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> What I meant was more along the lines of the following, but I think we
>> would need a new kind of callback that can take N-way parents (which is
>> not depicted here).
>>
>> Let me cook up something and get back to you later tonight.
>
> And here is a two-patch series to do just that.
>
> The first one is meant for you to use, and the second one is a sample
> application of the new machinery.
>
> -- >8 --
> Subject: [PATCH 1/2] combine-diff: support format_callback
>
> This teaches combine-diff machinery to feed a combined merge to a callback
> function when DIFF_FORMAT_CALLBACK is specified.

After removing the change to combine-diff.c from your two-patch series, I
applied them on top of this one, and queued the result in 'pu'.

While I tried to be careful while doing this callback-for-combine-diff
patch so that a callback function written for two-way diff can be used
without any change as long as it does not care about the LHS (i.e. "one")
of the filepair, please double check. I didn't read your change to
submodule.c very carefully (and I didn't have to change it).

The result seems to pass your new tests ;-).

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 2/2] demonstrate format-callback used in combined diff
  2011-08-20  6:34     ` [PATCH 2/2] demonstrate format-callback used in combined diff Junio C Hamano
@ 2011-08-21 21:55       ` Fredrik Gustafsson
  0 siblings, 0 replies; 20+ messages in thread
From: Fredrik Gustafsson @ 2011-08-21 21:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, hvoigt, jens.lehmann

Thank you. GSoC is now finished and I have examans next week that needs
my attention. I will continue finish what I started but won't be able to
do so until the end of the weak. Just so you know...

/Fredrik Gustafsson

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: Re: [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-21  6:48       ` Junio C Hamano
@ 2011-08-22 19:47         ` Heiko Voigt
  2011-08-22 22:22           ` Junio C Hamano
  0 siblings, 1 reply; 20+ messages in thread
From: Heiko Voigt @ 2011-08-22 19:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, jens.lehmann

Hi,

On Sat, Aug 20, 2011 at 11:48:48PM -0700, Junio C Hamano wrote:
> After removing the change to combine-diff.c from your two-patch series, I
> applied them on top of this one, and queued the result in 'pu'.
> 
> While I tried to be careful while doing this callback-for-combine-diff
> patch so that a callback function written for two-way diff can be used
> without any change as long as it does not care about the LHS (i.e. "one")
> of the filepair, please double check. I didn't read your change to
> submodule.c very carefully (and I didn't have to change it).
> 
> The result seems to pass your new tests ;-).

Very nice. Today I had a deeper look into the current tests for
on-demand and found a bug in them. Cleaning them up also revealed a bug
in the current code. Junio could you please squash this[1] in the last
patch (on-demand option).

I analysed the cause of this bug and it seems that we are not allowed to
iterate revisions using init_revisions() and setup_revisions() more
than once. I tracked this down to the SEEN flag in the struct object.
Junio since you are one person listed in the api docs could you maybe
quickly explain to me what this flag is used for?

I quickly tried to implement a reset_revision_walk function which will
reset this flag but it seems that this breaks some expectations in the
code since I got a segfault.

Cheers Heiko

[1]

diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index 35820ec..b0e94f7 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -124,6 +124,10 @@ test_expect_success 'push unpushed submodules' '
 		cd work &&
 		git checkout master &&
 		git push --recurse-submodules=on-demand ../pub.git master
+		cd gar/bage &&
+		git rev-parse master >expected &&
+		git rev-parse origin/master >actual &&
+		test_cmp expected actual
 	)
 '
 
@@ -132,10 +136,14 @@ test_expect_success 'push unpushed submodules when not needed' '
 		cd work &&
 		(
 			cd gar/bage &&
-			>junk4 &&
-			git add junk4 &&
-			git commit -m "junk4" &&
-			git push
+			git checkout master &&
+			>junk5 &&
+			git add junk5 &&
+			git commit -m "junk5" &&
+			git push &&
+			git rev-parse master >expected &&
+			git rev-parse origin/master >actual &&
+			test_cmp expected actual
 		) &&
 		git add gar/bage &&
 		git commit -m "updated submodule" &&
@@ -143,4 +151,20 @@ test_expect_success 'push unpushed submodules when not needed' '
 	)
 '
 
+test_expect_failure 'push unpushed submodules on-demand fails when submodule not pushable' '
+	(
+		cd work &&
+		(
+			cd gar/bage &&
+			git checkout HEAD~0 &&
+			>junk6 &&
+			git add junk6 &&
+			git commit -m "junk6"
+		) &&
+		git add gar/bage &&
+		git commit -m "updated submodule" &&
+		test_must_fail git push --recurse-submodules=on-demand ../pub.git master
+	)
+'
+
 test_done
-- 
1.7.6.46.g0f058

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-22 19:47         ` Heiko Voigt
@ 2011-08-22 22:22           ` Junio C Hamano
  2011-08-23 19:45             ` Heiko Voigt
  2011-08-24 21:14             ` [WIP PATCH] revision-walking: allow iterating revisions multiple times Heiko Voigt
  0 siblings, 2 replies; 20+ messages in thread
From: Junio C Hamano @ 2011-08-22 22:22 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Fredrik Gustafsson, git, jens.lehmann

Heiko Voigt <hvoigt@hvoigt.net> writes:

> Junio since you are one person listed in the api docs could you maybe
> quickly explain to me what this flag is used for?

It is used in order to avoid walking the object we have walked already.

Which in turn means that once you walk chain of objects, unless you
remember the ones you walked and clear the marks after you are done, you
cannot walk the object chain for unrelated purposes.  See how functions
like get_merge_bases_many() walk portions of graph for their own purpose
and then avoid disrupting others by calling clear_commit_marks(). The use
of TMP_MARK (and its clearing after the function is done with the marked
objects) in remove_duplicate_parents() serve the same purpose.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 1/2] push: Don't push a repository with unpushed submodules
  2011-08-22 22:22           ` Junio C Hamano
@ 2011-08-23 19:45             ` Heiko Voigt
  2011-08-24 21:14             ` [WIP PATCH] revision-walking: allow iterating revisions multiple times Heiko Voigt
  1 sibling, 0 replies; 20+ messages in thread
From: Heiko Voigt @ 2011-08-23 19:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, jens.lehmann

On Mon, Aug 22, 2011 at 03:22:31PM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> 
> > Junio since you are one person listed in the api docs could you maybe
> > quickly explain to me what this flag is used for?
> 
> It is used in order to avoid walking the object we have walked already.
> 
> Which in turn means that once you walk chain of objects, unless you
> remember the ones you walked and clear the marks after you are done, you
> cannot walk the object chain for unrelated purposes.  See how functions
> like get_merge_bases_many() walk portions of graph for their own purpose
> and then avoid disrupting others by calling clear_commit_marks(). The use
> of TMP_MARK (and its clearing after the function is done with the marked
> objects) in remove_duplicate_parents() serve the same purpose.

Thanks I will have look at those places and try to cook up something.

Cheers Heiko

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [WIP PATCH] revision-walking: allow iterating revisions multiple times
  2011-08-22 22:22           ` Junio C Hamano
  2011-08-23 19:45             ` Heiko Voigt
@ 2011-08-24 21:14             ` Heiko Voigt
  2011-08-24 21:44               ` Junio C Hamano
  1 sibling, 1 reply; 20+ messages in thread
From: Heiko Voigt @ 2011-08-24 21:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, jens.lehmann

---
Hi,

On Mon, Aug 22, 2011 at 03:22:31PM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> 
> > Junio since you are one person listed in the api docs could you maybe
> > quickly explain to me what this flag is used for?
> 
> It is used in order to avoid walking the object we have walked already.
> 
> Which in turn means that once you walk chain of objects, unless you
> remember the ones you walked and clear the marks after you are done, you
> cannot walk the object chain for unrelated purposes.  See how functions
> like get_merge_bases_many() walk portions of graph for their own purpose
> and then avoid disrupting others by calling clear_commit_marks(). The use
> of TMP_MARK (and its clearing after the function is done with the marked
> objects) in remove_duplicate_parents() serve the same purpose.

What do you think about this approach ? Its not yet correctly collecting
revisions for all situations but it fixes the demonstrated test failure.

 revision.c  |   24 ++++++++++++++++++++++++
 revision.h  |    3 +++
 submodule.c |    3 +++
 3 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/revision.c b/revision.c
index c46cfaa..e374c4a 100644
--- a/revision.c
+++ b/revision.c
@@ -500,6 +500,8 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
 				continue;
 			p->object.flags |= SEEN;
 			commit_list_insert_by_date_cached(p, list, cached_base, cache_ptr);
+			if (revs->fill_reset_list)
+				add_object_array(&p->object, NULL, &revs->walked);
 		}
 		return 0;
 	}
@@ -527,6 +529,8 @@ static int add_parents_to_list(struct rev_info *revs, struct commit *commit,
 		if (!(p->object.flags & SEEN)) {
 			p->object.flags |= SEEN;
 			commit_list_insert_by_date_cached(p, list, cached_base, cache_ptr);
+			if (revs->fill_reset_list)
+				add_object_array(&p->object, NULL, &revs->walked);
 		}
 		if (revs->first_parent_only)
 			break;
@@ -1950,6 +1954,23 @@ static void set_children(struct rev_info *revs)
 	}
 }
 
+void reset_revision_walk(struct rev_info *revs)
+{
+	int nr = revs->walked.nr;
+	struct object_array_entry *e = revs->walked.objects;
+
+	/* reset the seen flags set by prepare_revision_walk */
+	while (--nr >= 0) {
+		struct object *o = e->item;
+		o->flags &= ~(ALL_REV_FLAGS);
+		e++;
+	}
+	free(revs->walked.objects);
+	revs->walked.nr = 0;
+	revs->walked.alloc = 0;
+	revs->walked.objects = NULL;
+}
+
 int prepare_revision_walk(struct rev_info *revs)
 {
 	int nr = revs->pending.nr;
@@ -1964,6 +1985,9 @@ int prepare_revision_walk(struct rev_info *revs)
 		if (commit) {
 			if (!(commit->object.flags & SEEN)) {
 				commit->object.flags |= SEEN;
+				if (revs->fill_reset_list)
+					add_object_array(&commit->object, NULL,
+							 &revs->walked);
 				commit_list_insert_by_date(commit, &revs->commits);
 			}
 		}
diff --git a/revision.h b/revision.h
index 3d64ada..6a0fa99 100644
--- a/revision.h
+++ b/revision.h
@@ -28,6 +28,7 @@ struct rev_info {
 	/* Starting list */
 	struct commit_list *commits;
 	struct object_array pending;
+	struct object_array walked;
 
 	/* Parents of shown commits */
 	struct object_array boundary_commits;
@@ -72,6 +73,7 @@ struct rev_info {
 			bisect:1,
 			ancestry_path:1,
 			first_parent_only:1;
+	unsigned int	fill_reset_list:1;
 
 	/* Diff flags */
 	unsigned int	diff:1,
@@ -169,6 +171,7 @@ extern void parse_revision_opt(struct rev_info *revs, struct parse_opt_ctx_t *ct
 				 const char * const usagestr[]);
 extern int handle_revision_arg(const char *arg, struct rev_info *revs,int flags,int cant_be_filename);
 
+extern void reset_revision_walk(struct rev_info *revs);
 extern int prepare_revision_walk(struct rev_info *revs);
 extern struct commit *get_revision(struct rev_info *revs);
 extern char *get_revision_mark(const struct rev_info *revs, const struct commit *commit);
diff --git a/submodule.c b/submodule.c
index dc95498..410d8e4 100644
--- a/submodule.c
+++ b/submodule.c
@@ -441,6 +441,7 @@ static int inspect_superproject_commits(unsigned char new_sha1[20], const char *
 
 	strbuf_addf(&remotes_arg, "--remotes=%s", remotes_name);
 	init_revisions(&rev, NULL);
+	rev.fill_reset_list = 1;
 	sha1_copy = xstrdup(sha1_to_hex(new_sha1));
 	argv[1] = sha1_copy;
 	argv[3] = remotes_arg.buf;
@@ -451,6 +452,8 @@ static int inspect_superproject_commits(unsigned char new_sha1[20], const char *
 	while ((commit = get_revision(&rev)) && do_continue)
 		do_continue = commit_need_pushing(commit, commit->parents, func, data);
 
+
+	reset_revision_walk(&rev);
 	free(sha1_copy);
 	strbuf_release(&remotes_arg);
 
-- 
1.7.6.553.g84dc

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [WIP PATCH] revision-walking: allow iterating revisions multiple times
  2011-08-24 21:14             ` [WIP PATCH] revision-walking: allow iterating revisions multiple times Heiko Voigt
@ 2011-08-24 21:44               ` Junio C Hamano
  0 siblings, 0 replies; 20+ messages in thread
From: Junio C Hamano @ 2011-08-24 21:44 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Fredrik Gustafsson, git, jens.lehmann

Heiko Voigt <hvoigt@hvoigt.net> writes:

> +void reset_revision_walk(struct rev_info *revs)
> +{
> +	int nr = revs->walked.nr;
> +	struct object_array_entry *e = revs->walked.objects;
> +
> +	/* reset the seen flags set by prepare_revision_walk */
> +	while (--nr >= 0) {
> +		struct object *o = e->item;
> +		o->flags &= ~(ALL_REV_FLAGS);
> +		e++;
> +	}
> +	free(revs->walked.objects);
> +	revs->walked.nr = 0;
> +	revs->walked.alloc = 0;
> +	revs->walked.objects = NULL;
> +}

I am afraid that this is not good enough for general purpose.  The object
you walk in the middle of doing something may have been marked for reasons
other than your extra walking before you started your walk. Imagine

 * The command takes arguments like rev-list does;

 * It calls setup_revisions(), which marks commits given from the command
   line with marks like UNINTERESTING, and then prepare_revision_walk();

 * It walks the commit graph and does interesting things on commits that
   it discovers, by repeatedly calling get_revision(), e.g.:

   	while ((commit = get_revision()) != NULL) {
		do_something_interesting(commit);
        }

Now, you add a new caller that walks the commit graph for a different
reason from the primary revision walking done by the command somewhere
down in the callchain of do_something_interesting()---obviously you cannot
use the above reset_revision_walk() to clean things up, as it will break
the outer revision walk.

If on the other hand you will _never_ have more than one revision walk
going on, it may amount to the same thing to iterate over the object array
and clear all the flags.

Traditionally the way to do nested revision walk that can potentially be
done more than once (but never having such a sub-walk in parallel) was to
remember the start points of the subwalk, use private marks that are not
used in the outer walk during the subwalk, and call clear_commit_marks()
on these start points when a subwalk is done to clear only the marks the
subwalk used.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-08-19 22:08 ` [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option Fredrik Gustafsson
@ 2011-09-02 18:21   ` Junio C Hamano
       [not found]     ` <20111017190749.GA3126@sandbox-rc>
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2011-09-02 18:21 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, hvoigt, jens.lehmann

Fredrik Gustafsson <iveqy@iveqy.com> writes:

> diff --git a/submodule.c b/submodule.c
> index 45f508c..dc95498 100644
> --- a/submodule.c
> +++ b/submodule.c
> @@ -8,7 +8,10 @@
>  #include "diffcore.h"
>  #include "refs.h"
>  #include "string-list.h"
> +#include "transport.h"
>  
> +typedef int (*needs_push_func_t)(const char *path, const unsigned char sha1[20],
> +		void *data);
>  static struct string_list config_name_for_path;
>  static struct string_list config_fetch_recurse_submodules_for_name;
>  static struct string_list config_ignore_for_name;
> @@ -308,21 +311,24 @@ void set_config_fetch_recurse_submodules(int value)
>  	config_fetch_recurse_submodules = value;
>  }
>  
> +typedef int (*module_func_t)(const char *path, const unsigned char sha1[20], void *data);
> +
>  static int has_remote(const char *refname, const unsigned char *sha1, int flags, void *cb_data)
>  {
>  	return 1;
>  }
>  
> -static int submodule_needs_pushing(const char *path, const unsigned char sha1[20])
> +int submodule_needs_pushing(const char *path, const unsigned char sha1[20], void *data)
>  {
> +	int *needs_pushing = data;
> +
>  	if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
> -		return 0;
> +		return 1;
>
>  	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
>  		struct child_process cp;
>  		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
>  		struct strbuf buf = STRBUF_INIT;
> -		int needs_pushing = 0;
>  
>  		argv[1] = sha1_to_hex(sha1);
>  		memset(&cp, 0, sizeof(cp));
> @@ -336,41 +342,74 @@ static int submodule_needs_pushing(const char *path, const unsigned char sha1[20
>  			die("Could not run 'git rev-list %s --not --remotes -n 1' command in submodule %s",
>  				sha1_to_hex(sha1), path);
>  		if (strbuf_read(&buf, cp.out, 41))
> -			needs_pushing = 1;
> +			*needs_pushing = 1;
>  		finish_command(&cp);
>  		close(cp.out);
>  		strbuf_release(&buf);
> -		return needs_pushing;
> +		return !*needs_pushing;
>  	}
> -	return 0;
> +	return 1;
> +}

It appears to me that this patch is flipping the meaning of the function,
and the returned value from here is no longer "do we know that this
submodule needs to be pushed (yes/no)?".  The function needs to be renamed
to describe what it does better.

Also you would need to give a comment before the function to describe the
semantics of these two return values (one from the function, the other
from the value placed via the callback data pointer).

The latter is especially important because the caller that gets 1 from
this function would not be able to tell if the value in the callback data
pointer is valid (only happens if "rev-list" said something) or undefined
(no assignment is ever done via *needs_pushing pointer to zero it when
"rev-list" is silent, or if no submodule is checked out at path).

> +int push_submodule(const char *path, const unsigned char sha1[20], void *data)
> +{
> +	if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
> +		return 1;
> +
> +	if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
> +		struct child_process cp;
> +		const char *argv[] = {"push", NULL};
> +
> +		memset(&cp, 0, sizeof(cp));
> +		cp.argv = argv;
> +		cp.env = local_repo_env;
> +		cp.git_cmd = 1;
> +		cp.no_stdin = 1;
> +		cp.out = -1;

Is this correct? Nobody seems to read from this pipe from the "git push"
output. Don't you either want to send it to the end user, or squelch it by
sending it to /dev/null? You could of course read its output between the
following run_command() and close() and do something intelligent depending
on what the command tells you, if you wanted to, but I somehow doubt that
is what you had in mind here...

> +		cp.dir = path;
> +		if (run_command(&cp))
> +			die("Could not run 'git push' command in submodule %s", path);
> +		close(cp.out);
> +	}
> +
> +	return 1;
>  }

Do you really want to "die" here? You would definitely do if the failure
was due to corruption of your submodule repository, but wouldn't you want
to continue pushing other submodules if you couldn't push this submodule
due to non-fast-forward (i.e. somebody else pushed there first), for
example?

> +struct collect_submodules_data {
> +	module_func_t func;
> +	void *data;
> +	int ret;
> +};

What are the meaning of these fields? Document them.

Do you really need a double indirection like this, I wonder...

>  static void collect_submodules_from_diff(struct diff_queue_struct *q,
>  					 struct diff_options *options,
>  					 void *data)
>  {
>  	int i;
> -	int *needs_pushing = data;
> +	struct collect_submodules_data *me = data;
>  
>  	for (i = 0; i < q->nr; i++) {
>  		struct diff_filepair *p = q->queue[i];
>  		if (!S_ISGITLINK(p->two->mode))
>  			continue;
> -		if (submodule_needs_pushing(p->two->path, p->two->sha1)) {
> -			*needs_pushing = 1;
> +		if (!(me->ret = me->func(p->two->path, p->two->sha1, me->data)))
>  			break;
> -		}
>  	}
>  }
>  
> -
> -static void commit_need_pushing(struct commit *commit, struct commit_list *parent, int *needs_pushing)
> +static int commit_need_pushing(struct commit *commit, struct commit_list *parent,
> +	module_func_t func, void *data)
>  {
>  	const unsigned char (*parents)[20];
>  	unsigned int i, n;
>  	struct rev_info rev;
>  
> +	struct collect_submodules_data cb;
> +	cb.func = func;

Just a style thing, but because we do not allow decl-after-statement, it
is customary to have the blank line _after_ the last decl, not in between
the declarations.

> +	cb.data = data;
> +	cb.ret = 1;
> +
>  	n = commit_list_count(parent);
>  	parents = xmalloc(n * sizeof(*parents));
>  
> @@ -382,21 +421,23 @@ static void commit_need_pushing(struct commit *commit, struct commit_list *paren
>  	init_revisions(&rev, NULL);
>  	rev.diffopt.output_format |= DIFF_FORMAT_CALLBACK;
>  	rev.diffopt.format_callback = collect_submodules_from_diff;
> -	rev.diffopt.format_callback_data = needs_pushing;
> +	rev.diffopt.format_callback_data = &cb;
>  	diff_tree_combined(commit->object.sha1, parents, n, 1, &rev);
>  
>  	free(parents);
> +	return cb.ret;
>  }
>  
> -int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name)
> +static int inspect_superproject_commits(unsigned char new_sha1[20], const char *remotes_name,
> +	module_func_t func, void *data)

Contrast your new name with "check-submodule-needs-pushing".  "inspect"
(or "check" for that matter) is a poor word to use in function names, as
the word by itself does not convey what aspect of the object of the verb
is being inspected or checked. The old name was fine because other words
in the name described what it was checking. The new name does not tell us
anything useful. First try to explain to yourself at high level what the
function does in a few lines, and then a more appropriate name would come
to you.

>  {
>  	struct rev_info rev;
>  	struct commit *commit;
>  	const char *argv[] = {NULL, NULL, "--not", "NULL", NULL};

What is this string "NULL" doing here???

>  	int argc = ARRAY_SIZE(argv) - 1;
>  	char *sha1_copy;
> -	int needs_pushing = 0;
>  	struct strbuf remotes_arg = STRBUF_INIT;
> +	int do_continue = 1;
>  
>  	strbuf_addf(&remotes_arg, "--remotes=%s", remotes_name);
>  	init_revisions(&rev, NULL);
> @@ -407,13 +448,25 @@ int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remote
>  	if (prepare_revision_walk(&rev))
>  		die("revision walk setup failed");
>  
> -	while ((commit = get_revision(&rev)) && !needs_pushing)
> -		commit_need_pushing(commit, commit->parents, &needs_pushing);
> +	while ((commit = get_revision(&rev)) && do_continue)
> +		do_continue = commit_need_pushing(commit, commit->parents, func, data);

A funny way to write

	white ((commit = get_revision(&rev)) != NULL) {
               if (!commit_need_pushing(commit, commit->parents, func, data))
			break;
	}

or even:

	white ((commit = get_revision(&rev)) != NULL &&
        	commit_need_pushing(commit, commit->parents, func, data))
		 ; /* nothing */

No caller of this function uses its return value (one caller uses its
return value left in "data" pointer), so I do not think you would need the
"do_continue" variable, which is misnamed (the name makes sense only as
the loop control inside this function, but does not make any sense as the
return value from this function---it does not tell the caller to continue).

As there is only this calling site of commit_need_pushing(), I wonder why
the function needs to be able to take commit and commit->parents as
separate parameters. Does it even make sense in other contexts to compare
a commit with list of commits that are not its parents and decide if the
commit needs pushing based on that comparison?

>  
>  	free(sha1_copy);
>  	strbuf_release(&remotes_arg);
>  
> -	return needs_pushing;
> +	return do_continue;
> +}
> +
> +int check_submodule_needs_pushing(unsigned char new_sha1[20], const char *remotes_name)
> +{
> +	int needs_push = 0;
> +	inspect_superproject_commits(new_sha1, remotes_name, submodule_needs_pushing, &needs_push);
> +	return needs_push;
> +}
> +
> +void push_unpushed_submodules(unsigned char new_sha1[20], const char *remotes_name)
> +{
> +	inspect_superproject_commits(new_sha1, remotes_name, push_submodule, NULL);
>  }

Again the called function is misnamed as the primary purpose it is used is not
to inspect, but to cause effects. But more importantly...

What does this do, given the loop structure of inspect_superproject_commits()? 
If your superproject is three commits ahead of the remote, the get_revision()
loop may run three times, calling commit_need_pushing() and have it inspect
these superproject commits, and may find that you bound different commits
from the same submodule multiple times during these three superproject commits.
Don't you end up running "git push" multiple times?

I have to say the overall code struction of this patch is simply broken.

How about doing it this way instead?

 - Update check-submodule-needs-pushing that used to stop at the first
   submodule that are not up-to-date not to do that. Instead, loop over
   all the submodules, find and collect which ones needs pushing, and
   return it as a list of submodules. Make sure you have the same
   submodule appear at most once in the result.  You may want to rename it
   to reflect the new role of the function (i.e. collecting submodules
   that needs to be pushed). Perhaps collect_stale_submodules() or
   something.

   For this, I do not think you need to touch the implementation of the
   submodule_needs_pushing() function at all. You do not need to introduce
   the indirection such as module_func_t and collect_submodules_data
   either. The only change needed is to collect_submodules_from_diff()
   that would treat the callback data not as a pointer to int
   (needs-pushing), but as a pointer to the structure to collect the names
   of submodules that need to be pushed (e.g. "struct string_list"), and
   make it not break the loop upon the first submodule that is stale.

 - Instead of adding a call to push-unpushed-submodules before
   check-submodule-needs-pushing in transport_push(), first call
   check-submodule-needs-pushing when on-demand or check is in effect.

   When in check mode, if check-submodule-needs-pushing returned a
   non-empty list, report which ones are stale and die.

   If in on-demand mode, you have a list of submodules you need to run
   "git push" in. Iterate over that list and do your push_submodule().
   You may want to reconsider your "die()" there, though.

Hmm?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
       [not found]     ` <20111017190749.GA3126@sandbox-rc>
@ 2011-10-17 22:33       ` Junio C Hamano
  2011-10-18 20:58         ` Jens Lehmann
  0 siblings, 1 reply; 20+ messages in thread
From: Junio C Hamano @ 2011-10-17 22:33 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Jens Lehmann, Fredrik Gustafsson, git

Heiko Voigt <hvoigt@hvoigt.net> writes:

> since we have not heard anything from Fredrik I will probably look into
> cleaning this up. Should I do that with follow-up patches since this
> patch is already in next?

I thought we kicked it back to 'pu' after 1.7.7 cycle.

I would personally want to put a freeze on "recursively do anything to
submodule" topic (including but not limited to "checkout") for now, until
we know how we would want to support "floating submodule" model. For
existing code in-flight, I would like to see us at least have a warm and
fuzzy feeling that we know which part of the code such a support would
need to undo and how the update would look like before moving forward.

There are two camps that use submodules in their large-ish projects.

One is mostly happy with the traditional "submodule trees checked out must
match what the superproject says, otherwise you have local changes and the
build product cannot be called to have emerged from that particular
superproject commit" model. Let's call this "exact submodules" model.

The other prefers "submodule trees checked out are whatever submodule
commits that happen to sit at the tips of the designated branches the
superproject wants to use" model. The superproject tree does not exactly
know or care what commit to use from each of its submodules, and I would
imagine that it may be more convenient for developers. They do not have to
care the entire build product while they commit---only the integration
process that could be separate and perhaps automated needs to know.

We haven't given any explicit support to the latter "floating submodules"
model so far. There may be easy workarounds to many of the potential
issues, (e.g. at "git diff/status" level, there may be some configuration
variables to tell the tools to ignore differences between the commit the
superproject records for the submodule path and the HEAD in the
submodule), but with recent work on submodules such as "allow pushing
superproject only after submodule commits are pushed out", I am afraid
that we seem to be piling random new things with the assumption that we
would never support anything but "exact submodules" model. Continuing the
development that way would require retrofitting support for "floating
submodules" model to largely undo the unwarranted assumptions existing
code makes. That is the reason why I would like to see people think about
the need to support the other "floating submodules" model, before making
the existing mess even worse.

The very first step for floating submodules support would be relatively
simple. We could declare that an entry in the .gitmodules file in the
superproject can optionally specify which branch needs to be checked out
with something like:

	[submodule "libfoo"]
		branch = master
                path = include/foo
                url = git://foo.com/git/lib.git
                
and when such an entry is defined, a command at the superproject level
would largely ignore what is at include/foo in the tree object recorded in
the superproject commit and in the index. When we show "git status" in the
superproject, instead of using the commit bound to the superproject, we
would use include/foo/.git/HEAD as the basis for detecting "local" changes
to the submodule. We could even declare that the gitlink for such a
submodule should record 0{40} SHA-1 in the superproject, but I do not
think that is necessary.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-10-17 22:33       ` Junio C Hamano
@ 2011-10-18 20:58         ` Jens Lehmann
  2011-12-12 21:16           ` Phil Hord
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Lehmann @ 2011-10-18 20:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Fredrik Gustafsson, git

Am 18.10.2011 00:33, schrieb Junio C Hamano:
> I would personally want to put a freeze on "recursively do anything to
> submodule" topic (including but not limited to "checkout") for now, until
> we know how we would want to support "floating submodule" model. For
> existing code in-flight, I would like to see us at least have a warm and
> fuzzy feeling that we know which part of the code such a support would
> need to undo and how the update would look like before moving forward.

Makes sense.

> There are two camps that use submodules in their large-ish projects.
> 
> One is mostly happy with the traditional "submodule trees checked out must
> match what the superproject says, otherwise you have local changes and the
> build product cannot be called to have emerged from that particular
> superproject commit" model. Let's call this "exact submodules" model.
> 
> The other prefers "submodule trees checked out are whatever submodule
> commits that happen to sit at the tips of the designated branches the
> superproject wants to use" model. The superproject tree does not exactly
> know or care what commit to use from each of its submodules, and I would
> imagine that it may be more convenient for developers. They do not have to
> care the entire build product while they commit---only the integration
> process that could be separate and perhaps automated needs to know.
>
> We haven't given any explicit support to the latter "floating submodules"
> model so far. There may be easy workarounds to many of the potential
> issues, (e.g. at "git diff/status" level, there may be some configuration
> variables to tell the tools to ignore differences between the commit the
> superproject records for the submodule path and the HEAD in the
> submodule), but with recent work on submodules such as "allow pushing
> superproject only after submodule commits are pushed out", I am afraid
> that we seem to be piling random new things with the assumption that we
> would never support anything but "exact submodules" model.

It's not about never supporting anything else, but right now we are
scratching our own itch ;-)

> Continuing the
> development that way would require retrofitting support for "floating
> submodules" model to largely undo the unwarranted assumptions existing
> code makes. That is the reason why I would like to see people think about
> the need to support the other "floating submodules" model, before making
> the existing mess even worse.

If you configure diff.ignoreSubmodules=all and fetch.recurseSubmodules=false
and write a script fetching and checking out the branch(es) of your choice
in the submodule(s) you run each time you want to update the branch tip
there, you should be almost there with current Git. But yes, we could do
better.

> The very first step for floating submodules support would be relatively
> simple. We could declare that an entry in the .gitmodules file in the
> superproject can optionally specify which branch needs to be checked out
> with something like:
> 
> 	[submodule "libfoo"]
> 		branch = master
>                 path = include/foo
>                 url = git://foo.com/git/lib.git
>                 
> and when such an entry is defined, a command at the superproject level
> would largely ignore what is at include/foo in the tree object recorded in
> the superproject commit and in the index. When we show "git status" in the
> superproject, instead of using the commit bound to the superproject, we
> would use include/foo/.git/HEAD as the basis for detecting "local" changes
> to the submodule.

Yup. And the presence of the "branch" config could tell "git submodule
update" to fetch and advance that branch to the tip every time it is run.
And it could tell the diff machinery (which is also used by status) to
ignore the differences between a submodule's HEAD and the SHA-1 in the
superproject (while still allowing to silence the presence of untracked
and/or modified files by using the diff.ignoreSubmodules option) and
fetch would just stop doing any on-demand action for such submodules.
Anything I missed?

> We could even declare that the gitlink for such a
> submodule should record 0{40} SHA-1 in the superproject, but I do not
> think that is necessary.

Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
should do nicely. And that would avoid referencing a non-existing commit
in case you later want to turn a floating submodule into an exact one.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-10-18 20:58         ` Jens Lehmann
@ 2011-12-12 21:16           ` Phil Hord
  2011-12-12 22:29             ` Jens Lehmann
  0 siblings, 1 reply; 20+ messages in thread
From: Phil Hord @ 2011-12-12 21:16 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git

On Tue, Oct 18, 2011 at 4:58 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 18.10.2011 00:33, schrieb Junio C Hamano:
>> The very first step for floating submodules support would be relatively
>> simple. We could declare that an entry in the .gitmodules file in the
>> superproject can optionally specify which branch needs to be checked out
>> with something like:
>>
>>       [submodule "libfoo"]
>>               branch = master
>>                 path = include/foo
>>                 url = git://foo.com/git/lib.git
>>
>> and when such an entry is defined, a command at the superproject level
>> would largely ignore what is at include/foo in the tree object recorded in
>> the superproject commit and in the index. When we show "git status" in the
>> superproject, instead of using the commit bound to the superproject, we
>> would use include/foo/.git/HEAD as the basis for detecting "local" changes
>> to the submodule.
>
> Yup. And the presence of the "branch" config could tell "git submodule
> update" to fetch and advance that branch to the tip every time it is run.
> And it could tell the diff machinery (which is also used by status) to
> ignore the differences between a submodule's HEAD and the SHA-1 in the
> superproject (while still allowing to silence the presence of untracked
> and/or modified files by using the diff.ignoreSubmodules option) and
> fetch would just stop doing any on-demand action for such submodules.
> Anything I missed?
>
>> We could even declare that the gitlink for such a
>> submodule should record 0{40} SHA-1 in the superproject, but I do not
>> think that is necessary.
>
> Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
> should do nicely. And that would avoid referencing a non-existing commit
> in case you later want to turn a floating submodule into an exact one.


I'm sorry I missed this comment before.

I hope we can allow storing the actual gitlink in the superproject for
each commit even when we're using floating submodules.  I
thought-experimented with this a bit last year and came to the
conclusion that I should be able to 'float' to tips (developer
convenience) and also to store the SHA-1 of each gitlink through
history (automated maybe; as-needed).

The problem with "float-only" is that it loses history so, for
example, git-bisect doesn't work.

The problem with "float + gitlinks", of course, is that it looks like
"not floating" to the developers (git-status is dirty unless
overridden, etc.)

Is there a deeper reason this wouldn't be possible?

Phil

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-12-12 21:16           ` Phil Hord
@ 2011-12-12 22:29             ` Jens Lehmann
  2011-12-12 23:50               ` Phil Hord
  0 siblings, 1 reply; 20+ messages in thread
From: Jens Lehmann @ 2011-12-12 22:29 UTC (permalink / raw)
  To: Phil Hord; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git

Am 12.12.2011 22:16, schrieb Phil Hord:
> On Tue, Oct 18, 2011 at 4:58 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>> Am 18.10.2011 00:33, schrieb Junio C Hamano:
>>> We could even declare that the gitlink for such a
>>> submodule should record 0{40} SHA-1 in the superproject, but I do not
>>> think that is necessary.
>>
>> Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
>> should do nicely. And that would avoid referencing a non-existing commit
>> in case you later want to turn a floating submodule into an exact one.
> 
> 
> I'm sorry I missed this comment before.
> 
> I hope we can allow storing the actual gitlink in the superproject for
> each commit even when we're using floating submodules.

I think you misread my statement, I was just talking about the initial
commit containing the newly added submodule, not any subsequent ones.
Floating makes differences between the original SHA-1 and the current
tip of the branch invisible, so there is nothing to commit.

>  I thought-experimented with this a bit last year and came to the
> conclusion that I should be able to 'float' to tips (developer
> convenience) and also to store the SHA-1 of each gitlink through
> history (automated maybe; as-needed).

Which means that after "git submodule update" floated a submodule branch
further, you would have to commit that in the superproject.

> The problem with "float-only" is that it loses history so, for
> example, git-bisect doesn't work.

Yep. And different developers can have the same superproject commit
checked out but their submodules can be quite different.

> The problem with "float + gitlinks", of course, is that it looks like
> "not floating" to the developers (git-status is dirty unless
> overridden, etc.)

Yeah. But what if each "git submodule update" would update the tip of
the submodule branch and add that to the superproject? You could follow
a tip but still produce reproducible trees.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-12-12 22:29             ` Jens Lehmann
@ 2011-12-12 23:50               ` Phil Hord
  2011-12-13  8:48                 ` Jens Lehmann
  0 siblings, 1 reply; 20+ messages in thread
From: Phil Hord @ 2011-12-12 23:50 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git

On Mon, Dec 12, 2011 at 5:29 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
> Am 12.12.2011 22:16, schrieb Phil Hord:
>> On Tue, Oct 18, 2011 at 4:58 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>>> Am 18.10.2011 00:33, schrieb Junio C Hamano:
>>>> We could even declare that the gitlink for such a
>>>> submodule should record 0{40} SHA-1 in the superproject, but I do not
>>>> think that is necessary.
>>>
>>> Me neither, e.g. the SHA-1 which was the submodules HEAD when it was added
>>> should do nicely. And that would avoid referencing a non-existing commit
>>> in case you later want to turn a floating submodule into an exact one.
>>
>>
>> I'm sorry I missed this comment before.
>>
>> I hope we can allow storing the actual gitlink in the superproject for
>> each commit even when we're using floating submodules.
>
> I think you misread my statement, I was just talking about the initial
> commit containing the newly added submodule, not any subsequent ones.
> Floating makes differences between the original SHA-1 and the current
> tip of the branch invisible, so there is nothing to commit.
>
>>  I thought-experimented with this a bit last year and came to the
>> conclusion that I should be able to 'float' to tips (developer
>> convenience) and also to store the SHA-1 of each gitlink through
>> history (automated maybe; as-needed).
>
> Which means that after "git submodule update" floated a submodule branch
> further, you would have to commit that in the superproject.

Sadly, yes.  Currently I have my CI-server do this for me after it
verifies each new submodule commit is able to build successfully.

>> The problem with "float-only" is that it loses history so, for
>> example, git-bisect doesn't work.
>
> Yep. And different developers can have the same superproject commit
> checked out but their submodules can be quite different.

>> The problem with "float + gitlinks", of course, is that it looks like
>> "not floating" to the developers (git-status is dirty unless
>> overridden, etc.)
>
> Yeah. But what if each "git submodule update" would update the tip of
> the submodule branch and add that to the superproject? You could follow
> a tip but still produce reproducible trees.

Yes, and that's what I want.

Not what it sounded like was being suggested before, which (to my
eyes) implied that the submodule gitlinks were useless noise.

Phil

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option
  2011-12-12 23:50               ` Phil Hord
@ 2011-12-13  8:48                 ` Jens Lehmann
  0 siblings, 0 replies; 20+ messages in thread
From: Jens Lehmann @ 2011-12-13  8:48 UTC (permalink / raw)
  To: Phil Hord; +Cc: Junio C Hamano, Heiko Voigt, Fredrik Gustafsson, git

Am 13.12.2011 00:50, schrieb Phil Hord:
> On Mon, Dec 12, 2011 at 5:29 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>> Am 12.12.2011 22:16, schrieb Phil Hord:
>>>  I thought-experimented with this a bit last year and came to the
>>> conclusion that I should be able to 'float' to tips (developer
>>> convenience) and also to store the SHA-1 of each gitlink through
>>> history (automated maybe; as-needed).
>>
>> Which means that after "git submodule update" floated a submodule branch
>> further, you would have to commit that in the superproject.
> 
> Sadly, yes.  Currently I have my CI-server do this for me after it
> verifies each new submodule commit is able to build successfully.

Which I think is a good thing to do, as you have a good chance of
catching breakage introduced by the submodule updates. "float-only"
submodules won't always be a pleasant experience, as they can (and
sometimes will) get you into trouble when advancing them introduces
bugs (and then you can't even bisect that breakage).

>>> The problem with "float + gitlinks", of course, is that it looks like
>>> "not floating" to the developers (git-status is dirty unless
>>> overridden, etc.)
>>
>> Yeah. But what if each "git submodule update" would update the tip of
>> the submodule branch and add that to the superproject? You could follow
>> a tip but still produce reproducible trees.
> 
> Yes, and that's what I want.
> 
> Not what it sounded like was being suggested before, which (to my
> eyes) implied that the submodule gitlinks were useless noise.

It was suggested in other threads in the past. For a start, you could
write a script doing that and play around with it. And if that works
well for you, we can discuss if implementing that functionality into
"git submodule update" makes sense.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2011-12-13  8:48 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-19 22:08 [PATCH v4 0/2] push: submodule support Fredrik Gustafsson
2011-08-19 22:08 ` [PATCH v4 1/2] push: Don't push a repository with unpushed submodules Fredrik Gustafsson
2011-08-19 23:26   ` Junio C Hamano
2011-08-20  6:32     ` Junio C Hamano
2011-08-21  6:48       ` Junio C Hamano
2011-08-22 19:47         ` Heiko Voigt
2011-08-22 22:22           ` Junio C Hamano
2011-08-23 19:45             ` Heiko Voigt
2011-08-24 21:14             ` [WIP PATCH] revision-walking: allow iterating revisions multiple times Heiko Voigt
2011-08-24 21:44               ` Junio C Hamano
2011-08-20  6:34     ` [PATCH 2/2] demonstrate format-callback used in combined diff Junio C Hamano
2011-08-21 21:55       ` Fredrik Gustafsson
2011-08-19 22:08 ` [PATCH v4 2/2] push: teach --recurse-submodules the on-demand option Fredrik Gustafsson
2011-09-02 18:21   ` Junio C Hamano
     [not found]     ` <20111017190749.GA3126@sandbox-rc>
2011-10-17 22:33       ` Junio C Hamano
2011-10-18 20:58         ` Jens Lehmann
2011-12-12 21:16           ` Phil Hord
2011-12-12 22:29             ` Jens Lehmann
2011-12-12 23:50               ` Phil Hord
2011-12-13  8:48                 ` Jens Lehmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.