All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/2] Teach rm to better handle submodules
@ 2012-07-04 20:43 Jens Lehmann
  2012-07-04 20:44 ` [RFC PATCH 1/2] rm: don't fail when removing populated submodules Jens Lehmann
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jens Lehmann @ 2012-07-04 20:43 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michał Górny, Phil Hord, Heiko Voigt

This is a follow-up to gmane thread 200564 about teaching "git
submodule" the "rm" command. It has the intention to only having
to add a small amount of new code to the submodule script while
handling the heavy lifting in core git.

With this series I also intend to make submodule handling a bit
easier for users by teaching "git rm" to remove the submodule
section from the .gitmodules file, no matter if the submodule is
populated or not. Not being able to do that because the section
is not found there or the .gitmodules file is already deleted
will only issue a warning (with the intent to make the user aware
that "git rm" would do that for him in case he did that himself).

The first patch introduces a change in behavior: Using rm on a
populated submodule will not error out anymore but just issue a
warning instead (In a subsequent series I intend to teach rm the
--recurse-submodule option to really remove the submodule when
its .git is a gitfile. Then the usual checks for a submodule to
not accidentally loose any changes will be added, for now it just
compares the submodules HEAD to the commit recorded in the index).
Me thinks this change should be ok as it makes a former failing
call succeed, but maybe others disagree here.

The second patch removes the corresponding submodule section from
.gitmodules, which should be done when removing a submodule so it
can be done by git itself to help the user ("git submodule add"
automatically adds it, so "git [submodule] rm" should also remove
it).

This series adds some extra cost to a "git rm" because of the
parsing of the .gitmodules file and keeping track of which of the
to-be-deleted entries are submodules, but I don't expect that to
be noticeable. Parsing .gitmodules is needed in the long run
anyways to detect submodules whose work tree is configured to be
removed automatically (or not) and the extra 'is_submodule' flag
is only added for the files to be removed and setting and testing
it is fast.

I would appreciate if someone with more knowledge of the index
api than I have to review the stage_updated_gitmodules() in patch
2 to see if I'm doing something stupid there.

Opinions?


Jens Lehmann (2):
  rm: don't fail when removing populated submodules
  rm: remove submodules from the index and the .gitmodules file

 builtin/rm.c         | 42 +++++++++++++++++---------
 submodule.c          | 55 ++++++++++++++++++++++++++++++++++
 submodule.h          |  2 ++
 t/t3600-rm.sh        | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 t/t7610-mergetool.sh |  6 ++--
 5 files changed, 172 insertions(+), 17 deletions(-)

-- 
1.7.11.1.105.g9f6831b

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-04 20:43 [RFC PATCH 0/2] Teach rm to better handle submodules Jens Lehmann
@ 2012-07-04 20:44 ` Jens Lehmann
  2012-07-06  6:57   ` Junio C Hamano
  2012-07-04 20:44 ` [RFC PATCH 2/2] rm: remove submodules from the index and the .gitmodules file Jens Lehmann
  2012-07-05  0:44 ` [RFC PATCH 0/2] Teach rm to better handle submodules Junio C Hamano
  2 siblings, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-07-04 20:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michał Górny, Phil Hord, Heiko Voigt

Currently using "git rm" on a populated submodule produces this error:
	fatal: git rm: '<submodule path>': Is a directory
Using it on an unpopulated submodule removes the empty directory silently
and removes the gitlink from the index, while it doesn't do the latter
when the submodule is populated but errors out.

While the error technically correct (the submodule directory can't be
removed because it still contains the checked out work tree) rm could do
better because it knows it is a submodule. It should remove the gitlink
from the index no matter if it is populated or not. Also not being able to
remove a submodule directory isn't an error but should only issue a
warning to inform the user about that fact while removing the gitlink from
the index nonetheless.

Change "git rm" so it only issues a warning if a populated submodule
cannot be removed. Also apply the same policy as for regular files and
require forcing when the submodules HEAD is different than what is
recorded in the index. To achieve that the list storing to be deleted
files and submodules is extended by a boolean recording if the entry is a
submodule or not to reuse that information in the removal phase.

While this changes behavior of "git rm", it only fixes an error where it
never worked properly.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---
 builtin/rm.c  | 34 +++++++++++++++++-----------
 t/t3600-rm.sh | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+), 13 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 90c8a50..1c73dcf 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -17,7 +17,10 @@ static const char * const builtin_rm_usage[] = {

 static struct {
 	int nr, alloc;
-	const char **name;
+	struct {
+		const char *name;
+		char is_submodule;
+	} *entry;
 } list;

 static int check_local_mod(unsigned char *head, int index_only)
@@ -37,7 +40,7 @@ static int check_local_mod(unsigned char *head, int index_only)
 		struct stat st;
 		int pos;
 		struct cache_entry *ce;
-		const char *name = list.name[i];
+		const char *name = list.entry[i].name;
 		unsigned char sha1[20];
 		unsigned mode;
 		int local_changes = 0;
@@ -54,11 +57,11 @@ static int check_local_mod(unsigned char *head, int index_only)
 			/* It already vanished from the working tree */
 			continue;
 		}
-		else if (S_ISDIR(st.st_mode)) {
+		else if (S_ISDIR(st.st_mode) && !S_ISGITLINK(ce->ce_mode)) {
 			/* if a file was removed and it is now a
 			 * directory, that is the same as ENOENT as
 			 * far as git is concerned; we do not track
-			 * directories.
+			 * directories unless they are submodules.
 			 */
 			continue;
 		}
@@ -173,8 +176,9 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 		struct cache_entry *ce = active_cache[i];
 		if (!match_pathspec(pathspec, ce->name, ce_namelen(ce), 0, seen))
 			continue;
-		ALLOC_GROW(list.name, list.nr + 1, list.alloc);
-		list.name[list.nr++] = ce->name;
+		ALLOC_GROW(list.entry, list.nr + 1, list.alloc);
+		list.entry[list.nr].name = ce->name;
+		list.entry[list.nr++].is_submodule = S_ISGITLINK(ce->ce_mode);
 	}

 	if (pathspec) {
@@ -222,7 +226,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	 * the index unless all of them succeed.
 	 */
 	for (i = 0; i < list.nr; i++) {
-		const char *path = list.name[i];
+		const char *path = list.entry[i].name;
 		if (!quiet)
 			printf("rm '%s'\n", path);

@@ -244,13 +248,17 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	if (!index_only) {
 		int removed = 0;
 		for (i = 0; i < list.nr; i++) {
-			const char *path = list.name[i];
-			if (!remove_path(path)) {
-				removed = 1;
-				continue;
+			const char *path = list.entry[i].name;
+			if (!list.entry[i].is_submodule) {
+				if (!remove_path(path)) {
+					removed = 1;
+					continue;
+				}
+				if (!removed)
+					die_errno("git rm: '%s'", path);
+			} else {
+				rmdir_or_warn(path);
 			}
-			if (!removed)
-				die_errno("git rm: '%s'", path);
 		}
 	}

diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 9fd28bc..2af8bb9 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -262,4 +262,75 @@ test_expect_success 'rm removes subdirectories recursively' '
 	! test -d dir
 '

+cat >expect <<EOF
+D  submod
+EOF
+
+test_expect_success 'rm removes empty submodules from work tree' '
+	mkdir submod &&
+	git update-index --add --cacheinfo 160000 $(git rev-parse HEAD) submod &&
+	git config -f .gitmodules submodule.sub.url ./. &&
+	git config -f .gitmodules submodule.sub.path submod &&
+	git submodule init &&
+	git add .gitmodules &&
+	git commit -m "add submodule" &&
+	git rm submod &&
+	test ! -e submod &&
+	git status -s -uno > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rm leaves work tree of populated submodules alone' '
+	git reset --hard &&
+	git submodule update &&
+	git rm submod &&
+	test -d submod &&
+	test -f submod/.git &&
+	git status -s -uno > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rm of a populated submodule with different HEAD requires forcing' '
+	git reset --hard &&
+	git submodule update &&
+	(cd submod &&
+		git checkout HEAD^
+	) &&
+	test_must_fail git rm submod &&
+	git status -s -uno > actual &&
+	git rm --force submod &&
+	test -s actual &&
+	test -d submod &&
+	test -f submod/.git &&
+	git status -s -uno > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rm of a populated submodule with modifications succeeds' '
+	git reset --hard &&
+	git submodule update &&
+	(cd submod &&
+		echo X >empty
+	) &&
+	git rm submod &&
+	test -d submod &&
+	test -f submod/.git &&
+	git status -s -uno > actual &&
+	test_cmp expect actual
+'
+
+test_expect_success 'rm of a populated submodule with untracked files succeeds' '
+	git reset --hard &&
+	git submodule update &&
+	(cd submod &&
+		echo X >untracked
+	) &&
+	git rm submod &&
+	test -d submod &&
+	test -f submod/.git &&
+	git status -s -uno > actual &&
+	test_cmp expect actual &&
+	rm submod/untracked
+'
+
 test_done
-- 
1.7.11.1.105.g9f6831b

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [RFC PATCH 2/2] rm: remove submodules from the index and the .gitmodules file
  2012-07-04 20:43 [RFC PATCH 0/2] Teach rm to better handle submodules Jens Lehmann
  2012-07-04 20:44 ` [RFC PATCH 1/2] rm: don't fail when removing populated submodules Jens Lehmann
@ 2012-07-04 20:44 ` Jens Lehmann
  2012-07-05  0:44 ` [RFC PATCH 0/2] Teach rm to better handle submodules Junio C Hamano
  2 siblings, 0 replies; 19+ messages in thread
From: Jens Lehmann @ 2012-07-04 20:44 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michał Górny, Phil Hord, Heiko Voigt

Currently using "git rm" on a submodule only removes the gitlink from the
index but leaves the submodule's section in .gitmodules untouched.

It should help the user by not only removing the gitlink from the index
but by also removing the "submodule.<submodule name>" section from the
.gitmodules file and stage both.

Change "git rm" so it removes the submodule section from .gitmodules and
only issues a warning if it cannot be removed, as the user might have done
that himself.

In t7610 three uses of "git rm submod" had to be replaced with
"git update-index --remove submod" because that test expects .gitmodules
to stay untouched.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---
 builtin/rm.c         |  8 +++++++-
 submodule.c          | 55 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 submodule.h          |  2 ++
 t/t3600-rm.sh        | 25 ++++++++++++++++++------
 t/t7610-mergetool.sh |  6 +++---
 5 files changed, 86 insertions(+), 10 deletions(-)

diff --git a/builtin/rm.c b/builtin/rm.c
index 1c73dcf..7190dc9 100644
--- a/builtin/rm.c
+++ b/builtin/rm.c
@@ -9,6 +9,7 @@
 #include "cache-tree.h"
 #include "tree-walk.h"
 #include "parse-options.h"
+#include "submodule.h"

 static const char * const builtin_rm_usage[] = {
 	"git rm [options] [--] <file>...",
@@ -149,6 +150,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	const char **pathspec;
 	char *seen;

+	gitmodules_config();
 	git_config(git_default_config, NULL);

 	argc = parse_options(argc, argv, prefix, builtin_rm_options,
@@ -246,7 +248,7 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 	 * in the middle)
 	 */
 	if (!index_only) {
-		int removed = 0;
+		int removed = 0, gitmodules_modified = 0;
 		for (i = 0; i < list.nr; i++) {
 			const char *path = list.entry[i].name;
 			if (!list.entry[i].is_submodule) {
@@ -258,8 +260,12 @@ int cmd_rm(int argc, const char **argv, const char *prefix)
 					die_errno("git rm: '%s'", path);
 			} else {
 				rmdir_or_warn(path);
+				if (!remove_path_from_gitmodules(path))
+					gitmodules_modified = 1;
 			}
 		}
+		if (gitmodules_modified)
+			stage_updated_gitmodules();
 	}

 	if (active_cache_changed) {
diff --git a/submodule.c b/submodule.c
index 959d349..f7e4b14 100644
--- a/submodule.c
+++ b/submodule.c
@@ -10,6 +10,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "argv-array.h"
+#include "blob.h"

 static struct string_list config_name_for_path;
 static struct string_list config_fetch_recurse_submodules_for_name;
@@ -30,6 +31,60 @@ static struct sha1_array ref_tips_after_fetch;
  */
 static int gitmodules_is_unmerged;

+/*
+ * Try to remove the "submodule.<name>" section from .gitmodules where the
+ * given path is configured.
+ */
+int remove_path_from_gitmodules(const char *path)
+{
+	struct strbuf sect = STRBUF_INIT;
+	struct string_list_item *path_option;
+
+	path_option = unsorted_string_list_lookup(&config_name_for_path, path);
+	if (!path_option) {
+		warning("Could not find section in .gitmodules where path=%s", path);
+		return -1;
+	}
+	strbuf_addstr(&sect, "submodule.");
+	strbuf_addstr(&sect, path_option->util);
+	if (git_config_rename_section_in_file(".gitmodules", sect.buf, NULL) < 0) {
+		/* Maybe the user already did that, don't error out here */
+		warning("Could not remove .gitmodules entry for %s", path);
+		return -1;
+	}
+	strbuf_release(&sect);
+	return 0;
+}
+
+void stage_updated_gitmodules(void)
+{
+	struct strbuf buf = STRBUF_INIT;
+	struct stat st;
+	int pos;
+	struct cache_entry *ce;
+	int namelen = strlen(".gitmodules");
+
+	pos = cache_name_pos(".gitmodules", strlen(".gitmodules"));
+	if (pos < 0) {
+		warning(_("could not find .gitmodules in index"));
+		return;
+	}
+	ce = active_cache[pos];
+	ce->ce_flags = namelen;
+	if (strbuf_read_file(&buf, ".gitmodules", 0))
+		die(_("reading updated .gitmodules failed"));
+	if (lstat(".gitmodules", &st) < 0)
+		die_errno(_("unable to stat updated .gitmodules"));
+	fill_stat_cache_info(ce, &st);
+	ce->ce_mode = ce_mode_from_stat(ce, st.st_mode);
+	if (remove_file_from_cache(".gitmodules") < 0)
+		die(_("unable to remove .gitmodules from index"));
+	if (write_sha1_file(buf.buf, buf.len, blob_type, ce->sha1))
+		die(_("adding updated .gitmodules failed"));
+	if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE))
+		die(_("staging updated .gitmodules failed"));
+}
+
 static int add_submodule_odb(const char *path)
 {
 	struct strbuf objects_directory = STRBUF_INIT;
diff --git a/submodule.h b/submodule.h
index e105b0e..1bfc2c6 100644
--- a/submodule.h
+++ b/submodule.h
@@ -10,6 +10,8 @@ enum {
 	RECURSE_SUBMODULES_ON = 2
 };

+int remove_path_from_gitmodules(const char *path);
+void stage_updated_gitmodules(void);
 void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
 		const char *path);
 int submodule_config(const char *var, const char *value, void *cb);
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 2af8bb9..4a6475f 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -263,10 +263,11 @@ test_expect_success 'rm removes subdirectories recursively' '
 '

 cat >expect <<EOF
+M  .gitmodules
 D  submod
 EOF

-test_expect_success 'rm removes empty submodules from work tree' '
+test_expect_success 'rm removes empty submodules from work tree and .gitmodules' '
 	mkdir submod &&
 	git update-index --add --cacheinfo 160000 $(git rev-parse HEAD) submod &&
 	git config -f .gitmodules submodule.sub.url ./. &&
@@ -277,17 +278,21 @@ test_expect_success 'rm removes empty submodules from work tree' '
 	git rm submod &&
 	test ! -e submod &&
 	git status -s -uno > actual &&
-	test_cmp expect actual
+	test_cmp expect actual &&
+	test_must_fail git config -f .gitmodules submodule.sub.url &&
+	test_must_fail git config -f .gitmodules submodule.sub.path
 '

-test_expect_success 'rm leaves work tree of populated submodules alone' '
+test_expect_success 'rm leaves work tree of populated submodules alone but updates .gitmodules' '
 	git reset --hard &&
 	git submodule update &&
 	git rm submod &&
 	test -d submod &&
 	test -f submod/.git &&
 	git status -s -uno > actual &&
-	test_cmp expect actual
+	test_cmp expect actual &&
+	test_must_fail git config -f .gitmodules submodule.sub.url &&
+	test_must_fail git config -f .gitmodules submodule.sub.path
 '

 test_expect_success 'rm of a populated submodule with different HEAD requires forcing' '
@@ -298,12 +303,16 @@ test_expect_success 'rm of a populated submodule with different HEAD requires fo
 	) &&
 	test_must_fail git rm submod &&
 	git status -s -uno > actual &&
+	git config -f .gitmodules submodule.sub.url &&
+	git config -f .gitmodules submodule.sub.path &&
 	git rm --force submod &&
 	test -s actual &&
 	test -d submod &&
 	test -f submod/.git &&
 	git status -s -uno > actual &&
-	test_cmp expect actual
+	test_cmp expect actual &&
+	test_must_fail git config -f .gitmodules submodule.sub.url &&
+	test_must_fail git config -f .gitmodules submodule.sub.path
 '

 test_expect_success 'rm of a populated submodule with modifications succeeds' '
@@ -316,7 +325,9 @@ test_expect_success 'rm of a populated submodule with modifications succeeds' '
 	test -d submod &&
 	test -f submod/.git &&
 	git status -s -uno > actual &&
-	test_cmp expect actual
+	test_cmp expect actual &&
+	test_must_fail git config -f .gitmodules submodule.sub.url &&
+	test_must_fail git config -f .gitmodules submodule.sub.path
 '

 test_expect_success 'rm of a populated submodule with untracked files succeeds' '
@@ -330,6 +341,8 @@ test_expect_success 'rm of a populated submodule with untracked files succeeds'
 	test -f submod/.git &&
 	git status -s -uno > actual &&
 	test_cmp expect actual &&
+	test_must_fail git config -f .gitmodules submodule.sub.url &&
+	test_must_fail git config -f .gitmodules submodule.sub.path &&
 	rm submod/untracked
 '

diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh
index f5e16fc..3261588 100755
--- a/t/t7610-mergetool.sh
+++ b/t/t7610-mergetool.sh
@@ -215,7 +215,7 @@ test_expect_success 'deleted vs modified submodule' '
     git checkout -b test6 branch1 &&
     git submodule update -N &&
     mv submod submod-movedaside &&
-    git rm submod &&
+    git update-index --remove submod &&
     git commit -m "Submodule deleted from branch" &&
     git checkout -b test6.a test6 &&
     test_must_fail git merge master &&
@@ -284,7 +284,7 @@ test_expect_success 'file vs modified submodule' '
     git checkout -b test7 branch1 &&
     git submodule update -N &&
     mv submod submod-movedaside &&
-    git rm submod &&
+    git update-index --remove submod &&
     echo not a submodule >submod &&
     git add submod &&
     git commit -m "Submodule path becomes file" &&
@@ -415,7 +415,7 @@ test_expect_success 'submodule in subdirectory' '
 test_expect_success 'directory vs modified submodule' '
     git checkout -b test11 branch1 &&
     mv submod submod-movedaside &&
-    git rm submod &&
+    git update-index --remove submod &&
     mkdir submod &&
     echo not a submodule >submod/file16 &&
     git add submod/file16 &&
-- 
1.7.11.1.105.g9f6831b

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/2] Teach rm to better handle submodules
  2012-07-04 20:43 [RFC PATCH 0/2] Teach rm to better handle submodules Jens Lehmann
  2012-07-04 20:44 ` [RFC PATCH 1/2] rm: don't fail when removing populated submodules Jens Lehmann
  2012-07-04 20:44 ` [RFC PATCH 2/2] rm: remove submodules from the index and the .gitmodules file Jens Lehmann
@ 2012-07-05  0:44 ` Junio C Hamano
  2012-07-05 19:06   ` Jens Lehmann
  2 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-07-05  0:44 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> This is a follow-up to gmane thread 200564 about teaching "git
> submodule" the "rm" command. It has the intention to only having
> to add a small amount of new code to the submodule script while
> handling the heavy lifting in core git.
>
> With this series I also intend to make submodule handling a bit
> easier for users by teaching "git rm" to remove the submodule
> section from the .gitmodules file, no matter if the submodule is
> populated or not. Not being able to do that because the section
> is not found there or the .gitmodules file is already deleted
> will only issue a warning (with the intent to make the user aware
> that "git rm" would do that for him in case he did that himself).

I wouldn't claim that I have thought things through yet, but in
general my instinct tells me that it is a bad idea to try pushing
down "submodule management" related bits to the core.

Let me think aloud about removing a directory, in a checkout of a
superproject, that has a submodule checkout.  There are at least two
majorly different reasons a user wants to do this:

 (1) The project used to bind a submodule as a part of the
     superproject there, but it no longer wants to do so (e.g. we
     used to ship compiler toolchain as part of our embedded
     appliance superproject, but from this release on, we expect
     developers to install the toolchain on their own).  We want to
     remove the submodule directory, we want to remove the submodule
     entry in the superproject's tree, and we want to remove the
     submodule entry from .gitmodules file.

 (2) One particular project participant has been working on one part
     of the superproject, say "the documentation submodule", hence
     "submodule init" was used for that submodule part to populate
     it.  Since the participant has done with the work, there is no
     longer need to keep the submodule checkout (submodule uninit?),
     and getting rid of it makes the working tree leaner and "git
     pull" faster (as there is no longer need to fetch updated
     history in the uninteresting submodule).  We just want to
     remove the contents of the submodule directory.  We do want to
     keep the index entry for the submodule in the superproject.  We
     do want to keep the .gitmodules intact.

This kind (1) of reorganization hopefully is rare (in other words,
you wouldn't drop a submodule today and turn around to add it again
tomorrow), done by one person and the result is propagated to all
other project participants.  On the other hand, (2) can be done by
any and all participants of the project at any time.

Which audience should "git rm $path" serve?  My gut feeling is that
the "project structure change" part that goes beyond the core
(e.g. what is in ".gitmodules") and should be a rare flag-day event
deserves a separate command, if only to make the user aware that the
user is invoking a heavy-weight (from the point-of-view of the
workflow) operation.

I am not convinced that "git rm $path" is a good interface for (2),
though.  For one thing, you would need to keep an empty directory at
the submodule path for the purpose of (2), so it is not really "rm".

Perhaps "git submodule uninit" might be needed to support (2), in
addition to "git submodule rm" to support (1).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/2] Teach rm to better handle submodules
  2012-07-05  0:44 ` [RFC PATCH 0/2] Teach rm to better handle submodules Junio C Hamano
@ 2012-07-05 19:06   ` Jens Lehmann
  2012-07-05 22:10     ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-07-05 19:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Am 05.07.2012 02:44, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> This is a follow-up to gmane thread 200564 about teaching "git
>> submodule" the "rm" command. It has the intention to only having
>> to add a small amount of new code to the submodule script while
>> handling the heavy lifting in core git.
>>
>> With this series I also intend to make submodule handling a bit
>> easier for users by teaching "git rm" to remove the submodule
>> section from the .gitmodules file, no matter if the submodule is
>> populated or not. Not being able to do that because the section
>> is not found there or the .gitmodules file is already deleted
>> will only issue a warning (with the intent to make the user aware
>> that "git rm" would do that for him in case he did that himself).
> 
> I wouldn't claim that I have thought things through yet, but in
> general my instinct tells me that it is a bad idea to try pushing
> down "submodule management" related bits to the core.

Am I right assuming you are only talking about 2/2 here and not
about the bugfix in 1/2?

> Let me think aloud about removing a directory, in a checkout of a
> superproject, that has a submodule checkout.  There are at least two
> majorly different reasons a user wants to do this:
> 
>  (1) The project used to bind a submodule as a part of the
>      superproject there, but it no longer wants to do so (e.g. we
>      used to ship compiler toolchain as part of our embedded
>      appliance superproject, but from this release on, we expect
>      developers to install the toolchain on their own).  We want to
>      remove the submodule directory, we want to remove the submodule
>      entry in the superproject's tree, and we want to remove the
>      submodule entry from .gitmodules file.
> 
>  (2) One particular project participant has been working on one part
>      of the superproject, say "the documentation submodule", hence
>      "submodule init" was used for that submodule part to populate
>      it.  Since the participant has done with the work, there is no
>      longer need to keep the submodule checkout (submodule uninit?),
>      and getting rid of it makes the working tree leaner and "git
>      pull" faster (as there is no longer need to fetch updated
>      history in the uninteresting submodule).  We just want to
>      remove the contents of the submodule directory.  We do want to
>      keep the index entry for the submodule in the superproject.  We
>      do want to keep the .gitmodules intact.
> 
> This kind (1) of reorganization hopefully is rare (in other words,
> you wouldn't drop a submodule today and turn around to add it again
> tomorrow), done by one person and the result is propagated to all
> other project participants.  On the other hand, (2) can be done by
> any and all participants of the project at any time.
> 
> Which audience should "git rm $path" serve?  My gut feeling is that
> the "project structure change" part that goes beyond the core
> (e.g. what is in ".gitmodules") and should be a rare flag-day event
> deserves a separate command, if only to make the user aware that the
> user is invoking a heavy-weight (from the point-of-view of the
> workflow) operation.

Yes, "git rm" is for case (1) where someone wants to remove a
submodule from the project with the commit he prepares. Case (2)
is not addressed at all by this series (but I like the "uninit"
idea, makes lots of sense to me).

> I am not convinced that "git rm $path" is a good interface for (2),
> though.  For one thing, you would need to keep an empty directory at
> the submodule path for the purpose of (2), so it is not really "rm".

I totally agree here.

> Perhaps "git submodule uninit" might be needed to support (2), in
> addition to "git submodule rm" to support (1).

I think we should, as doing a simple "rm -rf $path" won't do the
job as you already pointed out. First git status will complain
that $path has been deleted, so you'll have to manually recreate
an empty directory there. And then you'll also have to revoke
your interest in having that submodule populated by removing the
submodule.name entry from .git/config, or the next "git submodule
update" will happily re-populate the submodule. All that should
be handled by a "git submodule uninit" (including the tests for
submodule modifications that could be lost). I don't see any
core git command where such a functionality could sanely live.

But that is subject to another patch. Here I wanted to get some
feedback if submodule users would agree with me that a "git rm"
should remove a submodule from index *and* .gitmodules (because
they want to remove it from future versions of the repo, and
then the removal from .gitmodules should happen at the same
point in time, so rm could just do it for you).

I don't have strong feelings about the decision if only the
submodule script should do any modifications to .gitmodules or
if it belongs into git core as part of the rm command, though
for me the latter feels more natural.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 0/2] Teach rm to better handle submodules
  2012-07-05 19:06   ` Jens Lehmann
@ 2012-07-05 22:10     ` Junio C Hamano
  0 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2012-07-05 22:10 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Am 05.07.2012 02:44, schrieb Junio C Hamano:
> ...
>> I wouldn't claim that I have thought things through yet, but in
>> general my instinct tells me that it is a bad idea to try pushing
>> down "submodule management" related bits to the core.
>
> Am I right assuming you are only talking about 2/2 here and not
> about the bugfix in 1/2?

I was addressing the general attitude, expressed in your [0/2] cover
letter, of eagerly attempting to push down to the core side what
could be affected by non-core policy, namely, things like what is in
".gitmodules" and how the working trees and controlling repositories
of submodules are laid out (cf. the other thread on GIT_WORK_TREE
and GIT_DIR used for the top-level superproject).  The comment was
not about either of the two patches.

When making a project-wide structural change to the superproject by
adding a submodule, "git submodule add" is used to add optional
information in .gitmodules, because where the other people would
clone the history for the submodule (or if there is a place for
other people to fetch it in the first place---a private standalone
project does not even need one) is not something the core "git add"
is not interested in.  Ceasing to use a submodule project wide (the
use case (1) in the message you are responding to) by removing the
submodule tree and an entry in .gitmodule for it is the same kind of
project wide structural change, and it felt very out of place, at
least to me, to have "git rm" do anything with .gitmodules file.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-04 20:44 ` [RFC PATCH 1/2] rm: don't fail when removing populated submodules Jens Lehmann
@ 2012-07-06  6:57   ` Junio C Hamano
  2012-07-07 12:51     ` Jens Lehmann
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-07-06  6:57 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Currently using "git rm" on a populated submodule produces this error:
> 	fatal: git rm: '<submodule path>': Is a directory
> Using it on an unpopulated submodule removes the empty directory silently
> and removes the gitlink from the index, while it doesn't do the latter
> when the submodule is populated but errors out.
>
> While the error technically correct (the submodule directory can't be
> removed because it still contains the checked out work tree) rm could do
> better because it knows it is a submodule.

Correct in principle, but the definition of "better" could be open
to discussion.

> It should remove the gitlink
> from the index no matter if it is populated or not.

Why?  If you have a regular file that has changes with respect to
the index entry, the index version is kept as well as the working
tree version.  You can "rm --cached" to remove the index entry, and
you can "rm --force" to remove both, nuking the change you made to
the working tree, but we do not touch such a "dirty" entry without
any explicit option.

> Also not being able to
> remove a submodule directory isn't an error but should only issue a
> warning to inform the user about that fact while removing the gitlink from
> the index nonetheless.

You are repeating yourself without justification.

> Change "git rm" so it only issues a warning if a populated submodule
> cannot be removed.

I find this part iffy due to the above.  With --cached, perhaps, but
without any option, I do not think I heard a sound justification
behind this change.

> Also apply the same policy as for regular files and
> require forcing when the submodules HEAD is different than what is
> recorded in the index. 

I think the "policy" for regular files is that "git rm $path" errors
out to avoid losing information in $path.  Even if the HEAD in the
submodule points at the same commit as recorded in the index, if the
submodule directory has other changes that (cd $path && git status)
would report, we would not want to remove it, no?

I am not sure if the difference between $path/.git/HEAD and :$path
(the version in the index) matters.  Maybe it does, maybe it
doesn't.

One possible sane behaviour of "git rm $path" might be:

 - If --force is given, remove it from the index and from the
   working tree (i.e. "rm -rf $path"), but use the "gitfile"
   facility to save $path/.git away to $GIT_DIR/modules/$name; error
   out if the submodule directory $path cannot be removed like this.
   We would probably want to remove "submodule.<name>.*" entries in
   .gitmodules for <name> for which "submodule.<name>.path" matches
   the $path.

 - If --cached is given, remove it from the index if the version in
   the index match either HEAD or the $path/.git/HEAD, without
   touching the working tree.  This is consistent with what happens
   to a regular file.

 - If neither --force nor --cached is given, run an equivalent of
   (cd $path && git status), and also check if $path/.git/HEAD
   matches the index version.  Error out if the submodule directory
   is dirty (again I am not sure about this part).  If the submodule
   directory is clean, do the same as the case with --force.

> While this changes behavior of "git rm", it only fixes an error where it
> never worked properly.

It stops an error from being issued, but I am not convinced that the
new behaviour is necessarily a sensible one.  A change that stops an
error and performs a random operation is not necessarily a "fix".

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-06  6:57   ` Junio C Hamano
@ 2012-07-07 12:51     ` Jens Lehmann
  2012-07-08  7:32       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-07-07 12:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Am 06.07.2012 08:57, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
>> Also apply the same policy as for regular files and
>> require forcing when the submodules HEAD is different than what is
>> recorded in the index. 
> 
> I think the "policy" for regular files is that "git rm $path" errors
> out to avoid losing information in $path.  Even if the HEAD in the
> submodule points at the same commit as recorded in the index, if the
> submodule directory has other changes that (cd $path && git status)
> would report, we would not want to remove it, no?
>
> I am not sure if the difference between $path/.git/HEAD and :$path
> (the version in the index) matters.  Maybe it does, maybe it
> doesn't.

Doh. I don't know how I got the idea it would be so, but a quick
test with checkout and rebase showed they ignore if a submodules
HEAD is different from the commit recorded. So plain rm should do
the same as long as it doesn't touch the submodule work tree and
I'll remove checking the HEAD from patch 1. I'll prepare v2 which
will also include an updated commit message.

> One possible sane behaviour of "git rm $path" might be:
> 
>  - If --force is given, remove it from the index and from the
>    working tree (i.e. "rm -rf $path"), but use the "gitfile"
>    facility to save $path/.git away to $GIT_DIR/modules/$name; error
>    out if the submodule directory $path cannot be removed like this.
>    We would probably want to remove "submodule.<name>.*" entries in
>    .gitmodules for <name> for which "submodule.<name>.path" matches
>    the $path.
> 
>  - If --cached is given, remove it from the index if the version in
>    the index match either HEAD or the $path/.git/HEAD, without
>    touching the working tree.  This is consistent with what happens
>    to a regular file.
> 
>  - If neither --force nor --cached is given, run an equivalent of
>    (cd $path && git status), and also check if $path/.git/HEAD
>    matches the index version.  Error out if the submodule directory
>    is dirty (again I am not sure about this part).  If the submodule
>    directory is clean, do the same as the case with --force.

What you describe here is exactly how I think "git submodule rm" and
"git rm --recurse-submodules" should behave.

The questions remaining for me are:

* What should a "git rm --no-recurse-submodules" do?

  I think it should try to follow the policy git core commands use:

  - don't touch the submodule's work tree

  - remove the submodule directory if it is empty and warn if not
    (currently it dies if not, to change that to a warning is the
    subject of patch 1)

  The more difficult question is if it should remove the submodule
  entry from .gitmodules (patch 2). As that file is part of the
  superproject's work tree and core git already learned to read
  configuration options for status, diff and fetch from it I think
  it's a good idea to help the user by doing so (but maybe we should
  make this configurable and/or add an option to enable/disable it).
  But on the other hand maybe users expect this only to happen when
  they use "git submodule rm" and "git rm" should leave .gitmodules
  alone?

* What should the default behavior of "git rm" be.

  I tend to think that as all other core git commands never
  manipulate a submodule's work tree without the --recurse-submodules
  option, rm should do the same. So I think we should default to
  the --no-recure-submodules case described above to not surprise the
  user. That makes checking the submodule for modifications unnecessary
  until the --recure-submodules option is implemented.


Does that make sense?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-07 12:51     ` Jens Lehmann
@ 2012-07-08  7:32       ` Junio C Hamano
  2012-07-08 15:08         ` Jens Lehmann
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-07-08  7:32 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

>> One possible sane behaviour of "git rm $path" might be:
>> 
>>  - If --force is given, remove it from the index and from the
>>    working tree (i.e. "rm -rf $path"), but use the "gitfile"
>>    facility to save $path/.git away to $GIT_DIR/modules/$name; error
>>    out if the submodule directory $path cannot be removed like this.
>>    We would probably want to remove "submodule.<name>.*" entries in
>>    .gitmodules for <name> for which "submodule.<name>.path" matches
>>    the $path.
>> 
>>  - If --cached is given, remove it from the index if the version in
>>    the index match either HEAD or the $path/.git/HEAD, without
>>    touching the working tree.  This is consistent with what happens
>>    to a regular file.
>> 
>>  - If neither --force nor --cached is given, run an equivalent of
>>    (cd $path && git status), and also check if $path/.git/HEAD
>>    matches the index version.  Error out if the submodule directory
>>    is dirty (again I am not sure about this part).  If the submodule
>>    directory is clean, do the same as the case with --force.
>
> What you describe here is exactly how I think "git submodule rm" and
> "git rm --recurse-submodules" should behave.

If you have a directory A with a file B in it (i.e. A/B), "git rm A"
is refused and you have to say "git rm -r A".  So I can see why the
above description of the mine is wrong with respect to "-r" option
(all cases should fail if you did not give "-r" option).

But I do not think "git rm" needs "--recurse-submodules".  Wasn't
"--recurse-submodules" the option to control, when you tell Git to
do something to submodule "A", what should happen to submodules
contained in the submodule "A" (e.g. "A/B" that appears at path "B"
that itself is a separate project bound as a submodule to "A")?

I can see what fetching or updating "A" (e.g. "git submodule
update") while leaving "A/B" intact means, so there is a reason to
have two ways (with or without --recurse-submodules) to such a
command, but I do not see any sensible definition of what it means
to "remove" (whether it is "git submodule rm" or just plain "git
rm") "A" without affecting "A/B", especially with respect to the
working tree files.  If you remove "A" and its working tree files,
is there a sensible way to keep "A/B" and its working tree files?

I am OK if you choose to implement the behaviour described above
only in "git submodule rm A" and not plain "git rm -r A", but if you
are going that route, I do not see how it is an improvement for it
to remove the index entry for A from the index if your "git rm -r A"
does not remove working tree files for submodule A.  The user asked
to remove A with a command that would remove both index entry and
working tree file for a regular file (or a directory), the command
may decide it is not prudent to do so for whatever reason.  Perhaps
the entity being asked to remove has local changes the user may
regret losing.  Perhaps we decided that such an opration to cause a
large structural change should not be done with a plain "rm" but
should be done with "git submodule rm".  The reasons do not matter
much, but for the end result to be consistent, shouldn't the command
keep the index entry intact if it does not remove the working tree?

Unless the user used "git rm --cached" to explicitly ask for the
index entry to be removed without touching the working tree files,
that is.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-08  7:32       ` Junio C Hamano
@ 2012-07-08 15:08         ` Jens Lehmann
  2012-07-09  2:17           ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-07-08 15:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Am 08.07.2012 09:32, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>>> One possible sane behaviour of "git rm $path" might be:
>>>
>>>  - If --force is given, remove it from the index and from the
>>>    working tree (i.e. "rm -rf $path"), but use the "gitfile"
>>>    facility to save $path/.git away to $GIT_DIR/modules/$name; error
>>>    out if the submodule directory $path cannot be removed like this.
>>>    We would probably want to remove "submodule.<name>.*" entries in
>>>    .gitmodules for <name> for which "submodule.<name>.path" matches
>>>    the $path.
>>>
>>>  - If --cached is given, remove it from the index if the version in
>>>    the index match either HEAD or the $path/.git/HEAD, without
>>>    touching the working tree.  This is consistent with what happens
>>>    to a regular file.
>>>
>>>  - If neither --force nor --cached is given, run an equivalent of
>>>    (cd $path && git status), and also check if $path/.git/HEAD
>>>    matches the index version.  Error out if the submodule directory
>>>    is dirty (again I am not sure about this part).  If the submodule
>>>    directory is clean, do the same as the case with --force.
>>
>> What you describe here is exactly how I think "git submodule rm" and
>> "git rm --recurse-submodules" should behave.
> 
> If you have a directory A with a file B in it (i.e. A/B), "git rm A"
> is refused and you have to say "git rm -r A".  So I can see why the
> above description of the mine is wrong with respect to "-r" option
> (all cases should fail if you did not give "-r" option).

I think that depends on how you see submodules in the context of
the superproject. If you see the submodule entry as an abstraction
describing the whole work tree below it with a single SHA1, "-r"
isn't necessary (but won't harm either). Only if you think that the
individual files of a submodule make sense in the superproject's
context, you'd need "-r". I think we always use the first approach:
you either get all files which are in a submodule's work tree, or
none. So I don't think we should require "-r" here, as a submodule
is a single object of the superproject. And for the same reason
"git rm A/B" issues an error: the file A/B doesn't exist in the
superproject.

> But I do not think "git rm" needs "--recurse-submodules".  Wasn't
> "--recurse-submodules" the option to control, when you tell Git to
> do something to submodule "A", what should happen to submodules
> contained in the submodule "A" (e.g. "A/B" that appears at path "B"
> that itself is a separate project bound as a submodule to "A")?

Nope. Only the "--recursive" option to the git submodule script
works like that (and almost everyone seems to use that option by
default anyway). But for all commands that understand the
"--recurse-submodule" option (currently these are clone, fetch,
merge, pull and push) that means "include submodules in what you
do and don't stop at the first level but recurse all the way down".
Without this option they won't even touch the first level of
submodules.

> I am OK if you choose to implement the behaviour described above
> only in "git submodule rm A" and not plain "git rm -r A", but if you
> are going that route, I do not see how it is an improvement for it
> to remove the index entry for A from the index if your "git rm -r A"
> does not remove working tree files for submodule A.  The user asked
> to remove A with a command that would remove both index entry and
> working tree file for a regular file (or a directory), the command
> may decide it is not prudent to do so for whatever reason.  Perhaps
> the entity being asked to remove has local changes the user may
> regret losing.  Perhaps we decided that such an opration to cause a
> large structural change should not be done with a plain "rm" but
> should be done with "git submodule rm".  The reasons do not matter
> much, but for the end result to be consistent, shouldn't the command
> keep the index entry intact if it does not remove the working tree?

All other core commands happily change the index without updating
the submodule's work tree. My first patch intended to make "git rm"
behave the same: Don't care if the submodule's work tree is still
there, but just record in the index what the user told you. Right
now it *always* throws an error no matter if the submodule is
modified or not, while a "rm" should either just work on unmodified
content or leave submodules alone. That's what I'm trying to fix.

To me checking out a commit before the submodule was added should
leave work tree and index in the same state with respect to the
submodule as when I do a "git rm sub". In the former case the
submodule directory is still there with all files but it won't be
present in the index (and .gitmodules) anymore. And checkout
doesn't care at all about any modifications in the submodule before
updating. "git rm" will behave just the same after my patch (of
course I'm talking about the upcoming v2 without checking the HEAD).

But I agree with you that rm should consider modifications of the
work tree of a submodule (unless "-f") before actually removing its
work tree. But I don't think we should teach "git rm" to do that
without an explicit "--recurse-submodules" for the following reasons:

* Change in behavior:
  Currently "git rm submod" always errors out when the submodule is
  populated. Would it remove an unmodified submodule from index and
  work tree and only error out if it would throw away local changes
  it will surprise users who expect the submodule work tree being
  left untouched by rm (and all other core commands).

* Consistency:
  rm would be the first core command to touch submodules work trees
  without being given the "--recurse-submodules" option (clone being
  the only command actually touching the work tree when that option
  is used by populating all submodules, all other just update refs
  and the object database in submodules). I'd rather have one config
  option teaching all work tree changing commands to always update
  submodules work trees later, when enough of them learned the
  "--recurse-submodules" option.

* Workflow issues:
  All current git core commands leave submodules untouched. So you
  might have to run a "git submodule update" after checking out a
  new branch in which you want to remove a submodule before "git rm"
  will remove it without "-f". No other core command demands that.

But if other people think these issues aren't as important as I
think they are, I'll happily teach "git rm" to do that by default.

The important change for me is the next step which teaches "git rm"
the "--recurse-submodules" option (no matter if enabled as default
or not). If this change here doesn't make sense to you I'll drop it
and start implementing the checking and removal of submodules. That
can then be used by "git submodule rm".

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-08 15:08         ` Jens Lehmann
@ 2012-07-09  2:17           ` Junio C Hamano
  2012-07-09  5:02             ` Junio C Hamano
  2012-07-09 18:33             ` Jens Lehmann
  0 siblings, 2 replies; 19+ messages in thread
From: Junio C Hamano @ 2012-07-09  2:17 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

>>> What you describe here is exactly how I think "git submodule rm" and
>>> "git rm --recurse-submodules" should behave.
>> 
>> If you have a directory A with a file B in it (i.e. A/B), "git rm A"
>> is refused and you have to say "git rm -r A".  So I can see why the
>> above description of the mine is wrong with respect to "-r" option
>> (all cases should fail if you did not give "-r" option).
>
> I think that depends on how you see submodules in the context of
> the superproject.

I am OK with "git rm path" removing the submodule working tree and
the index entry for path without -r (of course "git rm dir" would
not remove submodule "dir/path" in a plain directory "dir" and needs
to be spelled "git rm -r dir" or "git rm dir/path" but that is the
same if "path" were a regular file and a submodule is not special);
thinking about it again, I think it makes more sense, because a
submodule is just one single path entity in the context of the
superproject.

>> But I do not think "git rm" needs "--recurse-submodules".  Wasn't
>> "--recurse-submodules" the option to control, when you tell Git to
>> do something to submodule "A", what should happen to submodules
>> contained in the submodule "A" (e.g. "A/B" that appears at path "B"
>> that itself is a separate project bound as a submodule to "A")?
>
> Nope. Only the "--recursive" option to the git submodule script
> works like that (and almost everyone seems to use that option by
> default anyway). But for all commands that understand the
> "--recurse-submodule" option (currently these are clone, fetch,
> merge, pull and push) that means "include submodules in what you
> do and don't stop at the first level but recurse all the way down".
> Without this option they won't even touch the first level of
> submodules.

OK, but what does "rm --no-recurse-submodules path" could possibly
mean in that case?  If you remove "path" by definition anything
underneath "path" cannot be remain, so in the context of "rm", once
you decide to remove submodule at "path", not recursing is an option.

So I still think "--recurse-submodules" does not make any sense to
the "rm" command.  I would understand a "Do not attempt to remove
submodules and ignore their existence altogether" option, even
though I do not think it is useful.

> All other core commands happily change the index without updating
> the submodule's work tree.

What are "all other core commands"?  "fetch" by definition does not
touch working tree inside or outside working tree.  "add" is about
recording the state from the working tree to the index, and
following the earlier point you raised (I unfortunately culled from
the quote), as the rest of core Git considers a submodule a single
path entity in the context of the superproject, "the state from the
working tree" for the submodule is where its HEAD points at, so it
is correct not to look at the working tree.

Without going outside the context of "rm", I think the current
behaviour of "git rm path" for submodule is merely punting---erroring
out against a request for an unimplemented feature.

If you ask an unpopulated submodule to be removed, we could simply
rmdir() it and remove the entry from the index, but that is far
insufficient for handling a submodule repository that is already
"init"ed.  And we did not want to plug the "check if removal will
lose any state from the submodule repository" logic because the
information is no use for us for a long time until we added the
"gitfile" support to allow us to relocate path/.git for the
submodule safely away when we remove the working tree of it.

But now we do have gitfile, so we could remove submodule working
tree.  I think not erroring out and removing only the index entry is
a irresponsible thing to do.  It would mean that we pretend as if a
feature that was earlier unimplemented (hence errored out) is now
supported, but it does not do what the user asked us to do, no?

I do not know what other commands you have in mind, but some of them
may be similar "recursing down and performing an operation that
potentially can fail is too complex, and we are not sure if we have
enough sequencing infrastructure to guide the user to sort out the
mess in the half-result, so let's punt and not to that part and have
the user sort out" half-features, and if that is the case, I do not
think it is prudent to model "rm", which is something that we now
could implement properly, with the necessary infrastructure already
in place, after them.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-09  2:17           ` Junio C Hamano
@ 2012-07-09  5:02             ` Junio C Hamano
  2012-07-09 18:33             ` Jens Lehmann
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2012-07-09  5:02 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Junio C Hamano <gitster@pobox.com> writes:

>> Nope. Only the "--recursive" option to the git submodule script
>> works like that (and almost everyone seems to use that option by
>> default anyway). But for all commands that understand the
>> "--recurse-submodule" option (currently these are clone, fetch,
>> merge, pull and push) that means "include submodules in what you
>> do and don't stop at the first level but recurse all the way down".
>> Without this option they won't even touch the first level of
>> submodules.
>
> OK, but what does "rm --no-recurse-submodules path" could possibly
> mean in that case?  If you remove "path" by definition anything
> underneath "path" cannot be remain, so in the context of "rm", once
> you decide to remove submodule at "path", not recursing is an option.

Yikes, I hate myself after making silly typoes.  Of course the above
needs s/cannot .. remain/cannot remain/; and more importantly, not
recursing is _not_ an option once you decide to remove it.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-09  2:17           ` Junio C Hamano
  2012-07-09  5:02             ` Junio C Hamano
@ 2012-07-09 18:33             ` Jens Lehmann
  2012-07-09 19:38               ` Junio C Hamano
  1 sibling, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-07-09 18:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Am 09.07.2012 04:17, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> So I still think "--recurse-submodules" does not make any sense to
> the "rm" command.  I would understand a "Do not attempt to remove
> submodules and ignore their existence altogether" option, even
> though I do not think it is useful.

Yes, when "rm" removes populated submodule work trees by default
then there is no need for a "--recurse-submodules" option. And
then we don't need a "--no-recurse-submodules" either as that use
case is already covered by the "--cached" option, right?

>> All other core commands happily change the index without updating
>> the submodule's work tree.
> 
> What are "all other core commands"?  "fetch" by definition does not
> touch working tree inside or outside working tree.  "add" is about
> recording the state from the working tree to the index, and
> following the earlier point you raised (I unfortunately culled from
> the quote), as the rest of core Git considers a submodule a single
> path entity in the context of the superproject, "the state from the
> working tree" for the submodule is where its HEAD points at, so it
> is correct not to look at the working tree.

I wanted to say "all other /work tree updating/ core commands"
here (checkout, merge, reset ...). Sorry for the confusion.

> Without going outside the context of "rm", I think the current
> behaviour of "git rm path" for submodule is merely punting---erroring
> out against a request for an unimplemented feature.
> 
> If you ask an unpopulated submodule to be removed, we could simply
> rmdir() it and remove the entry from the index, but that is far
> insufficient for handling a submodule repository that is already
> "init"ed.  And we did not want to plug the "check if removal will
> lose any state from the submodule repository" logic because the
> information is no use for us for a long time until we added the
> "gitfile" support to allow us to relocate path/.git for the
> submodule safely away when we remove the working tree of it.
> 
> But now we do have gitfile, so we could remove submodule working
> tree.  I think not erroring out and removing only the index entry is
> a irresponsible thing to do.  It would mean that we pretend as if a
> feature that was earlier unimplemented (hence errored out) is now
> supported, but it does not do what the user asked us to do, no?
> 
> I do not know what other commands you have in mind, but some of them
> may be similar "recursing down and performing an operation that
> potentially can fail is too complex, and we are not sure if we have
> enough sequencing infrastructure to guide the user to sort out the
> mess in the half-result, so let's punt and not to that part and have
> the user sort out" half-features, and if that is the case, I do not
> think it is prudent to model "rm", which is something that we now
> could implement properly, with the necessary infrastructure already
> in place, after them.

Cool, so let's drop this patch and I'll teach "rm" to handle
populated submodules according to what we do for regular files:
Make sure there are no modifications which could get lost (unless
"-f") and remove all tracked files and the gitfile from the work
tree (unless "--cached") before removing the submodule from the
index. If the submodule uses the old layout with the .git
directory instead of a gitfile we error out just like we do today.

And we didn't talk about untracked or ignored files which may live
inside the submodules work tree so far, but according to what a
"rm -r" does in the superproject they should still be around after
using "rm" on a populated submodule, right?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-09 18:33             ` Jens Lehmann
@ 2012-07-09 19:38               ` Junio C Hamano
  2012-07-09 20:23                 ` Jens Lehmann
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-07-09 19:38 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Cool, so let's drop this patch and I'll teach "rm" to handle
> populated submodules according to what we do for regular files:
> Make sure there are no modifications which could get lost (unless
> "-f") and remove all tracked files and the gitfile from the work
> tree (unless "--cached") before removing the submodule from the
> index. If the submodule uses the old layout with the .git
> directory instead of a gitfile we error out just like we do today.

Alternatively we could "mv" .git directory out of the way and the
next "git checkout" of a branch that still has the submodule can
make a gitfile to point there, no?

Or we can just error out and tell the user to do so herself.

> And we didn't talk about untracked or ignored files which may live
> inside the submodules work tree so far, but according to what a
> "rm -r" does in the superproject they should still be around after
> using "rm" on a populated submodule, right?

Until we add the "precious" class, untracked and ignored files are
expendable, so if a submodule working tree has no modification and
only has leftover *.o files, they can be nuked as part of submodule
removal, but if it has an untracked and unignored *.c file for
example, the "rm" operation without "-f" should be stopped, no?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-09 19:38               ` Junio C Hamano
@ 2012-07-09 20:23                 ` Jens Lehmann
  2012-08-16 21:56                   ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-07-09 20:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Am 09.07.2012 21:38, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> Cool, so let's drop this patch and I'll teach "rm" to handle
>> populated submodules according to what we do for regular files:
>> Make sure there are no modifications which could get lost (unless
>> "-f") and remove all tracked files and the gitfile from the work
>> tree (unless "--cached") before removing the submodule from the
>> index. If the submodule uses the old layout with the .git
>> directory instead of a gitfile we error out just like we do today.
> 
> Alternatively we could "mv" .git directory out of the way and the
> next "git checkout" of a branch that still has the submodule can
> make a gitfile to point there, no?

Yup. That would mean a smooth transition for legacy .git-dir
submodules into the new gitfile world.

>> And we didn't talk about untracked or ignored files which may live
>> inside the submodules work tree so far, but according to what a
>> "rm -r" does in the superproject they should still be around after
>> using "rm" on a populated submodule, right?
> 
> Until we add the "precious" class, untracked and ignored files are
> expendable, so if a submodule working tree has no modification and
> only has leftover *.o files, they can be nuked as part of submodule
> removal, but if it has an untracked and unignored *.c file for
> example, the "rm" operation without "-f" should be stopped, no?

Ok, untracked files mark the submodule modified while ignored files
which are not tracked won't.

Thanks for this discussion, I'll start hacking on that.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-07-09 20:23                 ` Jens Lehmann
@ 2012-08-16 21:56                   ` Junio C Hamano
  2012-08-17 16:44                     ` Jens Lehmann
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2012-08-16 21:56 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Am 09.07.2012 21:38, schrieb Junio C Hamano:
>> Jens Lehmann <Jens.Lehmann@web.de> writes:
>> 
>>> Cool, so let's drop this patch and I'll teach "rm" to handle
>>> populated submodules according to what we do for regular files:
>>> Make sure there are no modifications which could get lost (unless
>>> "-f") and remove all tracked files and the gitfile from the work
>>> tree (unless "--cached") before removing the submodule from the
>>> index. If the submodule uses the old layout with the .git
>>> directory instead of a gitfile we error out just like we do today.
>> 
>> Alternatively we could "mv" .git directory out of the way and the
>> next "git checkout" of a branch that still has the submodule can
>> make a gitfile to point there, no?
>
> Yup. That would mean a smooth transition for legacy .git-dir
> submodules into the new gitfile world.
>
>>> And we didn't talk about untracked or ignored files which may live
>>> inside the submodules work tree so far, but according to what a
>>> "rm -r" does in the superproject they should still be around after
>>> using "rm" on a populated submodule, right?
>> 
>> Until we add the "precious" class, untracked and ignored files are
>> expendable, so if a submodule working tree has no modification and
>> only has leftover *.o files, they can be nuked as part of submodule
>> removal, but if it has an untracked and unignored *.c file for
>> example, the "rm" operation without "-f" should be stopped, no?
>
> Ok, untracked files mark the submodule modified while ignored files
> which are not tracked won't.
>
> Thanks for this discussion, I'll start hacking on that.

A mild ping on seemingly stalled topic.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-08-16 21:56                   ` Junio C Hamano
@ 2012-08-17 16:44                     ` Jens Lehmann
  2012-08-17 18:11                       ` Phil Hord
  0 siblings, 1 reply; 19+ messages in thread
From: Jens Lehmann @ 2012-08-17 16:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michał Górny, Phil Hord, Heiko Voigt

Am 16.08.2012 23:56, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> Am 09.07.2012 21:38, schrieb Junio C Hamano:
>>> Jens Lehmann <Jens.Lehmann@web.de> writes:
>>>
>>>> Cool, so let's drop this patch and I'll teach "rm" to handle
>>>> populated submodules according to what we do for regular files:
>>>> Make sure there are no modifications which could get lost (unless
>>>> "-f") and remove all tracked files and the gitfile from the work
>>>> tree (unless "--cached") before removing the submodule from the
>>>> index. If the submodule uses the old layout with the .git
>>>> directory instead of a gitfile we error out just like we do today.
>>>
>>> Alternatively we could "mv" .git directory out of the way and the
>>> next "git checkout" of a branch that still has the submodule can
>>> make a gitfile to point there, no?
>>
>> Yup. That would mean a smooth transition for legacy .git-dir
>> submodules into the new gitfile world.
>>
>>>> And we didn't talk about untracked or ignored files which may live
>>>> inside the submodules work tree so far, but according to what a
>>>> "rm -r" does in the superproject they should still be around after
>>>> using "rm" on a populated submodule, right?
>>>
>>> Until we add the "precious" class, untracked and ignored files are
>>> expendable, so if a submodule working tree has no modification and
>>> only has leftover *.o files, they can be nuked as part of submodule
>>> removal, but if it has an untracked and unignored *.c file for
>>> example, the "rm" operation without "-f" should be stopped, no?
>>
>> Ok, untracked files mark the submodule modified while ignored files
>> which are not tracked won't.
>>
>> Thanks for this discussion, I'll start hacking on that.
> 
> A mild ping on seemingly stalled topic.

I'm almost there. The only thing left is to check if a nested
submodule is using a git directory. In that case I expect "rm" to
fail even when -f is used to protect the submodule's history. I
still need to find a suitable command for recursing the submodules
and doing that check.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-08-17 16:44                     ` Jens Lehmann
@ 2012-08-17 18:11                       ` Phil Hord
  2012-08-19 19:38                         ` Jens Lehmann
  0 siblings, 1 reply; 19+ messages in thread
From: Phil Hord @ 2012-08-17 18:11 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Junio C Hamano, git, Michał Górny, Heiko Voigt

On Fri, Aug 17, 2012 at 12:44 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>
> I'm almost there. The only thing left is to check if a nested
> submodule is using a git directory. In that case I expect "rm" to
> fail even when -f is used to protect the submodule's history. I
> still need to find a suitable command for recursing the submodules
> and doing that check.

I suppose the style of this is wrong, but this seems to work for me.

git submodule foreach --recursive '! test -f .git'

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH 1/2] rm: don't fail when removing populated submodules
  2012-08-17 18:11                       ` Phil Hord
@ 2012-08-19 19:38                         ` Jens Lehmann
  0 siblings, 0 replies; 19+ messages in thread
From: Jens Lehmann @ 2012-08-19 19:38 UTC (permalink / raw)
  To: Phil Hord; +Cc: Junio C Hamano, git, Michał Górny, Heiko Voigt

Am 17.08.2012 20:11, schrieb Phil Hord:
> On Fri, Aug 17, 2012 at 12:44 PM, Jens Lehmann <Jens.Lehmann@web.de> wrote:
>>
>> I'm almost there. The only thing left is to check if a nested
>> submodule is using a git directory. In that case I expect "rm" to
>> fail even when -f is used to protect the submodule's history. I
>> still need to find a suitable command for recursing the submodules
>> and doing that check.
> 
> I suppose the style of this is wrong, but this seems to work for me.
> 
> git submodule foreach --recursive '! test -f .git'

Thanks! I was looking for something less expensive, but given that
I don't expect removing submodules to be a performance critical
operation this test should just work fine.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-08-19 19:39 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-04 20:43 [RFC PATCH 0/2] Teach rm to better handle submodules Jens Lehmann
2012-07-04 20:44 ` [RFC PATCH 1/2] rm: don't fail when removing populated submodules Jens Lehmann
2012-07-06  6:57   ` Junio C Hamano
2012-07-07 12:51     ` Jens Lehmann
2012-07-08  7:32       ` Junio C Hamano
2012-07-08 15:08         ` Jens Lehmann
2012-07-09  2:17           ` Junio C Hamano
2012-07-09  5:02             ` Junio C Hamano
2012-07-09 18:33             ` Jens Lehmann
2012-07-09 19:38               ` Junio C Hamano
2012-07-09 20:23                 ` Jens Lehmann
2012-08-16 21:56                   ` Junio C Hamano
2012-08-17 16:44                     ` Jens Lehmann
2012-08-17 18:11                       ` Phil Hord
2012-08-19 19:38                         ` Jens Lehmann
2012-07-04 20:44 ` [RFC PATCH 2/2] rm: remove submodules from the index and the .gitmodules file Jens Lehmann
2012-07-05  0:44 ` [RFC PATCH 0/2] Teach rm to better handle submodules Junio C Hamano
2012-07-05 19:06   ` Jens Lehmann
2012-07-05 22:10     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.