All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC/PATCH] Disallow commands from within unpopulated submodules.
@ 2017-01-19 19:30 Stefan Beller
  2017-01-20 19:17 ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2017-01-19 19:30 UTC (permalink / raw)
  To: git; +Cc: Stefan Beller

Consider you have a submodule at path "sub". What should happen in case
you run a command such as "git -C sub add ." ?

Here is what currently happens:
1) The submodule is populated, i.e. there is a .git (file/dir) inside
   "sub". This is equivalent of running "git add ." in the submodule and
   it behaves as you would expect, adding all files to the index.
2) The submodule is not populated or even not initialized.
   For quite some time we got
   $ git -C sub add .
   git: pathspec.c:317: prefix_pathspec: Assertion `item->nowildcard_len <= item->len && item->prefix <= item->len' failed.
   Aborted (core dumped)
   (This is fixed by another patch in flight to not assert,
    but rather die with a better message instead; but that patch is
    merely a fix of a corner case in the pathspec code.)

While 1) is rather uncontroversial, there are multiple things the user
may have intended with this command in 2):
 * add the submodule to the superproject
 * add all files inside the sub/ directory to the submodule or
   superproject.
It is unclear what the user intended, so rather error out instead.

Now let's ask the same question for "git -C sub status ." (which is a
command that is only reading and not writing to the repository)

1) If the submodule is populated, the user clearly intended to know
   more about the submodules status
2) It is unclear if the user wanted to learn about the submodules state
   (So ideally: "The submodule 'sub' is not initialized. To init ...")
   or the status check should be applied to the superproject instead.

Avoid the confusion in 2) as well and just error out for now. Later on
we may want to add another flag to git.c to allow commands to be run
inside unpopulated submodules and each command reacts appropriately.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This is the next logical step after sb/pathspec-errors (pathspec:
give better message for submodule related pathspec error). If you are in
a path that is clearly a submodules, I would expect that most users would
expect the git operation to apply to the submodule.  In case of unpopulated
submodules, this is not the case though, but we apply the operation to the
superproject, which may be wrong or confusing.  Hence just error out for now.
Later we may want to add a flag that allows specific commands to run in such
a setup (e.g. git status could give a fancier message than a die(..)).

I marked this as RFC
* to request for comments if this is a good idea from a UI-perspective
* because I did not adapt any test for this patch. (A lot of submodule tests
  seem to break with this; From a cursory read of those tests, I'd rather
  blame the tests for being sloppy than this patch damaging user expectations)

Thanks,
Stefan

 git.c       |  3 +++
 submodule.c | 36 ++++++++++++++++++++++++++++++++++++
 submodule.h |  1 +
 3 files changed, 40 insertions(+)

diff --git a/git.c b/git.c
index bbaa949e9c..ca53b61474 100644
--- a/git.c
+++ b/git.c
@@ -2,6 +2,7 @@
 #include "exec_cmd.h"
 #include "help.h"
 #include "run-command.h"
+#include "submodule.h"
 
 const char git_usage_string[] =
 	"git [--version] [--help] [-C <path>] [-c name=value]\n"
@@ -364,6 +365,8 @@ static int run_builtin(struct cmd_struct *p, int argc, const char **argv)
 		if (prefix)
 			die("can't use --super-prefix from a subdirectory");
 	}
+	if (prefix)
+		check_prefix_inside_submodule(prefix);
 
 	if (!help && p->option & NEED_WORK_TREE)
 		setup_work_tree();
diff --git a/submodule.c b/submodule.c
index 73521cdbb2..357a22b804 100644
--- a/submodule.c
+++ b/submodule.c
@@ -495,6 +495,42 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
+/* check if the given prefix is inside an uninitialized submodule */
+void check_prefix_inside_submodule(const char *prefix)
+{
+	const char *work_tree = get_git_work_tree();
+	if (work_tree) {
+		int pos;
+		const struct cache_entry *in_submodule = NULL;
+
+		if (read_cache() < 0)
+			die("index file corrupt");
+		pos = cache_name_pos(prefix, strlen(prefix));
+		/*
+		 * gitlinks are recored with no ending '/' in the index,
+		 * but the prefix has an ending '/', so we will never find
+		 * an exact match, but always the position where we'd
+		 * insert the prefix.
+		 */
+		if (pos < 0) {
+			const struct cache_entry *ce;
+			int len = strlen(prefix);
+			/* Check the previous position */
+			pos = (-1 - pos) - 1;
+			ce = active_cache[pos];
+			if (ce->ce_namelen < len)
+				len = ce->ce_namelen;
+			if (!memcmp(ce->name, prefix, len))
+				in_submodule = ce;
+		} else
+			/* This case cannot happen because */
+			die("BUG: prefixes end with '/', but we do not record ending slashes in the index");
+
+		if (in_submodule)
+			die(_("command from inside unpopulated submodule '%s' not supported."), in_submodule->name);
+	}
+}
+
 static int has_remote(const char *refname, const struct object_id *oid,
 		      int flags, void *cb_data)
 {
diff --git a/submodule.h b/submodule.h
index b7576d6f43..6b30542822 100644
--- a/submodule.h
+++ b/submodule.h
@@ -30,6 +30,7 @@ struct submodule_update_strategy {
 };
 #define SUBMODULE_UPDATE_STRATEGY_INIT {SM_UPDATE_UNSPECIFIED, NULL}
 
+extern void check_prefix_inside_submodule(const char *prefix);
 int is_staging_gitmodules_ok(void);
 int update_path_in_gitmodules(const char *oldpath, const char *newpath);
 int remove_path_from_gitmodules(const char *path);
-- 
2.11.0.299.g762782ba8a


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-19 19:30 [RFC/PATCH] Disallow commands from within unpopulated submodules Stefan Beller
@ 2017-01-20 19:17 ` Jeff King
  2017-01-20 19:33   ` Stefan Beller
  2017-01-21 12:55   ` Duy Nguyen
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-01-20 19:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Thu, Jan 19, 2017 at 11:30:23AM -0800, Stefan Beller wrote:

> Now let's ask the same question for "git -C sub status ." (which is a
> command that is only reading and not writing to the repository)
> 
> 1) If the submodule is populated, the user clearly intended to know
>    more about the submodules status
> 2) It is unclear if the user wanted to learn about the submodules state
>    (So ideally: "The submodule 'sub' is not initialized. To init ...")
>    or the status check should be applied to the superproject instead.
> 
> Avoid the confusion in 2) as well and just error out for now. Later on
> we may want to add another flag to git.c to allow commands to be run
> inside unpopulated submodules and each command reacts appropriately.

I like the general idea of catching commands in unpopulated submodules,
but I'm somewhat uncomfortable with putting an unconditional check into
git.c, for two reasons:

  1. Reading the index can be expensive. You would not want "git
     rev-parse" to incur this cost.

  2. How does this interact with commands which do interact with the
     index? Don't they expect to find the_index unpopulated?

     (I notice that it's effectively tied to RUN_SETUP, which is good.
      But that also means that many commands, like "diff", won't get the
      benefit. Not to mention non-builtins).

I'd rather see it in the commands themselves. Especially given the
"ideal" in your status example, which requires command-specific
knowledge.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 19:17 ` Jeff King
@ 2017-01-20 19:33   ` Stefan Beller
  2017-01-20 19:42     ` Jeff King
  2017-01-21 12:55   ` Duy Nguyen
  1 sibling, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2017-01-20 19:33 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 20, 2017 at 11:17 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Jan 19, 2017 at 11:30:23AM -0800, Stefan Beller wrote:
>
>> Now let's ask the same question for "git -C sub status ." (which is a
>> command that is only reading and not writing to the repository)
>>
>> 1) If the submodule is populated, the user clearly intended to know
>>    more about the submodules status
>> 2) It is unclear if the user wanted to learn about the submodules state
>>    (So ideally: "The submodule 'sub' is not initialized. To init ...")
>>    or the status check should be applied to the superproject instead.
>>
>> Avoid the confusion in 2) as well and just error out for now. Later on
>> we may want to add another flag to git.c to allow commands to be run
>> inside unpopulated submodules and each command reacts appropriately.
>
> I like the general idea of catching commands in unpopulated submodules,
> but I'm somewhat uncomfortable with putting an unconditional check into
> git.c, for two reasons:
>
>   1. Reading the index can be expensive. You would not want "git
>      rev-parse" to incur this cost.

Well, I would want rev-parse to not be run in the wrong repo.
(intended to rev-parse something in the submodule, but got results for
the superproject).

Talking about rev-parse, I was about to propose an extension in reply to
"[PATCH] git-prompt.sh: add submodule indicator" that rev-parse could
learn a flag similar to --show-toplevel, named:
    --show-superproject-if-any or
    --indicate-if-in-submodule-possibly
which would help out there.

>
>   2. How does this interact with commands which do interact with the
>      index? Don't they expect to find the_index unpopulated?

That is another sloppiness in this RFC patch, as I haven't nailed down
the corner cases yet.

>
>      (I notice that it's effectively tied to RUN_SETUP, which is good.
>       But that also means that many commands, like "diff", won't get the
>       benefit. Not to mention non-builtins).
>
> I'd rather see it in the commands themselves. Especially given the
> "ideal" in your status example, which requires command-specific
> knowledge.

So you rather want to go bottom up, i.e. add it to each command individually
for which it makes sense, instead of rather first having a catch-it-all like
this and then we can have a flag similar to RUN_SETUP, e.g.
ALLOW_IN_UNPOP_SUBMODULE, which allows commands to
take over the responsibility to act responsibly in this case?

status may be the first command for going that route; I wonder if we'd
want to add this feature unconditionally or only in the porcelain case.
(In plumbing you're supposed to know what you're doing... so there is
no need as well as our promise to not change it)

Thanks,
Stefan

>
> -Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 19:33   ` Stefan Beller
@ 2017-01-20 19:42     ` Jeff King
  2017-01-20 19:53       ` Stefan Beller
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff King @ 2017-01-20 19:42 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Fri, Jan 20, 2017 at 11:33:45AM -0800, Stefan Beller wrote:

> > I'd rather see it in the commands themselves. Especially given the
> > "ideal" in your status example, which requires command-specific
> > knowledge.
> 
> So you rather want to go bottom up, i.e. add it to each command individually
> for which it makes sense, instead of rather first having a catch-it-all like
> this and then we can have a flag similar to RUN_SETUP, e.g.
> ALLOW_IN_UNPOP_SUBMODULE, which allows commands to
> take over the responsibility to act responsibly in this case?

Yes. I know it's "less safe" in the sense that commands have to make an
effort to detect the situation, but I feel like only they'll know what
the sensible behavior is. And they can also do the check at a time when
they would be reading the index anyway.

> status may be the first command for going that route; I wonder if we'd
> want to add this feature unconditionally or only in the porcelain case.
> (In plumbing you're supposed to know what you're doing... so there is
> no need as well as our promise to not change it)

Yeah. The reason that it would be so painful to load the index
for every rev-parse is not just that it probably doesn't otherwise need
the index, but that scripts may make a _ton_ of rev-parse (or other
plumbing) calls.

One alternative would be to make the check cheaper. Could we reliably
tell from the submodule.foo.* block in the config that path "foo" is a
submodule? I think that would work after "submodule init" but not right
after "git clone". So the index really is the source of truth there.

I guess there could be an index extension "these are the gitlinks I
contain" and in theory we could read just that extension. I dunno.

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 19:42     ` Jeff King
@ 2017-01-20 19:53       ` Stefan Beller
  2017-01-20 20:00         ` Jeff King
  0 siblings, 1 reply; 13+ messages in thread
From: Stefan Beller @ 2017-01-20 19:53 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 20, 2017 at 11:42 AM, Jeff King <peff@peff.net> wrote:
> On Fri, Jan 20, 2017 at 11:33:45AM -0800, Stefan Beller wrote:
>
>> > I'd rather see it in the commands themselves. Especially given the
>> > "ideal" in your status example, which requires command-specific
>> > knowledge.
>>
>> So you rather want to go bottom up, i.e. add it to each command individually
>> for which it makes sense, instead of rather first having a catch-it-all like
>> this and then we can have a flag similar to RUN_SETUP, e.g.
>> ALLOW_IN_UNPOP_SUBMODULE, which allows commands to
>> take over the responsibility to act responsibly in this case?
>
> Yes. I know it's "less safe" in the sense that commands have to make an
> effort to detect the situation, but I feel like only they'll know what
> the sensible behavior is. And they can also do the check at a time when
> they would be reading the index anyway.
>
>> status may be the first command for going that route; I wonder if we'd
>> want to add this feature unconditionally or only in the porcelain case.
>> (In plumbing you're supposed to know what you're doing... so there is
>> no need as well as our promise to not change it)
>
> Yeah. The reason that it would be so painful to load the index
> for every rev-parse is not just that it probably doesn't otherwise need
> the index, but that scripts may make a _ton_ of rev-parse (or other
> plumbing) calls.
>
> One alternative would be to make the check cheaper. Could we reliably
> tell from the submodule.foo.* block in the config that path "foo" is a
> submodule? I think that would work after "submodule init" but not right
> after "git clone". So the index really is the source of truth there.

Well we can check if there is a .gitmodules file that has a
submodule.*.path equal to the last part of $CWD, no need to look
at the git config.

And that would also work right after git clone (in an
unpopulated/uninitialized submodule as I call it).

And in my current understanding of submodules the check in
.gitmodules ought to be enough, too.

>
> I guess there could be an index extension "these are the gitlinks I
> contain" and in theory we could read just that extension. I dunno.
>
> -Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 19:53       ` Stefan Beller
@ 2017-01-20 20:00         ` Jeff King
  2017-01-20 20:07           ` Stefan Beller
  2017-01-20 21:41           ` Junio C Hamano
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2017-01-20 20:00 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

On Fri, Jan 20, 2017 at 11:53:01AM -0800, Stefan Beller wrote:

> > One alternative would be to make the check cheaper. Could we reliably
> > tell from the submodule.foo.* block in the config that path "foo" is a
> > submodule? I think that would work after "submodule init" but not right
> > after "git clone". So the index really is the source of truth there.
> 
> Well we can check if there is a .gitmodules file that has a
> submodule.*.path equal to the last part of $CWD, no need to look
> at the git config.
> 
> And that would also work right after git clone (in an
> unpopulated/uninitialized submodule as I call it).
> 
> And in my current understanding of submodules the check in
> .gitmodules ought to be enough, too.

Yeah, that probably makes sense. You can have a gitlink without a
.gitmodules file, but I don't quite know what that would mean in terms
of submodules (I guess it's not a submodule but "something else").

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 20:00         ` Jeff King
@ 2017-01-20 20:07           ` Stefan Beller
  2017-01-20 21:41           ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2017-01-20 20:07 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 20, 2017 at 12:00 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Jan 20, 2017 at 11:53:01AM -0800, Stefan Beller wrote:
>
>> > One alternative would be to make the check cheaper. Could we reliably
>> > tell from the submodule.foo.* block in the config that path "foo" is a
>> > submodule? I think that would work after "submodule init" but not right
>> > after "git clone". So the index really is the source of truth there.
>>
>> Well we can check if there is a .gitmodules file that has a
>> submodule.*.path equal to the last part of $CWD, no need to look
>> at the git config.
>>
>> And that would also work right after git clone (in an
>> unpopulated/uninitialized submodule as I call it).
>>
>> And in my current understanding of submodules the check in
>> .gitmodules ought to be enough, too.
>
> Yeah, that probably makes sense. You can have a gitlink without a
> .gitmodules file, but I don't quite know what that would mean in terms
> of submodules (I guess it's not a submodule but "something else").

yeah, I agree it could be git series[1] at work, or as you said
"something else", and we have no idea what to do.

I think this could actually be implemented top-down, because the
check is cheap as we're beginning with lstat(.gitmodules), and no further
pursue checking this corner case in case the .gitmodules is not found.

I'll see if I can make a patch that passes the test suite.

[1] https://github.com/git-series/git-series/blob/master/INTERNALS.md

>
> -Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 20:00         ` Jeff King
  2017-01-20 20:07           ` Stefan Beller
@ 2017-01-20 21:41           ` Junio C Hamano
  2017-01-20 21:48             ` Jeff King
  2017-01-20 21:58             ` Junio C Hamano
  1 sibling, 2 replies; 13+ messages in thread
From: Junio C Hamano @ 2017-01-20 21:41 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git

Jeff King <peff@peff.net> writes:

>> And in my current understanding of submodules the check in
>> .gitmodules ought to be enough, too.
>
> Yeah, that probably makes sense. You can have a gitlink without a
> .gitmodules file, but I don't quite know what that would mean in terms
> of submodules (I guess it's not a submodule but "something else").

That may be a lot better than reading the index unconditionally, but
I'd rather not to see "git rev-parse" read ".gitmodules" at all.  It
would discourage scripted use of Git for no good reason.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 21:41           ` Junio C Hamano
@ 2017-01-20 21:48             ` Jeff King
  2017-01-20 21:58             ` Junio C Hamano
  1 sibling, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-01-20 21:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git

On Fri, Jan 20, 2017 at 01:41:54PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >> And in my current understanding of submodules the check in
> >> .gitmodules ought to be enough, too.
> >
> > Yeah, that probably makes sense. You can have a gitlink without a
> > .gitmodules file, but I don't quite know what that would mean in terms
> > of submodules (I guess it's not a submodule but "something else").
> 
> That may be a lot better than reading the index unconditionally, but
> I'd rather not to see "git rev-parse" read ".gitmodules" at all.  It
> would discourage scripted use of Git for no good reason.

Why is that? Just because it makes rev-parse seem more bloated?

I think Stefan's putting it into git.c is confusing the issue a bit.
This is fundamentally about figuring out which git repository we're in,
and that procedure is the right place to put the check.

IOW, when we call setup_git_repository() we are already walking up the
tree and looking at .git/HEAD, .git/config, etc to see if we are in a
valid git repository. It doesn't seem unreasonable to me to make this
part of that check. I.e.:

  - if we we walked up from the working tree (so we have a non-NULL
    prefix); and

  - if there is a .gitmodules file; and

  - if the .gitmodules file shows that we were inside what _should_ have
    been a submodule; then

  - complain and do not accept the outer repository as a valid repo.

That adds only an extra failed open() for people who do not use
submodules, and an extra config-file parse for people who do. And then
only when they are not in the top-level of the working tree (so scripts,
etc that cd_to_toplevel wouldn't pay per-invocation).

-Peff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 21:41           ` Junio C Hamano
  2017-01-20 21:48             ` Jeff King
@ 2017-01-20 21:58             ` Junio C Hamano
  2017-01-20 22:01               ` Jeff King
  2017-01-20 23:22               ` Stefan Beller
  1 sibling, 2 replies; 13+ messages in thread
From: Junio C Hamano @ 2017-01-20 21:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>>> And in my current understanding of submodules the check in
>>> .gitmodules ought to be enough, too.
>>
>> Yeah, that probably makes sense. You can have a gitlink without a
>> .gitmodules file, but I don't quite know what that would mean in terms
>> of submodules (I guess it's not a submodule but "something else").
>
> That may be a lot better than reading the index unconditionally, but
> I'd rather not to see "git rev-parse" read ".gitmodules" at all.  It
> would discourage scripted use of Git for no good reason.

Thinking about this more, I suspect that

	cd sub && git anything

when the index of the top-level thinks "sub" must be a submodule and
the user is not interested in "sub" (hence it hasn't gone through
"git submodule init" or "update") should get the same error as you
would get if you did

	cd /var/tmp/ && git anything

when none of /, /var, /var/tmp/ is controlled by any Git repository.
I.e. "fatal: Not a git repository".

Perhaps we can update two things and make it cheap.

 - checking out the top-level working tree without populating the
   working tree of a submodule learns to do a bit more than just
   creating an empty directory.  Instead, it creates the third kind
   of ".git" (we currently support two kinds of ".git", one that is
   a repository itself, and another that is points at a repository),
   that tells us that there is not (yet) a repository there.

 - the "discovering the root of the working tree" logic learns to
   notice the third kind of ".git" and stop with "Not a git
   repository".



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 21:58             ` Junio C Hamano
@ 2017-01-20 22:01               ` Jeff King
  2017-01-20 23:22               ` Stefan Beller
  1 sibling, 0 replies; 13+ messages in thread
From: Jeff King @ 2017-01-20 22:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git

On Fri, Jan 20, 2017 at 01:58:02PM -0800, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Jeff King <peff@peff.net> writes:
> >
> >>> And in my current understanding of submodules the check in
> >>> .gitmodules ought to be enough, too.
> >>
> >> Yeah, that probably makes sense. You can have a gitlink without a
> >> .gitmodules file, but I don't quite know what that would mean in terms
> >> of submodules (I guess it's not a submodule but "something else").
> >
> > That may be a lot better than reading the index unconditionally, but
> > I'd rather not to see "git rev-parse" read ".gitmodules" at all.  It
> > would discourage scripted use of Git for no good reason.
> 
> Thinking about this more, I suspect that
> 
> 	cd sub && git anything
> 
> when the index of the top-level thinks "sub" must be a submodule and
> the user is not interested in "sub" (hence it hasn't gone through
> "git submodule init" or "update") should get the same error as you
> would get if you did
> 
> 	cd /var/tmp/ && git anything
> 
> when none of /, /var, /var/tmp/ is controlled by any Git repository.
> I.e. "fatal: Not a git repository".

Not sure if our emails just crossed, but yes, I agree completely.

> Perhaps we can update two things and make it cheap.
> 
>  - checking out the top-level working tree without populating the
>    working tree of a submodule learns to do a bit more than just
>    creating an empty directory.  Instead, it creates the third kind
>    of ".git" (we currently support two kinds of ".git", one that is
>    a repository itself, and another that is points at a repository),
>    that tells us that there is not (yet) a repository there.
> 
>  - the "discovering the root of the working tree" logic learns to
>    notice the third kind of ".git" and stop with "Not a git
>    repository".

Yeah, I thought about suggesting something like that earlier. It's
slightly more efficient than the "find the root and then complain" thing
Stefan and I were talking about, but I'd worry it comes with weird
corner cases. E.g., are there tools (or other parts of git) that care
about it literally being an empty directory? E.g., would parts of
"checkout" need to know that it's OK to blow it away?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 21:58             ` Junio C Hamano
  2017-01-20 22:01               ` Jeff King
@ 2017-01-20 23:22               ` Stefan Beller
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Beller @ 2017-01-20 23:22 UTC (permalink / raw)
  To: Junio C Hamano, Brandon Williams, Jonathan Nieder; +Cc: Jeff King, git

On Fri, Jan 20, 2017 at 1:58 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>> Jeff King <peff@peff.net> writes:
>>
>>>> And in my current understanding of submodules the check in
>>>> .gitmodules ought to be enough, too.
>>>
>>> Yeah, that probably makes sense. You can have a gitlink without a
>>> .gitmodules file, but I don't quite know what that would mean in terms
>>> of submodules (I guess it's not a submodule but "something else").
>>
>> That may be a lot better than reading the index unconditionally, but
>> I'd rather not to see "git rev-parse" read ".gitmodules" at all.  It
>> would discourage scripted use of Git for no good reason.
>
> Thinking about this more, I suspect that
>
>         cd sub && git anything
>
> when the index of the top-level thinks "sub" must be a submodule and
> the user is not interested in "sub" (hence it hasn't gone through
> "git submodule init" or "update") should get the same error as you
> would get if you did
>
>         cd /var/tmp/ && git anything
>
> when none of /, /var, /var/tmp/ is controlled by any Git repository.
> I.e. "fatal: Not a git repository".

I agree. The idea with a tombstone sounds great from a
performance perspective as you do not need to do extra work
in the superproject at all, because any gitlink is detected early
in the discovery.

The big BUT is however the following:
How do current users know if a submodule is e.g. populated?
(From say a third party script). Most likely they use something like

    test -e ${sub}/.git

as that just worked. So if we go with the tombstone idea, we
may break things. (i.e. the fictional third party script confirms any
submodule to be there, but it is not)

I do really like the idea though, so maybe we also need to provide
some submodule plumbing that we opine to be the "correct" way
to see the submodules state[1] to make the transition easier for the
script writers?

[1] c.f. submodule states in
https://github.com/gitster/git/commit/e2b51b9df618ceeff7c4ec044e20f5ce9a87241e

>
> Perhaps we can update two things and make it cheap.
>
>  - checking out the top-level working tree without populating the
>    working tree of a submodule learns to do a bit more than just
>    creating an empty directory.  Instead, it creates the third kind
>    of ".git" (we currently support two kinds of ".git", one that is
>    a repository itself, and another that is points at a repository),
>    that tells us that there is not (yet) a repository there.
>
>  - the "discovering the root of the working tree" logic learns to
>    notice the third kind of ".git" and stop with "Not a git
>    repository".

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/PATCH] Disallow commands from within unpopulated submodules.
  2017-01-20 19:17 ` Jeff King
  2017-01-20 19:33   ` Stefan Beller
@ 2017-01-21 12:55   ` Duy Nguyen
  1 sibling, 0 replies; 13+ messages in thread
From: Duy Nguyen @ 2017-01-21 12:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, Git Mailing List

On Sat, Jan 21, 2017 at 2:17 AM, Jeff King <peff@peff.net> wrote:
> On Thu, Jan 19, 2017 at 11:30:23AM -0800, Stefan Beller wrote:
>
>> Now let's ask the same question for "git -C sub status ." (which is a
>> command that is only reading and not writing to the repository)
>>
>> 1) If the submodule is populated, the user clearly intended to know
>>    more about the submodules status
>> 2) It is unclear if the user wanted to learn about the submodules state
>>    (So ideally: "The submodule 'sub' is not initialized. To init ...")
>>    or the status check should be applied to the superproject instead.
>>
>> Avoid the confusion in 2) as well and just error out for now. Later on
>> we may want to add another flag to git.c to allow commands to be run
>> inside unpopulated submodules and each command reacts appropriately.
>
> I like the general idea of catching commands in unpopulated submodules,
> but I'm somewhat uncomfortable with putting an unconditional check into
> git.c, for two reasons:
>
>   1. Reading the index can be expensive. You would not want "git
>      rev-parse" to incur this cost.
>
>   2. How does this interact with commands which do interact with the
>      index? Don't they expect to find the_index unpopulated?
>
>      (I notice that it's effectively tied to RUN_SETUP, which is good.
>       But that also means that many commands, like "diff", won't get the
>       benefit. Not to mention non-builtins).
>
> I'd rather see it in the commands themselves. Especially given the
> "ideal" in your status example, which requires command-specific
> knowledge.

I agree. It's already bad enough for pathspec code to peek into the
index, adding a hidden dependency between parse_pathspec() and
read_cache(). And I still think parse_pathspec() is not the right
place to check submodule paths. Worktree should be checked as well, in
the case that the submodule is not yet registered in the index. The
right place to do that is per-command, with their consent so to speak,
because they may need to set things up (index, .git/config and stuff)
properly before explicitly doing this check.
-- 
Duy

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-01-21 12:55 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-19 19:30 [RFC/PATCH] Disallow commands from within unpopulated submodules Stefan Beller
2017-01-20 19:17 ` Jeff King
2017-01-20 19:33   ` Stefan Beller
2017-01-20 19:42     ` Jeff King
2017-01-20 19:53       ` Stefan Beller
2017-01-20 20:00         ` Jeff King
2017-01-20 20:07           ` Stefan Beller
2017-01-20 21:41           ` Junio C Hamano
2017-01-20 21:48             ` Jeff King
2017-01-20 21:58             ` Junio C Hamano
2017-01-20 22:01               ` Jeff King
2017-01-20 23:22               ` Stefan Beller
2017-01-21 12:55   ` Duy Nguyen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.