All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH 1/3] externalize is_git_repository
       [not found] <1449252917-3877-1-git-send-email-a.krey@gmx.de>
@ 2015-12-05  6:58 ` Jeff King
       [not found] ` <1449252917-3877-2-git-send-email-a.krey@gmx.de>
       [not found] ` <1449252917-3877-3-git-send-email-a.krey@gmx.de>
  2 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2015-12-05  6:58 UTC (permalink / raw)
  To: Andreas Krey; +Cc: Git Mailing List

On Fri, Dec 04, 2015 at 07:15:15PM +0100, Andreas Krey wrote:

> diff --git a/cache.h b/cache.h
> index 736abc0..6626e97 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -439,6 +439,7 @@ extern int is_inside_work_tree(void);
>  extern const char *get_git_dir(void);
>  extern const char *get_git_common_dir(void);
>  extern int is_git_directory(const char *path);
> +extern int is_git_repository(struct strbuf *sb);

I wonder if we need to give this a better name if it is going to be
globally available. Seeing these two functions defined next to each
other, I have to wonder: what is the difference?

The first one ("is_git_directory") is about testing whether a particular
path is a ".git" directory. The second is about looking for a ".git"
inside the path, and seeing if _that_ points to a valid repository.
That's one way for it to be a repository, but not all (a repository
could itself simply be a bare repo that passes is_git_directory(), after
all).

So maybe a better name for the new function would be
"directory_contains_dotgit_repo" or something?

>  /*
> + * Return 1 if the given path is the root of a git repository or
> + * submodule else 0. Will not return 1 for bare repositories with the
> + * exception of creating a bare repository in "foo/.git" and calling
> + * is_git_repository("foo").
> + */
> +int is_git_repository(struct strbuf *path)

I think it's more useful to put a descriptive comment like this in the
header, where the interface is defined (I know you are just cutting and
pasting this, but in the prior version the declaration and the
definition were in the same place).

> +{
> +	int ret = 0;
> +	int gitfile_error;
> +	size_t orig_path_len = path->len;
> +	assert(orig_path_len != 0);
> +	strbuf_complete(path, '/');
> +	strbuf_addstr(path, ".git");
> +	if (read_gitfile_gently(path->buf, &gitfile_error) || is_git_directory(path->buf))
> +		ret = 1;
> +	if (gitfile_error == READ_GITFILE_ERR_OPEN_FAILED ||
> +	    gitfile_error == READ_GITFILE_ERR_READ_FAILED)
> +		ret = 1;  /* This could be a real .git file, take the
> +			   * safe option and avoid cleaning */

This comment is somewhat stale; we don't know we're cleaning.

Is it always going to be appropriate to err on the side of "yes, this
could be a repository?". My hunch is "yes", because we generally
consider sub-repos to be precious, so that's the safer thing to do.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/3] add is_git_repo (is_git_repository for char*)
       [not found] ` <1449252917-3877-2-git-send-email-a.krey@gmx.de>
@ 2015-12-05  7:01   ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2015-12-05  7:01 UTC (permalink / raw)
  To: Andreas Krey; +Cc: Git Mailing List

On Fri, Dec 04, 2015 at 07:15:16PM +0100, Andreas Krey wrote:

> diff --git a/cache.h b/cache.h
> index 6626e97..42ab0c3 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -440,6 +440,7 @@ extern const char *get_git_dir(void);
>  extern const char *get_git_common_dir(void);
>  extern int is_git_directory(const char *path);
>  extern int is_git_repository(struct strbuf *sb);
> +extern int is_git_repo(const char *path);

If we have two functions which do the same thing but differ only in
their interface (here strbuf versus a C-string), we should probably give
them related names that indicate the difference.

So "is_git_repository_str()" or something. (though per my previous
email, I think we should have a new name for the first one, too).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH/RFC 3/3] ls-files/dir: use is_git_repo to detect submodules
       [not found] ` <1449252917-3877-3-git-send-email-a.krey@gmx.de>
@ 2015-12-05  7:37   ` Jeff King
  2015-12-06 16:59     ` Andreas Krey
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2015-12-05  7:37 UTC (permalink / raw)
  To: Andreas Krey; +Cc: Git Mailing List

On Fri, Dec 04, 2015 at 07:15:17PM +0100, Andreas Krey wrote:

> Using resolve_gitlink_ref to check for submodules
> creates submodule list entries even when it isn't
> one, and causes O(n^2) runtime behaviour in repos
> with many untracked directories.
> 
> Use is_git_repo instead for detection.
> 
> Signed-off-by: Andreas Krey <a.krey@gmx.de>
> ---
> This looks good, but it breaks test - at least
> number 67 ('../bar/a/b/c works with relative local
> path - ../foo/bar.git') in t7400-submodule-basic.sh
> 
> I don't really understand yet how to fix that,
> but it is due to resolve_gitlink_ref looking
> for a valid HEAD which is_git_repo doesn't.

Hrm. Weird. You'd think it would break with the existing code if I do
this, then:

diff --git a/t/t7400-submodule-basic.sh b/t/t7400-submodule-basic.sh
index 540771c..944e9f5 100755
--- a/t/t7400-submodule-basic.sh
+++ b/t/t7400-submodule-basic.sh
@@ -754,7 +754,7 @@ test_expect_success '../bar/a/b/c works with relative local path - ../foo/bar.gi
 		cp pristine-.git-config .git/config &&
 		cp pristine-.gitmodules .gitmodules &&
 		mkdir -p a/b/c &&
-		(cd a/b/c; git init) &&
+		(cd a/b/c; git init && git commit --allow-empty -m foo) &&
 		git config remote.origin.url ../foo/bar.git &&
 		git submodule add ../bar/a/b/c ./a/b/c &&
 		git submodule init &&

But it doesn't. So presumably there is a mismatch where some other code
expects that anything which gets marked as a repo in the code you
changed can also be resolved to a sha1. I'm not sure where that other
code is though (in git-submodule.sh, or elsewhere in add).

Perhaps we end up in index_path(), which then barfs? That would explain
this (run in the trash directory after the test fails):

  $ cd reltest && git add a/b/c
  error: unable to index file a/b/c/
  fatal: adding files failed

We know it is a git dir, but there is no sha1 for us to actually add as
the gitlink entry.

If that is the case, then there is either some very tricky refactoring
required, or what we are trying to do here is simply wrong. Maybe it
would be simpler to just speed up resolve_gitlink_ref with a better data
structure.

-Peff

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH/RFC 3/3] ls-files/dir: use is_git_repo to detect submodules
  2015-12-05  7:37   ` [PATCH/RFC 3/3] ls-files/dir: use is_git_repo to detect submodules Jeff King
@ 2015-12-06 16:59     ` Andreas Krey
  2015-12-07 21:10       ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Andreas Krey @ 2015-12-06 16:59 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List

On Sat, 05 Dec 2015 02:37:44 +0000, Jeff King wrote:
...
> Hrm. Weird. You'd think it would break with the existing code if I do
> this, then:
> 
...
> -		(cd a/b/c; git init) &&
> +		(cd a/b/c; git init && git commit --allow-empty -m foo) &&
>  		git config remote.origin.url ../foo/bar.git &&
>  		git submodule add ../bar/a/b/c ./a/b/c &&

I tried a -f here instead; did not work either.

I guess I will first wade through the other failures my 'fix'
causes to see the total damage.

...
> We know it is a git dir, but there is no sha1 for us to actually add as
> the gitlink entry.
> 
> If that is the case, then there is either some very tricky refactoring
> required,

Yes, it looks like the return code delivered need to be slightly different
dependent on the user.

> or what we are trying to do here is simply wrong. Maybe it
> would be simpler to just speed up resolve_gitlink_ref with a better data
> structure.

Which is what I did on square one, but now we already have a real fix
for git clean, and now it's even less advantageous the fix the consequence
(the submodule cache blowing up) instead of the cause (asking for it
in the first place).

I don't think we should let is_git_repository look for a valid(ish) HEAD.

Andreas

PS: I seem to not quite have send-email under control, the envelope from
    seems to made the patches not reach the mailing list (nor me in the CC).

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH/RFC 3/3] ls-files/dir: use is_git_repo to detect submodules
  2015-12-06 16:59     ` Andreas Krey
@ 2015-12-07 21:10       ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2015-12-07 21:10 UTC (permalink / raw)
  To: Andreas Krey; +Cc: Git Mailing List

On Sun, Dec 06, 2015 at 05:59:26PM +0100, Andreas Krey wrote:

> On Sat, 05 Dec 2015 02:37:44 +0000, Jeff King wrote:
> ...
> > Hrm. Weird. You'd think it would break with the existing code if I do
> > this, then:
> > 
> ...
> > -		(cd a/b/c; git init) &&
> > +		(cd a/b/c; git init && git commit --allow-empty -m foo) &&
> >  		git config remote.origin.url ../foo/bar.git &&
> >  		git submodule add ../bar/a/b/c ./a/b/c &&
> 
> I tried a -f here instead; did not work either.
> 
> I guess I will first wade through the other failures my 'fix'
> causes to see the total damage.

The only other one I saw was in the completion tests. And it looked like
`git add` failing in a way similar to what I dug into here. So I think
it's "just" the one bug.

> > We know it is a git dir, but there is no sha1 for us to actually add as
> > the gitlink entry.
> > 
> > If that is the case, then there is either some very tricky refactoring
> > required,
> 
> Yes, it looks like the return code delivered need to be slightly different
> dependent on the user.

Maybe. From your patch it looks like the "git-add" code will return it
as "untracked". Which makes sense if we then want to add it. But if it
has no HEAD commit we _cannot_ add it, as we have no sha1 to stick in
the index. It should probably be ignored totally in that case.

But that means you have to actually find out if HEAD is valid or not.
Which is what the current code is doing. :-/

> > or what we are trying to do here is simply wrong. Maybe it
> > would be simpler to just speed up resolve_gitlink_ref with a better data
> > structure.
> 
> Which is what I did on square one, but now we already have a real fix
> for git clean, and now it's even less advantageous the fix the consequence
> (the submodule cache blowing up) instead of the cause (asking for it
> in the first place).

I think "clean" is a much simpler case. It only wants to know "can I
skip this entry as an untracked sub-repo?". And that is similar to what
"git status" or "git ls-files" wants to know. But "git add" wants to
know "can I _add_ this entry to the index", and that is a different
question (but I think answered by the same code that powers ls-files).

> PS: I seem to not quite have send-email under control, the envelope from
>     seems to made the patches not reach the mailing list (nor me in the CC).

Hmm, yeah. Obviously they made it to me directly, but the list is a
little bit picky.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-12-07 21:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <1449252917-3877-1-git-send-email-a.krey@gmx.de>
2015-12-05  6:58 ` [PATCH 1/3] externalize is_git_repository Jeff King
     [not found] ` <1449252917-3877-2-git-send-email-a.krey@gmx.de>
2015-12-05  7:01   ` [PATCH 2/3] add is_git_repo (is_git_repository for char*) Jeff King
     [not found] ` <1449252917-3877-3-git-send-email-a.krey@gmx.de>
2015-12-05  7:37   ` [PATCH/RFC 3/3] ls-files/dir: use is_git_repo to detect submodules Jeff King
2015-12-06 16:59     ` Andreas Krey
2015-12-07 21:10       ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.