All of lore.kernel.org
 help / color / mirror / Atom feed
* git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin
@ 2017-11-02 23:45 Andrew Baumann
  2017-11-03  1:03 ` Jeff King
  2017-11-03  1:06 ` [PATCH] setup.c: don't try to access '//HEAD' during repo discovery SZEDER Gábor
  0 siblings, 2 replies; 6+ messages in thread
From: Andrew Baumann @ 2017-11-02 23:45 UTC (permalink / raw)
  To: git

Hi,

I have a workaround for this, but someone on stack overflow [1] suggested reporting it upstream, so here you go:

I have a fancy shell prompt that executes "git rev-parse --is-inside-work-tree" to determine whether we're currently inside a working directory. This causes git to walk up the directory hierarchy looking for a containing git repo. For example, when invoked from my home directory, it stats the following paths, in order:

/home/me/.git
/home/me/.git/HEAD
/home/me/HEAD
/home
/home/.git
/home/.git/HEAD
/home/HEAD
/
/.git
/.git/HEAD
//HEAD

The last name (//HEAD) interacts badly with Cygwin, which interprets it as a UNC file share, and so demand-loads a bunch of extra DLLs and attempts to resolve/contact the server named HEAD. This obviously doesn't work too well, especially over a slow network link.

I've tested with the latest Cygwin git (2.15.0); this was also present in a prior version.

Cheers,
Andrew

[1] https://stackoverflow.com/questions/47084672

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin
  2017-11-02 23:45 git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin Andrew Baumann
@ 2017-11-03  1:03 ` Jeff King
  2017-11-03 12:32   ` Johannes Schindelin
  2017-11-03  1:06 ` [PATCH] setup.c: don't try to access '//HEAD' during repo discovery SZEDER Gábor
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff King @ 2017-11-03  1:03 UTC (permalink / raw)
  To: Andrew Baumann; +Cc: Johannes Schindelin, git

On Thu, Nov 02, 2017 at 11:45:55PM +0000, Andrew Baumann wrote:

> I have a workaround for this, but someone on stack overflow [1]
> suggested reporting it upstream, so here you go:
> 
> I have a fancy shell prompt that executes "git rev-parse
> --is-inside-work-tree" to determine whether we're currently inside a
> working directory. This causes git to walk up the directory hierarchy
> looking for a containing git repo. For example, when invoked from my
> home directory, it stats the following paths, in order:
> 
> /home/me/.git
> /home/me/.git/HEAD
> /home/me/HEAD
> /home
> /home/.git
> /home/.git/HEAD
> /home/HEAD
> /
> /.git
> /.git/HEAD
> //HEAD
> 
> The last name (//HEAD) interacts badly with Cygwin, which interprets
> it as a UNC file share, and so demand-loads a bunch of extra DLLs and
> attempts to resolve/contact the server named HEAD. This obviously
> doesn't work too well, especially over a slow network link.
> 
> I've tested with the latest Cygwin git (2.15.0); this was also present
> in a prior version.

Interesting. I can reproduce on Linux (but of course "//HEAD" is cheap
to look at there). It bisects to ce9b8aab5d (setup_git_directory_1():
avoid changing global state, 2017-03-13). Before that, the end of the
strace for "git rev-parse --git-dir" looks like:

  chdir("..")                             = 0
  stat(".git", 0x7fffba398e00)            = -1 ENOENT (No such file or directory)
  lstat(".git/HEAD", 0x7fffba398dd0)      = -1 ENOENT (No such file or directory)
  lstat("./HEAD", 0x7fffba398dd0)         = -1 ENOENT (No such file or directory)
  write(2, "fatal: Not a git repository (or "..., 69) = 69

and after:

  stat("/.git", 0x7ffdb28b7eb0)           = -1 ENOENT (No such file or directory)
  lstat("/.git/HEAD", 0x7ffdb28b7e80)     = -1 ENOENT (No such file or directory)
  lstat("//HEAD", 0x7ffdb28b7e80)         = -1 ENOENT (No such file or directory)
  write(2, "fatal: Not a git repository (or "..., 69) = 69

Switching to using absolute paths rather than chdir-ing around is
intentional for that commit, but it looks like we just need to
special-case the construction of the root path.

Like this, perhaps:

diff --git a/setup.c b/setup.c
index 27a31de33f..5d0b6a88e3 100644
--- a/setup.c
+++ b/setup.c
@@ -283,7 +283,9 @@ int is_git_directory(const char *suspect)
 	size_t len;
 
 	/* Check worktree-related signatures */
-	strbuf_addf(&path, "%s/HEAD", suspect);
+	strbuf_addstr(&path, suspect);
+	strbuf_complete(&path, '/');
+	strbuf_addstr(&path, "HEAD");
 	if (validate_headref(path.buf))
 		goto done;
 

-Peff

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH] setup.c: don't try to access '//HEAD' during repo discovery
  2017-11-02 23:45 git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin Andrew Baumann
  2017-11-03  1:03 ` Jeff King
@ 2017-11-03  1:06 ` SZEDER Gábor
  2017-11-03 12:36   ` Johannes Schindelin
  1 sibling, 1 reply; 6+ messages in thread
From: SZEDER Gábor @ 2017-11-03  1:06 UTC (permalink / raw)
  To: Andrew Baumann; +Cc: Johannes Schindelin, git, SZEDER Gábor

Commit ce9b8aab5 (setup_git_directory_1(): avoid changing global state,
2017-03-13) changed how the git directory is discovered, and as a side
effect when the discovery reaches the root directory Git tries to
access paths like '//HEAD' and '//objects'.  This interacts badly with
Cygwin, which interprets it as a UNC file share, and so demand-loads a
bunch of extra DLLs and attempts to resolve/contact the server named
HEAD.  This obviously doesn't work too well, especially over a slow
network link.

Special case the root directory in is_git_directory() to prevent
accessing paths with leading double slashes.

Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
---

I'm not quite sure whether this is the right or complete fix.  I can't
test it on Cygwin, and I don't know whether Git for Windows is
affected with it's drive prefixes in paths.  Anyway, at least strace
output on Linux looks good to me.

 setup.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/setup.c b/setup.c
index 03f51e056..0cfc5e676 100644
--- a/setup.c
+++ b/setup.c
@@ -311,6 +311,10 @@ int is_git_directory(const char *suspect)
 	int ret = 0;
 	size_t len;
 
+	/* To avoid accessing '//HEAD' & co when checking the root directory */
+	if (!strcmp(suspect, "/"))
+		suspect = "";
+
 	/* Check worktree-related signatures */
 	strbuf_addf(&path, "%s/HEAD", suspect);
 	if (validate_headref(path.buf))
-- 
2.15.0.67.gb67a46776


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin
  2017-11-03  1:03 ` Jeff King
@ 2017-11-03 12:32   ` Johannes Schindelin
  2017-11-03 18:29     ` Jeff King
  0 siblings, 1 reply; 6+ messages in thread
From: Johannes Schindelin @ 2017-11-03 12:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Andrew Baumann, git

Hi Peff,


On Thu, 2 Nov 2017, Jeff King wrote:

> On Thu, Nov 02, 2017 at 11:45:55PM +0000, Andrew Baumann wrote:
> 
> > I have a workaround for this, but someone on stack overflow [1]
> > suggested reporting it upstream, so here you go:
> > 
> > I have a fancy shell prompt that executes "git rev-parse
> > --is-inside-work-tree" to determine whether we're currently inside a
> > working directory. This causes git to walk up the directory hierarchy
> > looking for a containing git repo. For example, when invoked from my
> > home directory, it stats the following paths, in order:
> > 
> > /home/me/.git
> > /home/me/.git/HEAD
> > /home/me/HEAD
> > /home
> > /home/.git
> > /home/.git/HEAD
> > /home/HEAD
> > /
> > /.git
> > /.git/HEAD
> > //HEAD
> > 
> > The last name (//HEAD) interacts badly with Cygwin, which interprets
> > it as a UNC file share, and so demand-loads a bunch of extra DLLs and
> > attempts to resolve/contact the server named HEAD. This obviously
> > doesn't work too well, especially over a slow network link.
> > 
> > I've tested with the latest Cygwin git (2.15.0); this was also present
> > in a prior version.
> 
> Interesting. I can reproduce on Linux (but of course "//HEAD" is cheap
> to look at there). It bisects to ce9b8aab5d (setup_git_directory_1():
> avoid changing global state, 2017-03-13). Before that, the end of the
> strace for "git rev-parse --git-dir" looks like:
> 
>   chdir("..")                             = 0
>   stat(".git", 0x7fffba398e00)            = -1 ENOENT (No such file or directory)
>   lstat(".git/HEAD", 0x7fffba398dd0)      = -1 ENOENT (No such file or directory)
>   lstat("./HEAD", 0x7fffba398dd0)         = -1 ENOENT (No such file or directory)
>   write(2, "fatal: Not a git repository (or "..., 69) = 69
> 
> and after:
> 
>   stat("/.git", 0x7ffdb28b7eb0)           = -1 ENOENT (No such file or directory)
>   lstat("/.git/HEAD", 0x7ffdb28b7e80)     = -1 ENOENT (No such file or directory)
>   lstat("//HEAD", 0x7ffdb28b7e80)         = -1 ENOENT (No such file or directory)
>   write(2, "fatal: Not a git repository (or "..., 69) = 69
> 
> Switching to using absolute paths rather than chdir-ing around is
> intentional for that commit, but it looks like we just need to
> special-case the construction of the root path.
> 
> Like this, perhaps:
> 
> diff --git a/setup.c b/setup.c
> index 27a31de33f..5d0b6a88e3 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -283,7 +283,9 @@ int is_git_directory(const char *suspect)
>  	size_t len;
>  
>  	/* Check worktree-related signatures */
> -	strbuf_addf(&path, "%s/HEAD", suspect);
> +	strbuf_addstr(&path, suspect);
> +	strbuf_complete(&path, '/');
> +	strbuf_addstr(&path, "HEAD");
>  	if (validate_headref(path.buf))
>  		goto done;

Yes, that would work around the issue. TBH I expected `/` to not be a
valid bare repository path (and therefore I thought that `suspect` could
never be just a single slash), but people do all kinds of crazy stuff, right?

I note also that there are tons of `strbuf_addstr(...);
strbuf_complete(..., '/');` patterns in our code, as well as
`strbuf("%s/blub", dir)`, which probably should all be condensed into
single function calls both for semantic clarity as well as to avoid double
slashes in the middle of paths.

In the short run, though, let's take your patch. Maybe with a commit
message like this?

-- snipsnap --
setup: avoid double slashes when looking for HEAD

Andrew Baumann reported that when called outside of any Git worktree, `git
rev-parse --is-inside-work-tree` eventually tries to access `//HEAD`, i.e.
any `HEAD` file in the root directory, but with a double slash.

This double slash is not only unintentional, but is allowed by the POSIX
standard to have a special meaning. And most notably on Windows, it does,
where it refers to a UNC path of the form `//server/share/`.

As a consequence, afore-mentioned `rev-parse` call not only looks for the
wrong thing, but it also causes serious delays, as Windows will try to
access a server called `HEAD`.

Let's simply avoid the unintended double slash.

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] setup.c: don't try to access '//HEAD' during repo discovery
  2017-11-03  1:06 ` [PATCH] setup.c: don't try to access '//HEAD' during repo discovery SZEDER Gábor
@ 2017-11-03 12:36   ` Johannes Schindelin
  0 siblings, 0 replies; 6+ messages in thread
From: Johannes Schindelin @ 2017-11-03 12:36 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Andrew Baumann, git

[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]

Hi Gábor,

On Fri, 3 Nov 2017, SZEDER Gábor wrote:

> Commit ce9b8aab5 (setup_git_directory_1(): avoid changing global state,
> 2017-03-13) changed how the git directory is discovered, and as a side
> effect when the discovery reaches the root directory Git tries to
> access paths like '//HEAD' and '//objects'.  This interacts badly with
> Cygwin, which interprets it as a UNC file share, and so demand-loads a
> bunch of extra DLLs and attempts to resolve/contact the server named
> HEAD.  This obviously doesn't work too well, especially over a slow
> network link.
> 
> Special case the root directory in is_git_directory() to prevent
> accessing paths with leading double slashes.

I would rather not special-case the (Unix) root directory. The underlying
problem, as Peff pointed out, is that an extra slash is appended if the
directory. And as `is_git_directory()` is not marked as static, we must
assume that there are other callers that may pass in directory paths
ending in a slash already.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin
  2017-11-03 12:32   ` Johannes Schindelin
@ 2017-11-03 18:29     ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2017-11-03 18:29 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Andrew Baumann, git

On Fri, Nov 03, 2017 at 01:32:26PM +0100, Johannes Schindelin wrote:

> > diff --git a/setup.c b/setup.c
> > index 27a31de33f..5d0b6a88e3 100644
> > --- a/setup.c
> > +++ b/setup.c
> > @@ -283,7 +283,9 @@ int is_git_directory(const char *suspect)
> >  	size_t len;
> >  
> >  	/* Check worktree-related signatures */
> > -	strbuf_addf(&path, "%s/HEAD", suspect);
> > +	strbuf_addstr(&path, suspect);
> > +	strbuf_complete(&path, '/');
> > +	strbuf_addstr(&path, "HEAD");
> >  	if (validate_headref(path.buf))
> >  		goto done;
> 
> Yes, that would work around the issue. TBH I expected `/` to not be a
> valid bare repository path (and therefore I thought that `suspect` could
> never be just a single slash), but people do all kinds of crazy stuff, right?

Heh. People _do_ do crazy stuff, but I think here it is just the tool
double-checking if people are doing crazy stuff (which they mostly
aren't) by walking up to the root.

> I note also that there are tons of `strbuf_addstr(...);
> strbuf_complete(..., '/');` patterns in our code, as well as
> `strbuf("%s/blub", dir)`, which probably should all be condensed into
> single function calls both for semantic clarity as well as to avoid double
> slashes in the middle of paths.

Yeah, I had the same thought. This _seems_ like the kind of thing
mkpathdup() would handle, but it doesn't (and as far as I can tell never
did).

Grepping around for calls to strbuf_complete() with '/' shows that most
callers wouldn't benefit. Many do trickier things like setting up a path
ending in slash, and then appending the final element repeatedly while
iterating over a list. Some have a strbuf already and just want to
append the final element to it.

On the other hand, I suspect these are only the cases that are already
conscientious about avoiding double-slashes. There are probably a ton of
xstrfmt("%s/%s", dir, file) equivalents that just aren't careful.

> In the short run, though, let's take your patch. Maybe with a commit
> message like this?

Agreed. There are enough pitfalls to a general path-constructing helper
that I don't think we should hold up a fix while on it.

> -- snipsnap --
> setup: avoid double slashes when looking for HEAD

I see you posted this separately, so I'll review there.

Thanks for finishing it off (I had intended to circle back to it this
morning myself).

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-11-03 18:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-02 23:45 git tries to stat //HEAD when searching for a repo, leading to huge delays on Cygwin Andrew Baumann
2017-11-03  1:03 ` Jeff King
2017-11-03 12:32   ` Johannes Schindelin
2017-11-03 18:29     ` Jeff King
2017-11-03  1:06 ` [PATCH] setup.c: don't try to access '//HEAD' during repo discovery SZEDER Gábor
2017-11-03 12:36   ` Johannes Schindelin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.