git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Han Xin" <hanxin.hx@bytedance.com>,
	chiyutianyi@gmail.com, derrickstolee@github.com,
	git@vger.kernel.org, haiyangtand@gmail.com,
	jonathantanmy@google.com, me@ttaylorr.com, ps@pks.im
Subject: Re: [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph()
Date: Fri, 1 Jul 2022 21:31:26 +0200 (CEST)	[thread overview]
Message-ID: <n3p471no-671q-2701-1r72-s0q02ns09053@tzk.qr> (raw)
In-Reply-To: <xmqq5ykignwb.fsf@gitster.g>

[-- Attachment #1: Type: text/plain, Size: 7105 bytes --]

Hi Junio,

On Thu, 30 Jun 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> > On Thu, Jun 30 2022, Johannes Schindelin wrote:
> >
> >> On Tue, 28 Jun 2022, Junio C Hamano wrote:
> >>
> >>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> >>>
> >>> >> +test_expect_success 'setup: prepare a repository with commit-graph contains the commit' '
> >>> >> +	git init with-commit-graph &&
> >>> >> +	echo "$(pwd)/with-commit/.git/objects" \
> >>> >> +		>with-commit-graph/.git/objects/info/alternates &&
> >>> >
> >>> > nit: you can use $PWD instead of $(pwd).
> >>>
> >>> We can, and it would not make any difference on non-Windows.
> >>>
> >>> But which one should we use to cater to Windows?  $(pwd) is a full
> >>> path in Windows notation "C:\Program Files\Git\..." while $PWD is
> >>> MSYS style "/C/Program Files/Git/..." or something like that, IIRC?
> >>
> >> Indeed, and since the `alternates` file is supposed to be read by
> >> `git.exe`, a non-MSYS program, the original was good, and the nit
> >> suggested the incorrect form.
> >
> > I looked at t5615-alternate-env.sh which does the equivalent of:
> >
> > 	GIT_ALTERNATE_OBJECT_DIRECTORIES="$PWD/one.git/objects:$PWD/two.git/objects" \
> >         	git cat-file [...]
> >
> > We run that test on all our platforms, does the $PWD form work in the
> > environment variable, but not when we write it to the "alternates" file?
> > Or is there some other subtlety there that I'm missing?
>
> I am also curious to see a clear and concise explanation so that we
> do not have to repeat this discussion later.

Unfortunately, I do not see any way to explain this concisely: MSYS2 does
hard-to-explain things here, in the hopes to Do The Right Thing (most of
the time, anyways).

Whenever you call a non-MSYS program from an MSYS program (and remember,
an MSYS program is a program that uses the MSYS2 runtime that acts as a
POSIX emulation layer), "magic" things are done. In our context,
`bash.exe` is an MSYS program, and the non-MSYS program that is called is
`git.exe`.

So what are those "magic" things? The command-line arguments and the
environment variables are auto-converted: everything that looks like a
Unix-style path (or path list, like the `PATH` environment variable) is
converted to a Windows-style path or path list (on Windows, the colon
cannot be the separator in `PATH`, therefore the semicolon is used).

And this is where it gets _really_ tricky to explain what is going on:
what _does_ look like a Unix-style path? The exact rules are convoluted
and hard to explain, but they work _most of the time_. For example,
`/usr/bin:/hello` is converted to `C:\Program Files\Git\usr\bin;C:\Program
Files\Git\hello` or something like that. But `kernel.org:/home/gitster` is
not, because it looks more like an SSH path. Similarly, `C:/Program Files`
is interpreted as a Windows-style path, even if it could technically be a
Unix-style path list.

Now, if you call `git.exe -C /blabla <command>`, it works, because
`git.exe` is a non-MSYS program, therefore that `/blabla` is converted to
a Windows-style path before executing `git.exe`. However, when you write a
file via `echo /blabla >file`, that `echo` is either the Bash built-in, or
it is an MSYS program, and no argument conversion takes place. If you
_then_ ask `git.exe` to read and interpret the file as a path, it won't
know what to do with that Unix-style path.

You can substitute `$PWD` for `/blabla` in all of this, and it will hold
true just the same.

So what makes `pwd` special?

Well, `pwd.exe` itself is an MSYS program, so it would still report a path
that `git.exe` cannot understand. But in Git's test suite, we specifically
override `pwd` to be a shell function that calls `pwd.exe -W`, which does
output Windows-style paths.

The thing that makes that `GIT_*=$PWD git ...` call work is that the
environment is automagically converted because `git` is a non-MSYS
program. The thing that makes `echo $PWD >.git/objects/info/alternates`
not work is that `echo` _is_ an MSYS program (or Bash built-in, which is
the same thing here, for all practical purposes), so it writes the path
verbatim into that file, but then we expect `git.exe` to read this file
and interpret it as a list of paths.

Hopefully that clarifies the scenario a bit, even if it is far from a
concise explanation (I did edit this mail multiple times for clarity and
brevity, though, as I do with pretty much all of my mails).

Ciao,
Dscho

> We have
>
>  - When a test checks for an absolute path that a git command generated,
>    construct the expected value using $(pwd) rather than $PWD,
>    $TEST_DIRECTORY, or $TRASH_DIRECTORY. It makes a difference on
>    Windows, where the shell (MSYS bash) mangles absolute path names.
>    For details, see the commit message of 4114156ae9.
>
> in t/README, but even with the log mesasge of 4114156a (Tests on
> Windows: $(pwd) must return Windows-style paths, 2009-03-13) [*1*],
> I have no idea what makes the thing you found in t5615 work and your
> suggestion to use $PWD in the new one not work.
>
> GIT_ALTERNATE_OBJECT_DIRECTORIES is a PATH_SEP (not necessarily a
> colon) separated list, and I think the way t5615 uses it is broken
> on Windows where PATH_SEP is defined as semicolon without the $PWD
> vs $(pwd) issue.  Is the test checking the right thing?
>
>
> [Footnote]
>
> *1*
>
>     Tests on Windows: $(pwd) must return Windows-style paths
>
>     Many tests pass $(pwd) in some form to git and later test that the output
>     of git contains the correct value of $(pwd). For example, the test of
>     'git remote show' sets up a remote that contains $(pwd) and then the
>     expected result must contain $(pwd).
>
>     Again, MSYS-bash's path mangling kicks in: Plain $(pwd) uses the MSYS style
>     absolute path /c/path/to/git. The test case would write this name into
>     the 'expect' file. But when git is invoked, MSYS-bash converts this name to
>     the Windows style path c:/path/to/git, and git would produce this form in
>     the result; the test would fail.
>
>     We fix this by passing -W to bash's pwd that produces the Windows-style
>     path.
>
>     There are a two cases that need an accompanying change:
>
>     - In t1504 the value of $(pwd) becomes part of a path list. In this case,
>       the lone 'c' in something like /foo:c:/path/to/git:/bar inhibits
>       MSYS-bashes path mangling; IOW in this case we want the /c/path/to/git
>       form to allow path mangling. We use $PWD instead of $(pwd), which always
>       has the latter form.
>
>     - In t6200, $(pwd) - the Windows style path - must be used to construct the
>       expected result because that is the path form that git sees. (The change
>       in the test itself is just for consistency: 'git fetch' always sees the
>       Windows-style path, with or without the change.)
>
>     Signed-off-by: Johannes Sixt <j6t@kdbg.org>
>
>

  parent reply	other threads:[~2022-07-01 19:31 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-14  7:25 An endless loop fetching issue with partial clone, alternates and commit graph Haiyng Tan
2022-06-15  2:18 ` Taylor Blau
2022-06-16  3:38   ` [RFC PATCH 0/2] " Han Xin
2022-06-16  3:38     ` [RFC PATCH 1/2] commit-graph.c: add "flags" to lookup_commit_in_graph() Han Xin
2022-06-16  3:38     ` [RFC PATCH 2/2] fetch-pack.c: pass "oi_flags" " Han Xin
2022-06-17 21:47     ` [RFC PATCH 0/2] Re: An endless loop fetching issue with partial clone, alternates and commit graph Jonathan Tan
2022-06-18  3:01     ` [PATCH v1] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-20  7:07       ` Patrick Steinhardt
2022-06-20  8:53         ` [External] " 欣韩
2022-06-20  9:05           ` Patrick Steinhardt
2022-06-21 18:23       ` Jonathan Tan
2022-06-22  3:17         ` Han Xin
2022-06-24  5:27       ` [PATCH v2 0/2] " Han Xin
2022-06-24  5:27         ` [PATCH v2 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-24 16:03           ` Junio C Hamano
2022-06-25  1:35             ` Han Xin
2022-06-27 12:22               ` Junio C Hamano
2022-06-24  5:27         ` [PATCH v2 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-24 16:56           ` Junio C Hamano
2022-06-25  2:25             ` Han Xin
2022-06-25  2:31               ` Han Xin
2022-06-28  2:02         ` [PATCH v3 0/2] " Han Xin
2022-06-28  2:02           ` [PATCH v3 1/2] test-lib.sh: add limited processes to test-lib Han Xin
2022-06-28  2:02           ` [PATCH v3 2/2] commit-graph.c: no lazy fetch in lookup_commit_in_graph() Han Xin
2022-06-28  7:49             ` Ævar Arnfjörð Bjarmason
2022-06-28 17:36               ` Junio C Hamano
2022-06-30 12:21                 ` Johannes Schindelin
2022-06-30 13:43                   ` Ævar Arnfjörð Bjarmason
2022-06-30 15:40                     ` Junio C Hamano
2022-06-30 18:47                       ` Ævar Arnfjörð Bjarmason
2022-07-01 19:31                       ` Johannes Schindelin [this message]
2022-07-01 20:47                         ` Junio C Hamano
2022-06-29  2:08               ` Han Xin
2022-06-30 17:37           ` test name conflict + js/ci-github-workflow-markup regression (was: [PATCH v3 0/2] no lazy fetch in lookup_commit_in_graph()) Ævar Arnfjörð Bjarmason
2022-07-01  1:34           ` [PATCH v4 0/1] no lazy fetch in lookup_commit_in_graph() Han Xin
2022-07-01  1:34             ` [PATCH v4 1/1] commit-graph.c: " Han Xin
2022-07-09 12:23               ` Michael J Gruber
2022-07-11 15:09                 ` Jeff King
2022-07-11 20:17                   ` Junio C Hamano
2022-07-12  1:52                     ` [External] " Han Xin
2022-07-12  5:23                       ` Junio C Hamano
2022-07-12  5:32                         ` Han Xin
2022-07-12  6:37                         ` [External] " Jeff King
2022-07-12 14:19                           ` Junio C Hamano
2022-07-12  6:50             ` [PATCH v5 0/1] " Han Xin
2022-07-12  6:50               ` [PATCH v5 1/1] commit-graph.c: " Han Xin
2022-07-12  9:50                 ` Ævar Arnfjörð Bjarmason
2022-07-13  1:26                   ` Han Xin
2022-07-12  6:58               ` [PATCH v5 0/1] " Jeff King
2022-07-12  8:01             ` [PATCH v1] t5330: remove run_with_limited_processses() Han Xin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=n3p471no-671q-2701-1r72-s0q02ns09053@tzk.qr \
    --to=johannes.schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=chiyutianyi@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=haiyangtand@gmail.com \
    --cc=hanxin.hx@bytedance.com \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).