* [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables @ 2015-09-04 10:52 Giuseppe Bilotta 2015-09-04 12:54 ` Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Giuseppe Bilotta @ 2015-09-04 10:52 UTC (permalink / raw) To: Git List Hello all, I'm going to tell a tale of the oddest bug I've ever come across in Git and aks for suggestions on how we can improve the user experience in a sensible way. I've been stymied for days trying to solve the following issue, which I came across every time I tried to push to a specific machine when using git 2.5 (I'm honestly not sure about which version introduced this, and I haven't runthe necessary bisection due to lack of time, sorry). Trying to push any changes with 2.5 resulted in this kind of failure: user@clientmachine:~/some/git/workdir $ git push Counting objects: 6, done. Delta compression using up to 8 threads. Compressing objects: 100% (5/5), done. Writing objects: 100% (6/6), 841 bytes | 0 bytes/s, done. Total 6 (delta 2), reused 0 (delta 0) fatal: Could not switch to '/home/user/some/git': No such file or directory error: unpack failed: unpack-objects abnormal exit fatal: Could not switch to '/home/user/some/git': No such file or directory To git@remote.machine:remote-repo ! [remote rejected] master -> master (n/a (unpacker error)) error: failed to push some refs to 'git@remote.machine:remote-repo' Notice two things: the messages refer to the worktree updir of the CLIENT machine, and even though it's _completely not obvious_ due to the missing 'remote:' lines, the messages actually come from the SERVER. The lack of indicator lines _alone_ took me hours of debugging before I finally understood that they were coming from the other side (yes, I woud say that it's a bug), but after this I was even more perplexed: why the heck would the SERVER try to switch to the updir of the CLIENT worktree? I still couldn't do much on the SERVER to debug due to a variety of reasons, but I finally had a suspicion: it was almost as if the SERVER was getting the GIT_DIR information from the CLIENT. And why the heck would _that_ be the case? I then remembered that the server was actually configured to AcceptEnv GIT_* in sshd_config, for reasons related to git identity preservation despite single login account (please don't ask). Turning the AcceptEnv to a stricter GIT_AUTHOR* and GIT_COMMITTER* solved the issue. Now, a couple of questions: 1. since when have git internals started exporting GIT_* variables related to the git dir and worktree location? 2. is it worth making sure that these don't get propagated via ssh? Regarding 2., there are actually webpages around suggesting the AcceptEnv GIT_* trick, and these setups are _all_ going to break with the latest git, so this is something that might be worth looking into. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables 2015-09-04 10:52 [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables Giuseppe Bilotta @ 2015-09-04 12:54 ` Jeff King 2015-09-04 14:02 ` Giuseppe Bilotta 2015-09-04 18:18 ` Junio C Hamano 0 siblings, 2 replies; 12+ messages in thread From: Jeff King @ 2015-09-04 12:54 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: Git List On Fri, Sep 04, 2015 at 12:52:45PM +0200, Giuseppe Bilotta wrote: > Trying to push any changes with 2.5 resulted in this kind of failure: > > user@clientmachine:~/some/git/workdir $ git push > Counting objects: 6, done. > Delta compression using up to 8 threads. > Compressing objects: 100% (5/5), done. > Writing objects: 100% (6/6), 841 bytes | 0 bytes/s, done. > Total 6 (delta 2), reused 0 (delta 0) > fatal: Could not switch to '/home/user/some/git': No such file or directory > error: unpack failed: unpack-objects abnormal exit > fatal: Could not switch to '/home/user/some/git': No such file or directory > To git@remote.machine:remote-repo > ! [remote rejected] master -> master (n/a (unpacker error)) > error: failed to push some refs to 'git@remote.machine:remote-repo' > > Notice two things: the messages refer to the worktree updir of the > CLIENT machine, and even though it's _completely not obvious_ due to > the missing 'remote:' lines, the messages actually come from the > SERVER. The lack of indicator lines _alone_ took me hours of debugging > before I finally understood that they were coming from the other side Older versions of receive-pack would let unpack-objects output go straight to stderr, but that changed in a22e6f8 (receive-pack: send pack-processing stderr over sideband, 2012-09-21), which is in git v1.7.12.3. What version of git is running on the remote server? E.g., even without going over ssh, if I do: git init echo content >file && git add file && git commit -m foo git init --bare dst.git # force unpacker to fail chmod -w dst.git/objects git push dst.git I get: Counting objects: 3, done. Writing objects: 100% (3/3), 205 bytes | 0 bytes/s, done. Total 3 (delta 0), reused 0 (delta 0) remote: error: insufficient permission for adding an object to repository database ./objects remote: fatal: failed to write object error: unpack failed: unpack-objects abnormal exit To dst.git ! [remote rejected] master -> master (unpacker error) error: failed to push some refs to 'dst.git' The "unpack failed" line _does_ come from the remote, but comes straight from receive-pack, not the child unpack-objects. Receive-pack distinguishes between errors which should go to the client and which just go to stderr for debugging (remember that not all transports actually propagate stderr to the client; ssh is special here). It's possible we could switch the "unpack failed" to go to the client, but it is redundant with the "unpacker error" which _does_ go to the client. > I still couldn't do much on the SERVER to debug due to a variety of > reasons, but I finally had a suspicion: it was almost as if the SERVER > was getting the GIT_DIR information from the CLIENT. And why the heck > would _that_ be the case? > > I then remembered that the server was actually configured to AcceptEnv > GIT_* in sshd_config, for reasons related to git identity preservation > despite single login account (please don't ask). Turning the AcceptEnv > to a stricter GIT_AUTHOR* and GIT_COMMITTER* solved the issue. I couldn't reproduce this problem, either during a local push, or across an ssh session (where the client has "SendEnv GIT_*" and the server has "AcceptEnv GIT_*"). In the local case, we explicitly unset GIT_DIR and related variables in connect.c:git_connect. We don't seem to do so for the ssh case, though. I can confirm that the variable makes it across to the remote: GIT_DIR=$PWD git push \ --receive-pack='echo >&2 GIT_DIR=$GIT_DIR; git-receive-pack' \ remote-host:remote-repo shows our local $PWD on the remote side (though note that you have to explicitly set $GIT_DIR; git-push does not do so normally). On the receiving side, git-receive-pack takes an argument for the repo path, and calls enter_repo. That should result in calling set_git_dir(), which overwrites $GIT_DIR in the environment. AFAICT, it has always done so. So I'm not sure how GIT_DIR would leak through, even on an older version of git. > 1. since when have git internals started exporting GIT_* variables > related to the git dir and worktree location? It has done so for a long time, though the exact rules for doing so changes from time to time. Browsing "git log", I couldn't find any recent changes in this area. It would help if you could bisect the problem, as I can't manage to replicate it. > 2. is it worth making sure that these don't get propagated via ssh? It seems like a reasonable idea for git_connect() to do so in the ssh case, as well as the local case. That _could_ be a regression for somebody who uses an ssh-wrapper whose behavior changes based on some $GIT_* variable, but that seems a bit far-fetched. It shouldn't be necessary for $GIT_DIR, but it makes sense for other git variables. E.g., with "AcceptEnv GIT_*", "git -c" config is propagated. E.g.: # make a syntactically bogus commit commit=$(git cat-file commit HEAD | sed 's/>/>>/' | git hash-object -t commit -w --stdin) git update-ref HEAD $commit # confirm that git complains git fsck # confirm that it gets blocked by receive.fsckObjects; the "-c" is # applied to the receive-pack directly, so we are already in the # "remote" repo. git push --receive-pack='git -c receive.fsckobjects=true receive-pack' dst.git # now let's try it with a local "-c". This doesn't get propagated, as # we clean out any environment-level config before running # receive-pack in the "remote". As a result, the push succeeds. git -c receive.fsckobjects=true push dst.git # and now the same thing over ssh; this _will_ complain, as we # propagated the config. git -c receive.fsckObjects=true push remote:dst.git -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables 2015-09-04 12:54 ` Jeff King @ 2015-09-04 14:02 ` Giuseppe Bilotta 2015-09-04 14:26 ` Jeff King 2015-09-04 18:18 ` Junio C Hamano 1 sibling, 1 reply; 12+ messages in thread From: Giuseppe Bilotta @ 2015-09-04 14:02 UTC (permalink / raw) To: Jeff King; +Cc: Git List Hello, On Fri, Sep 4, 2015 at 2:54 PM, Jeff King <peff@peff.net> wrote: > On Fri, Sep 04, 2015 at 12:52:45PM +0200, Giuseppe Bilotta wrote: > >> Notice two things: the messages refer to the worktree updir of the >> CLIENT machine, and even though it's _completely not obvious_ due to >> the missing 'remote:' lines, the messages actually come from the >> SERVER. The lack of indicator lines _alone_ took me hours of debugging >> before I finally understood that they were coming from the other side > > Older versions of receive-pack would let unpack-objects output go > straight to stderr, but that changed in a22e6f8 (receive-pack: send > pack-processing stderr over sideband, 2012-09-21), which is in git > v1.7.12.3. What version of git is running on the remote server? Sorry, I forgot to mention this in my origina email, I should have. The remote is quite ancient: git version 1.7.0.4 >> I still couldn't do much on the SERVER to debug due to a variety of >> reasons, but I finally had a suspicion: it was almost as if the SERVER >> was getting the GIT_DIR information from the CLIENT. And why the heck >> would _that_ be the case? >> >> I then remembered that the server was actually configured to AcceptEnv >> GIT_* in sshd_config, for reasons related to git identity preservation >> despite single login account (please don't ask). Turning the AcceptEnv >> to a stricter GIT_AUTHOR* and GIT_COMMITTER* solved the issue. > > I couldn't reproduce this problem, either during a local push, or across > an ssh session (where the client has "SendEnv GIT_*" and the server has > "AcceptEnv GIT_*"). [snip] > On the receiving side, git-receive-pack takes an argument for the repo > path, and calls enter_repo. That should result in calling set_git_dir(), > which overwrites $GIT_DIR in the environment. AFAICT, it has always done > so. So I'm not sure how GIT_DIR would leak through, even on an older > version of git. Hm, aside from the ancient version of git, the other side also has gitolite2 running (can you tell it's an ancient installation that needs upgrading?). However,I've done a quick test by creating an empty bare git repository outside of gitolite2, and I'm still getting the same error, so I would exclude gitolite2 interference. In fact, your testing approach gave me the idea to test by prepending env 1>&2 to the receive pack and this way I noticed that the variable being set was GIT_WORK_TREE. In fact, using --receive-pack 'export -n GIT_WORK_TREE ; git-receive-pack' fixes the problem for me. So, I've at least been able to circumscribe the problem to: server git 1.7.0.4, client git 2.5, GIT_WORK_TREE being sent over the ssh connection. I'll see if I can find some time to do some proper bisecting next week. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables 2015-09-04 14:02 ` Giuseppe Bilotta @ 2015-09-04 14:26 ` Jeff King 0 siblings, 0 replies; 12+ messages in thread From: Jeff King @ 2015-09-04 14:26 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: Git List On Fri, Sep 04, 2015 at 04:02:09PM +0200, Giuseppe Bilotta wrote: > So, I've at least been able to circumscribe the problem to: server git > 1.7.0.4, client git 2.5, GIT_WORK_TREE being sent over the ssh > connection. I'll see if I can find some time to do some proper > bisecting next week. That sounds like progress. I was hoping I could replicate it with that information, but I still can't: LOCAL_GIT=git.v2.5.0 REMOTE_GIT=/home/peff/compile/git/bin-wrappers/git $ $LOCAL_GIT version git version 2.5.0 $ ssh git.peff.net $REMOTE_GIT version git version 1.7.0.4 $ ssh git.peff.net "rm -rf /tmp/foo.git; $REMOTE_GIT init --bare /tmp/foo.git" $ GIT_DIR=$PWD/.git GIT_WORK_TREE=$PWD $LOCAL_GIT push \ --receive-pack="env | grep GIT >&2; $REMOTE_GIT receive-pack" \ git.peff.net:/tmp/foo.git HEAD:branch GIT_DIR=/home/peff/tmp/.git GIT_WORK_TREE=/home/peff/tmp GIT_PREFIX= Counting objects: 3, done. Writing objects: 100% (3/3), 208 bytes | 0 bytes/s, done. Total 3 (delta 0), reused 0 (delta 0) To git.peff.net:/tmp/foo.git * [new branch] HEAD -> branch Interestingly, if I try pushing straight to master, the remote side complains about trying to push the checked-out branch, even though it's a bare repo (presumably it's getting confused by the GIT_WORK_TREE setting; so that's _a_ bug, but not the one you are seeing). So I'm not sure what is different about my setup and yours. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables 2015-09-04 12:54 ` Jeff King 2015-09-04 14:02 ` Giuseppe Bilotta @ 2015-09-04 18:18 ` Junio C Hamano 2015-09-04 21:10 ` Giuseppe Bilotta 2015-09-04 21:44 ` Jeff King 1 sibling, 2 replies; 12+ messages in thread From: Junio C Hamano @ 2015-09-04 18:18 UTC (permalink / raw) To: Jeff King; +Cc: Giuseppe Bilotta, Git List Jeff King <peff@peff.net> writes: > It shouldn't be necessary for $GIT_DIR, but it makes sense for other git > variables. E.g., with "AcceptEnv GIT_*", "git -c" config is propagated. > E.g.: > ... Just to make sure I got you correctly, you are saying that "we propagate, but that is not correct. We should stop doing so", right? ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables 2015-09-04 18:18 ` Junio C Hamano @ 2015-09-04 21:10 ` Giuseppe Bilotta 2015-09-04 21:44 ` Jeff King 1 sibling, 0 replies; 12+ messages in thread From: Giuseppe Bilotta @ 2015-09-04 21:10 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Git List On Fri, Sep 4, 2015 at 8:18 PM, Junio C Hamano <gitster@pobox.com> wrote: > Jeff King <peff@peff.net> writes: > >> It shouldn't be necessary for $GIT_DIR, but it makes sense for other git >> variables. E.g., with "AcceptEnv GIT_*", "git -c" config is propagated. >> E.g.: >> ... > > Just to make sure I got you correctly, you are saying that "we > propagate, but that is not correct. We should stop doing so", right? Indeed, we should not propagate them. For what we know so far, at least GIT_DIR and GIT_WORK_TREE should be removed from the ssh environment, in case the remote server has AcceptEnv GIT_* and an older git version. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables 2015-09-04 18:18 ` Junio C Hamano 2015-09-04 21:10 ` Giuseppe Bilotta @ 2015-09-04 21:44 ` Jeff King 2015-09-04 22:40 ` [PATCH] git_connect: clear GIT_* environment for ssh Jeff King 1 sibling, 1 reply; 12+ messages in thread From: Jeff King @ 2015-09-04 21:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: Giuseppe Bilotta, Git List On Fri, Sep 04, 2015 at 11:18:01AM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > It shouldn't be necessary for $GIT_DIR, but it makes sense for other git > > variables. E.g., with "AcceptEnv GIT_*", "git -c" config is propagated. > > E.g.: > > ... > > Just to make sure I got you correctly, you are saying that "we > propagate, but that is not correct. We should stop doing so", right? Exactly. We do not propagate config over git:// or http:// (because we do not share our environment). Nor do we do so over same-machine connections (because we explicitly clean the environment). So ssh:// is the odd duck. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] git_connect: clear GIT_* environment for ssh 2015-09-04 21:44 ` Jeff King @ 2015-09-04 22:40 ` Jeff King 2015-09-05 13:50 ` Giuseppe Bilotta 0 siblings, 1 reply; 12+ messages in thread From: Jeff King @ 2015-09-04 22:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: Giuseppe Bilotta, Git List On Fri, Sep 04, 2015 at 05:44:54PM -0400, Jeff King wrote: > > Just to make sure I got you correctly, you are saying that "we > > propagate, but that is not correct. We should stop doing so", right? > > Exactly. We do not propagate config over git:// or http:// (because we > do not share our environment). Nor do we do so over same-machine > connections (because we explicitly clean the environment). So ssh:// is > the odd duck. So here is the patch I would propose. I'm fairly certain it will solve Giuseppe's problem, though I think there is something else odd going on here that I don't understand. I'm worried that we're papering over another regression. Giuseppe, if you can still find time to do that bisect, it would be helpful. -- >8 -- Subject: git_connect: clear GIT_* environment for ssh When we "switch" to another local repository to run the server side of a fetch or push, we must clear the variables in local_repo_env so that our local $GIT_DIR, etc, do not pollute the upload-pack or receive-pack that is executing in the "remote" repository. We have never done so for ssh connections. For the most part, nobody has noticed because ssh will not pass unknown environment variables by default. However, it is not out of the question for a user to configure ssh to pass along GIT_* variables using SendEnv/AcceptEnv. We can demonstrate the problem by using "git -c" on a local command and seeing its impact on a remote repository. This config ends up in $GIT_CONFIG_PARAMETERS. In the local case, the config has no impact, but in the ssh transport, it does (our test script has a fake ssh that passes through all environment variables; this isn't normal, but does simulate one possible setup). Signed-off-by: Jeff King <peff@peff.net> --- connect.c | 4 ++-- t/t5507-remote-environment.sh | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 36 insertions(+), 2 deletions(-) create mode 100755 t/t5507-remote-environment.sh diff --git a/connect.c b/connect.c index c0144d8..962f990 100644 --- a/connect.c +++ b/connect.c @@ -721,6 +721,8 @@ struct child_process *git_connect(int fd[2], const char *url, strbuf_addch(&cmd, ' '); sq_quote_buf(&cmd, path); + /* remove repo-local variables from the environment */ + conn->env = local_repo_env; conn->in = conn->out = -1; if (protocol == PROTO_SSH) { const char *ssh; @@ -778,8 +780,6 @@ struct child_process *git_connect(int fd[2], const char *url, } argv_array_push(&conn->args, ssh_host); } else { - /* remove repo-local variables from the environment */ - conn->env = local_repo_env; conn->use_shell = 1; } argv_array_push(&conn->args, cmd.buf); diff --git a/t/t5507-remote-environment.sh b/t/t5507-remote-environment.sh new file mode 100755 index 0000000..e614929 --- /dev/null +++ b/t/t5507-remote-environment.sh @@ -0,0 +1,34 @@ +#!/bin/sh + +test_description='check environment showed to remote side of transports' +. ./test-lib.sh + +test_expect_success 'set up "remote" push situation' ' + test_commit one && + git config push.default current && + git init remote +' + +test_expect_success 'set up fake ssh' ' + GIT_SSH_COMMAND="f() { + cd \"\$TRASH_DIRECTORY\" && + eval \"\$2\" + }; f" && + export GIT_SSH_COMMAND && + export TRASH_DIRECTORY +' + +# due to receive.denyCurrentBranch=true +test_expect_success 'confirm default push fails' ' + test_must_fail git push remote +' + +test_expect_success 'config does not travel over same-machine push' ' + test_must_fail git -c receive.denyCurrentBranch=false push remote +' + +test_expect_success 'config does not travel over ssh push' ' + test_must_fail git -c receive.denyCurrentBranch=false push host:remote +' + +test_done -- 2.5.1.812.ge796bff ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] git_connect: clear GIT_* environment for ssh 2015-09-04 22:40 ` [PATCH] git_connect: clear GIT_* environment for ssh Jeff King @ 2015-09-05 13:50 ` Giuseppe Bilotta 2015-09-08 8:33 ` [PATCH] git_connect: clarify conn->use_shell flag Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Giuseppe Bilotta @ 2015-09-05 13:50 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Git List On Sat, Sep 5, 2015 at 12:40 AM, Jeff King <peff@peff.net> wrote: > So here is the patch I would propose. I'm fairly certain it will solve > Giuseppe's problem, though I think there is something else odd going on > here that I don't understand. I'm worried that we're papering over > another regression. Giuseppe, if you can still find time to do that > bisect, it would be helpful. So, as it happens I was able to script the check in a not-too-convoluted way using the same pair of machines where I came across the problem,, and it turns out that the culprit is as obvious as one would expect: d95138e695d99d32dcad528a2a7974f434c51e79 is the first bad commit commit d95138e695d99d32dcad528a2a7974f434c51e79 Author: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Date: Fri Jun 26 17:37:35 2015 +0700 setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR which matches my previous findings in that the problem was GIT_WORK_TREE being leaked. Since setting GIT_WORK_TREE when setting GIT_DIR is correct, the solution is indeed to clear GIT_* environment variables for ssh, as you propose. I tested your patch and indeed it fixes my problem. I still don't understand why you cannot replicate the issue on your side. One thing that is _extremely_ important in reproducing the problem is that the leaked GIT_WORK_TREE point to a non-existent (or otherwise inaccessible) directory in the server machine. For example, on my first attempt to reproduce I was trying to use my clone of the git repo, and it wouldn't work because I _did_ have a ~/src/git on both machines, even though I was pushing to a remote.machine:/tmp/test.git Would it be worth looking at the issue server-side too, as this looks like something that might have exploit potential? (At the very least, it could be used to test if/which directories the remote user has access to.) > -- >8 -- > Subject: git_connect: clear GIT_* environment for ssh > > When we "switch" to another local repository to run the server > side of a fetch or push, we must clear the variables in > local_repo_env so that our local $GIT_DIR, etc, do not > pollute the upload-pack or receive-pack that is executing in > the "remote" repository. I think we might want to mention GIT_WORK_TREE explicitly here, since this seems to be the one causing issues. > We have never done so for ssh connections. For the most > part, nobody has noticed because ssh will not pass unknown > environment variables by default. However, it is not out of > the question for a user to configure ssh to pass along GIT_* > variables using SendEnv/AcceptEnv. > > We can demonstrate the problem by using "git -c" on a local > command and seeing its impact on a remote repository. This > config ends up in $GIT_CONFIG_PARAMETERS. In the local case, > the config has no impact, but in the ssh transport, it does > (our test script has a fake ssh that passes through all > environment variables; this isn't normal, but does simulate > one possible setup). > > Signed-off-by: Jeff King <peff@peff.net> The patch works for me, so aside from the suggested commit message change, I'm all for it. Reviewed-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com> > --- > connect.c | 4 ++-- > t/t5507-remote-environment.sh | 34 ++++++++++++++++++++++++++++++++++ > 2 files changed, 36 insertions(+), 2 deletions(-) > create mode 100755 t/t5507-remote-environment.sh > > diff --git a/connect.c b/connect.c > index c0144d8..962f990 100644 > --- a/connect.c > +++ b/connect.c > @@ -721,6 +721,8 @@ struct child_process *git_connect(int fd[2], const char *url, > strbuf_addch(&cmd, ' '); > sq_quote_buf(&cmd, path); > > + /* remove repo-local variables from the environment */ > + conn->env = local_repo_env; A posteriori, this makes a lot of sense too: why single out a single protocol in the environment cleanup? > conn->in = conn->out = -1; > if (protocol == PROTO_SSH) { > const char *ssh; > @@ -778,8 +780,6 @@ struct child_process *git_connect(int fd[2], const char *url, > } > argv_array_push(&conn->args, ssh_host); > } else { > - /* remove repo-local variables from the environment */ > - conn->env = local_repo_env; > conn->use_shell = 1; Of course we're now left with just conn->use_shell = 1 in the non-ssh case. This could be moved in front of the if, and replaced with something like conn>use_shell = !(procol == PROTO_SSH), but maybe this would kill legibility. It's just that a single line i the else clause of a large if looks odd. -- Giuseppe "Oblomov" Bilotta ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] git_connect: clarify conn->use_shell flag 2015-09-05 13:50 ` Giuseppe Bilotta @ 2015-09-08 8:33 ` Jeff King 2015-09-08 17:18 ` Junio C Hamano 0 siblings, 1 reply; 12+ messages in thread From: Jeff King @ 2015-09-08 8:33 UTC (permalink / raw) To: Giuseppe Bilotta; +Cc: Junio C Hamano, Git List On Sat, Sep 05, 2015 at 03:50:08PM +0200, Giuseppe Bilotta wrote: > which matches my previous findings in that the problem was > GIT_WORK_TREE being leaked. Since setting GIT_WORK_TREE when setting > GIT_DIR is correct, the solution is indeed to clear GIT_* environment > variables for ssh, as you propose. I tested your patch and indeed it > fixes my problem. Yeah, that makes sense. > I still don't understand why you cannot replicate the issue on your > side. One thing that is _extremely_ important in reproducing the > problem is that the leaked GIT_WORK_TREE point to a non-existent (or > otherwise inaccessible) directory in the server machine. For example, > on my first attempt to reproduce I was trying to use my clone of the > git repo, and it wouldn't work because I _did_ have a ~/src/git on > both machines, even though I was pushing to a > remote.machine:/tmp/test.git I checked that the path in question is not accessible in my test setup. I'm also confused why it doesn't replicate for me. I noticed that your original report shows receive-pack behaving well, but the child unpack-objects complaining. That process should not care about GIT_WORK_TREE either, though. > Would it be worth looking at the issue server-side too, as this looks > like something that might have exploit potential? (At the very least, > it could be used to test if/which directories the remote user has > access to.) I think if anything, the server side should be clearing GIT_DIR, GIT_WORK_TREE, etc, when it does enter_repo(). The point of that function is to enter a repository which is given externally (generally on the command-line, though in the case of git-daemon, across the socket). Clearing any existing local-repo environment variables seems like it should be part of that. That's as easy as: diff --git a/path.c b/path.c index 95acbaf..395a75d 100644 --- a/path.c +++ b/path.c @@ -383,6 +383,10 @@ const char *enter_repo(const char *path, int strict) { static char used_path[PATH_MAX]; static char validated_path[PATH_MAX]; + const char * const *env; + + for (env = local_repo_env; *env; env++) + unsetenv(*env); if (!path) return NULL; but I suspect it would break some esoteric cases. E.g.: git clone -u 'git -c some.option=true upload-pack' foo.git does what you'd expect now. With that patch, upload-pack would clear its env before entering "foo.git", and would ignore it. It would probably be OK to clean GIT_DIR (since we overwrite it in enter_repo anyway) and GIT_WORK_TREE (since upload-pack, etc, should not care about it). But the others are much more confusing (should we clear GIT_ALTERNATE_OBJECT_DIRECTORIES? You would not want that to bleed over between repositories, but setting it manually seems like a legitimate, if uncommon, thing to do). So I'm hesitant to do a patch like that because it seems like the wrong layer. It's at the interface between the old and new repos that we need clean these up. And we are (with my other patch) doing that on the client side. If the server side wants to enforce some protection, that's a good idea, too. But it should be happening at the ssh layer; i.e., disabling "AcceptEnv GIT_*". > > -- >8 -- > > Subject: git_connect: clear GIT_* environment for ssh > > > > When we "switch" to another local repository to run the server > > side of a fetch or push, we must clear the variables in > > local_repo_env so that our local $GIT_DIR, etc, do not > > pollute the upload-pack or receive-pack that is executing in > > the "remote" repository. > > I think we might want to mention GIT_WORK_TREE explicitly here, since > this seems to be the one causing issues. Heh. I avoided doing so because I could not convince it to cause an issue (and even with that aside, the config thing is wrong by itself, and I _could_ test that). > The patch works for me, so aside from the suggested commit message > change, I'm all for it. Thanks for testing/reviewing. > > @@ -778,8 +780,6 @@ struct child_process *git_connect(int fd[2], const char *url, > > } > > argv_array_push(&conn->args, ssh_host); > > } else { > > - /* remove repo-local variables from the environment */ > > - conn->env = local_repo_env; > > conn->use_shell = 1; > > Of course we're now left with just conn->use_shell = 1 in the non-ssh > case. This could be moved in front of the if, and replaced with > something like conn>use_shell = !(procol == PROTO_SSH), but maybe this > would kill legibility. It's just that a single line i the else clause > of a large if looks odd. Yeah, I noticed that, too, but I wanted to keep the diff minimal. Maybe it is worth doing this cleanup on top. -- >8 -- Subject: [PATCH] git_connect: clarify conn->use_shell flag When executing user-specified programs, we generally always want to use a shell, for flexibility and consistency. One big exception is executing $GIT_SSH, which for historical reasons must not use a shell. Once upon a time the logic in git_connect looked like: if (protocol == PROTO_SSH) { ... setup ssh ... } else { ... setup local connection ... conn->use_shell = 1; } But over time the PROTO_SSH block has grown, and the "local" block has shrunk so that it contains only conn->use_shell; it's easy to miss at the end of the large block. Moreover, PROTO_SSH now also sometimes sets use_shell, when the new GIT_SSH_COMMAND is used. To make the flow easier to follow, let's just set conn->use_shell when we're setting up the "conn" struct, and unset it (with a comment) in the historical GIT_SSH case. Note that for clarity we leave "use_shell" on in the case that we fall back to our default "ssh" This looks like a behavior change, but in practice run-command.c optimizes shell invocations without metacharacters into a straight execve anyway. Signed-off-by: Jeff King <peff@peff.net> --- connect.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/connect.c b/connect.c index 962f990..6f3f85e 100644 --- a/connect.c +++ b/connect.c @@ -723,6 +723,7 @@ struct child_process *git_connect(int fd[2], const char *url, /* remove repo-local variables from the environment */ conn->env = local_repo_env; + conn->use_shell = 1; conn->in = conn->out = -1; if (protocol == PROTO_SSH) { const char *ssh; @@ -748,15 +749,21 @@ struct child_process *git_connect(int fd[2], const char *url, } ssh = getenv("GIT_SSH_COMMAND"); - if (ssh) { - conn->use_shell = 1; + if (ssh) putty = 0; - } else { + else { const char *base; char *ssh_dup; + /* + * GIT_SSH is the no-shell version of + * GIT_SSH_COMMAND (and must remain so for + * historical compatibility). + */ ssh = getenv("GIT_SSH"); - if (!ssh) + if (ssh) + conn->use_shell = 0; + else ssh = "ssh"; ssh_dup = xstrdup(ssh); @@ -779,8 +786,6 @@ struct child_process *git_connect(int fd[2], const char *url, argv_array_push(&conn->args, port); } argv_array_push(&conn->args, ssh_host); - } else { - conn->use_shell = 1; } argv_array_push(&conn->args, cmd.buf); -- 2.6.0.rc0.358.gaf70435 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] git_connect: clarify conn->use_shell flag 2015-09-08 8:33 ` [PATCH] git_connect: clarify conn->use_shell flag Jeff King @ 2015-09-08 17:18 ` Junio C Hamano 2015-09-08 21:40 ` Jeff King 0 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2015-09-08 17:18 UTC (permalink / raw) To: Jeff King; +Cc: Giuseppe Bilotta, Git List Jeff King <peff@peff.net> writes: > Subject: [PATCH] git_connect: clarify conn->use_shell flag > > When executing user-specified programs, we generally always > want to use a shell, for flexibility and consistency. One > big exception is executing $GIT_SSH, which for historical > reasons must not use a shell. > > Once upon a time the logic in git_connect looked like: > > if (protocol == PROTO_SSH) { > ... setup ssh ... > } else { > ... setup local connection ... > conn->use_shell = 1; > } > > But over time the PROTO_SSH block has grown, and the "local" > block has shrunk so that it contains only conn->use_shell; > it's easy to miss at the end of the large block. Moreover, > PROTO_SSH now also sometimes sets use_shell, when the new > GIT_SSH_COMMAND is used. > > To make the flow easier to follow, let's just set > conn->use_shell when we're setting up the "conn" struct, and > unset it (with a comment) in the historical GIT_SSH case. Makes perfect sense. I think another thing that falls into the same class wrt readability is 'putty'; if the code does putty=0 at the beginning of "various flavours of SSH", and sets it only when it checks with "various flavours of plink" when GIT_SSH_COMMAND is not set, the logic would be even clearer, I suspect. > Note that for clarity we leave "use_shell" on in the case > that we fall back to our default "ssh" This looks like a > behavior change, but in practice run-command.c optimizes > shell invocations without metacharacters into a straight > execve anyway. Hmm, interesting. I am not sure if that has to be the way, though. Wouldn't the resulting code become simpler if you do not do that? That is, is is what I have in mind on top of your patch. Did I break something? connect.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/connect.c b/connect.c index 6f3f85e..acd39d7 100644 --- a/connect.c +++ b/connect.c @@ -727,7 +727,7 @@ struct child_process *git_connect(int fd[2], const char *url, conn->in = conn->out = -1; if (protocol == PROTO_SSH) { const char *ssh; - int putty, tortoiseplink = 0; + int putty = 0, tortoiseplink = 0; char *ssh_host = hostandport; const char *port = NULL; get_host_and_port(&ssh_host, &port); @@ -749,9 +749,7 @@ struct child_process *git_connect(int fd[2], const char *url, } ssh = getenv("GIT_SSH_COMMAND"); - if (ssh) - putty = 0; - else { + if (!ssh) { const char *base; char *ssh_dup; @@ -760,10 +758,10 @@ struct child_process *git_connect(int fd[2], const char *url, * GIT_SSH_COMMAND (and must remain so for * historical compatibility). */ + conn->use_shell = 0; + ssh = getenv("GIT_SSH"); - if (ssh) - conn->use_shell = 0; - else + if (!ssh) ssh = "ssh"; ssh_dup = xstrdup(ssh); @@ -771,8 +769,9 @@ struct child_process *git_connect(int fd[2], const char *url, tortoiseplink = !strcasecmp(base, "tortoiseplink") || !strcasecmp(base, "tortoiseplink.exe"); - putty = !strcasecmp(base, "plink") || - !strcasecmp(base, "plink.exe") || tortoiseplink; + putty = tortoiseplink || + !strcasecmp(base, "plink") || + !strcasecmp(base, "plink.exe"); free(ssh_dup); } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH] git_connect: clarify conn->use_shell flag 2015-09-08 17:18 ` Junio C Hamano @ 2015-09-08 21:40 ` Jeff King 0 siblings, 0 replies; 12+ messages in thread From: Jeff King @ 2015-09-08 21:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: Giuseppe Bilotta, Git List On Tue, Sep 08, 2015 at 10:18:41AM -0700, Junio C Hamano wrote: > > To make the flow easier to follow, let's just set > > conn->use_shell when we're setting up the "conn" struct, and > > unset it (with a comment) in the historical GIT_SSH case. > > Makes perfect sense. I think another thing that falls into the same > class wrt readability is 'putty'; if the code does putty=0 at the > beginning of "various flavours of SSH", and sets it only when it > checks with "various flavours of plink" when GIT_SSH_COMMAND is not > set, the logic would be even clearer, I suspect. Yeah, I think so. > > Note that for clarity we leave "use_shell" on in the case > > that we fall back to our default "ssh" This looks like a > > behavior change, but in practice run-command.c optimizes > > shell invocations without metacharacters into a straight > > execve anyway. > > Hmm, interesting. I am not sure if that has to be the way, though. > Wouldn't the resulting code become simpler if you do not do that? Heh, I originally wrote it that way, and waffled between sending one or the other. > That is, is is what I have in mind on top of your patch. Did I > break something? > > connect.c | 17 ++++++++--------- > 1 file changed, 8 insertions(+), 9 deletions(-) I think both changes are correct, and the result looks nice to read. Feel free to squash them in as you apply. -Peff ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-09-08 21:40 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-09-04 10:52 [RFC] a tale of Git 2.5, ssh transport and GIT_* environment variables Giuseppe Bilotta 2015-09-04 12:54 ` Jeff King 2015-09-04 14:02 ` Giuseppe Bilotta 2015-09-04 14:26 ` Jeff King 2015-09-04 18:18 ` Junio C Hamano 2015-09-04 21:10 ` Giuseppe Bilotta 2015-09-04 21:44 ` Jeff King 2015-09-04 22:40 ` [PATCH] git_connect: clear GIT_* environment for ssh Jeff King 2015-09-05 13:50 ` Giuseppe Bilotta 2015-09-08 8:33 ` [PATCH] git_connect: clarify conn->use_shell flag Jeff King 2015-09-08 17:18 ` Junio C Hamano 2015-09-08 21:40 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.