git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* $GIT_DIR is no longer set when pre-commit hooks are called
@ 2018-08-22 23:16 Gregory Oschwald
  2018-08-26  0:41 ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Gregory Oschwald @ 2018-08-22 23:16 UTC (permalink / raw)
  To: git

As of the release of 2.18.0, $GIT_DIR is no longer set before calling
pre-commit hooks. This change was introduced in "set_work_tree: use
chdir_notify" (8500e0de) and is still present in master.

I reviewed the discussion when this change was initially submitted,
and I don't think this behavior change was intentional.

I have several hooks that were broken like this, and from a quick
Google search, using $GIT_DIR without setting a default does seem
pretty common in hooks.

Greg

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: $GIT_DIR is no longer set when pre-commit hooks are called
  2018-08-22 23:16 $GIT_DIR is no longer set when pre-commit hooks are called Gregory Oschwald
@ 2018-08-26  0:41 ` Jeff King
  2018-08-27 16:25   ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2018-08-26  0:41 UTC (permalink / raw)
  To: Gregory Oschwald; +Cc: git

On Wed, Aug 22, 2018 at 04:16:00PM -0700, Gregory Oschwald wrote:

> As of the release of 2.18.0, $GIT_DIR is no longer set before calling
> pre-commit hooks. This change was introduced in "set_work_tree: use
> chdir_notify" (8500e0de) and is still present in master.
> 
> I reviewed the discussion when this change was initially submitted,
> and I don't think this behavior change was intentional.

It was sort-of intentional and sort-of not.

We never set intentionally GIT_DIR for hooks in the first place, but it
would sometimes end up set as a side effect of other operations. So some
hooks might see it and some might not. In the case of the pre-commit
hook, I think it was probably set consistently, since git-commit
requires a working tree, and setup_work_tree() used to set it as an
accidental side effect of its absolute/relative adjustments.

The "right" way to find the directory has always been "git rev-parse
--git-dir" (which will use GIT_DIR if set, and otherwise do the normal
discovery process).

However, I am sympathetic to the breakage of existing hooks. AFAICT that
while unintentional, we've probably been consistently setting $GIT_DIR
in hooks for commands with work-trees since 2008, as a side effect of
044bbbcb63 (Make git_dir a path relative to work_tree in
setup_work_tree(), 2008-06-19). Although I am slightly confused by this
earlier thread where the OP complained that the variable is set only
sometimes:

  https://public-inbox.org/git/CAEDDsWdXQ1+UukvbfRoTPzY3Y9sOaxQ7nh+qL_Mcuy3=XKKh7w@mail.gmail.com/

(and there the preferred behavior is actually _not_ to have it set,
because it's a gotcha when chdir-ing to another repo).

If we want to keep supporting this case, then I think we should be doing
it consistently, by setting $GIT_DIR in the child environment as we run
the hook. Something like the patch below, but ideally we'd apply it
consistently to all hooks (of course, that would break any hooks that
chdir to a new repo without resetting GIT_DIR, but such hooks are
already iffy, as they may already sometimes see GIT_DIR set in the
environment).

diff --git a/builtin/commit.c b/builtin/commit.c
index 3bfeabc463..3670024a25 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1440,6 +1440,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, const char *name
 	int ret;
 
 	argv_array_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
+	argv_array_pushf(&hook_env, "GIT_DIR=%s", get_git_dir());
 
 	/*
 	 * Let the hook know that no editor will be launched.

-Peff

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: $GIT_DIR is no longer set when pre-commit hooks are called
  2018-08-26  0:41 ` Jeff King
@ 2018-08-27 16:25   ` Johannes Schindelin
  2018-08-27 23:37     ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Johannes Schindelin @ 2018-08-27 16:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Gregory Oschwald, git

Hi Peff,

On Sat, 25 Aug 2018, Jeff King wrote:

> On Wed, Aug 22, 2018 at 04:16:00PM -0700, Gregory Oschwald wrote:
> 
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 3bfeabc463..3670024a25 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1440,6 +1440,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, const char *name
>  	int ret;
>  
>  	argv_array_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
> +	argv_array_pushf(&hook_env, "GIT_DIR=%s", get_git_dir());

We did something similar in sequencer.c, and it required setting
`GIT_WORK_TREE`, too, to avoid problems in worktrees. Do you need the same
here, too?

Ciao,
Dscho

>  	/*
>  	 * Let the hook know that no editor will be launched.
> 
> -Peff
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: $GIT_DIR is no longer set when pre-commit hooks are called
  2018-08-27 16:25   ` Johannes Schindelin
@ 2018-08-27 23:37     ` Jeff King
  2018-08-28 12:50       ` Johannes Schindelin
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff King @ 2018-08-27 23:37 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Gregory Oschwald, git

On Mon, Aug 27, 2018 at 06:25:26PM +0200, Johannes Schindelin wrote:

> On Sat, 25 Aug 2018, Jeff King wrote:
> 
> > On Wed, Aug 22, 2018 at 04:16:00PM -0700, Gregory Oschwald wrote:
> > 
> > diff --git a/builtin/commit.c b/builtin/commit.c
> > index 3bfeabc463..3670024a25 100644
> > --- a/builtin/commit.c
> > +++ b/builtin/commit.c
> > @@ -1440,6 +1440,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, const char *name
> >  	int ret;
> >  
> >  	argv_array_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
> > +	argv_array_pushf(&hook_env, "GIT_DIR=%s", get_git_dir());
> 
> We did something similar in sequencer.c, and it required setting
> `GIT_WORK_TREE`, too, to avoid problems in worktrees. Do you need the same
> here, too?

Hmm, good point. Maybe. If we're just trying to restore the original
behavior (i.e., fix a regression), then this would behave as the
original.

I'm not at all opposed to going beyond that and actually fixing more
cases. But I am a little nervous of introducing a new regression, given
the subtlety around some of our startup code.

My current understanding is:

  - If we're setting GIT_DIR, it's probably always save to set
    GIT_WORK_TREE (or GIT_IMPLICIT_WORK_TREE=0 if there isn't a
    worktree). I.e., there is no case I know that would be broken by
    this.

  - Passing GIT_DIR to any sub-process operating in the context of our
    repo (i.e., excluding cases where we're clearing local_repo_env) can
    break a script that expects to do:

      cd /another/repo
      git ...

    and operate in /another/repo. But such a script is already broken at
    least in some cases, because running the initial command as:

      git --git-dir=/some/repo

    will mean that GIT_DIR is set. So a hook depending on that is
    already broken, though it may be small consolation to somebody whose
    hook happened to work most of the time, because they do not pass in
    GIT_DIR.

  - Any command that uses setup_work_tree() (which includes things with
    NEED_WORK_TREE in git.c) would have always been setting GIT_DIR up
    through v2.17.x. So any hooks they run would have seen it
    consistently, and therefore are exempt from being regressed in the
    case above.

So I _think_ it would be safe to:

  1. Set GIT_DIR again anytime we call setup_work_tree(), because that's
     what has always happened.

  2. Start setting GIT_WORK_TREE at the same time. This didn't happen
     before, but it can't break anybody, and might help.

That makes the rules for when GIT_DIR is set confusing, but at least it
shouldn't regress. A more consistent rule of "hooks always see GIT_DIR"
(or even "any sub-process always sees...") is easier to explain, but
might break people in practice.

I notice also that sequencer.c sets an absolute path, since 09d7b6c6fa
(sequencer: pass absolute GIT_DIR to exec commands, 2017-10-31). That
does seem friendlier, though I wonder if could break any scripts. I
cannot think of a case, unless the intermediate paths were no accessible
to the uid running the process (but then, how would Git have generated
the absolute path in the first place?).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: $GIT_DIR is no longer set when pre-commit hooks are called
  2018-08-27 23:37     ` Jeff King
@ 2018-08-28 12:50       ` Johannes Schindelin
  0 siblings, 0 replies; 5+ messages in thread
From: Johannes Schindelin @ 2018-08-28 12:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Gregory Oschwald, git

Hi Peff,

On Mon, 27 Aug 2018, Jeff King wrote:

> On Mon, Aug 27, 2018 at 06:25:26PM +0200, Johannes Schindelin wrote:
> 
> > On Sat, 25 Aug 2018, Jeff King wrote:
> > 
> > > On Wed, Aug 22, 2018 at 04:16:00PM -0700, Gregory Oschwald wrote:
> > > 
> > > diff --git a/builtin/commit.c b/builtin/commit.c
> > > index 3bfeabc463..3670024a25 100644
> > > --- a/builtin/commit.c
> > > +++ b/builtin/commit.c
> > > @@ -1440,6 +1440,7 @@ int run_commit_hook(int editor_is_used, const char *index_file, const char *name
> > >  	int ret;
> > >  
> > >  	argv_array_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
> > > +	argv_array_pushf(&hook_env, "GIT_DIR=%s", get_git_dir());
> > 
> > We did something similar in sequencer.c, and it required setting
> > `GIT_WORK_TREE`, too, to avoid problems in worktrees. Do you need the same
> > here, too?
> 
> Hmm, good point. Maybe. If we're just trying to restore the original
> behavior (i.e., fix a regression), then this would behave as the
> original.
> 
> I'm not at all opposed to going beyond that and actually fixing more
> cases. But I am a little nervous of introducing a new regression, given
> the subtlety around some of our startup code.

I concur about the subtlety around the startup code. Quite a bit of that
mess is now a bit easier to grok, thanks to my work to discover the
GIT_DIR gently, but there are still monsters lurking in that code.

Having said that, I don't think that we can get away with setting GIT_DIR
without GIT_WORK_TREE here. That would *definitely* introduce a bug, as
far as I can tell.

> My current understanding is:
> 
>   - If we're setting GIT_DIR, it's probably always save to set

With s/save/safe/, I agree.

>     GIT_WORK_TREE (or GIT_IMPLICIT_WORK_TREE=0 if there isn't a
>     worktree). I.e., there is no case I know that would be broken by
>     this.
> 
>   - Passing GIT_DIR to any sub-process operating in the context of our
>     repo (i.e., excluding cases where we're clearing local_repo_env) can
>     break a script that expects to do:
> 
>       cd /another/repo
>       git ...
> 
>     and operate in /another/repo. But such a script is already broken at
>     least in some cases, because running the initial command as:
> 
>       git --git-dir=/some/repo
> 
>     will mean that GIT_DIR is set.

Since this mailing list often indulges in tangent fest, I'll allow myself
this one: `git --git-dir=...` should probably handle the case where
GIT_DIR/GIT_WORK_TREE were set differently when the command was started,
and if so, unset them.

>     already broken, though it may be small consolation to somebody whose
>     hook happened to work most of the time, because they do not pass in
>     GIT_DIR.

I seem to remember two reports about such funny way to do things. So they
exist, I agree.

>   - Any command that uses setup_work_tree() (which includes things with
>     NEED_WORK_TREE in git.c) would have always been setting GIT_DIR up
>     through v2.17.x. So any hooks they run would have seen it
>     consistently, and therefore are exempt from being regressed in the
>     case above.
> 
> So I _think_ it would be safe to:
> 
>   1. Set GIT_DIR again anytime we call setup_work_tree(), because that's
>      what has always happened.
> 
>   2. Start setting GIT_WORK_TREE at the same time. This didn't happen
>      before, but it can't break anybody, and might help.

With s/might help/will help in some edge cases/, I agree.

> That makes the rules for when GIT_DIR is set confusing, but at least it
> shouldn't regress. A more consistent rule of "hooks always see GIT_DIR"
> (or even "any sub-process always sees...") is easier to explain, but
> might break people in practice.

Indeed.

> I notice also that sequencer.c sets an absolute path, since 09d7b6c6fa
> (sequencer: pass absolute GIT_DIR to exec commands, 2017-10-31). That
> does seem friendlier, though I wonder if could break any scripts. I
> cannot think of a case, unless the intermediate paths were no accessible
> to the uid running the process (but then, how would Git have generated
> the absolute path in the first place?).

Please note that (IIRC) this commit just reinstated the handling in
git-sh-setup, which always set GIT_DIR to an absolute path. Scripts used
in conjunction with `git rebase -i` would therefore have had to expect
this.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-28 12:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-22 23:16 $GIT_DIR is no longer set when pre-commit hooks are called Gregory Oschwald
2018-08-26  0:41 ` Jeff King
2018-08-27 16:25   ` Johannes Schindelin
2018-08-27 23:37     ` Jeff King
2018-08-28 12:50       ` Johannes Schindelin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).