git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problem: staging of an alternative repository
@ 2014-04-30 21:22 Pasha Bolokhov
  2014-04-30 21:35 ` Jonathan Nieder
  0 siblings, 1 reply; 7+ messages in thread
From: Pasha Bolokhov @ 2014-04-30 21:22 UTC (permalink / raw)
  To: git

        Hi

    It turns out Git treats the directory '.git' differently enough
from everything else. That may be ok, but here's one place where I
encountered an unpleasant (and imho unexpected) behaviour:

    if you supply a different repository base name, say, '.git_new',
by either setting GIT_DIR or using the '--git-dir' option, Git 'add'
will not make any exception for it and think of it as a new (weird)
directory. In particular, 'git add -A' with a consequent commit will
add this repository into itself with all its guts.

    Now I know, the '--git-dir' option may usually be meant to use
when the repository is somewhere outside of the work tree, and such a
problem would not arise. And even if it is inside, sure enough, you
can add this '.git_new' to the ignores or excludes. But is this really
what you expect?

    I come forward to offer my own will to fix this behaviour (which
is rooted in 'dir.c'). However there are uncertainties, and I'm asking
for an opinion.

    Apparently, the assumption that the repository is in '.git' has
propagated far enough. In particular, every '.git' within the working
tree seems to be ignored for the purpose of staging. Is this a
consistent behaviour? And, perhaps there are a million more places
where the name '.git' is hard-coded, and it might be reasonable to
question the legitimacy for that. Or, in contrast, to what degree or
depth (in the source code) does one *expect* Git to rename all its
hard-coded '.git's into '.gut's when a "GIT_DIR=.gut" is supplied?

   cheers
Pavel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: staging of an alternative repository
  2014-04-30 21:22 Problem: staging of an alternative repository Pasha Bolokhov
@ 2014-04-30 21:35 ` Jonathan Nieder
  2014-05-02  5:23   ` Pasha Bolokhov
  2014-05-02  6:20   ` Duy Nguyen
  0 siblings, 2 replies; 7+ messages in thread
From: Jonathan Nieder @ 2014-04-30 21:35 UTC (permalink / raw)
  To: Pasha Bolokhov; +Cc: git

Hi Pavel,

Pasha Bolokhov wrote:

>     It turns out Git treats the directory '.git' differently enough
> from everything else. That may be ok,

Yeah, it's intended.

[...]
>     if you supply a different repository base name, say, '.git_new',
> by either setting GIT_DIR or using the '--git-dir' option, Git 'add'
> will not make any exception for it and think of it as a new (weird)
> directory.

Yep, a git repository metadata directory named .git_new is not special
in any way and you can use "git add" to track it if you want (for
example to add a testcase).

[...]
>     Now I know, the '--git-dir' option may usually be meant to use
> when the repository is somewhere outside of the work tree, and such a
> problem would not arise. And even if it is inside, sure enough, you
> can add this '.git_new' to the ignores or excludes. But is this really
> what you expect?

I think it's more that it never came up.  Excluding the current
$GIT_DIR from what "git add" can add (on top of the current rule of
excluding all instances of ".git") seems like a sensible change,
assuming it can be done without hurting the code too much. ;-)

But as you note, you are not using $GIT_DIR the way it was intended to
be used.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: staging of an alternative repository
  2014-04-30 21:35 ` Jonathan Nieder
@ 2014-05-02  5:23   ` Pasha Bolokhov
  2014-05-02  6:20   ` Duy Nguyen
  1 sibling, 0 replies; 7+ messages in thread
From: Pasha Bolokhov @ 2014-05-02  5:23 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

    Hi Jonathan

Thanks for the answers

> I think it's more that it never came up.  Excluding the current
> $GIT_DIR from what "git add" can add (on top of the current rule of
> excluding all instances of ".git") seems like a sensible change,
> assuming it can be done without hurting the code too much. ;-)

I did notice it to have come up in the forums in some related but
non-transparent ways.

Anyway, this can hopefully be easily fixed and I can look into it. My
understanding is, that unlike the special treatment of ".git", the
alternative repository (call it '.gut') should only be "ignored" at
the *top* of the work tree and not anywhere deeper inside. And of
course, the special treatment (that is, the ignoring) of ".git" should
be kept as it is. Am I right?

   regards
Pavel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: staging of an alternative repository
  2014-04-30 21:35 ` Jonathan Nieder
  2014-05-02  5:23   ` Pasha Bolokhov
@ 2014-05-02  6:20   ` Duy Nguyen
  2014-05-07 20:51     ` Pasha Bolokhov
  2014-05-17 16:31     ` Pasha Bolokhov
  1 sibling, 2 replies; 7+ messages in thread
From: Duy Nguyen @ 2014-05-02  6:20 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Pasha Bolokhov, Git Mailing List

On Thu, May 1, 2014 at 4:35 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>>     Now I know, the '--git-dir' option may usually be meant to use
>> when the repository is somewhere outside of the work tree, and such a
>> problem would not arise. And even if it is inside, sure enough, you
>> can add this '.git_new' to the ignores or excludes. But is this really
>> what you expect?
>
> I think it's more that it never came up.  Excluding the current
> $GIT_DIR from what "git add" can add (on top of the current rule of
> excluding all instances of ".git") seems like a sensible change,
> assuming it can be done without hurting the code too much. ;-)

I think it came up before. Changes could be very messy (but I did not
check carefully) because right now we just compare $(basename $path)
with ".git", one path component, simple and easy. Checking against
$GIT_DIR means all path components. You also have to deal with
relative and absolute paths and symlinks in some path components. You
may also need to think if submodule detection code (checking ".git"
again) is impacted. On top of that, read_directory() code is already
messy (or at least scary to me) with all kinds of shortcuts we have
added over the years. A simpler solution may be ignoring all
directories whose last component is  "$GIT_DIR_NAME" (e.g.
GIT_DIR_NAME=.git_new).
-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: staging of an alternative repository
  2014-05-02  6:20   ` Duy Nguyen
@ 2014-05-07 20:51     ` Pasha Bolokhov
  2014-05-17 16:31     ` Pasha Bolokhov
  1 sibling, 0 replies; 7+ messages in thread
From: Pasha Bolokhov @ 2014-05-07 20:51 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jonathan Nieder, Git Mailing List

    Hi,

I've looked more attentively, here are my observations and the
resulting suggestions:

- Suggest only to check *string-wise* the given "path" against
$GIT_DIR. Both or one of them may be relative paths (but comparison
best be performed when converted to absolute paths). That's the only
solution which will give predictable results, but does not handle
symlinks: if you have symlinks (so that both differing paths actually
point to the same location) the comparison will return failure. Only
if you have an exact string-wise match do we ignore the "path"

- With this way of comparison, only the root ".git_new" will be
ignored. Submodules will likely contain the usual ".git". Since the
code normally ignores ".git" anyway, and I do not intend to change
that, I don't see how submodule detection can be affected

- The problem of resolving symlinks is in a very general case
insolvable (e.g. imagine one of the symlinks points to another
filesystem which may be up or down depending on the day of week - it's
easy to plot a scenario where symlinks will resolve (or even fail to
resolve) differently at different runs)

- Even if it was solvable, the current implementation of handling
".git" certainly does not check any symlinks:
        $  mv -i   .git   .metadata
        $  ln -s .metadata .git
  Then certainly "git add -A" will grab all ".metadata" and store into itself

  Please let me know what you think. Again, I can try to carefully do
this suggestion

Pavel


On Thu, May 1, 2014 at 11:20 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Thu, May 1, 2014 at 4:35 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>>>     Now I know, the '--git-dir' option may usually be meant to use
>>> when the repository is somewhere outside of the work tree, and such a
>>> problem would not arise. And even if it is inside, sure enough, you
>>> can add this '.git_new' to the ignores or excludes. But is this really
>>> what you expect?
>>
>> I think it's more that it never came up.  Excluding the current
>> $GIT_DIR from what "git add" can add (on top of the current rule of
>> excluding all instances of ".git") seems like a sensible change,
>> assuming it can be done without hurting the code too much. ;-)
>
> I think it came up before. Changes could be very messy (but I did not
> check carefully) because right now we just compare $(basename $path)
> with ".git", one path component, simple and easy. Checking against
> $GIT_DIR means all path components. You also have to deal with
> relative and absolute paths and symlinks in some path components. You
> may also need to think if submodule detection code (checking ".git"
> again) is impacted. On top of that, read_directory() code is already
> messy (or at least scary to me) with all kinds of shortcuts we have
> added over the years. A simpler solution may be ignoring all
> directories whose last component is  "$GIT_DIR_NAME" (e.g.
> GIT_DIR_NAME=.git_new).
> --
> Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: staging of an alternative repository
  2014-05-02  6:20   ` Duy Nguyen
  2014-05-07 20:51     ` Pasha Bolokhov
@ 2014-05-17 16:31     ` Pasha Bolokhov
  2014-05-19 10:05       ` Duy Nguyen
  1 sibling, 1 reply; 7+ messages in thread
From: Pasha Bolokhov @ 2014-05-17 16:31 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Jonathan Nieder, Git Mailing List

        Hi again

    I've come up with a fix for this. It's just two and a half lines,
and required more studying the code than typing.
A lot of path-processing work has been implemented in "abspath.c" and
"dir.c", including the symlinks and checking whether one path is a
subdirectory of another. I just added an "exclude" for GITDIR without
touching anything else.

    Now the best place to add that exclude would probably be "git.c",
right after the option "--git-dir" is processed. But this is not
actually the place where excludes are initialized or used any how.
Since initialization of excludes is done more or less individually by
each command concerned about them, the most "centralized" place
happens to be dir:setup_standard_excludes(), and that's where I did
it. One of the (side?) effects is that the excludes work in such a way
that any directory named ".metadata" in the directory tree will be
ignored once "-git-dir=.metadata" has been given

    Now if you guys don't see anything against this, I would shoot out a patch?

    regards
Pasha


On Thu, May 1, 2014 at 11:20 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Thu, May 1, 2014 at 4:35 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>>>     Now I know, the '--git-dir' option may usually be meant to use
>>> when the repository is somewhere outside of the work tree, and such a
>>> problem would not arise. And even if it is inside, sure enough, you
>>> can add this '.git_new' to the ignores or excludes. But is this really
>>> what you expect?
>>
>> I think it's more that it never came up.  Excluding the current
>> $GIT_DIR from what "git add" can add (on top of the current rule of
>> excluding all instances of ".git") seems like a sensible change,
>> assuming it can be done without hurting the code too much. ;-)
>
> I think it came up before. Changes could be very messy (but I did not
> check carefully) because right now we just compare $(basename $path)
> with ".git", one path component, simple and easy. Checking against
> $GIT_DIR means all path components. You also have to deal with
> relative and absolute paths and symlinks in some path components. You
> may also need to think if submodule detection code (checking ".git"
> again) is impacted. On top of that, read_directory() code is already
> messy (or at least scary to me) with all kinds of shortcuts we have
> added over the years. A simpler solution may be ignoring all
> directories whose last component is  "$GIT_DIR_NAME" (e.g.
> GIT_DIR_NAME=.git_new).
> --
> Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Problem: staging of an alternative repository
  2014-05-17 16:31     ` Pasha Bolokhov
@ 2014-05-19 10:05       ` Duy Nguyen
  0 siblings, 0 replies; 7+ messages in thread
From: Duy Nguyen @ 2014-05-19 10:05 UTC (permalink / raw)
  To: Pasha Bolokhov; +Cc: Jonathan Nieder, Git Mailing List

On Sat, May 17, 2014 at 11:31 PM, Pasha Bolokhov
<pasha.bolokhov@gmail.com> wrote:
>     Now if you guys don't see anything against this, I would shoot out a patch?
>

If you have written the patch already, I see no harm in sending it
here. I'm concerned about the perfomance impact on this code, which is
already slow when the repo is large. But we can benchmark it later.
-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-05-19 10:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-30 21:22 Problem: staging of an alternative repository Pasha Bolokhov
2014-04-30 21:35 ` Jonathan Nieder
2014-05-02  5:23   ` Pasha Bolokhov
2014-05-02  6:20   ` Duy Nguyen
2014-05-07 20:51     ` Pasha Bolokhov
2014-05-17 16:31     ` Pasha Bolokhov
2014-05-19 10:05       ` Duy Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).