git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git grep doesn't follow symbolic link
@ 2012-01-09 16:54 Bertrand BENOIT
  2012-01-10  5:56 ` Ramkumar Ramachandra
  0 siblings, 1 reply; 10+ messages in thread
From: Bertrand BENOIT @ 2012-01-09 16:54 UTC (permalink / raw)
  To: git

Hi,

I've not found information about that in documentation, so I do a report.

When using git grep, symbolic links are not followed.
Is it a wanted behavior ?

I've tested with a symbolic link:
 - 'ignored'
 - NOT staged for commit
 - to be commited
 - commited
Anytime -> no result when asking on symbolic link

Example:
# git grep foo mySrc
-> OK answer [...]

# ln -s mySrc test
# git grep foo test
-> KO: No answer

# git add test
# git grep foo test
-> KO: No answer

# git commit -m "DO NOT PUSH" test
# git grep foo test
-> KO: No answer

Best Regards,
Bertrand

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-09 16:54 git grep doesn't follow symbolic link Bertrand BENOIT
@ 2012-01-10  5:56 ` Ramkumar Ramachandra
  2012-01-10 10:00   ` Thomas Rast
  0 siblings, 1 reply; 10+ messages in thread
From: Ramkumar Ramachandra @ 2012-01-10  5:56 UTC (permalink / raw)
  To: Bertrand BENOIT; +Cc: git

Hi Bertrand,

Bertrand BENOIT wrote:
> When using git grep, symbolic links are not followed.
> Is it a wanted behavior ?

I'd imagine so: symbolic links are not portable across different file
systems; Git's internal representation of a symbolic link is a file
containing the path of the file to be linked to.

> I've not found information about that in documentation, so I do a report.

Hm, the description says:

Look for specified patterns in the tracked files in the work tree, blobs
registered in the index file, or blobs in given tree objects.

Hm, "tracked files in the work tree" is definitely sub-optimal: "blobs
corresponding to the tracked files in the work tree" is probably
better.  Then again, I think the description is too cryptic for an
end-user: do you have any suggestions?  Have we mentioned how Git
handles symbolic links anywhere in the documentation?  If not, where
should this information go?

-- Ram

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-10  5:56 ` Ramkumar Ramachandra
@ 2012-01-10 10:00   ` Thomas Rast
  2012-01-10 18:22     ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Rast @ 2012-01-10 10:00 UTC (permalink / raw)
  To: Ramkumar Ramachandra; +Cc: Bertrand BENOIT, git

Ramkumar Ramachandra <artagnon@gmail.com> writes:

> Hi Bertrand,
>
> Bertrand BENOIT wrote:
>> When using git grep, symbolic links are not followed.
>> Is it a wanted behavior ?
>
> I'd imagine so: symbolic links are not portable across different file
> systems; Git's internal representation of a symbolic link is a file
> containing the path of the file to be linked to.

I'd actually welcome a fix to this general area, for an entirely
different reason.  With bash and ordinary diff I can do things like

  diff -u <(ls) <(cd elsewhere && ls) | less

But I lose all the cute features of git-diff.  I *could* say

  git diff --no-index <(ls) <(cd elsewhere && ls)

and it helpfully tells me

  diff --git 1/dev/fd/63 2/dev/fd/62
  index 55ccbe5..d796c45 120000
  --- 1/dev/fd/63
  +++ 2/dev/fd/62
  @@ -1 +1 @@
  -pipe:[607341]
  \ No newline at end of file
  +pipe:[607343]
  \ No newline at end of file

Of course that's diff and not grep, but I think they suffer from the
same flaw: they share the file-kind handling logic of the rest of git in
a case where it's not very helpful.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-10 10:00   ` Thomas Rast
@ 2012-01-10 18:22     ` Junio C Hamano
  2012-01-14  9:50       ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-01-10 18:22 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Ramkumar Ramachandra, Bertrand BENOIT, git

Thomas Rast <trast@student.ethz.ch> writes:

>> I'd imagine so: symbolic links are not portable across different file
>> systems; Git's internal representation of a symbolic link is a file
>> containing the path of the file to be linked to.
>
> I'd actually welcome a fix to this general area,...

Even though some platforms may lack symbolic links, where they are
supported, they have a clear and defined meaning and that is what Git
tracks as contents: where the link points at.

So we would want our "git diff" to tell us, even if you moved without
content modification the symbolic link target that lives somewhere on your
filesystem but is outside the control of Git, and updated a symbolic link
that is tracked by Git to point to a new location, that you updated the
link. On the other hand, if you did not update a tracked symbolic link,
even if the location the link points at that may or may not be under the
control of Git, we do not want "git diff" to show anything. As far as that
link is concerned, nothing has changed.

Changing this would not be a fix; it would be butchering.

Having said that...

> But I lose all the cute features of git-diff.  I *could* say
>
>   git diff --no-index <(ls) <(cd elsewhere && ls)

... "--no-index" is specifically _not_ about tracked contents of Git, but
was bolted on as a poor-man's substitute for GNU diff (think of it as
somebody wanted to have the nifty "git diff" features like renames and
coloring, but did not want to bother to port them to GNU diff codebase,
but instead hacked Git codebase to work outside Git tracked contents).  In
that context, I would agree that it _might_ make sense to treat special
files and symbolic links in a way that is different from how tracked
contents are handled.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-10 18:22     ` Junio C Hamano
@ 2012-01-14  9:50       ` Nguyen Thai Ngoc Duy
  2012-01-15  2:07         ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-01-14  9:50 UTC (permalink / raw)
  To: Junio C Hamano, Pang Yan Han
  Cc: Thomas Rast, Ramkumar Ramachandra, Bertrand BENOIT, git

(Pang's patch [1] caught my attention so I returned to the original discussion)

[1] http://thread.gmane.org/gmane.comp.version-control.git/188552

On Wed, Jan 11, 2012 at 1:22 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Thomas Rast <trast@student.ethz.ch> writes:
>
>>> I'd imagine so: symbolic links are not portable across different file
>>> systems; Git's internal representation of a symbolic link is a file
>>> containing the path of the file to be linked to.
>>
>> I'd actually welcome a fix to this general area,...
>
> Even though some platforms may lack symbolic links, where they are
> supported, they have a clear and defined meaning and that is what Git
> tracks as contents: where the link points at.
>
> So we would want our "git diff" to tell us, even if you moved without
> content modification the symbolic link target that lives somewhere on your
> filesystem but is outside the control of Git, and updated a symbolic link
> that is tracked by Git to point to a new location, that you updated the
> link. On the other hand, if you did not update a tracked symbolic link,
> even if the location the link points at that may or may not be under the
> control of Git, we do not want "git diff" to show anything. As far as that
> link is concerned, nothing has changed.
>
> Changing this would not be a fix; it would be butchering.

That's a good default. But git should allow me to say "diff the files
that symlinks point to". Link target is content from git perspective,
not from user perspective.

So instead changing the default behavior specifically for git-grep as
Pang did, I think adding --follow-symlinks option, that could be
passed to grep or any of diff family, would be a better approach.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-14  9:50       ` Nguyen Thai Ngoc Duy
@ 2012-01-15  2:07         ` Junio C Hamano
  2012-01-15  9:47           ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-01-15  2:07 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Pang Yan Han, Thomas Rast, Ramkumar Ramachandra, Bertrand BENOIT, git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

>> Even though some platforms may lack symbolic links, where they are
>> supported, they have a clear and defined meaning and that is what Git
>> tracks as contents: where the link points at.
>>
>> So we would want our "git diff" to tell us, even if you moved without
>> content modification the symbolic link target that lives somewhere on your
>> filesystem but is outside the control of Git, and updated a symbolic link
>> that is tracked by Git to point to a new location, that you updated the
>> link. On the other hand, if you did not update a tracked symbolic link,
>> even if the location the link points at that may or may not be under the
>> control of Git, we do not want "git diff" to show anything. As far as that
>> link is concerned, nothing has changed.
>>
>> Changing this would not be a fix; it would be butchering.
>
> That's a good default. But git should allow me to say "diff the files
> that symlinks point to". Link target is content from git perspective,
> not from user perspective.
>
> So instead changing the default behavior specifically for git-grep as
> Pang did, I think adding --follow-symlinks option, that could be
> passed to grep or any of diff family, would be a better approach.

Stop and think what "git diff --follow-symlinks v1.3.0 v1.7.0" should do
when these versions record a symbolic link, "from user perspective", if
the link points outside the tracked contents. Naturally, the users would
expect that the comparison is made between the contents of the file back
when v1.3.0 was tagged and the contents of the file (which may or may not
be the same path depending on the target of that symbolic link) back when
v1.7.0 was tagged.

But that is something that the user is *NOT* tracking with the system, and
hence something we cannot give the right answer. Your "--follow-symlinks"
option only encourages the *wrong* perception on the users' side, without
supporting what it appears to promise to the users. Why could it be an
improvement?

Compared to that, limiting the optional support for following symlinks
only in "--no-index" case, where the user explicitly asks us to look at
the data that is not managed by Git at all, makes more sense.  At the
design level, I wouldn't be fundamentally opposed to a change to add an
optional "follow the symlink" feature only when "--no-index" is asked for.

I didn't look at the posted patch, so I do not know if it adds an optional
following or unconditionally makes us follow symbolic links, or if the
patch sensibly implements the feature, though. That is a separate issue.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-15  2:07         ` Junio C Hamano
@ 2012-01-15  9:47           ` Nguyen Thai Ngoc Duy
  2012-01-16 22:44             ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-01-15  9:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Pang Yan Han, Thomas Rast, Ramkumar Ramachandra, Bertrand BENOIT, git

On Sun, Jan 15, 2012 at 9:07 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stop and think what "git diff --follow-symlinks v1.3.0 v1.7.0" should do
> when these versions record a symbolic link, "from user perspective", if
> the link points outside the tracked contents. Naturally, the users would
> expect that the comparison is made between the contents of the file back
> when v1.3.0 was tagged and the contents of the file (which may or may not
> be the same path depending on the target of that symbolic link) back when
> v1.7.0 was tagged.
>
> But that is something that the user is *NOT* tracking with the system, and
> hence something we cannot give the right answer. Your "--follow-symlinks"
> option only encourages the *wrong* perception on the users' side, without
> supporting what it appears to promise to the users. Why could it be an
> improvement?

It's not wrong per se. It's an implication that users have to take
when they choose to use it. We may help make it clear that the
symlinks point to untracked files by putting some indication in the
diff.

When I do "git log -Sfoo -- '*.cxx'" I don't really care if bar.cxx is
a symlink. Neither does my compiler. It may be a symlink's target
change that makes "foo" appear. Git could help me detect that quickly
instead of sticking with tracked contents only.

Even if we decide --follow-symlinks=untracked is a bad idea,
--follow-symlinks=tracked (i.e. follow symlinks to tracked files only)
is still a good thing to support. And I suspect that's a more common
case as linking outside repository could is undeterministic.

The "=tracked" could be dropped if we have no other option value. I'm
thinking of --follow-symlinks=submodule, which is currently covered by
a separate option name.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-15  9:47           ` Nguyen Thai Ngoc Duy
@ 2012-01-16 22:44             ` Junio C Hamano
  2012-01-17  1:55               ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2012-01-16 22:44 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Pang Yan Han, Thomas Rast, Ramkumar Ramachandra, Bertrand BENOIT, git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> It's not wrong per se. It's an implication that users have to take
> when they choose to use it. We may help make it clear that the
> symlinks point to untracked files by putting some indication in the
> diff.
>
> When I do "git log -Sfoo -- '*.cxx'" I don't really care if bar.cxx is
> a symlink. Neither does my compiler. It may be a symlink's target
> change that makes "foo" appear. Git could help me detect that quickly
> instead of sticking with tracked contents only.

As there is nothing in Git that tells that whatever is pointed at by
bar.cxx that happens to be in your filesystem today had "foo" in it when
that historical version of the commit whose bar.cxx symlink was updated to
point to that file. It is *WRONG* to show the commit as something that
changes bar.cxx to contain "foo" (or more precisely, changes the count of
"foo" in it).

Why is it so hard to understand?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-16 22:44             ` Junio C Hamano
@ 2012-01-17  1:55               ` Nguyen Thai Ngoc Duy
  2012-01-17  6:44                 ` Junio C Hamano
  0 siblings, 1 reply; 10+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2012-01-17  1:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Pang Yan Han, Thomas Rast, Ramkumar Ramachandra, Bertrand BENOIT, git

On Tue, Jan 17, 2012 at 5:44 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> It's not wrong per se. It's an implication that users have to take
>> when they choose to use it. We may help make it clear that the
>> symlinks point to untracked files by putting some indication in the
>> diff.
>>
>> When I do "git log -Sfoo -- '*.cxx'" I don't really care if bar.cxx is
>> a symlink. Neither does my compiler. It may be a symlink's target
>> change that makes "foo" appear. Git could help me detect that quickly
>> instead of sticking with tracked contents only.
>
> As there is nothing in Git that tells that whatever is pointed at by
> bar.cxx that happens to be in your filesystem today had "foo" in it when
> that historical version of the commit whose bar.cxx symlink was updated to
> point to that file. It is *WRONG* to show the commit as something that
> changes bar.cxx to contain "foo" (or more precisely, changes the count of
> "foo" in it).
>
> Why is it so hard to understand?

OK resolving links to untracked contents is bad and should only be
supported in --no-index case, resolving links to tracked contents
should be ok in principal? (I'm not sure how messed up diff code could
be with these changes)
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: git grep doesn't follow symbolic link
  2012-01-17  1:55               ` Nguyen Thai Ngoc Duy
@ 2012-01-17  6:44                 ` Junio C Hamano
  0 siblings, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2012-01-17  6:44 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Pang Yan Han, Thomas Rast, Ramkumar Ramachandra, Bertrand BENOIT, git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> OK resolving links to untracked contents is bad and should only be
> supported in --no-index case, resolving links to tracked contents should
> be ok in principal?

Conceptually it is not as bad, but I doubt it is still "ok".

It would defeat one of the fundamental properties of Git (or any content
based revision control scheme for that matter): a tree object records the
hash of its contents, so if two subtrees agree at the content hash level,
you do not have to descend into them to compare what they contain.

Imagine that you have a symlink at a/b/c/d that points a file e at the
root level, and you are running "git log a/b/c".  Even if the entire
hierarchy a/ does not change in a commit since its parent, you may have to
show a/b/c/d only because "e" has changed.

So I suspect that the required change would involve a lot more than a
naïve "when we reach the leaf level, if it is a symlink, read the link
contents and call get_tree_entry() to dereference the blob, or if the link
points outside the tree, use 0{40} to say 'contents undefined'".

After you compare 'a' of parent and child and find them to be identical,
you still need to anticipate that the hierarchy _might_ have a symbolic
link somewhere deep inside, and read _everything_ at least once in order
to find symbolic links and where they point at (if you did that to parent
already, and if you know that the child agrees with it at 'a', then you
can obviously do not have to read everything in the child---you know the
parent and the child have the same _contents_ in 'a' at that point). And
then grab the pointee out of parent tree and child tree to compare.

I personally do not think it is worth it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-01-17  6:45 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-09 16:54 git grep doesn't follow symbolic link Bertrand BENOIT
2012-01-10  5:56 ` Ramkumar Ramachandra
2012-01-10 10:00   ` Thomas Rast
2012-01-10 18:22     ` Junio C Hamano
2012-01-14  9:50       ` Nguyen Thai Ngoc Duy
2012-01-15  2:07         ` Junio C Hamano
2012-01-15  9:47           ` Nguyen Thai Ngoc Duy
2012-01-16 22:44             ` Junio C Hamano
2012-01-17  1:55               ` Nguyen Thai Ngoc Duy
2012-01-17  6:44                 ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).