All of lore.kernel.org
 help / color / mirror / Atom feed
* Using .gitignore symbolic links?
@ 2021-06-18  2:34 Tessa L. H. Lovelace
  2021-06-18  6:44 ` Robert Karszniewicz
  2021-06-18 11:15 ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 5+ messages in thread
From: Tessa L. H. Lovelace @ 2021-06-18  2:34 UTC (permalink / raw)
  To: git

The recent release candidate of Git (v2.32.0) hit my OS this week, and 
it included a line () on symbolic links for several specific files are 
now ignored.

Thank you for putting the changelogs in an accessible location, knowing 
that this was a known breaking change was useful in debugging why my 
workflows stopped working.

I have two concerns.

First, the error thrown is

 > "warning: unable to access '.gitignore': Too many levels of symbolic 
links",

,,,which does not accurately represent what is happening.

I spent a bit of time convinced that I'd broken something with the 
symbolic links during setup, and an error such as "symbolic linking no 
longer allowed for 'filename'." would make more sense, given the change 
under discussion eliminates *any* use of symbolic links.


Secondly, and more personally important to me, a system administrator:
My repositories use symbolic links to allow a single .gitignore file to 
define my folder structure, allowing me to avoid hardcoding the 
repo-specific folder paths into my configs.

Is there a flag to disable this new behavior?

If not, this change means I need to update dozens of files, duplicates 
all, or completely rewrite my .gitignore files to have shyteloads of 
arbitrary file paths in them, which I'd rather not do.

Also, is there a justification for forcing this as the on-update 
default new behavior, when a user-querying behavior (such as with 'git 
pull' defaults as they've changed recently) exists?

---

ref 
https://github.com/git/git/commit/142430338477d9d1bb25be66267225fb58498d92#diff-eae5facd145e2748250f7b275e45cb001c0b8e2c47c529a4e28bbfa208e5fb59R7


===


Thoughts?

-- 
Tessa L. H. Lovelace
----
office:		503.893.9709
consulting:	assorted.tech


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using .gitignore symbolic links?
  2021-06-18  2:34 Using .gitignore symbolic links? Tessa L. H. Lovelace
@ 2021-06-18  6:44 ` Robert Karszniewicz
  2021-06-18 11:15 ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 5+ messages in thread
From: Robert Karszniewicz @ 2021-06-18  6:44 UTC (permalink / raw)
  To: Tessa L. H. Lovelace; +Cc: git

On Thu, Jun 17, 2021 at 07:34:40PM -0700, Tessa L. H. Lovelace wrote:
> Secondly, and more personally important to me, a system administrator:
> My repositories use symbolic links to allow a single .gitignore file to 
> define my folder structure, allowing me to avoid hardcoding the 
> repo-specific folder paths into my configs.
> 
> Is there a flag to disable this new behavior?
> 
> If not, this change means I need to update dozens of files, duplicates 
> all, or completely rewrite my .gitignore files to have shyteloads of 
> arbitrary file paths in them, which I'd rather not do.

Hmm, it sounds like `core.excludesFile` described in git-config(1) could
do what you need:

  core.excludesFile
      Specifies the pathname to the file that contains patterns to
      describe paths that are not meant to be tracked, in addition to
      .gitignore (per-directory) and .git/info/exclude. Defaults to
      $XDG_CONFIG_HOME/git/ignore. If $XDG_CONFIG_HOME is either not set
      or empty, $HOME/.config/git/ignore is used instead. See
      gitignore(5).

Regards,
Robert Karszniewicz

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using .gitignore symbolic links?
  2021-06-18  2:34 Using .gitignore symbolic links? Tessa L. H. Lovelace
  2021-06-18  6:44 ` Robert Karszniewicz
@ 2021-06-18 11:15 ` Ævar Arnfjörð Bjarmason
  2021-06-18 12:55   ` Jeff King
  2021-07-03 17:29   ` Tessa L.
  1 sibling, 2 replies; 5+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-18 11:15 UTC (permalink / raw)
  To: Tessa L. H. Lovelace; +Cc: git, Jeff King


On Thu, Jun 17 2021, Tessa L. H. Lovelace wrote:

> The recent release candidate of Git (v2.32.0) hit my OS this week, and
> it included a line () on symbolic links for several specific files are 
> now ignored.
>
> Thank you for putting the changelogs in an accessible location,
> knowing that this was a known breaking change was useful in debugging
> why my workflows stopped working.
>
> I have two concerns.
>
> First, the error thrown is
>
>> "warning: unable to access '.gitignore': Too many levels of symbolic
>   links",
>
> ,,,which does not accurately represent what is happening.
>
> I spent a bit of time convinced that I'd broken something with the
> symbolic links during setup, and an error such as "symbolic linking no 
> longer allowed for 'filename'." would make more sense, given the
> change under discussion eliminates *any* use of symbolic links.
>
>
> Secondly, and more personally important to me, a system administrator:
> My repositories use symbolic links to allow a single .gitignore file
> to define my folder structure, allowing me to avoid hardcoding the 
> repo-specific folder paths into my configs.
>
> Is there a flag to disable this new behavior?
>
> If not, this change means I need to update dozens of files, duplicates
> all, or completely rewrite my .gitignore files to have shyteloads of 
> arbitrary file paths in them, which I'd rather not do.
>
> Also, is there a justification for forcing this as the on-update
> default new behavior, when a user-querying behavior (such as with 'git 
> pull' defaults as they've changed recently) exists?

[CC-ing Jeff]

Breaking this was intentional, see https://github.com/git/git/commit/2ef579e261

That doesn't mean we can't take it back.

As discussed by Robert's reply and in that commit there's the workaround
of .git/info/exclude and the core.excludesFile.

However, we realize that sucks for many users. Let's say you have a
script to clone a "tree" of repositories similar to but not using
git-submodule (or they live side-by-side), such a thing won't Just Work
anymore.

At the end of the day there's an inherent conflict here between security
and convenience. We really want a repository to be safe to just "git
clone", i.e. we don't set up any hooks, execute code etc.; these
gitattributes and gitignore issues were on edges of that.

We can make it work as before, but it gets hard to distinguish the
gitignore you mean, from a gitignore that's pointing to /dev/urandom
(annoying), or to some crafted out-of-tree thing that'll cause an
overflow in the parser and an RCE.

Any way out of that that's configurable is going to be be the same
opt-in problem as core.excludesFile is now.

So I'd think our options are basically:

 1) Do nothing, it sucks for some people (like you) but we think it's worth it

 2) Some DWYM middle ground, e.g. we could discover if the link points
    to another git repo, and only trust it then, or if it's in the
    user's $HOME or whatever.

 3) Bring back the old behavior, it was more of a "while we're at it for
    gitattributes..." fix than something specifically a problem with
    gitignore, the RCE threat is a hypothetical, and we can more easily
    audit/be confident in the gitignore parser, probably...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using .gitignore symbolic links?
  2021-06-18 11:15 ` Ævar Arnfjörð Bjarmason
@ 2021-06-18 12:55   ` Jeff King
  2021-07-03 17:29   ` Tessa L.
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff King @ 2021-06-18 12:55 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Tessa L. H. Lovelace, git

On Fri, Jun 18, 2021 at 01:15:46PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Breaking this was intentional, see https://github.com/git/git/commit/2ef579e261
> 
> That doesn't mean we can't take it back.
> 
> As discussed by Robert's reply and in that commit there's the workaround
> of .git/info/exclude and the core.excludesFile.

I'd prefer not to undo it because of the security implications that led
us there in the first place. And for the most part, there should be a
better way to accomplish the same things:

  - these symlinked files were already subtly broken; Git was not
    following the links when it read them from the index or a tree.

  - repetitive links within a tree can be refactored to use patterns in
    the top-level .gitignore, .gitattributes, etc. It may be that our
    pattern language is not sufficient for some cases, but improving
    that seems like a better path forward.

  - links that span repos can use core.excludesFile or similar (and
    conditional config can help enable them only when you want to). It
    may also be that this could be extended to cover more cases (e.g.,
    you can only have one configured excludesFile, but you may want
    several). Or just a symlink from .git/info/exclude if there's one
    per repo.

I'd be curious to hear if any of those solutions don't help in this
case.

> At the end of the day there's an inherent conflict here between security
> and convenience. We really want a repository to be safe to just "git
> clone", i.e. we don't set up any hooks, execute code etc.; these
> gitattributes and gitignore issues were on edges of that.
> 
> We can make it work as before, but it gets hard to distinguish the
> gitignore you mean, from a gitignore that's pointing to /dev/urandom
> (annoying), or to some crafted out-of-tree thing that'll cause an
> overflow in the parser and an RCE.

I agree with all of this, but I would soften the "RCE" part a bit. An
untrusted repository can already feed whatever it wants into the parser.
The danger of symlinks is that accessing out-of-tree paths may cause
unexpected results (information disclosure in some situations, but also
weirdness when opening files in /dev, /proc, etc).

> Any way out of that that's configurable is going to be be the same
> opt-in problem as core.excludesFile is now.
> 
> So I'd think our options are basically:
> 
>  1) Do nothing, it sucks for some people (like you) but we think it's worth it

I hope the "it sucks" is "the transition sucks", but they're still able
to configure Git differently to achieve the same goals in a roughly
similar way. Again, I'd be curious to hear about cases where this isn't
true.

I'm not completely opposed to having a config switch for "allow
gitignore symlinks" as an escape hatch, as long as the default is still
"off". One of the things I don't like about it is that the config option
needs to come with a warning explaining how the result is still subtly
broken.

>  2) Some DWYM middle ground, e.g. we could discover if the link points
>     to another git repo, and only trust it then, or if it's in the
>     user's $HOME or whatever.

We've talked before about identifying out-of-tree symlinks. It's not
clear to me in this case if the symlinks are to other paths within the
repository, or if they go out-of-tree.

In-tree symlinks are OK. It's just complicated and error-prone to detect
them (because of course interior paths may themselves be symlinks).

I think we'd always want to forbid out-of-tree symlinks, no matter what
they're pointing to (because we don't have any idea what's "safe" in the
user's filesystem). It's easier both us and the user to just have a
switch for "look at these symlinks anyway".

>  3) Bring back the old behavior, it was more of a "while we're at it for
>     gitattributes..." fix than something specifically a problem with
>     gitignore, the RCE threat is a hypothetical, and we can more easily
>     audit/be confident in the gitignore parser, probably...

Hopefully it's obvious at this point that I'd prefer not to go that
route. :)

Tessa mentioned one other thing, which is somewhat orthogonal to the
options you listed. The error message is just:

  warning: unable to access '.gitignore': Too many levels of symbolic links

This comes from a generic open-or-warn function. The kernel is giving us
ELOOP, which we feed to strerror(). And it's _technically_ true, in that
we allow 0 levels of symbolic links. But we could perhaps intercept
ELOOP in the gitignore and gitattributes code to produce a more coherent
warning.

TBH, I didn't give too much thought to user experience in the original
patches because my digging showed that using symlinks for these files
was exceedingly rare (at least on the corpus of GitHub repos I scanned,
but of course all the world is not hosted on GitHub, and there will
always be edge cases anyway).

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Using .gitignore symbolic links?
  2021-06-18 11:15 ` Ævar Arnfjörð Bjarmason
  2021-06-18 12:55   ` Jeff King
@ 2021-07-03 17:29   ` Tessa L.
  1 sibling, 0 replies; 5+ messages in thread
From: Tessa L. @ 2021-07-03 17:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Robert, Peff, and Ævar,

Thanks for the responses on this.

Will attempt to better-articulate both my use case and my concerns.


--- 

First, an attempt to provide clarity regarding my use case.

This is a hypothetical representational of my project's structure
(symlinks designated by '->'):
```
..
.
./.gitignore  -> ./doc/.gitignore
./doc
./doc/readme.md
./doc/.gitignore
./main
./main/.gitignore -> ../doc/.gitignore./main/00_textFile.src
./main/00_textfile.src~
./main/00_textFile.src~
./main/01_differenTextFile.src
./static
./static/.gitignore -> ../doc/.gitignore
./static/executeable.sh
./static/executeable.sh~
```
The contents of these listed files (aside from the .gitignore, as
articulated below) don't actually matter, though it's possibly relevant
that the tilde-differentiated files are the default "temp/backup" file
for GNU nano.

By creating symbolic links between the single file in ./doc/.gitignore,
and other directories (repository root, ./main, ./static, etc...) means
repo-wide ignoring of nano's backup files can be added with the single
line:
```
*~
```
...in my ./doc/.gitignore. And just like that, I've eliminated the
entire visible clutter of those backup files from my filetree.

---


Having said that, it's worth noting that I was, also, trying to solve
the "...but adding the metaphorical '*~' to my gitignore on every new
project is a pain, and then if there's another global-ish file-pattern
to squash, updating each and every repo is a pain, too..." problem.

Thank you, Robert, for calling my attention to core.excludesFile,
that's going to be horrifically misused in my homelab sometime soon-
ish.


---

Having said *that*, I did not have visibility on commit 2ef579e261, and
am especially in favour of the mentality of "Let's do X instead of Y so
that (the) two cases behave consistently."

The subtle difference between my actual case of (ab)using symlinks to
point at a single 'ignore' file within the repo (vs linking from
outside the repo) is definitely not within the supported use of git as
a version control system, and I'm not expecting to halt progress
because of some edge case use like this.

I'd rather come up with a better path than "take it back", if that
makes sense.

---


That being said, I have concerns about the implemented approach here.

As a former information security professional, I'm comfortable saying
that the presented dichotomy of the "inherent conflict between security
and convenience" ignores how effective communication and user-consent
fixes the problem.

Arbitrarily breaking system-wide expectations for a low-level utility
(symlinking) in the name of security (it's still a hypothetical RCE,
right? There's not currently a working proof of this exploit?)
seems...problematic...in that it prioritizes something that *may*
happen over pushing breaking changes down the pipe and assuming that
the user will just figure it out...when often even highly technical
users completely miss that something has changed, much less on what
level.

Obviously, security has to happen at all levels...and, a bash-
compatible system has literally countably infinite amount of ways to
cause harm, if you're running someone else's code without attempting
some form of analysis beforehand.


I, and others who use bash or similar shells, are going to expect
symlinks to work consistently both inside git repos and out of it, and
'what if a malicious repo symlinks to Something Bad' seems...outside
the scope of doing version control. 

Trying to detect exploits at this level (the version control system)
seems like a lot of complexity to add for something that is
fundamentally at odds with the philosophy of 'do one thing well'. And,
choosing to pursue and define what is or is not an acceptable use of
symlinks, or even changing behavior based on internal/external linking,
sounds a lot like scope creep to me.


I'm not saying that to downplay the seriousness of the security
concern...but hopefully to contextualize my deeper concerns about a
choice to not just start breaking the (working, system-level-
consistent) defaults, but to do so without somehow informing the user
and giving them a form of choice regarding the change.


---

I acknowledge that 'too many levels of symlinks' is technically valid
for 'zero levels allowed', but that's not what is functionally
communicated to the end user. I debated about calling out this
distinction, but decided to orient on how a less-technical user would
perceive the error.


In that mindset (of keeping an eye on what a less-technical user will
have perceive), I've watched with happy interest the process of
adjusting the defaults for various commands (git pull and git init,
especially).

The excellent use of user-communicating blocks of text in those cases
while preserving the in-use legacy defaults, while allowing an informed
choice (eg, presenting the user with a suggestion to run specific
command(s) to change the fast-forward behavior or to rename the default
branch on init, respectively) seems like a much better path to me.


If nothing else, I'd like to see that model of user-
querying/informing/consenting behaviour (from the fast-forward and init
examples) happen with this case, for consistency at the very least.


---

Back to my specific use case.

All three of the potential solutions subtly miss my need, so my
suggestion for a 'fourth option' would look something like a flag to
prompt a 'flat' (non-tree) interpretation of the file(s) inside the
repo when filtering the displayed file-list via gitignore/excludesFile.

So, instead of checking if each folder has a rule matching against it,
any rule in the .gitignore (or wherever the core.excludesFile points)
applies to the base-level of any directory inside the repo, resulting
in essentially the same behaviour.

This would eliminate my specific use of symlinks entirely, though it
doesn't touch on my concern about symlinks behaving differently inside
version control than pretty much everywhere else inside a symlink-
capable filesystem.


I don't have visibility on the complexities of adjusting/adding this,
so please correct my assumptions where they conflict with the realities
of developing the next release candidate.


Once again, I appreciate your time and communication on this.

--
Tessa L.

office: 503.893.9709
web: 	https://assorted.tech

On Fri, 2021-06-18 at 13:15 +0200, Ævar Arnfjörð Bjarmason wrote:
> On Thu, Jun 17 2021, Tessa L. H. Lovelace wrote:
> 
> > The recent release candidate of Git (v2.32.0) hit my OS this week,
> > and
> > it included a line () on symbolic links for several specific files
> > are 
> > now ignored.
> > 
> > Thank you for putting the changelogs in an accessible location,
> > knowing that this was a known breaking change was useful in
> > debugging
> > why my workflows stopped working.
> > 
> > I have two concerns.
> > 
> > First, the error thrown is
> > 
> > > "warning: unable to access '.gitignore': Too many levels of
> > > symbolic
> > 
> >   links",
> > 
> > ,,,which does not accurately represent what is happening.
> > 
> > I spent a bit of time convinced that I'd broken something with the
> > symbolic links during setup, and an error such as "symbolic linking
> > no 
> > longer allowed for 'filename'." would make more sense, given the
> > change under discussion eliminates *any* use of symbolic links.
> > 
> > 
> > Secondly, and more personally important to me, a system
> > administrator:
> > My repositories use symbolic links to allow a single .gitignore
> > file
> > to define my folder structure, allowing me to avoid hardcoding the 
> > repo-specific folder paths into my configs.
> > 
> > Is there a flag to disable this new behavior?
> > 
> > If not, this change means I need to update dozens of files,
> > duplicates
> > all, or completely rewrite my .gitignore files to have shyteloads
> > of 
> > arbitrary file paths in them, which I'd rather not do.
> > 
> > Also, is there a justification for forcing this as the on-update
> > default new behavior, when a user-querying behavior (such as with
> > 'git 
> > pull' defaults as they've changed recently) exists?
> 
> [CC-ing Jeff]
> 
> Breaking this was intentional, see 
> https://github.com/git/git/commit/2ef579e261
> 
> That doesn't mean we can't take it back.
> 
> As discussed by Robert's reply and in that commit there's the
> workaround
> of .git/info/exclude and the core.excludesFile.
> 
> However, we realize that sucks for many users. Let's say you have a
> script to clone a "tree" of repositories similar to but not using
> git-submodule (or they live side-by-side), such a thing won't Just
> Work
> anymore.
> 
> At the end of the day there's an inherent conflict here between
> security
> and convenience. We really want a repository to be safe to just "git
> clone", i.e. we don't set up any hooks, execute code etc.; these
> gitattributes and gitignore issues were on edges of that.
> 
> We can make it work as before, but it gets hard to distinguish the
> gitignore you mean, from a gitignore that's pointing to /dev/urandom
> (annoying), or to some crafted out-of-tree thing that'll cause an
> overflow in the parser and an RCE.
> 
> Any way out of that that's configurable is going to be be the same
> opt-in problem as core.excludesFile is now.
> 
> So I'd think our options are basically:
> 
>  1) Do nothing, it sucks for some people (like you) but we think it's
> worth it
> 
>  2) Some DWYM middle ground, e.g. we could discover if the link
> points
>     to another git repo, and only trust it then, or if it's in the
>     user's $HOME or whatever.
> 
>  3) Bring back the old behavior, it was more of a "while we're at it
> for
>     gitattributes..." fix than something specifically a problem with
>     gitignore, the RCE threat is a hypothetical, and we can more
> easily
>     audit/be confident in the gitignore parser, probably...
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-07-03 17:38 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-18  2:34 Using .gitignore symbolic links? Tessa L. H. Lovelace
2021-06-18  6:44 ` Robert Karszniewicz
2021-06-18 11:15 ` Ævar Arnfjörð Bjarmason
2021-06-18 12:55   ` Jeff King
2021-07-03 17:29   ` Tessa L.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.