All of lore.kernel.org
 help / color / mirror / Atom feed
* [Discussion] What is Git's Security Boundary?
@ 2022-04-26 17:00 Derrick Stolee
  2022-05-16 14:13 ` Derrick Stolee
  2022-05-17 12:55 ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 7+ messages in thread
From: Derrick Stolee @ 2022-04-26 17:00 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano, Taylor Blau, Emily Shaffer,
	Glen Choo, Ævar Arnfjörð Bjarmason,
	Christian Couder

I've been having a few discussions internally and externally with folks
about the 2.35.2 release and the safe.directory config value. After
stumbling a little with a too-technical proposal, I (along with Taylor)
figured out that I was jumping into "solutions" mode without first talking
about the problem and agreeing on common language there.

Specifically, the issue at the root of CVE-2022-24765 [1] is that a
machine can have an "untrusted" Git repository on-disk. The actual exploit
uses hooks as an attack vector. The protection we put in place was the
safe.directory config key, which is only read out of "protected" config
files.

[1] https://github.com/git-for-windows/git/security/advisories/GHSA-vw2c-22j4-2fh2

I had started going down a rabbit hole of additional ways to harden our
security stance using this concept of "protected" config files, but that
was premature.

I'm hoping to start a conversation in this thread about "What is Git's
security boundary?" so we can have an established base to work from for
future security incidents or protections.

The only documentation I could find is our security policy [2], but that
doesn't try to define this boundary. Recent discussions on the config-
based hooks have also talked about the security boundary [3].

[2] https://github.com/git/git/security/policy
[3] https://lore.kernel.org/git/20200425205727.GB6421@camp.crustytoothpaste.net/


What is Git's Security Boundary?
================================

From my discussions, I've seen two different perspectives on Git's
security responsibility:

1. When Git interacts with another _machine_ over a protocol, then Git
   must be hardened against malicious actors across that protocol. Git
   expects data on disk from that point on to be created by "trusted"
   actors. Any files checked out from a remote source are expected to
   be inspected by the user before they are used.

2. Users expect Git commands, especially those that do not communicate
   over a network, to be as safe to use as a text editor. If there is
   a realistic vector where an unprivileged actor with access to a
   mounted disk can create a malicious repo on disk and convince another
   user to run a Git command, then Git should protect against this attack
   to the best of its ability.

The choice here is not an either-or: (1) is obviously an important case to
handle. The question is "how much of (2) are we responsible for?" I admit
that this area has no strict boundary, but instead must be inspected on a
case-by-case basis. The level of social engineering must be taken into
account. It is probably clear that we shouldn't protect against someone
who was tricked into running `git bisect run rm -rf /*` but most cases
require more inspection.

The case that came about in the recent security release had an attack
vector that is identical to using a shared repository, which is a valid
use case. Hence, the `safe.directory` config value was created. Users
using shared repositories like this (now marked with `safe.directory`) are
susceptible to someone modifying the repository-local config in a
repository they have been using without issue. This allows an actor with
access to one user's credentials to gain access to other users on that
machine, spreading their access. The ability to change the executable run
by Git via local config is the critical piece of this attack.


Example: Out of security boundary: improper filesystem permissions
------------------------------------------------------------------

It can be helpful to include a case of something that is outside of our
security boundary. In particular, if the machine has filesystem
permissions set improperly (and different than the default upon install),
then the following things can happen:

1. An attacker could replace the Git executable.

2. An attacker could modify `~/.gitconfig` to change config and have Git
   execute arbitrary code.

3. An attacker could modify `~/.bashrc` (or equivalents) to change
   environment variables such as the `PATH` or anything else.

Hopefully we can all agree that if the machine is in such a state, then
there is nothing we can do and would not call this a vulnerability _in Git_.


Example: Out of security boundary: "clone and make"
---------------------------------------------------

A well-established example is the case where a combination of social
engineering and lack of user care lead to an attack. This is most
commonly of the form of "clone this repo, go into it and run 'make'".

Building code right after cloning is a common workflow for developers,
but that does not mean that this is an attack. Running content that
could execute arbitrary code on your machine should always be checked
carefully.

This "attack" vector is considered outside of the boundary because of
the social engineering aspect (the attacker needs you to clone the
repo _and_ run something inside it) and the lack of care by the user.
The user has ample opportunity to check what they are about to do
before running that second execution step.

The other side of this situation is this: there is nothing Git can do.
Once Git has cloned the data it was asked to clone, no part of this
attack uses Git itself. There is no opportunity to intervene. Git is
doing exactly what it is asked to do: nothing less _and nothing more_.


Question: Is there a better way to describe the boundary?
---------------------------------------------------------

Please submit your attempts to clarify the boundary in replies.


Question: Is "protected" config _really_ more trustworthy?
----------------------------------------------------------

This leads to an interesting question: Do we think that `~/.gitconfig`
and `/etc/gitconfig` are "more trustworthy" than `.git/config`?

I think that if an attacker has access to write to system or global config,
then they have access to do more harm without Git. On the other hand,
local config overrides the other config values, so local config can "unset"
any potentially-malicious values in system and global config. I don't
think such "defensive config" is common.


Example Security Boundary Question: Unprotected Config and Executing Code
-------------------------------------------------------------------------

We have a lot of ways of executing external code from within Git. This is
a key extensibility feature for customizing Git functionality. One way to
do this is to create executable files in the $GIT_DIR/hooks/ directory.
Git will discover these hooks and run them at the appropriate times.

There are also many config options that specify external commands to run:

* `core.fsmonitor=<path>` is executed before scanning the worktree for
  created or modified files.

* `core.editor` specifies an editor when writing commit messages, among
  other user-input scenarios.

* `credential.helper` specifies an external tool to assist with connecting
  to a remote repository.

* `uploadPack.packObjectsHook` chooses an alternative for `git pack-objects`
  during `git upload-pack`.

The list is actually quite long. This last one, `uploadPack.packObjectsHook`
_does_ do a check for protected config: it does not allow its value to
come from repository-local config.

However, most of these options really do want to have customization on a
per-repository basis, hence this proliferation of config options and
local hook directories.

I'm concerned that as long as we allow arbitrary execution to happen based
on repository-local data, we will always have security risks in Git. For
that reason, I'm interested in exploring ways to change how we execute
code externally, even if it means we would (eventually) need to introduce
a breaking change.

The idea would be to allow repository-local customization by selecting
from a list of "installed" executables. The list of "installed"
executables comes from protected config (and the Git binary itself).

The plan I would like to put forward is to restrict all external execution
to be from one of these sources:

1. Specified in system config (`/etc/gitconfig`).
2. Specified in global config (`~/.gitconfig`).
3. An allow-list of known tools, on $PATH (e.g. `vim`).

Such a change would be a major one, and would require changing a lot of
existing code paths. In particular, this completely removes the
functionality of the `$GIT_DIR/hooks/` directory in favor of a config-
based approach. This would require wide communication before pulling all
support for the old way, and likely a 3.0 version designation. After the
old hook mechanism is removed, the "safe.directory" protection from 2.35.2
would no longer be needed.

At minimum, in the short term, this would affect the proposed design of
config-based hooks.

I think this is a good example to think about at a high level before going
into the technical details. We can use it to test any proposed security
boundary definitions to see if it lands on one side or another. Here are
some points:

1. These config-based executables cannot be set in a full repository by
   a "git clone" operation.

2. These config-based executables can be set in an embedded bare
   repository, but the user needs to move deeper into the repository for
   that to have any affect. This leads to some amount of social engineering
   being involved in the attack. See [4] for recent discussion on this.

   [4] https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/

3. If users are sharing a common Git repository, then if an attacker gains
   control of one user's account, they can use the shared repository as an
   attack vector to gain control of the other users' accounts. For this
   case, do we consider the "safe.directory" config as an indicator of
   "I trust this repo and all users that can access it _in perpetuity_" or
   instead "I need to use this repo, even though it is owned by someone
   else."

4. Git's dedication to backwards compatibility might prevent any attempt
   to change how hooks are specified or config can be used to customize
   its behavior.

5. The technical challenge of converting all possible execution paths may
   be too daunting to ever feel the project is "complete" and be able to
   confidently say "Git is safe from these kinds of attacks".


Conclusion
----------

I look forward to hearing from the community about this topic. There are
so many things to think about in this space, and I hope that a lot of
voices can share their perspectives here.

Please collect any action items here at the end. I would love to add a
doc to the Git tree that summarizes our conclusions here, even if it is
only a start to our evolving security boundary.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Discussion] What is Git's Security Boundary?
  2022-04-26 17:00 [Discussion] What is Git's Security Boundary? Derrick Stolee
@ 2022-05-16 14:13 ` Derrick Stolee
  2022-05-16 14:38   ` rsbecker
  2022-05-17 12:55 ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 7+ messages in thread
From: Derrick Stolee @ 2022-05-16 14:13 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano, Taylor Blau, Emily Shaffer,
	Glen Choo, Ævar Arnfjörð Bjarmason,
	Christian Couder

On 4/26/2022 1:00 PM, Derrick Stolee wrote:
> I've been having a few discussions internally and externally with folks
> about the 2.35.2 release and the safe.directory config value. After
> stumbling a little with a too-technical proposal, I (along with Taylor)
> figured out that I was jumping into "solutions" mode without first talking
> about the problem and agreeing on common language there.

> I'm hoping to start a conversation in this thread about "What is Git's
> security boundary?" so we can have an established base to work from for
> future security incidents or protections.

I'm back from a vacation, and haven't seen any response to this message.

I thought this would be an interesting topic that would create a lot of
valuable discussion, but that has not happened. I have a few ideas of why
that could be:

1. It's long, so readers put if off until it fell off the end of their
   inboxes.

2. The fixes for 2.36.1 have been taking priority.

3. There are no patches, so I should submit code if I want concrete
   feedback.

4. I'm so off base that it's not even worth replying.

Of course, it could be a combination of these or any number of other
things.

I'm sending this email as a hopeful ping that this topic could use some
feedback. I'm looking forward to your ideas.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [Discussion] What is Git's Security Boundary?
  2022-05-16 14:13 ` Derrick Stolee
@ 2022-05-16 14:38   ` rsbecker
  2022-05-20 17:23     ` Derrick Stolee
  0 siblings, 1 reply; 7+ messages in thread
From: rsbecker @ 2022-05-16 14:38 UTC (permalink / raw)
  To: 'Derrick Stolee', 'Git Mailing List',
	'Junio C Hamano', 'Taylor Blau',
	'Emily Shaffer', 'Glen Choo',
	'Ævar Arnfjörð Bjarmason',
	'Christian Couder'

On May 16, 2022 10:14 AM, Derrick Stolee wrote:
>On 4/26/2022 1:00 PM, Derrick Stolee wrote:
>> I've been having a few discussions internally and externally with
>> folks about the 2.35.2 release and the safe.directory config value.
>> After stumbling a little with a too-technical proposal, I (along with
>> Taylor) figured out that I was jumping into "solutions" mode without
>> first talking about the problem and agreeing on common language there.
>
>> I'm hoping to start a conversation in this thread about "What is Git's
>> security boundary?" so we can have an established base to work from
>> for future security incidents or protections.
>
>I'm back from a vacation, and haven't seen any response to this message.
>
>I thought this would be an interesting topic that would create a lot of valuable
>discussion, but that has not happened. I have a few ideas of why that could be:
>
>1. It's long, so readers put if off until it fell off the end of their
>   inboxes.
>
>2. The fixes for 2.36.1 have been taking priority.
>
>3. There are no patches, so I should submit code if I want concrete
>   feedback.
>
>4. I'm so off base that it's not even worth replying.
>
>Of course, it could be a combination of these or any number of other things.
>
>I'm sending this email as a hopeful ping that this topic could use some feedback.
>I'm looking forward to your ideas.

Some ramblings, since you asked, and I hope I am not missing the point:

I guess some (me) were waiting for more ideas on what you meant by "Security boundary". In network security, the definition is fairly clear - the line where security needs change, so a firewall, DMZ, etc. When talking about applications, a security boundary would be an area where the concept of a user diverges from the system, so your GitHub logon vs. user ids on the servers where GitHub runs - or perhaps Amazon is a better example.

The line blurs for git because we depend on the underlying user authentication mechanisms of the platform. To do anything in git, you either have to have a legitimate logon to the server where git runs or are coming in anonymously in a read-only (hopefully) fashion. In one view, your boundary expands beyond one system, making the boundary non-traditional.

The "security boundary" line is different for git than what a network security admin would consider as a similar domain. In gits terms (my view anyway), the boundary is functional. Do we want git doing something intended vs. unintended given the structure of the repository. In strict technical terms, the boundary is at fopen() and exec(). Can git access something or do something on a system and if so, should it. Conversely, is git blocked from doing something it should be able to do. This seems like well structured problem except for the introduction of incoming changes that could trigger undesired behaviour either at clone, fetch/merge time, switch or other situations where there is a side impact.

So putting the fopen() boundary into a box, that seems pretty much up to the operating system. I am not 100% sure that the safe.directory situation is required for that - although I have had customers asking for something like that for about 3 years.

There are three areas of ancillary impacts that give me continual concern: clean/smudge, hooks, workflows. Each hits the exec() boundary. Clean/smudge has a well-defined control that is up to the user or system admin to manage. Similarly hooks, although hook import has become a topic lately. The GitHub (and other app) Workflow Actions concept opens up a new area that allows the exec() boundary to be traversed, potentially with undesired side effects. Actions depends on GitHub to provide safety controls, which is outside git's responsibility although git is the transport vector through which potential problems can be introduced. We then get into "trust" and who is trusted across that boundary and is the trust justified. If it were up to me, I would want all of the incoming changes to be signed at least for accountability, but more having some kind of authentication to ensure the trust.

I'm not sure whether this is what you are looking for.

Sincerely,
Randall


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Discussion] What is Git's Security Boundary?
  2022-04-26 17:00 [Discussion] What is Git's Security Boundary? Derrick Stolee
  2022-05-16 14:13 ` Derrick Stolee
@ 2022-05-17 12:55 ` Ævar Arnfjörð Bjarmason
  2022-05-20 17:53   ` Derrick Stolee
  1 sibling, 1 reply; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-17 12:55 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Emily Shaffer,
	Glen Choo, Christian Couder


On Tue, Apr 26 2022, Derrick Stolee wrote:

[Snip and just focusing on the "what to do with config" part of this]:

> Question: Is "protected" config _really_ more trustworthy?
> ----------------------------------------------------------
>
> This leads to an interesting question: Do we think that `~/.gitconfig`
> and `/etc/gitconfig` are "more trustworthy" than `.git/config`?
>
> I think that if an attacker has access to write to system or global config,
> then they have access to do more harm without Git. On the other hand,
> local config overrides the other config values, so local config can "unset"
> any potentially-malicious values in system and global config. I don't
> think such "defensive config" is common.

I think this is a subset of areas where we rightfully piggy-back on FS
permissions, and should continue to do so.

I.e. we should trust /etc/gitconfig and the like because someone who can
modify it can modify /usr/bin/git anyway, so trying to defend against
anything in that area is pointless.

Yes we allow overriding that config, but that should be thought of as a
convenience, not as a nascent security boundary.

> Example Security Boundary Question: Unprotected Config and Executing Code
> -------------------------------------------------------------------------
>
> We have a lot of ways of executing external code from within Git. This is
> a key extensibility feature for customizing Git functionality. One way to
> do this is to create executable files in the $GIT_DIR/hooks/ directory.
> Git will discover these hooks and run them at the appropriate times.
>
> There are also many config options that specify external commands to run:
>
> * `core.fsmonitor=<path>` is executed before scanning the worktree for
>   created or modified files.
>
> * `core.editor` specifies an editor when writing commit messages, among
>   other user-input scenarios.
>
> * `credential.helper` specifies an external tool to assist with connecting
>   to a remote repository.
>
> * `uploadPack.packObjectsHook` chooses an alternative for `git pack-objects`
>   during `git upload-pack`.
>
> The list is actually quite long. This last one, `uploadPack.packObjectsHook`
> _does_ do a check for protected config: it does not allow its value to
> come from repository-local config.
>
> However, most of these options really do want to have customization on a
> per-repository basis, hence this proliferation of config options and
> local hook directories.
>
> I'm concerned that as long as we allow arbitrary execution to happen based
> on repository-local data, we will always have security risks in Git. For
> that reason, I'm interested in exploring ways to change how we execute
> code externally, even if it means we would (eventually) need to introduce
> a breaking change.

I agree that we should mainly be thinking of these config values that
directly execute something external, but as elaborated on below I think
any security solution that narrowly focuses only on these is on the
wrong path.

You can e.g. point git-send-email to hostile server, or disable its
SSL/TLS settings with config. Ditto HTTP settings to disable certificate
checking etc. etc.

You can also set transfer.fsckObjects=false or one of the fsck.*
settings and open the door to fetching a payload which propagates a
known part CVE. But more below.

> The idea would be to allow repository-local customization by selecting
> from a list of "installed" executables. The list of "installed"
> executables comes from protected config (and the Git binary itself).

Most of this type of config doesn't point to a path to an executable,
but is a string we'll give to "sh -c" or equivalent. E.g. for editors we
couldn't naively add "emacs" to such a whitelist, as it supports
command-line options to evaluate arbitrary code.

How would your plan handle such cases?

> The plan I would like to put forward is to restrict all external execution
> to be from one of these sources:
>
> 1. Specified in system config (`/etc/gitconfig`).
> 2. Specified in global config (`~/.gitconfig`).
> 3. An allow-list of known tools, on $PATH (e.g. `vim`).
>
> Such a change would be a major one, and would require changing a lot of
> existing code paths. In particular, this completely removes the
> functionality of the `$GIT_DIR/hooks/` directory in favor of a config-
> based approach. This would require wide communication before pulling all
> support for the old way, and likely a 3.0 version designation. After the
> old hook mechanism is removed, the "safe.directory" protection from 2.35.2
> would no longer be needed.

Aside from any of the details of safe.directory & how we discover hook
it was my understanding per [1] that Johannes Schindelin disagreed with
this assessment of what safe.directory was for.

I.e. now the known vector is a hook, but in the previous off-list
discussions I'd proposed narrowing the new safe.directory error down to
handle that hook case only, but per the "cat being out of the bag" in
[1] there was concern about other non-hook issues being found.

Perhaps that assessment has changed, just noting it here for
completeness.

In any case, I don't think that we'd need to make the removal of
$GIT_DIR/hooks support in favor of config-based hooks a dependency of
any such proposal.

The current config-based hook proposal would allow you to exhaustively
emulate $GIT_DIR/hooks by defining all our hooks to those
paths. Therefore any security mechanism could surely consider the old
$GIT_DIR/hooks to be handled equivalently to however it would handle
that sort of config-based hooks configuration.

> At minimum, in the short term, this would affect the proposed design of
> config-based hooks.
>
> I think this is a good example to think about at a high level before going
> into the technical details. We can use it to test any proposed security
> boundary definitions to see if it lands on one side or another. Here are
> some points:
>
> 1. These config-based executables cannot be set in a full repository by
>    a "git clone" operation.
>
> 2. These config-based executables can be set in an embedded bare
>    repository, but the user needs to move deeper into the repository for
>    that to have any affect. This leads to some amount of social engineering
>    being involved in the attack. See [4] for recent discussion on this.
>
>    [4] https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/
>
> 3. If users are sharing a common Git repository, then if an attacker gains
>    control of one user's account, they can use the shared repository as an
>    attack vector to gain control of the other users' accounts. For this
>    case, do we consider the "safe.directory" config as an indicator of
>    "I trust this repo and all users that can access it _in perpetuity_" or
>    instead "I need to use this repo, even though it is owned by someone
>    else."
>
> 4. Git's dedication to backwards compatibility might prevent any attempt
>    to change how hooks are specified or config can be used to customize
>    its behavior.
>
> 5. The technical challenge of converting all possible execution paths may
>    be too daunting to ever feel the project is "complete" and be able to
>    confidently say "Git is safe from these kinds of attacks".
>
>
> Conclusion
> ----------
>
> I look forward to hearing from the community about this topic. There are
> so many things to think about in this space, and I hope that a lot of
> voices can share their perspectives here.
>
> Please collect any action items here at the end. I would love to add a
> doc to the Git tree that summarizes our conclusions here, even if it is
> only a start to our evolving security boundary.

This didn't make it on-list, but in the off-list discussion about
safe.directory I pointed out that this class of problem is something
Emacs has been dealing with for decades, and which we'd do well to try
to emulate. [2] below is the relevant part of my
<220303.861qzi3mag.gmgdl@evledraar.gmail.com> (sent on Thu, 03 Mar 2022
19:27:59 +0100), I also mentioned it in passing in [3].

The brief overview for it in Emacs's documentation is available here:
https://www.gnu.org/software/emacs/manual/html_node/emacs/Safe-File-Variables.html

I feel strongly that something like that is a much better direction to
go in than an approch that tries to narrowly classify only "dangerous"
config.

That sort of approach would basically do the reverse. We'd whitelist
"safe" config (e.g. diff.orderFile or whatever), and ask the user if
they're OK with using config that falls outside of the whitelisting.

By classifying our own config (and we'd probably need more than just
"safe" and "executes arbitrary code") the common case is that users
shouldn't need to answer those questions, as we'd know that the config
is safe.

This would be implemented by having a config mechanism "mark" which
area(s) of config are "safe". We'd only pay attention to such a config
from sources that area already "safe".

So, to begin with this addresses cases where e.g. a tool like git-annex
will execute arbitrary executables based on git configuration, which any
mechanism that marks only config git itself knows about won't be able to
do (it uses its own config space).

But it also extends this mechanism from being something *just* focused
on narrowly addressing security to something generally useful. E.g. even
if a repository on disk has "safe" config I might still want to say that
I don't want to use its alias.* config or whatever.

Whatever mechanism we end up with here, I think that now that the dust
has settled on the CVE we'd do well to consider those sorts of
problems.

A core.editor pointing to "rm -rf /" is a security issue, but any such
issues are just subsets of annoying third-party config doing something I
didn't ask for.

1. https://lore.kernel.org/git/220412.86h76yglfe.gmgdl@evledraar.gmail.com/
2. 
	I know I've mentioned this before, including at past dev meetings, but I
	think that people interested in our whole config / repo reading ACL and
	"post-ACL" story should really look at how Emacs solves this problem.
	
	Not just "yeah yeah, Ævar's blathering about Emacs again" :). It really
	is highly topical here, especially to the point that Johannes & Stolee
	are discussing about university setups. It's exactly the sort of setups
	it was aimed at. The problem space is essentially the same.
	
	In brief summary. Emacs's implementation language and its "config
	format" are one and the same thing. It loads it from ~, and it can load
	it from a directory you're looking at, or even the file, or over the
	network.
	
	For those familiar with "vim" it would be like its "expandtab"
	etc. "file variables" were really arbitrary C code that it would compile
	and dynamically load. Scary, I know.
	
	I think what it does to do that safely is basically the same sort of
	security model we should be aiming for in git when it comes to reading
	the config format. In a roundabout way it's doing the same thing:
	potentially executing arbitrary code.
	
	I wrote about this a while ago in the context of loading an in-repo (as
	in part of the repo content, a .gitconfig like we have .gitattributes)
	here: https://lore.kernel.org/git/87zi6eakkt.fsf@evledraar.gmail.com/;
	
	I've mentioned it a few times on-list after, following that breadcrumb
	trail should lead to relevant discussions:
	https://lore.kernel.org/git/?q=%2F87zi6eakkt.fsf%40evledraar.gmail.com
	
	The tl;dr is that we'd move to a whitelisting-based approach where (for
	example, the details could be tweaked):
	
	 1. You could define in ~/.gitconfig (or other trusted "this is really
	    the user's") what files/paths you trust.
	 2. Anything not on the list we'd either ignore or prompt you the first
	    time about.
	 3. When you "trust" a "foreign" source we'd default to a narrow whitelist,
	    i.e. saying "this config defines these 5 variables at these values".
	    Is it OK to load those from this path?
	
	    If the user says "yes" we'd write those answers to the ~/.gitconfig for
	    future reference. I.e. these config values at this path are OK.
	 4. We'd provide a way to run git commands that normally read config in a
	    way that completely ignores config, or that only trust "safe" config
	    (e.g. user.{name,email}, but nothing/little else). This could be used
	    for git-prompt.bash, git-completion.bash etc.)
	
	 5. We could (perhaps by default) prompt you for the first time you read
	    config from a repo. Doing a "git clone" or "git init" would add a "this
	    is OK", but otherwise asking would cover e.g. wget-ing a tarred-up git
	    repo whose config is malicious.
	
	Moving that sort of model would definitely break things, but hopefully
	in a way that wouldn't be too disruptive.
3. https://lore.kernel.org/git/220413.86fsmgeq15.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Discussion] What is Git's Security Boundary?
  2022-05-16 14:38   ` rsbecker
@ 2022-05-20 17:23     ` Derrick Stolee
  0 siblings, 0 replies; 7+ messages in thread
From: Derrick Stolee @ 2022-05-20 17:23 UTC (permalink / raw)
  To: rsbecker, 'Git Mailing List', 'Junio C Hamano',
	'Taylor Blau', 'Emily Shaffer',
	'Glen Choo',
	'Ævar Arnfjörð Bjarmason',
	'Christian Couder'

On 5/16/2022 10:38 AM, rsbecker@nexbridge.com wrote:
> On May 16, 2022 10:14 AM, Derrick Stolee wrote:
>>
>> I'm sending this email as a hopeful ping that this topic could use some feedback.
>> I'm looking forward to your ideas.
> 
> Some ramblings, since you asked, and I hope I am not missing the point:
> 
> I guess some (me) were waiting for more ideas on what you meant by
> "Security boundary". In network security, the definition is fairly clear
> - the line where security needs change, so a firewall, DMZ, etc. When
> talking about applications, a security boundary would be an area where
> the concept of a user diverges from the system, so your GitHub logon vs.
> user ids on the servers where GitHub runs - or perhaps Amazon is a
> better example.
> 
> The line blurs for git because we depend on the underlying user
> authentication mechanisms of the platform. To do anything in git, you
> either have to have a legitimate logon to the server where git runs or
> are coming in anonymously in a read-only (hopefully) fashion. In one
> view, your boundary expands beyond one system, making the boundary
> non-traditional.

Yes, this is exactly why this is an interesting discussion to have.

> The "security boundary" line is different for git than what a network
> security admin would consider as a similar domain. In gits terms (my
> view anyway), the boundary is functional. Do we want git doing something
> intended vs. unintended given the structure of the repository. In strict
> technical terms, the boundary is at fopen() and exec(). Can git access
> something or do something on a system and if so, should it. Conversely,
> is git blocked from doing something it should be able to do. This seems
> like well structured problem except for the introduction of incoming
> changes that could trigger undesired behaviour either at clone,
> fetch/merge time, switch or other situations where there is a side
> impact.

I agree that the boundary is functional. We want Git users to feel safe
running Git commands that their data will not go anywhere unintended and
no unintended behavior could comprimise their security. This is all for
things outside of the umbrella of "doing what you told Git to do," so not
understanding Git isn't a way to claim there is a security issue. Git
should push data where it is told, when the appropriate commands are run.
Git should run the hooks that are configured in the repository, since that
is an important functionality.

The biggest questions are how much we can rely on a "properly configured
and secured" system? Should we consider the filesystem to be trusted
state, or are our only concerns with data that is sent over a network? The
recent CVE around safe.directory hints that we don't always trust the file
system. Embedded Git repositories can be placed by a "git clone" but they
are not dangerous until after the user has a chance to inspect the data
that is on their filesystem.

> So putting the fopen() boundary into a box, that seems pretty much up to
> the operating system. I am not 100% sure that the safe.directory
> situation is required for that - although I have had customers asking
> for something like that for about 3 years.
> 
> There are three areas of ancillary impacts that give me continual
> concern: clean/smudge, hooks, workflows. Each hits the exec() boundary.
> Clean/smudge has a well-defined control that is up to the user or system
> admin to manage. Similarly hooks, although hook import has become a
> topic lately. The GitHub (and other app) Workflow Actions concept opens
> up a new area that allows the exec() boundary to be traversed,
> potentially with undesired side effects. Actions depends on GitHub to
> provide safety controls, which is outside git's responsibility although
> git is the transport vector through which potential problems can be
> introduced.

My biggest concern (outside of our well-established concerns over network
communication vulnerabilities) is the exec() boundary. How easy is it for
an attacker to trick Git into running a bad hook? This goes hand in hand
with how difficult it is to "install" hooks and that efforts to make that
easier are also likely to make it easier to create this kind of
vulnerability.

> We then get into "trust" and who is trusted across that
> boundary and is the trust justified. If it were up to me, I would want
> all of the incoming changes to be signed at least for accountability,
> but more having some kind of authentication to ensure the trust.

This level of trust is interesting. Outside of introducing an opt-in mode
that rejects any commits that are not signed by trusted parties, we
cannot make this change without breaking almost all existing scenarios.
This is an interesting thing to think about providing for ultra-security-
conscious folks.

Thanks for your thoughts!
-Stolee

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Discussion] What is Git's Security Boundary?
  2022-05-17 12:55 ` Ævar Arnfjörð Bjarmason
@ 2022-05-20 17:53   ` Derrick Stolee
  2022-05-21 10:22     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 7+ messages in thread
From: Derrick Stolee @ 2022-05-20 17:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Emily Shaffer,
	Glen Choo, Christian Couder

On 5/17/2022 8:55 AM, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Apr 26 2022, Derrick Stolee wrote:
> 
> [Snip and just focusing on the "what to do with config" part of this]:
> 
>> Question: Is "protected" config _really_ more trustworthy?
>> ----------------------------------------------------------
>>
>> This leads to an interesting question: Do we think that `~/.gitconfig`
>> and `/etc/gitconfig` are "more trustworthy" than `.git/config`?
>>
>> I think that if an attacker has access to write to system or global config,
>> then they have access to do more harm without Git. On the other hand,
>> local config overrides the other config values, so local config can "unset"
>> any potentially-malicious values in system and global config. I don't
>> think such "defensive config" is common.
> 
> I think this is a subset of areas where we rightfully piggy-back on FS
> permissions, and should continue to do so.
> 
> I.e. we should trust /etc/gitconfig and the like because someone who can
> modify it can modify /usr/bin/git anyway, so trying to defend against
> anything in that area is pointless.
> 
> Yes we allow overriding that config, but that should be thought of as a
> convenience, not as a nascent security boundary.

Thanks. This was my thought, too. I thought it worth bringing up to confirm.

>> Example Security Boundary Question: Unprotected Config and Executing Code
>> -------------------------------------------------------------------------
>>
>> We have a lot of ways of executing external code from within Git. This is
>> a key extensibility feature for customizing Git functionality. One way to
>> do this is to create executable files in the $GIT_DIR/hooks/ directory.
>> Git will discover these hooks and run them at the appropriate times.
>>
>> There are also many config options that specify external commands to run:
>>
>> * `core.fsmonitor=<path>` is executed before scanning the worktree for
>>   created or modified files.
>>
>> * `core.editor` specifies an editor when writing commit messages, among
>>   other user-input scenarios.
>>
>> * `credential.helper` specifies an external tool to assist with connecting
>>   to a remote repository.
>>
>> * `uploadPack.packObjectsHook` chooses an alternative for `git pack-objects`
>>   during `git upload-pack`.
>>
>> The list is actually quite long. This last one, `uploadPack.packObjectsHook`
>> _does_ do a check for protected config: it does not allow its value to
>> come from repository-local config.
>>
>> However, most of these options really do want to have customization on a
>> per-repository basis, hence this proliferation of config options and
>> local hook directories.
>>
>> I'm concerned that as long as we allow arbitrary execution to happen based
>> on repository-local data, we will always have security risks in Git. For
>> that reason, I'm interested in exploring ways to change how we execute
>> code externally, even if it means we would (eventually) need to introduce
>> a breaking change.
> 
> I agree that we should mainly be thinking of these config values that
> directly execute something external, but as elaborated on below I think
> any security solution that narrowly focuses only on these is on the
> wrong path.
> 
> You can e.g. point git-send-email to hostile server, or disable its
> SSL/TLS settings with config. Ditto HTTP settings to disable certificate
> checking etc. etc.
> 
> You can also set transfer.fsckObjects=false or one of the fsck.*
> settings and open the door to fetching a payload which propagates a
> known part CVE. But more below.

These examples you mention are definitely things that can go wrong, but
they become much harder to use as an attack because of the extra hoops
needed by the user.

The one thing I think is particularly interesting in your examples is the
case of checking certs. I'm thinking specifically of a case where someone
updates the local repo config to have a different remote URL and tricks
the user into pushing their private repo to another location. (Although,
why didn't they just do that themselves with their read access to the
repo?)

A leaky bucket can have many holes, but that doesn't mean we shouldn't
start patching the holes we see and can reach. And the process of fixing
the ones we know about makes it easier to fix more in the future using
similar mechanisms.

>> The idea would be to allow repository-local customization by selecting
>> from a list of "installed" executables. The list of "installed"
>> executables comes from protected config (and the Git binary itself).
> 
> Most of this type of config doesn't point to a path to an executable,
> but is a string we'll give to "sh -c" or equivalent. E.g. for editors we
> couldn't naively add "emacs" to such a whitelist, as it supports
> command-line options to evaluate arbitrary code.
>
> How would your plan handle such cases?

We could add "emacs" if we assume there are no other arguments. Extra
arguments would need to be part of the "installed hook".

>> The plan I would like to put forward is to restrict all external execution
>> to be from one of these sources:
>>
>> 1. Specified in system config (`/etc/gitconfig`).
>> 2. Specified in global config (`~/.gitconfig`).
>> 3. An allow-list of known tools, on $PATH (e.g. `vim`).
>>
>> Such a change would be a major one, and would require changing a lot of
>> existing code paths. In particular, this completely removes the
>> functionality of the `$GIT_DIR/hooks/` directory in favor of a config-
>> based approach. This would require wide communication before pulling all
>> support for the old way, and likely a 3.0 version designation. After the
>> old hook mechanism is removed, the "safe.directory" protection from 2.35.2
>> would no longer be needed.
> 
> Aside from any of the details of safe.directory & how we discover hook
> it was my understanding per [1] that Johannes Schindelin disagreed with
> this assessment of what safe.directory was for.
> 
> I.e. now the known vector is a hook, but in the previous off-list
> discussions I'd proposed narrowing the new safe.directory error down to
> handle that hook case only, but per the "cat being out of the bag" in
> [1] there was concern about other non-hook issues being found.
> 
> Perhaps that assessment has changed, just noting it here for
> completeness.

You're right that since there are other ways to use shared repos to break
user expectations and create a vulnerability, then safe.directory will
likely still be needed.

I also think that safe.directory is still insufficient because a shared
repo can be marked as "safe" but then be attacked later when a "trusted"
user is compromised.

> In any case, I don't think that we'd need to make the removal of
> $GIT_DIR/hooks support in favor of config-based hooks a dependency of
> any such proposal.
> 
> The current config-based hook proposal would allow you to exhaustively
> emulate $GIT_DIR/hooks by defining all our hooks to those
> paths. Therefore any security mechanism could surely consider the old
> $GIT_DIR/hooks to be handled equivalently to however it would handle
> that sort of config-based hooks configuration.

Here's another way I would phrase my thoughts here:

If we were designing the hook mechanism _today_, then I would absolutely
advocate that we require the hook definitions to come from protected
config and not be repository local. It is too dangerous to let this level
of arbitrary execution be done in shared repository context.

The question we face today is two-fold:

1. Is that enough of a risk that we would want to break backwards
   compatibility and remove $GIT_DIR/hooks as a hook mechanism?

2. Should any _new_ way of configuring hooks be more secure than the
   $GIT_DIR/hooks mechanism?

In my opinion, I think the answer to (2) is "absolutely yes" and the
answer to (1) is "maybe". The full answer depends on how well the new
system works, which is only something we can learn after it is built and
used in real-world scenarios.

>> At minimum, in the short term, this would affect the proposed design of
>> config-based hooks.
>>
>> I think this is a good example to think about at a high level before going
>> into the technical details. We can use it to test any proposed security
>> boundary definitions to see if it lands on one side or another. Here are
>> some points:
>>
>> 1. These config-based executables cannot be set in a full repository by
>>    a "git clone" operation.
>>
>> 2. These config-based executables can be set in an embedded bare
>>    repository, but the user needs to move deeper into the repository for
>>    that to have any affect. This leads to some amount of social engineering
>>    being involved in the attack. See [4] for recent discussion on this.
>>
>>    [4] https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/
>>
>> 3. If users are sharing a common Git repository, then if an attacker gains
>>    control of one user's account, they can use the shared repository as an
>>    attack vector to gain control of the other users' accounts. For this
>>    case, do we consider the "safe.directory" config as an indicator of
>>    "I trust this repo and all users that can access it _in perpetuity_" or
>>    instead "I need to use this repo, even though it is owned by someone
>>    else."
>>
>> 4. Git's dedication to backwards compatibility might prevent any attempt
>>    to change how hooks are specified or config can be used to customize
>>    its behavior.
>>
>> 5. The technical challenge of converting all possible execution paths may
>>    be too daunting to ever feel the project is "complete" and be able to
>>    confidently say "Git is safe from these kinds of attacks".
>>
>>
>> Conclusion
>> ----------
>>
>> I look forward to hearing from the community about this topic. There are
>> so many things to think about in this space, and I hope that a lot of
>> voices can share their perspectives here.
>>
>> Please collect any action items here at the end. I would love to add a
>> doc to the Git tree that summarizes our conclusions here, even if it is
>> only a start to our evolving security boundary.
> 
> This didn't make it on-list, but in the off-list discussion about
> safe.directory I pointed out that this class of problem is something
> Emacs has been dealing with for decades, and which we'd do well to try
> to emulate. [2] below is the relevant part of my
> <220303.861qzi3mag.gmgdl@evledraar.gmail.com> (sent on Thu, 03 Mar 2022
> 19:27:59 +0100), I also mentioned it in passing in [3].
> 
> The brief overview for it in Emacs's documentation is available here:
> https://www.gnu.org/software/emacs/manual/html_node/emacs/Safe-File-Variables.html
> 
> I feel strongly that something like that is a much better direction to
> go in than an approch that tries to narrowly classify only "dangerous"
> config.
>
> That sort of approach would basically do the reverse. We'd whitelist
> "safe" config (e.g. diff.orderFile or whatever), and ask the user if
> they're OK with using config that falls outside of the whitelisting.
> 
> By classifying our own config (and we'd probably need more than just
> "safe" and "executes arbitrary code") the common case is that users
> shouldn't need to answer those questions, as we'd know that the config
> is safe.

You are focusing on the part where it displays all config that is not
known to be "safe" but ignore the parts where it refuses to take changes
for config that is known to be "risky".

Essentially I'm advocating for adding the less-invasive "never accept
risky config from untrusted sources" and you are advocating for "always
prompt for any untrusted config that isn't completely safe".
 
> This would be implemented by having a config mechanism "mark" which
> area(s) of config are "safe". We'd only pay attention to such a config
> from sources that area already "safe".

Such a direction seems like it would need a significant amount of extra
work before it would make Git usable in these shared scenarios. The
amount of "safe" config seems to be quite large _and_ continues to grow.
We would need to evaluate every boolean config option as it is written
and do an extra step to add it to this allow-list. Of course, this also
assumes that we are only guarding these repo-local config options when the
filesystem communicates that the repo is owned by someone else. I'd like
to remove the owner from the equation and stop trusting repo-local config
for things like this, if at all possible.

> So, to begin with this addresses cases where e.g. a tool like git-annex
> will execute arbitrary executables based on git configuration, which any
> mechanism that marks only config git itself knows about won't be able to
> do (it uses its own config space).
> 
> But it also extends this mechanism from being something *just* focused
> on narrowly addressing security to something generally useful. E.g. even
> if a repository on disk has "safe" config I might still want to say that
> I don't want to use its alias.* config or whatever.

This sounds again like an opposite direction: you want to have something
in your global config that ignores certain config keys in repo-local
config. That creates a user-specified deny-list which I find an interesting
direction.

My goal is to make Git safer for users who would not set up such a deny-
list.

> Whatever mechanism we end up with here, I think that now that the dust
> has settled on the CVE we'd do well to consider those sorts of
> problems.
> 
> A core.editor pointing to "rm -rf /" is a security issue, but any such
> issues are just subsets of annoying third-party config doing something I
> didn't ask for.

I feel you are making my point exactly: "rm -rf /" is just a placeholder
for something I don't want to happen and can be harmful. The flexibility
here allows attackers to be very creative with how they attack Git users
and I'd like to shut down those approaches however possible.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Discussion] What is Git's Security Boundary?
  2022-05-20 17:53   ` Derrick Stolee
@ 2022-05-21 10:22     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 7+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-21 10:22 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Git Mailing List, Junio C Hamano, Taylor Blau, Emily Shaffer,
	Glen Choo, Christian Couder


On Fri, May 20 2022, Derrick Stolee wrote:

> On 5/17/2022 8:55 AM, Ævar Arnfjörð Bjarmason wrote:
> [...]
> These examples you mention are definitely things that can go wrong, but
> they become much harder to use as an attack because of the extra hoops
> needed by the user.

Yes in those cases, but as noted later we've also got e.g. git-annex
configuration which is the same arbitrary command execution.

> The one thing I think is particularly interesting in your examples is the
> case of checking certs. I'm thinking specifically of a case where someone
> updates the local repo config to have a different remote URL and tricks
> the user into pushing their private repo to another location. (Although,
> why didn't they just do that themselves with their read access to the
> repo?)

.... [replied to just below]....

> A leaky bucket can have many holes, but that doesn't mean we shouldn't
> start patching the holes we see and can reach. And the process of fixing
> the ones we know about makes it easier to fix more in the future using
> similar mechanisms.

My concern is that we'd start out with implementing a security model
using the wrong approach, without learning lessons from existing working
systems. E.g. the Emacs one that's been dealing with the same problem.

I do think that doing it that way will probably be simpler to do as an
initial implementation, and if we can create a system that's
sufficiently secure and doesn't have arbitrary restrictions (e.g. only
focusing on "executable config") we won't need any of the subsequent
work of trying to combine a more extensible system with an initial
limited implementation down the road.

>>> The idea would be to allow repository-local customization by selecting
>>> from a list of "installed" executables. The list of "installed"
>>> executables comes from protected config (and the Git binary itself).
>> 
>> Most of this type of config doesn't point to a path to an executable,
>> but is a string we'll give to "sh -c" or equivalent. E.g. for editors we
>> couldn't naively add "emacs" to such a whitelist, as it supports
>> command-line options to evaluate arbitrary code.
>>
>> How would your plan handle such cases?
>
> We could add "emacs" if we assume there are no other arguments. Extra
> arguments would need to be part of the "installed hook".

Yes, it's possible to make it work, but e.g. some want to run emacs (or
whatever editor) with "-nw", my $EDITOR is "emacsclient
--alternate-editor emacs".

So being able to e.g. have a regex to accept config for that key is more
useful than an arbitrary limitation of only allowing commands that you
could feed to "command -v" (or something), and requiring a wrapper in
$PATH for anything more complex...

>>> The plan I would like to put forward is to restrict all external execution
>>> to be from one of these sources:
>>>
>>> 1. Specified in system config (`/etc/gitconfig`).
>>> 2. Specified in global config (`~/.gitconfig`).
>>> 3. An allow-list of known tools, on $PATH (e.g. `vim`).
>>>
>>> Such a change would be a major one, and would require changing a lot of
>>> existing code paths. In particular, this completely removes the
>>> functionality of the `$GIT_DIR/hooks/` directory in favor of a config-
>>> based approach. This would require wide communication before pulling all
>>> support for the old way, and likely a 3.0 version designation. After the
>>> old hook mechanism is removed, the "safe.directory" protection from 2.35.2
>>> would no longer be needed.
>> 
>> Aside from any of the details of safe.directory & how we discover hook
>> it was my understanding per [1] that Johannes Schindelin disagreed with
>> this assessment of what safe.directory was for.
>> 
>> I.e. now the known vector is a hook, but in the previous off-list
>> discussions I'd proposed narrowing the new safe.directory error down to
>> handle that hook case only, but per the "cat being out of the bag" in
>> [1] there was concern about other non-hook issues being found.
>> 
>> Perhaps that assessment has changed, just noting it here for
>> completeness.
>
> You're right that since there are other ways to use shared repos to break
> user expectations and create a vulnerability, then safe.directory will
> likely still be needed.
>
> I also think that safe.directory is still insufficient because a shared
> repo can be marked as "safe" but then be attacked later when a "trusted"
> user is compromised.

Yes, exactly. I brought this up in the initial discussions about
safe.directory. I.e. it's by design creating scenarios where we're
making users more unsafe than they'd be if we narrowly focused just on
the hook config.

That's because we force them to set a repo as "safe" based on FS
permissions, but don't care about the content of the config, we just
assumed that it had that "bad hook".

Then once we mark it as safe the config can change, and could be changed
to something that would be truly harmful, i.e. now the "bad hook" which
wasn't there before could appear.

Which is just completely the wrong model to go for, and we should
instead go for something that works the way emacs's dir-locals
does.

There the default is to trust almost no config, and if you open an
untrusted file that wants to set new config the default UI is to ask you
*which parts specifically* you'd like to trust.

Those can then be whitelisted (either globally, or as path:config-pair),
so that when you open the file the next time and it's added the
proverbial "rm -rf /" you'll pick up only the parts of config you
already audited/trusted, not the new parts.

Which is the opposite of the direction safe.directory is taking us in,
where we're treating config safety as a boolean we can mark as a
one-off, and which doesn't take into account subsequent config changes
at all.

>> In any case, I don't think that we'd need to make the removal of
>> $GIT_DIR/hooks support in favor of config-based hooks a dependency of
>> any such proposal.
>> 
>> The current config-based hook proposal would allow you to exhaustively
>> emulate $GIT_DIR/hooks by defining all our hooks to those
>> paths. Therefore any security mechanism could surely consider the old
>> $GIT_DIR/hooks to be handled equivalently to however it would handle
>> that sort of config-based hooks configuration.
>
> Here's another way I would phrase my thoughts here:
>
> If we were designing the hook mechanism _today_, then I would absolutely
> advocate that we require the hook definitions to come from protected
> config and not be repository local. It is too dangerous to let this level
> of arbitrary execution be done in shared repository context.

I agree that it's unsafe now, but I think it's wrong to head in the
direction of "protected config".

We should trust content, not origin or FS permissions. Just as I'd be
happy to accept a "git bundle" from an untrusted source if I could
validate it against a known OID I should be able to open a completely
untrusted repo (e.g. random tar.gz I downloaded) and have its config be
enacted, but only those subsets of config I've deemed safe in that
security context.

> The question we face today is two-fold:
>
> 1. Is that enough of a risk that we would want to break backwards
>    compatibility and remove $GIT_DIR/hooks as a hook mechanism?
>
> 2. Should any _new_ way of configuring hooks be more secure than the
>    $GIT_DIR/hooks mechanism?
>
> In my opinion, I think the answer to (2) is "absolutely yes" and the
> answer to (1) is "maybe". The full answer depends on how well the new
> system works, which is only something we can learn after it is built and
> used in real-world scenarios.

I think for any replacement of safe.config we'd just treat both the same
way we'd treat any other config.

>>> [...]
>>> I look forward to hearing from the community about this topic. There are
>>> so many things to think about in this space, and I hope that a lot of
>>> voices can share their perspectives here.
>>>
>>> Please collect any action items here at the end. I would love to add a
>>> doc to the Git tree that summarizes our conclusions here, even if it is
>>> only a start to our evolving security boundary.
>> 
>> This didn't make it on-list, but in the off-list discussion about
>> safe.directory I pointed out that this class of problem is something
>> Emacs has been dealing with for decades, and which we'd do well to try
>> to emulate. [2] below is the relevant part of my
>> <220303.861qzi3mag.gmgdl@evledraar.gmail.com> (sent on Thu, 03 Mar 2022
>> 19:27:59 +0100), I also mentioned it in passing in [3].
>> 
>> The brief overview for it in Emacs's documentation is available here:
>> https://www.gnu.org/software/emacs/manual/html_node/emacs/Safe-File-Variables.html
>> 
>> I feel strongly that something like that is a much better direction to
>> go in than an approch that tries to narrowly classify only "dangerous"
>> config.
>>
>> That sort of approach would basically do the reverse. We'd whitelist
>> "safe" config (e.g. diff.orderFile or whatever), and ask the user if
>> they're OK with using config that falls outside of the whitelisting.
>> 
>> By classifying our own config (and we'd probably need more than just
>> "safe" and "executes arbitrary code") the common case is that users
>> shouldn't need to answer those questions, as we'd know that the config
>> is safe.
>
> You are focusing on the part where it displays all config that is not
> known to be "safe" but ignore the parts where it refuses to take changes
> for config that is known to be "risky".
>
> Essentially I'm advocating for adding the less-invasive "never accept
> risky config from untrusted sources" and you are advocating for "always
> prompt for any untrusted config that isn't completely safe".

No, I'm not advocating for that. You wouldn't always prompt, and
depending on your configuration you could perfectly emulate your
proposal with a dir-locals-like implementation, but the reverse isn't
true.

And we could make the default configuration match what you've suggested,
I'm suggesting that a replacement can be both more generic, extensible
and safer.

>> This would be implemented by having a config mechanism "mark" which
>> area(s) of config are "safe". We'd only pay attention to such a config
>> from sources that area already "safe".
>
> Such a direction seems like it would need a significant amount of extra
> work before it would make Git usable in these shared scenarios. The
> amount of "safe" config seems to be quite large _and_ continues to grow.
> We would need to evaluate every boolean config option as it is written
> and do an extra step to add it to this allow-list. [...]

We wouldn't need to do that, I think we should do it to be friendlier to
users, just like emacs has a list of built-in config that's safe
(e.g. what tab indent level you'd like and similar).

But we could punt that until later and only say that we found config
keys X, Y and Z, we're ignoring them for now, but you can whitelist them
in general or for this key (or path/key combination) by doing ABC.

> [...]Of course, this also
> assumes that we are only guarding these repo-local config options when the
> filesystem communicates that the repo is owned by someone else. I'd like
> to remove the owner from the equation and stop trusting repo-local config
> for things like this, if at all possible.

Yes, I'm in complete agreement with you that we should completely ignore
FS ownership. Per the above content should matter, not ownership.

Having said that I think there might be some place for FS ownership in
the end, e.g. just as you might configure that ~/git/mine has config you
trust but ~/git/unpacked-tarballs has completely untrusted config you
might know enough about your system to do that based on FS permissions
instead.

But it should not be the default we we do gatekeeping. It should be
equally safe to inspect /tmp/bad-repo created by another user as a
tarball you extracted from an untrusted source as your $USER.

>> So, to begin with this addresses cases where e.g. a tool like git-annex
>> will execute arbitrary executables based on git configuration, which any
>> mechanism that marks only config git itself knows about won't be able to
>> do (it uses its own config space).
>> 
>> But it also extends this mechanism from being something *just* focused
>> on narrowly addressing security to something generally useful. E.g. even
>> if a repository on disk has "safe" config I might still want to say that
>> I don't want to use its alias.* config or whatever.
>
> This sounds again like an opposite direction: you want to have something
> in your global config that ignores certain config keys in repo-local
> config. That creates a user-specified deny-list which I find an interesting
> direction.
>
> My goal is to make Git safer for users who would not set up such a deny-
> list.

Mine too, I'm not suggesting that anyone would set up a deny list by
default, just as emacs's dir-locals is safe by default we should ship a
sensible safe default.

But seriously, I think it would be really helpful for reaching some
understanding here if you just took some time to play with an Emacs
installation & how dir-locals works.

I'm not trying to advocate that you switch your $EDITOR (which I realize
is a borderline religious issue), nor am I mentioning emacs because it
happens to be my $EDITOR of choice.

I'm mentioning it because it's:

 1. Solving the same problem we're trying to solve here, but has an
    existing battle-tested implementation with the benefit of lessons
    learned.

 2. I'm not aware of any other software that even tries to solve the
    same problem, let alone does so successfully (but I'll happily admit
    that's probably just my own ignorance).

    E.g. last I checked vim had similar (modeline) config that relied on
    an exhaustive built-in whitelist, and other such "get config from
    file" features in editors I've tried worked the same way.

    Emacs is the only one I'm aware of where it's arbitrarily
    extensible, able to accept and safely deal with config Emacs has no
    idea about (because it's for some third party package or whatever),
    and where all of that config might range from "completely safe, at
    worst annying" (e.g. indentation configuration) to completely unsafe
    (e.g. execute this arbitrary command).

I didn't link to this upthread because I hadn't found it yet, but when
this came up on-list in the past (in a different context) I proposed
doing the git-specific config driving it like this:

    https://lore.kernel.org/git/874lkq11ug.fsf@evledraar.gmail.com/

I.e. you could accept/reject config based on sources (as in mapping to
git config's notion of --system, --global, --local etc.). In that
proposal your upthread suggestion would be implemented as something
like:

	[config "repo"]
	reject = core.fsmonitor
	reject = core.editor
	reject = credential.helper
	reject = uploadPack.packObjectsHook
	accept = *

In terms of implementation (which I guess I'm talking myself into at
this point) it shouldn't be that complex. We have various "early config"
reading already, e.g. for trace2 and "git init/clone".

So in this case we'd just early on read config to get these
"meta-config-rules" (i.e. the "[config ...]" settings above), and do
nothing with them except note what we found.

Then when we were using the config API "for real" (i.e. in later
git_config() callbacks and the like) we'd just exclude such config.

Then if we excluded anything we'd emit an advice() or whatever saying
that we did, that excluded config could be viewed by running X command,
or whitelisted (or the notice silenced) by doing Y or Z.

The exclusion logic would be a relatively simple rule-based filter
driven by something like wildmatch(). I.e. given a key like
"core.fsmonitor" see if we have any excludes (if not pass it along), if
we have excludes does it match etc.

Does any of that sound like it still wouldn't satisfy the goals you've
outlined? Again, I'm interested in the underlying mechanism, whether
we'd e.g. configure it by default to use some set of excludes or another
is, I think, best discussed later, except to the extent that one
proposal or another would have arbitrary limitations that would make
some use-case we'd have in mind impossible.

>> Whatever mechanism we end up with here, I think that now that the dust
>> has settled on the CVE we'd do well to consider those sorts of
>> problems.
>> 
>> A core.editor pointing to "rm -rf /" is a security issue, but any such
>> issues are just subsets of annoying third-party config doing something I
>> didn't ask for.
>
> I feel you are making my point exactly: "rm -rf /" is just a placeholder
> for something I don't want to happen and can be harmful. The flexibility
> here allows attackers to be very creative with how they attack Git users
> and I'd like to shut down those approaches however possible.

I think this should be addressed by the above replies, but perhaps I've
misunderstood you here, so let me know if you feel there's something
outstanding.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-05-21 11:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-26 17:00 [Discussion] What is Git's Security Boundary? Derrick Stolee
2022-05-16 14:13 ` Derrick Stolee
2022-05-16 14:38   ` rsbecker
2022-05-20 17:23     ` Derrick Stolee
2022-05-17 12:55 ` Ævar Arnfjörð Bjarmason
2022-05-20 17:53   ` Derrick Stolee
2022-05-21 10:22     ` Ævar Arnfjörð Bjarmason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.