Re: [Discussion] What is Git's Security Boundary?

From: Derrick Stolee <derrickstolee@github.com>
To: rsbecker@nexbridge.com,
	"'Git Mailing List'" <git@vger.kernel.org>,
	"'Junio C Hamano'" <gitster@pobox.com>,
	"'Taylor Blau'" <me@ttaylorr.com>,
	"'Emily Shaffer'" <emilyshaffer@google.com>,
	"'Glen Choo'" <chooglen@google.com>,
	"'Ævar Arnfjörð Bjarmason'" <avarab@gmail.com>,
	"'Christian Couder'" <christian.couder@gmail.com>
Subject: Re: [Discussion] What is Git's Security Boundary?
Date: Fri, 20 May 2022 13:23:31 -0400	[thread overview]
Message-ID: <35e80e21-7388-8047-d8b9-02e136d20e04@github.com> (raw)
In-Reply-To: <004d01d86932$a36f95f0$ea4ec1d0$@nexbridge.com>

On 5/16/2022 10:38 AM, rsbecker@nexbridge.com wrote:
> On May 16, 2022 10:14 AM, Derrick Stolee wrote:
>>
>> I'm sending this email as a hopeful ping that this topic could use some feedback.
>> I'm looking forward to your ideas.
> 
> Some ramblings, since you asked, and I hope I am not missing the point:
> 
> I guess some (me) were waiting for more ideas on what you meant by
> "Security boundary". In network security, the definition is fairly clear
> - the line where security needs change, so a firewall, DMZ, etc. When
> talking about applications, a security boundary would be an area where
> the concept of a user diverges from the system, so your GitHub logon vs.
> user ids on the servers where GitHub runs - or perhaps Amazon is a
> better example.
> 
> The line blurs for git because we depend on the underlying user
> authentication mechanisms of the platform. To do anything in git, you
> either have to have a legitimate logon to the server where git runs or
> are coming in anonymously in a read-only (hopefully) fashion. In one
> view, your boundary expands beyond one system, making the boundary
> non-traditional.

Yes, this is exactly why this is an interesting discussion to have.

> The "security boundary" line is different for git than what a network
> security admin would consider as a similar domain. In gits terms (my
> view anyway), the boundary is functional. Do we want git doing something
> intended vs. unintended given the structure of the repository. In strict
> technical terms, the boundary is at fopen() and exec(). Can git access
> something or do something on a system and if so, should it. Conversely,
> is git blocked from doing something it should be able to do. This seems
> like well structured problem except for the introduction of incoming
> changes that could trigger undesired behaviour either at clone,
> fetch/merge time, switch or other situations where there is a side
> impact.

I agree that the boundary is functional. We want Git users to feel safe
running Git commands that their data will not go anywhere unintended and
no unintended behavior could comprimise their security. This is all for
things outside of the umbrella of "doing what you told Git to do," so not
understanding Git isn't a way to claim there is a security issue. Git
should push data where it is told, when the appropriate commands are run.
Git should run the hooks that are configured in the repository, since that
is an important functionality.

The biggest questions are how much we can rely on a "properly configured
and secured" system? Should we consider the filesystem to be trusted
state, or are our only concerns with data that is sent over a network? The
recent CVE around safe.directory hints that we don't always trust the file
system. Embedded Git repositories can be placed by a "git clone" but they
are not dangerous until after the user has a chance to inspect the data
that is on their filesystem.

> So putting the fopen() boundary into a box, that seems pretty much up to
> the operating system. I am not 100% sure that the safe.directory
> situation is required for that - although I have had customers asking
> for something like that for about 3 years.
> 
> There are three areas of ancillary impacts that give me continual
> concern: clean/smudge, hooks, workflows. Each hits the exec() boundary.
> Clean/smudge has a well-defined control that is up to the user or system
> admin to manage. Similarly hooks, although hook import has become a
> topic lately. The GitHub (and other app) Workflow Actions concept opens
> up a new area that allows the exec() boundary to be traversed,
> potentially with undesired side effects. Actions depends on GitHub to
> provide safety controls, which is outside git's responsibility although
> git is the transport vector through which potential problems can be
> introduced.

My biggest concern (outside of our well-established concerns over network
communication vulnerabilities) is the exec() boundary. How easy is it for
an attacker to trick Git into running a bad hook? This goes hand in hand
with how difficult it is to "install" hooks and that efforts to make that
easier are also likely to make it easier to create this kind of
vulnerability.

> We then get into "trust" and who is trusted across that
> boundary and is the trust justified. If it were up to me, I would want
> all of the incoming changes to be signed at least for accountability,
> but more having some kind of authentication to ensure the trust.

This level of trust is interesting. Outside of introducing an opt-in mode
that rejects any commits that are not signed by trusted parties, we
cannot make this change without breaking almost all existing scenarios.
This is an interesting thing to think about providing for ultra-security-
conscious folks.

Thanks for your thoughts!
-Stolee