From: John Cai <johncai86@gmail.com>
To: Glen Choo <chooglen@google.com>
Cc: git@vger.kernel.org, Emily Shaffer <emilyshaffer@google.com>,
justin@justinsteven.com, Taylor Blau <me@ttaylorr.com>
Subject: Re: Bare repositories in the working tree are a security risk
Date: Thu, 07 Apr 2022 14:38:41 -0400 [thread overview]
Message-ID: <FF4B21A6-A8B6-4142-B325-B9C9193BD885@gmail.com> (raw)
In-Reply-To: <kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com>
Hi Glen,
This is an interesting find. Thanks for writing it up.
On 6 Apr 2022, at 18:43, Glen Choo wrote:
> Hi all,
>
> My Google colleagues and I would like to address what we think is a security
> risk - namely that Git repositories can contain valid bare repositories in their
> working trees. This is based on an excellent article by Justin Steven [1] (CC-ed
> here).
>
> Below is a discussion of:
>
> * The risky behavior - what Git does and why it is risky
> * Possible responses to this risk
> * A proposed approach
>
> The proposed changes are nontrivial, so I’d really appreciate any feedback here.
> Unfortunately, I will be out of the office and won’t respond to emails for the
> next 7 days or so, but there will still be at least one Google Git team member
> keeping tabs on the discussion :)
>
> = TL;DR
>
> Git repositories should not be allowed to contain bare repositories in their
> working trees because:
>
> * Such bare repositories can have maliciously crafted config files that cause
> `git` to execute arbitrary code.
> * Git itself can distribute the malicious repo via `git clone`; no need to
> distribute repos out of band e.g. via tarballs [2].
> * Many `git` commands can be affected by malicious config files, and many users
> have tools that will run `git` in the current directory or the subdirectories
> of a repo. Once the malicious repo has been cloned, very little social
> engineering is needed; the user might only need to open the repo in an editor
> or `cd` into the correct subdirectory.
>
> = Background
>
> (This section is primarily a summary of [1]. I highly, highly recommend reading
> that as it describes the issue in much more detail and is extremely readable
> regardless of Git experience.)
>
> Certain Git configuration options are particularly enticing targets for
> attackers, e.g. core.fsmonitor can execute arbitrary commands and is invoked
> on many innocuous-looking `git` commands (like `git status`). This is even more
> risky when one considers that many tools (like shell prompts and IDEs) will run
> `git` opportunistically inside a repository - so many users won't even need to
> explicitly run `git` to be affected [3].
>
> Since config files are such an enticing target for attackers, Git intentionally
> avoids distributing config files with repos - a user shouldn't be able to `git
> clone` a repo with a config file (or really, any files inside .git). However,
> one can 'trick' Git into doing this by embedding a bare repository inside of
> another, containing repository: a repository can contain subdirectories that are
> valid bare repositories, and any `git` operations run in such a subdirectory
> will then use the bare repository’s config file instead of the "main"
> repository’s.
>
> An attack might look like this:
>
> * Attacker creates a repository where subdirectory "Documentation/" is a bare
> repository i.e. it contains "HEAD", "refs/" and "objects/" [4]. Attacker
> also adds "config" with a malicious setting for core.fsmonitor.
> * Attacker convinces User to read their project's documentation by `git
> clone`-ing their repository and inspecting the "Documentation/" directory.
> * User cd-s into "Documentation/" and their shell prompt runs `git status`,
This might be obvious to most reading this, but for those trying this out at home,
you'll need to create a worktree that points to the bare repository and then run
`git status` in there.
> executing the core.fsmonitor command defined by Attacker.
This is easily reproduceable.
>
> = What can we do about it?
>
> Each subsection is an alternative and an analysis (+/- are pros/cons).
>
> == 1. Prevent users from checking out bare repos
>
> This is similar to an existing mitigation where we prevent repository entries
> that can be confused for ".git" (like ".GIT"). but it requires checking multiple
> entries instead of a single entry. I suspect that we could accomplish this in
> one of two ways:
>
> a. Prevent bare repos from entering the index.
> b. Prevent writing bare repos to the working tree.
>
> + Relatively robust protection - because the malicious repo never enters the
> working tree, we even protect other tools (e.g. JGit) from doing dangerous
> things in the embedded repo (provided the checkout is done with `git`, of
> course).
> - This breaks some 'valid' workflows (e.g. someone embedding a bare repo as a
> more convenient alternative to submodules), but it seems reasonable to let
> users opt out of this behavior.
> - (1a) is difficult to do in practice because many code paths add entries to
> the index, and checking a combination of new entry and existing entries is
> much trickier than checking just the new entry.
> - (1b) might also be difficult, though not as difficult as 1a because we
> already have a complete list of entries we will write. I don’t think there
> are existing facilities that do this sort of checking of multiple entries.
>
> == 2. Detect and reject bare repos using `git fsck` and `transfer.fsckObjects`.
>
> This entails checking for the markers of a bare repository (HEAD, refs/,
> objects/) in tree objects. This shares a precedent with (1), since `git fsck`
> will also detect ".GIT".
>
> + Most reputable hosting sites set `transfer.fsckObjects`, which allows them to
> detect and prevent this kind of transmission.
> + Confers some protection to users even without them doing anything.
> + Easy to visualize and to write.
> - This won’t directly protect most users because they don’t typically set
> `transfer.fsckObjects` (in fact, `transfer.fsckObjects` will render many
> projects, like linux.git, uncloneable without additional config)
> - Won’t guard against malicious/poorly configured hosts.
>
> == 3. Detect that we are in an embedded bare repo and ignore the embedded bare
> repository in favor of the containing repo.
>
> For example, if setup.c detects that we are in a bare repo, it could then walk
> up the directory hierarchy to see if there is a containing repo that tracks the
> bare one. If so, setup.c chooses to use the containing repo instead.
>
> + Extremely robust; this even protects against a checkout by an earlier Git
> version.
> + Users who don’t use bare repos won’t even notice the difference.
> - The change in rules breaks some legitimate workflows e.g. a user with a repo
> at HOME and bare repos underneath.
> - Potentially very expensive for bare repo users because setup.c will likely
> walk up to the root directory; we’d be essentially *forcing* those users to
> set GIT_CEILING_DIRECTORIES.
> - Doesn’t protect users who use other tools e.g. JGit.
>
> == 4. Educate users about this risk without making code changes.
>
> Some risks fall into this category e.g. "Avoid unarchiving repositories because
> .git/config might be poisoned." [2].
>
> + We don’t break any users.
> - Breaks the trust that users have in `git clone`.
> - IMO "Inspect every repo in its entirety before cloning it" is too much of a
> burden to put on users.
>
> = Next steps
>
> I propose that we prevent repositories from containing bare repositories by
> doing the following (in order):
>
> * Implement (2) by adding a new fsck message "embeddedBareRepo".
> * When this is done, hosting sites can hopefully use this capability to
> prevent transmission, and help us understand the prevalence of such attacks.
> * Implement (1b) by teaching unpack_trees.c to check whether the tree contains
> an entire bare repo, and die() if so. This will be guarded by a
> defaults-to-true config value.
> * This would only block a bare repo from being written in a single operation.
> It wouldn’t stop a user from writing a bare repo entry-by-entry using "git
> checkout <path>", but the amount of social engineering required probably
> renders this attack infeasible.
> * As I noted earlier, I foresee some difficulty actually implementing this
> because I don’t think we have facilities for checking multiple tree entries
> at once.
>
> I am particularly interested in hearing feedback about (1b), namely:
>
> * How to actually teach unpack_trees.c to detect bare repos.
> * Whether preventing bare repos in unpack_trees.c is a good enough mitigation
> (e.g. Are there other code paths that should block bare repos? Should we be
> checking the index instead of the tree?).
>
> I have a patch that does (2); I will send that out and we can leave feedback on
> that separately.
>
> = Demonstration
>
> This is based on a script by Taylor Blau (thanks!). [1] also contains a
> demonstration that runs in Docker.
>
> #!/bin/sh
>
> rm -fr malicious cloned &&git init malicious &&
>
> (
> cd malicious &&
>
> mkdir -p bare &&
> cd bare &&
>
> echo 'ref: refs/heads/main' >HEAD &&
> cat >config <<-EOF
> [core]
> repositoryformatversion = 0
> filemode = true
> bare = false
> worktree = "worktree"
> fsmonitor = "echo pwned >&2; false"
> EOF
>
> mkdir objects refs worktree &&
> touch worktree/.gitkeep &&
>
> git add . &&
> git commit -m ".gitkeep" &&
>
> cd .. &&
> git add bare &&
> git commit -m 'initial commit'
> ) &&
>
> git clone --no-local malicious cloned &&
> cd cloned/bare &&
> git status # pwned
>
> = Footnotes
>
> [1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md.
> [2] Archived repositories with malicious .git directories are a known risk. See
> https://lore.kernel.org/git/20171003123239.lisk43a2goxtxkro@sigill.intra.peff.net/
> for an on-list discussion. This is also described in
> https://blog.sonarsource.com/securing-developer-tools-git-integrations
> (referenced in [1]).
> [3] We even ship such a tool - contrib/completion/git-prompt.sh. A user can pwn
> themselves with tab completion even if they don’t have a prompt configured
> [4] Any directory containing these entries will be recognized as a bare repo, so
> an attacker could add arbitrary entries to the directory to obfuscate the fact
> that it is a bare repo.
> [*] https://offensi.com/2019/12/16/4-google-cloud-shell-bugs-explained-bug-3/ is
> similar to [1], but uses hooks instead of core.fsmonitor.
next prev parent reply other threads:[~2022-04-07 18:38 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo
2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo
2022-04-07 12:42 ` Johannes Schindelin
2022-04-07 13:21 ` Derrick Stolee
2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason
2022-04-14 20:02 ` Glen Choo
2022-04-15 12:46 ` Ævar Arnfjörð Bjarmason
2022-04-07 15:11 ` Junio C Hamano
2022-04-13 22:24 ` Glen Choo
2022-04-07 13:12 ` Ævar Arnfjörð Bjarmason
2022-04-07 15:20 ` Junio C Hamano
2022-04-07 18:38 ` John Cai [this message]
2022-04-07 21:24 ` Bare repositories in the working tree are a security risk brian m. carlson
2022-04-07 21:53 ` Justin Steven
2022-04-07 22:10 ` brian m. carlson
2022-04-07 22:40 ` rsbecker
2022-04-08 5:54 ` Junio C Hamano
2022-04-14 0:03 ` Junio C Hamano
2022-04-14 0:04 ` Glen Choo
2022-04-13 23:44 ` Glen Choo
2022-04-13 20:37 ` Glen Choo
2022-04-13 23:36 ` Junio C Hamano
2022-04-14 16:41 ` Glen Choo
2022-04-14 17:35 ` Junio C Hamano
2022-04-14 18:19 ` Junio C Hamano
2022-04-15 21:33 ` Glen Choo
2022-04-15 22:17 ` Junio C Hamano
2022-04-16 0:52 ` Taylor Blau
2022-04-15 22:43 ` Glen Choo
2022-04-15 20:13 ` Junio C Hamano
2022-04-15 23:45 ` Glen Choo
2022-04-15 23:59 ` Glen Choo
2022-04-16 1:00 ` Taylor Blau
2022-04-16 1:18 ` Junio C Hamano
2022-04-16 1:30 ` Taylor Blau
2022-04-16 0:34 ` Glen Choo
2022-04-16 0:41 ` Glen Choo
2022-04-16 1:28 ` Taylor Blau
2022-04-21 18:25 ` Emily Shaffer
2022-04-21 18:29 ` Emily Shaffer
2022-04-21 18:47 ` Junio C Hamano
2022-04-21 18:54 ` Taylor Blau
2022-04-21 19:09 ` Taylor Blau
2022-04-21 21:01 ` Emily Shaffer
2022-04-21 21:22 ` Taylor Blau
2022-04-29 23:57 ` Glen Choo
2022-04-30 1:14 ` Taylor Blau
2022-05-02 19:39 ` Glen Choo
2022-05-02 14:05 ` Philip Oakley
2022-05-02 18:50 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=FF4B21A6-A8B6-4142-B325-B9C9193BD885@gmail.com \
--to=johncai86@gmail.com \
--cc=chooglen@google.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=justin@justinsteven.com \
--cc=me@ttaylorr.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).