From: John Cai <email@example.com> To: Glen Choo <firstname.lastname@example.org> Cc: email@example.com, Emily Shaffer <firstname.lastname@example.org>, email@example.com, Taylor Blau <firstname.lastname@example.org> Subject: Re: Bare repositories in the working tree are a security risk Date: Thu, 07 Apr 2022 14:38:41 -0400 [thread overview] Message-ID: <FF4B21A6-A8B6-4142-B325-B9C9193BD885@gmail.com> (raw) In-Reply-To: <email@example.com> Hi Glen, This is an interesting find. Thanks for writing it up. On 6 Apr 2022, at 18:43, Glen Choo wrote: > Hi all, > > My Google colleagues and I would like to address what we think is a security > risk - namely that Git repositories can contain valid bare repositories in their > working trees. This is based on an excellent article by Justin Steven  (CC-ed > here). > > Below is a discussion of: > > * The risky behavior - what Git does and why it is risky > * Possible responses to this risk > * A proposed approach > > The proposed changes are nontrivial, so I’d really appreciate any feedback here. > Unfortunately, I will be out of the office and won’t respond to emails for the > next 7 days or so, but there will still be at least one Google Git team member > keeping tabs on the discussion :) > > = TL;DR > > Git repositories should not be allowed to contain bare repositories in their > working trees because: > > * Such bare repositories can have maliciously crafted config files that cause > `git` to execute arbitrary code. > * Git itself can distribute the malicious repo via `git clone`; no need to > distribute repos out of band e.g. via tarballs . > * Many `git` commands can be affected by malicious config files, and many users > have tools that will run `git` in the current directory or the subdirectories > of a repo. Once the malicious repo has been cloned, very little social > engineering is needed; the user might only need to open the repo in an editor > or `cd` into the correct subdirectory. > > = Background > > (This section is primarily a summary of . I highly, highly recommend reading > that as it describes the issue in much more detail and is extremely readable > regardless of Git experience.) > > Certain Git configuration options are particularly enticing targets for > attackers, e.g. core.fsmonitor can execute arbitrary commands and is invoked > on many innocuous-looking `git` commands (like `git status`). This is even more > risky when one considers that many tools (like shell prompts and IDEs) will run > `git` opportunistically inside a repository - so many users won't even need to > explicitly run `git` to be affected . > > Since config files are such an enticing target for attackers, Git intentionally > avoids distributing config files with repos - a user shouldn't be able to `git > clone` a repo with a config file (or really, any files inside .git). However, > one can 'trick' Git into doing this by embedding a bare repository inside of > another, containing repository: a repository can contain subdirectories that are > valid bare repositories, and any `git` operations run in such a subdirectory > will then use the bare repository’s config file instead of the "main" > repository’s. > > An attack might look like this: > > * Attacker creates a repository where subdirectory "Documentation/" is a bare > repository i.e. it contains "HEAD", "refs/" and "objects/" . Attacker > also adds "config" with a malicious setting for core.fsmonitor. > * Attacker convinces User to read their project's documentation by `git > clone`-ing their repository and inspecting the "Documentation/" directory. > * User cd-s into "Documentation/" and their shell prompt runs `git status`, This might be obvious to most reading this, but for those trying this out at home, you'll need to create a worktree that points to the bare repository and then run `git status` in there. > executing the core.fsmonitor command defined by Attacker. This is easily reproduceable. > > = What can we do about it? > > Each subsection is an alternative and an analysis (+/- are pros/cons). > > == 1. Prevent users from checking out bare repos > > This is similar to an existing mitigation where we prevent repository entries > that can be confused for ".git" (like ".GIT"). but it requires checking multiple > entries instead of a single entry. I suspect that we could accomplish this in > one of two ways: > > a. Prevent bare repos from entering the index. > b. Prevent writing bare repos to the working tree. > > + Relatively robust protection - because the malicious repo never enters the > working tree, we even protect other tools (e.g. JGit) from doing dangerous > things in the embedded repo (provided the checkout is done with `git`, of > course). > - This breaks some 'valid' workflows (e.g. someone embedding a bare repo as a > more convenient alternative to submodules), but it seems reasonable to let > users opt out of this behavior. > - (1a) is difficult to do in practice because many code paths add entries to > the index, and checking a combination of new entry and existing entries is > much trickier than checking just the new entry. > - (1b) might also be difficult, though not as difficult as 1a because we > already have a complete list of entries we will write. I don’t think there > are existing facilities that do this sort of checking of multiple entries. > > == 2. Detect and reject bare repos using `git fsck` and `transfer.fsckObjects`. > > This entails checking for the markers of a bare repository (HEAD, refs/, > objects/) in tree objects. This shares a precedent with (1), since `git fsck` > will also detect ".GIT". > > + Most reputable hosting sites set `transfer.fsckObjects`, which allows them to > detect and prevent this kind of transmission. > + Confers some protection to users even without them doing anything. > + Easy to visualize and to write. > - This won’t directly protect most users because they don’t typically set > `transfer.fsckObjects` (in fact, `transfer.fsckObjects` will render many > projects, like linux.git, uncloneable without additional config) > - Won’t guard against malicious/poorly configured hosts. > > == 3. Detect that we are in an embedded bare repo and ignore the embedded bare > repository in favor of the containing repo. > > For example, if setup.c detects that we are in a bare repo, it could then walk > up the directory hierarchy to see if there is a containing repo that tracks the > bare one. If so, setup.c chooses to use the containing repo instead. > > + Extremely robust; this even protects against a checkout by an earlier Git > version. > + Users who don’t use bare repos won’t even notice the difference. > - The change in rules breaks some legitimate workflows e.g. a user with a repo > at HOME and bare repos underneath. > - Potentially very expensive for bare repo users because setup.c will likely > walk up to the root directory; we’d be essentially *forcing* those users to > set GIT_CEILING_DIRECTORIES. > - Doesn’t protect users who use other tools e.g. JGit. > > == 4. Educate users about this risk without making code changes. > > Some risks fall into this category e.g. "Avoid unarchiving repositories because > .git/config might be poisoned." . > > + We don’t break any users. > - Breaks the trust that users have in `git clone`. > - IMO "Inspect every repo in its entirety before cloning it" is too much of a > burden to put on users. > > = Next steps > > I propose that we prevent repositories from containing bare repositories by > doing the following (in order): > > * Implement (2) by adding a new fsck message "embeddedBareRepo". > * When this is done, hosting sites can hopefully use this capability to > prevent transmission, and help us understand the prevalence of such attacks. > * Implement (1b) by teaching unpack_trees.c to check whether the tree contains > an entire bare repo, and die() if so. This will be guarded by a > defaults-to-true config value. > * This would only block a bare repo from being written in a single operation. > It wouldn’t stop a user from writing a bare repo entry-by-entry using "git > checkout <path>", but the amount of social engineering required probably > renders this attack infeasible. > * As I noted earlier, I foresee some difficulty actually implementing this > because I don’t think we have facilities for checking multiple tree entries > at once. > > I am particularly interested in hearing feedback about (1b), namely: > > * How to actually teach unpack_trees.c to detect bare repos. > * Whether preventing bare repos in unpack_trees.c is a good enough mitigation > (e.g. Are there other code paths that should block bare repos? Should we be > checking the index instead of the tree?). > > I have a patch that does (2); I will send that out and we can leave feedback on > that separately. > > = Demonstration > > This is based on a script by Taylor Blau (thanks!).  also contains a > demonstration that runs in Docker. > > #!/bin/sh > > rm -fr malicious cloned &&git init malicious && > > ( > cd malicious && > > mkdir -p bare && > cd bare && > > echo 'ref: refs/heads/main' >HEAD && > cat >config <<-EOF > [core] > repositoryformatversion = 0 > filemode = true > bare = false > worktree = "worktree" > fsmonitor = "echo pwned >&2; false" > EOF > > mkdir objects refs worktree && > touch worktree/.gitkeep && > > git add . && > git commit -m ".gitkeep" && > > cd .. && > git add bare && > git commit -m 'initial commit' > ) && > > git clone --no-local malicious cloned && > cd cloned/bare && > git status # pwned > > = Footnotes > >  https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md. >  Archived repositories with malicious .git directories are a known risk. See > https://firstname.lastname@example.org/ > for an on-list discussion. This is also described in > https://blog.sonarsource.com/securing-developer-tools-git-integrations > (referenced in ). >  We even ship such a tool - contrib/completion/git-prompt.sh. A user can pwn > themselves with tab completion even if they don’t have a prompt configured >  Any directory containing these entries will be recognized as a bare repo, so > an attacker could add arbitrary entries to the directory to obfuscate the fact > that it is a bare repo. > [*] https://offensi.com/2019/12/16/4-google-cloud-shell-bugs-explained-bug-3/ is > similar to , but uses hooks instead of core.fsmonitor.
next prev parent reply other threads:[~2022-04-07 18:38 UTC|newest] Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top 2022-04-06 22:43 Glen Choo 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo 2022-04-07 12:42 ` Johannes Schindelin 2022-04-07 13:21 ` Derrick Stolee 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason 2022-04-14 20:02 ` Glen Choo 2022-04-15 12:46 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:11 ` Junio C Hamano 2022-04-13 22:24 ` Glen Choo 2022-04-07 13:12 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:20 ` Junio C Hamano 2022-04-07 18:38 ` John Cai [this message] 2022-04-07 21:24 ` Bare repositories in the working tree are a security risk brian m. carlson 2022-04-07 21:53 ` Justin Steven 2022-04-07 22:10 ` brian m. carlson 2022-04-07 22:40 ` rsbecker 2022-04-08 5:54 ` Junio C Hamano 2022-04-14 0:03 ` Junio C Hamano 2022-04-14 0:04 ` Glen Choo 2022-04-13 23:44 ` Glen Choo 2022-04-13 20:37 ` Glen Choo 2022-04-13 23:36 ` Junio C Hamano 2022-04-14 16:41 ` Glen Choo 2022-04-14 17:35 ` Junio C Hamano 2022-04-14 18:19 ` Junio C Hamano 2022-04-15 21:33 ` Glen Choo 2022-04-15 22:17 ` Junio C Hamano 2022-04-16 0:52 ` Taylor Blau 2022-04-15 22:43 ` Glen Choo 2022-04-15 20:13 ` Junio C Hamano 2022-04-15 23:45 ` Glen Choo 2022-04-15 23:59 ` Glen Choo 2022-04-16 1:00 ` Taylor Blau 2022-04-16 1:18 ` Junio C Hamano 2022-04-16 1:30 ` Taylor Blau 2022-04-16 0:34 ` Glen Choo 2022-04-16 0:41 ` Glen Choo 2022-04-16 1:28 ` Taylor Blau 2022-04-21 18:25 ` Emily Shaffer 2022-04-21 18:29 ` Emily Shaffer 2022-04-21 18:47 ` Junio C Hamano 2022-04-21 18:54 ` Taylor Blau 2022-04-21 19:09 ` Taylor Blau 2022-04-21 21:01 ` Emily Shaffer 2022-04-21 21:22 ` Taylor Blau 2022-04-29 23:57 ` Glen Choo 2022-04-30 1:14 ` Taylor Blau 2022-05-02 19:39 ` Glen Choo 2022-05-02 14:05 ` Philip Oakley 2022-05-02 18:50 ` Junio C Hamano
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=FF4B21A6-A8B6-4142-B325-B9C9193BD885@gmail.com \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --subject='Re: Bare repositories in the working tree are a security risk' \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).