* Bare repositories in the working tree are a security risk @ 2022-04-06 22:43 Glen Choo 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo ` (5 more replies) 0 siblings, 6 replies; 50+ messages in thread From: Glen Choo @ 2022-04-06 22:43 UTC (permalink / raw) To: git; +Cc: Emily Shaffer, justin, Taylor Blau Hi all, My Google colleagues and I would like to address what we think is a security risk - namely that Git repositories can contain valid bare repositories in their working trees. This is based on an excellent article by Justin Steven [1] (CC-ed here). Below is a discussion of: * The risky behavior - what Git does and why it is risky * Possible responses to this risk * A proposed approach The proposed changes are nontrivial, so I’d really appreciate any feedback here. Unfortunately, I will be out of the office and won’t respond to emails for the next 7 days or so, but there will still be at least one Google Git team member keeping tabs on the discussion :) = TL;DR Git repositories should not be allowed to contain bare repositories in their working trees because: * Such bare repositories can have maliciously crafted config files that cause `git` to execute arbitrary code. * Git itself can distribute the malicious repo via `git clone`; no need to distribute repos out of band e.g. via tarballs [2]. * Many `git` commands can be affected by malicious config files, and many users have tools that will run `git` in the current directory or the subdirectories of a repo. Once the malicious repo has been cloned, very little social engineering is needed; the user might only need to open the repo in an editor or `cd` into the correct subdirectory. = Background (This section is primarily a summary of [1]. I highly, highly recommend reading that as it describes the issue in much more detail and is extremely readable regardless of Git experience.) Certain Git configuration options are particularly enticing targets for attackers, e.g. core.fsmonitor can execute arbitrary commands and is invoked on many innocuous-looking `git` commands (like `git status`). This is even more risky when one considers that many tools (like shell prompts and IDEs) will run `git` opportunistically inside a repository - so many users won't even need to explicitly run `git` to be affected [3]. Since config files are such an enticing target for attackers, Git intentionally avoids distributing config files with repos - a user shouldn't be able to `git clone` a repo with a config file (or really, any files inside .git). However, one can 'trick' Git into doing this by embedding a bare repository inside of another, containing repository: a repository can contain subdirectories that are valid bare repositories, and any `git` operations run in such a subdirectory will then use the bare repository’s config file instead of the "main" repository’s. An attack might look like this: * Attacker creates a repository where subdirectory "Documentation/" is a bare repository i.e. it contains "HEAD", "refs/" and "objects/" [4]. Attacker also adds "config" with a malicious setting for core.fsmonitor. * Attacker convinces User to read their project's documentation by `git clone`-ing their repository and inspecting the "Documentation/" directory. * User cd-s into "Documentation/" and their shell prompt runs `git status`, executing the core.fsmonitor command defined by Attacker. = What can we do about it? Each subsection is an alternative and an analysis (+/- are pros/cons). == 1. Prevent users from checking out bare repos This is similar to an existing mitigation where we prevent repository entries that can be confused for ".git" (like ".GIT"). but it requires checking multiple entries instead of a single entry. I suspect that we could accomplish this in one of two ways: a. Prevent bare repos from entering the index. b. Prevent writing bare repos to the working tree. + Relatively robust protection - because the malicious repo never enters the working tree, we even protect other tools (e.g. JGit) from doing dangerous things in the embedded repo (provided the checkout is done with `git`, of course). - This breaks some 'valid' workflows (e.g. someone embedding a bare repo as a more convenient alternative to submodules), but it seems reasonable to let users opt out of this behavior. - (1a) is difficult to do in practice because many code paths add entries to the index, and checking a combination of new entry and existing entries is much trickier than checking just the new entry. - (1b) might also be difficult, though not as difficult as 1a because we already have a complete list of entries we will write. I don’t think there are existing facilities that do this sort of checking of multiple entries. == 2. Detect and reject bare repos using `git fsck` and `transfer.fsckObjects`. This entails checking for the markers of a bare repository (HEAD, refs/, objects/) in tree objects. This shares a precedent with (1), since `git fsck` will also detect ".GIT". + Most reputable hosting sites set `transfer.fsckObjects`, which allows them to detect and prevent this kind of transmission. + Confers some protection to users even without them doing anything. + Easy to visualize and to write. - This won’t directly protect most users because they don’t typically set `transfer.fsckObjects` (in fact, `transfer.fsckObjects` will render many projects, like linux.git, uncloneable without additional config) - Won’t guard against malicious/poorly configured hosts. == 3. Detect that we are in an embedded bare repo and ignore the embedded bare repository in favor of the containing repo. For example, if setup.c detects that we are in a bare repo, it could then walk up the directory hierarchy to see if there is a containing repo that tracks the bare one. If so, setup.c chooses to use the containing repo instead. + Extremely robust; this even protects against a checkout by an earlier Git version. + Users who don’t use bare repos won’t even notice the difference. - The change in rules breaks some legitimate workflows e.g. a user with a repo at HOME and bare repos underneath. - Potentially very expensive for bare repo users because setup.c will likely walk up to the root directory; we’d be essentially *forcing* those users to set GIT_CEILING_DIRECTORIES. - Doesn’t protect users who use other tools e.g. JGit. == 4. Educate users about this risk without making code changes. Some risks fall into this category e.g. "Avoid unarchiving repositories because .git/config might be poisoned." [2]. + We don’t break any users. - Breaks the trust that users have in `git clone`. - IMO "Inspect every repo in its entirety before cloning it" is too much of a burden to put on users. = Next steps I propose that we prevent repositories from containing bare repositories by doing the following (in order): * Implement (2) by adding a new fsck message "embeddedBareRepo". * When this is done, hosting sites can hopefully use this capability to prevent transmission, and help us understand the prevalence of such attacks. * Implement (1b) by teaching unpack_trees.c to check whether the tree contains an entire bare repo, and die() if so. This will be guarded by a defaults-to-true config value. * This would only block a bare repo from being written in a single operation. It wouldn’t stop a user from writing a bare repo entry-by-entry using "git checkout <path>", but the amount of social engineering required probably renders this attack infeasible. * As I noted earlier, I foresee some difficulty actually implementing this because I don’t think we have facilities for checking multiple tree entries at once. I am particularly interested in hearing feedback about (1b), namely: * How to actually teach unpack_trees.c to detect bare repos. * Whether preventing bare repos in unpack_trees.c is a good enough mitigation (e.g. Are there other code paths that should block bare repos? Should we be checking the index instead of the tree?). I have a patch that does (2); I will send that out and we can leave feedback on that separately. = Demonstration This is based on a script by Taylor Blau (thanks!). [1] also contains a demonstration that runs in Docker. #!/bin/sh rm -fr malicious cloned &&git init malicious && ( cd malicious && mkdir -p bare && cd bare && echo 'ref: refs/heads/main' >HEAD && cat >config <<-EOF [core] repositoryformatversion = 0 filemode = true bare = false worktree = "worktree" fsmonitor = "echo pwned >&2; false" EOF mkdir objects refs worktree && touch worktree/.gitkeep && git add . && git commit -m ".gitkeep" && cd .. && git add bare && git commit -m 'initial commit' ) && git clone --no-local malicious cloned && cd cloned/bare && git status # pwned = Footnotes [1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md. [2] Archived repositories with malicious .git directories are a known risk. See https://lore.kernel.org/git/20171003123239.lisk43a2goxtxkro@sigill.intra.peff.net/ for an on-list discussion. This is also described in https://blog.sonarsource.com/securing-developer-tools-git-integrations (referenced in [1]). [3] We even ship such a tool - contrib/completion/git-prompt.sh. A user can pwn themselves with tab completion even if they don’t have a prompt configured [4] Any directory containing these entries will be recognized as a bare repo, so an attacker could add arbitrary entries to the directory to obfuscate the fact that it is a bare repo. [*] https://offensi.com/2019/12/16/4-google-cloud-shell-bugs-explained-bug-3/ is similar to [1], but uses hooks instead of core.fsmonitor. ^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] fsck: detect bare repos in trees and warn 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo @ 2022-04-06 23:22 ` Glen Choo 2022-04-07 12:42 ` Johannes Schindelin ` (2 more replies) 2022-04-07 18:38 ` Bare repositories in the working tree are a security risk John Cai ` (4 subsequent siblings) 5 siblings, 3 replies; 50+ messages in thread From: Glen Choo @ 2022-04-06 23:22 UTC (permalink / raw) To: git; +Cc: Glen Choo [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="kjj whjj", Size: 4768 bytes --] Git tries not to distribute configs in-repo because they are a security risk. However, an attacker can do exactly this if they embed a bare repo inside of another repo. Teach fsck to detect whether a tree object contains a bare repo (as determined by setup.c) and warn. This will help hosting sites detect and prevent transmission of such malicious repos. See [1] for a more in-depth discussion, including future steps and alternatives. [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/ Signed-off-by: Glen Choo <chooglen@google.com> --- fsck.c | 19 +++++++++++++++++++ fsck.h | 1 + setup.c | 4 ++++ t/t1450-fsck.sh | 36 ++++++++++++++++++++++++++++++++++++ 4 files changed, 60 insertions(+) diff --git a/fsck.c b/fsck.c index 3ec500d707..11c11c348a 100644 --- a/fsck.c +++ b/fsck.c @@ -573,6 +573,9 @@ static int fsck_tree(const struct object_id *tree_oid, int has_bad_modes = 0; int has_dup_entries = 0; int not_properly_sorted = 0; + int has_head = 0; + int has_refs_entry = 0; + int has_objects_entry = 0; struct tree_desc desc; unsigned o_mode; const char *o_name; @@ -602,6 +605,12 @@ static int fsck_tree(const struct object_id *tree_oid, has_dotdot |= !strcmp(name, ".."); has_dotgit |= is_hfs_dotgit(name) || is_ntfs_dotgit(name); has_zero_pad |= *(char *)desc.buffer == '0'; + has_head |= !strcasecmp(name, "HEAD") + && (S_ISLNK(mode) || S_ISREG(mode)); + has_refs_entry |= !strcasecmp(name, "refs") + && (S_ISLNK(mode) || S_ISDIR(mode)); + has_objects_entry |= !strcasecmp(name, "objects") + && (S_ISLNK(mode) || S_ISDIR(mode)); if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { if (!S_ISLNK(mode)) @@ -739,6 +748,16 @@ static int fsck_tree(const struct object_id *tree_oid, retval += report(options, tree_oid, OBJ_TREE, FSCK_MSG_TREE_NOT_SORTED, "not properly sorted"); + /* + * Determine if this tree looks like a bare repository according + * to the rules of setup.c. If those are changed, this should be + * changed too. + */ + if (has_head && has_refs_entry && has_objects_entry) + retval += report(options, tree_oid, OBJ_TREE, + FSCK_MSG_EMBEDDED_BARE_REPO, + "contains bare repository"); + return retval; } diff --git a/fsck.h b/fsck.h index d07f7a2459..3f0f73b0f3 100644 --- a/fsck.h +++ b/fsck.h @@ -65,6 +65,7 @@ enum fsck_msg_type { FUNC(NULL_SHA1, WARN) \ FUNC(ZERO_PADDED_FILEMODE, WARN) \ FUNC(NUL_IN_COMMIT, WARN) \ + FUNC(EMBEDDED_BARE_REPO, WARN) \ /* infos (reported as warnings, but ignored by default) */ \ FUNC(GITMODULES_PARSE, INFO) \ FUNC(GITIGNORE_SYMLINK, INFO) \ diff --git a/setup.c b/setup.c index 04ce33cdcd..2600548776 100644 --- a/setup.c +++ b/setup.c @@ -336,6 +336,10 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) * - either a HEAD symlink or a HEAD file that is formatted as * a proper "ref:", or a regular file HEAD that has a properly * formatted sha1 object name. + * + * fsck.c checks for bare repositories in trees using similar rules, but a + * duplicated implementation. If these are changed, the correspnding code in + * fsck.c should change too. */ int is_git_directory(const char *suspect) { diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh index de50c0ea01..a65827bc03 100755 --- a/t/t1450-fsck.sh +++ b/t/t1450-fsck.sh @@ -563,6 +563,42 @@ dot-backslash-case .\\\\.GIT\\\\foobar dotgit-case-backslash .git\\\\foobar EOF +test_expect_success "fsck notices bare repo" ' +( + mkdir -p embedded-bare-repo/bare && + git init embedded-bare-repo && + ( + cd embedded-bare-repo/bare && + echo content >HEAD && + mkdir refs/ objects/ && + echo content >refs/foo && + echo content >objects/foo && + git add . && + git commit -m base && + bad_tree=$(git rev-parse HEAD:bare) && + git fsck 2>out && + test_i18ngrep "warning.*tree $bad_tree: embeddedBareRepo: contains bare repository" out + ) +)' + +test_expect_success "fsck notices bare repo with odd casing" ' +( + mkdir -p embedded-bare-repo-case/bare && + git init embedded-bare-repo-case && + ( + cd embedded-bare-repo-case/bare && + echo content >heAD && + mkdir Refs/ objectS/ && + echo content >Refs/foo && + echo content >objectS/foo && + git add . && + git commit -m base && + bad_tree=$(git rev-parse HEAD:bare) && + git fsck 2>out && + test_i18ngrep "warning.*tree $bad_tree: embeddedBareRepo: contains bare repository" out + ) +)' + test_expect_success 'fsck allows .Ňit' ' ( git init not-dotgit && base-commit: 805e0a68082a217f0112db9ee86a022227a9c81b -- 2.33.GIT ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo @ 2022-04-07 12:42 ` Johannes Schindelin 2022-04-07 13:21 ` Derrick Stolee 2022-04-07 13:12 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:20 ` Junio C Hamano 2 siblings, 1 reply; 50+ messages in thread From: Johannes Schindelin @ 2022-04-07 12:42 UTC (permalink / raw) To: Glen Choo; +Cc: git Hi Glen, On Wed, 6 Apr 2022, Glen Choo wrote: > Git tries not to distribute configs in-repo because they are a security > risk. However, an attacker can do exactly this if they embed a bare > repo inside of another repo. > > Teach fsck to detect whether a tree object contains a bare repo (as > determined by setup.c) and warn. This will help hosting sites detect and > prevent transmission of such malicious repos. > > See [1] for a more in-depth discussion, including future steps and > alternatives. > > [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/ Out of curiosity: does this new check trigger with https://github.com/libgit2/libgit2? AFAIR it has embedded repositories that are used in its test suite. In other words, libgit2 has a legitimate use case for embedded bare repositories, I believe. Thank you, Johannes > > Signed-off-by: Glen Choo <chooglen@google.com> > --- > fsck.c | 19 +++++++++++++++++++ > fsck.h | 1 + > setup.c | 4 ++++ > t/t1450-fsck.sh | 36 ++++++++++++++++++++++++++++++++++++ > 4 files changed, 60 insertions(+) > > diff --git a/fsck.c b/fsck.c > index 3ec500d707..11c11c348a 100644 > --- a/fsck.c > +++ b/fsck.c > @@ -573,6 +573,9 @@ static int fsck_tree(const struct object_id *tree_oid, > int has_bad_modes = 0; > int has_dup_entries = 0; > int not_properly_sorted = 0; > + int has_head = 0; > + int has_refs_entry = 0; > + int has_objects_entry = 0; > struct tree_desc desc; > unsigned o_mode; > const char *o_name; > @@ -602,6 +605,12 @@ static int fsck_tree(const struct object_id *tree_oid, > has_dotdot |= !strcmp(name, ".."); > has_dotgit |= is_hfs_dotgit(name) || is_ntfs_dotgit(name); > has_zero_pad |= *(char *)desc.buffer == '0'; > + has_head |= !strcasecmp(name, "HEAD") > + && (S_ISLNK(mode) || S_ISREG(mode)); > + has_refs_entry |= !strcasecmp(name, "refs") > + && (S_ISLNK(mode) || S_ISDIR(mode)); > + has_objects_entry |= !strcasecmp(name, "objects") > + && (S_ISLNK(mode) || S_ISDIR(mode)); > > if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { > if (!S_ISLNK(mode)) > @@ -739,6 +748,16 @@ static int fsck_tree(const struct object_id *tree_oid, > retval += report(options, tree_oid, OBJ_TREE, > FSCK_MSG_TREE_NOT_SORTED, > "not properly sorted"); > + /* > + * Determine if this tree looks like a bare repository according > + * to the rules of setup.c. If those are changed, this should be > + * changed too. > + */ > + if (has_head && has_refs_entry && has_objects_entry) > + retval += report(options, tree_oid, OBJ_TREE, > + FSCK_MSG_EMBEDDED_BARE_REPO, > + "contains bare repository"); > + > return retval; > } > > diff --git a/fsck.h b/fsck.h > index d07f7a2459..3f0f73b0f3 100644 > --- a/fsck.h > +++ b/fsck.h > @@ -65,6 +65,7 @@ enum fsck_msg_type { > FUNC(NULL_SHA1, WARN) \ > FUNC(ZERO_PADDED_FILEMODE, WARN) \ > FUNC(NUL_IN_COMMIT, WARN) \ > + FUNC(EMBEDDED_BARE_REPO, WARN) \ > /* infos (reported as warnings, but ignored by default) */ \ > FUNC(GITMODULES_PARSE, INFO) \ > FUNC(GITIGNORE_SYMLINK, INFO) \ > diff --git a/setup.c b/setup.c > index 04ce33cdcd..2600548776 100644 > --- a/setup.c > +++ b/setup.c > @@ -336,6 +336,10 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) > * - either a HEAD symlink or a HEAD file that is formatted as > * a proper "ref:", or a regular file HEAD that has a properly > * formatted sha1 object name. > + * > + * fsck.c checks for bare repositories in trees using similar rules, but a > + * duplicated implementation. If these are changed, the correspnding code in > + * fsck.c should change too. > */ > int is_git_directory(const char *suspect) > { > diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh > index de50c0ea01..a65827bc03 100755 > --- a/t/t1450-fsck.sh > +++ b/t/t1450-fsck.sh > @@ -563,6 +563,42 @@ dot-backslash-case .\\\\.GIT\\\\foobar > dotgit-case-backslash .git\\\\foobar > EOF > > +test_expect_success "fsck notices bare repo" ' > +( > + mkdir -p embedded-bare-repo/bare && > + git init embedded-bare-repo && > + ( > + cd embedded-bare-repo/bare && > + echo content >HEAD && > + mkdir refs/ objects/ && > + echo content >refs/foo && > + echo content >objects/foo && > + git add . && > + git commit -m base && > + bad_tree=$(git rev-parse HEAD:bare) && > + git fsck 2>out && > + test_i18ngrep "warning.*tree $bad_tree: embeddedBareRepo: contains bare repository" out > + ) > +)' > + > +test_expect_success "fsck notices bare repo with odd casing" ' > +( > + mkdir -p embedded-bare-repo-case/bare && > + git init embedded-bare-repo-case && > + ( > + cd embedded-bare-repo-case/bare && > + echo content >heAD && > + mkdir Refs/ objectS/ && > + echo content >Refs/foo && > + echo content >objectS/foo && > + git add . && > + git commit -m base && > + bad_tree=$(git rev-parse HEAD:bare) && > + git fsck 2>out && > + test_i18ngrep "warning.*tree $bad_tree: embeddedBareRepo: contains bare repository" out > + ) > +)' > + > test_expect_success 'fsck allows .??it' ' > ( > git init not-dotgit && > > base-commit: 805e0a68082a217f0112db9ee86a022227a9c81b > -- > 2.33.GIT > > ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-07 12:42 ` Johannes Schindelin @ 2022-04-07 13:21 ` Derrick Stolee 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason ` (2 more replies) 0 siblings, 3 replies; 50+ messages in thread From: Derrick Stolee @ 2022-04-07 13:21 UTC (permalink / raw) To: Johannes Schindelin, Glen Choo; +Cc: git On 4/7/2022 8:42 AM, Johannes Schindelin wrote: > Hi Glen, > > On Wed, 6 Apr 2022, Glen Choo wrote: > >> Git tries not to distribute configs in-repo because they are a security >> risk. However, an attacker can do exactly this if they embed a bare >> repo inside of another repo. >> >> Teach fsck to detect whether a tree object contains a bare repo (as >> determined by setup.c) and warn. This will help hosting sites detect and >> prevent transmission of such malicious repos. >> >> See [1] for a more in-depth discussion, including future steps and >> alternatives. >> >> [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/ > > Out of curiosity: does this new check trigger with > https://github.com/libgit2/libgit2? AFAIR it has embedded repositories > that are used in its test suite. In other words, libgit2 has a legitimate > use case for embedded bare repositories, I believe. It is definitely good to keep in mind that other repositories have included bare repositories for convenience. I'm not sure that the behavior of some good actors should outweigh the benefits of protecting against this attack vector. The trouble here is: how could the libgit2 repo change their project to not trigger this warning? These bare repos are in their history forever if they don't do go through significant work and pain to remove them from their history. We would want to have a way to make the warnings less severe for special cases like this. Simultaneously, we wouldn't want to bless all _forks_ of libgit2. Finally, the real thing we want to avoid is having the Git client write these trees to disk, for example during a 'git checkout', unless the user gives an override. (We would want 'git bisect' to still work on the libgit2 repo, for example.) A more complete protection here would be: 1. Warn when finding a bare repo as a tree (this patch). 2. Suppress warnings on trusted repos, scoped to a specific set of known trees _or_ based on some set of known commits (in case the known trees are too large). 3. Prevent writing a bare repo to the worktree, unless the user provided an opt-in to that behavior. Since your patch is moving in the right direction here, I don't think steps (2) and (3) are required to move forward with your patch. However, it is a good opportunity to discuss the full repercussions of this issue. Thanks, -Stolee ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-07 13:21 ` Derrick Stolee @ 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason 2022-04-14 20:02 ` Glen Choo 2022-04-07 15:11 ` Junio C Hamano 2022-04-13 22:24 ` Glen Choo 2 siblings, 1 reply; 50+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-07 14:14 UTC (permalink / raw) To: Derrick Stolee; +Cc: Johannes Schindelin, Glen Choo, git On Thu, Apr 07 2022, Derrick Stolee wrote: > A more complete protection here would be: > > 1. Warn when finding a bare repo as a tree (this patch). > > 2. Suppress warnings on trusted repos, scoped to a specific set of known > trees _or_ based on some set of known commits (in case the known trees > are too large). > > 3. Prevent writing a bare repo to the worktree, unless the user provided > an opt-in to that behavior. > > Since your patch is moving in the right direction here, I don't think > steps (2) and (3) are required to move forward with your patch. However, > it is a good opportunity to discuss the full repercussions of this issue. Isn't a gentler solution here to: 1. In setup.c, we detect a repo 2. Walk up a directory 3. Do we find a repo? 4. Does that repo "contain" the first one? If yes: die on setup If no: it's OK It also seems to me that there's pretty much perfect overlap between this and the long-discussed topic of marking a submodule with config v.s. detecting it on the fly. Since we're already discussing breaking long-standing repos in the wild here I think it's good to take a step back and think of a potential more narrow solution. There's nothing wrong with having tracked content per-se, it's just that we ourselves have other code that'll shoot itself in the foot in certain scenarios, like this one. But for a newer git that needs to run this fsck check it'll already need to detect that something "looks like a repo", but if it can do that and setup.c can walk up from the parent directory & find a repository we can combine the two. Of course we'll probably still want an opt-in fsck check for hosting providers who'll want to protect older clients, but as long as the two are separated that'll only need to last as long as such clients are potentially there in the wild. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason @ 2022-04-14 20:02 ` Glen Choo 2022-04-15 12:46 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 50+ messages in thread From: Glen Choo @ 2022-04-14 20:02 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, Derrick Stolee Cc: Johannes Schindelin, git Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > On Thu, Apr 07 2022, Derrick Stolee wrote: > >> A more complete protection here would be: >> >> 1. Warn when finding a bare repo as a tree (this patch). >> >> 2. Suppress warnings on trusted repos, scoped to a specific set of known >> trees _or_ based on some set of known commits (in case the known trees >> are too large). >> >> 3. Prevent writing a bare repo to the worktree, unless the user provided >> an opt-in to that behavior. >> >> Since your patch is moving in the right direction here, I don't think >> steps (2) and (3) are required to move forward with your patch. However, >> it is a good opportunity to discuss the full repercussions of this issue. > > Isn't a gentler solution here to: > > 1. In setup.c, we detect a repo > 2. Walk up a directory > 3. Do we find a repo? > 4. Does that repo "contain" the first one? > If yes: die on setup > If no: it's OK > > It also seems to me that there's pretty much perfect overlap between > this and the long-discussed topic of marking a submodule with config > v.s. detecting it on the fly. Your suggestion seems similar to: == 3. Detect that we are in an embedded bare repo and ignore the embedded bare repository in favor of the containing repo. which I also think is a simple, robust mitigation if we put aside the problem of walking up to the root in too many situations. I seem to recall that this problem has come up before in [1] (and possibly other topics? I wasn't really able to locate them through a cursory search..), so I assume that's what you're referring to by "long-discussed topic". (Forgive me if I'm asking you to repeat yourself yet another time) I seem to recall that we weren't able to reach consensus on whether it's okay for Git to opportunistically walk up the directory hierarchy during setup, especially since There are some situations where this is extremely expensive (VFS, network mount). I actually like this option quite a lot, but I don't see how we could implement this without imposing a big penalty to all bare repo users - they'd either be forced to set GIT_DIR or GIT_CEILING_DIRECTORIES, or take a (potentially big) performance hit. Hopefully I'm just framing this too narrowly and you're approaching this differently. PS: As an aside, wouldn't this also break libgit2? We could make this opt-out behavior, though that requires us to read system config _before_ discovering the gitdir (as I discussed in [2]). [1] https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/ [2] https://lore.kernel.org/git/kl6lv8vc90ts.fsf@chooglen-macbookpro.roam.corp.google.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-14 20:02 ` Glen Choo @ 2022-04-15 12:46 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 50+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-15 12:46 UTC (permalink / raw) To: Glen Choo; +Cc: Derrick Stolee, Johannes Schindelin, git On Thu, Apr 14 2022, Glen Choo wrote: > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > >> On Thu, Apr 07 2022, Derrick Stolee wrote: >> >>> A more complete protection here would be: >>> >>> 1. Warn when finding a bare repo as a tree (this patch). >>> >>> 2. Suppress warnings on trusted repos, scoped to a specific set of known >>> trees _or_ based on some set of known commits (in case the known trees >>> are too large). >>> >>> 3. Prevent writing a bare repo to the worktree, unless the user provided >>> an opt-in to that behavior. >>> >>> Since your patch is moving in the right direction here, I don't think >>> steps (2) and (3) are required to move forward with your patch. However, >>> it is a good opportunity to discuss the full repercussions of this issue. >> >> Isn't a gentler solution here to: >> >> 1. In setup.c, we detect a repo >> 2. Walk up a directory >> 3. Do we find a repo? >> 4. Does that repo "contain" the first one? >> If yes: die on setup >> If no: it's OK >> >> It also seems to me that there's pretty much perfect overlap between >> this and the long-discussed topic of marking a submodule with config >> v.s. detecting it on the fly. > > Your suggestion seems similar to: > > == 3. Detect that we are in an embedded bare repo and ignore the embedded bare > repository in favor of the containing repo. > > which I also think is a simple, robust mitigation if we put aside the > problem of walking up to the root in too many situations. I seem to > recall that this problem has come up before in [1] (and possibly other > topics? I wasn't really able to locate them through a cursory search..), > so I assume that's what you're referring to by "long-discussed topic". Yes, I mean the submodule.superprojectGitDir topic. > (Forgive me if I'm asking you to repeat yourself yet another time) I > seem to recall that we weren't able to reach consensus on whether it's > okay for Git to opportunistically walk up the directory hierarchy during > setup, especially since There are some situations where this is > extremely expensive (VFS, network mount). I'm not sure, but I think per the later https://lore.kernel.org/git/220204.86pmo34d2m.gmgdl@evledraar.gmail.com/ and https://lore.kernel.org/git/220311.8635joj0lf.gmgdl@evledraar.gmail.com/ that any optimization concerns were likely just "this is slow in shellscript" and not at the FS level. There were also passing references to some internal Google-specific NFS-ish implementation that I know nothing about (but you might), i.e. what I asked about in: https://lore.kernel.org/git/220212.864k53yfws.gmgdl@evledraar.gmail.com/ But given the v9 superprojectGitDir becoming a boolean instead of a path in v9 I'm not sure/have no idea. The only thing I'm sure of is if past iterations of the series were addressing such a problem as an optimization that doesn't seem to be a current goal. As noted in those past exchanges I have tested this method on e.g. AIX whose FS is unbelievably slow, and I couldn't even tell the differenc. That's because if you look at the total FS syscalls even for an uninitialized repo just traversing .git, getting config etc. is going to dwarf "walking up" in terms of number of calls. Of course not all calls are going to be equal, and there's that potential "I'm not NFS-y, but a parent is" case etc. In any case, I think even *if* we had such a case somewhere that this plan would still make sense. Such users could simply set GIT_CEILING_DIRECTORIES or something similar if they cared about the performance. But for everyone else we'd do the right thing, and not prematurely optimize. I.e. we actually *are* concerned not with "does it look like a bare repo?" but "is this thing that looks like a bare repo within our current actual repo or not?". > I actually like this option quite a lot, but I don't see how we could > implement this without imposing a big penalty to all bare repo users - > they'd either be forced to set GIT_DIR or GIT_CEILING_DIRECTORIES, or > take a (potentially big) performance hit. Hopefully I'm just framing > this too narrowly and you're approaching this differently. As noted in the [1] you quoted (link below) I tried to quantify that potential penalty, and it seems to be a complete non-issue. Of course there may be other scenarios where it matters, but I haven't seen any concrete data to support that. Doesn't pretty everyone who cares about the performance of bare in any capacity do so because they're running a server that's using git-upload-pack and the like? Those require you to specify the exact .git directory you want. I.e. wouldn't this *only* apply to those doing the equivalent of "git -C some-dir" to "cd" to a bare repo? > PS: As an aside, wouldn't this also break libgit2? We could make this > opt-out behavior, though that requires us to read system config _before_ > discovering the gitdir (as I discussed in [2]). No it wouldn't? I don't use libgit2, but upthread there's concern that banning things that look-like-a-repo from being tracked would break it. Whereas I'm pointing out that we don't need to do that, we can just keep searching upwards. But yes, it would "break" anything that assumed you could cd to that tracked-looks-like-or-is--a-gitdir and have e.g. "git config" pick up its config instead of our "real repo" config, but that's exactly what we want in this case isn't it? I'm just pointing out that we can do it on the fly in setup.c, instead of forbidding such content from ever being tracked within the repository, which we'd be doing because we know we're doing the wrong thing in that setup.c codepath. Let's just fix that bit in setup.c instead. > [1] https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/ > [2] https://lore.kernel.org/git/kl6lv8vc90ts.fsf@chooglen-macbookpro.roam.corp.google.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-07 13:21 ` Derrick Stolee 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason @ 2022-04-07 15:11 ` Junio C Hamano 2022-04-13 22:24 ` Glen Choo 2 siblings, 0 replies; 50+ messages in thread From: Junio C Hamano @ 2022-04-07 15:11 UTC (permalink / raw) To: Derrick Stolee; +Cc: Johannes Schindelin, Glen Choo, git Derrick Stolee <derrickstolee@github.com> writes: > It is definitely good to keep in mind that other repositories have > included bare repositories for convenience. I'm not sure that the behavior > of some good actors should outweigh the benefits of protecting against > this attack vector. Good line of thinking. > 2. Suppress warnings on trusted repos, scoped to a specific set of known > trees _or_ based on some set of known commits (in case the known trees > are too large). Is "It is OK to check out an embedded repository from this commit or any of its ancestors" the kind of suppression you meant by "known commits"? > 3. Prevent writing a bare repo to the worktree, unless the user provided > an opt-in to that behavior. > > Since your patch is moving in the right direction here, I don't think > steps (2) and (3) are required to move forward with your patch. However, > it is a good opportunity to discuss the full repercussions of this issue. We can definitely start without (3). Unleashing (1) before (2) is ready would mean folks cannot clone projects like libgit2 until later, which takes us back to the first point you made above. Those who use embedded bare repositories as test vector can easily switch to storing an equivalent of "a tarball that is expanded while building and/or testing", and as long as the user trusts the project enough to run its build procedure or tests, we are not adding much to the attack surface, I would guess. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-07 13:21 ` Derrick Stolee 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:11 ` Junio C Hamano @ 2022-04-13 22:24 ` Glen Choo 2 siblings, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-04-13 22:24 UTC (permalink / raw) To: Derrick Stolee, Johannes Schindelin, Junio C Hamano; +Cc: git Derrick Stolee <derrickstolee@github.com> writes: > On 4/7/2022 8:42 AM, Johannes Schindelin wrote: >> Hi Glen, >> >> On Wed, 6 Apr 2022, Glen Choo wrote: >> >>> Git tries not to distribute configs in-repo because they are a security >>> risk. However, an attacker can do exactly this if they embed a bare >>> repo inside of another repo. >>> >>> Teach fsck to detect whether a tree object contains a bare repo (as >>> determined by setup.c) and warn. This will help hosting sites detect and >>> prevent transmission of such malicious repos. >>> >>> See [1] for a more in-depth discussion, including future steps and >>> alternatives. >>> >>> [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/ >> >> Out of curiosity: does this new check trigger with >> https://github.com/libgit2/libgit2? AFAIR it has embedded repositories >> that are used in its test suite. In other words, libgit2 has a legitimate >> use case for embedded bare repositories, I believe. > > It is definitely good to keep in mind that other repositories have > included bare repositories for convenience. I'm not sure that the behavior > of some good actors should outweigh the benefits of protecting against > this attack vector. > > The trouble here is: how could the libgit2 repo change their project to > not trigger this warning? These bare repos are in their history forever if > they don't do go through significant work and pain to remove them from > their history. We would want to have a way to make the warnings less > severe for special cases like this. > > Simultaneously, we wouldn't want to bless all _forks_ of libgit2. Yes, that makes sense. Thanks for the thoughtful reply. > 2. Suppress warnings on trusted repos, scoped to a specific set of known > trees _or_ based on some set of known commits (in case the known trees > are too large). Since Junio mentioned downthread that we'd need (2), I'll focus on this. I'm not sure I follow, though, so let me try to verbalize my thought process to see what I'm not understanding... By "Suppress warnings on trusted repos", I assume this is done on the hosting side? (Since I can't imagine a built-in Git feature that could selectively trust repos.) "scoped to a specific set of known trees" sounds like fsck.skipList i.e. as a host, I can configure a list of "good" libgit2 trees that I will trust and those will be skipped by fsck. So from my _very_ naive reading of (2), it seems like we already have all of the pieces in place for hosts to do (2) on their own, _unless_ we think that fsck.skipList is inadequate for this use case. e.g. I personally can't imagine any way to list every "good" tree and still have a cloneable fork of libgit2, so we might to teach fsck to do something smarter like "skip any objects reachable by these commits". ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo 2022-04-07 12:42 ` Johannes Schindelin @ 2022-04-07 13:12 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:20 ` Junio C Hamano 2 siblings, 0 replies; 50+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-04-07 13:12 UTC (permalink / raw) To: Glen Choo; +Cc: git On Wed, Apr 06 2022, Glen Choo wrote: > @@ -602,6 +605,12 @@ static int fsck_tree(const struct object_id *tree_oid, > has_dotdot |= !strcmp(name, ".."); > has_dotgit |= is_hfs_dotgit(name) || is_ntfs_dotgit(name); > has_zero_pad |= *(char *)desc.buffer == '0'; > + has_head |= !strcasecmp(name, "HEAD") > + && (S_ISLNK(mode) || S_ISREG(mode)); > + has_refs_entry |= !strcasecmp(name, "refs") > + && (S_ISLNK(mode) || S_ISDIR(mode)); > + has_objects_entry |= !strcasecmp(name, "objects") > + && (S_ISLNK(mode) || S_ISDIR(mode)); Doesn't this code need to use is_hfs_dot_str() instead of strcasecmp() like the other similar checks? > @@ -336,6 +336,10 @@ int get_common_dir_noenv(struct strbuf *sb, const char *gitdir) > * - either a HEAD symlink or a HEAD file that is formatted as > * a proper "ref:", or a regular file HEAD that has a properly > * formatted sha1 object name. > + * > + * fsck.c checks for bare repositories in trees using similar rules, but a > + * duplicated implementation. If these are changed, the correspnding code in > + * fsck.c should change too. > */ Probably took much hassle to factor these so it can be re-used. Typo: correspnding. > + test_i18ngrep "warning.*tree $bad_tree: embeddedBareRepo: contains bare repository" out s/test_i18ngrep/grep/ ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] fsck: detect bare repos in trees and warn 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo 2022-04-07 12:42 ` Johannes Schindelin 2022-04-07 13:12 ` Ævar Arnfjörð Bjarmason @ 2022-04-07 15:20 ` Junio C Hamano 2 siblings, 0 replies; 50+ messages in thread From: Junio C Hamano @ 2022-04-07 15:20 UTC (permalink / raw) To: Glen Choo; +Cc: git Glen Choo <chooglen@google.com> writes: > @@ -602,6 +605,12 @@ static int fsck_tree(const struct object_id *tree_oid, > has_dotdot |= !strcmp(name, ".."); > has_dotgit |= is_hfs_dotgit(name) || is_ntfs_dotgit(name); > has_zero_pad |= *(char *)desc.buffer == '0'; > + has_head |= !strcasecmp(name, "HEAD") > + && (S_ISLNK(mode) || S_ISREG(mode)); > + has_refs_entry |= !strcasecmp(name, "refs") > + && (S_ISLNK(mode) || S_ISDIR(mode)); Sorry, I know I am to blame, but I think it is wrong to insist "refs" (and others) to be a directory. setup.c::is_git_directory() only checks with access(X_OK). Also, if I am not mistaken, we plan to take advantage of it and may make refs a regular file, for example, to signal that the ref-files backend is unwelcome in the repository. HEAD can be a symbolic link, not necessarily a regular file. We do not create "refs" and "objects" as symbolic links ourselves, but it is good to see that the code protects us against such a directory with these entries being one. > + has_objects_entry |= !strcasecmp(name, "objects") > + && (S_ISLNK(mode) || S_ISDIR(mode)); > > if (is_hfs_dotgitmodules(name) || is_ntfs_dotgitmodules(name)) { > if (!S_ISLNK(mode)) > @@ -739,6 +748,16 @@ static int fsck_tree(const struct object_id *tree_oid, > retval += report(options, tree_oid, OBJ_TREE, > FSCK_MSG_TREE_NOT_SORTED, > "not properly sorted"); > + /* > + * Determine if this tree looks like a bare repository according > + * to the rules of setup.c. If those are changed, this should be > + * changed too. > + */ > + if (has_head && has_refs_entry && has_objects_entry) > + retval += report(options, tree_oid, OBJ_TREE, > + FSCK_MSG_EMBEDDED_BARE_REPO, > + "contains bare repository"); > + > return retval; > } ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo @ 2022-04-07 18:38 ` John Cai 2022-04-07 21:24 ` brian m. carlson ` (3 subsequent siblings) 5 siblings, 0 replies; 50+ messages in thread From: John Cai @ 2022-04-07 18:38 UTC (permalink / raw) To: Glen Choo; +Cc: git, Emily Shaffer, justin, Taylor Blau Hi Glen, This is an interesting find. Thanks for writing it up. On 6 Apr 2022, at 18:43, Glen Choo wrote: > Hi all, > > My Google colleagues and I would like to address what we think is a security > risk - namely that Git repositories can contain valid bare repositories in their > working trees. This is based on an excellent article by Justin Steven [1] (CC-ed > here). > > Below is a discussion of: > > * The risky behavior - what Git does and why it is risky > * Possible responses to this risk > * A proposed approach > > The proposed changes are nontrivial, so I’d really appreciate any feedback here. > Unfortunately, I will be out of the office and won’t respond to emails for the > next 7 days or so, but there will still be at least one Google Git team member > keeping tabs on the discussion :) > > = TL;DR > > Git repositories should not be allowed to contain bare repositories in their > working trees because: > > * Such bare repositories can have maliciously crafted config files that cause > `git` to execute arbitrary code. > * Git itself can distribute the malicious repo via `git clone`; no need to > distribute repos out of band e.g. via tarballs [2]. > * Many `git` commands can be affected by malicious config files, and many users > have tools that will run `git` in the current directory or the subdirectories > of a repo. Once the malicious repo has been cloned, very little social > engineering is needed; the user might only need to open the repo in an editor > or `cd` into the correct subdirectory. > > = Background > > (This section is primarily a summary of [1]. I highly, highly recommend reading > that as it describes the issue in much more detail and is extremely readable > regardless of Git experience.) > > Certain Git configuration options are particularly enticing targets for > attackers, e.g. core.fsmonitor can execute arbitrary commands and is invoked > on many innocuous-looking `git` commands (like `git status`). This is even more > risky when one considers that many tools (like shell prompts and IDEs) will run > `git` opportunistically inside a repository - so many users won't even need to > explicitly run `git` to be affected [3]. > > Since config files are such an enticing target for attackers, Git intentionally > avoids distributing config files with repos - a user shouldn't be able to `git > clone` a repo with a config file (or really, any files inside .git). However, > one can 'trick' Git into doing this by embedding a bare repository inside of > another, containing repository: a repository can contain subdirectories that are > valid bare repositories, and any `git` operations run in such a subdirectory > will then use the bare repository’s config file instead of the "main" > repository’s. > > An attack might look like this: > > * Attacker creates a repository where subdirectory "Documentation/" is a bare > repository i.e. it contains "HEAD", "refs/" and "objects/" [4]. Attacker > also adds "config" with a malicious setting for core.fsmonitor. > * Attacker convinces User to read their project's documentation by `git > clone`-ing their repository and inspecting the "Documentation/" directory. > * User cd-s into "Documentation/" and their shell prompt runs `git status`, This might be obvious to most reading this, but for those trying this out at home, you'll need to create a worktree that points to the bare repository and then run `git status` in there. > executing the core.fsmonitor command defined by Attacker. This is easily reproduceable. > > = What can we do about it? > > Each subsection is an alternative and an analysis (+/- are pros/cons). > > == 1. Prevent users from checking out bare repos > > This is similar to an existing mitigation where we prevent repository entries > that can be confused for ".git" (like ".GIT"). but it requires checking multiple > entries instead of a single entry. I suspect that we could accomplish this in > one of two ways: > > a. Prevent bare repos from entering the index. > b. Prevent writing bare repos to the working tree. > > + Relatively robust protection - because the malicious repo never enters the > working tree, we even protect other tools (e.g. JGit) from doing dangerous > things in the embedded repo (provided the checkout is done with `git`, of > course). > - This breaks some 'valid' workflows (e.g. someone embedding a bare repo as a > more convenient alternative to submodules), but it seems reasonable to let > users opt out of this behavior. > - (1a) is difficult to do in practice because many code paths add entries to > the index, and checking a combination of new entry and existing entries is > much trickier than checking just the new entry. > - (1b) might also be difficult, though not as difficult as 1a because we > already have a complete list of entries we will write. I don’t think there > are existing facilities that do this sort of checking of multiple entries. > > == 2. Detect and reject bare repos using `git fsck` and `transfer.fsckObjects`. > > This entails checking for the markers of a bare repository (HEAD, refs/, > objects/) in tree objects. This shares a precedent with (1), since `git fsck` > will also detect ".GIT". > > + Most reputable hosting sites set `transfer.fsckObjects`, which allows them to > detect and prevent this kind of transmission. > + Confers some protection to users even without them doing anything. > + Easy to visualize and to write. > - This won’t directly protect most users because they don’t typically set > `transfer.fsckObjects` (in fact, `transfer.fsckObjects` will render many > projects, like linux.git, uncloneable without additional config) > - Won’t guard against malicious/poorly configured hosts. > > == 3. Detect that we are in an embedded bare repo and ignore the embedded bare > repository in favor of the containing repo. > > For example, if setup.c detects that we are in a bare repo, it could then walk > up the directory hierarchy to see if there is a containing repo that tracks the > bare one. If so, setup.c chooses to use the containing repo instead. > > + Extremely robust; this even protects against a checkout by an earlier Git > version. > + Users who don’t use bare repos won’t even notice the difference. > - The change in rules breaks some legitimate workflows e.g. a user with a repo > at HOME and bare repos underneath. > - Potentially very expensive for bare repo users because setup.c will likely > walk up to the root directory; we’d be essentially *forcing* those users to > set GIT_CEILING_DIRECTORIES. > - Doesn’t protect users who use other tools e.g. JGit. > > == 4. Educate users about this risk without making code changes. > > Some risks fall into this category e.g. "Avoid unarchiving repositories because > .git/config might be poisoned." [2]. > > + We don’t break any users. > - Breaks the trust that users have in `git clone`. > - IMO "Inspect every repo in its entirety before cloning it" is too much of a > burden to put on users. > > = Next steps > > I propose that we prevent repositories from containing bare repositories by > doing the following (in order): > > * Implement (2) by adding a new fsck message "embeddedBareRepo". > * When this is done, hosting sites can hopefully use this capability to > prevent transmission, and help us understand the prevalence of such attacks. > * Implement (1b) by teaching unpack_trees.c to check whether the tree contains > an entire bare repo, and die() if so. This will be guarded by a > defaults-to-true config value. > * This would only block a bare repo from being written in a single operation. > It wouldn’t stop a user from writing a bare repo entry-by-entry using "git > checkout <path>", but the amount of social engineering required probably > renders this attack infeasible. > * As I noted earlier, I foresee some difficulty actually implementing this > because I don’t think we have facilities for checking multiple tree entries > at once. > > I am particularly interested in hearing feedback about (1b), namely: > > * How to actually teach unpack_trees.c to detect bare repos. > * Whether preventing bare repos in unpack_trees.c is a good enough mitigation > (e.g. Are there other code paths that should block bare repos? Should we be > checking the index instead of the tree?). > > I have a patch that does (2); I will send that out and we can leave feedback on > that separately. > > = Demonstration > > This is based on a script by Taylor Blau (thanks!). [1] also contains a > demonstration that runs in Docker. > > #!/bin/sh > > rm -fr malicious cloned &&git init malicious && > > ( > cd malicious && > > mkdir -p bare && > cd bare && > > echo 'ref: refs/heads/main' >HEAD && > cat >config <<-EOF > [core] > repositoryformatversion = 0 > filemode = true > bare = false > worktree = "worktree" > fsmonitor = "echo pwned >&2; false" > EOF > > mkdir objects refs worktree && > touch worktree/.gitkeep && > > git add . && > git commit -m ".gitkeep" && > > cd .. && > git add bare && > git commit -m 'initial commit' > ) && > > git clone --no-local malicious cloned && > cd cloned/bare && > git status # pwned > > = Footnotes > > [1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md. > [2] Archived repositories with malicious .git directories are a known risk. See > https://lore.kernel.org/git/20171003123239.lisk43a2goxtxkro@sigill.intra.peff.net/ > for an on-list discussion. This is also described in > https://blog.sonarsource.com/securing-developer-tools-git-integrations > (referenced in [1]). > [3] We even ship such a tool - contrib/completion/git-prompt.sh. A user can pwn > themselves with tab completion even if they don’t have a prompt configured > [4] Any directory containing these entries will be recognized as a bare repo, so > an attacker could add arbitrary entries to the directory to obfuscate the fact > that it is a bare repo. > [*] https://offensi.com/2019/12/16/4-google-cloud-shell-bugs-explained-bug-3/ is > similar to [1], but uses hooks instead of core.fsmonitor. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo 2022-04-07 18:38 ` Bare repositories in the working tree are a security risk John Cai @ 2022-04-07 21:24 ` brian m. carlson 2022-04-07 21:53 ` Justin Steven 2022-04-13 20:37 ` Glen Choo ` (2 subsequent siblings) 5 siblings, 1 reply; 50+ messages in thread From: brian m. carlson @ 2022-04-07 21:24 UTC (permalink / raw) To: Glen Choo; +Cc: git, Emily Shaffer, justin, Taylor Blau [-- Attachment #1: Type: text/plain, Size: 2415 bytes --] On 2022-04-06 at 22:43:08, Glen Choo wrote: > An attack might look like this: > > * Attacker creates a repository where subdirectory "Documentation/" is a bare > repository i.e. it contains "HEAD", "refs/" and "objects/" [4]. Attacker > also adds "config" with a malicious setting for core.fsmonitor. > * Attacker convinces User to read their project's documentation by `git > clone`-ing their repository and inspecting the "Documentation/" directory. > * User cd-s into "Documentation/" and their shell prompt runs `git status`, > executing the core.fsmonitor command defined by Attacker. As mentioned elsewhere, git status doesn't work without a working tree. > = Next steps > > I propose that we prevent repositories from containing bare repositories by > doing the following (in order): > > * Implement (2) by adding a new fsck message "embeddedBareRepo". > * When this is done, hosting sites can hopefully use this capability to > prevent transmission, and help us understand the prevalence of such attacks. > * Implement (1b) by teaching unpack_trees.c to check whether the tree contains > an entire bare repo, and die() if so. This will be guarded by a > defaults-to-true config value. > * This would only block a bare repo from being written in a single operation. > It wouldn’t stop a user from writing a bare repo entry-by-entry using "git > checkout <path>", but the amount of social engineering required probably > renders this attack infeasible. > * As I noted earlier, I foresee some difficulty actually implementing this > because I don’t think we have facilities for checking multiple tree entries > at once. I'm aware of repositories that happen to break in this case. It's not uncommon to embed bare repositories when working with tools that involve Git, and this will definitely break them. git fast-import isn't always a valid option because the test data may involve specific structures or tooling that can't be replicated that way, or it involves things like commit signatures which aren't round-tripped. Instead, I'd rather see us avoid executing any program from the config or any hooks in a bare repository without a working tree (except for pushes). I think that would avoid breaking things while still improving security. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-07 21:24 ` brian m. carlson @ 2022-04-07 21:53 ` Justin Steven 2022-04-07 22:10 ` brian m. carlson 0 siblings, 1 reply; 50+ messages in thread From: Justin Steven @ 2022-04-07 21:53 UTC (permalink / raw) To: brian m. carlson, Glen Choo, git, Emily Shaffer, justin, Taylor Blau Hi all, I'm the author of one of the articles linked in Glen's mail. Thank you Glen for summarising the problem beautifully and pushing this forward. Brian said: > As mentioned elsewhere, git status doesn't work without a working tree. This is correct. However, it is possible to embed a bare repo that has its own core.worktree which points to a directory within the containing repo, satisfying the requirement of having a working tree. This is covered in the article [1] and looks to be accounted for in Taylor's reproducer script which admittedly I haven't run. > Instead, I'd rather see us avoid executing any program from the config > or any hooks in a bare repository without a working tree (except for > pushes). I think that would avoid breaking things while still improving > security. Due to the fact that the embedded bare repo can be made to have a working tree, this won't be an effective fix. I'm not dismissing your examples of uses of Git which would break under Glen's suggestions. Thank you for describing these. [1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md#poc---regular-vs-bare-repos-and-adding-a-corefsmonitor-payload-to-a-bare-repo -- Justin ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-07 21:53 ` Justin Steven @ 2022-04-07 22:10 ` brian m. carlson 2022-04-07 22:40 ` rsbecker ` (2 more replies) 0 siblings, 3 replies; 50+ messages in thread From: brian m. carlson @ 2022-04-07 22:10 UTC (permalink / raw) To: Justin Steven; +Cc: Glen Choo, git, Emily Shaffer, Taylor Blau [-- Attachment #1: Type: text/plain, Size: 1902 bytes --] On 2022-04-07 at 21:53:26, Justin Steven wrote: > Hi all, > > I'm the author of one of the articles linked in Glen's mail. Thank you > Glen for summarising the problem beautifully and pushing this forward. > > Brian said: > > As mentioned elsewhere, git status doesn't work without a working tree. > > This is correct. However, it is possible to embed a bare repo that has > its own core.worktree which points to a directory within the > containing repo, satisfying the requirement of having a working tree. > This is covered in the article [1] and looks to be accounted for in > Taylor's reproducer script which admittedly I haven't run. > > > Instead, I'd rather see us avoid executing any program from the config > > or any hooks in a bare repository without a working tree (except for > > pushes). I think that would avoid breaking things while still improving > > security. > > Due to the fact that the embedded bare repo can be made to have a > working tree, this won't be an effective fix. Then we'd probably be better off just walking up the entire hierarchy and excluding worktrees from embedded bare repositories, or otherwise restricting the config we read. That will probably mean we'll need to walk the entire directory hierarchy to see if it's embedded (or at least to the root of the device) in such a case, but that should be relatively uncommon. I'd definitely like to see us make a security improvement here, but I also would like to avoid us breaking a lot of repositories, especially since we lack alternatives. If git fast-import could 100% correctly round-trip all commits and repositories, I would be much more open to blocking this in fsck after a deprecation period, but as it stands that's not possible. Perhaps improving that would be a suitable way forward. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 50+ messages in thread
* RE: Bare repositories in the working tree are a security risk 2022-04-07 22:10 ` brian m. carlson @ 2022-04-07 22:40 ` rsbecker 2022-04-08 5:54 ` Junio C Hamano 2022-04-13 23:44 ` Glen Choo 2 siblings, 0 replies; 50+ messages in thread From: rsbecker @ 2022-04-07 22:40 UTC (permalink / raw) To: 'brian m. carlson', 'Justin Steven' Cc: 'Glen Choo', git, 'Emily Shaffer', 'Taylor Blau' On April 7, 2022 6:10 PM, brian m. carlson wrote: >On 2022-04-07 at 21:53:26, Justin Steven wrote: >> Hi all, >> >> I'm the author of one of the articles linked in Glen's mail. Thank you >> Glen for summarising the problem beautifully and pushing this forward. >> >> Brian said: >> > As mentioned elsewhere, git status doesn't work without a working tree. >> >> This is correct. However, it is possible to embed a bare repo that has >> its own core.worktree which points to a directory within the >> containing repo, satisfying the requirement of having a working tree. >> This is covered in the article [1] and looks to be accounted for in >> Taylor's reproducer script which admittedly I haven't run. >> >> > Instead, I'd rather see us avoid executing any program from the >> > config or any hooks in a bare repository without a working tree >> > (except for pushes). I think that would avoid breaking things while >> > still improving security. >> >> Due to the fact that the embedded bare repo can be made to have a >> working tree, this won't be an effective fix. > >Then we'd probably be better off just walking up the entire hierarchy and >excluding worktrees from embedded bare repositories, or otherwise restricting >the config we read. That will probably mean we'll need to walk the entire >directory hierarchy to see if it's embedded (or at least to the root of the device) in >such a case, but that should be relatively uncommon. > >I'd definitely like to see us make a security improvement here, but I also would like >to avoid us breaking a lot of repositories, especially since we lack alternatives. > >If git fast-import could 100% correctly round-trip all commits and repositories, I >would be much more open to blocking this in fsck after a deprecation period, but >as it stands that's not possible. Perhaps improving that would be a suitable way >forward. One option relating to enable/disable this is to set up a config option that, by default is false, to allow embedded bare repositories. At least with enough warning that this option is required, it might be acceptable. I would prefer never to receive a bare repo through any means (including through a more worrying submodule). From an attack vector standpoint, I would be more concerned about this in an automation setting, say GitHub workflows or Jenkins GitSCM. At least with GitHub workflows, this is "somewhat" contained within VMs under GitHub's control - although not entirely. Jenkins is probably more vulnerable as the clones done through that path do not get the same scrutiny, although in my world, I use a dedicated non-root UID and sandbox the whole thing. This topic makes me nervous and wonder whether we should be self-reporting a CVE. Shuddering, Randall ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-07 22:10 ` brian m. carlson 2022-04-07 22:40 ` rsbecker @ 2022-04-08 5:54 ` Junio C Hamano 2022-04-14 0:03 ` Junio C Hamano 2022-04-14 0:04 ` Glen Choo 2022-04-13 23:44 ` Glen Choo 2 siblings, 2 replies; 50+ messages in thread From: Junio C Hamano @ 2022-04-08 5:54 UTC (permalink / raw) To: brian m. carlson Cc: Justin Steven, Glen Choo, git, Emily Shaffer, Taylor Blau "brian m. carlson" <sandals@crustytoothpaste.net> writes: > Then we'd probably be better off just walking up the entire hierarchy > and excluding worktrees from embedded bare repositories, or otherwise > restricting the config we read. That will probably mean we'll need to > walk the entire directory hierarchy to see if it's embedded (or at least > to the root of the device) in such a case, but that should be relatively > uncommon. I find this direction to notice iffy "user data" and disable it quite reasonable. A configuration file can define alias, and it would be yet another attack vector to overload common ones users likely use ("git co", "git st", ...). There may also be a hooks/ directory. I wonder if it is an acceptable defence to deliberately "corrupt" such user data when we notice that they smell fishy, perhaps by renaming "config" and "hooks", when they are found next to "HEAD" and "objects" and "refs", to "config.disabled" and "hooks.disabled"? I am just thinking aloud without assessing if it is sensible or feasible at ths point. I am not sure if "walking the hierarchy up" is an effective enough defence offhand. Do we consider it too much social engineering to make the user follow cloning instruction of the malicious project to prepare a repository, with core.worktree set to elsewhere, and pull into it? Since walking up from any subdirectory of the directory the core.worktree points at will never see a directory, with ".git/" subdirectory that is the malicious project, "git status" run in the "embedded" place in such a scenario will not notice that it is a repository lookalike that came from outside. But we can write it off as an approach needing too much social engineering, that's OK. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-08 5:54 ` Junio C Hamano @ 2022-04-14 0:03 ` Junio C Hamano 2022-04-14 0:04 ` Glen Choo 1 sibling, 0 replies; 50+ messages in thread From: Junio C Hamano @ 2022-04-14 0:03 UTC (permalink / raw) To: brian m. carlson Cc: Justin Steven, Glen Choo, git, Emily Shaffer, Taylor Blau Junio C Hamano <gitster@pobox.com> writes: > I am not sure if "walking the hierarchy up" is an effective enough > defence offhand. Do we consider it too much social engineering to > make the user follow cloning instruction of the malicious project to > prepare a repository, with core.worktree set to elsewhere, and pull > into it? Since walking up from any subdirectory of the directory > the core.worktree points at will never see a directory, with ".git/" > subdirectory that is the malicious project, "git status" run in the > "embedded" place in such a scenario will not notice that it is a > repository lookalike that came from outside. But we can write it > off as an approach needing too much social engineering, that's OK. In other words, an attacker could lead the victim to clone their malicious repository at path $R with core.worktree in its configuration file $R/config pointing at some directory $D that is entirely unrelated to $R. In a subdirectory of $D (i.e. the working tree of that malicious repository) there may be a directory $G with HEAD, refs and config in it. When the victim visits such a directory $G, going up from there to the root of the filesystem would not find any directory with ".git/" subdirectory in it, so your protection may not even notice that $G came from a checkout of $R. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-08 5:54 ` Junio C Hamano 2022-04-14 0:03 ` Junio C Hamano @ 2022-04-14 0:04 ` Glen Choo 1 sibling, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-04-14 0:04 UTC (permalink / raw) To: Junio C Hamano, brian m. carlson Cc: Justin Steven, git, Emily Shaffer, Taylor Blau Junio C Hamano <gitster@pobox.com> writes: > "brian m. carlson" <sandals@crustytoothpaste.net> writes: > >> Then we'd probably be better off just walking up the entire hierarchy >> and excluding worktrees from embedded bare repositories, or otherwise >> restricting the config we read. That will probably mean we'll need to >> walk the entire directory hierarchy to see if it's embedded (or at least >> to the root of the device) in such a case, but that should be relatively >> uncommon. > > I find this direction to notice iffy "user data" and disable it > quite reasonable. A configuration file can define alias, and it > would be yet another attack vector to overload common ones users > likely use ("git co", "git st", ...). There may also be a hooks/ > directory. > > I wonder if it is an acceptable defence to deliberately "corrupt" > such user data when we notice that they smell fishy, perhaps by > renaming "config" and "hooks", when they are found next to "HEAD" > and "objects" and "refs", to "config.disabled" and "hooks.disabled"? > I am just thinking aloud without assessing if it is sensible or > feasible at ths point. Interesting idea - so the final result for the user is that they can check out a valid bare repository, but it will not have anything "sensitive". I suspect that it will be difficult to define "will not have anything sensitive" in practice. e.g. Justin's original article [1] shows how .git/index can be used in surprising, malicious ways. And if we're going to corrupt the repo anyway, it just be easier block the entire bare repo from entering the worktree by default (but with an escape hatch for users who know what they are doing). > I am not sure if "walking the hierarchy up" is an effective enough > defence offhand. Do we consider it too much social engineering to > make the user follow cloning instruction of the malicious project to > prepare a repository, with core.worktree set to elsewhere, and pull > into it? Since walking up from any subdirectory of the directory > the core.worktree points at will never see a directory, with ".git/" > subdirectory that is the malicious project, "git status" run in the > "embedded" place in such a scenario will not notice that it is a > repository lookalike that came from outside. But we can write it > off as an approach needing too much social engineering, that's OK. IMO that sounds like too much social engineering :) [1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-07 22:10 ` brian m. carlson 2022-04-07 22:40 ` rsbecker 2022-04-08 5:54 ` Junio C Hamano @ 2022-04-13 23:44 ` Glen Choo 2 siblings, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-04-13 23:44 UTC (permalink / raw) To: brian m. carlson, Justin Steven; +Cc: git, Emily Shaffer, Taylor Blau Thanks a lot for chiming in, especially for advocating on behalf of users of embedded bare repos :) "brian m. carlson" <sandals@crustytoothpaste.net> writes: > On 2022-04-07 at 21:53:26, Justin Steven wrote: >> Hi all, >> >> I'm the author of one of the articles linked in Glen's mail. Thank you >> Glen for summarising the problem beautifully and pushing this forward. >> >> Brian said: >> > As mentioned elsewhere, git status doesn't work without a working tree. >> >> This is correct. However, it is possible to embed a bare repo that has >> its own core.worktree which points to a directory within the >> containing repo, satisfying the requirement of having a working tree. >> This is covered in the article [1] and looks to be accounted for in >> Taylor's reproducer script which admittedly I haven't run. >> >> > Instead, I'd rather see us avoid executing any program from the config >> > or any hooks in a bare repository without a working tree (except for >> > pushes). I think that would avoid breaking things while still improving >> > security. >> >> Due to the fact that the embedded bare repo can be made to have a >> working tree, this won't be an effective fix. > > Then we'd probably be better off just walking up the entire hierarchy > and excluding worktrees from embedded bare repositories, or otherwise > restricting the config we read. That will probably mean we'll need to > walk the entire directory hierarchy to see if it's embedded (or at least > to the root of the device) in such a case, but that should be relatively > uncommon. IIUC, the difference between your suggestion and (3) in the original email: == 3. Detect that we are in an embedded bare repo and ignore the embedded bare repository in favor of the containing repo. is that we only walk the hierarchy in the event that the bare repo has a worktree? If so, then yes that seems relatively uncommon and wouldn't break most bare repo users. What I'm unsure about is whether or not this reduces the attack surface enough; it protects the highest value target, `git status`, but I'm not sure about the following: * How well does this protect other, existing commands (e.g. `log`, `branch` and `remote` fall into the same category of commands that appear informative and harmless)? * How well does this protect other, abusable configuration directives (`core.pager` perhaps)? * How future-proof is this? (if someone removes a dependency on core.worktree, is the command vulnerable again?) There's a similar argument to be made about restricting the config we read - we'd be running the risk of either missing some abusable config that already exists or will exist. And if my understanding is correct (and it very well might be wrong), then this seems too piecemeal to be worth the effort. > I'd definitely like to see us make a security improvement here, but I > also would like to avoid us breaking a lot of repositories, especially > since we lack alternatives. Agreed, leaving existing users high and dry is an awful outcome. > > If git fast-import could 100% correctly round-trip all commits and > repositories, I would be much more open to blocking this in fsck after a > deprecation period, but as it stands that's not possible. Perhaps > improving that would be a suitable way forward. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo ` (2 preceding siblings ...) 2022-04-07 21:24 ` brian m. carlson @ 2022-04-13 20:37 ` Glen Choo 2022-04-13 23:36 ` Junio C Hamano 2022-04-16 0:34 ` Glen Choo 2022-04-16 0:41 ` Glen Choo 5 siblings, 1 reply; 50+ messages in thread From: Glen Choo @ 2022-04-13 20:37 UTC (permalink / raw) To: git; +Cc: Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Thanks for the thoughtful replies, everyone. I'm still catching up with the discussion but I thought I'd share a suggestion/observation from a colleague (cc-ed, thanks!) who was following the discussion separately. Glen Choo <chooglen@google.com> writes: > = What can we do about it? > > Each subsection is an alternative and an analysis (+/- are pros/cons). > > == 1. Prevent users from checking out bare repos > ... > == 2. Detect and reject bare repos using `git fsck` and `transfer.fsckObjects`. > ... > == 3. Detect that we are in an embedded bare repo and ignore the embedded bare > repository in favor of the containing repo. > ... > == 4. Educate users about this risk without making code changes. Martin observed that, viability aside, there's another approach that I haven't discussed: == 5 Disable bare repo discovery We could introduce a config value that disables bare repo discovery altogether. This would only disable _discovery_; a user can still use the bare repo by specifying the gitdir (e.g. via `--git-dir=.` or GIT_DIR). + Extremely robust (for the majority of users who don't need bare repos at least). + Easy to understand. - We need to read config _before_ discovering the gitdir; although Git _can_ do this (e.g. via read_early_config()), I don't think there is precedent for doing this during repository setup. - Making this behavior opt-out would break every bare repo user (they'd need to either start passing --git-dir or change their config). - Making this behavior opt-in won't help the users who need it the most. I'm not sure if specifically "disable bare repo discovery for all users by default" is viable, but the line of thinking seems like it might yield some good ideas. Thoughts? ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-13 20:37 ` Glen Choo @ 2022-04-13 23:36 ` Junio C Hamano 2022-04-14 16:41 ` Glen Choo 0 siblings, 1 reply; 50+ messages in thread From: Junio C Hamano @ 2022-04-13 23:36 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Glen Choo <chooglen@google.com> writes: > Martin observed that, viability aside, there's another approach that I > haven't discussed: > > == 5 Disable bare repo discovery > > We could introduce a config value that disables bare repo discovery > altogether. This would only disable _discovery_; a user can still use > the bare repo by specifying the gitdir (e.g. via `--git-dir=.` or > GIT_DIR). Does it or does it not count as "allowing discovery to do its job" if you go to the directory, knowing that the directory is a bare one, and expect Git to work in it? I am guessing that your definition of "discovery" is not even consider if the current directory is a repository and always force the user to tell us with --git-dir or GIT_DIR. I am not sure that is realistically feasible (I am thinking of cases like "git fetch" going to the remote repository on the local disk that is bare to run "git upload-pack"), but if the fallout is not too bad, it may be a good heuristics. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-13 23:36 ` Junio C Hamano @ 2022-04-14 16:41 ` Glen Choo 2022-04-14 17:35 ` Junio C Hamano 2022-04-15 20:13 ` Junio C Hamano 0 siblings, 2 replies; 50+ messages in thread From: Glen Choo @ 2022-04-14 16:41 UTC (permalink / raw) To: Junio C Hamano Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Junio C Hamano <gitster@pobox.com> writes: > Glen Choo <chooglen@google.com> writes: > >> Martin observed that, viability aside, there's another approach that I >> haven't discussed: >> >> == 5 Disable bare repo discovery >> >> We could introduce a config value that disables bare repo discovery >> altogether. This would only disable _discovery_; a user can still use >> the bare repo by specifying the gitdir (e.g. via `--git-dir=.` or >> GIT_DIR). > > Does it or does it not count as "allowing discovery to do its job" > if you go to the directory, knowing that the directory is a bare > one, and expect Git to work in it? > > I am guessing that your definition of "discovery" is not even > consider if the current directory is a repository and always force > the user to tell us with --git-dir or GIT_DIR. Yes, I mean that even the current directory will be ignored when discovery is disabled. > I am not sure that > is realistically feasible (I am thinking of cases like "git fetch" > going to the remote repository on the local disk that is bare to run > "git upload-pack"), but if the fallout is not too bad, it may be a > good heuristics. Good detail - I hadn't considered the impact on our own child processes. I suspect this might be a huge undertaking. Unless there is significant interest in this option, I probably won't pursue it further. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-14 16:41 ` Glen Choo @ 2022-04-14 17:35 ` Junio C Hamano 2022-04-14 18:19 ` Junio C Hamano 2022-04-15 21:33 ` Glen Choo 2022-04-15 20:13 ` Junio C Hamano 1 sibling, 2 replies; 50+ messages in thread From: Junio C Hamano @ 2022-04-14 17:35 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Glen Choo <chooglen@google.com> writes: > Yes, I mean that even the current directory will be ignored when > discovery is disabled. OK. >> I am not sure that >> is realistically feasible (I am thinking of cases like "git fetch" >> going to the remote repository on the local disk that is bare to run >> "git upload-pack"), but if the fallout is not too bad, it may be a >> good heuristics. > > Good detail - I hadn't considered the impact on our own child processes. > I suspect this might be a huge undertaking. Unless there is significant > interest in this option, I probably won't pursue it further. I do not necessarily think so. The entry points to transport on the server side are quite limited (and the client side is dealing with your own repositories anyway), and they already know which directory in the server filesystem to hand to the upload-pack and friends, so it would be a matter of passing GIT_DIR=$there when they call into the run_command() API, if they are not already doing so. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-14 17:35 ` Junio C Hamano @ 2022-04-14 18:19 ` Junio C Hamano 2022-04-15 21:33 ` Glen Choo 1 sibling, 0 replies; 50+ messages in thread From: Junio C Hamano @ 2022-04-14 18:19 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Junio C Hamano <gitster@pobox.com> writes: > Glen Choo <chooglen@google.com> writes: > >> Yes, I mean that even the current directory will be ignored when >> discovery is disabled. > > OK. > >>> I am not sure that >>> is realistically feasible (I am thinking of cases like "git fetch" >>> going to the remote repository on the local disk that is bare to run >>> "git upload-pack"), but if the fallout is not too bad, it may be a >>> good heuristics. >> >> Good detail - I hadn't considered the impact on our own child processes. >> I suspect this might be a huge undertaking. Unless there is significant >> interest in this option, I probably won't pursue it further. > I do not necessarily think so. The entry points to transport on the By "not" I meant "this might be huge? It may not be". Sorry for being unclear. > server side are quite limited (and the client side is dealing with > your own repositories anyway), and they already know which directory > in the server filesystem to hand to the upload-pack and friends, so > it would be a matter of passing GIT_DIR=$there when they call into the > run_command() API, if they are not already doing so. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-14 17:35 ` Junio C Hamano 2022-04-14 18:19 ` Junio C Hamano @ 2022-04-15 21:33 ` Glen Choo 2022-04-15 22:17 ` Junio C Hamano 2022-04-15 22:43 ` Glen Choo 1 sibling, 2 replies; 50+ messages in thread From: Glen Choo @ 2022-04-15 21:33 UTC (permalink / raw) To: Junio C Hamano Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Junio C Hamano <gitster@pobox.com> writes: > Glen Choo <chooglen@google.com> writes: > >> Yes, I mean that even the current directory will be ignored when >> discovery is disabled. > > OK. > >>> I am not sure that >>> is realistically feasible (I am thinking of cases like "git fetch" >>> going to the remote repository on the local disk that is bare to run >>> "git upload-pack"), but if the fallout is not too bad, it may be a >>> good heuristics. >> >> Good detail - I hadn't considered the impact on our own child processes. >> I suspect this might be a huge undertaking. Unless there is significant >> interest in this option, I probably won't pursue it further. > > I do not necessarily think so. The entry points to transport on the > server side are quite limited (and the client side is dealing with > your own repositories anyway), and they already know which directory > in the server filesystem to hand to the upload-pack and friends, so > it would be a matter of passing GIT_DIR=$there when they call into the > run_command() API, if they are not already doing so. FWIW I experimented with turning off bare repo recognition altogether and seeing what breaks. CI Run: https://github.com/chooglen/git/runs/6042600953?check_suite_focus=true AFAICT, most of the test failures are what we'd expect if we stopped recognizing bare repos altogether. One stands out to me in particular though - in t5550-http-fetch-dumb.sh, git clone $HTTPD_URL/dumb/repo.git clone-tmpl yields ++ git clone http://127.0.0.1:5550/dumb/repo.git clone-tmpl Cloning into 'clone-tmpl'... fatal: repository 'http://127.0.0.1:5550/dumb/repo.git/' not found This sounds to me like Git isn't recognizing the static http files as a remote Git repo, and if so, --git-dir doesn't sound like it'll save us. But I'm not that familiar with the transport code so I don't know if this is a big deal or just a quirk with our httpd tests. ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ---- diff --git a/setup.c b/setup.c index 2600548776..794d9624d5 100644 --- a/setup.c +++ b/setup.c @@ -1196,10 +1196,10 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, return GIT_DIR_DISCOVERED; } - if (is_git_directory(dir->buf)) { - strbuf_addstr(gitdir, "."); - return GIT_DIR_BARE; - } + /* if (is_git_directory(dir->buf)) { */ + /* strbuf_addstr(gitdir, "."); */ + /* return GIT_DIR_BARE; */ + /* } */ if (offset <= min_offset) return GIT_DIR_HIT_CEILING; ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-15 21:33 ` Glen Choo @ 2022-04-15 22:17 ` Junio C Hamano 2022-04-16 0:52 ` Taylor Blau 2022-04-15 22:43 ` Glen Choo 1 sibling, 1 reply; 50+ messages in thread From: Junio C Hamano @ 2022-04-15 22:17 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Glen Choo <chooglen@google.com> writes: > FWIW I experimented with turning off bare repo recognition altogether > and seeing what breaks. I guess our mails crossed ;-) I've done a similar one locally and sent a report out earlier, and I think the part of the function in question that we disabled is identical. > ++ git clone http://127.0.0.1:5550/dumb/repo.git clone-tmpl > Cloning into 'clone-tmpl'... > fatal: repository 'http://127.0.0.1:5550/dumb/repo.git/' not found > > This sounds to me like Git isn't recognizing the static http files as a > remote Git repo, and if so, --git-dir doesn't sound like it'll save us. If the http server side we ship _depends_ on the fact that we historically consider that it is enough to chdir into a directory to use that directory as a bare repository, it is not all that surprising that the server side infrastructure needs to do an equivalent of "export GIT_DIR=." in addition to chdir it already does into the directory. There may be other things that the http responder in the affected test needs to do before it can recognize that the dumb/repo.git URL refers to a valid bare repository, and until that happens, the above experiment may not start working. I am not worried about that kind for breakage all that much, because it is entirely under _our_ control how an HTTP request received results in a "git" invocation, how "git clone/fetch ssh://" invokes the process that runs "git upload-pack" on the other side in the directory requested, etc. What worries me more is the effect on _other_ people's server implementations ("server", because that is one major class of use case of bare repositories). Essentially they need to identify the places where they depend on the current behaviour (i.e. going to the bare repository is enough to use it) and export GIT_DIR=. when they invoke "git" there. The actual change that is necessary might be small, but identifying the places that need such changes may be added burden to them. Thanks. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-15 22:17 ` Junio C Hamano @ 2022-04-16 0:52 ` Taylor Blau 0 siblings, 0 replies; 50+ messages in thread From: Taylor Blau @ 2022-04-16 0:52 UTC (permalink / raw) To: Junio C Hamano Cc: Glen Choo, git, Emily Shaffer, justin, martinvonz, brian m. carlson On Fri, Apr 15, 2022 at 03:17:54PM -0700, Junio C Hamano wrote: > What worries me more is the effect on _other_ people's server > implementations ("server", because that is one major class of use > case of bare repositories). Essentially they need to identify the > places where they depend on the current behaviour (i.e. going to the > bare repository is enough to use it) and export GIT_DIR=. when they > invoke "git" there. The actual change that is necessary might be > small, but identifying the places that need such changes may be > added burden to them. I'm definitely biased here, but I think that this is probably too big of a burden to place on forge administrators. Every repository on GitHub (and I suspect GitLab, too) is written as a bare repository on disk, and so having to inject `GIT_DIR=.` into every git invocation, while doable, seems like an unnecessary pain to impose. I suppose you could hide this behavior behind an opt-in configuration setting, e.g., `core.bareDiscovery` which determines whether or not we will try to discover bare repositories without `GIT_DIR` in the environment. That setting could default to true to avoid breaking existing use-cases. But I think that's just masking the overall pain that this approach would incur, without providing much protection, so I wouldn't be unhappy to see us pursue a different approach here. Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-15 21:33 ` Glen Choo 2022-04-15 22:17 ` Junio C Hamano @ 2022-04-15 22:43 ` Glen Choo 1 sibling, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-04-15 22:43 UTC (permalink / raw) To: Junio C Hamano Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Glen Choo <chooglen@google.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> Glen Choo <chooglen@google.com> writes: >> >>> Yes, I mean that even the current directory will be ignored when >>> discovery is disabled. >> >> OK. >> >>>> I am not sure that >>>> is realistically feasible (I am thinking of cases like "git fetch" >>>> going to the remote repository on the local disk that is bare to run >>>> "git upload-pack"), but if the fallout is not too bad, it may be a >>>> good heuristics. >>> >>> Good detail - I hadn't considered the impact on our own child processes. >>> I suspect this might be a huge undertaking. Unless there is significant >>> interest in this option, I probably won't pursue it further. >> >> I do not necessarily think so. The entry points to transport on the >> server side are quite limited (and the client side is dealing with >> your own repositories anyway), and they already know which directory >> in the server filesystem to hand to the upload-pack and friends, so >> it would be a matter of passing GIT_DIR=$there when they call into the >> run_command() API, if they are not already doing so. > > FWIW I experimented with turning off bare repo recognition altogether > and seeing what breaks. > > CI Run: https://github.com/chooglen/git/runs/6042600953?check_suite_focus=true > > AFAICT, most of the test failures are what we'd expect if we stopped > recognizing bare repos altogether. One stands out to me in particular > though - in t5550-http-fetch-dumb.sh, > > git clone $HTTPD_URL/dumb/repo.git clone-tmpl > > yields > > ++ git clone http://127.0.0.1:5550/dumb/repo.git clone-tmpl > Cloning into 'clone-tmpl'... > fatal: repository 'http://127.0.0.1:5550/dumb/repo.git/' not found > > This sounds to me like Git isn't recognizing the static http files as a > remote Git repo, and if so, --git-dir doesn't sound like it'll save us. > But I'm not that familiar with the transport code so I don't know if > this is a big deal or just a quirk with our httpd tests. Ok I think this is a false alarm - the previous test does some setup on the server, which is a bare repo. It was the _setup_ that was broken, not the actual `git clone`. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-14 16:41 ` Glen Choo 2022-04-14 17:35 ` Junio C Hamano @ 2022-04-15 20:13 ` Junio C Hamano 2022-04-15 23:45 ` Glen Choo 1 sibling, 1 reply; 50+ messages in thread From: Junio C Hamano @ 2022-04-15 20:13 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Glen Choo <chooglen@google.com> writes: > Yes, I mean that even the current directory will be ignored when > discovery is disabled. > >> I am not sure that >> is realistically feasible (I am thinking of cases like "git fetch" >> going to the remote repository on the local disk that is bare to run >> "git upload-pack"), but if the fallout is not too bad, it may be a >> good heuristics. > > Good detail - I hadn't considered the impact on our own child processes. > I suspect this might be a huge undertaking. Unless there is significant > interest in this option, I probably won't pursue it further. So, I was doing a bit of experiment. The change to the set-up essentially means that working in either a bare repository or in the .git/ subdirectory of a non-bare repository that you used to be able to do with cd bare.git && git cmd cd .git/ && git cmd is now forbidden, unless you explicitly say which directory you want to use as the GIT_DIR, i.e. cd bare.git && GIT_DIR=. git cmd cd .git/ && GIT_DIR=. git cmd The required change to the code is surprisingly small, but the fallout is very big. Partial patches to two tests are attached with some commentary at the end. setup.c | 9 +++++++-- diff --git c/setup.c w/setup.c index a7b36f3ffb..52304ae57e 100644 --- c/setup.c +++ w/setup.c @@ -1203,12 +1203,15 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, * Test in the following order (relative to the dir): * - .git (file containing "gitdir: <path>") * - .git/ - * - ./ (bare) + * - ./ (bare) --- not * - ../.git * - ../.git/ - * - ../ (bare) + * - ../ (bare) --- not * - ../../.git * etc. + * we used to test the directory itself as a bare one + * after checking if the directory has .git/ that is a repository + * and before moving up the hierarchy, but no longer do so. */ one_filesystem = !git_env_bool("GIT_DISCOVERY_ACROSS_FILESYSTEM", 0); if (one_filesystem) @@ -1238,12 +1241,14 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir, return GIT_DIR_DISCOVERED; } +#if 0 if (is_git_directory(dir->buf)) { if (!ensure_valid_ownership(dir->buf)) return GIT_DIR_INVALID_OWNERSHIP; strbuf_addstr(gitdir, "."); return GIT_DIR_BARE; } +#endif if (offset <= min_offset) return GIT_DIR_HIT_CEILING; THe above is all it takes to disable the recognition of a bare repository during the repositiry discovery. The code to handle explicitly specified GIT_DIR is way before the loop being touched here and works OK as before. Interestingly, many breakages in tests are not due to "clone" failing to operate in "the other side". For example, this one tries to inspect a bare repository that was created by a clone operation, and we need to tell "git" where the GIT_DIR is, even after we chdir to it. I do not think this is so common but with our popularity, not-so-common population is still an absolute large number of users that we do not want to hurt. diff --git c/t/t5601-clone.sh w/t/t5601-clone.sh index 4a61f2c901..822ee3164a 100755 --- c/t/t5601-clone.sh +++ w/t/t5601-clone.sh @@ -111,9 +111,9 @@ test_expect_success 'clone --mirror' ' git clone --mirror src mirror && test -f mirror/HEAD && test ! -f mirror/file && - FETCH="$(cd mirror && git config remote.origin.fetch)" && + FETCH="$(cd mirror && GIT_DIR=. git config remote.origin.fetch)" && test "+refs/*:refs/*" = "$FETCH" && - MIRROR="$(cd mirror && git config --bool remote.origin.mirror)" && + MIRROR="$(cd mirror && GIT_DIR=. git config --bool remote.origin.mirror)" && test "$MIRROR" = true ' @@ -151,7 +151,7 @@ test_expect_success 'clone --mirror does not repeat tags' ' git tag some-tag HEAD) && git clone --mirror src mirror2 && (cd mirror2 && - git show-ref 2> clone.err > clone.out) && + GIT_DIR=. git show-ref 2>clone.err >clone.out) && ! grep Duplicate mirror2/clone.err && grep some-tag mirror2/clone.out The next example is "init" test, which seems to reveal that refusing to discover a bare repository is not exactly an approach that is workable. The first two hunks just show the same pattern as above. The check_config helper is given a directory, under which 'config' and 'refs' should exist (in other words, the first part of the helper emulates setup.c::is_git_directory()). Then it tries to read from the configuration file found in that directory. Be it a bare repository or .git/ subdirectory of a non-bare repository the caller has given us, we need to do the same GIT_DIR=. dance as the above. The last hunk does not really fix. The scenario is this: * A bare repository bare-ancestor-aliased.git is prepared * Its config file says "[alias] aliasedinit=init" in it * In that bare repository (next to 'refs', 'config'), we create a subdirectory 'plain-nested'. * We go to that bare-ancestor-aliased.git/plain-nested/ directory and say "git aliasedinit". This wants to be able to create a subdirectory in this bare repository and make that subdirectory an independent repository (it happens to be a non-bare repository, but a bare repository should also work---think of .git/modules/name/ bare repository that is designed to be pointed at with gitfile: link from a submodule working tree). Now, in order for the last step to be able to even _use_ the alias, it has to treat the bare-ancestor-aliased.git directory as a bare repository and read its configuration. But by explicitly setting GIT_DIR to that location, the behaviour of "git init" itself is changed. It no longer initializes the current directory, i.e. plain-nested subdirectory of the bare-ancestor-aliased.git; GIT_DIR tells us to (re)initialize it instead. This fundamentally does not work. It is a separate matter if it makes sense to allow creating a new repository inside a bare repository (it does---that is what the modern submodule layout uses), or to allow an alias to help doing so defined in the top-level bare repository (it probably does---the end-user may want to have a handy way to customize how submodules are configured). But it seems to tell us that with today's external interface we have for "git init", we cannot take configuration from a shallower level and use it to drive "git init" to create a new repository at a deeper level. diff --git c/t/t0001-init.sh w/t/t0001-init.sh index d479303efa..57f2eedac0 100755 --- c/t/t0001-init.sh +++ w/t/t0001-init.sh @@ -6,6 +6,7 @@ TEST_PASSES_SANITIZE_LEAK=true . ./test-lib.sh check_config () { + # NEEDSWORK: refs could be a file in post reftable world. if test_path_is_dir "$1" && test_path_is_file "$1/config" && test_path_is_dir "$1/refs" then @@ -21,8 +22,8 @@ check_config () { return 1 fi - bare=$(cd "$1" && git config --bool core.bare) - worktree=$(cd "$1" && git config core.worktree) || + bare=$(cd "$1" && GIT_DIR=. git config --bool core.bare) + worktree=$(cd "$1" && GIT_DIR=. git config core.worktree) || worktree=unset test "$bare" = "$2" && test "$worktree" = "$3" || { @@ -84,7 +85,7 @@ test_expect_success 'plain nested in bare through aliased command' ' echo "[alias] aliasedinit = init" >>config && mkdir plain-nested && cd plain-nested && - git aliasedinit + GIT_DIR=.. git aliasedinit ;# does not work ) && check_config bare-ancestor-aliased.git/plain-nested/.git false unset ' ^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-15 20:13 ` Junio C Hamano @ 2022-04-15 23:45 ` Glen Choo 2022-04-15 23:59 ` Glen Choo 2022-04-16 1:00 ` Taylor Blau 0 siblings, 2 replies; 50+ messages in thread From: Glen Choo @ 2022-04-15 23:45 UTC (permalink / raw) To: Junio C Hamano Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson Junio C Hamano <gitster@pobox.com> writes: > Glen Choo <chooglen@google.com> writes: > >> Yes, I mean that even the current directory will be ignored when >> discovery is disabled. >> >>> I am not sure that >>> is realistically feasible (I am thinking of cases like "git fetch" >>> going to the remote repository on the local disk that is bare to run >>> "git upload-pack"), but if the fallout is not too bad, it may be a >>> good heuristics. >> >> Good detail - I hadn't considered the impact on our own child processes. >> I suspect this might be a huge undertaking. Unless there is significant >> interest in this option, I probably won't pursue it further. > > So, I was doing a bit of experiment. The change to the set-up > essentially means that working in either a bare repository or in the > .git/ subdirectory of a non-bare repository that you used to be able > to do with > > cd bare.git && git cmd > cd .git/ && git cmd > > is now forbidden, unless you explicitly say which directory you want > to use as the GIT_DIR, i.e. > > cd bare.git && GIT_DIR=. git cmd > cd .git/ && GIT_DIR=. git cmd > > The required change to the code is surprisingly small, but the > fallout is very big. Partial patches to two tests are attached with > some commentary at the end. Thanks for running the experiment and the thorough writeup! Within the Google Git team, there was quite a bit of interest in this approach because of how simple it is, but the projected impact on users is potentially very high; this experiment is a great starting heuristic. > Interestingly, many breakages in tests are not due to "clone" > failing to operate in "the other side". For example, this one tries > to inspect a bare repository that was created by a clone operation, > and we need to tell "git" where the GIT_DIR is, even after we chdir > to it. I do not think this is so common but with our popularity, > not-so-common population is still an absolute large number of users > that we do not want to hurt. > > diff --git c/t/t5601-clone.sh w/t/t5601-clone.sh > index 4a61f2c901..822ee3164a 100755 > --- c/t/t5601-clone.sh > +++ w/t/t5601-clone.sh > @@ -111,9 +111,9 @@ test_expect_success 'clone --mirror' ' > git clone --mirror src mirror && > test -f mirror/HEAD && > test ! -f mirror/file && > - FETCH="$(cd mirror && git config remote.origin.fetch)" && > + FETCH="$(cd mirror && GIT_DIR=. git config remote.origin.fetch)" && > test "+refs/*:refs/*" = "$FETCH" && > - MIRROR="$(cd mirror && git config --bool remote.origin.mirror)" && > + MIRROR="$(cd mirror && GIT_DIR=. git config --bool remote.origin.mirror)" && > test "$MIRROR" = true > > ' > @@ -151,7 +151,7 @@ test_expect_success 'clone --mirror does not repeat tags' ' > git tag some-tag HEAD) && > git clone --mirror src mirror2 && > (cd mirror2 && > - git show-ref 2> clone.err > clone.out) && > + GIT_DIR=. git show-ref 2>clone.err >clone.out) && > ! grep Duplicate mirror2/clone.err && > grep some-tag mirror2/clone.out The assumption when I made this proposal is that --git-dir is a sufficient workaround when we don't detect bare repos. As such, I'm of the opinion that this is the "expected" result of such a change - we really do expect that `git clone --bare` no longer works without --git-dir. I don't think this case (by itself) means that we can ignore bare repos. > The next example is "init" test, which seems to reveal that refusing > to discover a bare repository is not exactly an approach that is > workable. > > The first two hunks just show the same pattern as above. The > check_config helper is given a directory, under which 'config' and > 'refs' should exist (in other words, the first part of the helper > emulates setup.c::is_git_directory()). Then it tries to read from > the configuration file found in that directory. Be it a bare > repository or .git/ subdirectory of a non-bare repository the caller > has given us, we need to do the same GIT_DIR=. dance as the above. > > The last hunk does not really fix. The scenario is this: > > * A bare repository bare-ancestor-aliased.git is prepared > > * Its config file says "[alias] aliasedinit=init" in it > > * In that bare repository (next to 'refs', 'config'), we create > a subdirectory 'plain-nested'. > > * We go to that bare-ancestor-aliased.git/plain-nested/ > directory and say "git aliasedinit". > > This wants to be able to create a subdirectory in this bare > repository and make that subdirectory an independent repository (it > happens to be a non-bare repository, but a bare repository should > also work---think of .git/modules/name/ bare repository that is > designed to be pointed at with gitfile: link from a submodule > working tree). > > Now, in order for the last step to be able to even _use_ the alias, > it has to treat the bare-ancestor-aliased.git directory as a bare > repository and read its configuration. But by explicitly setting > GIT_DIR to that location, the behaviour of "git init" itself is > changed. It no longer initializes the current directory, > i.e. plain-nested subdirectory of the bare-ancestor-aliased.git; > GIT_DIR tells us to (re)initialize it instead. However, this case shows that --git-dir won't work at all with "git init", and I wouldn't be surprised if there are other breakages hiding just out of plain sight. This worries me a lot more, and I think disabling bare repo detection wholesale might be a nonstarter. I wonder if all we need to do is make setup.c a little smarter - we recognize bare repos, but *only* if they are contained in a directory named `.git/`. This should fix "git init" I think, because even though we'd ignore "./ (bare)", we'd recognize `../.git/` and we get the right behavior again. I haven't tested it yet, but this proposal sounds like it has quite a few merits to me: - Easy to explain. - Easy rationalization ("We used to be quite cavalier about detecting bare repos, but now we're being more strict by default"). - Fixes the embedded bare repo problem (since we don't allow .git). - No-op and neglible performance hit for non-bare repo users, even if they occasionally run "git" inside ".git". As with the original proposal of "ignore bare repos altogether", this will still be a headache for bare repo users (unless they don't mind renaming their bare repo directory to ".git"), so this behavior needs to be configurable, but perhaps the fallout is small enough that this could be a safe default. > This fundamentally does not work. > > It is a separate matter if it makes sense to allow creating a new > repository inside a bare repository (it does---that is what the > modern submodule layout uses), or to allow an alias to help doing so > defined in the top-level bare repository (it probably does---the > end-user may want to have a handy way to customize how submodules > are configured). But it seems to tell us that with today's external > interface we have for "git init", we cannot take configuration from > a shallower level and use it to drive "git init" to create a new > repository at a deeper level. > ' ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-15 23:45 ` Glen Choo @ 2022-04-15 23:59 ` Glen Choo 2022-04-16 1:00 ` Taylor Blau 1 sibling, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-04-15 23:59 UTC (permalink / raw) To: Junio C Hamano Cc: git, Emily Shaffer, justin, Taylor Blau, martinvonz, brian m. carlson > I wonder if all we need to do is make setup.c a little smarter - we > recognize bare repos, but *only* if they are contained in a directory > named `.git/`. This should fix "git init" I think, because even though > we'd ignore "./ (bare)", we'd recognize `../.git/` and we get the right > behavior again. > > > It is a separate matter if it makes sense to allow creating a new > > repository inside a bare repository (it does---that is what the > > modern submodule layout uses), or to allow an alias to help doing so > > defined in the top-level bare repository (it probably does---the > > end-user may want to have a handy way to customize how submodules > > are configured). But it seems to tell us that with today's external > > interface we have for "git init", we cannot take configuration from > > a shallower level and use it to drive "git init" to create a new > > repository at a deeper level. Hah, I forgot that submodules are stored as `.git/modules/foo` and not `.git/modules/foo/.git`, so this doesn't preserve the existing behavior for submodules nested > 1 level. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-15 23:45 ` Glen Choo 2022-04-15 23:59 ` Glen Choo @ 2022-04-16 1:00 ` Taylor Blau 2022-04-16 1:18 ` Junio C Hamano 1 sibling, 1 reply; 50+ messages in thread From: Taylor Blau @ 2022-04-16 1:00 UTC (permalink / raw) To: Glen Choo Cc: Junio C Hamano, git, Emily Shaffer, justin, martinvonz, brian m. carlson On Fri, Apr 15, 2022 at 04:45:32PM -0700, Glen Choo wrote: > However, this case shows that --git-dir won't work at all with "git > init", and I wouldn't be surprised if there are other breakages hiding > just out of plain sight. This worries me a lot more, and I think > disabling bare repo detection wholesale might be a nonstarter. > > I wonder if all we need to do is make setup.c a little smarter - we > recognize bare repos, but *only* if they are contained in a directory > named `.git/`. This should fix "git init" I think, because even though > we'd ignore "./ (bare)", we'd recognize `../.git/` and we get the right > behavior again. > > I haven't tested it yet, but this proposal sounds like it has quite a > few merits to me: > > - Easy to explain. > - Easy rationalization ("We used to be quite cavalier about detecting > bare repos, but now we're being more strict by default"). > - Fixes the embedded bare repo problem (since we don't allow .git). > - No-op and neglible performance hit for non-bare repo users, even if > they occasionally run "git" inside ".git". > > As with the original proposal of "ignore bare repos altogether", this > will still be a headache for bare repo users (unless they don't mind > renaming their bare repo directory to ".git"), so this behavior needs to > be configurable, but perhaps the fallout is small enough that this > could be a safe default. I think you are underestimating the fallout, at least if I'm understanding your proposal correctly. Is the proposal to only detect bare repositories that are called `.git`? I think that's what you're suggesting, though can't we just as easily embed a bare repository named ".git" in a clone as long as its not in the root directory? But like I said in an earlier message, I think that this is vastly underestimating the number of legitimate repositories that are stored as bare repos on disk (not embedded in a non-bare repo, nor named ".git"). Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-16 1:00 ` Taylor Blau @ 2022-04-16 1:18 ` Junio C Hamano 2022-04-16 1:30 ` Taylor Blau 0 siblings, 1 reply; 50+ messages in thread From: Junio C Hamano @ 2022-04-16 1:18 UTC (permalink / raw) To: Taylor Blau Cc: Glen Choo, git, Emily Shaffer, justin, martinvonz, brian m. carlson Taylor Blau <me@ttaylorr.com> writes: > Is the proposal to only detect bare repositories that are called `.git`? > I think that's what you're suggesting, though can't we just as easily > embed a bare repository named ".git" in a clone as long as its not in > the root directory? I do not think "you can use your bare repository as before ONLY if the directory is named .git; otherwise you must use GIT_DIR to point at it" would fly; the Glen's exception may help many uses of ".git subdirectory of a non-bare repository as if it were a bare" you can find in tests, but does not help real-world use cases where there may be bunch of bare repositories named "$project.git" at all. But I have to point out that your attack above would not work, as we do not allow ".git" directory in the index to begin with. IOW, you as an attacker may be able to prepare such a tree with nonstandard tools, but the victim won't be able to check it out (and fsck-during-transfer would probably block the cloning). ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-16 1:18 ` Junio C Hamano @ 2022-04-16 1:30 ` Taylor Blau 0 siblings, 0 replies; 50+ messages in thread From: Taylor Blau @ 2022-04-16 1:30 UTC (permalink / raw) To: Junio C Hamano Cc: Taylor Blau, Glen Choo, git, Emily Shaffer, justin, martinvonz, brian m. carlson On Fri, Apr 15, 2022 at 06:18:14PM -0700, Junio C Hamano wrote: > Taylor Blau <me@ttaylorr.com> writes: > > > Is the proposal to only detect bare repositories that are called `.git`? > > I think that's what you're suggesting, though can't we just as easily > > embed a bare repository named ".git" in a clone as long as its not in > > the root directory? > > I do not think "you can use your bare repository as before ONLY if > the directory is named .git; otherwise you must use GIT_DIR to point > at it" would fly; the Glen's exception may help many uses of ".git > subdirectory of a non-bare repository as if it were a bare" you can > find in tests, but does not help real-world use cases where there > may be bunch of bare repositories named "$project.git" at all. Agreed. > But I have to point out that your attack above would not work, as we > do not allow ".git" directory in the index to begin with. IOW, you > as an attacker may be able to prepare such a tree with nonstandard > tools, but the victim won't be able to check it out (and > fsck-during-transfer would probably block the cloning). Makes sense, and thanks for the reminder; I agree. Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo ` (3 preceding siblings ...) 2022-04-13 20:37 ` Glen Choo @ 2022-04-16 0:34 ` Glen Choo 2022-04-16 0:41 ` Glen Choo 5 siblings, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-04-16 0:34 UTC (permalink / raw) To: git; +Cc: Emily Shaffer, justin, Taylor Blau Hi all! Thanks for chiming in. I'm sorry I wasn't able to participate in a timely manner, and unfortunately (once again..), I will be out of the office (returning 27 Apr, PDT) and my colleagues will be keeping tabs on the discussion in my stead (once again..). Nevertheless, I think the conversation has been pretty fruitful, so I'm optimistic about next steps. Here's my understanding of the conversation thus far - do chime in if you feel I'm off base or if I've missed something: * We all agree that something needs to be done about embedded bare repos. This is a pretty good starting point IMO, because we agree that 'do nothing' isn't a good response. * There are use cases for embedded bare repos that don't have great alternatives (e.g. libgit2 uses bare repos in its tests). Even if this workflow is frowned upon (I personally don't think we should support it), I don't think we're ready to categorically declare that Git should ban embedded bare repos altogether (e.g. the way we ban .GiT). * We want additional protection on the client besides `git fsck`; there are a few ways to do this: 1. Prevent checking out an embedded bare repo. 2. Detect if the bare repo is embedded and refuse to work with it. 3. Detect if the bare repo is embedded and do not read its config/hooks, but everything else still 'works'. 4. Don't detect bare repos. 5. Only detect bare repos that are named `.git` [1]. (I've responded with my thoughts on each of these approaches in-thread). With that in mind, here's what I propose we do next: * Merge the `git fsck` patch [2] if we think that it is useful despite the potentially huge number of false positives. That patch needs some fixing; I'll make the changes when I'm back. * I'll experiment with (5), and if it seems promising, I'll propose this as an opt-in feature, with the intent of making it opt-out in the future. We'll opt-into this at Google to help figure out if this is a good default or not. * If that direction turns out not to be promising, I'll pursue (1), since that is the only option that can be configured per-repo, which should hopefully minimize the fallout. Given that this embedded bare repo problem has been known for a long time, I don't think we need to rush out a fix, but (especially since I'll be OOO) I'm more than happy for someone to take my ideas (or any ideas) and run with them. [1] https://lore.kernel.org/git/kl6l5yn99ahv.fsf@chooglen-macbookpro.roam.corp.google.com/ [2] https://lore.kernel.org/git/20220406232231.47714-1-chooglen@google.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo ` (4 preceding siblings ...) 2022-04-16 0:34 ` Glen Choo @ 2022-04-16 0:41 ` Glen Choo 2022-04-16 1:28 ` Taylor Blau 5 siblings, 1 reply; 50+ messages in thread From: Glen Choo @ 2022-04-16 0:41 UTC (permalink / raw) To: git Cc: Emily Shaffer, justin, Taylor Blau, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, rsbecker (resending this because I think the original got blocked, and I messed up the CC list anyway.) Hi all! Thanks for chiming in. I'm sorry I wasn't able to participate in a timely manner, and unfortunately (once again..), I will be out of the office (returning 27 Apr, PDT) and my colleagues will be keeping tabs on the discussion in my stead (once again..). Nevertheless, I think the conversation has been pretty fruitful, so I'm optimistic about next steps. Here's my understanding of the conversation thus far - do chime in if you feel I'm off base or if I've missed something: * We all agree that something needs to be done about embedded bare repos. This is a pretty good starting point IMO, because we agree that 'do nothing' isn't a good response. * There are use cases for embedded bare repos that don't have great alternatives (e.g. libgit2 uses bare repos in its tests). Even if this workflow is frowned upon (I personally don't think we should support it), I don't think we're ready to categorically declare that Git should ban embedded bare repos altogether (e.g. the way we ban .GiT). * We want additional protection on the client besides `git fsck`; there are a few ways to do this: 1. Prevent checking out an embedded bare repo. 2. Detect if the bare repo is embedded and refuse to work with it. 3. Detect if the bare repo is embedded and do not read its config/hooks, but everything else still 'works'. 4. Don't detect bare repos. 5. Only detect bare repos that are named `.git` [1]. (I've responded with my thoughts on each of these approaches in-thread). With that in mind, here's what I propose we do next: * Merge the `git fsck` patch [2] if we think that it is useful despite the potentially huge number of false positives. That patch needs some fixing; I'll make the changes when I'm back. * I'll experiment with (5), and if it seems promising, I'll propose this as an opt-in feature, with the intent of making it opt-out in the future. We'll opt-into this at Google to help figure out if this is a good default or not. * If that direction turns out not to be promising, I'll pursue (1), since that is the only option that can be configured per-repo, which should hopefully minimize the fallout. Given that this embedded bare repo problem has been known for a long time, I don't think we need to rush out a fix, but (especially since I'll be OOO) I'm more than happy for someone to take my ideas (or any ideas) and run with them. [1] https://lore.kernel.org/git/kl6l5yn99ahv.fsf@chooglen-macbookpro.roam.corp.google.com/ [2] https://lore.kernel.org/git/20220406232231.47714-1-chooglen@google.com ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-16 0:41 ` Glen Choo @ 2022-04-16 1:28 ` Taylor Blau 2022-04-21 18:25 ` Emily Shaffer 2022-04-29 23:57 ` Glen Choo 0 siblings, 2 replies; 50+ messages in thread From: Taylor Blau @ 2022-04-16 1:28 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Taylor Blau, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, rsbecker On Fri, Apr 15, 2022 at 05:41:59PM -0700, Glen Choo wrote: > * We all agree that something needs to be done about embedded bare repos. This > is a pretty good starting point IMO, because we agree that 'do nothing' isn't > a good response. To be totally honest, I am not absolutely convinced. I agree that it's sub-optimal that Git is an attack vector for remote code execution, but I think there is significant social engineering required in order to meaningfully exploit this. Particularly because an attacker must convince their victim to: - clone the repository, - cd into the embedded bare repository, and - run a git command Scripting around the output of git commands in your PS1 makes the latter more likely, so I think it's worthwhile to explore how to either prevent this type of attack or make it substantially less likely to have a user run git commands that execute parts of the config opportunistically. That said, I think there are other approaches that we could take that would hopefully disrupt fewer existing workflows. > * There are use cases for embedded bare repos that don't have great alternatives > (e.g. libgit2 uses bare repos in its tests). Even if this workflow is frowned > upon (I personally don't think we should support it), I don't think we're > ready to categorically declare that Git should ban embedded bare repos > altogether (e.g. the way we ban .GiT). I think there are a handful of legitimate reasons that we might want to continue supporting this; for projects like libgit2 and Git LFS, it's useful to have repositories in a known state to execute tests in. Having bare Git repositories contained in some "test fixture" directory is a really easy way to do just that. > * We want additional protection on the client besides `git fsck`; there are > a few ways to do this: I'm a little late to chime into the thread, but I appreciate you summarizing some of the approaches discussed so far. Let me add my thoughts on each of these in order: > 1. Prevent checking out an embedded bare repo. > 2. Detect if the bare repo is embedded and refuse to work with it. > 3. Detect if the bare repo is embedded and do not read its config/hooks, but > everything else still 'works'. > 4. Don't detect bare repos. > 5. Only detect bare repos that are named `.git` [1]. > > (I've responded with my thoughts on each of these approaches in-thread). 1. Likely disrupts too many legitimate workflows for us to adopt without designing some way to declare an embedded bare repository is "safe". 2. Ditto. 3. This seems the most promising approach so far. Similar to (1), I would also want to make sure we provide an easy way to declare a bare repository as "safe" in order to avoid permanently disrupting valid workflows that have accumulated over the past >15 years. 4. Seems like this approach is too heavy-handed. 5. Ditto. > With that in mind, here's what I propose we do next: > > * Merge the `git fsck` patch [2] if we think that it is useful despite the > potentially huge number of false positives. That patch needs some fixing; I'll > make the changes when I'm back. If there are lots of false positives, we should consider downgrading the severity of the proposed `EMBEDDED_BARE_REPO` fsck check to "info". I'm not clear if there is another reason why this patch would have a significant number of false positives (i.e., if the detection mechanism is over-zealous). But if not, and this does detect only legitimate embedded bare repositories, we should use it as a reminder to consider how many use-cases and workflows will be affected by whatever approach we take here, if any. > * I'll experiment with (5), and if it seems promising, I'll propose this as an > opt-in feature, with the intent of making it opt-out in the future. We'll > opt-into this at Google to help figure out if this is a good default or not. > > * If that direction turns out not to be promising, I'll pursue (1), since that > is the only option that can be configured per-repo, which should hopefully > minimize the fallout. Here's an alternative approach, which I haven't seen discussed thus far: When a bare repository is embedded in another repository, avoid reading its config by default. This means that most Git commands will still work, but without the possibility of running any "executable" portions of the config. To opt-out (i.e., to allow legitimate use-cases to start reading embedded bare repository config again), the embedding repository would have to set a multi-valued `safe.embeddedRepo` configuration. This would specify a list of paths relative to the embedding repository's root of known-safe bare repositories. I think there are a couple of desirable attributes of this approach: - It minimally disrupts bare repositories, restricting the change to only embedded repositories. - It allows most Git commands to continue working as expected (modulo reading the config), hopefully making the population of users whose workflows will suddenly break pretty small. - It requires the user to explicitly opt-in to the unsafe behavior, because an attacker could not influence the embedding repository's `safe.embeddedRepo` config. If we were going to do something about this, I would strongly advocate for something that resembles the above. Or at the very least, some solution that captures the attributes I outlined there. I would be happy to work together with you (or anybody!) on developing patches in that direction, so let me know if you are interested in coordinating our efforts. > Given that this embedded bare repo problem has been known for a long time, I > don't think we need to rush out a fix, but (especially since I'll be OOO) I'm > more than happy for someone to take my ideas (or any ideas) and run with them. No rush. Regardless of your time out-of-office, we should take advantage of the fact that this is a long-known wart to carefully craft a solution that provides some additional safety while disrupting as few existing workflows as possible. Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-16 1:28 ` Taylor Blau @ 2022-04-21 18:25 ` Emily Shaffer 2022-04-21 18:29 ` Emily Shaffer 2022-04-21 19:09 ` Taylor Blau 2022-04-29 23:57 ` Glen Choo 1 sibling, 2 replies; 50+ messages in thread From: Emily Shaffer @ 2022-04-21 18:25 UTC (permalink / raw) To: Taylor Blau Cc: Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, Randall S. Becker On Fri, Apr 15, 2022 at 6:28 PM Taylor Blau <me@ttaylorr.com> wrote: > > On Fri, Apr 15, 2022 at 05:41:59PM -0700, Glen Choo wrote: > > * We want additional protection on the client besides `git fsck`; there are > > a few ways to do this: > > I'm a little late to chime into the thread, but I appreciate you > summarizing some of the approaches discussed so far. Let me add my > thoughts on each of these in order: > > > 1. Prevent checking out an embedded bare repo. > > 2. Detect if the bare repo is embedded and refuse to work with it. > > 3. Detect if the bare repo is embedded and do not read its config/hooks, but > > everything else still 'works'. > > 4. Don't detect bare repos. > > 5. Only detect bare repos that are named `.git` [1]. > > > > (I've responded with my thoughts on each of these approaches in-thread). > > 1. Likely disrupts too many legitimate workflows for us to adopt > without designing some way to declare an embedded bare repository > is "safe". > 2. Ditto. > 3. This seems the most promising approach so far. Similar to (1), I > would also want to make sure we provide an easy way to declare a > bare repository as "safe" in order to avoid permanently disrupting > valid workflows that have accumulated over the past >15 years. I'd like to think a little more about how we want to determine that a bare repo isn't embedded - wantonly scanning up the filesystem for any gitdir above the current one is really expensive. When I tried that approach for the purposes of including some shared config between superproject and submodules, it slowed down the Git test suite by something like 3-5x. So, I suppose that "we think this is bare" means refs/ and objects/ present in a directory that isn't named '.git/', and then we hunt anywhere above us for another .git/? Of course being careful not to accept another bare repo as the "parent" gitdir. Does it work for submodules? I suppose it works for non-bare submodules - and for bare submodules, e.g. 'submodule-having-project.git/modules/some-submodule/' we can rely on the user to turn that flag on if they want their submodule's config and hooks to be noticed from the gitdir. (From 'worktree-for-submodule-having-project/some-submodule' there is a '.git' pointer, so I'd expect things to work normally that way, right?) As long as we are careful to avoid searching the filesystem in *every* case, this seems like a pretty reasonable approach to me. > 4. Seems like this approach is too heavy-handed. > 5. Ditto. > > > With that in mind, here's what I propose we do next: > > > > * Merge the `git fsck` patch [2] if we think that it is useful despite the > > potentially huge number of false positives. That patch needs some fixing; I'll > > make the changes when I'm back. > > If there are lots of false positives, we should consider downgrading the > severity of the proposed `EMBEDDED_BARE_REPO` fsck check to "info". I'm > not clear if there is another reason why this patch would have a > significant number of false positives (i.e., if the detection mechanism > is over-zealous). > > But if not, and this does detect only legitimate embedded bare > repositories, we should use it as a reminder to consider how many > use-cases and workflows will be affected by whatever approach we take > here, if any. > > > * I'll experiment with (5), and if it seems promising, I'll propose this as an > > opt-in feature, with the intent of making it opt-out in the future. We'll > > opt-into this at Google to help figure out if this is a good default or not. > > > > * If that direction turns out not to be promising, I'll pursue (1), since that > > is the only option that can be configured per-repo, which should hopefully > > minimize the fallout. > > Here's an alternative approach, which I haven't seen discussed thus far: > > When a bare repository is embedded in another repository, avoid reading > its config by default. This means that most Git commands will still > work, but without the possibility of running any "executable" portions > of the config. To opt-out (i.e., to allow legitimate use-cases to start > reading embedded bare repository config again), the embedding repository > would have to set a multi-valued `safe.embeddedRepo` configuration. This > would specify a list of paths relative to the embedding repository's > root of known-safe bare repositories. > > I think there are a couple of desirable attributes of this approach: > > - It minimally disrupts bare repositories, restricting the change to > only embedded repositories. > - It allows most Git commands to continue working as expected (modulo > reading the config), hopefully making the population of users whose > workflows will suddenly break pretty small. > - It requires the user to explicitly opt-in to the unsafe behavior, > because an attacker could not influence the embedding repository's > `safe.embeddedRepo` config. > > If we were going to do something about this, I would strongly advocate > for something that resembles the above. Or at the very least, some > solution that captures the attributes I outlined there. Nice - a more strict spin on proposal 3 from above, if I understand it right. Rather than allowing any and all bare repos, avoid someone sneaking in a malicious one next to legitimate ones by using an allowlist. Seems reasonable to me. - Emily ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-21 18:25 ` Emily Shaffer @ 2022-04-21 18:29 ` Emily Shaffer 2022-04-21 18:47 ` Junio C Hamano 2022-04-21 19:09 ` Taylor Blau 1 sibling, 1 reply; 50+ messages in thread From: Emily Shaffer @ 2022-04-21 18:29 UTC (permalink / raw) To: Taylor Blau Cc: Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, Randall S. Becker On Thu, Apr 21, 2022 at 11:25 AM Emily Shaffer <emilyshaffer@google.com> wrote: > > On Fri, Apr 15, 2022 at 6:28 PM Taylor Blau <me@ttaylorr.com> wrote: > > > > On Fri, Apr 15, 2022 at 05:41:59PM -0700, Glen Choo wrote: > > > * We want additional protection on the client besides `git fsck`; there are > > > a few ways to do this: > > > > I'm a little late to chime into the thread, but I appreciate you > > summarizing some of the approaches discussed so far. Let me add my > > thoughts on each of these in order: > > > > > 1. Prevent checking out an embedded bare repo. > > > 2. Detect if the bare repo is embedded and refuse to work with it. > > > 3. Detect if the bare repo is embedded and do not read its config/hooks, but > > > everything else still 'works'. > > > 4. Don't detect bare repos. > > > 5. Only detect bare repos that are named `.git` [1]. > > > > > > (I've responded with my thoughts on each of these approaches in-thread). > > > > 1. Likely disrupts too many legitimate workflows for us to adopt > > without designing some way to declare an embedded bare repository > > is "safe". > > 2. Ditto. > > 3. This seems the most promising approach so far. Similar to (1), I > > would also want to make sure we provide an easy way to declare a > > bare repository as "safe" in order to avoid permanently disrupting > > valid workflows that have accumulated over the past >15 years. > > I'd like to think a little more about how we want to determine that a > bare repo isn't embedded - wantonly scanning up the filesystem for any > gitdir above the current one is really expensive. When I tried that > approach for the purposes of including some shared config between > superproject and submodules, it slowed down the Git test suite by > something like 3-5x. So, I suppose that "we think this is bare" means > refs/ and objects/ present in a directory that isn't named '.git/', > and then we hunt anywhere above us for another .git/? Of course being > careful not to accept another bare repo as the "parent" gitdir. > > Does it work for submodules? I suppose it works for non-bare > submodules - and for bare submodules, e.g. > 'submodule-having-project.git/modules/some-submodule/' we can rely on > the user to turn that flag on if they want their submodule's config > and hooks to be noticed from the gitdir. (From > 'worktree-for-submodule-having-project/some-submodule' there is a > '.git' pointer, so I'd expect things to work normally that way, > right?) > > As long as we are careful to avoid searching the filesystem in *every* > case, this seems like a pretty reasonable approach to me. > > > 4. Seems like this approach is too heavy-handed. > > 5. Ditto. > > > > > With that in mind, here's what I propose we do next: > > > > > > * Merge the `git fsck` patch [2] if we think that it is useful despite the > > > potentially huge number of false positives. That patch needs some fixing; I'll > > > make the changes when I'm back. > > > > If there are lots of false positives, we should consider downgrading the > > severity of the proposed `EMBEDDED_BARE_REPO` fsck check to "info". I'm > > not clear if there is another reason why this patch would have a > > significant number of false positives (i.e., if the detection mechanism > > is over-zealous). > > > > But if not, and this does detect only legitimate embedded bare > > repositories, we should use it as a reminder to consider how many > > use-cases and workflows will be affected by whatever approach we take > > here, if any. > > > > > * I'll experiment with (5), and if it seems promising, I'll propose this as an > > > opt-in feature, with the intent of making it opt-out in the future. We'll > > > opt-into this at Google to help figure out if this is a good default or not. > > > > > > * If that direction turns out not to be promising, I'll pursue (1), since that > > > is the only option that can be configured per-repo, which should hopefully > > > minimize the fallout. > > > > Here's an alternative approach, which I haven't seen discussed thus far: > > > > When a bare repository is embedded in another repository, avoid reading > > its config by default. This means that most Git commands will still > > work, but without the possibility of running any "executable" portions > > of the config. To opt-out (i.e., to allow legitimate use-cases to start > > reading embedded bare repository config again), the embedding repository > > would have to set a multi-valued `safe.embeddedRepo` configuration. This > > would specify a list of paths relative to the embedding repository's > > root of known-safe bare repositories. > > > > I think there are a couple of desirable attributes of this approach: > > > > - It minimally disrupts bare repositories, restricting the change to > > only embedded repositories. > > - It allows most Git commands to continue working as expected (modulo > > reading the config), hopefully making the population of users whose > > workflows will suddenly break pretty small. > > - It requires the user to explicitly opt-in to the unsafe behavior, > > because an attacker could not influence the embedding repository's > > `safe.embeddedRepo` config. > > > > If we were going to do something about this, I would strongly advocate > > for something that resembles the above. Or at the very least, some > > solution that captures the attributes I outlined there. > > Nice - a more strict spin on proposal 3 from above, if I understand it > right. Rather than allowing any and all bare repos, avoid someone > sneaking in a malicious one next to legitimate ones by using an > allowlist. Seems reasonable to me. Ah, another thing I forgot to mention. There has been a little discussion in the past about isolating "safe" parts of config (and gitdir) from "unsafe" parts, e.g. "which configs and embedded scripts are executables", to help better protect from zipfile-type attacks, which are very similar to this kind of attack. I wonder if it makes sense to consider that sort of thing as a needswork for this bugfix? e.g. "/* NEEDSWORK: Only ignore unsafe configs and hooks instead of ignoring the entire embedded config and hooks in the future */"? > > - Emily ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-21 18:29 ` Emily Shaffer @ 2022-04-21 18:47 ` Junio C Hamano 2022-04-21 18:54 ` Taylor Blau 0 siblings, 1 reply; 50+ messages in thread From: Junio C Hamano @ 2022-04-21 18:47 UTC (permalink / raw) To: Emily Shaffer Cc: Taylor Blau, Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, brian m. carlson, Randall S. Becker Emily Shaffer <emilyshaffer@google.com> writes: >> Nice - a more strict spin on proposal 3 from above, if I understand it >> right. Rather than allowing any and all bare repos, avoid someone >> sneaking in a malicious one next to legitimate ones by using an >> allowlist. Seems reasonable to me. > > Ah, another thing I forgot to mention. There has been a little > discussion in the past about isolating "safe" parts of config (and > gitdir) from "unsafe" parts, e.g. "which configs and embedded scripts > are executables", to help better protect from zipfile-type attacks, > which are very similar to this kind of attack. I wonder if it makes > sense to consider that sort of thing as a needswork for this bugfix? > e.g. "/* NEEDSWORK: Only ignore unsafe configs and hooks instead of > ignoring the entire embedded config and hooks in the future */"? There have been such discussions in the past and they all went nowhere because such safe-listing fundamentally does not work. What you consider "safe" today may turn out to be "unsafe" and in a later version of Git will stop honoring it, and for those who depended on it being listed as "safe", such a security fix will be a regression. Disabling the whole thing if the file can be tainted is the only sensible way forward without promising users that they will be hurt by such changes/regressions in the future, I would think. So, no, I do not think such a NEEDSWORK comment is welcome. Thanks. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-21 18:47 ` Junio C Hamano @ 2022-04-21 18:54 ` Taylor Blau 0 siblings, 0 replies; 50+ messages in thread From: Taylor Blau @ 2022-04-21 18:54 UTC (permalink / raw) To: Junio C Hamano Cc: Emily Shaffer, Taylor Blau, Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, brian m. carlson, Randall S. Becker On Thu, Apr 21, 2022 at 11:47:34AM -0700, Junio C Hamano wrote: > Emily Shaffer <emilyshaffer@google.com> writes: > > > Ah, another thing I forgot to mention. There has been a little > > discussion in the past about isolating "safe" parts of config (and > > gitdir) from "unsafe" parts, e.g. "which configs and embedded scripts > > are executables", to help better protect from zipfile-type attacks, > > which are very similar to this kind of attack. I wonder if it makes > > sense to consider that sort of thing as a needswork for this bugfix? > > e.g. "/* NEEDSWORK: Only ignore unsafe configs and hooks instead of > > ignoring the entire embedded config and hooks in the future */"? > > There have been such discussions in the past and they all went > nowhere because such safe-listing fundamentally does not work. What > you consider "safe" today may turn out to be "unsafe" and in a later > version of Git will stop honoring it, and for those who depended on > it being listed as "safe", such a security fix will be a regression. > > Disabling the whole thing if the file can be tainted is the only > sensible way forward without promising users that they will be hurt > by such changes/regressions in the future, I would think. I assume when Junio says "safe-listing" he is talking about your "which configs and embedded scripts are executables". I have tossed that idea around in my own head for a little while, and in addition to the points that Junio refers to, I think that this could be confusing for users. I worry about forcing the user to consider which parts of their config and hooks are being read/ignored, and then re-interpret their meaning in light of that. That seems like it would be an unnecessarily tricky position to put users in, and I think we could get around it by either reading the config/hooks, or ignoring them entirely. (It also seems error-prone to me: just trying to list which parts of our config could lead to command-execution is challenging for me at least. In addition to the ongoing maintenance cost, a clever attacker would almost certainly be able to spot some obscure piece of config that we didn't consider and then use it in their attack.) Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-21 18:25 ` Emily Shaffer 2022-04-21 18:29 ` Emily Shaffer @ 2022-04-21 19:09 ` Taylor Blau 2022-04-21 21:01 ` Emily Shaffer 1 sibling, 1 reply; 50+ messages in thread From: Taylor Blau @ 2022-04-21 19:09 UTC (permalink / raw) To: Emily Shaffer Cc: Taylor Blau, Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, Randall S. Becker On Thu, Apr 21, 2022 at 11:25:39AM -0700, Emily Shaffer wrote: > On Fri, Apr 15, 2022 at 6:28 PM Taylor Blau <me@ttaylorr.com> wrote: > > > > On Fri, Apr 15, 2022 at 05:41:59PM -0700, Glen Choo wrote: > > > * We want additional protection on the client besides `git fsck`; there are > > > a few ways to do this: > > > > I'm a little late to chime into the thread, but I appreciate you > > summarizing some of the approaches discussed so far. Let me add my > > thoughts on each of these in order: > > > > > 1. Prevent checking out an embedded bare repo. > > > 2. Detect if the bare repo is embedded and refuse to work with it. > > > 3. Detect if the bare repo is embedded and do not read its config/hooks, but > > > everything else still 'works'. > > > 4. Don't detect bare repos. > > > 5. Only detect bare repos that are named `.git` [1]. > > > > > > (I've responded with my thoughts on each of these approaches in-thread). > > > > 1. Likely disrupts too many legitimate workflows for us to adopt > > without designing some way to declare an embedded bare repository > > is "safe". > > 2. Ditto. > > 3. This seems the most promising approach so far. Similar to (1), I > > would also want to make sure we provide an easy way to declare a > > bare repository as "safe" in order to avoid permanently disrupting > > valid workflows that have accumulated over the past >15 years. > > I'd like to think a little more about how we want to determine that a > bare repo isn't embedded - wantonly scanning up the filesystem for any > gitdir above the current one is really expensive. When I tried that > approach for the purposes of including some shared config between > superproject and submodules, it slowed down the Git test suite by > something like 3-5x. So, I suppose that "we think this is bare" means > refs/ and objects/ present in a directory that isn't named '.git/', > and then we hunt anywhere above us for another .git/? Of course being > careful not to accept another bare repo as the "parent" gitdir. Definitely worth considering. Fundamentally I think if you're going to allow checking out embedded bare repositories (which I strongly think that we should), any search for "is this repository embedded in another one?" needs to walk along its entire path up to the filesystem root. I'd think that we could offer some tools, perhaps "safe.boundary", specifying a list of directories to halt our traversal at. IOW, if you have a bunch of bare repositories underneath /data/repositories, and know that they're all safe, you could set your boundary there to signal to Git "if my path is a subdirectory of anything in safe.boundary, then I don't care whether the repo is embedded or not". > Does it work for submodules? I suppose it works for non-bare > submodules - and for bare submodules, e.g. > 'submodule-having-project.git/modules/some-submodule/' we can rely on > the user to turn that flag on if they want their submodule's config > and hooks to be noticed from the gitdir. (From > 'worktree-for-submodule-having-project/some-submodule' there is a > '.git' pointer, so I'd expect things to work normally that way, > right?) I don't enough about submodules to comment here, sorry. > > Here's an alternative approach, which I haven't seen discussed thus > > far: [...] > > Nice - a more strict spin on proposal 3 from above, if I understand it > right. Rather than allowing any and all bare repos, avoid someone > sneaking in a malicious one next to legitimate ones by using an > allowlist. Seems reasonable to me. We'd probably want to allow saying "all embedded bare repositories are safe to read config/hooks from", too. I hadn't considered this approach as a way to read some embedded repos and not others; I suspect the overwhelmingly common use-case would be: `git config --local safe.embeddedRepo '*'`. Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-21 19:09 ` Taylor Blau @ 2022-04-21 21:01 ` Emily Shaffer 2022-04-21 21:22 ` Taylor Blau 0 siblings, 1 reply; 50+ messages in thread From: Emily Shaffer @ 2022-04-21 21:01 UTC (permalink / raw) To: Taylor Blau Cc: Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, Randall S. Becker, Martin von Zweigbergk On Thu, Apr 21, 2022 at 12:09 PM Taylor Blau <me@ttaylorr.com> wrote: > On Thu, Apr 21, 2022 at 11:25:39AM -0700, Emily Shaffer wrote: > > On Fri, Apr 15, 2022 at 6:28 PM Taylor Blau <me@ttaylorr.com> wrote: > > > 3. This seems the most promising approach so far. Similar to (1), I > > > would also want to make sure we provide an easy way to declare a > > > bare repository as "safe" in order to avoid permanently disrupting > > > valid workflows that have accumulated over the past >15 years. > > > > I'd like to think a little more about how we want to determine that a > > bare repo isn't embedded - wantonly scanning up the filesystem for any > > gitdir above the current one is really expensive. When I tried that > > approach for the purposes of including some shared config between > > superproject and submodules, it slowed down the Git test suite by > > something like 3-5x. So, I suppose that "we think this is bare" means > > refs/ and objects/ present in a directory that isn't named '.git/', > > and then we hunt anywhere above us for another .git/? Of course being > > careful not to accept another bare repo as the "parent" gitdir. > > Definitely worth considering. Fundamentally I think if you're going to > allow checking out embedded bare repositories (which I strongly think > that we should), any search for "is this repository embedded in another > one?" needs to walk along its entire path up to the filesystem root. > > I'd think that we could offer some tools, perhaps "safe.boundary", > specifying a list of directories to halt our traversal at. IOW, if you > have a bunch of bare repositories underneath /data/repositories, and > know that they're all safe, you could set your boundary there to signal > to Git "if my path is a subdirectory of anything in safe.boundary, then > I don't care whether the repo is embedded or not". Ah, I'm sorry to say that I was more on board before I read this mail than I am after :) I don't think there is much reason to continue searching above the first found '.git' because we disallow '.git' from being committed or checked out, right? So this extra filesystem hunting and ceiling math seems unnecessary to me. I know I am slightly allergic to searching the filesystem for a parent repo to begin with, so I'm sure I've got some bias here ;) > > > Here's an alternative approach, which I haven't seen discussed thus > > > far: [...] > > > > Nice - a more strict spin on proposal 3 from above, if I understand it > > right. Rather than allowing any and all bare repos, avoid someone > > sneaking in a malicious one next to legitimate ones by using an > > allowlist. Seems reasonable to me. > > We'd probably want to allow saying "all embedded bare repositories are > safe to read config/hooks from", too. I hadn't considered this approach > as a way to read some embedded repos and not others; I suspect the > overwhelmingly common use-case would be: `git config --local > safe.embeddedRepo '*'`. Ah, I dislike this option for the exact reason I mentioned - avoiding a malicious repo being snuck in next to legitimate repos. I'd prefer to rely on exact matching only - but as the config needs to be set by every contributor every time the set of bare repos changes, that sounds impossible to manage for a project which may be constantly adding and removing these repos. That said.... I'm biased again, but if you want to be certain of the state of another repository in order to run tests in it... why not use a submodule for that repository? Not a helpful comment for those already using embedded bare repos, though ;) Thanks for the food for thought. - Emily ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-21 21:01 ` Emily Shaffer @ 2022-04-21 21:22 ` Taylor Blau 0 siblings, 0 replies; 50+ messages in thread From: Taylor Blau @ 2022-04-21 21:22 UTC (permalink / raw) To: Emily Shaffer Cc: Taylor Blau, Glen Choo, Git List, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, Randall S. Becker, Martin von Zweigbergk On Thu, Apr 21, 2022 at 02:01:44PM -0700, Emily Shaffer wrote: > I don't think there is much reason to continue searching above the > first found '.git' because we disallow '.git' from being committed or > checked out, right? So this extra filesystem hunting and ceiling math > seems unnecessary to me. I know I am slightly allergic to searching > the filesystem for a parent repo to begin with, so I'm sure I've got > some bias here ;) Right; once we find a non-bare embedding repository we should absolutely stop our search. I was suggesting we could stop our search earlier for environments that use non-embedded bare repositories with a fairly deep directory structure. E.g., if you store lots of bare repositories in /data/repositories and you know that none of them are embedded, we could quickly determine whether or not the cwd is a descendent of /data/repositories and avoid the search entirely if so. > > We'd probably want to allow saying "all embedded bare repositories are > > safe to read config/hooks from", too. I hadn't considered this approach > > as a way to read some embedded repos and not others; I suspect the > > overwhelmingly common use-case would be: `git config --local > > safe.embeddedRepo '*'`. > > Ah, I dislike this option for the exact reason I mentioned - avoiding > a malicious repo being snuck in next to legitimate repos. I'd prefer > to rely on exact matching only - but as the config needs to be set by > every contributor every time the set of bare repos changes, that > sounds impossible to manage for a project which may be constantly > adding and removing these repos. I share your distaste for this sort of thing, but I think we have to recognize that there are likely to be many repos that embed dozens or more bare repos inside of them, and asking them to opt-in each one individually seems like a fairly large request for Git to make of them. > That said.... I'm biased again, but if you want to be certain of the > state of another repository in order to run tests in it... why not use > a submodule for that repository? Not a helpful comment for those > already using embedded bare repos, though ;) Exactly, re: your last sentence. > Thanks for the food for thought. Ditto! Thanks, Taylor ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-16 1:28 ` Taylor Blau 2022-04-21 18:25 ` Emily Shaffer @ 2022-04-29 23:57 ` Glen Choo 2022-04-30 1:14 ` Taylor Blau 2022-05-02 14:05 ` Philip Oakley 1 sibling, 2 replies; 50+ messages in thread From: Glen Choo @ 2022-04-29 23:57 UTC (permalink / raw) To: Taylor Blau Cc: git, Emily Shaffer, justin, Taylor Blau, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, rsbecker Thanks so much for the response - it's really helpful, and there's a lot of food for thought here. (Sorry that I didn't get back to you sooner, there really is a lot of great stuff here to think about :)) Taylor Blau <me@ttaylorr.com> writes: > On Fri, Apr 15, 2022 at 05:41:59PM -0700, Glen Choo wrote: >> * There are use cases for embedded bare repos that don't have great alternatives >> (e.g. libgit2 uses bare repos in its tests). Even if this workflow is frowned >> upon (I personally don't think we should support it), I don't think we're >> ready to categorically declare that Git should ban embedded bare repos >> altogether (e.g. the way we ban .GiT). > > I think there are a handful of legitimate reasons that we might want to > continue supporting this; for projects like libgit2 and Git LFS, it's > useful to have repositories in a known state to execute tests in. Having > bare Git repositories contained in some "test fixture" directory is a > really easy way to do just that. If I were designing Git from scratch, I would probably block embedded bare repos from being committed altogether - if an embedded bare repo doesn't behave particularly differently from `.git` (which we pretty much agree we should not support), then this is just an inherently dangerous way to work. But yes, we have historically allowed embedded bare repos, and I don't think we should stop supporting them altogether. For instance, I don't see a good alternative for the test fixture use case: - Submodules aren't a good fit because they only allow you to include the contents of a submodule's tree, whereas in a test fixture, you really do want the gitdir internals to be source controlled so that you get nice predictable results. - Users could store the repos in some other form e.g. CDN, tarball. It's fine when running from a test script, but it's pretty awful to author/review any changes. - Perhaps the users could munge the bare repo at commit time e.g. instead of storing (refs/, objects/, HEAD), they could store (test_refs/, test_objects/, test_HEAD), which would later get turned into the bare repo in the test script. It's a little silly, but not unreasonable for a test script, I think. So I'll leave behind this idea of "blocking embedding bare repos" for now; I think there are more promising proposals in this thread. >> 1. Prevent checking out an embedded bare repo. >> 2. Detect if the bare repo is embedded and refuse to work with it. >> 3. Detect if the bare repo is embedded and do not read its config/hooks, but >> everything else still 'works'. >> 4. Don't detect bare repos. >> 5. Only detect bare repos that are named `.git` [1]. >> >> (I've responded with my thoughts on each of these approaches in-thread). > > 1. Likely disrupts too many legitimate workflows for us to adopt > without designing some way to declare an embedded bare repository > is "safe". > 2. Ditto. > 3. This seems the most promising approach so far. Similar to (1), I > would also want to make sure we provide an easy way to declare a > bare repository as "safe" in order to avoid permanently disrupting > valid workflows that have accumulated over the past >15 years. > 4. Seems like this approach is too heavy-handed. > 5. Ditto. If I understand you correctly, it seems like we can ship any of the options from 1.-3., provided there is an easy way to opt-in known, "safe" bare repos. After some more reflection, I tend to agree that 4.-5. are too heavy-handed for the general population because they would break nearly all bare repo users. The performance cost of detecting an embedded bare repo isn't ideal either, but that can be easily fixed via GIT_CEILING_DIRECTORIES or an allow-list of "safe" repos, and it's a lot gentler than just ignoring bare repos altogether. I suspect that there is a subset of the user population who would love to disable bare repo detection because they a) never expect to see bare repos in a given environment and b) want to be defensive about the possibility of unexpected bare repos. Since this really heavy-handed, I don't think this fixes the embedded bare repo problem for everyone, but that subset of users will be very well-served :) I think that there must be some coherent framework that encompasses this idea + the allow-list you proposed, but I haven't come up with one yet. >> With that in mind, here's what I propose we do next: >> >> * Merge the `git fsck` patch [2] if we think that it is useful despite the >> potentially huge number of false positives. That patch needs some fixing; I'll >> make the changes when I'm back. > > If there are lots of false positives, we should consider downgrading the > severity of the proposed `EMBEDDED_BARE_REPO` fsck check to "info". I'm > not clear if there is another reason why this patch would have a > significant number of false positives (i.e., if the detection mechanism > is over-zealous). > > But if not, and this does detect only legitimate embedded bare > repositories, we should use it as a reminder to consider how many > use-cases and workflows will be affected by whatever approach we take > here, if any. Yes, that's good to keep in mind. After mulling about it some more, I don't have a clear direction on the fsck patch to be honest, I'll leave this alone for now and I'll return to it if I get a clearer picture. >> * I'll experiment with (5), and if it seems promising, I'll propose this as an >> opt-in feature, with the intent of making it opt-out in the future. We'll >> opt-into this at Google to help figure out if this is a good default or not. >> >> * If that direction turns out not to be promising, I'll pursue (1), since that >> is the only option that can be configured per-repo, which should hopefully >> minimize the fallout. > > Here's an alternative approach, which I haven't seen discussed thus far: > > When a bare repository is embedded in another repository, avoid reading > its config by default. This means that most Git commands will still > work, but without the possibility of running any "executable" portions > of the config. To opt-out (i.e., to allow legitimate use-cases to start > reading embedded bare repository config again), the embedding repository > would have to set a multi-valued `safe.embeddedRepo` configuration. This > would specify a list of paths relative to the embedding repository's > root of known-safe bare repositories. > > I think there are a couple of desirable attributes of this approach: > > - It minimally disrupts bare repositories, restricting the change to > only embedded repositories. > - It allows most Git commands to continue working as expected (modulo > reading the config), hopefully making the population of users whose > workflows will suddenly break pretty small. > - It requires the user to explicitly opt-in to the unsafe behavior, > because an attacker could not influence the embedding repository's > `safe.embeddedRepo` config. > > If we were going to do something about this, I would strongly advocate > for something that resembles the above. Or at the very least, some > solution that captures the attributes I outlined there. I really like the `safe.embeddedRepo` idea, though I'm not convinced about "respect only the safe parts of the embedded repo". I'll address the latter first. I think brian m. carlson was coming from a similar direction, and the "respect only the safe parts of the embedded repo" part of the proposal sounds similar to his [1]. Both seem to be motivated by your second point - protect as many workflows as we can. It's a good guiding principle, and I think it's a good place to start from. That said, I'm not sure that this proposal serves these users that well: - Not reading the config might break the embedded bare repos in ways we don't expect (e.g. not reading core.repositoryformatversion). - Users who use embedded bare repos as test fixtures presumably want their tests to mimic real-world scenarios as closely as possible; running in this half-state of "use some parts of the repo but not others" doesn't seem a good fit for that use case. - This complicates the rules significantly for the user, who now has to figure out which parts of the bare repo are respected and which are not. - I'm also of the opinion that changing the rules like this actually does affect workflows, even if it doesn't break libgit2's tests. - A diligent user still has to convince themselves that the tests are passing for the right reasons, possibly adapting to the new rules (e.g. by selectively enabling `safe.embeddedRepo` on the right test fixtures). - A less diligent user might not even realize the change has happened and end up with difficult to debug results somewhere down the line. I'm also not keen on it for other reasons: - This expands the attack surface significantly, and I'm pessimistic that we can maintain a list of the 'safe' parts of a bare repo. A lot of attention has been focused on config/hooks, but I wouldn't be surprised if a creative attacker finds some other avenue that we missed (maybe a buffer overflow attack on a malicious index file?). - I expect that this is also going to be really complex to implement and maintain; instead of looking in a single gitdir for everything, we now look in two gitdirs. What is promising is an allow-listing scheme like `safe.embeddedRepo` that can be enabled on a per-repository basis. . Using an allowlist to selectively choose *entire* embedded bare repos preserves the first and third attributes you described, and it keeps things simple(-ish) for us and users. Breaking libgit2 and Git LFS this way is pretty harsh, but it will give us the confidence that we have communicated the behavior change (and fix!) to the relevant users, rather than having them find out the wrong way. Some extra thoughts (in case they're helpful): - It's pretty important to get the format of `safe.embeddedRepo` 'correct', but what 'correct' is is up for debate. For example, should we allow '*'? (I think so, but I know some don't ;)). - There might be some unifying principles behind "allowlisting certain embedded bare repos" and "disabling/enabling bare repo detection" that can guide our fix. - Perhaps we could allow different 'levels' of bare repo protection, like 'allow all bare repos', 'allow only non-embedded bare repos', 'allow no bare repos', 'allow embedded bare repos but not their configs". - If we do want to discourage embedded bare repos (and flip the default), this kind of gradual roll-out might give projects a way to incrementally migrate. [1] https://lore.kernel.org/git/Yk9Wcr74gvhtyOi7@camp.crustytoothpaste.net > > I would be happy to work together with you (or anybody!) on developing > patches in that direction, so let me know if you are interested in > coordinating our efforts. Thanks! Once again, I really appreciate the response. I think it's moved the discussion forward in a meaningful way :) I think the easiest next step is for me (or whoever :)) to propose patches and to hash out the details there. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-29 23:57 ` Glen Choo @ 2022-04-30 1:14 ` Taylor Blau 2022-05-02 19:39 ` Glen Choo 2022-05-02 14:05 ` Philip Oakley 1 sibling, 1 reply; 50+ messages in thread From: Taylor Blau @ 2022-04-30 1:14 UTC (permalink / raw) To: Glen Choo Cc: git, Emily Shaffer, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, rsbecker On Fri, Apr 29, 2022 at 04:57:51PM -0700, Glen Choo wrote: > Thanks so much for the response - it's really helpful, and there's a lot of food > for thought here. > > (Sorry that I didn't get back to you sooner, there really is a lot of > great stuff here to think about :)) No problem, I'm glad that you found it helpful (and likewise!). > So I'll leave behind this idea of "blocking embedding bare repos" for > now; I think there are more promising proposals in this thread. I am in strong agreement with you here, but I would add an additional point which is that even if we encouraged submodules over embedded bare repos, or suggested storing bare repos on CDNs or what have you, there's significant momentum in the "do nothing" category which we have to take into account, too. So even if we made it as easy as possible to convert to, for e.g., using submodules, getting the millions (?) of repositories with embedded bare repositories in them that have accumulated over the years to actually change seems virtually impossible to me. > >> 1. Prevent checking out an embedded bare repo. > >> 2. Detect if the bare repo is embedded and refuse to work with it. > >> 3. Detect if the bare repo is embedded and do not read its config/hooks, but > >> everything else still 'works'. > >> 4. Don't detect bare repos. > >> 5. Only detect bare repos that are named `.git` [1]. > >> > >> (I've responded with my thoughts on each of these approaches in-thread). > > > > 1. Likely disrupts too many legitimate workflows for us to adopt > > without designing some way to declare an embedded bare repository > > is "safe". > > 2. Ditto. > > 3. This seems the most promising approach so far. Similar to (1), I > > would also want to make sure we provide an easy way to declare a > > bare repository as "safe" in order to avoid permanently disrupting > > valid workflows that have accumulated over the past >15 years. > > 4. Seems like this approach is too heavy-handed. > > 5. Ditto. > > If I understand you correctly, it seems like we can ship any of the options from > 1.-3., provided there is an easy way to opt-in known, "safe" bare repos. After thinking about it some more, I think that we should probably try to ship (3) of the ones that we agree are viable, but more on that below... > Yes, that's good to keep in mind. After mulling about it some more, I don't have > a clear direction on the fsck patch to be honest, I'll leave this alone for now > and I'll return to it if I get a clearer picture. Sounds good. I'm happy to have ideas bounced off of me in the meantime ;). > I really like the `safe.embeddedRepo` idea, though I'm not convinced about > "respect only the safe parts of the embedded repo". I'll address the latter > first. To be clear, I am advocating for "only the safe parts" insofar as "read repository extensions, core.repositoryFormatVersion and literally nothing else". I'm definitely not suggesting we go and enumerate every configurable value, determine whether it's safe or not, and then read only the safe ones. That approach seems doomed to fail, since no matter how clever we are, there will always be some slightly-cleverer attacker who can find a vector that we missed. > I think brian m. carlson was coming from a similar direction, and the "respect > only the safe parts of the embedded repo" part of the proposal sounds similar to > his [1]. Both seem to be motivated by your second point - protect as many > workflows as we can. It's a good guiding principle, and I think it's a good > place to start from. That said, I'm not sure that this proposal serves these > users that well: > > - Not reading the config might break the embedded bare repos in ways we don't > expect (e.g. not reading core.repositoryformatversion). I'm hoping that if we include this among the list of essential- and known-safe config values that we'll mitigate this reasonably well. > - Users who use embedded bare repos as test fixtures presumably want their tests > to mimic real-world scenarios as closely as possible; running in this > half-state of "use some parts of the repo but not others" doesn't seem a good > fit for that use case. It's hard to estimate how many of these tests will get broken. But I think the important trade-off to consider between "abort all Git operations in embedded bare repositories" and "warn, and avoid reading config/hooks" is how disruptive the change will be. The latter seems far less disruptive to me, so I would rather us favor that over an overly-conservative approach. > - This complicates the rules significantly for the user, who now has to figure > out which parts of the bare repo are respected and which are not. On this point I disagree, but I suspect we weren't on the same page about what "only the safe parts" meant when you wrote this. To be extra-extra clear, I don't think we should read some parts of config and not other, I mean we should read _only_ the above listed parts (the format version and extensions) and nothing else. > - I'm also of the opinion that changing the rules like this actually does affect > workflows, even if it doesn't break libgit2's tests. > - A diligent user still has to convince themselves that the tests are passing > for the right reasons, possibly adapting to the new rules (e.g. by > selectively enabling `safe.embeddedRepo` on the right test fixtures). > - A less diligent user might not even realize the change has happened and > end up with difficult to debug results somewhere down the line. I am sympathetic to what you're saying, but I (a) think there's still a tradeoff here that doesn't obviously point us in one direction or the other and (b) we should equally keep in mind other workflows besides just test fixtures. Does that change our thinking at all? I'm not sure. > I'm also not keen on it for other reasons: > > - This expands the attack surface significantly, and I'm pessimistic that we > can maintain a list of the 'safe' parts of a bare repo. A lot of attention has > been focused on config/hooks, but I wouldn't be surprised if a creative > attacker finds some other avenue that we missed (maybe a buffer overflow > attack on a malicious index file?). I disagree, though again I suspect we were thinking of different thing when saying "only read safe parts of the config". Still though, I would argue that it limits the attack surface at the right level, which is to say any vector that we _did_ miss is something that we should just fix (e.g., preventing a buffer overflow) and not "oops, this config value does specify an executable". (We shouldn't have to deal with the index file, though, since a bare repository would not read the index, no?). > - I expect that this is also going to be really complex to implement and > maintain; instead of looking in a single gitdir for everything, we now look in > two gitdirs. I'd think that any approach we take that has different behavior for bare repositories depending on whether or not they are embedded has to do a similar check, so I don't think this adds significant complexity. Though not having written any code here yet, I'd take what I say with a huge grain of salt ;-). > What is promising is an allow-listing scheme like `safe.embeddedRepo` that can > be enabled on a per-repository basis. . Using an allowlist to selectively choose > *entire* embedded bare repos preserves the first and third attributes you > described, and it keeps things simple(-ish) for us and users. Breaking libgit2 > and Git LFS this way is pretty harsh, but it will give us the confidence that we > have communicated the behavior change (and fix!) to the relevant users, rather > than having them find out the wrong way. Hmm. I still am pretty convicted that this (avoid working with unknown embedded bare repositories) is too harsh of a change to make the default. Replace "Breaking libgit2 and Git LFS" with "breaking many hundreds of thousands of repositories" [1], and I think that we would need to come up with something more lightweight. > Some extra thoughts (in case they're helpful): > > - It's pretty important to get the format of `safe.embeddedRepo` 'correct', but > what 'correct' is is up for debate. For example, should we allow '*'? (I think > so, but I know some don't ;)). Stolee (cc'd) will have an interesting perspective here (at least as it relates to the 2.35.3 release), I think. > - There might be some unifying principles behind "allowlisting certain embedded > bare repos" and "disabling/enabling bare repo detection" that can guide our > fix. > - Perhaps we could allow different 'levels' of bare repo protection, like > 'allow all bare repos', 'allow only non-embedded bare repos', 'allow no bare > repos', 'allow embedded bare repos but not their configs". > > - If we do want to discourage embedded bare repos (and flip the default), this > kind of gradual roll-out might give projects a way to incrementally migrate. Ah! Are you suggesting a global configuration setting that controls the behavior of embedded bare repositories that _aren't_ listed in a repositories safe.embeddedRepo list? That sort of thing could work, though I'd argue strongly that any Git 2.x.y release should make the default behavior "avoid config/hooks" as opposed to "refuse to work". We could consider changing that default in Git 3.x.y, but I feel strongly that any Git 2.x.y release should cater as much to existing workflows as possible without significantly compromising on attack surface area. Thanks, Taylor [1]: I have no idea if that figure is right, but I suspect it's in the right order of magnitude. I could look into it further, though. ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-30 1:14 ` Taylor Blau @ 2022-05-02 19:39 ` Glen Choo 0 siblings, 0 replies; 50+ messages in thread From: Glen Choo @ 2022-05-02 19:39 UTC (permalink / raw) To: Taylor Blau Cc: git, Emily Shaffer, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, rsbecker Taylor Blau <me@ttaylorr.com> writes: >> >> 1. Prevent checking out an embedded bare repo. >> >> 2. Detect if the bare repo is embedded and refuse to work with it. >> >> 3. Detect if the bare repo is embedded and do not read its config/hooks, but >> >> everything else still 'works'. >> >> 4. Don't detect bare repos. >> >> 5. Only detect bare repos that are named `.git` [1]. >> >> >> >> (I've responded with my thoughts on each of these approaches in-thread). >> > >> > 1. Likely disrupts too many legitimate workflows for us to adopt >> > without designing some way to declare an embedded bare repository >> > is "safe". >> > 2. Ditto. >> > 3. This seems the most promising approach so far. Similar to (1), I >> > would also want to make sure we provide an easy way to declare a >> > bare repository as "safe" in order to avoid permanently disrupting >> > valid workflows that have accumulated over the past >15 years. >> > 4. Seems like this approach is too heavy-handed. >> > 5. Ditto. >> >> If I understand you correctly, it seems like we can ship any of the options from >> 1.-3., provided there is an easy way to opt-in known, "safe" bare repos. > > After thinking about it some more, I think that we should probably try > to ship (3) of the ones that we agree are viable, but more on that > below... > >> I really like the `safe.embeddedRepo` idea, though I'm not convinced about >> "respect only the safe parts of the embedded repo". I'll address the latter >> first. > > To be clear, I am advocating for "only the safe parts" insofar as "read > repository extensions, core.repositoryFormatVersion and literally > nothing else". I'm definitely not suggesting we go and enumerate every > configurable value, determine whether it's safe or not, and then read > only the safe ones. That approach seems doomed to fail, since no matter > how clever we are, there will always be some slightly-cleverer attacker > who can find a vector that we missed. [...] >> - This complicates the rules significantly for the user, who now has to figure >> out which parts of the bare repo are respected and which are not. > > On this point I disagree, but I suspect we weren't on the same page > about what "only the safe parts" meant when you wrote this. To be > extra-extra clear, I don't think we should read some parts of config and > not other, I mean we should read _only_ the above listed parts (the > format version and extensions) and nothing else. Ah, to clarify, I'm taking an even more pessimistic stance here, which is that I'd prefer to avoid trusting any parts of the repo (not the config) as "safe", e.g. I'm not even keen on something as innocuous-looking as "We trust _only_ HEAD, refs/ and objects/". Maybe this is overly-pessimistic though; your response (further down) suggests that this might not be as bad as I think. >> - I'm also of the opinion that changing the rules like this actually does affect >> workflows, even if it doesn't break libgit2's tests. >> - A diligent user still has to convince themselves that the tests are passing >> for the right reasons, possibly adapting to the new rules (e.g. by >> selectively enabling `safe.embeddedRepo` on the right test fixtures). >> - A less diligent user might not even realize the change has happened and >> end up with difficult to debug results somewhere down the line. > > I am sympathetic to what you're saying, but I (a) think there's still a > tradeoff here that doesn't obviously point us in one direction or the > other and (b) we should equally keep in mind other workflows besides > just test fixtures. Does that change our thinking at all? I'm not sure. Just musing here. I think (b) is particularly important to keep in mind. If a workflow is well-served by something else, I really don't mind breaking it at all. I suppose the test fixture use case comes up the most frequently because it is one that we are familiar with, and doesn't seem to have a good alternative. Maybe we're approaching the limits of what we can know without performing some studies of users in the wild. >> I'm also not keen on it for other reasons: >> >> - This expands the attack surface significantly, and I'm pessimistic that we >> can maintain a list of the 'safe' parts of a bare repo. A lot of attention has >> been focused on config/hooks, but I wouldn't be surprised if a creative >> attacker finds some other avenue that we missed (maybe a buffer overflow >> attack on a malicious index file?). > > I disagree, though again I suspect we were thinking of different thing > when saying "only read safe parts of the config". Still though, I would > argue that it limits the attack surface at the right level, which is to > say any vector that we _did_ miss is something that we should just fix > (e.g., preventing a buffer overflow) and not "oops, this config value > does specify an executable". > > (We shouldn't have to deal with the index file, though, since a bare > repository would not read the index, no?). I think this comment highlights the difference in opinion (in a good way!). IMO 'only reading parts of the repo and not others' will fix this particular arbitrary code execution problem, but it won't fix (and might even open us up to) other, future problems with embedded bare repos. From this comment, I get the impression (and I hope I'm representing you correctly) that you think these future problems are manageable, and don't justify breaking more workflows than we need to in order to fix this arbitrary code execution problem. I think it's a healthy disagreement though - there's enough common ground for us to start working our approaches separately (plus I don't think either of us will be able to convince the other by thinking up hypotheticals on a mailing list thread :p), and the discussion will probably get even better once we start comparing, reviewing, and internally testing the patches. (Ah, I'm not sure about the index file in particular, I was just speculating.) >> - I expect that this is also going to be really complex to implement and >> maintain; instead of looking in a single gitdir for everything, we now look in >> two gitdirs. > > I'd think that any approach we take that has different behavior > for bare repositories depending on whether or not they are embedded has > to do a similar check, so I don't think this adds significant > complexity. Though not having written any code here yet, I'd take what I > say with a huge grain of salt ;-). ;) >> - There might be some unifying principles behind "allowlisting certain embedded >> bare repos" and "disabling/enabling bare repo detection" that can guide our >> fix. >> - Perhaps we could allow different 'levels' of bare repo protection, like >> 'allow all bare repos', 'allow only non-embedded bare repos', 'allow no bare >> repos', 'allow embedded bare repos but not their configs". >> >> - If we do want to discourage embedded bare repos (and flip the default), this >> kind of gradual roll-out might give projects a way to incrementally migrate. > > Ah! Are you suggesting a global configuration setting that controls the > behavior of embedded bare repositories that _aren't_ listed in a > repositories safe.embeddedRepo list? Yes, and an even more restrictive mode that disables bare repo detection altogether. I suspect this is a safe, non-disruptive default for many user, but it obviously can't be the default today (or possibly ever? we'd need a lot of user testing for sure). ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-04-29 23:57 ` Glen Choo 2022-04-30 1:14 ` Taylor Blau @ 2022-05-02 14:05 ` Philip Oakley 2022-05-02 18:50 ` Junio C Hamano 1 sibling, 1 reply; 50+ messages in thread From: Philip Oakley @ 2022-05-02 14:05 UTC (permalink / raw) To: Glen Choo, Taylor Blau Cc: git, Emily Shaffer, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, Junio C Hamano, brian m. carlson, rsbecker On 30/04/2022 00:57, Glen Choo wrote: > If I were designing Git from scratch, I would probably block embedded bare repos > from being committed altogether - if an embedded bare repo doesn't behave > particularly differently from `.git` (which we pretty much agree we should not > support), then this is just an inherently dangerous way to work. > > But yes, we have historically allowed embedded bare repos, and I don't think we > should stop supporting them altogether. For instance, I don't see a good > alternative for the test fixture use case: > > - Submodules aren't a good fit because they only allow you to include the > contents of a submodule's tree, whereas in a test fixture, you really do want > the gitdir internals to be source controlled so that you get nice predictable > results. > - Users could store the repos in some other form e.g. CDN, tarball. It's fine > when running from a test script, but it's pretty awful to author/review any > changes. Can I check that storing bundles (of other repos) within a repository is considered reasonably safe? I have been looking at how Git's documentation could carry with it small exemplar repositories that cover the commit hierarchies shown in the various man pages to allow users to see, explore and understand the man page examples. I'd settled on bundles as a reasonable compromise, with the exemplar repositories being generated via the test suite (obviously). With the recent focus on security issues, I thought it worth asking now. > - Perhaps the users could munge the bare repo at commit time e.g. instead of > storing (refs/, objects/, HEAD), they could store (test_refs/, test_objects/, > test_HEAD), which would later get turned into the bare repo in the test > script. It's a little silly, but not unreasonable for a test script, I think. -- Philip ^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: Bare repositories in the working tree are a security risk 2022-05-02 14:05 ` Philip Oakley @ 2022-05-02 18:50 ` Junio C Hamano 0 siblings, 0 replies; 50+ messages in thread From: Junio C Hamano @ 2022-05-02 18:50 UTC (permalink / raw) To: Philip Oakley Cc: Glen Choo, Taylor Blau, git, Emily Shaffer, justin, Johannes Schindelin, Ævar Arnfjörð Bjarmason, Derrick Stolee, brian m. carlson, rsbecker Philip Oakley <philipoakley@iee.email> writes: > Can I check that storing bundles (of other repos) within a repository is > considered reasonably safe? I think the threat model we are protecting against is that we shouldn't have to worry about "git clone" and then "cd" into the hierarchy. If you did "make" in there and their Makefile (or "runme.sh" shipped as part of the tree) is malicious, that is not something we can solve. So, I would say it is safe to have a bundle, as in its statinary state it does not actively do anything bad, even when you did "ls" in a directory that stores it. > I have been looking at how Git's documentation could carry with it small > exemplar repositories that cover the commit hierarchies shown in the > various man pages to allow users to see, explore and understand the man > page examples. I'd settled on bundles as a reasonable compromise, with > the exemplar repositories being generated via the test suite > (obviously). With the recent focus on security issues, I thought it > worth asking now. A bundle would be an OK vehicle. This depends on the size of the sample project, but another that may be more suitable may be to create the repository on the fly in the "test suite" you were planning to use to extract from the bundle. ^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2022-05-02 19:39 UTC | newest] Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo 2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo 2022-04-07 12:42 ` Johannes Schindelin 2022-04-07 13:21 ` Derrick Stolee 2022-04-07 14:14 ` Ævar Arnfjörð Bjarmason 2022-04-14 20:02 ` Glen Choo 2022-04-15 12:46 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:11 ` Junio C Hamano 2022-04-13 22:24 ` Glen Choo 2022-04-07 13:12 ` Ævar Arnfjörð Bjarmason 2022-04-07 15:20 ` Junio C Hamano 2022-04-07 18:38 ` Bare repositories in the working tree are a security risk John Cai 2022-04-07 21:24 ` brian m. carlson 2022-04-07 21:53 ` Justin Steven 2022-04-07 22:10 ` brian m. carlson 2022-04-07 22:40 ` rsbecker 2022-04-08 5:54 ` Junio C Hamano 2022-04-14 0:03 ` Junio C Hamano 2022-04-14 0:04 ` Glen Choo 2022-04-13 23:44 ` Glen Choo 2022-04-13 20:37 ` Glen Choo 2022-04-13 23:36 ` Junio C Hamano 2022-04-14 16:41 ` Glen Choo 2022-04-14 17:35 ` Junio C Hamano 2022-04-14 18:19 ` Junio C Hamano 2022-04-15 21:33 ` Glen Choo 2022-04-15 22:17 ` Junio C Hamano 2022-04-16 0:52 ` Taylor Blau 2022-04-15 22:43 ` Glen Choo 2022-04-15 20:13 ` Junio C Hamano 2022-04-15 23:45 ` Glen Choo 2022-04-15 23:59 ` Glen Choo 2022-04-16 1:00 ` Taylor Blau 2022-04-16 1:18 ` Junio C Hamano 2022-04-16 1:30 ` Taylor Blau 2022-04-16 0:34 ` Glen Choo 2022-04-16 0:41 ` Glen Choo 2022-04-16 1:28 ` Taylor Blau 2022-04-21 18:25 ` Emily Shaffer 2022-04-21 18:29 ` Emily Shaffer 2022-04-21 18:47 ` Junio C Hamano 2022-04-21 18:54 ` Taylor Blau 2022-04-21 19:09 ` Taylor Blau 2022-04-21 21:01 ` Emily Shaffer 2022-04-21 21:22 ` Taylor Blau 2022-04-29 23:57 ` Glen Choo 2022-04-30 1:14 ` Taylor Blau 2022-05-02 19:39 ` Glen Choo 2022-05-02 14:05 ` Philip Oakley 2022-05-02 18:50 ` Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.