All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [RFC] setup.c: make bare repo discovery optional
@ 2022-05-06 18:30 Glen Choo via GitGitGadget
  2022-05-06 20:33 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-06 18:30 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Add a config variable, `safe.barerepository`, that tells Git whether or
not to recognize bare repositories when it is trying to discover the
repository. This only affects repository discovery, thus it has no
effect if discovery was not done (e.g. `--git-dir` was passed).

This is motivated by the fact that some workflows don't use bare
repositories at all, and users may prefer to opt out of bare repository
discovery altogether:

- An easy assumption for a user to make is that Git commands run
  anywhere inside a repository's working tree will use the same
  repository. However, if the working tree contains a bare repository
  below the root-level (".git" is preferred at the root-level), any
  operations inside that bare repository use the bare repository
  instead.

  In the worst case, attackers can use this confusion to trick users
  into running arbitrary code (see [1] for a deeper discussion). But
  even in benign situations (e.g. a user renames ".git/" to ".git.old/"
  and commits it for archival purposes), disabling bare repository
  discovery can be a simpler mode of operation (e.g. because the user
  doesn't actually want to use ".git.old/") [2].

- Git won't "accidentally" recognize a directory that wasn't meant to be
  a bare repository, but happens to resemble one. While such accidents
  are probably very rare in practice, this lets users reduce the chance
  to zero.

This config is designed to be used like an allow-list, but it is not yet
clear what a good format for this allow-list would be. As such, this
patch limits the config value to a tri-state of [true|false|unset]:

- [*|(unset)] recognize all bare repositories (like Git does today)
- (empty) recognize no bare repositories

and leaves the full format to be determined later.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: I don't personally know anyone who does this as part of their
normal workflow, but a cursory search on GitHub suggests that there is a
not insubstantial number of people who munge ".git" in order to store
its contents.

https://github.com/search?l=&o=desc&p=1&q=ref+size%3A%3C1000+filename%3AHEAD&s=indexed&type=Code
(aka search for the text "ref", size:<1000, filename:HEAD)

Signed-off-by: Glen Choo <chooglen@google.com>
---
    RFC setup.c: make bare repo discovery optional
    
    (Forgive the non-standard RFC tag, I haven't figured out how to send as
    RFC using GGG. I also didn't realize that /preview would also respect
    CC...)
    
    = Description
    
    A relatively easy win that came out of the discussions around embedded
    bare repos [1], is to just let users opt-out of discovering bare repos.
    This patch does exactly that, by adding a 'boolean' config variable,
    safe.barerepository.
    
    safe.barerepository is presented to users as an allow-list of
    directories that Git will recognize as a bare repository during the
    repository discovery process (much like safe.directory), but this patch
    only implements (and permits) boolean behavior (i.e. on, off and unset).
    Hopefully, this gives us some room to discuss and experiment with
    possible formats.
    
    Thanks to Taylor for suggesting the allow-list idea :)
    
    I think the core concept of letting users toggle bare repo discovery is
    solid, but I'm sending this as RFC for the following reasons:
    
     * I don't love the name safe.barerepository, because it feels like Git
       is saying that bare repos are unsafe and consequently, that bare repo
       users are behaving unsafely. On the other hand, this is quite similar
       to safe.directory in a few respects, so it might make sense for the
       naming to reflect that.
    
     * The *-gcc CI jobs don't pass. I haven't discerned any kind of pattern
       yet.
    
    = How this relates to embedded bare repos
    
    This does not change the default behavior (i.e. Git will still discover
    all bare repos by default) because that would be catastrophic for bare
    repo users [2]. As such, this patch isn't intended to solve the problem
    of embedded bare repos for all users once and for all, but I think it
    does improve the our stance on the matter:
    
     * In the short-term, users who know they won't need bare repos (or
       those who are willing to set GIT_DIR for all of their bare repos) can
       opt-in to a safer, easier to reason about mode of operation.
    
     * In the longer-term, we might identify a usable-enough default that we
       can give opt-out protection that works for the vast majority of
       users.
    
    = Other questions/Concerns
    
     * Maybe it's more informative for the user if we die() (or warn()) when
       we find a bare repo instead of silently ignoring it?
    
     * I wonder if it makes sense to separate the toggle for bare repo
       discovery and the allow-list of bare repositories. Something like
       core.barediscovery or discovery.barerepository has a lot less baggage
       than safe.*, and boolean enable/disable is a lot simpler, but this
       isn't good from an extensibility perspective.
    
     * Is there any reason why safe.barerepository shouldn't use the same
       format as (its obvious inspiration) safe.directory?
    
     * Are the docs clear enough? I found those hard to put into words, so
       I'd especially appreciate wording suggestions :)
    
    = Future work
    
     * Like safe.directory, safe.barerepository is only read from system and
       global config. I anticipate that this is too restrictive; there has
       already been some discussion of adding a GIT_SAFE_DIRECTORIES
       environment variable for safe.directory [3], and it would be useful
       to have the same thing for safe.barerepository.
    
    [1]
    https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
    [2] In https://lore.kernel.org/git/xmqqh76ucdg6.fsf@gitster.g, Junio
    experimented with switching off bare repo discovery altogether and
    relying solely on GIT_DIR. The resulting fallout was deemed too big to
    be feasible. [3] https://lore.kernel.org/git/xmqqee1il09v.fsf@gitster.g/

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v1
Pull-Request: https://github.com/git/git/pull/1261

 Documentation/config/safe.txt | 24 +++++++++++++++
 setup.c                       | 36 ++++++++++++++++++++++-
 t/t1510-repo-setup.sh         | 55 +++++++++++++++++++++++++++++++++++
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index 6d764fe0ccf..02032251ffd 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -1,3 +1,27 @@
+safe.barerepository::
+	This config entry specifies directories that Git can recognize as
+	a bare repository when looking for the repository (aka repository
+	discovery). This has no effect if repository discovery is not
+	performed e.g. the path to the repository is set via `--git-dir`
+	(see linkgit:git[1]).
++
+It is recommended that you set this value so that Git will only use the bare
+repositories you intend it to. This prevents certain types of security and
+non-security problems, such as:
+
+* `git clone`-ing a repository containing a maliciously bare repository
+  inside it.
+* Git recognizing a directory that isn't mean to be a bare repository,
+  but happens to look like one.
++
+The currently supported values are `*` (Git recognizes all bare
+repositories) and the empty value (Git never recognizes bare repositories).
+Defaults to `*`.
++
+This config setting is only respected when specified in a system or global
+config, not when it is specified in a repository config or via the command
+line option `-c safe.barerepository=<path>`.
+
 safe.directory::
 	These config entries specify Git-tracked directories that are
 	considered safe even if they are owned by someone other than the
diff --git a/setup.c b/setup.c
index a7b36f3ffbf..9b5dd877273 100644
--- a/setup.c
+++ b/setup.c
@@ -1133,6 +1133,40 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+/*
+ * This is similar to safe_directory_data, but only supports true/false.
+ */
+struct safe_bare_repository_data {
+	int is_safe;
+};
+
+static int safe_bare_repository_cb(const char *key, const char *value, void *d)
+{
+	struct safe_bare_repository_data *data = d;
+
+	if (strcmp(key, "safe.barerepository"))
+		return 0;
+
+	if (!value || !strcmp(value, "*")) {
+		data->is_safe = 1;
+		return 0;
+	}
+	if (!*value) {
+		data->is_safe = 0;
+		return 0;
+	}
+	return -1;
+}
+
+static int should_detect_bare(void)
+{
+	struct safe_bare_repository_data data;
+
+	read_very_early_config(safe_bare_repository_cb, &data);
+
+	return data.is_safe;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1238,7 +1272,7 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 			return GIT_DIR_DISCOVERED;
 		}
 
-		if (is_git_directory(dir->buf)) {
+		if (should_detect_bare() && is_git_directory(dir->buf)) {
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
diff --git a/t/t1510-repo-setup.sh b/t/t1510-repo-setup.sh
index 591505a39c0..3ce8f776921 100755
--- a/t/t1510-repo-setup.sh
+++ b/t/t1510-repo-setup.sh
@@ -541,6 +541,61 @@ test_expect_success '#16e: bareness preserved by --bare' '
 	)
 '
 
+# Test the tri-state of [(unset)|""|"*"].
+test_expect_success '#16f: bare repo in worktree' '
+	test_when_finished "git config --global --unset safe.barerepository" &&
+	setup_repo 16f unset "" unset &&
+
+	git init --bare 16f/default/bare &&
+	git init --bare 16f/default/bare/bare &&
+	try_case 16f/default/bare unset unset \
+		. "(null)" "$here/16f/default/bare" "(null)" &&
+	try_case 16f/default/bare/bare unset unset \
+		. "(null)" "$here/16f/default/bare/bare" "(null)" &&
+
+	git config --global safe.barerepository "*" &&
+	git init --bare 16f/all/bare &&
+	git init --bare 16f/all/bare/bare &&
+	try_case 16f/all/bare unset unset \
+		. "(null)" "$here/16f/all/bare" "(null)" &&
+	try_case 16f/all/bare/bare unset unset \
+		. "(null)" "$here/16f/all/bare/bare" "(null)" &&
+
+	git config --global safe.barerepository "" &&
+	git init --bare 16f/never/bare &&
+	git init --bare 16f/never/bare/bare &&
+	try_case 16f/never/bare unset unset \
+		".git" "$here/16f" "$here/16f" "never/bare/" &&
+	try_case 16f/never/bare/bare unset unset \
+		".git" "$here/16f" "$here/16f" "never/bare/bare/"
+'
+
+test_expect_success '#16g: inside .git with safe.barerepository' '
+	test_when_finished "git config --global --unset safe.barerepository" &&
+
+	# Omit the "default" case; it is covered by 16a.
+
+	git config --global safe.barerepository "*" &&
+	setup_repo 16g/all unset "" unset &&
+	mkdir -p 16g/all/.git/wt/sub &&
+	try_case 16g/all/.git unset unset \
+		. "(null)" "$here/16g/all/.git" "(null)" &&
+	try_case 16g/all/.git/wt unset unset \
+		"$here/16g/all/.git" "(null)" "$here/16g/all/.git/wt" "(null)" &&
+	try_case 16g/all/.git/wt/sub unset unset \
+		"$here/16g/all/.git" "(null)" "$here/16g/all/.git/wt/sub" "(null)" &&
+
+	git config --global safe.barerepository "" &&
+	setup_repo 16g/never unset "" unset &&
+	mkdir -p 16g/never/.git/wt/sub &&
+	try_case 16g/never/.git unset unset \
+		".git" "$here/16g/never" "$here/16g/never" ".git/" &&
+	try_case 16g/never/.git/wt unset unset \
+		".git" "$here/16g/never" "$here/16g/never" ".git/wt/" &&
+	try_case 16g/never/.git/wt/sub unset unset \
+		".git" "$here/16g/never" "$here/16g/never" ".git/wt/sub/"
+'
+
 test_expect_success '#17: GIT_WORK_TREE without explicit GIT_DIR is accepted (bare case)' '
 	# Just like #16.
 	setup_repo 17a unset "" true &&

base-commit: 0f828332d5ac36fc63b7d8202652efa152809856
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH] [RFC] setup.c: make bare repo discovery optional
  2022-05-06 18:30 [PATCH] [RFC] setup.c: make bare repo discovery optional Glen Choo via GitGitGadget
@ 2022-05-06 20:33 ` Junio C Hamano
  2022-05-09 21:42 ` Taylor Blau
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
  2 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-06 20:33 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Glen Choo <chooglen@google.com>
>
> Add a config variable, `safe.barerepository`, that tells Git whether or
> not to recognize bare repositories when it is trying to discover the
> repository. This only affects repository discovery, thus it has no
> effect if discovery was not done (e.g. `--git-dir` was passed).

> +safe.barerepository::
> +	This config entry specifies directories that Git can recognize as
> +	a bare repository when looking for the repository (aka repository
> +	discovery). This has no effect if repository discovery is not
> +	performed e.g. the path to the repository is set via `--git-dir`
> +	(see linkgit:git[1]).
> ++
> +It is recommended that you set this value so that Git will only use the bare
> +repositories you intend it to. This prevents certain types of security and
> +non-security problems, such as:
> +
> +* `git clone`-ing a repository containing a maliciously bare repository
> +  inside it.

"maliciously bare"? "malicious bare" probably.

> +* Git recognizing a directory that isn't mean to be a bare repository,

"mean to be" -> "meant to be".

> +  but happens to look like one.

> diff --git a/setup.c b/setup.c
> index a7b36f3ffbf..9b5dd877273 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -1133,6 +1133,40 @@ static int ensure_valid_ownership(const char *path)
>  	return data.is_safe;
>  }
>  
> +/*
> + * This is similar to safe_directory_data, but only supports true/false.
> + */
> +struct safe_bare_repository_data {
> +	int is_safe;
> +};
> +
> +static int safe_bare_repository_cb(const char *key, const char *value, void *d)
> +{
> +	struct safe_bare_repository_data *data = d;
> +
> +	if (strcmp(key, "safe.barerepository"))
> +		return 0;
> +
> +	if (!value || !strcmp(value, "*")) {
> +		data->is_safe = 1;
> +		return 0;
> +	}
> +	if (!*value) {
> +		data->is_safe = 0;
> +		return 0;
> +	}
> +	return -1;
> +}
> +
> +static int should_detect_bare(void)
> +{
> +	struct safe_bare_repository_data data;
> +
> +	read_very_early_config(safe_bare_repository_cb, &data);
> +
> +	return data.is_safe;
> +}
> +
>  enum discovery_result {
>  	GIT_DIR_NONE = 0,
>  	GIT_DIR_EXPLICIT,
> @@ -1238,7 +1272,7 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>  			return GIT_DIR_DISCOVERED;
>  		}
>  
> -		if (is_git_directory(dir->buf)) {
> +		if (should_detect_bare() && is_git_directory(dir->buf)) {
>  			if (!ensure_valid_ownership(dir->buf))
>  				return GIT_DIR_INVALID_OWNERSHIP;
>  			strbuf_addstr(gitdir, ".");

This is in a loop, which will go up and try the parent directory if
the body of this block is not entered, so it is calling the new
should_detect_bare() helper over and over if it returns false.

Not a very good idea.

Perhaps this would help?  I dunno.

static int should_detect_bare(void)
{
	static int should = -1; /* unknown yet */

	if (should < 0) {
		struct safe_bare_repository_data data = { 0 };
		read_very_early_config(safe_bare_repository_cb, &data);
		should = data.is_safe;
	}
	return should;
}

In any case, I very much appreciate the fact that this touches the
setup_git_directory_gently_1() codepath only minimally, as we have
other plans to update the code further soonish.

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH] [RFC] setup.c: make bare repo discovery optional
  2022-05-06 18:30 [PATCH] [RFC] setup.c: make bare repo discovery optional Glen Choo via GitGitGadget
  2022-05-06 20:33 ` Junio C Hamano
@ 2022-05-09 21:42 ` Taylor Blau
  2022-05-09 22:54   ` Junio C Hamano
  2022-05-10 22:00   ` Glen Choo
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
  2 siblings, 2 replies; 113+ messages in thread
From: Taylor Blau @ 2022-05-09 21:42 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo

Hi Glen,

On Fri, May 06, 2022 at 06:30:10PM +0000, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
>
> Add a config variable, `safe.barerepository`, that tells Git whether or
> not to recognize bare repositories when it is trying to discover the
> repository. This only affects repository discovery, thus it has no
> effect if discovery was not done (e.g. `--git-dir` was passed).

Thanks for working on this! I'm excited to see some patches here, though
I'm not totally convinced of this direction. More below.

To summarize, this proposal attempts to work around the problem of
embedding bare repositories in non-bare checkouts by providing a way to
opt-out of bare repository discovery (which is to only discover things
that are listed in the safe.bareRepository configuration).

I agree that this would prevent the problem you're trying to solve, but
I have significant concerns that this patch is going too far (at the
risk of future damage to unrelated workflows) in order to accomplish
that goal.

My concern is that if we ever flipped the default (i.e. that
"safe.bareRepository" might someday be ""), that many legitimate cases
of using bare repositories would be broken. I think there are many such
legitimate use cases that _do_ rely on discovering bare repositories
(i.e., Git invocations that do not have a `--git-dir` in their
command-line). One such example would be forges, but I imagine that
there are many other uses we don't even know about, and I would like to
avoid breaking those if we ended up changing the default.

If it's possible to pursue a more targeted fix that leaves non-embedded
bare repositories alone, I'd like to try and focus these efforts on a
more narrow fix that would address just the case of embedded bare
repositories. I think that the direction I outlined in:

    https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/

could be a good place to start (see the paragraph beginning with "Here's
an alternative approach" and below for the details).

One potential problem with that approach (that this patch doesn't suffer
from) is that any discovery which finds a bare repository would have to
continue up to the root of the volume in order to figure out whether or
not that bare repository is embedded in another non-bare one. That is
probably a non-starter due to performance, but I think you could easily
work around with a top-level setting that controls whether or not you
even _care_ about embedded bare repositories.

For example, if I set safe.bareRepository='*' in my top-level
/etc/gitconfig, then we can avoid having to continue discovery for bare
repositories altogether because we know we'll allow it anyway.

To pursue a change that targets just embedded bare repositories, I think
you fundamentally have to do an exhaustive repository discovery in order
to figure out whether the (bare) repository you're dealing with is
embedded or not. So having an opt-out for users that either (a) don't
care or (b) can't accept the performance degradation that Emily
mentioned as a result of doing unbounded filesystem traversal would be
sensible.

Playing devil's advocate for a moment, though, even if we had something
like the proposal I outlined, flipping the top-level default from '*' to
some value that implies we stop working in embedded bare repositories
will break existing workflows. But that breakage would just be limited
to embedded bare repositories, and not non-embedded ones. So I think on
balance that breakage would affect fewer real-world users, while still
being just as easy to recover from.

>     safe.barerepository is presented to users as an allow-list of
>     directories that Git will recognize as a bare repository during the
>     repository discovery process (much like safe.directory), but this patch
>     only implements (and permits) boolean behavior (i.e. on, off and unset).
>     Hopefully, this gives us some room to discuss and experiment with
>     possible formats.
>
>     Thanks to Taylor for suggesting the allow-list idea :)

I did suggest an allow-list, but not this one ;-).

>     I think the core concept of letting users toggle bare repo discovery is
>     solid, but I'm sending this as RFC for the following reasons:
>
>      * I don't love the name safe.barerepository, because it feels like Git
>        is saying that bare repos are unsafe and consequently, that bare repo
>        users are behaving unsafely. On the other hand, this is quite similar
>        to safe.directory in a few respects, so it might make sense for the
>        naming to reflect that.

Yes, the concerns I outlined above are definitely echoing this
sentiment. Another way to say it is that this feels like too big of a
hammer (i.e., it is targeting _all_ bare repositories, not just embedded
ones) for too small of a nail (embedded bare repositories). As you're
probably sick of hearing me say by now, I would strongly prefer a more
targeted solution (perhaps what I outlined, or perhaps something else,
so long as it doesn't break non-embedded bare repositories if/ever we
decided to change the default value of safe.bareRepository).

>      * The *-gcc CI jobs don't pass. I haven't discerned any kind of pattern
>        yet.

Interesting. I wouldn't expect this to be the case (since the default is
to allow everything right now).

>      * In the longer-term, we might identify a usable-enough default that we
>        can give opt-out protection that works for the vast majority of
>        users.

Perhaps, and I think if this were the case then I would feel differently
about this patch. But I don't want us to paint ourselves into a corner,
either. It would be unfortunate to, say, find ourselves in a position
where the only protection against some novel embedded bare repository
attack is to change a default that would break many existing workflows
for _non_-embedded bare repositories.

>     = Other questions/Concerns
>
>      * Maybe it's more informative for the user if we die() (or warn()) when
>        we find a bare repo instead of silently ignoring it?

We should definitely provide more feedback to the user. If I set
`safe.bareRepository` to the empty string via a global config, and then
execute a Git command in a non-embedded bare repository, I get:

    $ git.compile config --get --global --default='*' safe.bareRepository

    $ git.compile rev-parse --absolute-git-dir
    fatal: not a git repository (or any of the parent directories): .git

whereas on the last release of Git, I get instead:

    $ git rev-parse --absolute-git-dir
    /home/ttaylorr/repo.git

I'm still not convinced that just reading repository extensions while
ignoring the rest of config and hooks is too confusing, so I'd be more
in favor of something like:

    $ git.compile rev-parse --absolute-git-dir
    warning: ignoring repository config and hooks
    advice: to permit bare repository discovery (which
    advice: will read config and hooks), consider running:
    advice:
    advice:   $ git config --global --add safe.bareRepository /home/ttaylorr/repo.git
    /home/ttaylorr/repo.git

(though I still feel strongly that we should pursue a more targeted
approach here).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH] [RFC] setup.c: make bare repo discovery optional
  2022-05-09 21:42 ` Taylor Blau
@ 2022-05-09 22:54   ` Junio C Hamano
  2022-05-09 23:57     ` Taylor Blau
  2022-05-10 22:00   ` Glen Choo
  1 sibling, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-05-09 22:54 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Glen Choo via GitGitGadget, git, brian m. carlson,
	Derrick Stolee, Emily Shaffer, Glen Choo

Taylor Blau <me@ttaylorr.com> writes:

> Thanks for working on this! I'm excited to see some patches here, though
> I'm not totally convinced of this direction. More below.
>
> To summarize, this proposal attempts to work around the problem of
> embedding bare repositories in non-bare checkouts by providing a way to
> opt-out of bare repository discovery (which is to only discover things
> that are listed in the safe.bareRepository configuration).
>
> I agree that this would prevent the problem you're trying to solve, but
> I have significant concerns that this patch is going too far (at the
> risk of future damage to unrelated workflows) in order to accomplish
> that goal.
>
> My concern is that if we ever flipped the default (i.e. that
> "safe.bareRepository" might someday be ""), that many legitimate cases
> of using bare repositories would be broken. I think there are many such
> legitimate use cases that _do_ rely on discovering bare repositories
> (i.e., Git invocations that do not have a `--git-dir` in their
> command-line).

I think 99% of such use is to chdir into the directory with HEAD,
refs/ and objects/ in it and let git recognise the cwd is a git
directory.  Am I mistaken, or are there tools that chdir into
objects/08/ and rely on setup_git_directory_gently_1() to find the
parent directory of that 'objects' directory to be a git directory?

I am wondering if another knob to help that particular use case
easier may be sufficient.  If you are a forge operator, you'd just
set a boolean configuration variable to say "it is sufficient to
chdir into a directory to use it a bare repository without exporting
the environment variable GIT_DIR=."

It is likely that end-user human users would not want to enable such
a variable, of course, but I wonder if a simple single knob would be
sufficient to help other use cases you are worried about?

While I wish "extensions and nothing else", i.e. we use "degraded
access", not "refuse to give access at all", were workable, I am
pessimistic that it would work well in practice.

Saying "nothing else" is easy, but we do "if X exists, use it" for
hook, and to implement "nothing else", you'd need to find such a
code and say "even if X exists, because we are in this strange
embedded bare thing, ignore this part of the logic" for every X.
We've been casually saying "potentially risky config" and then
started mixing "hooks" in the discussion, but who knows what other
things are used from the repository by third-party tools that we
need to yet add to the mix?

> I'm still not convinced that just reading repository extensions while
> ignoring the rest of config and hooks is too confusing, so I'd be more
> in favor of something like:

I do not think it would be confusing.  I am worried about it being
error prone.

>     $ git.compile rev-parse --absolute-git-dir
>     warning: ignoring repository config and hooks
>     advice: to permit bare repository discovery (which
>     advice: will read config and hooks), consider running:
>     advice:
>     advice:   $ git config --global --add safe.bareRepository /home/ttaylorr/repo.git
>     /home/ttaylorr/repo.git

Is the last line meant to be an output from "rev-parse --absolute-git-dir"?
IOW, the warning says you are ignoring, but we are still recognising
it as a repository?

By the way, do we need safe.bareRepository?  Shouldn't
safe.directory cover the same purpose?  

If a directory is on the latter, you are saying that (1) the
directory is OK to use as a repository, and (2) it is so even if the
directory is owned by somebody else, not you.

Theoretically you can argue that there can be cases where you only
want (1) and not (2), but as long as you control such a directory
(like an embedded repository in your project's checkout) yourself,
you do not have to worry about the "ok even if it is owned by
somebody else" part.





^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH] [RFC] setup.c: make bare repo discovery optional
  2022-05-09 22:54   ` Junio C Hamano
@ 2022-05-09 23:57     ` Taylor Blau
  2022-05-10  0:23       ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Taylor Blau @ 2022-05-09 23:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Taylor Blau, Glen Choo via GitGitGadget, git, brian m. carlson,
	Derrick Stolee, Emily Shaffer, Glen Choo

On Mon, May 09, 2022 at 03:54:08PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > Thanks for working on this! I'm excited to see some patches here, though
> > I'm not totally convinced of this direction. More below.
> >
> > To summarize, this proposal attempts to work around the problem of
> > embedding bare repositories in non-bare checkouts by providing a way to
> > opt-out of bare repository discovery (which is to only discover things
> > that are listed in the safe.bareRepository configuration).
> >
> > I agree that this would prevent the problem you're trying to solve, but
> > I have significant concerns that this patch is going too far (at the
> > risk of future damage to unrelated workflows) in order to accomplish
> > that goal.
> >
> > My concern is that if we ever flipped the default (i.e. that
> > "safe.bareRepository" might someday be ""), that many legitimate cases
> > of using bare repositories would be broken. I think there are many such
> > legitimate use cases that _do_ rely on discovering bare repositories
> > (i.e., Git invocations that do not have a `--git-dir` in their
> > command-line).
>
> I think 99% of such use is to chdir into the directory with HEAD,
> refs/ and objects/ in it and let git recognise the cwd is a git
> directory.  Am I mistaken, or are there tools that chdir into
> objects/08/ and rely on setup_git_directory_gently_1() to find the
> parent directory of that 'objects' directory to be a git directory?

If you took this change, and then at some point in the future we changed
the default value of safe.bareRepository to "", wouldn't that break that
99% of use cases you are talking about?

When I read your "I think 99% of such use is ...", it makes me think
that this change won't disrupt bare repo discovery when we only traverse
one layer above $CWD. But this change disrupts the case where we don't
need to traverse at all to do discovery (i.e., when $CWD is the root of
a bare repository).

> I am wondering if another knob to help that particular use case
> easier may be sufficient.  If you are a forge operator, you'd just
> set a boolean configuration variable to say "it is sufficient to
> chdir into a directory to use it a bare repository without exporting
> the environment variable GIT_DIR=."

Yes, GitHub would almost certainly set safe.bareRepository to "*"
regardless of what Git's own default would be.

> It is likely that end-user human users would not want to enable such
> a variable, of course, but I wonder if a simple single knob would be
> sufficient to help other use cases you are worried about?

I'm not sure I agree that end-users wouldn't want to touch this knob. If
they have embedded bare repositories that they rely on as test fixtures,
for example, wouldn't safe.bareRepository need to be tweaked?

(On a separate but somewhat-related note, I still think that this
setting should be read from the repository config, too, i.e., it seems
odd that we'd force a user to set safe.bareRepository to some deeply
nested repository (in the embedded case) via their global config.)

> While I wish "extensions and nothing else", i.e. we use "degraded
> access", not "refuse to give access at all", were workable, I am
> pessimistic that it would work well in practice.
>
> Saying "nothing else" is easy, but we do "if X exists, use it" for
> hook, and to implement "nothing else", you'd need to find such a
> code and say "even if X exists, because we are in this strange
> embedded bare thing, ignore this part of the logic" for every X.
> We've been casually saying "potentially risky config" and then
> started mixing "hooks" in the discussion, but who knows what other
> things are used from the repository by third-party tools that we
> need to yet add to the mix?
>
> > I'm still not convinced that just reading repository extensions while
> > ignoring the rest of config and hooks is too confusing, so I'd be more
> > in favor of something like:
>
> I do not think it would be confusing.  I am worried about it being
> error prone.

Yeah, on this and the above quoted hunk, I am fine if our behavior
eventually became "call die()" for when we are in an embedded bare
repository. But I do think this transition should be gradual, i.e., we
should likely emit a warning in those cases that would be broken in the
future to say "this will break, run this `git config` invocation if you
want it to remain working".

> >     $ git.compile rev-parse --absolute-git-dir
> >     warning: ignoring repository config and hooks
> >     advice: to permit bare repository discovery (which
> >     advice: will read config and hooks), consider running:
> >     advice:
> >     advice:   $ git config --global --add safe.bareRepository /home/ttaylorr/repo.git
> >     /home/ttaylorr/repo.git
>
> Is the last line meant to be an output from "rev-parse --absolute-git-dir"?
> IOW, the warning says you are ignoring, but we are still recognising
> it as a repository?

In this example, yes. But again, I'm not so deeply attached to the idea
that we *have* to run in those cases. So I would equally be OK with the
above s/warning/fatal and minus the last line, too (i.e., that we call
die(), obviously we'd have to emit the advice before calling die()).

> By the way, do we need safe.bareRepository?  Shouldn't
> safe.directory cover the same purpose?
>
> If a directory is on the latter, you are saying that (1) the
> directory is OK to use as a repository, and (2) it is so even if the
> directory is owned by somebody else, not you.
>
> Theoretically you can argue that there can be cases where you only
> want (1) and not (2), but as long as you control such a directory
> (like an embedded repository in your project's checkout) yourself,
> you do not have to worry about the "ok even if it is owned by
> somebody else" part.

I'm not sure yet, but will think more about it.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH] [RFC] setup.c: make bare repo discovery optional
  2022-05-09 23:57     ` Taylor Blau
@ 2022-05-10  0:23       ` Junio C Hamano
  0 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-10  0:23 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Glen Choo via GitGitGadget, git, brian m. carlson,
	Derrick Stolee, Emily Shaffer, Glen Choo

Taylor Blau <me@ttaylorr.com> writes:

>> > My concern is that if we ever flipped the default (i.e. that
>> > "safe.bareRepository" might someday be ""), that many legitimate cases
>> > of using bare repositories would be broken. I think there are many such
>> > legitimate use cases that _do_ rely on discovering bare repositories
>> > (i.e., Git invocations that do not have a `--git-dir` in their
>> > command-line).
>>
>> I think 99% of such use is to chdir into the directory with HEAD,
>> refs/ and objects/ in it and let git recognise the cwd is a git
>> directory.  Am I mistaken, or are there tools that chdir into
>> objects/08/ and rely on setup_git_directory_gently_1() to find the
>> parent directory of that 'objects' directory to be a git directory?
>
> If you took this change, and then at some point in the future we changed
> the default value of safe.bareRepository to "", wouldn't that break that
> 99% of use cases you are talking about?

Our spawning (e.g. "fetch" run_command()s "upload-pack" in a local
repository, or "fetch" runs "upload-pack" over ssh connection, or
http gateway runs "upload-pack" after learning which repository the
request is fetching from) of subcommands can and should be fixed by
exporting "GIT_DIR=." when we spawn them in the target directory,
and such a fix should be more or less trivial.  It must happen
before such a switch of default happens (if it is what we plan to
do, that is).  Also, the trivial fix must be conveyed to third-party
tool authors and give them time to adjust their ware.

That's part of the usual migration process, and I am not so worried
about it.

If some third-party tool for whatever reason wants to start from a
random subdirectory in a bare repository, that is a different story.
Fixing such a third-party tool would be more involved than "more or
less trivial".

> When I read your "I think 99% of such use is ...", it makes me think
> that this change won't disrupt bare repo discovery when we only traverse
> one layer above $CWD. But this change disrupts the case where we don't
> need to traverse at all to do discovery (i.e., when $CWD is the root of
> a bare repository).

By "this change" you mean what Glen proposes?  I think it was
designed to break the use case where you go there to signal that you
want to use the directory as a repository.

>> I am wondering if another knob to help that particular use case
>> easier may be sufficient.  If you are a forge operator, you'd just
>> set a boolean configuration variable to say "it is sufficient to
>> chdir into a directory to use it a bare repository without exporting
>> the environment variable GIT_DIR=."

And such a boolean, without safe.bareRepository setting, should be
sufficient to cover that 99% of such use, because it disables that
deliberate refusal of treating CWD as a repository without
explicitly saying that is what you want with "GIT_DIR=.".  One thing
I wasn't sure about was if that 99% number is close to reality,
hence my question.

> Yes, GitHub would almost certainly set safe.bareRepository to "*"
> regardless of what Git's own default would be.

And with such a boolean, I am hoping that GitHub do not have to make
such a wildly open setting.  Only $CWD that is the top of a repository,
without allowing it to be any random subdirectory, would be allowed.

> I'm not sure I agree that end-users wouldn't want to touch this knob. If
> they have embedded bare repositories that they rely on as test fixtures,
> for example, wouldn't safe.bareRepository need to be tweaked?

But not in the "My $CWD is always fine" knob, whose only reason is
to simplify things without opening you up unnecessarily too widely
for hosting sites.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH] [RFC] setup.c: make bare repo discovery optional
  2022-05-09 21:42 ` Taylor Blau
  2022-05-09 22:54   ` Junio C Hamano
@ 2022-05-10 22:00   ` Glen Choo
  1 sibling, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-05-10 22:00 UTC (permalink / raw)
  To: Taylor Blau, Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano, Emily Shaffer

Hi Taylor,

Taylor Blau <me@ttaylorr.com> writes:

> Hi Glen,
>
> On Fri, May 06, 2022 at 06:30:10PM +0000, Glen Choo via GitGitGadget wrote:
>> From: Glen Choo <chooglen@google.com>
>>
>> Add a config variable, `safe.barerepository`, that tells Git whether or
>> not to recognize bare repositories when it is trying to discover the
>> repository. This only affects repository discovery, thus it has no
>> effect if discovery was not done (e.g. `--git-dir` was passed).
>
> To summarize, this proposal attempts to work around the problem of
> embedding bare repositories in non-bare checkouts by providing a way to
> opt-out of bare repository discovery (which is to only discover things
> that are listed in the safe.bareRepository configuration).
>
> I agree that this would prevent the problem you're trying to solve, but
> I have significant concerns that this patch is going too far (at the
> risk of future damage to unrelated workflows) in order to accomplish
> that goal.

Thanks again for the careful read. As I understand it, your concern is
that making bare repository discovery configurable and then flipping the
default to e.g. never detecting bare repositories is too disruptive to
fix the embedded bare repository problem. And to avoid disrupting
non-embedded bare repositories, you would prefer to pursue a more
targeted fix.

If the problem statement were limited to embedded bare repositories,
then I agree that this is way more than overkill, and that a targeted
solution would be preferable.

More generally however, the problem of embedded bare repositories seems
to suggest that bare repository discovery doesn't serve all users well,
and in fact, may even be a net negative for a subset of users. I'd be
interested in hearing your thoughts from that perspective, e.g.

- Should bare repository discovery should be configurable?
- What is a good default for bare repository discovery? (regardless of
  how feasible changing the default is)

This is a somewhat different direction from how the conversation started
(I hope it doesn't look like I'm shifting the goal posts), but I think
it's a good opportunity to step back and simplify something that we
wished we got right in the beginning.

And even if we don't flip the default, shipping the config value still
seems useful e.g. there's a good amount of interest in disabling bare
repository discovery at $DAYJOB (and I think we'll get a lot of
interesting results once we do).

>>     safe.barerepository is presented to users as an allow-list of
>>     directories that Git will recognize as a bare repository during the
>>     repository discovery process (much like safe.directory), but this patch
>>     only implements (and permits) boolean behavior (i.e. on, off and unset).
>>     Hopefully, this gives us some room to discuss and experiment with
>>     possible formats.
>>
>>     Thanks to Taylor for suggesting the allow-list idea :)
>
> I did suggest an allow-list, but not this one ;-).

Ah, yes. Oops. Sorry if it looked like I was putting words in your
mouth.

What I really meant was that an allow-list (untethered from any specific
purpose) seems like a useful 'UI primitive', so thanks for bringing up
the option.

>>     I think the core concept of letting users toggle bare repo discovery is
>>     solid, but I'm sending this as RFC for the following reasons:
>>
>>      * I don't love the name safe.barerepository, because it feels like Git
>>        is saying that bare repos are unsafe and consequently, that bare repo
>>        users are behaving unsafely. On the other hand, this is quite similar
>>        to safe.directory in a few respects, so it might make sense for the
>>        naming to reflect that.
>
> Yes, the concerns I outlined above are definitely echoing this
> sentiment. Another way to say it is that this feels like too big of a
> hammer (i.e., it is targeting _all_ bare repositories, not just embedded
> ones) for too small of a nail (embedded bare repositories). As you're
> probably sick of hearing me say by now, I would strongly prefer a more
> targeted solution (perhaps what I outlined, or perhaps something else,
> so long as it doesn't break non-embedded bare repositories if/ever we
> decided to change the default value of safe.bareRepository).

Ok, yeah I think safe.barerepository is a terrible way to achieve my
purported goal of 'making bare repository discovery
configurable/simpler/' - using the "safe." namespace makes it impossible
to see this as anything other than protection against dangerous, unknown
bare repositories. I'll drop the idea of safe-listing known bare
repositories for now, that seems unproductive.

'Optionally disable bare repository discovery' still sounds like it's on
the table though, but probably with a different kind of UX e.g.
"discovery.barerepository" with the options:

- always: always discover bare repos
- never: never discover bare repos
- cwd-only: only discover bare repos if they are the cwd
- dotgit-only: only discover bare repos if they are a descendant of
  .git/

>>      * The *-gcc CI jobs don't pass. I haven't discerned any kind of pattern
>>        yet.
>
> Interesting. I wouldn't expect this to be the case (since the default is
> to allow everything right now).

This might be a false alarm - I saw similar failures on an unrelated
patch. I think my "master" is just out of date :(

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-06 18:30 [PATCH] [RFC] setup.c: make bare repo discovery optional Glen Choo via GitGitGadget
  2022-05-06 20:33 ` Junio C Hamano
  2022-05-09 21:42 ` Taylor Blau
@ 2022-05-13 23:37 ` Glen Choo via GitGitGadget
  2022-05-13 23:37   ` [PATCH v2 1/2] " Glen Choo via GitGitGadget
                     ` (5 more replies)
  2 siblings, 6 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-13 23:37 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo

Thanks all for the comments on v1, I've expanded this series somewhat to
address them, notably:

 * The config value is now named discovery.bare and is an enum (instead of
   an allow-list). This hopefully moves us away from "bare repos are unsafe
   and need to be guarded against" and towards "bare repos can be made
   optional if it serves your needs better".
 * discovery.bare now causes git to die() instead of silently ignoring the
   bare repo.
 * Add an option that allows a bare repo if it is the CWD, since this is
   presumably a reasonable default for 99% of bare repo users [1].

= Questions/Concerns

 * die()-ing is necessary if we're trying to flip the default value of
   discovery.bare. We'd expect many bare repo users to be broken, and it's
   more helpful to fail loudly than to silently ignore the bare repo.
   
   But in the long term, long after we've flipped the default and users know
   that they need to opt into bare repo discovery, would it be a better UX
   to just silently ignore the bare repo?

= Patch organization

 * Patch 1 introduces discovery.bare with allowed values [always|never].

 * Patch 2 adds discover.bare=cwd, which is useful when users don't always
   set GIT_DIR e.g. their workflow really depends on it, they are in the
   midst of migration.

= Series history

Changes since v1:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

[1] I tried this 'cwd' setting on our test suite, with some pretty promising
results.

https://github.com/chooglen/git/actions/runs/2321914777

Out of the 8 failing scripts:

 * 6 are of the form "make sure we 'do the right thing' inside a
   subdirectory of a bare repo" (which typically means .git) e.g.
   t9903-bash-prompt.sh. We should be setting discovery.bare=always for
   these tests, so this is a non-issue.
 * t5323-pack-redundant.sh can be rewritten to -C into the root of the bare
   repo instead of a subdirectory.
 * t3310-notes-merge-manual-resolve.sh: not sure what the test is checking
   in particular, but I think this can be rewritten.

IOW, I don't think we have any commands that require that CWD is a
subdirectory of a bare repo, and we could use discovery.bare without much
hassle.

Glen Choo (2):
  setup.c: make bare repo discovery optional
  setup.c: learn discovery.bareRepository=cwd

 Documentation/config/discovery.txt | 26 +++++++++
 setup.c                            | 89 ++++++++++++++++++++++++++++--
 t/t0034-discovery-bare.sh          | 69 +++++++++++++++++++++++
 3 files changed, 178 insertions(+), 6 deletions(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0034-discovery-bare.sh


base-commit: e8005e4871f130c4e402ddca2032c111252f070a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v2
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v1:

 1:  3370258c4b3 ! 1:  22b10bf9da8 [RFC] setup.c: make bare repo discovery optional
     @@ Metadata
      Author: Glen Choo <chooglen@google.com>
      
       ## Commit message ##
     -    [RFC] setup.c: make bare repo discovery optional
     +    setup.c: make bare repo discovery optional
      
     -    Add a config variable, `safe.barerepository`, that tells Git whether or
     -    not to recognize bare repositories when it is trying to discover the
     -    repository. This only affects repository discovery, thus it has no
     +    Add a config variable, `discovery.bare`, that tells Git whether or not
     +    it should work with the bare repository it has discovered i.e. Git will
     +    die() if it discovers a bare repository, but it is not allowed by
     +    `discovery.bare`. This only affects repository discovery, thus it has no
          effect if discovery was not done (e.g. `--git-dir` was passed).
      
          This is motivated by the fact that some workflows don't use bare
     @@ Commit message
            are probably very rare in practice, this lets users reduce the chance
            to zero.
      
     -    This config is designed to be used like an allow-list, but it is not yet
     -    clear what a good format for this allow-list would be. As such, this
     -    patch limits the config value to a tri-state of [true|false|unset]:
     +    This config is an enum of:
      
     -    - [*|(unset)] recognize all bare repositories (like Git does today)
     -    - (empty) recognize no bare repositories
     +    - ["always"|(unset)]: always recognize bare repositories (like Git does
     +      today)
     +    - "never": never recognize bare repositories
      
     -    and leaves the full format to be determined later.
     +    More values are expected to be added later, and the default is expected
     +    to change (i.e. to something other than "always").
      
          [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
          [2]: I don't personally know anyone who does this as part of their
     @@ Commit message
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
     - ## Documentation/config/safe.txt ##
     +    WIP setup.c: make discovery.bare die on failure
     +
     +    Signed-off-by: Glen Choo <chooglen@google.com>
     +
     + ## Documentation/config/discovery.txt (new) ##
      @@
     -+safe.barerepository::
     -+	This config entry specifies directories that Git can recognize as
     -+	a bare repository when looking for the repository (aka repository
     ++discovery.bare::
     ++	Specifies what kinds of directories Git can recognize as a bare
     ++	repository when looking for the repository (aka repository
      +	discovery). This has no effect if repository discovery is not
      +	performed e.g. the path to the repository is set via `--git-dir`
      +	(see linkgit:git[1]).
      ++
     -+It is recommended that you set this value so that Git will only use the bare
     -+repositories you intend it to. This prevents certain types of security and
     -+non-security problems, such as:
     -+
     -+* `git clone`-ing a repository containing a maliciously bare repository
     -+  inside it.
     -+* Git recognizing a directory that isn't mean to be a bare repository,
     -+  but happens to look like one.
     -++
     -+The currently supported values are `*` (Git recognizes all bare
     -+repositories) and the empty value (Git never recognizes bare repositories).
     -+Defaults to `*`.
     -++
      +This config setting is only respected when specified in a system or global
      +config, not when it is specified in a repository config or via the command
     -+line option `-c safe.barerepository=<path>`.
     ++line option `-c discovery.bare=<value>`.
     +++
     ++The currently supported values are `always` (Git always recognizes bare
     ++repositories) and `never` (Git never recognizes bare repositories).
     ++This defaults to `always`, but this default is likely to change.
     +++
     ++If your workflow does not rely on bare repositories, it is recommended that
     ++you set this value to `never`. This makes repository discovery easier to
     ++reason about and prevents certain types of security and non-security
     ++problems, such as:
      +
     - safe.directory::
     - 	These config entries specify Git-tracked directories that are
     - 	considered safe even if they are owned by someone other than the
     ++* `git clone`-ing a repository containing a malicious bare repository
     ++  inside it.
     ++* Git recognizing a directory that isn't meant to be a bare repository,
     ++  but happens to look like one.
      
       ## setup.c ##
     +@@
     + static int inside_git_dir = -1;
     + static int inside_work_tree = -1;
     + static int work_tree_config_is_bogus;
     ++enum discovery_bare_config {
     ++	DISCOVERY_BARE_UNKNOWN = -1,
     ++	DISCOVERY_BARE_NEVER = 0,
     ++	DISCOVERY_BARE_ALWAYS,
     ++};
     ++static enum discovery_bare_config discovery_bare_config =
     ++	DISCOVERY_BARE_UNKNOWN;
     + 
     + static struct startup_info the_startup_info;
     + struct startup_info *startup_info = &the_startup_info;
      @@ setup.c: static int ensure_valid_ownership(const char *path)
       	return data.is_safe;
       }
       
     -+/*
     -+ * This is similar to safe_directory_data, but only supports true/false.
     -+ */
     -+struct safe_bare_repository_data {
     -+	int is_safe;
     -+};
     -+
     -+static int safe_bare_repository_cb(const char *key, const char *value, void *d)
     ++static int discovery_bare_cb(const char *key, const char *value, void *d)
      +{
     -+	struct safe_bare_repository_data *data = d;
     -+
     -+	if (strcmp(key, "safe.barerepository"))
     ++	if (strcmp(key, "discovery.bare"))
      +		return 0;
      +
     -+	if (!value || !strcmp(value, "*")) {
     -+		data->is_safe = 1;
     ++	if (!strcmp(value, "never")) {
     ++		discovery_bare_config = DISCOVERY_BARE_NEVER;
      +		return 0;
      +	}
     -+	if (!*value) {
     -+		data->is_safe = 0;
     ++	if (!strcmp(value, "always")) {
     ++		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
      +		return 0;
      +	}
      +	return -1;
      +}
      +
     -+static int should_detect_bare(void)
     ++static int check_bare_repo_allowed(void)
      +{
     -+	struct safe_bare_repository_data data;
     -+
     -+	read_very_early_config(safe_bare_repository_cb, &data);
     ++	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
     ++		read_very_early_config(discovery_bare_cb, NULL);
     ++		/* We didn't find a value; use the default. */
     ++		if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN)
     ++			discovery_bare_config = DISCOVERY_BARE_ALWAYS;
     ++	}
     ++	switch (discovery_bare_config) {
     ++	case DISCOVERY_BARE_NEVER:
     ++		return 0;
     ++	case DISCOVERY_BARE_ALWAYS:
     ++		return 1;
     ++	default:
     ++		BUG("invalid discovery_bare_config %d", discovery_bare_config);
     ++	}
     ++}
      +
     -+	return data.is_safe;
     ++static const char *discovery_bare_config_to_string(void)
     ++{
     ++	switch (discovery_bare_config) {
     ++	case DISCOVERY_BARE_NEVER:
     ++		return "never";
     ++	case DISCOVERY_BARE_ALWAYS:
     ++		return "always";
     ++	default:
     ++		BUG("invalid discovery_bare_config %d", discovery_bare_config);
     ++	}
      +}
      +
       enum discovery_result {
       	GIT_DIR_NONE = 0,
       	GIT_DIR_EXPLICIT,
     +@@ setup.c: enum discovery_result {
     + 	GIT_DIR_HIT_CEILING = -1,
     + 	GIT_DIR_HIT_MOUNT_POINT = -2,
     + 	GIT_DIR_INVALID_GITFILE = -3,
     +-	GIT_DIR_INVALID_OWNERSHIP = -4
     ++	GIT_DIR_INVALID_OWNERSHIP = -4,
     ++	GIT_DIR_DISALLOWED_BARE = -5
     + };
     + 
     + /*
      @@ setup.c: static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
     - 			return GIT_DIR_DISCOVERED;
       		}
       
     --		if (is_git_directory(dir->buf)) {
     -+		if (should_detect_bare() && is_git_directory(dir->buf)) {
     + 		if (is_git_directory(dir->buf)) {
     ++			if (!check_bare_repo_allowed())
     ++				return GIT_DIR_DISALLOWED_BARE;
       			if (!ensure_valid_ownership(dir->buf))
       				return GIT_DIR_INVALID_OWNERSHIP;
       			strbuf_addstr(gitdir, ".");
     +@@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
     + 		}
     + 		*nongit_ok = 1;
     + 		break;
     ++	case GIT_DIR_DISALLOWED_BARE:
     ++		if (!nongit_ok) {
     ++			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
     ++			    dir.buf,
     ++			    discovery_bare_config_to_string());
     ++		}
     ++		*nongit_ok = 1;
     ++		break;
     + 	case GIT_DIR_NONE:
     + 		/*
     + 		 * As a safeguard against setup_git_directory_gently_1 returning
      
     - ## t/t1510-repo-setup.sh ##
     -@@ t/t1510-repo-setup.sh: test_expect_success '#16e: bareness preserved by --bare' '
     - 	)
     - '
     - 
     -+# Test the tri-state of [(unset)|""|"*"].
     -+test_expect_success '#16f: bare repo in worktree' '
     -+	test_when_finished "git config --global --unset safe.barerepository" &&
     -+	setup_repo 16f unset "" unset &&
     + ## t/t0034-discovery-bare.sh (new) ##
     +@@
     ++#!/bin/sh
      +
     -+	git init --bare 16f/default/bare &&
     -+	git init --bare 16f/default/bare/bare &&
     -+	try_case 16f/default/bare unset unset \
     -+		. "(null)" "$here/16f/default/bare" "(null)" &&
     -+	try_case 16f/default/bare/bare unset unset \
     -+		. "(null)" "$here/16f/default/bare/bare" "(null)" &&
     ++test_description='verify discovery.bare checks'
      +
     -+	git config --global safe.barerepository "*" &&
     -+	git init --bare 16f/all/bare &&
     -+	git init --bare 16f/all/bare/bare &&
     -+	try_case 16f/all/bare unset unset \
     -+		. "(null)" "$here/16f/all/bare" "(null)" &&
     -+	try_case 16f/all/bare/bare unset unset \
     -+		. "(null)" "$here/16f/all/bare/bare" "(null)" &&
     ++. ./test-lib.sh
      +
     -+	git config --global safe.barerepository "" &&
     -+	git init --bare 16f/never/bare &&
     -+	git init --bare 16f/never/bare/bare &&
     -+	try_case 16f/never/bare unset unset \
     -+		".git" "$here/16f" "$here/16f" "never/bare/" &&
     -+	try_case 16f/never/bare/bare unset unset \
     -+		".git" "$here/16f" "$here/16f" "never/bare/bare/"
     -+'
     ++pwd="$(pwd)"
      +
     -+test_expect_success '#16g: inside .git with safe.barerepository' '
     -+	test_when_finished "git config --global --unset safe.barerepository" &&
     ++expect_allowed () {
     ++	git rev-parse --absolute-git-dir >actual &&
     ++	echo "$pwd/outer-repo/bare-repo" >expected &&
     ++	test_cmp expected actual
     ++}
      +
     -+	# Omit the "default" case; it is covered by 16a.
     ++expect_rejected () {
     ++	test_must_fail git rev-parse --absolute-git-dir 2>err &&
     ++	grep "discovery.bare" err
     ++}
      +
     -+	git config --global safe.barerepository "*" &&
     -+	setup_repo 16g/all unset "" unset &&
     -+	mkdir -p 16g/all/.git/wt/sub &&
     -+	try_case 16g/all/.git unset unset \
     -+		. "(null)" "$here/16g/all/.git" "(null)" &&
     -+	try_case 16g/all/.git/wt unset unset \
     -+		"$here/16g/all/.git" "(null)" "$here/16g/all/.git/wt" "(null)" &&
     -+	try_case 16g/all/.git/wt/sub unset unset \
     -+		"$here/16g/all/.git" "(null)" "$here/16g/all/.git/wt/sub" "(null)" &&
     ++test_expect_success 'setup bare repo in worktree' '
     ++	git init outer-repo &&
     ++	git init --bare outer-repo/bare-repo
     ++'
     ++
     ++test_expect_success 'discovery.bare unset' '
     ++	(
     ++		cd outer-repo/bare-repo &&
     ++		expect_allowed &&
     ++		cd refs/ &&
     ++		expect_allowed
     ++	)
     ++'
     ++
     ++test_expect_success 'discovery.bare=always' '
     ++	git config --global discovery.bare always &&
     ++	(
     ++		cd outer-repo/bare-repo &&
     ++		expect_allowed &&
     ++		cd refs/ &&
     ++		expect_allowed
     ++	)
     ++'
      +
     -+	git config --global safe.barerepository "" &&
     -+	setup_repo 16g/never unset "" unset &&
     -+	mkdir -p 16g/never/.git/wt/sub &&
     -+	try_case 16g/never/.git unset unset \
     -+		".git" "$here/16g/never" "$here/16g/never" ".git/" &&
     -+	try_case 16g/never/.git/wt unset unset \
     -+		".git" "$here/16g/never" "$here/16g/never" ".git/wt/" &&
     -+	try_case 16g/never/.git/wt/sub unset unset \
     -+		".git" "$here/16g/never" "$here/16g/never" ".git/wt/sub/"
     ++test_expect_success 'discovery.bare=never' '
     ++	git config --global discovery.bare never &&
     ++	(
     ++		cd outer-repo/bare-repo &&
     ++		expect_rejected &&
     ++		cd refs/ &&
     ++		expect_rejected
     ++	) &&
     ++	(
     ++		GIT_DIR=outer-repo/bare-repo &&
     ++		export GIT_DIR &&
     ++		expect_allowed
     ++	)
      +'
      +
     - test_expect_success '#17: GIT_WORK_TREE without explicit GIT_DIR is accepted (bare case)' '
     - 	# Just like #16.
     - 	setup_repo 17a unset "" true &&
     ++test_done
 -:  ----------- > 2:  62070aab7eb setup.c: learn discovery.bareRepository=cwd

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v2 1/2] setup.c: make bare repo discovery optional
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
@ 2022-05-13 23:37   ` Glen Choo via GitGitGadget
  2022-05-16 18:12     ` Glen Choo
  2022-05-16 18:46     ` Derrick Stolee
  2022-05-13 23:37   ` [PATCH v2 2/2] setup.c: learn discovery.bareRepository=cwd Glen Choo via GitGitGadget
                     ` (4 subsequent siblings)
  5 siblings, 2 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-13 23:37 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Add a config variable, `discovery.bare`, that tells Git whether or not
it should work with the bare repository it has discovered i.e. Git will
die() if it discovers a bare repository, but it is not allowed by
`discovery.bare`. This only affects repository discovery, thus it has no
effect if discovery was not done (e.g. `--git-dir` was passed).

This is motivated by the fact that some workflows don't use bare
repositories at all, and users may prefer to opt out of bare repository
discovery altogether:

- An easy assumption for a user to make is that Git commands run
  anywhere inside a repository's working tree will use the same
  repository. However, if the working tree contains a bare repository
  below the root-level (".git" is preferred at the root-level), any
  operations inside that bare repository use the bare repository
  instead.

  In the worst case, attackers can use this confusion to trick users
  into running arbitrary code (see [1] for a deeper discussion). But
  even in benign situations (e.g. a user renames ".git/" to ".git.old/"
  and commits it for archival purposes), disabling bare repository
  discovery can be a simpler mode of operation (e.g. because the user
  doesn't actually want to use ".git.old/") [2].

- Git won't "accidentally" recognize a directory that wasn't meant to be
  a bare repository, but happens to resemble one. While such accidents
  are probably very rare in practice, this lets users reduce the chance
  to zero.

This config is an enum of:

- ["always"|(unset)]: always recognize bare repositories (like Git does
  today)
- "never": never recognize bare repositories

More values are expected to be added later, and the default is expected
to change (i.e. to something other than "always").

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: I don't personally know anyone who does this as part of their
normal workflow, but a cursory search on GitHub suggests that there is a
not insubstantial number of people who munge ".git" in order to store
its contents.

https://github.com/search?l=&o=desc&p=1&q=ref+size%3A%3C1000+filename%3AHEAD&s=indexed&type=Code
(aka search for the text "ref", size:<1000, filename:HEAD)

Signed-off-by: Glen Choo <chooglen@google.com>

WIP setup.c: make discovery.bare die on failure

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/discovery.txt | 24 +++++++++++
 setup.c                            | 66 +++++++++++++++++++++++++++++-
 t/t0034-discovery-bare.sh          | 59 ++++++++++++++++++++++++++
 3 files changed, 148 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0034-discovery-bare.sh

diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
new file mode 100644
index 00000000000..761cabe6e70
--- /dev/null
+++ b/Documentation/config/discovery.txt
@@ -0,0 +1,24 @@
+discovery.bare::
+	Specifies what kinds of directories Git can recognize as a bare
+	repository when looking for the repository (aka repository
+	discovery). This has no effect if repository discovery is not
+	performed e.g. the path to the repository is set via `--git-dir`
+	(see linkgit:git[1]).
++
+This config setting is only respected when specified in a system or global
+config, not when it is specified in a repository config or via the command
+line option `-c discovery.bare=<value>`.
++
+The currently supported values are `always` (Git always recognizes bare
+repositories) and `never` (Git never recognizes bare repositories).
+This defaults to `always`, but this default is likely to change.
++
+If your workflow does not rely on bare repositories, it is recommended that
+you set this value to `never`. This makes repository discovery easier to
+reason about and prevents certain types of security and non-security
+problems, such as:
+
+* `git clone`-ing a repository containing a malicious bare repository
+  inside it.
+* Git recognizing a directory that isn't meant to be a bare repository,
+  but happens to look like one.
diff --git a/setup.c b/setup.c
index a7b36f3ffbf..cee01d86f0c 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,13 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum discovery_bare_config {
+	DISCOVERY_BARE_UNKNOWN = -1,
+	DISCOVERY_BARE_NEVER = 0,
+	DISCOVERY_BARE_ALWAYS,
+};
+static enum discovery_bare_config discovery_bare_config =
+	DISCOVERY_BARE_UNKNOWN;
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1133,6 +1140,52 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+static int discovery_bare_cb(const char *key, const char *value, void *d)
+{
+	if (strcmp(key, "discovery.bare"))
+		return 0;
+
+	if (!strcmp(value, "never")) {
+		discovery_bare_config = DISCOVERY_BARE_NEVER;
+		return 0;
+	}
+	if (!strcmp(value, "always")) {
+		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
+		return 0;
+	}
+	return -1;
+}
+
+static int check_bare_repo_allowed(void)
+{
+	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
+		read_very_early_config(discovery_bare_cb, NULL);
+		/* We didn't find a value; use the default. */
+		if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN)
+			discovery_bare_config = DISCOVERY_BARE_ALWAYS;
+	}
+	switch (discovery_bare_config) {
+	case DISCOVERY_BARE_NEVER:
+		return 0;
+	case DISCOVERY_BARE_ALWAYS:
+		return 1;
+	default:
+		BUG("invalid discovery_bare_config %d", discovery_bare_config);
+	}
+}
+
+static const char *discovery_bare_config_to_string(void)
+{
+	switch (discovery_bare_config) {
+	case DISCOVERY_BARE_NEVER:
+		return "never";
+	case DISCOVERY_BARE_ALWAYS:
+		return "always";
+	default:
+		BUG("invalid discovery_bare_config %d", discovery_bare_config);
+	}
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1142,7 +1195,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5
 };
 
 /*
@@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (!check_bare_repo_allowed())
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1385,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
+			    dir.buf,
+			    discovery_bare_config_to_string());
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0034-discovery-bare.sh b/t/t0034-discovery-bare.sh
new file mode 100755
index 00000000000..9c774872c4e
--- /dev/null
+++ b/t/t0034-discovery-bare.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+test_description='verify discovery.bare checks'
+
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_allowed () {
+	git rev-parse --absolute-git-dir >actual &&
+	echo "$pwd/outer-repo/bare-repo" >expected &&
+	test_cmp expected actual
+}
+
+expect_rejected () {
+	test_must_fail git rev-parse --absolute-git-dir 2>err &&
+	grep "discovery.bare" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare unset' '
+	(
+		cd outer-repo/bare-repo &&
+		expect_allowed &&
+		cd refs/ &&
+		expect_allowed
+	)
+'
+
+test_expect_success 'discovery.bare=always' '
+	git config --global discovery.bare always &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_allowed &&
+		cd refs/ &&
+		expect_allowed
+	)
+'
+
+test_expect_success 'discovery.bare=never' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_rejected &&
+		cd refs/ &&
+		expect_rejected
+	) &&
+	(
+		GIT_DIR=outer-repo/bare-repo &&
+		export GIT_DIR &&
+		expect_allowed
+	)
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v2 2/2] setup.c: learn discovery.bareRepository=cwd
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
  2022-05-13 23:37   ` [PATCH v2 1/2] " Glen Choo via GitGitGadget
@ 2022-05-13 23:37   ` Glen Choo via GitGitGadget
  2022-05-16 18:49     ` Derrick Stolee
  2022-05-16 16:40   ` [PATCH v2 0/2] setup.c: make bare repo discovery optional Junio C Hamano
                     ` (3 subsequent siblings)
  5 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-13 23:37 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Add a 'cwd' option to discovery.bareRepository, which allows a bare
repository to be used if and only if the cwd is the root of a bare
repository. This covers the common case where a user works with a bare
repository by cd-ing into the repository's root.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/discovery.txt |  6 ++++--
 setup.c                            | 27 ++++++++++++++++++++-------
 t/t0034-discovery-bare.sh          | 10 ++++++++++
 3 files changed, 34 insertions(+), 9 deletions(-)

diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
index 761cabe6e70..d7cdee3a5e1 100644
--- a/Documentation/config/discovery.txt
+++ b/Documentation/config/discovery.txt
@@ -10,8 +10,10 @@ config, not when it is specified in a repository config or via the command
 line option `-c discovery.bare=<value>`.
 +
 The currently supported values are `always` (Git always recognizes bare
-repositories) and `never` (Git never recognizes bare repositories).
-This defaults to `always`, but this default is likely to change.
+repositories), `cwd` (Git only recognizes bare repositories if they are the
+current working directory) and `never` (Git never recognizes bare
+repositories). This defaults to `always`, but this default is likely to
+change.
 +
 If your workflow does not rely on bare repositories, it is recommended that
 you set this value to `never`. This makes repository discovery easier to
diff --git a/setup.c b/setup.c
index cee01d86f0c..ead999f404c 100644
--- a/setup.c
+++ b/setup.c
@@ -14,6 +14,7 @@ enum discovery_bare_config {
 	DISCOVERY_BARE_UNKNOWN = -1,
 	DISCOVERY_BARE_NEVER = 0,
 	DISCOVERY_BARE_ALWAYS,
+	DISCOVERY_BARE_CWD,
 };
 static enum discovery_bare_config discovery_bare_config =
 	DISCOVERY_BARE_UNKNOWN;
@@ -1153,10 +1154,14 @@ static int discovery_bare_cb(const char *key, const char *value, void *d)
 		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
 		return 0;
 	}
+	if (!strcmp(value, "cwd")) {
+		discovery_bare_config = DISCOVERY_BARE_CWD;
+		return 0;
+	}
 	return -1;
 }
 
-static int check_bare_repo_allowed(void)
+static int check_bare_repo_allowed(const char *cwd, const char *path)
 {
 	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
 		read_very_early_config(discovery_bare_cb, NULL);
@@ -1169,6 +1174,8 @@ static int check_bare_repo_allowed(void)
 		return 0;
 	case DISCOVERY_BARE_ALWAYS:
 		return 1;
+	case DISCOVERY_BARE_CWD:
+		return !strcmp(cwd, path);
 	default:
 		BUG("invalid discovery_bare_config %d", discovery_bare_config);
 	}
@@ -1181,6 +1188,8 @@ static const char *discovery_bare_config_to_string(void)
 		return "never";
 	case DISCOVERY_BARE_ALWAYS:
 		return "always";
+	case DISCOVERY_BARE_CWD:
+		return "cwd";
 	default:
 		BUG("invalid discovery_bare_config %d", discovery_bare_config);
 	}
@@ -1212,7 +1221,8 @@ enum discovery_result {
  * the discovered .git/ directory, if any. If `gitdir` is not absolute, it
  * is relative to `dir` (i.e. *not* necessarily the cwd).
  */
-static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
+static enum discovery_result setup_git_directory_gently_1(struct strbuf *cwd,
+							  struct strbuf *dir,
 							  struct strbuf *gitdir,
 							  int die_on_error)
 {
@@ -1293,7 +1303,7 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
-			if (!check_bare_repo_allowed())
+			if (!check_bare_repo_allowed(cwd->buf, dir->buf))
 				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
@@ -1319,16 +1329,18 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 int discover_git_directory(struct strbuf *commondir,
 			   struct strbuf *gitdir)
 {
-	struct strbuf dir = STRBUF_INIT, err = STRBUF_INIT;
+	struct strbuf cwd = STRBUF_INIT, dir = STRBUF_INIT, err = STRBUF_INIT;
 	size_t gitdir_offset = gitdir->len, cwd_len;
 	size_t commondir_offset = commondir->len;
 	struct repository_format candidate = REPOSITORY_FORMAT_INIT;
 
-	if (strbuf_getcwd(&dir))
+	if (strbuf_getcwd(&cwd))
 		return -1;
+	strbuf_addbuf(&dir, &cwd);
 
 	cwd_len = dir.len;
-	if (setup_git_directory_gently_1(&dir, gitdir, 0) <= 0) {
+	if (setup_git_directory_gently_1(&cwd, &dir, gitdir, 0) <= 0) {
+		strbuf_release(&cwd);
 		strbuf_release(&dir);
 		return -1;
 	}
@@ -1351,6 +1363,7 @@ int discover_git_directory(struct strbuf *commondir,
 	strbuf_reset(&dir);
 	strbuf_addf(&dir, "%s/config", commondir->buf + commondir_offset);
 	read_repository_format(&candidate, dir.buf);
+	strbuf_release(&cwd);
 	strbuf_release(&dir);
 
 	if (verify_repository_format(&candidate, &err) < 0) {
@@ -1400,7 +1413,7 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		die_errno(_("Unable to read current working directory"));
 	strbuf_addbuf(&dir, &cwd);
 
-	switch (setup_git_directory_gently_1(&dir, &gitdir, 1)) {
+	switch (setup_git_directory_gently_1(&cwd, &dir, &gitdir, 1)) {
 	case GIT_DIR_EXPLICIT:
 		prefix = setup_explicit_git_dir(gitdir.buf, &cwd, &repo_fmt, nongit_ok);
 		break;
diff --git a/t/t0034-discovery-bare.sh b/t/t0034-discovery-bare.sh
index 9c774872c4e..ba44cf19c99 100755
--- a/t/t0034-discovery-bare.sh
+++ b/t/t0034-discovery-bare.sh
@@ -56,4 +56,14 @@ test_expect_success 'discovery.bare=never' '
 	)
 '
 
+test_expect_success 'discovery.bare=cwd' '
+	git config --global discovery.bare cwd &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_allowed &&
+		cd refs/ &&
+		expect_rejected
+	)
+'
+
 test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
  2022-05-13 23:37   ` [PATCH v2 1/2] " Glen Choo via GitGitGadget
  2022-05-13 23:37   ` [PATCH v2 2/2] setup.c: learn discovery.bareRepository=cwd Glen Choo via GitGitGadget
@ 2022-05-16 16:40   ` Junio C Hamano
  2022-05-16 18:36     ` Glen Choo
  2022-05-16 16:43   ` Junio C Hamano
                     ` (2 subsequent siblings)
  5 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-05-16 16:40 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  * die()-ing is necessary if we're trying to flip the default value of
>    discovery.bare. We'd expect many bare repo users to be broken, and it's
>    more helpful to fail loudly than to silently ignore the bare repo.
>
>    But in the long term, long after we've flipped the default and users know
>    that they need to opt into bare repo discovery, would it be a better UX
>    to just silently ignore the bare repo?

Would a middle-ground of giving a warning() message help?  Can it be
loud and annoying enough to knudge the users to adjust without
breaking the functionality?

The longer-term default should be "cwd is allowed, but we do not
bother going up from object/04 subdirectory of a bare repository",
not "bare repositories should not be usable at all without GIT_DIR".

>      +    Add a config variable, `discovery.bare`, that tells Git whether or not
>      +    it should work with the bare repository it has discovered i.e. Git will
>      +    die() if it discovers a bare repository, but it is not allowed by

Missing comma before "i.e."

>      ++discovery.bare::
>      ++	Specifies what kinds of directories Git can recognize as a bare
>      ++	repository when looking for the repository (aka repository
>       +	discovery). This has no effect if repository discovery is not
>       +	performed e.g. the path to the repository is set via `--git-dir`
>       +	(see linkgit:git[1]).
>       ++
>       +This config setting is only respected when specified in a system or global
>       +config, not when it is specified in a repository config or via the command
>      ++line option `-c discovery.bare=<value>`.

;-)

>      +++
>      ++The currently supported values are `always` (Git always recognizes bare
>      ++repositories) and `never` (Git never recognizes bare repositories).
>      ++This defaults to `always`, but this default is likely to change.
>      +++
>      ++If your workflow does not rely on bare repositories, it is recommended that
>      ++you set this value to `never`. This makes repository discovery easier to
>      ++reason about and prevents certain types of security and non-security
>      ++problems, such as:

Hopefully "git fetch" over ssh:// and file:/// would run the other
side with GIT_DIR explicitly set?  As long as this recommendation
does not break these use cases, I think we are OK, but I do not yet
find these "problems, such as..." so convincing.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
                     ` (2 preceding siblings ...)
  2022-05-16 16:40   ` [PATCH v2 0/2] setup.c: make bare repo discovery optional Junio C Hamano
@ 2022-05-16 16:43   ` Junio C Hamano
  2022-05-16 19:07   ` Derrick Stolee
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  5 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-16 16:43 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  t/t0034-discovery-bare.sh          | 69 +++++++++++++++++++++++

This number is already in use by an in-flight topic, if I am not
mistaken.  Please make it a habit to always check your topic works
well when merged to 'next' and to 'seen'.

Thanks.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/2] setup.c: make bare repo discovery optional
  2022-05-13 23:37   ` [PATCH v2 1/2] " Glen Choo via GitGitGadget
@ 2022-05-16 18:12     ` Glen Choo
  2022-05-16 18:46     ` Derrick Stolee
  1 sibling, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-05-16 18:12 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Glen Choo <chooglen@google.com>
>
> Add a config variable, `discovery.bare`, that tells Git whether or not
> it should work with the bare repository it has discovered i.e. Git will
> die() if it discovers a bare repository, but it is not allowed by
> `discovery.bare`. This only affects repository discovery, thus it has no
> effect if discovery was not done (e.g. `--git-dir` was passed).
>
> This is motivated by the fact that some workflows don't use bare
> repositories at all, and users may prefer to opt out of bare repository
> discovery altogether:
>
> - An easy assumption for a user to make is that Git commands run
>   anywhere inside a repository's working tree will use the same
>   repository. However, if the working tree contains a bare repository
>   below the root-level (".git" is preferred at the root-level), any
>   operations inside that bare repository use the bare repository
>   instead.
>
>   In the worst case, attackers can use this confusion to trick users
>   into running arbitrary code (see [1] for a deeper discussion). But
>   even in benign situations (e.g. a user renames ".git/" to ".git.old/"
>   and commits it for archival purposes), disabling bare repository
>   discovery can be a simpler mode of operation (e.g. because the user
>   doesn't actually want to use ".git.old/") [2].
>
> - Git won't "accidentally" recognize a directory that wasn't meant to be
>   a bare repository, but happens to resemble one. While such accidents
>   are probably very rare in practice, this lets users reduce the chance
>   to zero.
>
> This config is an enum of:
>
> - ["always"|(unset)]: always recognize bare repositories (like Git does
>   today)
> - "never": never recognize bare repositories
>
> More values are expected to be added later, and the default is expected
> to change (i.e. to something other than "always").
>
> [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
> [2]: I don't personally know anyone who does this as part of their
> normal workflow, but a cursory search on GitHub suggests that there is a
> not insubstantial number of people who munge ".git" in order to store
> its contents.
>
> https://github.com/search?l=&o=desc&p=1&q=ref+size%3A%3C1000+filename%3AHEAD&s=indexed&type=Code
> (aka search for the text "ref", size:<1000, filename:HEAD)
>
> Signed-off-by: Glen Choo <chooglen@google.com>

The intended commit message ends here...

> WIP setup.c: make discovery.bare die on failure
>
> Signed-off-by: Glen Choo <chooglen@google.com>

Ugh, dumb mistake (bad squash). Fortunately this was one of my more
professional-sounding WIP commit messages.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 16:40   ` [PATCH v2 0/2] setup.c: make bare repo discovery optional Junio C Hamano
@ 2022-05-16 18:36     ` Glen Choo
  2022-05-16 19:16       ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-05-16 18:36 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, rsbecker

Junio C Hamano <gitster@pobox.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>>  * die()-ing is necessary if we're trying to flip the default value of
>>    discovery.bare. We'd expect many bare repo users to be broken, and it's
>>    more helpful to fail loudly than to silently ignore the bare repo.
>>
>>    But in the long term, long after we've flipped the default and users know
>>    that they need to opt into bare repo discovery, would it be a better UX
>>    to just silently ignore the bare repo?
>
> Would a middle-ground of giving a warning() message help?  Can it be
> loud and annoying enough to knudge the users to adjust without
> breaking the functionality?

Personally, when my tool changes its behavior, I would strongly prefer
it to die than to "change behavior + warn". I'd feel more comfortable
knowing that the tool did nothing as opposed to doing the wrong thing
and only being informed after the fact. Also, I sometimes ignore
warnings ;)

When we _do_ transition away from die(), ignore + warning() sounds like
a good first step.

But if any of this flies in the face of the project's conventions, let
me know as such.

>>      +    Add a config variable, `discovery.bare`, that tells Git whether or not
>>      +    it should work with the bare repository it has discovered i.e. Git will
>>      +    die() if it discovers a bare repository, but it is not allowed by
>
> Missing comma before "i.e."

Thanks.

>>      +++
>>      ++The currently supported values are `always` (Git always recognizes bare
>>      ++repositories) and `never` (Git never recognizes bare repositories).
>>      ++This defaults to `always`, but this default is likely to change.
>>      +++
>>      ++If your workflow does not rely on bare repositories, it is recommended that
>>      ++you set this value to `never`. This makes repository discovery easier to
>>      ++reason about and prevents certain types of security and non-security
>>      ++problems, such as:
>
> Hopefully "git fetch" over ssh:// and file:/// would run the other
> side with GIT_DIR explicitly set?

Ah, I'll check this and get back to you.

>                                                        I do not yet
> find these "problems, such as..." so convincing.

What would be a convincing rationale to you? I'll capture that here.

I'm assuming that you already have such an rationale in mind when you
say that the longer-term default is that "we respect bare repositories
only if they are the cwd.". I'm also assuming that this rationale is
something other than embedded bare repos, because "cwd-only" does not
protect against that.

Perhaps "never" sounds better to folks who don't ever expect bare
repositories and want to lock down the environment. Randall (cc-ed)
suggests one such use case in [1].

(To Randall: Oops, I actually meant to cc you earlier, since you were
the first to suggest a practical use case for never allowing bare repos.
It must've slipped my mind).

[1] https://lore.kernel.org/git/005d01d84ad0$782e8fc0$688baf40$@nexbridge.com.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/2] setup.c: make bare repo discovery optional
  2022-05-13 23:37   ` [PATCH v2 1/2] " Glen Choo via GitGitGadget
  2022-05-16 18:12     ` Glen Choo
@ 2022-05-16 18:46     ` Derrick Stolee
  2022-05-16 22:25       ` Taylor Blau
  2022-05-17 20:24       ` Glen Choo
  1 sibling, 2 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-05-16 18:46 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
> 
> Add a config variable, `discovery.bare`, that tells Git whether or not
> it should work with the bare repository it has discovered i.e. Git will
> die() if it discovers a bare repository, but it is not allowed by
> `discovery.bare`. This only affects repository discovery, thus it has no
> effect if discovery was not done (e.g. `--git-dir` was passed).

> This config is an enum of:
> 
> - ["always"|(unset)]: always recognize bare repositories (like Git does
>   today)
> - "never": never recognize bare repositories
> 
> More values are expected to be added later, and the default is expected
> to change (i.e. to something other than "always").

I think it is fine to include the "never" option for users to opt-in to
this super-protected state, but I want to make it very clear that we
should never move to it as a new default. This phrasing of 'something
other than "always"' is key, but it might be good to point out that
"never" is very unlikely to be that default.

> WIP setup.c: make discovery.bare die on failure
> 
> Signed-off-by: Glen Choo <chooglen@google.com>

Accidental concatenation of squashed commit?

> diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
> new file mode 100644
> index 00000000000..761cabe6e70
> --- /dev/null
> +++ b/Documentation/config/discovery.txt
> @@ -0,0 +1,24 @@
> +discovery.bare::
> +	Specifies what kinds of directories Git can recognize as a bare
> +	repository when looking for the repository (aka repository
> +	discovery). This has no effect if repository discovery is not
> +	performed e.g. the path to the repository is set via `--git-dir`
> +	(see linkgit:git[1]).

Avoid "e.g." here.

	This has no effect if the repository is specified directly via
	the --git-dir command-line option or the GIT_DIR environment
	variable.

> +This config setting is only respected when specified in a system or global
> +config, not when it is specified in a repository config or via the command
> +line option `-c discovery.bare=<value>`.

We are sprinkling config options that have these same restrictions throughout
the config documentation. It might be time to define a term like "protected
config" at the top of git-config.txt and then refer to that from these other
locations.

> +The currently supported values are `always` (Git always recognizes bare
> +repositories) and `never` (Git never recognizes bare repositories).

This sentence structure is likely to change in the future, and as it stands
will become complicated. A bulleted list will have easier edits in the future.

> +This defaults to `always`, but this default is likely to change.

For now, I would say "but this default may change in the future." instead.

> +If your workflow does not rely on bare repositories, it is recommended that
> +you set this value to `never`. This makes repository discovery easier to
> +reason about and prevents certain types of security and non-security
> +problems, such as:
> +

(You might need a "+" here.)

> +* `git clone`-ing a repository containing a malicious bare repository
> +  inside it.
> +* Git recognizing a directory that isn't meant to be a bare repository,
> +  but happens to look like one.

I think these last bits recommending the 'never' option are a bit
distracting. It doesn't make repository discovery "easier to reason
about" because we still discover the bare repo and die() instead of
skipping it and looking higher for a non-bare repository in the
parent directories. The case of an "accidentally-recognized bare
repo" is so unlikely it is probably not worth mention in these docs.

Instead, I think something like this might be better:

  If you do not use bare repositories in your workflow, then it may
  be beneficial to set `discovery.bare` to `never` in your global
  config. This will protect you from attacks that involve cloning a
  repository that contains a bare repository and running a Git
  command within that directory.

> +static int check_bare_repo_allowed(void)
> +{
> +	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
> +		read_very_early_config(discovery_bare_cb, NULL);

This will add the third place where we use read_very_early_config(),
adding to the existing calls in tr2_sysenv_load() and
ensure_valid_ownership(). If I understand it correctly, that means
that every Git execution in a bare repository will now parse the
system and global config three times.

This doesn't count the check for uploadpack.packobjectshook in
upload-pack.c that uses current_config_scope() to restrict its
value to the system and global config.

We are probably at the point where we need to instead create a
configset that stores this "protected config" and allow us to
lookup config keys directly from that configset instead of
iterating through these config files repeatedly.

> +		/* We didn't find a value; use the default. */
> +		if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN)
> +			discovery_bare_config = DISCOVERY_BARE_ALWAYS;

This could also be done in advance of the config parsing
by setting discovery_bare_config = DISCOVERY_BARE_ALWAYS before
calling read_very_early_config(). Avoids an if and a comment
here, which might be nice.

> +	}
> +	switch (discovery_bare_config) {
> +	case DISCOVERY_BARE_NEVER:
> +		return 0;
> +	case DISCOVERY_BARE_ALWAYS:
> +		return 1;
> +	default:
> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);
> +	}

You return -1 in discovery_bare_cb when the key matches, but
the value is not understood. Should we check the return value
of read_very_early_config(), too?
> +static const char *discovery_bare_config_to_string(void)
> +{
> +	switch (discovery_bare_config) {
> +	case DISCOVERY_BARE_NEVER:
> +		return "never";
> +	case DISCOVERY_BARE_ALWAYS:
> +		return "always";
> +	default:
> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);

In general, I'm not sure these BUG() statements are helpful,
but they aren't hurting anything. I wonder if it would be
better to use DISCOVERY_BARE_UNKNOWN instead of default,
because then the compiler should notice that the switch needs
updating when a new enum mode is added.

> @@ -1142,7 +1195,8 @@ enum discovery_result {
>  	GIT_DIR_HIT_CEILING = -1,
>  	GIT_DIR_HIT_MOUNT_POINT = -2,
>  	GIT_DIR_INVALID_GITFILE = -3,
> -	GIT_DIR_INVALID_OWNERSHIP = -4
> +	GIT_DIR_INVALID_OWNERSHIP = -4,
> +	GIT_DIR_DISALLOWED_BARE = -5

I think that you can add a comma at the end of this enum to avoid the
changed line the next time the enum needs to be expanded.

>  };
>  
>  /*
> @@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>  		}
>  
>  		if (is_git_directory(dir->buf)) {
> +			if (!check_bare_repo_allowed())
> +				return GIT_DIR_DISALLOWED_BARE;

Won't this fail if someone runs a Git command inside of a .git/
directory for a non-bare repository? I just want to be sure that
we hit this error instead:

	fatal: this operation must be run in a work tree

I see that this error is tested in t0008-ignores.sh, but that's
with the default "always" value. It would be good to explicitly
check that this is the right error when using the "never" config.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 2/2] setup.c: learn discovery.bareRepository=cwd
  2022-05-13 23:37   ` [PATCH v2 2/2] setup.c: learn discovery.bareRepository=cwd Glen Choo via GitGitGadget
@ 2022-05-16 18:49     ` Derrick Stolee
  0 siblings, 0 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-05-16 18:49 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
> 
> Add a 'cwd' option to discovery.bareRepository, which allows a bare
> repository to be used if and only if the cwd is the root of a bare
> repository. This covers the common case where a user works with a bare
> repository by cd-ing into the repository's root.

I don't consider this case valuable. In addition to allowing
the most-common use case, it also allows the most-common route
that an attacker would use to try to get a user to run a Git
command in a malicious embedded bare repo. I think we are
better off without it.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
                     ` (3 preceding siblings ...)
  2022-05-16 16:43   ` Junio C Hamano
@ 2022-05-16 19:07   ` Derrick Stolee
  2022-05-16 22:43     ` Taylor Blau
                       ` (2 more replies)
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  5 siblings, 3 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-05-16 19:07 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> Thanks all for the comments on v1, I've expanded this series somewhat to
> address them,...

Please include a full cover letter with each version, so reviewers
can respond to the full series goals.

Your series here intends to start protecting against malicious
embedded bare repositories by allowing users to opt-in to a more
protected state. When the 'discovery.bare' option is set, then
Git may die() on a bare repository that is discovered based on
the current working directory (these protections are ignored if
the user specifies the directory directly through --git-dir or
$GIT_DIR).

The 'discovery.bare' option has these values at the end of your
series:

* 'always' (default) allows all bare repos, matching the current
  behavior of Git.

* 'never' avoids operating in bare repositories altogether.

* 'cwd' operates in a bare repository only if the current directory
  is exactly the root of the bare repository.

It is important that we keep 'always' as the default at first,
because we do not want to introduce a breaking change without
warning (at least for an issue like this that has been around
for a long time).

The 'never' option is a good one for very security-conscious
users who really want to avoid problems. I don't anticipate that
users who know about this option and set it themselves are the
type that would fall for the social engineering required to
attack using this vector, but I can imagine an IT department
installing the value in system config across a fleet of machines.

I find the 'cwd' option to not be valuable. It unblocks most
existing users, but also almost completely removes the protection
that the option was supposed to provide.

I find neither the 'never' or 'cwd' options an acceptable choice
for a future default.

I also think that this protection is too rigid: it restricts
_all_ bare repositories, not just embedded ones. There is no check
to see if the parent directory of the bare repository is inside a
non-bare repository.

This leads to what I think would be a valuable replacement for
the 'cwd' option:

* 'no-embedded' allows non-embedded bare repositories. An
  _embedded bare repository_ is a bare repository whose parent
  directory is contained in the worktree of a non-bare Git
  repository. When in this mode, embedded bare repositories are
  not allowed unless the parent non-bare Git repository has a
  'safe.embedded' config value storing the path to the current
  embedded bare repository.

That was certainly difficult to write, but here it is as
pseudo-code to hopefully remove some doubt as to how this might
work:

  if repo is bare:
    if value == "always":
       return ALLOWED
    if value == "never":
       return FORBIDDEN;

    path = get_parent_repo()

    if !path:
       return ALLOWED
    
    if config_file_has_value("{path}/.git/config", "safe.embedded", repo):
       return ALLOWED

    return FORBIDDEN

With this kind of option, we can protect users from these
social engineering attacks while providing an opt-in protection
for scenarios where embedded bare repos are currently being used
(while also not breaking anyone using non-embedded bare repos).

I think Taylor was mentioning something like this in his previous
replies, perhaps even to the previous thread on this topic.

This 'no-embedded' option is something that I could see as a
potential new default, after it has proven itself in a released
version of Git.

There are performance drawbacks to checking the parent path for
a Git repo, which is why it is only done when in "no-embedded"
mode.

I mentioned some other concerns in your PATCH 1 about how we
are now adding the third use of read_very_early_config() and that
we should probably refactor that before adding the third option,
in order to avoid additional performance costs as well as it
being difficult to audit which config options are only checked
from these "protected" config files.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 18:36     ` Glen Choo
@ 2022-05-16 19:16       ` Junio C Hamano
  2022-05-16 20:27         ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-05-16 19:16 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Emily Shaffer, rsbecker

Glen Choo <chooglen@google.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>>  * die()-ing is necessary if we're trying to flip the default value of
>>>    discovery.bare. We'd expect many bare repo users to be broken, and it's
>>>    more helpful to fail loudly than to silently ignore the bare repo.
>>>
>>>    But in the long term, long after we've flipped the default and users know
>>>    that they need to opt into bare repo discovery, would it be a better UX
>>>    to just silently ignore the bare repo?
>>
>> Would a middle-ground of giving a warning() message help?  Can it be
>> loud and annoying enough to knudge the users to adjust without
>> breaking the functionality?
>
> Personally, when my tool changes its behavior, I would strongly prefer
> it to die than to "change behavior + warn". I'd feel more comfortable
> knowing that the tool did nothing as opposed to doing the wrong thing
> and only being informed after the fact. Also, I sometimes ignore
> warnings ;)

Heh, personally I would try very hard not to change the behaviour
without explicitly asked by the users with configuration or command
line option.  Flipping the default has traditionally been done in
two or three phases.

 (1) We start by giving a loud and annoying warning to those who
     haven't configured and tell them the default *will* change, how
     to keep the current behaviour forever, and how to live in the
     future by adopting the future default early.

 (2) After a while, we flip the default.  Those who haven't
     configured are given a notice that the default has changed, how
     to keep the old behaviour forever, and how to explicitly choose
     the same value as the default to squelch the notice.

 (3) After yet another while, we stop giving the notice.  If we
     omitted (2), here is where we flip the default.

Strictly speaking, we can have (1) in one release and then could
directly jump to (3), but some distros may skip the releases that
has (1), and (2) is an attempt to help users of such distros.

>> Hopefully "git fetch" over ssh:// and file:/// would run the other
>> side with GIT_DIR explicitly set?
>
> Ah, I'll check this and get back to you.
>
>>                                                        I do not yet
>> find these "problems, such as..." so convincing.
>
> What would be a convincing rationale to you? I'll capture that here.

That is a wrong question.  You are the one pushing for castrating
the bare repositories.

> I'm assuming that you already have such an rationale in mind when you
> say that the longer-term default is that "we respect bare repositories
> only if they are the cwd.". I'm also assuming that this rationale is
> something other than embedded bare repos, because "cwd-only" does not
> protect against that.

No, I do not have such a "different" rationale to justify the change
proposed in this patch.  I was saying that the claim "embedded bare
repos are risky", backed by your two examples, did not sound all
that serious a problem.  Presented with a more serious brekage
scenario, it may make the description more convincing.

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 19:16       ` Junio C Hamano
@ 2022-05-16 20:27         ` Glen Choo
  2022-05-16 22:16           ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-05-16 20:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Emily Shaffer, rsbecker

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> Junio C Hamano <gitster@pobox.com> writes:
>>
>>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>
>>>>  * die()-ing is necessary if we're trying to flip the default value of
>>>>    discovery.bare. We'd expect many bare repo users to be broken, and it's
>>>>    more helpful to fail loudly than to silently ignore the bare repo.
>>>>
>>>>    But in the long term, long after we've flipped the default and users know
>>>>    that they need to opt into bare repo discovery, would it be a better UX
>>>>    to just silently ignore the bare repo?
>>>
>>> Would a middle-ground of giving a warning() message help?  Can it be
>>> loud and annoying enough to knudge the users to adjust without
>>> breaking the functionality?
>>
>> Personally, when my tool changes its behavior, I would strongly prefer
>> it to die than to "change behavior + warn". I'd feel more comfortable
>> knowing that the tool did nothing as opposed to doing the wrong thing
>> and only being informed after the fact. Also, I sometimes ignore
>> warnings ;)
>
> Heh, personally I would try very hard not to change the behaviour
> without explicitly asked by the users with configuration or command
> line option.  Flipping the default has traditionally been done in
> two or three phases.
>
>  (1) We start by giving a loud and annoying warning to those who
>      haven't configured and tell them the default *will* change, how
>      to keep the current behaviour forever, and how to live in the
>      future by adopting the future default early.
>
>  (2) After a while, we flip the default.  Those who haven't
>      configured are given a notice that the default has changed, how
>      to keep the old behaviour forever, and how to explicitly choose
>      the same value as the default to squelch the notice.
>
>  (3) After yet another while, we stop giving the notice.  If we
>      omitted (2), here is where we flip the default.
>
> Strictly speaking, we can have (1) in one release and then could
> directly jump to (3), but some distros may skip the releases that
> has (1), and (2) is an attempt to help users of such distros.

Ah, that is very helpful. Thanks. It's pretty clear that I misunderstood
what you meant by "giving a warning() message" - the warning() is there
to prepare users in advance of the change; we don't actually want the
warning() in the long term.

For something as disruptive as discovering bare repos, having all of
(1), (2) and (3) sounds appropriate.

>>> Hopefully "git fetch" over ssh:// and file:/// would run the other
>>> side with GIT_DIR explicitly set?
>>
>> Ah, I'll check this and get back to you.
>>
>>>                                                        I do not yet
>>> find these "problems, such as..." so convincing.
>>
>> What would be a convincing rationale to you? I'll capture that here.
>
> That is a wrong question.  You are the one pushing for castrating
> the bare repositories.

Let me clarify in case this wasn't received the way I intended. Earlier
in the thread, you mentioned:

  The longer-term default should be "cwd is allowed, but we do not
  bother going up from object/04 subdirectory of a bare repository",
  [...]

which I took to mean "Junio thinks that, by default, Git should stop
walking up to find a bare repo, and thinks this is better because of
rationale X.", and not, "Junio does not think that the default needs to
change, but is just suggesting a better default than Glen's".

If it is the former, then there is obviously some thought process here
that is worth sharing.

If it the latter, then I'm in favor of taking Stolee's suggestion to
drop "cwd", since nobody else finds it useful enough. (I like the
'simplification' story, but not enough to push "cwd" through, especially
since it does quite little security-wise.)

>> I'm assuming that you already have such an rationale in mind when you
>> say that the longer-term default is that "we respect bare repositories
>> only if they are the cwd.". I'm also assuming that this rationale is
>> something other than embedded bare repos, because "cwd-only" does not
>> protect against that.
>
> No, I do not have such a "different" rationale to justify the change
> proposed in this patch.  I was saying that the claim "embedded bare
> repos are risky", backed by your two examples, did not sound all
> that serious a problem.  Presented with a more serious brekage
> scenario, it may make the description more convincing.

Fair. I'll mull over this.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 20:27         ` Glen Choo
@ 2022-05-16 22:16           ` Junio C Hamano
  0 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-16 22:16 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Emily Shaffer, rsbecker

Glen Choo <chooglen@google.com> writes:

> which I took to mean "Junio thinks that, by default, Git should stop
> walking up to find a bare repo, and thinks this is better because of
> rationale X."

The X is "it would not break existing use case too badly, just to
address a 'security' story whose severity is not so clearly
expressed".

> If it the latter, then I'm in favor of taking Stolee's suggestion to
> drop "cwd", since nobody else finds it useful enough. (I like the
> 'simplification' story, but not enough to push "cwd" through, especially
> since it does quite little security-wise.)

As long as you'll be there to answer the angry mob that complain
loudly (and irritatingly enough, the only do so after a release is
made to flip the default), I do not care too much either way ;-).

Thanks.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/2] setup.c: make bare repo discovery optional
  2022-05-16 18:46     ` Derrick Stolee
@ 2022-05-16 22:25       ` Taylor Blau
  2022-05-17 20:24       ` Glen Choo
  1 sibling, 0 replies; 113+ messages in thread
From: Taylor Blau @ 2022-05-16 22:25 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Glen Choo via GitGitGadget, git, brian m. carlson,
	Junio C Hamano, Emily Shaffer, Glen Choo

On Mon, May 16, 2022 at 02:46:55PM -0400, Derrick Stolee wrote:
> On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> > From: Glen Choo <chooglen@google.com>
> >
> > Add a config variable, `discovery.bare`, that tells Git whether or not
> > it should work with the bare repository it has discovered i.e. Git will
> > die() if it discovers a bare repository, but it is not allowed by
> > `discovery.bare`. This only affects repository discovery, thus it has no
> > effect if discovery was not done (e.g. `--git-dir` was passed).
>
> > This config is an enum of:
> >
> > - ["always"|(unset)]: always recognize bare repositories (like Git does
> >   today)
> > - "never": never recognize bare repositories
> >
> > More values are expected to be added later, and the default is expected
> > to change (i.e. to something other than "always").
>
> I think it is fine to include the "never" option for users to opt-in to
> this super-protected state, but I want to make it very clear that we
> should never move to it as a new default. This phrasing of 'something
> other than "always"' is key, but it might be good to point out that
> "never" is very unlikely to be that default.

I am confused, then.

What does a user who has some legitimate (non-embedded) bare
repositories do if they are skeptical of other bare repositories? I
suspect the best answer we would be able to provide with these patches
is "use `--git-dir`".

What happens to a user who has a combination of legitimate bare
repositories, embedded bare repositories that they trust, and other
embedded bare repositories that they don't?

As far as I can tell, our recommendation with these tools would be to:

  - run `git config --global discovery.bare never`, and
  - include `--git-dir=$(pwd)` in any git invocations in bare
    repositories that they do trust

This gets at my concerns from [1] and [2] (mostly [2], in this case)
that we're trying to close the embedded bare repos problem with an
overly broad solution, at the expense of usability.

I can't shake the feeling that something like I described towards the
bottom of [2] would give you all of the security guarantees you're after
without compromising on usability for non-embedded bare repositories.

I'm happy to explore this direction more myself if you don't want to. I
would just much rather see us adopt an approach that doesn't break more
use-cases than it has to if such a thing can be avoided.

I cannot endorse these patches as-is.

Thanks,
Taylor

[1]: https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/
[2]: https://lore.kernel.org/git/YnmKwLoQCorBnMe2@nand.local/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 19:07   ` Derrick Stolee
@ 2022-05-16 22:43     ` Taylor Blau
  2022-05-16 23:19     ` Junio C Hamano
  2022-05-17 18:56     ` Glen Choo
  2 siblings, 0 replies; 113+ messages in thread
From: Taylor Blau @ 2022-05-16 22:43 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Glen Choo via GitGitGadget, git, brian m. carlson,
	Junio C Hamano, Emily Shaffer, Glen Choo

On Mon, May 16, 2022 at 03:07:35PM -0400, Derrick Stolee wrote:
> On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
> > Thanks all for the comments on v1, I've expanded this series somewhat to
> > address them,...
>
> Please include a full cover letter with each version, so reviewers
> can respond to the full series goals.
>
> Your series here intends to start protecting against malicious
> embedded bare repositories by allowing users to opt-in to a more
> protected state. [...]

Thanks for the summary, which I think will be especially helpful to
others looking at this series for the very first time.

> The 'never' option is a good one for very security-conscious
> users who really want to avoid problems. I don't anticipate that
> users who know about this option and set it themselves are the
> type that would fall for the social engineering required to
> attack using this vector, but I can imagine an IT department
> installing the value in system config across a fleet of machines.

When I first read this, I disagreed, since presumably that same crowd
has legitimate bare repositories that they want to continue being able
to operate in without having to pass `--git-dir` or `$GIT_DIR` in.

In fact...

> I also think that this protection is too rigid: it restricts
> _all_ bare repositories, not just embedded ones. There is no check
> to see if the parent directory of the bare repository is inside a
> non-bare repository.

...this resonates quite a bit more with me. "never" isn't a good option
unless you aren't a user of bare repositories _and_ don't have any
embedded bare repositories (either at all, or any ones that you trust).

> This leads to what I think would be a valuable replacement for
> the 'cwd' option:
>
> * 'no-embedded' allows non-embedded bare repositories. An
>   _embedded bare repository_ is a bare repository whose parent
>   directory is contained in the worktree of a non-bare Git
>   repository. When in this mode, embedded bare repositories are
>   not allowed unless the parent non-bare Git repository has a
>   'safe.embedded' config value storing the path to the current
>   embedded bare repository.
>
> That was certainly difficult to write, but here it is as
> pseudo-code to hopefully remove some doubt as to how this might
> work:
>
>   if repo is bare:
>     if value == "always":
>        return ALLOWED
>     if value == "never":
>        return FORBIDDEN;

This is indeed very similar to a proposal I had made upthread (which you
note lower down in this email). One thing that's nice is that we only
have to traverse up to the parent repo when in the "no-embedded" mode.
That may be slow (since it's unbounded all the way up to the filesystem
root or a ceiling directory, whichever we encounter first), but I think
it's unavoidable if you need to distinguish between embedded and
non-embedded bare repositories.

>     path = get_parent_repo()
>
>     if !path:
>        return ALLOWED
>
>     if config_file_has_value("{path}/.git/config", "safe.embedded", repo):
>        return ALLOWED
>
>     return FORBIDDEN
>
> With this kind of option, we can protect users from these
> social engineering attacks while providing an opt-in protection
> for scenarios where embedded bare repos are currently being used
> (while also not breaking anyone using non-embedded bare repos).
>
> I think Taylor was mentioning something like this in his previous
> replies, perhaps even to the previous thread on this topic.

Yep, see: https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/.a

> This 'no-embedded' option is something that I could see as a
> potential new default, after it has proven itself in a released
> version of Git.

I would be totally happy to see "no-embedded" become the default. It
might be nice to issue a warning when the top-level config is unset,
to give users a heads up about cases that may be broken, perhaps like:

    if repo is bare:
      switch (value) {
      case "always":
        return ALLOWED;
      case "never":
        return FORBIDDEN;
      case "no-embedded": # fallthrough
      case "":
        path = get_parent_repo()
        if !path
          return ALLOWED;

        if config_file_has_value("{path}/.git/config", "safe.embedded", repo)
          return ALLOWED;

        if value == "no-embedded":
          return FORBIDDEN;

        # otherwise, we're in an embedded bare repository with an unset
        # discovery.bare config.
        #
        # warn that this will break in the future...
        warning(_("%s is embedded within %s"), the_repository.path, path);
        advise(_("to allow discovery for this embedded repo, either run"));
        advise(_(""));
        advise(_("  $ git config --global discovery.bare always, or"));
        advise(_("  $ git -C '%s' config --local safe.embedded '%s'"),
               path, relpath(path, the_repository.path));

        # ...but allow the invocation for now until the default is
        # changed.
        return ALLOWED;
      default:
        die(_("unrecognized value of discovery.bare: '%s'"), value);
      }

...where relpath is similar to Go's path/filepath.Rel function.

With an appropriate deprecation period, I think we could even get away
from the "continue executing, but don't read config+hooks", which in
retrospect is more error-prone and difficult to reason about than I
initially had given it credit for.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 19:07   ` Derrick Stolee
  2022-05-16 22:43     ` Taylor Blau
@ 2022-05-16 23:19     ` Junio C Hamano
  2022-05-17 18:56     ` Glen Choo
  2 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-16 23:19 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Emily Shaffer, Glen Choo

Derrick Stolee <derrickstolee@github.com> writes:

> * 'no-embedded' allows non-embedded bare repositories. An
>   _embedded bare repository_ is a bare repository whose parent
>   directory is contained in the worktree of a non-bare Git
>   repository. When in this mode, embedded bare repositories are
>   not allowed unless the parent non-bare Git repository has a
>   'safe.embedded' config value storing the path to the current
>   embedded bare repository.

Sounds sensible.  I wonder how expensive this will be in practice,
but the behaviour seems well thought out.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 0/2] setup.c: make bare repo discovery optional
  2022-05-16 19:07   ` Derrick Stolee
  2022-05-16 22:43     ` Taylor Blau
  2022-05-16 23:19     ` Junio C Hamano
@ 2022-05-17 18:56     ` Glen Choo
  2 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-05-17 18:56 UTC (permalink / raw)
  To: Derrick Stolee, Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer


Thanks, I think this has advanced the conversation quite a bit.

Derrick Stolee <derrickstolee@github.com> writes:

> On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
>> Thanks all for the comments on v1, I've expanded this series somewhat to
>> address them,...
>
> Please include a full cover letter with each version, so reviewers
> can respond to the full series goals.
>
> Your series here intends to start protecting against malicious
> embedded bare repositories by allowing users to opt-in to a more
> protected state. When the 'discovery.bare' option is set, then
> Git may die() on a bare repository that is discovered based on
> the current working directory (these protections are ignored if
> the user specifies the directory directly through --git-dir or
> $GIT_DIR).
>
> The 'discovery.bare' option has these values at the end of your
> series:
>
> * 'always' (default) allows all bare repos, matching the current
>   behavior of Git.
>
> * 'never' avoids operating in bare repositories altogether.
>
> * 'cwd' operates in a bare repository only if the current directory
>   is exactly the root of the bare repository.

My mistake, I should have prepared this summary myself. Thanks again.

> It is important that we keep 'always' as the default at first,
> because we do not want to introduce a breaking change without
> warning (at least for an issue like this that has been around
> for a long time).

Yes.

> The 'never' option is a good one for very security-conscious
> users who really want to avoid problems. I don't anticipate that
> users who know about this option and set it themselves are the
> type that would fall for the social engineering required to
> attack using this vector, but I can imagine an IT department
> installing the value in system config across a fleet of machines.

Yes. Setting the 'never' option in a system config is the use case that
motivated this.

> I find the 'cwd' option to not be valuable. It unblocks most
> existing users, but also almost completely removes the protection
> that the option was supposed to provide.

Ok, I agree that it provides next-to-no protection. I'll drop it in this
series; it's easy enough to reimplement if users really want it anyway.

> This leads to what I think would be a valuable replacement for
> the 'cwd' option:
>
> * 'no-embedded' allows non-embedded bare repositories. An
>   _embedded bare repository_ is a bare repository whose parent
>   directory is contained in the worktree of a non-bare Git
>   repository. When in this mode, embedded bare repositories are
>   not allowed unless the parent non-bare Git repository has a
>   'safe.embedded' config value storing the path to the current
>   embedded bare repository.
>
> That was certainly difficult to write, but here it is as
> pseudo-code to hopefully remove some doubt as to how this might
> work:
>
>   if repo is bare:
>     if value == "always":
>        return ALLOWED
>     if value == "never":
>        return FORBIDDEN;
>
>     path = get_parent_repo()
>
>     if !path:
>        return ALLOWED
>     
>     if config_file_has_value("{path}/.git/config", "safe.embedded", repo):
>        return ALLOWED
>
>     return FORBIDDEN
>
> With this kind of option, we can protect users from these
> social engineering attacks while providing an opt-in protection
> for scenarios where embedded bare repos are currently being used
> (while also not breaking anyone using non-embedded bare repos).

[...]

> This 'no-embedded' option is something that I could see as a
> potential new default, after it has proven itself in a released
> version of Git.

I agree, this sounds like a good default that should work for most
users.

That said, I don't think I will implement it, and even if I do, it won't
be in this series. I have serious doubts that I'd be able to deliver it
in a reasonable amount of time (I tried preparing patches to this effect
and failed [1]), and 'never' is sufficient for $DAYJOB's current needs.

I would be very happy to see this come to fruition though. I have no
objections to anyone preparing patches for this, and I'll gladly review
those if that's helpful.

[1] The specific trouble I had was figuring out whether or not the
 'parent' repo was tracking the bare repo, since an untracked bare repo
 in the working tree isn't (in some sense) really "embedded" and it
 can't have come from a remote.

 But maybe the tracking check is unnecessary. We would break a few more
 users without it, but 'safe.embedded' is an easy enough way for a user
 to unbreak themselves.

> I mentioned some other concerns in your PATCH 1 about how we
> are now adding the third use of read_very_early_config() and that
> we should probably refactor that before adding the third option,
> in order to avoid additional performance costs as well as it
> being difficult to audit which config options are only checked
> from these "protected" config files.

Makes sense. I'll ask about specifics on that subthread.

>
> Thanks,
> -Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/2] setup.c: make bare repo discovery optional
  2022-05-16 18:46     ` Derrick Stolee
  2022-05-16 22:25       ` Taylor Blau
@ 2022-05-17 20:24       ` Glen Choo
  2022-05-17 21:51         ` Glen Choo
  1 sibling, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-05-17 20:24 UTC (permalink / raw)
  To: Derrick Stolee, Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer

Thanks for being thorough, I find it really helpful.

For brevity, I won't reply to comments that I think are obviously good,
so you can assume I'll incorproate anything that isn't commented on.

Derrick Stolee <derrickstolee@github.com> writes:

> On 5/13/2022 7:37 PM, Glen Choo via GitGitGadget wrote:
>> From: Glen Choo <chooglen@google.com>
>> 
>> +This config setting is only respected when specified in a system or global
>> +config, not when it is specified in a repository config or via the command
>> +line option `-c discovery.bare=<value>`.
>
> We are sprinkling config options that have these same restrictions throughout
> the config documentation. It might be time to define a term like "protected
> config" at the top of git-config.txt and then refer to that from these other
> locations.

Agree, and I think defining the term will be useful in future on-list
discussions.

>> +static int check_bare_repo_allowed(void)
>> +{
>> +	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
>> +		read_very_early_config(discovery_bare_cb, NULL);
>
> This will add the third place where we use read_very_early_config(),
> adding to the existing calls in tr2_sysenv_load() and
> ensure_valid_ownership(). If I understand it correctly, that means
> that every Git execution in a bare repository will now parse the
> system and global config three times.
>
> This doesn't count the check for uploadpack.packobjectshook in
> upload-pack.c that uses current_config_scope() to restrict its
> value to the system and global config.
>
> We are probably at the point where we need to instead create a
> configset that stores this "protected config" and allow us to
> lookup config keys directly from that configset instead of
> iterating through these config files repeatedly.

Looking at all of the read_very_early_config() calls,

- check_bare_repo_allowed() can use git_configset_get_string()
- ensure_valid_ownership() can use git_configset_get_value_multi()
- tr2_sysenv_load() reads every value with the "trace2." prefix. AFAICT
  configsets only support exact key lookups and I don't see an easy way
  teach configsets to support prefix lookups.

(I didn't look too closely at uploadpack.packobjectshook because I don't
know enough about config scopes to comment.)

So using a configset, we'll still need to read the config files at least
twice. That's better than thrice, but it doesn't cover the
tr2_sysenv_load() use case, and we'll run into this yet again if add
function that reads all config values with a given prefix.

An hacky alternative that covers all of these use cases would be to read
all protected config in a single pass, e.g.

  static struct protected_config {
         struct safe_directory_data safe_directory_data;
         const char *discovery_bare;
         struct string_list tr2_sysenv;
  };

  static int protected_config_cb()
  {
    /* Parse EVERYTHING that belongs in protected_config. */
  }

but protected_config_cb() would have to parse too many unrelated things
for my liking.

So I'll use the configset for the cases where the key is known, and
perhaps we'll punt on tr2_sysenv_load().

>> +	}
>> +	switch (discovery_bare_config) {
>> +	case DISCOVERY_BARE_NEVER:
>> +		return 0;
>> +	case DISCOVERY_BARE_ALWAYS:
>> +		return 1;
>> +	default:
>> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);
>> +	}
>
> You return -1 in discovery_bare_cb when the key matches, but
> the value is not understood. Should we check the return value
> of read_very_early_config(), too?

This comment doesn't apply because unlike most other config reading
functions, read_very_early_config() and read_early_config() die when the
callback returns -1.

I'm not sure why this is the case though, and maybe you think there is
value in having a non-die()-ing variant, e.g.
read_very_early_config_gently()?

>>  };
>>  
>>  /*
>> @@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>>  		}
>>  
>>  		if (is_git_directory(dir->buf)) {
>> +			if (!check_bare_repo_allowed())
>> +				return GIT_DIR_DISALLOWED_BARE;
>
> Won't this fail if someone runs a Git command inside of a .git/
> directory for a non-bare repository? I just want to be sure that
> we hit this error instead:
>
> 	fatal: this operation must be run in a work tree
>
> I see that this error is tested in t0008-ignores.sh, but that's
> with the default "always" value. It would be good to explicitly
> check that this is the right error when using the "never" config.

Yes, it will fail if run inside of a .git/ directory. "never" prevents
you from working from inside .git/ unless you set GIT_DIR.

IIRC, we don't show "fatal: this operation must be run in a work
tree" for every Git command, e.g. "git log" works just fine. It makes
sense to show this warning when the CWD supports 'some, but not all' Git
commands, but I don't think this is valuable if we forbid *all* Git
commands.

Instead of trying to make "never" accomodate this use case, perhaps what
we want is a "dotgit-only" option that allows a bare repository if it is
below a .git/ directory. Since we forbid .git in the index, this seems
somewhat safe, but I hadn't proposed this sooner because I don't know if
we need it yet, and I'm certain that there are less secure edge cases
that need to be thought through.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v2 1/2] setup.c: make bare repo discovery optional
  2022-05-17 20:24       ` Glen Choo
@ 2022-05-17 21:51         ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-05-17 21:51 UTC (permalink / raw)
  To: Derrick Stolee, Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer

Glen Choo <chooglen@google.com> writes:

>>> +static int check_bare_repo_allowed(void)
>>> +{
>>> +	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
>>> +		read_very_early_config(discovery_bare_cb, NULL);
>>
>> This will add the third place where we use read_very_early_config(),
>> adding to the existing calls in tr2_sysenv_load() and
>> ensure_valid_ownership(). If I understand it correctly, that means
>> that every Git execution in a bare repository will now parse the
>> system and global config three times.
>>
>> This doesn't count the check for uploadpack.packobjectshook in
>> upload-pack.c that uses current_config_scope() to restrict its
>> value to the system and global config.
>>
>> We are probably at the point where we need to instead create a
>> configset that stores this "protected config" and allow us to
>> lookup config keys directly from that configset instead of
>> iterating through these config files repeatedly.
>
> Looking at all of the read_very_early_config() calls,
>
> - check_bare_repo_allowed() can use git_configset_get_string()
> - ensure_valid_ownership() can use git_configset_get_value_multi()
> - tr2_sysenv_load() reads every value with the "trace2." prefix. AFAICT
>   configsets only support exact key lookups and I don't see an easy way
>   teach configsets to support prefix lookups.
>
> (I didn't look too closely at uploadpack.packobjectshook because I don't
> know enough about config scopes to comment.)
>
> So using a configset, we'll still need to read the config files at least
> twice. That's better than thrice, but it doesn't cover the
> tr2_sysenv_load() use case, and we'll run into this yet again if add
> function that reads all config values with a given prefix.
>
> An hacky alternative that covers all of these use cases would be to read
> all protected config in a single pass, e.g.
>
>   static struct protected_config {
>          struct safe_directory_data safe_directory_data;
>          const char *discovery_bare;
>          struct string_list tr2_sysenv;
>   };
>
>   static int protected_config_cb()
>   {
>     /* Parse EVERYTHING that belongs in protected_config. */
>   }
>
> but protected_config_cb() would have to parse too many unrelated things
> for my liking.
>
> So I'll use the configset for the cases where the key is known, and
> perhaps we'll punt on tr2_sysenv_load().

Since I'm trying to replace read_very_early_config() anyway, is this a
good time to teach git to respect "-c safe.directory"?

My understanding of [1] is that we only ignore "-c safe.directory"
because read_very_early_config() doesn't support it, but we would prefer
to support it if we could.

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v3 0/5] config: introduce discovery.bare and protected config
  2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
                     ` (4 preceding siblings ...)
  2022-05-16 19:07   ` Derrick Stolee
@ 2022-05-27 21:09   ` Glen Choo via GitGitGadget
  2022-05-27 21:09     ` [PATCH v3 1/5] Documentation: define protected configuration Glen Choo via GitGitGadget
                       ` (5 more replies)
  5 siblings, 6 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-27 21:09 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo

Thanks everyone for the feedback. This round has two major changes compared
to the last:

 * This feature is now purely motivated by the embedded bare repo attack;
   I've dropped all of language pertaining to 'simplifying' bare repository
   discovery.

 * Protected config gets a formal definition and is implemented using a
   config_set, which should hopefully address most of the performance
   concerns of using read_very_early_config() (see 2/5 for more discussion).
   
   With this new implementation, it's easy to teach protected config to
   include "-c". This round includes some patches (4-5/5) that do this but
   I'm ok to drop them if we don't agree that they are a good idea.

This round is newly rebased onto master because
sg/safe-directory-tests-and-docs has been recently integrated and the "-c"
patches directly contradict that.

= Description

There is a known social engineering attack that takes advantage of the fact
that a working tree can include an entire bare repository, including a
config file. A user could run a Git command inside the bare repository
thinking that the config file of the 'outer' repository would be used, but
in reality, the bare repository's config file (which is attacker-controlled)
is used, which may result in arbitrary code execution. See [1] for a fuller
description and deeper discussion.

This series implements a simple way of preventing such attacks: create a
config option, discovery.bare, that tells Git whether or not to die when it
finds a bare repository. discovery.bare has two values:

 * "always": always allow bare repositories (default), identical to current
   behavior
 * "never": never allow bare repositories

and users/system administrators who never expect to work with bare
repositories can secure their environments using "never". discovery.bare has
no effect if --git-dir or GIT_DIR is passed because we are confident that
the user is not confused about which repository is being used.

This series does not change the default behavior, but in the long-run, a
"no-embedded" option might be a safe and usable default [2]. "never" is too
restrictive and unlikely to be the default.

For security reasons, discovery.bare cannot be read from repository-level
config (because we would end up trusting the embedded bare repository that
we aren't supposed to trust to begin with). Since this would introduce a 3rd
variable that is only read from 'protected/trusted config' (the others are
safe.directory and uploadpack.packObjectsHook) this series also defines and
creates a shared implementation for 'protected config'

= Patch organization

 * Patches 1-2 define "protected config" and create a shared implementation.
 * Patch 3 introduces discovery.bare.
 * Patches 4-5 expand the definition of "protected config" to include the
   CLI option "-c" [3]. Since this is identical to how
   uploadpack.packObjectsHook currently behaves, it is refactored to a
   "protected config only" variable.

= Series history

Changes in v3:

 * Rebase onto a more recent 'master'
 * Reframe this feature in only in terms of the 'embedded bare repo' attack.
 * Other docs improvements (thanks Stolee in particular!)
 * Protected config no longer uses read_very_early_config() and is only read
   once
 * Protected config now includes "-c"
 * uploadpack.packObjectsHook now uses protected config instead of ignoring
   repo config using config scopes

Changes in v2:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

= Future work

 * This series does not implement the "no-embedded" option [2] and I won't
   work on it any time soon, but I'd be more than happy to review if someone
   sends patches.
 * With discovery.bare, if a builtin is marked RUN_SETUP_GENTLY, setup.c
   doesn't die() and we don't tell users why their repository was rejected,
   e.g. "git config" gives an opaque "fatal: not in a git directory". This
   isn't a new problem though, since safe.directory has the same issue.

[1]
https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2] This was first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com
[3] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/ suggests that
safe.directory doesn't need to ignore "-c", it just happened to be
implemented that way.

Glen Choo (5):
  Documentation: define protected configuration
  config: read protected config with `git_protected_config()`
  setup.c: create `discovery.bare`
  config: include "-c" in protected config
  upload-pack: make uploadpack.packObjectsHook protected

 Documentation/config.txt            |  8 ++++
 Documentation/config/discovery.txt  | 19 ++++++++
 Documentation/config/safe.txt       | 19 ++++----
 Documentation/config/uploadpack.txt | 22 ++++------
 Documentation/glossary-content.txt  | 18 ++++++++
 config.c                            | 41 +++++++++++++++++
 config.h                            | 17 ++++++++
 repository.c                        |  5 +++
 repository.h                        |  8 ++++
 setup.c                             | 68 ++++++++++++++++++++++++++++-
 t/t0033-safe-directory.sh           | 24 +++++-----
 t/t0035-discovery-bare.sh           | 63 ++++++++++++++++++++++++++
 upload-pack.c                       | 17 +++++---
 13 files changed, 283 insertions(+), 46 deletions(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh


base-commit: f9b95943b68b6b8ca5a6072f50a08411c6449b55
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v3
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v2:

 -:  ----------- > 1:  575676c760d Documentation: define protected configuration
 -:  ----------- > 2:  7499a280961 config: read protected config with `git_protected_config()`
 1:  22b10bf9da8 ! 3:  d5a3e9f9845 setup.c: make bare repo discovery optional
     @@ Metadata
      Author: Glen Choo <chooglen@google.com>
      
       ## Commit message ##
     -    setup.c: make bare repo discovery optional
     +    setup.c: create `discovery.bare`
      
     -    Add a config variable, `discovery.bare`, that tells Git whether or not
     -    it should work with the bare repository it has discovered i.e. Git will
     -    die() if it discovers a bare repository, but it is not allowed by
     -    `discovery.bare`. This only affects repository discovery, thus it has no
     -    effect if discovery was not done (e.g. `--git-dir` was passed).
     +    There is a known social engineering attack that takes advantage of the
     +    fact that a working tree can include an entire bare repository,
     +    including a config file. A user could run a Git command inside the bare
     +    repository thinking that the config file of the 'outer' repository would
     +    be used, but in reality, the bare repository's config file (which is
     +    attacker-controlled) is used, which may result in arbitrary code
     +    execution. See [1] for a fuller description and deeper discussion.
      
     -    This is motivated by the fact that some workflows don't use bare
     -    repositories at all, and users may prefer to opt out of bare repository
     -    discovery altogether:
     +    A simple mitigation is to forbid bare repositories unless specified via
     +    `--git-dir` or `GIT_DIR`. In environments that don't use bare
     +    repositories, this would be minimally disruptive.
      
     -    - An easy assumption for a user to make is that Git commands run
     -      anywhere inside a repository's working tree will use the same
     -      repository. However, if the working tree contains a bare repository
     -      below the root-level (".git" is preferred at the root-level), any
     -      operations inside that bare repository use the bare repository
     -      instead.
     -
     -      In the worst case, attackers can use this confusion to trick users
     -      into running arbitrary code (see [1] for a deeper discussion). But
     -      even in benign situations (e.g. a user renames ".git/" to ".git.old/"
     -      and commits it for archival purposes), disabling bare repository
     -      discovery can be a simpler mode of operation (e.g. because the user
     -      doesn't actually want to use ".git.old/") [2].
     -
     -    - Git won't "accidentally" recognize a directory that wasn't meant to be
     -      a bare repository, but happens to resemble one. While such accidents
     -      are probably very rare in practice, this lets users reduce the chance
     -      to zero.
     +    Create a config variable, `discovery.bare`, that tells Git whether or
     +    not to die() when it discovers a bare repository. This only affects
     +    repository discovery, thus it has no effect if discovery was not
     +    done (e.g. `--git-dir` was passed).
      
          This config is an enum of:
      
     -    - ["always"|(unset)]: always recognize bare repositories (like Git does
     -      today)
     -    - "never": never recognize bare repositories
     +    - "always": always allow bare repositories (this is the default)
     +    - "never": never allow bare repositories
      
     -    More values are expected to be added later, and the default is expected
     -    to change (i.e. to something other than "always").
     +    If we want to protect users from such attacks by default, neither value
     +    will suffice - "always" provides no protection, but "never" is
     +    impractical for bare repository users. A more usable default would be to
     +    allow only non-embedded bare repositories ([2] contains one such
     +    proposal), but detecting if a repository is embedded is potentially
     +    non-trivial, so this work is not implemented in this series.
      
          [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
     -    [2]: I don't personally know anyone who does this as part of their
     -    normal workflow, but a cursory search on GitHub suggests that there is a
     -    not insubstantial number of people who munge ".git" in order to store
     -    its contents.
     -
     -    https://github.com/search?l=&o=desc&p=1&q=ref+size%3A%3C1000+filename%3AHEAD&s=indexed&type=Code
     -    (aka search for the text "ref", size:<1000, filename:HEAD)
     +    [2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
     -    WIP setup.c: make discovery.bare die on failure
     -
     -    Signed-off-by: Glen Choo <chooglen@google.com>
     + ## Documentation/config.txt ##
     +@@ Documentation/config.txt: include::config/diff.txt[]
     + 
     + include::config/difftool.txt[]
     + 
     ++include::config/discovery.txt[]
     ++
     + include::config/extensions.txt[]
     + 
     + include::config/fastimport.txt[]
      
       ## Documentation/config/discovery.txt (new) ##
      @@
      +discovery.bare::
     -+	Specifies what kinds of directories Git can recognize as a bare
     -+	repository when looking for the repository (aka repository
     -+	discovery). This has no effect if repository discovery is not
     -+	performed e.g. the path to the repository is set via `--git-dir`
     -+	(see linkgit:git[1]).
     ++	'(Protected config only)' Specifies whether Git will work with a
     ++	bare repository that it found during repository discovery. This
     ++	has no effect if the repository is specified directly via the
     ++	--git-dir command-line option or the GIT_DIR environment
     ++	variable (see linkgit:git[1]).
      ++
     -+This config setting is only respected when specified in a system or global
     -+config, not when it is specified in a repository config or via the command
     -+line option `-c discovery.bare=<value>`.
     ++The currently supported values are:
      ++
     -+The currently supported values are `always` (Git always recognizes bare
     -+repositories) and `never` (Git never recognizes bare repositories).
     -+This defaults to `always`, but this default is likely to change.
     ++* `always`: Git always works with bare repositories
     ++* `never`: Git never works with bare repositories
      ++
     -+If your workflow does not rely on bare repositories, it is recommended that
     -+you set this value to `never`. This makes repository discovery easier to
     -+reason about and prevents certain types of security and non-security
     -+problems, such as:
     -+
     -+* `git clone`-ing a repository containing a malicious bare repository
     -+  inside it.
     -+* Git recognizing a directory that isn't meant to be a bare repository,
     -+  but happens to look like one.
     ++This defaults to `always`, but this default may change in the future.
     +++
     ++If you do not use bare repositories in your workflow, then it may be
     ++beneficial to set `discovery.bare` to `never` in your global config.
     ++This will protect you from attacks that involve cloning a repository
     ++that contains a bare repository and running a Git command within that
     ++directory.
      
       ## setup.c ##
      @@
     @@ setup.c: static int ensure_valid_ownership(const char *path)
      +static int check_bare_repo_allowed(void)
      +{
      +	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
     -+		read_very_early_config(discovery_bare_cb, NULL);
     -+		/* We didn't find a value; use the default. */
     -+		if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN)
     -+			discovery_bare_config = DISCOVERY_BARE_ALWAYS;
     ++		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
     ++		git_protected_config(discovery_bare_cb, NULL);
      +	}
      +	switch (discovery_bare_config) {
      +	case DISCOVERY_BARE_NEVER:
      +		return 0;
      +	case DISCOVERY_BARE_ALWAYS:
      +		return 1;
     -+	default:
     ++	case DISCOVERY_BARE_UNKNOWN:
      +		BUG("invalid discovery_bare_config %d", discovery_bare_config);
      +	}
     ++	return 0;
      +}
      +
      +static const char *discovery_bare_config_to_string(void)
     @@ setup.c: static int ensure_valid_ownership(const char *path)
      +		return "never";
      +	case DISCOVERY_BARE_ALWAYS:
      +		return "always";
     -+	default:
     ++	case DISCOVERY_BARE_UNKNOWN:
      +		BUG("invalid discovery_bare_config %d", discovery_bare_config);
      +	}
     ++	return NULL;
      +}
      +
       enum discovery_result {
     @@ setup.c: enum discovery_result {
       	GIT_DIR_INVALID_GITFILE = -3,
      -	GIT_DIR_INVALID_OWNERSHIP = -4
      +	GIT_DIR_INVALID_OWNERSHIP = -4,
     -+	GIT_DIR_DISALLOWED_BARE = -5
     ++	GIT_DIR_DISALLOWED_BARE = -5,
       };
       
       /*
     @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
       		/*
       		 * As a safeguard against setup_git_directory_gently_1 returning
      
     - ## t/t0034-discovery-bare.sh (new) ##
     + ## t/t0035-discovery-bare.sh (new) ##
      @@
      +#!/bin/sh
      +
     @@ t/t0034-discovery-bare.sh (new)
      +
      +pwd="$(pwd)"
      +
     -+expect_allowed () {
     -+	git rev-parse --absolute-git-dir >actual &&
     -+	echo "$pwd/outer-repo/bare-repo" >expected &&
     -+	test_cmp expected actual
     -+}
     -+
      +expect_rejected () {
     -+	test_must_fail git rev-parse --absolute-git-dir 2>err &&
     ++	test_must_fail git rev-parse --git-dir 2>err &&
      +	grep "discovery.bare" err
      +}
      +
     @@ t/t0034-discovery-bare.sh (new)
      +test_expect_success 'discovery.bare unset' '
      +	(
      +		cd outer-repo/bare-repo &&
     -+		expect_allowed &&
     -+		cd refs/ &&
     -+		expect_allowed
     ++		git rev-parse --git-dir
      +	)
      +'
      +
     @@ t/t0034-discovery-bare.sh (new)
      +	git config --global discovery.bare always &&
      +	(
      +		cd outer-repo/bare-repo &&
     -+		expect_allowed &&
     -+		cd refs/ &&
     -+		expect_allowed
     ++		git rev-parse --git-dir
      +	)
      +'
      +
     @@ t/t0034-discovery-bare.sh (new)
      +	git config --global discovery.bare never &&
      +	(
      +		cd outer-repo/bare-repo &&
     -+		expect_rejected &&
     -+		cd refs/ &&
      +		expect_rejected
     -+	) &&
     ++	)
     ++'
     ++
     ++test_expect_success 'discovery.bare in the repository' '
     ++	(
     ++		cd outer-repo/bare-repo &&
     ++		# Temporarily set discovery.bare=always, otherwise git
     ++		# config fails with "fatal: not in a git directory"
     ++		# (like safe.directory)
     ++		git config --global discovery.bare always &&
     ++		git config discovery.bare always &&
     ++		git config --global discovery.bare never &&
     ++		expect_rejected
     ++	)
     ++'
     ++
     ++test_expect_success 'discovery.bare on the command line' '
     ++	git config --global discovery.bare never &&
      +	(
     -+		GIT_DIR=outer-repo/bare-repo &&
     -+		export GIT_DIR &&
     -+		expect_allowed
     ++		cd outer-repo/bare-repo &&
     ++		test_must_fail git -c discovery.bare=always rev-parse --git-dir 2>err &&
     ++		grep "discovery.bare" err
      +	)
      +'
      +
 2:  62070aab7eb < -:  ----------- setup.c: learn discovery.bareRepository=cwd
 -:  ----------- > 4:  66a0a208176 config: include "-c" in protected config
 -:  ----------- > 5:  e25d5907cd1 upload-pack: make uploadpack.packObjectsHook protected

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v3 1/5] Documentation: define protected configuration
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
@ 2022-05-27 21:09     ` Glen Choo via GitGitGadget
  2022-05-27 23:29       ` Junio C Hamano
  2022-05-27 21:09     ` [PATCH v3 2/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
                       ` (4 subsequent siblings)
  5 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-27 21:09 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

For security reasons, some config variables are only trusted when they
are specified in so-called 'protected configuration' [1]. A future
commit will introduce another such config variable, so this is a good
time to standardize the documentation and implementation of 'protected
configuration'.

Define 'protected configuration' as global and system-level config, and
mark `safe.directory` 'Protected config only'. In a future commit,
protected configuration will also include "-c".

The following variables are intentionally not marked 'Protected config
only':

- `uploadpack.packObjectsHook` has the same security concerns as
  `safe.directory`, but due to a different implementation, it also
  respects the "-c" option.

  When protected configuration includes "-c", `upload.packObjectsHook`
  will be marked 'Protected config only'.

- `trace2.*` happens to read the same config as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  6 ++++++
 Documentation/config/safe.txt      | 19 ++++++++-----------
 Documentation/glossary-content.txt | 18 ++++++++++++++++++
 3 files changed, 32 insertions(+), 11 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e284b042f22..07832de1a6c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -369,6 +369,12 @@ inventing new variables for use in your own tool, make sure their
 names do not conflict with those that are used by Git itself and
 other popular tools, and describe them in your documentation.
 
+Variables marked with '(Protected config only)' are only respected when
+they are specified in protected configuration. This includes global and
+system-level config, and excludes repository config, the command line
+option `-c`, and environment variables. For more details, see the
+'protected configuration' entry in linkgit:gitglossary[7].
+
 include::config/advice.txt[]
 
 include::config/core.txt[]
diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index ae0e2e3bdb4..c1caec460e8 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -1,21 +1,18 @@
 safe.directory::
-	These config entries specify Git-tracked directories that are
-	considered safe even if they are owned by someone other than the
-	current user. By default, Git will refuse to even parse a Git
-	config of a repository owned by someone else, let alone run its
-	hooks, and this config setting allows users to specify exceptions,
-	e.g. for intentionally shared repositories (see the `--shared`
-	option in linkgit:git-init[1]).
+	'(Protected config only) ' These config entries specify
+	Git-tracked directories that are considered safe even if they
+	are owned by someone other than the current user. By default,
+	Git will refuse to even parse a Git config of a repository owned
+	by someone else, let alone run its hooks, and this config
+	setting allows users to specify exceptions, e.g. for
+	intentionally shared repositories (see the `--shared` option in
+	linkgit:git-init[1]).
 +
 This is a multi-valued setting, i.e. you can add more than one directory
 via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
-This config setting is only respected when specified in a system or global
-config, not when it is specified in a repository config, via the command
-line option `-c safe.directory=<path>`, or in environment variables.
-+
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
 path relative to Git's (runtime) prefix.
diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index aa2f41f5e70..a669983abd6 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -483,6 +483,24 @@ exclude;;
 	head ref. If the remote <<def_head,head>> is not an
 	ancestor to the local head, the push fails.
 
+[[def_protected_config]]protected configuration::
+	Protected configuration is configuration that Git considers more
+	trustworthy because it is unlikely to be tampered with by an
+	attacker. For security reasons, some configuration variables are
+	only respected when they are defined in protected configuration.
++
+Protected configuration includes:
++
+- system-level config, e.g. `/etc/git/config`
+- global config, e.g. `$XDG_CONFIG_HOME/git/config` and
+  `$HOME/.gitconfig`
++
+Protected configuration excludes:
++
+- repository config, e.g. `$GIT_DIR/config` and
+  `$GIT_DIR/config.worktree`
+- the command line option `-c` and its equivalent environment variables
+
 [[def_reachable]]reachable::
 	All of the ancestors of a given <<def_commit,commit>> are said to be
 	"reachable" from that commit. More
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 2/5] config: read protected config with `git_protected_config()`
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  2022-05-27 21:09     ` [PATCH v3 1/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-05-27 21:09     ` Glen Choo via GitGitGadget
  2022-05-28  0:28       ` Junio C Hamano
  2022-06-02 12:56       ` Derrick Stolee
  2022-05-27 21:09     ` [PATCH v3 3/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
                       ` (3 subsequent siblings)
  5 siblings, 2 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-27 21:09 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Protected config is read using `read_very_early_config()`, which has
several downsides:

- Every call to `read_very_early_config()` parses global and
  system-level config files anew, but this can be optimized by just
  parsing them once [1].
- Protected variables should respect "-c" because we can reasonably
  assume that it comes from the user. But, `read_very_early_config()`
  can't use "-c" because it is called so early that it does not have
  access to command line arguments.

Introduce `git_protected_config()`, which reads protected config and
caches the values in `the_repository.protected_config`. Then, refactor
`safe.directory` to use `git_protected_config()`.

This implementation can still be improved, however:

- `git_protected_config()` iterates through every variable in
  `the_repository.protected_config`, which may still be too expensive to
  be called in every "git" invocation. There exist constant time lookup
  functions for non-protected config (repo_config_get_*()), but for
  simplicity, this commit does not implement similar functions for
  protected config.

- Protected config is stored in `the_repository` so that we don't need
  to statically allocate it. But this might be confusing since protected
  config ignores repository config by definition.

[1] While `git_protected_config()` should save on file I/O, I wasn't
able to measure a meaningful difference between that and
`read_very_early_config()` on my machine (which has an SSD).

Signed-off-by: Glen Choo <chooglen@google.com>
---
 config.c     | 35 +++++++++++++++++++++++++++++++++++
 config.h     |  8 ++++++++
 repository.c |  5 +++++
 repository.h |  8 ++++++++
 setup.c      |  2 +-
 5 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/config.c b/config.c
index fa471dbdb89..c30bb7c5d09 100644
--- a/config.c
+++ b/config.c
@@ -2614,6 +2614,41 @@ int repo_config_get_pathname(struct repository *repo,
 	return ret;
 }
 
+/* Read protected config into the_repository->protected_config. */
+static void read_protected_config(void)
+{
+	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
+
+	CALLOC_ARRAY(the_repository->protected_config, 1);
+	git_configset_init(the_repository->protected_config);
+
+	system_config = git_system_config();
+	git_global_config(&user_config, &xdg_config);
+
+	git_configset_add_file(the_repository->protected_config, system_config);
+	git_configset_add_file(the_repository->protected_config, xdg_config);
+	git_configset_add_file(the_repository->protected_config, user_config);
+
+	free(system_config);
+	free(xdg_config);
+	free(user_config);
+}
+
+/* Ensure that the_repository->protected_config has been initialized. */
+static void git_protected_config_check_init(void)
+{
+	if (the_repository->protected_config &&
+	    the_repository->protected_config->hash_initialized)
+		return;
+	read_protected_config();
+}
+
+void git_protected_config(config_fn_t fn, void *data)
+{
+	git_protected_config_check_init();
+	configset_iter(the_repository->protected_config, fn, data);
+}
+
 /* Functions used historically to read configuration from 'the_repository' */
 void git_config(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index 7654f61c634..411965f52b5 100644
--- a/config.h
+++ b/config.h
@@ -505,6 +505,14 @@ int repo_config_get_maybe_bool(struct repository *repo,
 int repo_config_get_pathname(struct repository *repo,
 			     const char *key, const char **dest);
 
+/*
+ * Functions for reading protected config. By definition, protected
+ * config ignores repository config, so it is unnecessary to read
+ * protected config from any `struct repository` other than
+ * the_repository.
+ */
+void git_protected_config(config_fn_t fn, void *data);
+
 /**
  * Querying For Specific Variables
  * -------------------------------
diff --git a/repository.c b/repository.c
index 5d166b692c8..ec319a5e09a 100644
--- a/repository.c
+++ b/repository.c
@@ -295,6 +295,11 @@ void repo_clear(struct repository *repo)
 		FREE_AND_NULL(repo->remote_state);
 	}
 
+	if (repo->protected_config) {
+		git_configset_clear(repo->protected_config);
+		FREE_AND_NULL(repo->protected_config);
+	}
+
 	repo_clear_path_cache(&repo->cached_paths);
 }
 
diff --git a/repository.h b/repository.h
index 6cc661e5a43..24251aac553 100644
--- a/repository.h
+++ b/repository.h
@@ -126,6 +126,14 @@ struct repository {
 
 	struct repo_settings settings;
 
+	/*
+	 * Config that comes from trusted sources, namely
+	 * - system config files (e.g. /etc/gitconfig)
+	 * - global config files (e.g. $HOME/.gitconfig,
+	 *   $XDG_CONFIG_HOME/git)
+	 */
+	struct config_set *protected_config;
+
 	/* Subsystems */
 	/*
 	 * Repository's config which contains key-value pairs from the usual
diff --git a/setup.c b/setup.c
index f818dd858c6..847d47f9195 100644
--- a/setup.c
+++ b/setup.c
@@ -1128,7 +1128,7 @@ static int ensure_valid_ownership(const char *path)
 	    is_path_owned_by_current_user(path))
 		return 1;
 
-	read_very_early_config(safe_directory_cb, &data);
+	git_protected_config(safe_directory_cb, &data);
 
 	return data.is_safe;
 }
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 3/5] setup.c: create `discovery.bare`
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  2022-05-27 21:09     ` [PATCH v3 1/5] Documentation: define protected configuration Glen Choo via GitGitGadget
  2022-05-27 21:09     ` [PATCH v3 2/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-05-27 21:09     ` Glen Choo via GitGitGadget
  2022-05-28  0:59       ` Junio C Hamano
  2022-06-02 13:11       ` Derrick Stolee
  2022-05-27 21:09     ` [PATCH v3 4/5] config: include "-c" in protected config Glen Choo via GitGitGadget
                       ` (2 subsequent siblings)
  5 siblings, 2 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-27 21:09 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `discovery.bare`, that tells Git whether or
not to die() when it discovers a bare repository. This only affects
repository discovery, thus it has no effect if discovery was not
done (e.g. `--git-dir` was passed).

This config is an enum of:

- "always": always allow bare repositories (this is the default)
- "never": never allow bare repositories

If we want to protect users from such attacks by default, neither value
will suffice - "always" provides no protection, but "never" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  2 +
 Documentation/config/discovery.txt | 19 +++++++++
 setup.c                            | 66 +++++++++++++++++++++++++++++-
 t/t0035-discovery-bare.sh          | 64 +++++++++++++++++++++++++++++
 4 files changed, 150 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 07832de1a6c..34133288d75 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -415,6 +415,8 @@ include::config/diff.txt[]
 
 include::config/difftool.txt[]
 
+include::config/discovery.txt[]
+
 include::config/extensions.txt[]
 
 include::config/fastimport.txt[]
diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
new file mode 100644
index 00000000000..fbe93597e7c
--- /dev/null
+++ b/Documentation/config/discovery.txt
@@ -0,0 +1,19 @@
+discovery.bare::
+	'(Protected config only)' Specifies whether Git will work with a
+	bare repository that it found during repository discovery. This
+	has no effect if the repository is specified directly via the
+	--git-dir command-line option or the GIT_DIR environment
+	variable (see linkgit:git[1]).
++
+The currently supported values are:
++
+* `always`: Git always works with bare repositories
+* `never`: Git never works with bare repositories
++
+This defaults to `always`, but this default may change in the future.
++
+If you do not use bare repositories in your workflow, then it may be
+beneficial to set `discovery.bare` to `never` in your global config.
+This will protect you from attacks that involve cloning a repository
+that contains a bare repository and running a Git command within that
+directory.
diff --git a/setup.c b/setup.c
index 847d47f9195..6686743ab7d 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,13 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum discovery_bare_config {
+	DISCOVERY_BARE_UNKNOWN = -1,
+	DISCOVERY_BARE_NEVER = 0,
+	DISCOVERY_BARE_ALWAYS,
+};
+static enum discovery_bare_config discovery_bare_config =
+	DISCOVERY_BARE_UNKNOWN;
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1133,6 +1140,52 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+static int discovery_bare_cb(const char *key, const char *value, void *d)
+{
+	if (strcmp(key, "discovery.bare"))
+		return 0;
+
+	if (!strcmp(value, "never")) {
+		discovery_bare_config = DISCOVERY_BARE_NEVER;
+		return 0;
+	}
+	if (!strcmp(value, "always")) {
+		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
+		return 0;
+	}
+	return -1;
+}
+
+static int check_bare_repo_allowed(void)
+{
+	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
+		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
+		git_protected_config(discovery_bare_cb, NULL);
+	}
+	switch (discovery_bare_config) {
+	case DISCOVERY_BARE_NEVER:
+		return 0;
+	case DISCOVERY_BARE_ALWAYS:
+		return 1;
+	case DISCOVERY_BARE_UNKNOWN:
+		BUG("invalid discovery_bare_config %d", discovery_bare_config);
+	}
+	return 0;
+}
+
+static const char *discovery_bare_config_to_string(void)
+{
+	switch (discovery_bare_config) {
+	case DISCOVERY_BARE_NEVER:
+		return "never";
+	case DISCOVERY_BARE_ALWAYS:
+		return "always";
+	case DISCOVERY_BARE_UNKNOWN:
+		BUG("invalid discovery_bare_config %d", discovery_bare_config);
+	}
+	return NULL;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1142,7 +1195,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
 };
 
 /*
@@ -1239,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (!check_bare_repo_allowed())
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1385,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
+			    dir.buf,
+			    discovery_bare_config_to_string());
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
new file mode 100755
index 00000000000..94c2f76d774
--- /dev/null
+++ b/t/t0035-discovery-bare.sh
@@ -0,0 +1,64 @@
+#!/bin/sh
+
+test_description='verify discovery.bare checks'
+
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_rejected () {
+	test_must_fail git rev-parse --git-dir 2>err &&
+	grep "discovery.bare" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare unset' '
+	(
+		cd outer-repo/bare-repo &&
+		git rev-parse --git-dir
+	)
+'
+
+test_expect_success 'discovery.bare=always' '
+	git config --global discovery.bare always &&
+	(
+		cd outer-repo/bare-repo &&
+		git rev-parse --git-dir
+	)
+'
+
+test_expect_success 'discovery.bare=never' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_rejected
+	)
+'
+
+test_expect_success 'discovery.bare in the repository' '
+	(
+		cd outer-repo/bare-repo &&
+		# Temporarily set discovery.bare=always, otherwise git
+		# config fails with "fatal: not in a git directory"
+		# (like safe.directory)
+		git config --global discovery.bare always &&
+		git config discovery.bare always &&
+		git config --global discovery.bare never &&
+		expect_rejected
+	)
+'
+
+test_expect_success 'discovery.bare on the command line' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		test_must_fail git -c discovery.bare=always rev-parse --git-dir 2>err &&
+		grep "discovery.bare" err
+	)
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 4/5] config: include "-c" in protected config
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                       ` (2 preceding siblings ...)
  2022-05-27 21:09     ` [PATCH v3 3/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-05-27 21:09     ` Glen Choo via GitGitGadget
  2022-06-02 13:15       ` Derrick Stolee
  2022-05-27 21:09     ` [PATCH v3 5/5] upload-pack: make uploadpack.packObjectsHook protected Glen Choo via GitGitGadget
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  5 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-27 21:09 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Protected config should include the command line (aka "-c") because we
can be quite certain that this config is specified by the user.

Introduce a function, `git_configset_add_parameters()`, that adds "-c"
config to a config_set, and use it to add "-c" to protected config.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  6 +++---
 Documentation/glossary-content.txt |  2 +-
 config.c                           |  6 ++++++
 config.h                           |  9 +++++++++
 t/t0033-safe-directory.sh          | 24 ++++++++++--------------
 t/t0035-discovery-bare.sh          |  3 +--
 6 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 34133288d75..f40a3e297ce 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -370,9 +370,9 @@ names do not conflict with those that are used by Git itself and
 other popular tools, and describe them in your documentation.
 
 Variables marked with '(Protected config only)' are only respected when
-they are specified in protected configuration. This includes global and
-system-level config, and excludes repository config, the command line
-option `-c`, and environment variables. For more details, see the
+they are specified in protected configuration. This includes global,
+system-level config, the command line option `-c`, and environment
+variables, and excludes repository config. For more details, see the
 'protected configuration' entry in linkgit:gitglossary[7].
 
 include::config/advice.txt[]
diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index a669983abd6..4190c410a00 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -494,12 +494,12 @@ Protected configuration includes:
 - system-level config, e.g. `/etc/git/config`
 - global config, e.g. `$XDG_CONFIG_HOME/git/config` and
   `$HOME/.gitconfig`
+- the command line option `-c` and its equivalent environment variables
 +
 Protected configuration excludes:
 +
 - repository config, e.g. `$GIT_DIR/config` and
   `$GIT_DIR/config.worktree`
-- the command line option `-c` and its equivalent environment variables
 
 [[def_reachable]]reachable::
 	All of the ancestors of a given <<def_commit,commit>> are said to be
diff --git a/config.c b/config.c
index c30bb7c5d09..22192ca1d63 100644
--- a/config.c
+++ b/config.c
@@ -2373,6 +2373,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
 	return git_config_from_file(config_set_callback, filename, cs);
 }
 
+int git_configset_add_parameters(struct config_set *cs)
+{
+	return git_config_from_parameters(config_set_callback, cs);
+}
+
 int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
 {
 	const struct string_list *values = NULL;
@@ -2628,6 +2633,7 @@ static void read_protected_config(void)
 	git_configset_add_file(the_repository->protected_config, system_config);
 	git_configset_add_file(the_repository->protected_config, xdg_config);
 	git_configset_add_file(the_repository->protected_config, user_config);
+	git_configset_add_parameters(the_repository->protected_config);
 
 	free(system_config);
 	free(xdg_config);
diff --git a/config.h b/config.h
index 411965f52b5..e3ff1fcf683 100644
--- a/config.h
+++ b/config.h
@@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
  */
 int git_configset_add_file(struct config_set *cs, const char *filename);
 
+/**
+ * Parses command line options and environment variables, and adds the
+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
+ * if there is an error in parsing. The caller decides whether to free
+ * the incomplete configset or continue using it when the function
+ * returns -1.
+ */
+int git_configset_add_parameters(struct config_set *cs);
+
 /**
  * Finds and returns the value list, sorted in order of increasing priority
  * for the configuration variable `key` and config set `cs`. When the
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 238b25f91a3..5a1cd0d0947 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -16,24 +16,20 @@ test_expect_success 'safe.directory is not set' '
 	expect_rejected_dir
 '
 
-test_expect_success 'ignoring safe.directory on the command line' '
-	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory on the command line' '
+	git -c safe.directory="$(pwd)" status
 '
 
-test_expect_success 'ignoring safe.directory in the environment' '
-	test_must_fail env GIT_CONFIG_COUNT=1 \
-		GIT_CONFIG_KEY_0="safe.directory" \
-		GIT_CONFIG_VALUE_0="$(pwd)" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in the environment' '
+	env GIT_CONFIG_COUNT=1 \
+	    GIT_CONFIG_KEY_0="safe.directory" \
+	    GIT_CONFIG_VALUE_0="$(pwd)" \
+	    git status
 '
 
-test_expect_success 'ignoring safe.directory in GIT_CONFIG_PARAMETERS' '
-	test_must_fail env \
-		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
+	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
+	    git status
 '
 
 test_expect_success 'ignoring safe.directory in repo config' '
diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
index 94c2f76d774..0d5983df307 100755
--- a/t/t0035-discovery-bare.sh
+++ b/t/t0035-discovery-bare.sh
@@ -56,8 +56,7 @@ test_expect_success 'discovery.bare on the command line' '
 	git config --global discovery.bare never &&
 	(
 		cd outer-repo/bare-repo &&
-		test_must_fail git -c discovery.bare=always rev-parse --git-dir 2>err &&
-		grep "discovery.bare" err
+		git -c discovery.bare=always rev-parse --git-dir
 	)
 '
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v3 5/5] upload-pack: make uploadpack.packObjectsHook protected
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                       ` (3 preceding siblings ...)
  2022-05-27 21:09     ` [PATCH v3 4/5] config: include "-c" in protected config Glen Choo via GitGitGadget
@ 2022-05-27 21:09     ` Glen Choo via GitGitGadget
  2022-06-02 13:18       ` Derrick Stolee
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  5 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-05-27 21:09 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Now that protected config includes "-c", "uploadpack.packObjectsHook"
behaves identically to a 'Protected config only' variable. Refactor it
to use git_protected_config() and mark it 'Protected config only'.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/uploadpack.txt | 22 +++++++++-------------
 upload-pack.c                       | 17 +++++++++++------
 2 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..57e5e021323 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -39,19 +39,15 @@ uploadpack.keepAlive::
 	disables keepalive packets entirely. The default is 5 seconds.
 
 uploadpack.packObjectsHook::
-	If this option is set, when `upload-pack` would run
-	`git pack-objects` to create a packfile for a client, it will
-	run this shell command instead.  The `pack-objects` command and
-	arguments it _would_ have run (including the `git pack-objects`
-	at the beginning) are appended to the shell command. The stdin
-	and stdout of the hook are treated as if `pack-objects` itself
-	was run. I.e., `upload-pack` will feed input intended for
-	`pack-objects` to the hook, and expects a completed packfile on
-	stdout.
-+
-Note that this configuration variable is ignored if it is seen in the
-repository-level config (this is a safety measure against fetching from
-untrusted repositories).
+	'(Protected config only)' If this option is set, when
+	`upload-pack` would run `git pack-objects` to create a packfile
+	for a client, it will run this shell command instead. The
+	`pack-objects` command and arguments it _would_ have run
+	(including the `git pack-objects` at the beginning) are appended
+	to the shell command. The stdin and stdout of the hook are
+	treated as if `pack-objects` itself was run. I.e., `upload-pack`
+	will feed input intended for `pack-objects` to the hook, and
+	expects a completed packfile on stdout.
 
 uploadpack.allowFilter::
 	If this option is set, `upload-pack` will support partial
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b36066..2a39391369d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1321,18 +1321,21 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 		data->advertise_sid = git_config_bool(var, value);
 	}
 
-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
-		if (!strcmp("uploadpack.packobjectshook", var))
-			return git_config_string(&data->pack_objects_hook, var, value);
-	}
-
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
+{
+	struct upload_pack_data *data = cb_data;
+
+	if (!strcmp("uploadpack.packobjectshook", var))
+		return git_config_string(&data->pack_objects_hook, var, value);
+	return 0;
+}
+
 void upload_pack(const int advertise_refs, const int stateless_rpc,
 		 const int timeout)
 {
@@ -1342,6 +1345,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 	upload_pack_data_init(&data);
 
 	git_config(upload_pack_config, &data);
+	git_protected_config(upload_pack_protected_config, &data);
 
 	data.stateless_rpc = stateless_rpc;
 	data.timeout = timeout;
@@ -1697,6 +1701,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
 	data.use_sideband = LARGE_PACKET_MAX;
 
 	git_config(upload_pack_config, &data);
+	git_protected_config(upload_pack_protected_config, &data);
 
 	while (state != FETCH_DONE) {
 		switch (state) {
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/5] Documentation: define protected configuration
  2022-05-27 21:09     ` [PATCH v3 1/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-05-27 23:29       ` Junio C Hamano
  2022-06-02 12:42         ` Derrick Stolee
  2022-06-03 15:57         ` Glen Choo
  0 siblings, 2 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-27 23:29 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

>  safe.directory::
> -	These config entries specify Git-tracked directories that are
> -	considered safe even if they are owned by someone other than the
> -	current user. By default, Git will refuse to even parse a Git
> -	config of a repository owned by someone else, let alone run its
> -	hooks, and this config setting allows users to specify exceptions,
> -	e.g. for intentionally shared repositories (see the `--shared`
> -	option in linkgit:git-init[1]).
> +	'(Protected config only) ' These config entries specify

What's the SP in "only) '" doing?

> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
> index aa2f41f5e70..a669983abd6 100644
> --- a/Documentation/glossary-content.txt
> +++ b/Documentation/glossary-content.txt
> @@ -483,6 +483,24 @@ exclude;;
>  	head ref. If the remote <<def_head,head>> is not an
>  	ancestor to the local head, the push fails.
>  
> +[[def_protected_config]]protected configuration::
> +	Protected configuration is configuration that Git considers more
> +	trustworthy because it is unlikely to be tampered with by an
> +	attacker. For security reasons, some configuration variables are
> +	only respected when they are defined in protected configuration.
> ++
> +Protected configuration includes:
> ++
> +- system-level config, e.g. `/etc/git/config`
> +- global config, e.g. `$XDG_CONFIG_HOME/git/config` and
> +  `$HOME/.gitconfig`
> +Protected configuration excludes:
> ++
> +- repository config, e.g. `$GIT_DIR/config` and
> +  `$GIT_DIR/config.worktree`
> +- the command line option `-c` and its equivalent environment variables

The description is a bit unclear what "protected configuration"
refers.

If it is the scopes (as in "git config --show-scope") Git can trust
more, in other words, a statement like this

    safe.directory is honored only when it comes from a protected
    configuration.

is what you want to make easier to write by introducing a new
phrase, perhaps use the word "scope" for more consistency?  E.g.

    Only safe.directory that is defined in a trusted scope is
    honored.

I dunno.

It would make sense to give a rationale behind the seemingly
arbitrary choice of what is and what is not "protected".  Not
necessarily in the glossary, but in the proposed log message of the
commit that makes the decision.  The rationale must help readers to
be able to answer the following questions.

 - The system level is "protected" because?  Is it because we do not
   even try to protect ourselves from those who can write anywhere
   in /etc/ or other system directories?

 - The per-user config is "protected" because?  Is it because our
   primary interest in "protection" is to protect individual users
   from landmines laid in the filesystem by other users, and those
   who can already write into $HOME are not we try to guard against?

 - The per-repo config is not "protected" (i.e. "trusted"), because?
   If we are not honoring a configuration in the repository, why are
   we working in that repository in the first place?

 - The per invocation config is not "protected" (i.e. "trusted"),
   because?  If we cannot trusting our own command line, what
   prevents an attacker from mucking with our command line to say
   "sudo whatever" using the same attack vector?

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 2/5] config: read protected config with `git_protected_config()`
  2022-05-27 21:09     ` [PATCH v3 2/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-05-28  0:28       ` Junio C Hamano
  2022-05-31 17:43         ` Glen Choo
  2022-06-02 12:56       ` Derrick Stolee
  1 sibling, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-05-28  0:28 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Glen Choo <chooglen@google.com>
>
> Protected config is read using `read_very_early_config()`, which has
> several downsides:
>
> - Every call to `read_very_early_config()` parses global and
>   system-level config files anew, but this can be optimized by just
>   parsing them once [1].
> - Protected variables should respect "-c" because we can reasonably
>   assume that it comes from the user. But, `read_very_early_config()`
>   can't use "-c" because it is called so early that it does not have
>   access to command line arguments.

Now we are talking about protected "variable".  Is that a synonym
for "config", or are there some distinctions between them?

> - Protected config is stored in `the_repository` so that we don't need
>   to statically allocate it. But this might be confusing since protected
>   config ignores repository config by definition.

Yes, it indeed is.  Is it because we were over-eager when we
introduced the "struct repository *repo" parameter to many functions
and the configuration system wants you to have some repository, even
when you know you are not reading from any repository?  

I am wondering if it is a cleaner solution *not* to hang the
protected config as a configset in the_repository, but keep the
configset as a separate global variable, perhaps static to config.c
and is meant to be only accessed via git_protected_config() and the
like.

> @@ -295,6 +295,11 @@ void repo_clear(struct repository *repo)
>  		FREE_AND_NULL(repo->remote_state);
>  	}
>  
> +	if (repo->protected_config) {
> +		git_configset_clear(repo->protected_config);
> +		FREE_AND_NULL(repo->protected_config);
> +	}
> +

This becomes necessary only because each repository instance has
protected_config, even though we need only one instance, no matter
how many repositories we are accessing in this single invocation of
Git, no?

How should "git config -l" interact with "protected config" and
"protected variables", by the way?  Should a user be able to tell
which ones are coming from protected scope?  Should we gain, next to
--global, --system, etc., --protected option to list only the
protected config/variable?

This is another thing that I find iffy on terminology.  Should a
random variable, like user.name, be a "protected config", if it is
found in $HOME/.gitconfig?  If it comes from there, surely we can
trust its value, but unlike things like safe.directory, there is no
code that wants to enforce that we pay attention only to user.name
that came from trusted scopes.  Should such a variable be called
"protected variable"?

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 3/5] setup.c: create `discovery.bare`
  2022-05-27 21:09     ` [PATCH v3 3/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-05-28  0:59       ` Junio C Hamano
  2022-06-02 13:11       ` Derrick Stolee
  1 sibling, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-05-28  0:59 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> +enum discovery_bare_config {
> +	DISCOVERY_BARE_UNKNOWN = -1,
> +	DISCOVERY_BARE_NEVER = 0,
> +	DISCOVERY_BARE_ALWAYS,
> +};
> +static enum discovery_bare_config discovery_bare_config =
> +	DISCOVERY_BARE_UNKNOWN;

Can discovery_bare come from anywhere other than config?

I am wondering if both the variable and the type should be called
"discovery_bare_allowed" instead.  That it comes from the config is
not the more important part.  That it determines if it is allowed
is.

> +static int check_bare_repo_allowed(void)
> +{
> +	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
> +		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
> +		git_protected_config(discovery_bare_cb, NULL);
> +	}

OK, so the thing is initialized to "unknown", and the first time we
want to use the value of it, we read from the file (or default to
"always").  Makes sense.

And then ...

> +	switch (discovery_bare_config) {
> +	case DISCOVERY_BARE_NEVER:
> +		return 0;
> +	case DISCOVERY_BARE_ALWAYS:
> +		return 1;
> +	case DISCOVERY_BARE_UNKNOWN:
> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);

... this is being defensive; we know discovery_bare_cb() won't give
UNKNOWN, but we want to make sure.

> +	}
> +	return 0;
> +}
> +
> +static const char *discovery_bare_config_to_string(void)
> +{

But this one feels strangely asymmetrical, as there is no inherent
reason why one must be called before the other.  I would expect it
to either

 * take a parameter of type "enum discovery_bare" and return
   "never", "always", or "unset", without calling any BUG().

or

 * have the same "we lazily figure out the discovery_bare_config
   variable on demand" logic.

As both of these functions are file-scope static, we can live with
it, though.

> +	switch (discovery_bare_config) {
> +	case DISCOVERY_BARE_NEVER:
> +		return "never";
> +	case DISCOVERY_BARE_ALWAYS:
> +		return "always";
> +	case DISCOVERY_BARE_UNKNOWN:
> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);
> +	}
> +	return NULL;
> +}


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 2/5] config: read protected config with `git_protected_config()`
  2022-05-28  0:28       ` Junio C Hamano
@ 2022-05-31 17:43         ` Glen Choo
  2022-06-01 15:58           ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-05-31 17:43 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

>> Protected config is read using `read_very_early_config()`, which has
>> several downsides:
>>
>> - Every call to `read_very_early_config()` parses global and
>>   system-level config files anew, but this can be optimized by just
>>   parsing them once [1].
>> - Protected variables should respect "-c" because we can reasonably
>>   assume that it comes from the user. But, `read_very_early_config()`
>>   can't use "-c" because it is called so early that it does not have
>>   access to command line arguments.
>
> Now we are talking about protected "variable".  Is that a synonym
> for "config", or are there some distinctions between them?

Sorry, that's an old term I was toying with (this somehow snuck through
my proofreading). I just meant "variable that is only read from
protected config", aka a "protected config only variable".

A goal in this version was to introduce as little jargon as possible, so
- "protected config" refers to the set of config sources, and
- "protected config only" refers to config variables/settings that are
  only read from protected config.

>> - Protected config is stored in `the_repository` so that we don't need
>>   to statically allocate it. But this might be confusing since protected
>>   config ignores repository config by definition.
>
> Yes, it indeed is.  Is it because we were over-eager when we
> introduced the "struct repository *repo" parameter to many functions
> and the configuration system wants you to have some repository, even
> when you know you are not reading from any repository?  

Ah no, I was just trying to avoid yet-another global variable (since
IIRC we want to move towards a more lib-like Git), and the_repository
was a convenient global variable to (ab)use.

> I am wondering if it is a cleaner solution *not* to hang the
> protected config as a configset in the_repository, but keep the
> configset as a separate global variable, perhaps static to config.c
> and is meant to be only accessed via git_protected_config() and the
> like.

I think your suggestion to use a global variable is better, as much as I
want to avoid another global variable. Protected config would affect any
repositories that we work with in-core, so using a global sounds ok.

environment.c might be a better place since we already make a concerted
effort to put global config variables there instead of config.c.

As an aside, I wonder how we could get rid of all of the globals in
environment.c in the long term. Maybe we would have yet-another all
encompassing global, the_environment, and then figure out which
variables belong to the repository and which belong to the environment.

>> @@ -295,6 +295,11 @@ void repo_clear(struct repository *repo)
>>  		FREE_AND_NULL(repo->remote_state);
>>  	}
>>  
>> +	if (repo->protected_config) {
>> +		git_configset_clear(repo->protected_config);
>> +		FREE_AND_NULL(repo->protected_config);
>> +	}
>> +
>
> This becomes necessary only because each repository instance has
> protected_config, even though we need only one instance, no matter
> how many repositories we are accessing in this single invocation of
> Git, no?

Yes.

> How should "git config -l" interact with "protected config" and
> "protected variables", by the way?  Should a user be able to tell
> which ones are coming from protected scope?  Should we gain, next to
> --global, --system, etc., --protected option to list only the
> protected config/variable?

I'll have to think about this some more. My initial thoughts are that we
should do this if we formalize 'protected' as a scope-like concept, but
I don't see the lack of "--protected" as a significant hindrance to
users because they can use "--global" and "--system" (albeit in two
invocations instead of one).

> This is another thing that I find iffy on terminology.  Should a
> random variable, like user.name, be a "protected config", if it is
> found in $HOME/.gitconfig?  If it comes from there, surely we can
> trust its value, but unlike things like safe.directory, there is no
> code that wants to enforce that we pay attention only to user.name
> that came from trusted scopes.  Should such a variable be called
> "protected variable"?

Ah.. I think it would be best to pretend that the "Protected variable"
typo never happened. That term was destined to be confusing and
meaningless.

Instead, we can use "protected config" to refer to the config and
"protected config only" to refer to variables. Since "protected config"
is defined as (global + system + CLI) config, then yes, we would say
that it is "protected config". But since we do not enforce that
"user.name" _must_ come from only protected config, it is not "protected
config only".

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 2/5] config: read protected config with `git_protected_config()`
  2022-05-31 17:43         ` Glen Choo
@ 2022-06-01 15:58           ` Junio C Hamano
  0 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-06-01 15:58 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Emily Shaffer

Glen Choo <chooglen@google.com> writes:

> A goal in this version was to introduce as little jargon as possible, so
> - "protected config" refers to the set of config sources, and
> - "protected config only" refers to config variables/settings that are
>   only read from protected config.

OK.  Let's have such a clear pair of definitions somewhere in the
doc or at least in a proposed log message.

>
>>> - Protected config is stored in `the_repository` so that we don't need
>>>   to statically allocate it. But this might be confusing since protected
>>>   config ignores repository config by definition.
>>
>> Yes, it indeed is.  Is it because we were over-eager when we
>> introduced the "struct repository *repo" parameter to many functions
>> and the configuration system wants you to have some repository, even
>> when you know you are not reading from any repository?  
>
> Ah no, I was just trying to avoid yet-another global variable (since
> IIRC we want to move towards a more lib-like Git), and the_repository
> was a convenient global variable to (ab)use.

If this does not have to be known only inside config.c, until we
introduce a more global bag of things, which may have the current
the_repository as one of its components, I do not think it hurts to
have a file-scope static there.  Then, perhaps git_configset_get*()
helper functions can recognize cs==NULL as a sign that the caller
wants to grab from the "protected config", or something?  If we do
not want to expose the underying global variable to the public, that
is.

> As an aside, I wonder how we could get rid of all of the globals in
> environment.c in the long term. Maybe we would have yet-another all
> encompassing global, the_environment, and then figure out which
> variables belong to the repository and which belong to the environment.

I think we are on the same page, we'd probably need something called
the_world ;-)

> Instead, we can use "protected config" to refer to the config and
> "protected config only" to refer to variables. Since "protected config"
> is defined as (global + system + CLI) config, then yes, we would say
> that it is "protected config". But since we do not enforce that
> "user.name" _must_ come from only protected config, it is not "protected
> config only".

Very clear.  Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/5] Documentation: define protected configuration
  2022-05-27 23:29       ` Junio C Hamano
@ 2022-06-02 12:42         ` Derrick Stolee
  2022-06-02 16:53           ` Junio C Hamano
  2022-06-03 15:57         ` Glen Choo
  1 sibling, 1 reply; 113+ messages in thread
From: Derrick Stolee @ 2022-06-02 12:42 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Emily Shaffer, Glen Choo

On 5/27/2022 7:29 PM, Junio C Hamano wrote:
> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> +[[def_protected_config]]protected configuration::
>> +	Protected configuration is configuration that Git considers more
>> +	trustworthy because it is unlikely to be tampered with by an
>> +	attacker. For security reasons, some configuration variables are
>> +	only respected when they are defined in protected configuration.
>> ++
>> +Protected configuration includes:
>> ++
>> +- system-level config, e.g. `/etc/git/config`
>> +- global config, e.g. `$XDG_CONFIG_HOME/git/config` and
>> +  `$HOME/.gitconfig`
>> +Protected configuration excludes:
>> ++
>> +- repository config, e.g. `$GIT_DIR/config` and
>> +  `$GIT_DIR/config.worktree`
>> +- the command line option `-c` and its equivalent environment variables
> 
> The description is a bit unclear what "protected configuration"
> refers.
> 
> If it is the scopes (as in "git config --show-scope") Git can trust
> more, in other words, a statement like this
> 
>     safe.directory is honored only when it comes from a protected
>     configuration.
> 
> is what you want to make easier to write by introducing a new
> phrase, perhaps use the word "scope" for more consistency?  E.g.
> 
>     Only safe.directory that is defined in a trusted scope is
>     honored.
> 
> I dunno.
> 
> It would make sense to give a rationale behind the seemingly
> arbitrary choice of what is and what is not "protected".  Not
> necessarily in the glossary, but in the proposed log message of the
> commit that makes the decision.  The rationale must help readers to
> be able to answer the following questions.
> 
>  - The system level is "protected" because?  Is it because we do not
>    even try to protect ourselves from those who can write anywhere
>    in /etc/ or other system directories?
> 
>  - The per-user config is "protected" because?  Is it because our
>    primary interest in "protection" is to protect individual users
>    from landmines laid in the filesystem by other users, and those
>    who can already write into $HOME are not we try to guard against?

I think the answers to these two questions is "yes", so they can
be turned into an affirmative sentence:

	We do not event try to protect ourselves from those who can
	write anywhere...

>  - The per-repo config is not "protected" (i.e. "trusted"), because?
>    If we are not honoring a configuration in the repository, why are
>    we working in that repository in the first place?

This requires an example:

	Some workflows use repositories stored in shared directories,
	which are writable by multiple unprivileged users.
 
>  - The per invocation config is not "protected" (i.e. "trusted"),
>    because?  If we cannot trusting our own command line, what
>    prevents an attacker from mucking with our command line to say
>    "sudo whatever" using the same attack vector?

With this argument, I agree that -c config can be considered
protected. At the very least, it is visible to the user when they
are running a command. This would unify our expectations with
uploadPack.packObjectsHook, too.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 2/5] config: read protected config with `git_protected_config()`
  2022-05-27 21:09     ` [PATCH v3 2/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
  2022-05-28  0:28       ` Junio C Hamano
@ 2022-06-02 12:56       ` Derrick Stolee
  1 sibling, 0 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-06-02 12:56 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/27/2022 5:09 PM, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
> 
> Protected config is read using `read_very_early_config()`, which has
> several downsides:
> 
> - Every call to `read_very_early_config()` parses global and
>   system-level config files anew, but this can be optimized by just
>   parsing them once [1].
> - Protected variables should respect "-c" because we can reasonably
>   assume that it comes from the user. But, `read_very_early_config()`
>   can't use "-c" because it is called so early that it does not have
>   access to command line arguments.
> 
> Introduce `git_protected_config()`, which reads protected config and
> caches the values in `the_repository.protected_config`. Then, refactor
> `safe.directory` to use `git_protected_config()`.
> 
> This implementation can still be improved, however:
> 
> - `git_protected_config()` iterates through every variable in
>   `the_repository.protected_config`, which may still be too expensive to
>   be called in every "git" invocation. There exist constant time lookup
>   functions for non-protected config (repo_config_get_*()), but for
>   simplicity, this commit does not implement similar functions for
>   protected config.

I originally thought that we should jump to that "right" solution, but
the existing logic in ensure_valid_ownership() uses the iterator method,
mostly because it uses a multi-valued string. There are helpers that
allow iterating over a specific multi-valued key, but there is no reason
to complicate the current patch with that amount of refactoring. That
can be handled as a completely separate topic.
 
> - Protected config is stored in `the_repository` so that we don't need
>   to statically allocate it. But this might be confusing since protected
>   config ignores repository config by definition.

I agree with Junio's suggestion of keeping this as a static global in
config.c, accessible only by the public methods from config.h. A future
where we have "the_world" might be nice for inventory on all these
globals. Definitely not something to hold up this series.

> +/* Read protected config into the_repository->protected_config. */
> +static void read_protected_config(void)
> +{
> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
> +
> +	CALLOC_ARRAY(the_repository->protected_config, 1);
> +	git_configset_init(the_repository->protected_config);
> +
> +	system_config = git_system_config();
> +	git_global_config(&user_config, &xdg_config);
> +
> +	git_configset_add_file(the_repository->protected_config, system_config);
> +	git_configset_add_file(the_repository->protected_config, xdg_config);
> +	git_configset_add_file(the_repository->protected_config, user_config);
> +
> +	free(system_config);
> +	free(xdg_config);
> +	free(user_config);
> +}

This loads the config from three files, including the xdg_config, which
I wasn't thinking about before.

This implementation does not use the -c config yet, which you listed as
a downside of read_very_early_config(). I see that you include that in
your patch 4, but the commit message for this patch could list that as a
step that will be handled by a later change.

(You could also do that as patch 3 and add a test near the existing
safe.directory tests instead of waiting for discovery.bare.)

> +
> +/* Ensure that the_repository->protected_config has been initialized. */
> +static void git_protected_config_check_init(void)
> +{
> +	if (the_repository->protected_config &&
> +	    the_repository->protected_config->hash_initialized)
> +		return;
> +	read_protected_config();
> +}
> +
> +void git_protected_config(config_fn_t fn, void *data)
> +{
> +	git_protected_config_check_init();
> +	configset_iter(the_repository->protected_config, fn, data);
> +}

These two methods are clearly correct.

..._check_init() is an OK name. I've seen us use "prepare_...()" in
other areas as a way of making sure that we have the proper state
(see prepare_packed_git() and the like), so maybe a rename here to
match would be worthwhile. Feel free to ignore.

> +	if (repo->protected_config) {
> +		git_configset_clear(repo->protected_config);
> +		FREE_AND_NULL(repo->protected_config);
> +	}

This will have no equivalent when protected_config is left as a
static global, but that is fine. It only goes out of scope with
the end of the process, anyway.

> @@ -1128,7 +1128,7 @@ static int ensure_valid_ownership(const char *path)
>  	    is_path_owned_by_current_user(path))
>  		return 1;
>  
> -	read_very_early_config(safe_directory_cb, &data);
> +	git_protected_config(safe_directory_cb, &data);

Nice to have a very simple conversion here.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 3/5] setup.c: create `discovery.bare`
  2022-05-27 21:09     ` [PATCH v3 3/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
  2022-05-28  0:59       ` Junio C Hamano
@ 2022-06-02 13:11       ` Derrick Stolee
  1 sibling, 0 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-06-02 13:11 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/27/2022 5:09 PM, Glen Choo via GitGitGadget wrote:
> +enum discovery_bare_config {
> +	DISCOVERY_BARE_UNKNOWN = -1,
> +	DISCOVERY_BARE_NEVER = 0,
> +	DISCOVERY_BARE_ALWAYS,
> +};
> +static enum discovery_bare_config discovery_bare_config =
> +	DISCOVERY_BARE_UNKNOWN;

Using this static global is fine, I think.

> +static int discovery_bare_cb(const char *key, const char *value, void *d)
> +{
> +	if (strcmp(key, "discovery.bare"))
> +		return 0;
> +
> +	if (!strcmp(value, "never")) {
> +		discovery_bare_config = DISCOVERY_BARE_NEVER;
> +		return 0;
> +	}
> +	if (!strcmp(value, "always")) {
> +		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
> +		return 0;
> +	}
> +	return -1;
> +}

However, I do think that this _cb method could benefit from interpreting
the 'd' pointer as a 'enum discovery_bare_config *' and assigning the
value at the pointer. We can then pass the global to the
git_protected_config() call below.

This is probably over-defensive future-proofing, but this kind of change
would be necessary if we ever wanted to return the enum instead of
simply an integer, as below:

> +
> +static int check_bare_repo_allowed(void)
> +{
> +	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
> +		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
> +		git_protected_config(discovery_bare_cb, NULL);
> +	}
> +	switch (discovery_bare_config) {
> +	case DISCOVERY_BARE_NEVER:
> +		return 0;
> +	case DISCOVERY_BARE_ALWAYS:
> +		return 1;
> +	case DISCOVERY_BARE_UNKNOWN:
> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);
> +	}
> +	return 0;
> +}

With the recommended change to the _cb method, we could rewrite this as

static enum discovery_bare_config get_discovery_bare(void)
{
	enum discovery_bare_config result = DISCOVERY_BARE_ALWAYS;
	git_protected_config(discovery_bare_cb, &result);
	return result;
}

With this, we can drop the UNKNOWN and let the caller treat the response
as a simple boolean.

I think this is simpler overall, but also makes it easier to extend in the
future to have "discovery.bare=non-embedded" by adding a new mode and
adjusting the consumer in setup_git_directory_gently_1() to use a switch()
on the resurned enum.

> +
> +static const char *discovery_bare_config_to_string(void)
> +{
> +	switch (discovery_bare_config) {
> +	case DISCOVERY_BARE_NEVER:
> +		return "never";
> +	case DISCOVERY_BARE_ALWAYS:
> +		return "always";
> +	case DISCOVERY_BARE_UNKNOWN:
> +		BUG("invalid discovery_bare_config %d", discovery_bare_config);

This case should be a "default:" in case somehow an arbitrary integer
value was placed in the variable. This could also take an enum as a
parameter, to avoid being coupled to the global.

> +++ b/t/t0035-discovery-bare.sh
> @@ -0,0 +1,64 @@
> +#!/bin/sh
> +
> +test_description='verify discovery.bare checks'
> +
> +. ./test-lib.sh
> +
> +pwd="$(pwd)"
> +
> +expect_rejected () {
> +	test_must_fail git rev-parse --git-dir 2>err &&
> +	grep "discovery.bare" err
> +}

Should we make a simple "expect_accepted" helper in case we ever
want to replace the "git rev-parse --git-dir" with anything else?

> +
> +test_expect_success 'setup bare repo in worktree' '
> +	git init outer-repo &&
> +	git init --bare outer-repo/bare-repo
> +'
> +
> +test_expect_success 'discovery.bare unset' '
> +	(
> +		cd outer-repo/bare-repo &&
> +		git rev-parse --git-dir
> +	)
> +'
> +
> +test_expect_success 'discovery.bare=always' '
> +	git config --global discovery.bare always &&
> +	(
> +		cd outer-repo/bare-repo &&
> +		git rev-parse --git-dir
> +	)
> +'
> +
> +test_expect_success 'discovery.bare=never' '
> +	git config --global discovery.bare never &&
> +	(
> +		cd outer-repo/bare-repo &&
> +		expect_rejected
> +	)
> +'
> +
> +test_expect_success 'discovery.bare in the repository' '
> +	(
> +		cd outer-repo/bare-repo &&
> +		# Temporarily set discovery.bare=always, otherwise git
> +		# config fails with "fatal: not in a git directory"
> +		# (like safe.directory)
> +		git config --global discovery.bare always &&
> +		git config discovery.bare always &&
> +		git config --global discovery.bare never &&
> +		expect_rejected
> +	)
> +'
> +
> +test_expect_success 'discovery.bare on the command line' '
> +	git config --global discovery.bare never &&> +	(
> +		cd outer-repo/bare-repo &&
> +		test_must_fail git -c discovery.bare=always rev-parse --git-dir 2>err &&
> +		grep "discovery.bare" err
> +	)

Ok, at the current place in the series, this test_must_fail matches
expectation. If you reorder to have this patch after your current patch 4,
then we can write this test immediately as a successful case.

We could also reuse some information from the expect_rejected helper by
adding this:

expect_rejected () {
	test_must_fail git $* rev-parse --git-dir 2>err &&
	grep "discovery.bare" err
}

Then you can test the -c options in the tests as

	expect_rejected -c discovery.bare=always

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 4/5] config: include "-c" in protected config
  2022-05-27 21:09     ` [PATCH v3 4/5] config: include "-c" in protected config Glen Choo via GitGitGadget
@ 2022-06-02 13:15       ` Derrick Stolee
  0 siblings, 0 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-06-02 13:15 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/27/2022 5:09 PM, Glen Choo via GitGitGadget wrote:
  
> +int git_configset_add_parameters(struct config_set *cs)
> +{
> +	return git_config_from_parameters(config_set_callback, cs);
> +}
> +

This one-line method could be inlined into the read_protected_config()
method:

> @@ -2628,6 +2633,7 @@ static void read_protected_config(void)
>  	git_configset_add_file(the_repository->protected_config, system_config);
>  	git_configset_add_file(the_repository->protected_config, xdg_config);
>  	git_configset_add_file(the_repository->protected_config, user_config);
> +	git_configset_add_parameters(the_repository->protected_config);
	git_config_from_parameters(config_set_callback, the_repository->protected_config);

...would be the way to inline it.

> +/**
> + * Parses command line options and environment variables, and adds the
> + * variable-value pairs to the `config_set`. Returns 0 on success, or -1
> + * if there is an error in parsing. The caller decides whether to free
> + * the incomplete configset or continue using it when the function
> + * returns -1.
> + */
> +int git_configset_add_parameters(struct config_set *cs);

You do make it public here. I wonder if we can think of other consumers
of this method that justify the addition to the API.

But this is also a nitpick. I don't feel strongly one way or another. The
code definitely works as-is.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 5/5] upload-pack: make uploadpack.packObjectsHook protected
  2022-05-27 21:09     ` [PATCH v3 5/5] upload-pack: make uploadpack.packObjectsHook protected Glen Choo via GitGitGadget
@ 2022-06-02 13:18       ` Derrick Stolee
  0 siblings, 0 replies; 113+ messages in thread
From: Derrick Stolee @ 2022-06-02 13:18 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Junio C Hamano, Emily Shaffer, Glen Choo

On 5/27/2022 5:09 PM, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
> 
> Now that protected config includes "-c", "uploadpack.packObjectsHook"
> behaves identically to a 'Protected config only' variable. Refactor it
> to use git_protected_config() and mark it 'Protected config only'.

I'm really glad to see this simplification at the end of your series.

> @@ -1321,18 +1321,21 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
>  		data->advertise_sid = git_config_bool(var, value);
>  	}
>  
> -	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
> -	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
> -		if (!strcmp("uploadpack.packobjectshook", var))
> -			return git_config_string(&data->pack_objects_hook, var, value);
> -	}
> -

...

> +static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct upload_pack_data *data = cb_data;
> +
> +	if (!strcmp("uploadpack.packobjectshook", var))
> +		return git_config_string(&data->pack_objects_hook, var, value);
> +	return 0;
> +}
> +

This is much cleaner.

> @@ -1342,6 +1345,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
>  	upload_pack_data_init(&data);
>  
>  	git_config(upload_pack_config, &data);
> +	git_protected_config(upload_pack_protected_config, &data);
>  
>  	data.stateless_rpc = stateless_rpc;
>  	data.timeout = timeout;
> @@ -1697,6 +1701,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
>  	data.use_sideband = LARGE_PACKET_MAX;
>  
>  	git_config(upload_pack_config, &data);
> +	git_protected_config(upload_pack_protected_config, &data);

It's unfortunate that there are two places that need this change.
Is it worth adding a static helper that executes these?

static void get_upload_pack_config(void *data)
{
	git_config(upload_pack_config, data);
	git_protected_config(upload_pack_protected_config, data);
}

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/5] Documentation: define protected configuration
  2022-06-02 12:42         ` Derrick Stolee
@ 2022-06-02 16:53           ` Junio C Hamano
  2022-06-02 17:39             ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-06-02 16:53 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Emily Shaffer, Glen Choo

Derrick Stolee <derrickstolee@github.com> writes:

>> It would make sense to give a rationale behind the seemingly
>> arbitrary choice of what is and what is not "protected".  Not
>> necessarily in the glossary, but in the proposed log message of the
>> commit that makes the decision.  The rationale must help readers to
>> be able to answer the following questions.
>> 
>>  - The system level is "protected" because?  Is it because we do not
>>    even try to protect ourselves from those who can write anywhere
>>    in /etc/ or other system directories?
>> 
>>  - The per-user config is "protected" because?  Is it because our
>>    primary interest in "protection" is to protect individual users
>>    from landmines laid in the filesystem by other users, and those
>>    who can already write into $HOME are not we try to guard against?
>
> I think the answers to these two questions is "yes", so they can
> be turned into an affirmative sentence:
>
> 	We do not event try to protect ourselves from those who can
> 	write anywhere...

s/event/even/.

>
>>  - The per-repo config is not "protected" (i.e. "trusted"), because?
>>    If we are not honoring a configuration in the repository, why are
>>    we working in that repository in the first place?
>
> This requires an example:
>
> 	Some workflows use repositories stored in shared directories,
> 	which are writable by multiple unprivileged users.

Hmph, "... and we do not trust these colleagues"?  It might be true,
but sounds a bit weak rationale, at least to me.  A natural reaction
coming form a devil's advocate naïve me would be "well, then I would
not be directly interacting with such a repository; I'd work in a
clone of it of my own, and pull and push as needed".

Isn't the reason more like "users may go spelunking random places in
the filesystem, with PS1 settings and the like that causes some
"git" command invoked automatically in their current directory, and
we want to protect these users from getting harmed by a random
repository with hostile contents in their configuration and hooks
without even realizing they have wandered into such a repository"?

>>  - The per invocation config is not "protected" (i.e. "trusted"),
>>    because?  If we cannot trusting our own command line, what
>>    prevents an attacker from mucking with our command line to say
>>    "sudo whatever" using the same attack vector?
>
> With this argument, I agree that -c config can be considered
> protected. At the very least, it is visible to the user when they
> are running a command. This would unify our expectations with
> uploadPack.packObjectsHook, too.

Yup, that matches my understanding.

In any case, I'd prefer to see not just the definition but the
reasoning behind the decision that made some "protected" while
leaving others not-"protected" clearly documented to help users.

Thanks.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/5] Documentation: define protected configuration
  2022-06-02 16:53           ` Junio C Hamano
@ 2022-06-02 17:39             ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-02 17:39 UTC (permalink / raw)
  To: Junio C Hamano, Derrick Stolee
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> Derrick Stolee <derrickstolee@github.com> writes:
>>>  - The per-repo config is not "protected" (i.e. "trusted"), because?
>>>    If we are not honoring a configuration in the repository, why are
>>>    we working in that repository in the first place?
>>
>> This requires an example:
>>
>> 	Some workflows use repositories stored in shared directories,
>> 	which are writable by multiple unprivileged users.
>
> Isn't the reason more like "users may go spelunking random places in
> the filesystem, with PS1 settings and the like that causes some
> "git" command invoked automatically in their current directory, and
> we want to protect these users from getting harmed by a random
> repository with hostile contents in their configuration and hooks
> without even realizing they have wandered into such a repository"?

Hm, this is my understanding as well, i.e. `safe.directory` is meant to
protect you from shared repositories that you didn't expect, but it lets
you trust the shared repositories that you need (and there is no
protection once you decide to trust the repo).

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v3 1/5] Documentation: define protected configuration
  2022-05-27 23:29       ` Junio C Hamano
  2022-06-02 12:42         ` Derrick Stolee
@ 2022-06-03 15:57         ` Glen Choo
  1 sibling, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-03 15:57 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>>  safe.directory::
>> -	These config entries specify Git-tracked directories that are
>> -	considered safe even if they are owned by someone other than the
>> -	current user. By default, Git will refuse to even parse a Git
>> -	config of a repository owned by someone else, let alone run its
>> -	hooks, and this config setting allows users to specify exceptions,
>> -	e.g. for intentionally shared repositories (see the `--shared`
>> -	option in linkgit:git-init[1]).
>> +	'(Protected config only) ' These config entries specify
>
> What's the SP in "only) '" doing?

Silly typo. Thanks for the catch :)

>> diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
>> index aa2f41f5e70..a669983abd6 100644
>> --- a/Documentation/glossary-content.txt
>> +++ b/Documentation/glossary-content.txt
>> @@ -483,6 +483,24 @@ exclude;;
>>  	head ref. If the remote <<def_head,head>> is not an
>>  	ancestor to the local head, the push fails.
>>  
>> +[[def_protected_config]]protected configuration::
>> +	Protected configuration is configuration that Git considers more
>> +	trustworthy because it is unlikely to be tampered with by an
>> +	attacker. For security reasons, some configuration variables are
>> +	only respected when they are defined in protected configuration.
>> ++
>> +Protected configuration includes:
>> ++
>> +- system-level config, e.g. `/etc/git/config`
>> +- global config, e.g. `$XDG_CONFIG_HOME/git/config` and
>> +  `$HOME/.gitconfig`
>> +Protected configuration excludes:
>> ++
>> +- repository config, e.g. `$GIT_DIR/config` and
>> +  `$GIT_DIR/config.worktree`
>> +- the command line option `-c` and its equivalent environment variables
>
> The description is a bit unclear what "protected configuration"
> refers.
>
> If it is the scopes (as in "git config --show-scope") Git can trust
> more, in other words, a statement like this
>
>     safe.directory is honored only when it comes from a protected
>     configuration.
>
> is what you want to make easier to write by introducing a new
> phrase, perhaps use the word "scope" for more consistency?  E.g.
>
>     Only safe.directory that is defined in a trusted scope is
>     honored.

Good point. I think using scope would be a lot clearer, and maybe I
will consider s/protected configuration/protected scope. I'm hesitant to
call the scope "trusted", because I don't want to insinuate that
repository config is "untrusted" since we _do_ trust it in most cases.

I don't think Documentation/git-config.txt has adequately defined what a
'scope' is though, even though scopes have been with us since 9acc591111
(config: add a notion of "scope", 2016-05-18). The best I could find is
"--show-scope", introduced in 145d59f482 (config: add '--show-scope' to
print the scope of a config value, 2020-02-10), which mentions scopes
but doesn't link the idea back to the specific files or CLI options
("--system", "--global", etc).

So I'll see if I can improve the docs around scopes since that will help
the language in this patch.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v4 0/5] config: introduce discovery.bare and protected config
  2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                       ` (4 preceding siblings ...)
  2022-05-27 21:09     ` [PATCH v3 5/5] upload-pack: make uploadpack.packObjectsHook protected Glen Choo via GitGitGadget
@ 2022-06-07 20:57     ` Glen Choo via GitGitGadget
  2022-06-07 20:57       ` [PATCH v4 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
                         ` (7 more replies)
  5 siblings, 8 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-07 20:57 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo

Thanks again for the kind feedback, everyone :)

The motivation has remained the same as the last round; you can find it in
the "Description" section in the previous cover letter [1].

This round doesn't introduce any major code changes. The most notable
changes are:

 * I've reorganized the patches so that protected config includes "-c" from
   the beginning (3/5) (instead of trying to avoid changing safe.directory).
   As a result, uploadpack.packObjectsHook becomes the first 'protected
   config only' variable instead of safe.directory.
   
   Since we start the conversion with uploadpack.packObjectHook, I was
   curious whether we might be able to reuse its approach of "reading the
   full set of config, but checking the scope of each value", which might be
   nice because we could reuse the cache in the_repository->config instead
   of creating an entirely new configset. I didn't pursue it further, but
   I've noted this alternative in 3/5's commit message.

 * 'Protected configuration' is now defined as a set of configuration scopes
   (2/5). This follows a suggestion from Junio [2], which I thought read
   very clearly. We haven't done a great job at describing 'scopes' in
   Documentation/git-config.txt though, so I tried to remedy that by
   cleaning up a little and adding a SCOPES section (1/5).
   
   Frankly, I'm not very happy with the end result - it's not nearly as
   clear as I had hoped, and I think I might be introducing some confusion
   between the words "config" and "scope". I'd appreciate any feedback that
   helps us get to a good final wording.

 * I added a test for "git config --show-scope" and the 'worktree' scope,
   since 'worktree' wasn't listed in Documentation/git-config.txt (6/5).

= Patch organization

 * Patch 1 add a section on config scopes to our docs
 * Patches 2-3 define 'protected config' and create a shared implementation.
 * Patch 4 refactors safe.directory to use protected config
 * Patch 5 adds discovery.bare

= Series history

Changes in v4:

 * 2/5's commit message now justifies what scopes are included in protected
   config
 * The global configset is now a file-scope static inside config.c
   (previously it was a member of the_repository).
 * Rename discovery_bare_config to discovery_bare_allowed
 * Make discovery_bare_allowed function-scoped (instead of global).
 * Add an expect_accepted helper to the discovery.bare tests.
 * Add a helper to "upload-pack" that reads the protected and non-protected
   config

Changes in v3:

 * Rebase onto a more recent 'master'
 * Reframe this feature in only in terms of the 'embedded bare repo' attack.
 * Other docs improvements (thanks Stolee in particular!)
 * Protected config no longer uses read_very_early_config() and is only read
   once
 * Protected config now includes "-c"
 * uploadpack.packObjectsHook now uses protected config instead of ignoring
   repo config using config scopes

Changes in v2:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

= Future work

 * This series does not implement the "no-embedded" option [3] and I won't
   work on it any time soon, but I'd be more than happy to review if someone
   sends patches.
 * With discovery.bare, if a builtin is marked RUN_SETUP_GENTLY, setup.c
   doesn't die() and we don't tell users why their repository was rejected,
   e.g. "git config" gives an opaque "fatal: not in a git directory". This
   isn't a new problem though, since safe.directory has the same issue.

[1]
https://lore.kernel.org/git/pull.1261.v3.git.git.1653685761.gitgitgadget@gmail.com
[2] https://lore.kernel.org/git/xmqqh75a1rmd.fsf@gitster.g [3] This was
first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Glen Choo (5):
  Documentation/git-config.txt: add SCOPES section
  Documentation: define protected configuration
  config: read protected config with `git_protected_config()`
  safe.directory: use git_protected_config()
  setup.c: create `discovery.bare`

 Documentation/config.txt            |  2 +
 Documentation/config/discovery.txt  | 19 +++++++
 Documentation/config/safe.txt       |  6 +--
 Documentation/config/uploadpack.txt |  6 +--
 Documentation/git-config.txt        | 77 +++++++++++++++++++++++------
 config.c                            | 51 +++++++++++++++++++
 config.h                            | 17 +++++++
 setup.c                             | 59 +++++++++++++++++++++-
 t/t0033-safe-directory.sh           | 24 ++++-----
 t/t0035-discovery-bare.sh           | 68 +++++++++++++++++++++++++
 t/t5544-pack-objects-hook.sh        |  7 ++-
 upload-pack.c                       | 27 ++++++----
 12 files changed, 316 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh


base-commit: f9b95943b68b6b8ca5a6072f50a08411c6449b55
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v4
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v3:

 1:  575676c760d < -:  ----------- Documentation: define protected configuration
 -:  ----------- > 1:  c0e27ab3b3e Documentation/git-config.txt: add SCOPES section
 5:  e25d5907cd1 ! 2:  a5a1dcb03e1 upload-pack: make uploadpack.packObjectsHook protected
     @@ Metadata
      Author: Glen Choo <chooglen@google.com>
      
       ## Commit message ##
     -    upload-pack: make uploadpack.packObjectsHook protected
     +    Documentation: define protected configuration
      
     -    Now that protected config includes "-c", "uploadpack.packObjectsHook"
     -    behaves identically to a 'Protected config only' variable. Refactor it
     -    to use git_protected_config() and mark it 'Protected config only'.
     +    For security reasons, there are config variables that are only trusted
     +    when they are specified in extra-trustworthy configuration scopes, which
     +    are sometimes referred to on-list as 'protected configuration' [1]. A
     +    future commit will introduce another such variable, so let's define our
     +    terms so that we can have consistent documentation and implementation.
     +
     +    In our documentation, define 'protected config' as the system, global
     +    and command config scopes. As a shorthand, I will refer to variables
     +    that are only respected in protected config as 'protected config only',
     +    but this term is not used in the documentation.
     +
     +    This definition of protected configuration is based on whether or not
     +    Git can reasonably protect the user by ignoring the configuration scope:
     +
     +    - System, global and command line config are considered protected
     +      because an attacker who has control over any of those can do plenty of
     +      harm without Git, so we gain very little by ignoring those scopes.
     +    - On the other hand, local (and similarly, worktree) config are not
     +      considered protected because it is relatively easy for an attacker to
     +      control local config, e.g.:
     +      - On some shared user environments, a non-admin attacker can create a
     +        repository high up the directory hierarchy (e.g. C:\.git on Windows),
     +        and a user may accidentally use it when their PS1 automatically
     +        invokes "git" commands.
     +
     +        `safe.directory` prevents attacks of this form by making sure that
     +        the user intended to use the shared repository. It obviously
     +        shouldn't be read from the repository, because that would end up
     +        trusting the repository that Git was supposed to reject.
     +      - "git upload-pack" is expected to run in repositories that may not be
     +        controlled by the user. We cannot ignore all config in that
     +        repository (because "git upload-pack" would fail), but we can limit
     +        the risks by ignoring `uploadpack.packObjectsHook`.
     +
     +    Only `uploadpack.packObjectsHook` is 'protected config only'. The
     +    following variables are intentionally excluded:
     +
     +    - `safe.directory` should be 'protected config only', but it does not
     +      technically fit the definition because it is not respected in the
     +      "command" scope. A future commit will fix this.
     +
     +    - `trace2.*` happens to read the same scopes as `safe.directory` because
     +      they share an implementation. However, this is not for security
     +      reasons; it is because we want to start tracing so early that
     +      repository-level config and "-c" are not available [2].
     +
     +      This requirement is unique to `trace2.*`, so it does not makes sense
     +      for protected configuration to be subject to the same constraints.
     +
     +    [1] For example,
     +    https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
     +    [2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
       ## Documentation/config/uploadpack.txt ##
     -@@ Documentation/config/uploadpack.txt: uploadpack.keepAlive::
     - 	disables keepalive packets entirely. The default is 5 seconds.
     - 
     - uploadpack.packObjectsHook::
     --	If this option is set, when `upload-pack` would run
     --	`git pack-objects` to create a packfile for a client, it will
     --	run this shell command instead.  The `pack-objects` command and
     --	arguments it _would_ have run (including the `git pack-objects`
     --	at the beginning) are appended to the shell command. The stdin
     --	and stdout of the hook are treated as if `pack-objects` itself
     --	was run. I.e., `upload-pack` will feed input intended for
     --	`pack-objects` to the hook, and expects a completed packfile on
     --	stdout.
     --+
     +@@ Documentation/config/uploadpack.txt: uploadpack.packObjectsHook::
     + 	`pack-objects` to the hook, and expects a completed packfile on
     + 	stdout.
     + +
      -Note that this configuration variable is ignored if it is seen in the
      -repository-level config (this is a safety measure against fetching from
      -untrusted repositories).
     -+	'(Protected config only)' If this option is set, when
     -+	`upload-pack` would run `git pack-objects` to create a packfile
     -+	for a client, it will run this shell command instead. The
     -+	`pack-objects` command and arguments it _would_ have run
     -+	(including the `git pack-objects` at the beginning) are appended
     -+	to the shell command. The stdin and stdout of the hook are
     -+	treated as if `pack-objects` itself was run. I.e., `upload-pack`
     -+	will feed input intended for `pack-objects` to the hook, and
     -+	expects a completed packfile on stdout.
     ++Note that this configuration variable is only respected when it is specified
     ++in protected config (see <<SCOPES>>). This is a safety measure against
     ++fetching from untrusted repositories.
       
       uploadpack.allowFilter::
       	If this option is set, `upload-pack` will support partial
      
     - ## upload-pack.c ##
     -@@ upload-pack.c: static int upload_pack_config(const char *var, const char *value, void *cb_data)
     - 		data->advertise_sid = git_config_bool(var, value);
     - 	}
     + ## Documentation/git-config.txt ##
     +@@ Documentation/git-config.txt: You can change the way options are read/written by specifying the path to a
     + file (`--file`), or by specifying a configuration scope (`--system`,
     + `--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
       
     --	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
     --	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
     --		if (!strcmp("uploadpack.packobjectshook", var))
     --			return git_config_string(&data->pack_objects_hook, var, value);
     --	}
     --
     - 	if (parse_object_filter_config(var, value, data) < 0)
     - 		return -1;
     ++[[SCOPES]]
     + SCOPES
     + ------
       
     - 	return parse_hide_refs_config(var, value, "uploadpack");
     - }
     +@@ Documentation/git-config.txt: Most configuration options are respected regardless of the scope it is
     + defined in, but some options are only respected in certain scopes. See the
     + option's documentation for the full details.
       
     -+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
     -+{
     -+	struct upload_pack_data *data = cb_data;
     ++Protected config
     ++~~~~~~~~~~~~~~~~
      +
     -+	if (!strcmp("uploadpack.packobjectshook", var))
     -+		return git_config_string(&data->pack_objects_hook, var, value);
     -+	return 0;
     -+}
     ++Protected config refers to the 'system', 'global', and 'command' scopes. Git
     ++considers these scopes to be especially trustworthy because they are likely
     ++to be controlled by the user or a trusted administrator. An attacker who
     ++controls these scopes can do substantial harm without using Git, so it is
     ++assumed that the user's environment protects these scopes against attackers.
      +
     - void upload_pack(const int advertise_refs, const int stateless_rpc,
     - 		 const int timeout)
     - {
     -@@ upload-pack.c: void upload_pack(const int advertise_refs, const int stateless_rpc,
     - 	upload_pack_data_init(&data);
     - 
     - 	git_config(upload_pack_config, &data);
     -+	git_protected_config(upload_pack_protected_config, &data);
     - 
     - 	data.stateless_rpc = stateless_rpc;
     - 	data.timeout = timeout;
     -@@ upload-pack.c: int upload_pack_v2(struct repository *r, struct packet_reader *request)
     - 	data.use_sideband = LARGE_PACKET_MAX;
     - 
     - 	git_config(upload_pack_config, &data);
     -+	git_protected_config(upload_pack_protected_config, &data);
     ++For security reasons, certain options are only respected when they are
     ++specified in protected config, and ignored otherwise.
     ++
     + ENVIRONMENT
     + -----------
       
     - 	while (state != FETCH_DONE) {
     - 		switch (state) {
 2:  7499a280961 ! 3:  94b40907e66 config: read protected config with `git_protected_config()`
     @@ Metadata
       ## Commit message ##
          config: read protected config with `git_protected_config()`
      
     -    Protected config is read using `read_very_early_config()`, which has
     -    several downsides:
     -
     -    - Every call to `read_very_early_config()` parses global and
     -      system-level config files anew, but this can be optimized by just
     -      parsing them once [1].
     -    - Protected variables should respect "-c" because we can reasonably
     -      assume that it comes from the user. But, `read_very_early_config()`
     -      can't use "-c" because it is called so early that it does not have
     -      access to command line arguments.
     -
     -    Introduce `git_protected_config()`, which reads protected config and
     -    caches the values in `the_repository.protected_config`. Then, refactor
     -    `safe.directory` to use `git_protected_config()`.
     -
     -    This implementation can still be improved, however:
     -
     -    - `git_protected_config()` iterates through every variable in
     -      `the_repository.protected_config`, which may still be too expensive to
     -      be called in every "git" invocation. There exist constant time lookup
     -      functions for non-protected config (repo_config_get_*()), but for
     -      simplicity, this commit does not implement similar functions for
     -      protected config.
     -
     -    - Protected config is stored in `the_repository` so that we don't need
     -      to statically allocate it. But this might be confusing since protected
     -      config ignores repository config by definition.
     -
     -    [1] While `git_protected_config()` should save on file I/O, I wasn't
     -    able to measure a meaningful difference between that and
     -    `read_very_early_config()` on my machine (which has an SSD).
     +    `uploadpack.packObjectsHook` is the only 'protected config only'
     +    variable today, but we've noted that `safe.directory` and the upcoming
     +    `discovery.bare` should also be 'protected config only'. So, for
     +    consistency, we'd like to have a single implementation for protected
     +    config.
     +
     +    The primary constraints are:
     +
     +    1. Reading from protected config should be as fast as possible. Nearly
     +       all "git" commands inside a bare repository will read both
     +       `safe.directory` and `discovery.bare`, so we cannot afford to be
     +       slow.
     +
     +    2. Protected config must be readable when the gitdir is not known.
     +       `safe.directory` and `discovery.bare` both affect repository
     +       discovery and the gitdir is not known at that point [1].
     +
     +    The chosen implementation in this commit is to read protected config and
     +    cache the values in a global configset. This is similar to the caching
     +    behavior we get with the_repository->config.
     +
     +    Introduce git_protected_config(), which reads protected config and
     +    caches them in the global configset protected_config. Then, refactor
     +    `uploadpack.packObjectsHook` to use git_protected_config().
     +
     +    The protected config functions are named similarly to their
     +    non-protected counterparts, e.g. git_protected_config_check_init() vs
     +    git_config_check_init().
     +
     +    In light of constraint 1, this implementation can still be improved
     +    since git_protected_config() iterates through every variable in
     +    protected_config, which may still be too expensive. There exist constant
     +    time lookup functions for non-protected config (repo_config_get_*()),
     +    but for simplicity, this commit does not implement similar functions for
     +    protected config.
     +
     +    An alternative that avoids introducing another configset is to continue
     +    to read all config using git_config(), but only accept values that have
     +    the correct config scope [2]. This technically fulfills constraint 2,
     +    because git_config() simply ignores the local and worktree config when
     +    the gitdir is not known. However, this would read incomplete config into
     +    the_repository->config, which would need to be reset when the gitdir is
     +    known and git_config() needs to read the local and worktree config.
     +    Resetting the_repository->config might be reasonable while we only have
     +    these 'protected config only' variables, but it's not clear whether this
     +    extends well to future variables.
     +
     +    [1] In this case, we do have a candidate gitdir though, so with a little
     +    refactoring, it might be possible to provide a gitdir.
     +    [2] This is how `uploadpack.packObjectsHook` was implemented prior to
     +    this commit.
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
       ## config.c ##
     +@@ config.c: static enum config_scope current_parsing_scope;
     + static int pack_compression_seen;
     + static int zlib_compression_seen;
     + 
     ++/*
     ++ * Config that comes from trusted sources, namely:
     ++ * - system config files (e.g. /etc/gitconfig)
     ++ * - global config files (e.g. $HOME/.gitconfig,
     ++ *   $XDG_CONFIG_HOME/git)
     ++ * - the command line.
     ++ *
     ++ * This is declared here for code cleanliness, but unlike the other
     ++ * static variables, this does not hold config parser state.
     ++ */
     ++static struct config_set protected_config;
     ++
     + static int config_file_fgetc(struct config_source *conf)
     + {
     + 	return getc_unlocked(conf->u.file);
     +@@ config.c: int git_configset_add_file(struct config_set *cs, const char *filename)
     + 	return git_config_from_file(config_set_callback, filename, cs);
     + }
     + 
     ++int git_configset_add_parameters(struct config_set *cs)
     ++{
     ++	return git_config_from_parameters(config_set_callback, cs);
     ++}
     ++
     + int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
     + {
     + 	const struct string_list *values = NULL;
      @@ config.c: int repo_config_get_pathname(struct repository *repo,
       	return ret;
       }
       
     -+/* Read protected config into the_repository->protected_config. */
     ++/* Read values into protected_config. */
      +static void read_protected_config(void)
      +{
      +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
      +
     -+	CALLOC_ARRAY(the_repository->protected_config, 1);
     -+	git_configset_init(the_repository->protected_config);
     ++	git_configset_init(&protected_config);
      +
      +	system_config = git_system_config();
      +	git_global_config(&user_config, &xdg_config);
      +
     -+	git_configset_add_file(the_repository->protected_config, system_config);
     -+	git_configset_add_file(the_repository->protected_config, xdg_config);
     -+	git_configset_add_file(the_repository->protected_config, user_config);
     ++	git_configset_add_file(&protected_config, system_config);
     ++	git_configset_add_file(&protected_config, xdg_config);
     ++	git_configset_add_file(&protected_config, user_config);
     ++	git_configset_add_parameters(&protected_config);
      +
      +	free(system_config);
      +	free(xdg_config);
      +	free(user_config);
      +}
      +
     -+/* Ensure that the_repository->protected_config has been initialized. */
     ++/* Ensure that protected_config has been initialized. */
      +static void git_protected_config_check_init(void)
      +{
     -+	if (the_repository->protected_config &&
     -+	    the_repository->protected_config->hash_initialized)
     ++	if (protected_config.hash_initialized)
      +		return;
      +	read_protected_config();
      +}
     @@ config.c: int repo_config_get_pathname(struct repository *repo,
      +void git_protected_config(config_fn_t fn, void *data)
      +{
      +	git_protected_config_check_init();
     -+	configset_iter(the_repository->protected_config, fn, data);
     ++	configset_iter(&protected_config, fn, data);
      +}
      +
       /* Functions used historically to read configuration from 'the_repository' */
     @@ config.c: int repo_config_get_pathname(struct repository *repo,
       {
      
       ## config.h ##
     +@@ config.h: void git_configset_init(struct config_set *cs);
     +  */
     + int git_configset_add_file(struct config_set *cs, const char *filename);
     + 
     ++/**
     ++ * Parses command line options and environment variables, and adds the
     ++ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
     ++ * if there is an error in parsing. The caller decides whether to free
     ++ * the incomplete configset or continue using it when the function
     ++ * returns -1.
     ++ */
     ++int git_configset_add_parameters(struct config_set *cs);
     ++
     + /**
     +  * Finds and returns the value list, sorted in order of increasing priority
     +  * for the configuration variable `key` and config set `cs`. When the
      @@ config.h: int repo_config_get_maybe_bool(struct repository *repo,
       int repo_config_get_pathname(struct repository *repo,
       			     const char *key, const char **dest);
     @@ config.h: int repo_config_get_maybe_bool(struct repository *repo,
        * Querying For Specific Variables
        * -------------------------------
      
     - ## repository.c ##
     -@@ repository.c: void repo_clear(struct repository *repo)
     - 		FREE_AND_NULL(repo->remote_state);
     - 	}
     - 
     -+	if (repo->protected_config) {
     -+		git_configset_clear(repo->protected_config);
     -+		FREE_AND_NULL(repo->protected_config);
     -+	}
     + ## t/t5544-pack-objects-hook.sh ##
     +@@ t/t5544-pack-objects-hook.sh: test_expect_success 'hook does not run from repo config' '
     + 	! grep "hook running" stderr &&
     + 	test_path_is_missing .git/hook.args &&
     + 	test_path_is_missing .git/hook.stdin &&
     +-	test_path_is_missing .git/hook.stdout
     ++	test_path_is_missing .git/hook.stdout &&
      +
     - 	repo_clear_path_cache(&repo->cached_paths);
     - }
     ++	# check that global config is used instead
     ++	test_config_global uploadpack.packObjectsHook ./hook &&
     ++	git clone --no-local . dst2.git 2>stderr &&
     ++	grep "hook running" stderr
     + '
       
     + test_expect_success 'hook works with partial clone' '
      
     - ## repository.h ##
     -@@ repository.h: struct repository {
     + ## upload-pack.c ##
     +@@ upload-pack.c: static int upload_pack_config(const char *var, const char *value, void *cb_data)
     + 		data->advertise_sid = git_config_bool(var, value);
     + 	}
       
     - 	struct repo_settings settings;
     +-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
     +-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
     +-		if (!strcmp("uploadpack.packobjectshook", var))
     +-			return git_config_string(&data->pack_objects_hook, var, value);
     +-	}
     +-
     + 	if (parse_object_filter_config(var, value, data) < 0)
     + 		return -1;
     + 
     + 	return parse_hide_refs_config(var, value, "uploadpack");
     + }
       
     -+	/*
     -+	 * Config that comes from trusted sources, namely
     -+	 * - system config files (e.g. /etc/gitconfig)
     -+	 * - global config files (e.g. $HOME/.gitconfig,
     -+	 *   $XDG_CONFIG_HOME/git)
     -+	 */
     -+	struct config_set *protected_config;
     ++static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
     ++{
     ++	struct upload_pack_data *data = cb_data;
     ++
     ++	if (!strcmp("uploadpack.packobjectshook", var))
     ++		return git_config_string(&data->pack_objects_hook, var, value);
     ++	return 0;
     ++}
     ++
     ++static void get_upload_pack_config(struct upload_pack_data *data)
     ++{
     ++	git_config(upload_pack_config, data);
     ++	git_protected_config(upload_pack_protected_config, data);
     ++}
      +
     - 	/* Subsystems */
     - 	/*
     - 	 * Repository's config which contains key-value pairs from the usual
     -
     - ## setup.c ##
     -@@ setup.c: static int ensure_valid_ownership(const char *path)
     - 	    is_path_owned_by_current_user(path))
     - 		return 1;
     + void upload_pack(const int advertise_refs, const int stateless_rpc,
     + 		 const int timeout)
     + {
     +@@ upload-pack.c: void upload_pack(const int advertise_refs, const int stateless_rpc,
     + 	struct upload_pack_data data;
       
     --	read_very_early_config(safe_directory_cb, &data);
     -+	git_protected_config(safe_directory_cb, &data);
     + 	upload_pack_data_init(&data);
     +-
     +-	git_config(upload_pack_config, &data);
     ++	get_upload_pack_config(&data);
       
     - 	return data.is_safe;
     - }
     + 	data.stateless_rpc = stateless_rpc;
     + 	data.timeout = timeout;
     +@@ upload-pack.c: int upload_pack_v2(struct repository *r, struct packet_reader *request)
     + 
     + 	upload_pack_data_init(&data);
     + 	data.use_sideband = LARGE_PACKET_MAX;
     +-
     +-	git_config(upload_pack_config, &data);
     ++	get_upload_pack_config(&data);
     + 
     + 	while (state != FETCH_DONE) {
     + 		switch (state) {
 4:  66a0a208176 ! 4:  156817966fa config: include "-c" in protected config
     @@ Metadata
      Author: Glen Choo <chooglen@google.com>
      
       ## Commit message ##
     -    config: include "-c" in protected config
     +    safe.directory: use git_protected_config()
      
     -    Protected config should include the command line (aka "-c") because we
     -    can be quite certain that this config is specified by the user.
     -
     -    Introduce a function, `git_configset_add_parameters()`, that adds "-c"
     -    config to a config_set, and use it to add "-c" to protected config.
     +    Use git_protected_config() to read `safe.directory` instead of
     +    read_very_early_config(), making it 'protected config only'. As a
     +    result, `safe.directory` now respects "-c", so update the tests and docs
     +    accordingly.
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
     - ## Documentation/config.txt ##
     -@@ Documentation/config.txt: names do not conflict with those that are used by Git itself and
     - other popular tools, and describe them in your documentation.
     - 
     - Variables marked with '(Protected config only)' are only respected when
     --they are specified in protected configuration. This includes global and
     --system-level config, and excludes repository config, the command line
     --option `-c`, and environment variables. For more details, see the
     -+they are specified in protected configuration. This includes global,
     -+system-level config, the command line option `-c`, and environment
     -+variables, and excludes repository config. For more details, see the
     - 'protected configuration' entry in linkgit:gitglossary[7].
     - 
     - include::config/advice.txt[]
     -
     - ## Documentation/glossary-content.txt ##
     -@@ Documentation/glossary-content.txt: Protected configuration includes:
     - - system-level config, e.g. `/etc/git/config`
     - - global config, e.g. `$XDG_CONFIG_HOME/git/config` and
     -   `$HOME/.gitconfig`
     -+- the command line option `-c` and its equivalent environment variables
     + ## Documentation/config/safe.txt ##
     +@@ Documentation/config/safe.txt: via `git config --add`. To reset the list of safe directories (e.g. to
     + override any such directories specified in the system config), add a
     + `safe.directory` entry with an empty value.
       +
     - Protected configuration excludes:
     +-This config setting is only respected when specified in a system or global
     +-config, not when it is specified in a repository config, via the command
     +-line option `-c safe.directory=<path>`, or in environment variables.
     ++This config setting is only respected in protected configuration (see
     ++<<SCOPES>>). This prevents the untrusted repository from tampering with this
     ++value.
       +
     - - repository config, e.g. `$GIT_DIR/config` and
     -   `$GIT_DIR/config.worktree`
     --- the command line option `-c` and its equivalent environment variables
     - 
     - [[def_reachable]]reachable::
     - 	All of the ancestors of a given <<def_commit,commit>> are said to be
     + The value of this setting is interpolated, i.e. `~/<path>` expands to a
     + path relative to the home directory and `%(prefix)/<path>` expands to a
      
     - ## config.c ##
     -@@ config.c: int git_configset_add_file(struct config_set *cs, const char *filename)
     - 	return git_config_from_file(config_set_callback, filename, cs);
     - }
     + ## setup.c ##
     +@@ setup.c: static int ensure_valid_ownership(const char *path)
     + 	    is_path_owned_by_current_user(path))
     + 		return 1;
       
     -+int git_configset_add_parameters(struct config_set *cs)
     -+{
     -+	return git_config_from_parameters(config_set_callback, cs);
     -+}
     -+
     - int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
     - {
     - 	const struct string_list *values = NULL;
     -@@ config.c: static void read_protected_config(void)
     - 	git_configset_add_file(the_repository->protected_config, system_config);
     - 	git_configset_add_file(the_repository->protected_config, xdg_config);
     - 	git_configset_add_file(the_repository->protected_config, user_config);
     -+	git_configset_add_parameters(the_repository->protected_config);
     +-	read_very_early_config(safe_directory_cb, &data);
     ++	git_protected_config(safe_directory_cb, &data);
       
     - 	free(system_config);
     - 	free(xdg_config);
     -
     - ## config.h ##
     -@@ config.h: void git_configset_init(struct config_set *cs);
     -  */
     - int git_configset_add_file(struct config_set *cs, const char *filename);
     - 
     -+/**
     -+ * Parses command line options and environment variables, and adds the
     -+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
     -+ * if there is an error in parsing. The caller decides whether to free
     -+ * the incomplete configset or continue using it when the function
     -+ * returns -1.
     -+ */
     -+int git_configset_add_parameters(struct config_set *cs);
     -+
     - /**
     -  * Finds and returns the value list, sorted in order of increasing priority
     -  * for the configuration variable `key` and config set `cs`. When the
     + 	return data.is_safe;
     + }
      
       ## t/t0033-safe-directory.sh ##
      @@ t/t0033-safe-directory.sh: test_expect_success 'safe.directory is not set' '
     @@ t/t0033-safe-directory.sh: test_expect_success 'safe.directory is not set' '
       '
       
       test_expect_success 'ignoring safe.directory in repo config' '
     -
     - ## t/t0035-discovery-bare.sh ##
     -@@ t/t0035-discovery-bare.sh: test_expect_success 'discovery.bare on the command line' '
     - 	git config --global discovery.bare never &&
     - 	(
     - 		cd outer-repo/bare-repo &&
     --		test_must_fail git -c discovery.bare=always rev-parse --git-dir 2>err &&
     --		grep "discovery.bare" err
     -+		git -c discovery.bare=always rev-parse --git-dir
     - 	)
     - '
     - 
 3:  d5a3e9f9845 ! 5:  29053d029f8 setup.c: create `discovery.bare`
     @@ setup.c
       static int inside_git_dir = -1;
       static int inside_work_tree = -1;
       static int work_tree_config_is_bogus;
     -+enum discovery_bare_config {
     -+	DISCOVERY_BARE_UNKNOWN = -1,
     ++enum discovery_bare_allowed {
      +	DISCOVERY_BARE_NEVER = 0,
      +	DISCOVERY_BARE_ALWAYS,
      +};
     -+static enum discovery_bare_config discovery_bare_config =
     -+	DISCOVERY_BARE_UNKNOWN;
       
       static struct startup_info the_startup_info;
       struct startup_info *startup_info = &the_startup_info;
     @@ setup.c: static int ensure_valid_ownership(const char *path)
       
      +static int discovery_bare_cb(const char *key, const char *value, void *d)
      +{
     ++	enum discovery_bare_allowed *discovery_bare_allowed = d;
     ++
      +	if (strcmp(key, "discovery.bare"))
      +		return 0;
      +
      +	if (!strcmp(value, "never")) {
     -+		discovery_bare_config = DISCOVERY_BARE_NEVER;
     ++		*discovery_bare_allowed = DISCOVERY_BARE_NEVER;
      +		return 0;
      +	}
      +	if (!strcmp(value, "always")) {
     -+		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
     ++		*discovery_bare_allowed = DISCOVERY_BARE_ALWAYS;
      +		return 0;
      +	}
      +	return -1;
      +}
      +
     -+static int check_bare_repo_allowed(void)
     ++static enum discovery_bare_allowed get_discovery_bare(void)
      +{
     -+	if (discovery_bare_config == DISCOVERY_BARE_UNKNOWN) {
     -+		discovery_bare_config = DISCOVERY_BARE_ALWAYS;
     -+		git_protected_config(discovery_bare_cb, NULL);
     -+	}
     -+	switch (discovery_bare_config) {
     -+	case DISCOVERY_BARE_NEVER:
     -+		return 0;
     -+	case DISCOVERY_BARE_ALWAYS:
     -+		return 1;
     -+	case DISCOVERY_BARE_UNKNOWN:
     -+		BUG("invalid discovery_bare_config %d", discovery_bare_config);
     -+	}
     -+	return 0;
     ++	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
     ++	git_protected_config(discovery_bare_cb, &result);
     ++	return result;
      +}
      +
     -+static const char *discovery_bare_config_to_string(void)
     ++static const char *discovery_bare_allowed_to_string(
     ++	enum discovery_bare_allowed discovery_bare_allowed)
      +{
     -+	switch (discovery_bare_config) {
     ++	switch (discovery_bare_allowed) {
      +	case DISCOVERY_BARE_NEVER:
      +		return "never";
      +	case DISCOVERY_BARE_ALWAYS:
      +		return "always";
     -+	case DISCOVERY_BARE_UNKNOWN:
     -+		BUG("invalid discovery_bare_config %d", discovery_bare_config);
     ++	default:
     ++		BUG("invalid discovery_bare_allowed %d",
     ++		    discovery_bare_allowed);
      +	}
      +	return NULL;
      +}
     @@ setup.c: static enum discovery_result setup_git_directory_gently_1(struct strbuf
       		}
       
       		if (is_git_directory(dir->buf)) {
     -+			if (!check_bare_repo_allowed())
     ++			if (!get_discovery_bare())
      +				return GIT_DIR_DISALLOWED_BARE;
       			if (!ensure_valid_ownership(dir->buf))
       				return GIT_DIR_INVALID_OWNERSHIP;
     @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
      +		if (!nongit_ok) {
      +			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
      +			    dir.buf,
     -+			    discovery_bare_config_to_string());
     ++			    discovery_bare_allowed_to_string(get_discovery_bare()));
      +		}
      +		*nongit_ok = 1;
      +		break;
     @@ t/t0035-discovery-bare.sh (new)
      +
      +pwd="$(pwd)"
      +
     ++expect_accepted () {
     ++	git "$@" rev-parse --git-dir
     ++}
     ++
      +expect_rejected () {
     -+	test_must_fail git rev-parse --git-dir 2>err &&
     ++	test_must_fail git "$@" rev-parse --git-dir 2>err &&
      +	grep "discovery.bare" err
      +}
      +
     @@ t/t0035-discovery-bare.sh (new)
      +test_expect_success 'discovery.bare unset' '
      +	(
      +		cd outer-repo/bare-repo &&
     -+		git rev-parse --git-dir
     ++		expect_accepted
      +	)
      +'
      +
     @@ t/t0035-discovery-bare.sh (new)
      +	git config --global discovery.bare always &&
      +	(
      +		cd outer-repo/bare-repo &&
     -+		git rev-parse --git-dir
     ++		expect_accepted
      +	)
      +'
      +
     @@ t/t0035-discovery-bare.sh (new)
      +	git config --global discovery.bare never &&
      +	(
      +		cd outer-repo/bare-repo &&
     -+		test_must_fail git -c discovery.bare=always rev-parse --git-dir 2>err &&
     -+		grep "discovery.bare" err
     ++		expect_accepted -c discovery.bare=always &&
     ++		expect_rejected -c discovery.bare=
      +	)
      +'
      +

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v4 1/5] Documentation/git-config.txt: add SCOPES section
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
@ 2022-06-07 20:57       ` Glen Choo via GitGitGadget
  2022-06-07 20:57       ` [PATCH v4 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-07 20:57 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

In a subsequent commit, we will introduce "protected config", which is
easiest to describe in terms of configuration scopes (i.e. it's the
union of the 'system', 'global', and 'command' scopes). This description
is fine for ML discussions, but it's inadequate for end users because we
don't provide a good description of "config scopes" in the public docs.

145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
those values mean.

Add a SCOPES section to Documentation/git-config.txt that describes the
config scopes, their corresponding CLI options, and mentions that some
configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/git-config.txt | 64 ++++++++++++++++++++++++++++--------
 1 file changed, 50 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index bdcfd94b642..5e4c95f2423 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -297,8 +297,8 @@ The default is to use a pager.
 FILES
 -----
 
-If not set explicitly with `--file`, there are four files where
-'git config' will search for configuration options:
+By default, 'git config' will read configuration options from multiple
+files:
 
 $(prefix)/etc/gitconfig::
 	System-wide configuration file.
@@ -322,27 +322,63 @@ $GIT_DIR/config.worktree::
 	This is optional and is only searched when
 	`extensions.worktreeConfig` is present in $GIT_DIR/config.
 
-If no further options are given, all reading options will read all of these
-files that are available. If the global or the system-wide configuration
-file are not available they will be ignored. If the repository configuration
-file is not available or readable, 'git config' will exit with a non-zero
-error code. However, in neither case will an error message be issued.
+You may also provide additional configuration parameters when running any
+git command by using the `-c` option. See linkgit:git[1] for details.
+
+Options will be read from all of these files that are available. If the
+global or the system-wide configuration file are not available they will be
+ignored. If the repository configuration file is not available or readable,
+'git config' will exit with a non-zero error code. However, in neither case
+will an error message be issued.
 
 The files are read in the order given above, with last value found taking
 precedence over values read earlier.  When multiple values are taken then all
 values of a key from all files will be used.
 
-You may override individual configuration parameters when running any git
-command by using the `-c` option. See linkgit:git[1] for details.
-
-All writing options will per default write to the repository specific
+By default, options are only written to the repository specific
 configuration file. Note that this also affects options like `--replace-all`
 and `--unset`. *'git config' will only ever change one file at a time*.
 
-You can override these rules using the `--global`, `--system`,
-`--local`, `--worktree`, and `--file` command-line options; see
-<<OPTIONS>> above.
+You can change the way options are read/written by specifying the path to a
+file (`--file`), or by specifying a configuration scope (`--system`,
+`--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
+
+SCOPES
+------
+
+Each configuration source falls within a configuration scope. The scopes
+are:
+
+system::
+	$(prefix)/etc/gitconfig
+
+global::
+	$XDG_CONFIG_HOME/git/config
++
+~/.gitconfig
+
+local::
+	$GIT_DIR/config
+
+worktree::
+	$GIT_DIR/config.worktree
+
+command::
+	environment variables
++
+the `-c` option
+
+With the exception of 'command', each scope corresponds to a command line
+option - `--system`, `--global`, `--local`, `--worktree`.
+
+When reading options, specifying a scope will only read options from the
+files within that scope. When writing options, specifying a scope will write
+to the files within that scope (instead of the repository specific
+configuration file). See <<OPTIONS>> above for a complete description.
 
+Most configuration options are respected regardless of the scope it is
+defined in, but some options are only respected in certain scopes. See the
+option's documentation for the full details.
 
 ENVIRONMENT
 -----------
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 2/5] Documentation: define protected configuration
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  2022-06-07 20:57       ` [PATCH v4 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-06-07 20:57       ` Glen Choo via GitGitGadget
  2022-06-22 21:58         ` Jonathan Tan
  2022-06-07 20:57       ` [PATCH v4 3/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
                         ` (5 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-07 20:57 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

For security reasons, there are config variables that are only trusted
when they are specified in extra-trustworthy configuration scopes, which
are sometimes referred to on-list as 'protected configuration' [1]. A
future commit will introduce another such variable, so let's define our
terms so that we can have consistent documentation and implementation.

In our documentation, define 'protected config' as the system, global
and command config scopes. As a shorthand, I will refer to variables
that are only respected in protected config as 'protected config only',
but this term is not used in the documentation.

This definition of protected configuration is based on whether or not
Git can reasonably protect the user by ignoring the configuration scope:

- System, global and command line config are considered protected
  because an attacker who has control over any of those can do plenty of
  harm without Git, so we gain very little by ignoring those scopes.
- On the other hand, local (and similarly, worktree) config are not
  considered protected because it is relatively easy for an attacker to
  control local config, e.g.:
  - On some shared user environments, a non-admin attacker can create a
    repository high up the directory hierarchy (e.g. C:\.git on Windows),
    and a user may accidentally use it when their PS1 automatically
    invokes "git" commands.

    `safe.directory` prevents attacks of this form by making sure that
    the user intended to use the shared repository. It obviously
    shouldn't be read from the repository, because that would end up
    trusting the repository that Git was supposed to reject.
  - "git upload-pack" is expected to run in repositories that may not be
    controlled by the user. We cannot ignore all config in that
    repository (because "git upload-pack" would fail), but we can limit
    the risks by ignoring `uploadpack.packObjectsHook`.

Only `uploadpack.packObjectsHook` is 'protected config only'. The
following variables are intentionally excluded:

- `safe.directory` should be 'protected config only', but it does not
  technically fit the definition because it is not respected in the
  "command" scope. A future commit will fix this.

- `trace2.*` happens to read the same scopes as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/uploadpack.txt |  6 +++---
 Documentation/git-config.txt        | 13 +++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..029abbefdff 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -49,9 +49,9 @@ uploadpack.packObjectsHook::
 	`pack-objects` to the hook, and expects a completed packfile on
 	stdout.
 +
-Note that this configuration variable is ignored if it is seen in the
-repository-level config (this is a safety measure against fetching from
-untrusted repositories).
+Note that this configuration variable is only respected when it is specified
+in protected config (see <<SCOPES>>). This is a safety measure against
+fetching from untrusted repositories.
 
 uploadpack.allowFilter::
 	If this option is set, `upload-pack` will support partial
diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 5e4c95f2423..2b4334faec9 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -343,6 +343,7 @@ You can change the way options are read/written by specifying the path to a
 file (`--file`), or by specifying a configuration scope (`--system`,
 `--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
 
+[[SCOPES]]
 SCOPES
 ------
 
@@ -380,6 +381,18 @@ Most configuration options are respected regardless of the scope it is
 defined in, but some options are only respected in certain scopes. See the
 option's documentation for the full details.
 
+Protected config
+~~~~~~~~~~~~~~~~
+
+Protected config refers to the 'system', 'global', and 'command' scopes. Git
+considers these scopes to be especially trustworthy because they are likely
+to be controlled by the user or a trusted administrator. An attacker who
+controls these scopes can do substantial harm without using Git, so it is
+assumed that the user's environment protects these scopes against attackers.
+
+For security reasons, certain options are only respected when they are
+specified in protected config, and ignored otherwise.
+
 ENVIRONMENT
 -----------
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 3/5] config: read protected config with `git_protected_config()`
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  2022-06-07 20:57       ` [PATCH v4 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
  2022-06-07 20:57       ` [PATCH v4 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-06-07 20:57       ` Glen Choo via GitGitGadget
  2022-06-07 22:49         ` Junio C Hamano
  2022-06-07 20:57       ` [PATCH v4 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
                         ` (4 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-07 20:57 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

`uploadpack.packObjectsHook` is the only 'protected config only'
variable today, but we've noted that `safe.directory` and the upcoming
`discovery.bare` should also be 'protected config only'. So, for
consistency, we'd like to have a single implementation for protected
config.

The primary constraints are:

1. Reading from protected config should be as fast as possible. Nearly
   all "git" commands inside a bare repository will read both
   `safe.directory` and `discovery.bare`, so we cannot afford to be
   slow.

2. Protected config must be readable when the gitdir is not known.
   `safe.directory` and `discovery.bare` both affect repository
   discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected config and
cache the values in a global configset. This is similar to the caching
behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected config and
caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected config functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved
since git_protected_config() iterates through every variable in
protected_config, which may still be too expensive. There exist constant
time lookup functions for non-protected config (repo_config_get_*()),
but for simplicity, this commit does not implement similar functions for
protected config.

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected config only' variables, but it's not clear whether this
extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 config.c                     | 51 ++++++++++++++++++++++++++++++++++++
 config.h                     | 17 ++++++++++++
 t/t5544-pack-objects-hook.sh |  7 ++++-
 upload-pack.c                | 27 ++++++++++++-------
 4 files changed, 91 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index fa471dbdb89..56b7ed5ffe8 100644
--- a/config.c
+++ b/config.c
@@ -81,6 +81,18 @@ static enum config_scope current_parsing_scope;
 static int pack_compression_seen;
 static int zlib_compression_seen;
 
+/*
+ * Config that comes from trusted sources, namely:
+ * - system config files (e.g. /etc/gitconfig)
+ * - global config files (e.g. $HOME/.gitconfig,
+ *   $XDG_CONFIG_HOME/git)
+ * - the command line.
+ *
+ * This is declared here for code cleanliness, but unlike the other
+ * static variables, this does not hold config parser state.
+ */
+static struct config_set protected_config;
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return getc_unlocked(conf->u.file);
@@ -2373,6 +2385,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
 	return git_config_from_file(config_set_callback, filename, cs);
 }
 
+int git_configset_add_parameters(struct config_set *cs)
+{
+	return git_config_from_parameters(config_set_callback, cs);
+}
+
 int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
 {
 	const struct string_list *values = NULL;
@@ -2614,6 +2631,40 @@ int repo_config_get_pathname(struct repository *repo,
 	return ret;
 }
 
+/* Read values into protected_config. */
+static void read_protected_config(void)
+{
+	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
+
+	git_configset_init(&protected_config);
+
+	system_config = git_system_config();
+	git_global_config(&user_config, &xdg_config);
+
+	git_configset_add_file(&protected_config, system_config);
+	git_configset_add_file(&protected_config, xdg_config);
+	git_configset_add_file(&protected_config, user_config);
+	git_configset_add_parameters(&protected_config);
+
+	free(system_config);
+	free(xdg_config);
+	free(user_config);
+}
+
+/* Ensure that protected_config has been initialized. */
+static void git_protected_config_check_init(void)
+{
+	if (protected_config.hash_initialized)
+		return;
+	read_protected_config();
+}
+
+void git_protected_config(config_fn_t fn, void *data)
+{
+	git_protected_config_check_init();
+	configset_iter(&protected_config, fn, data);
+}
+
 /* Functions used historically to read configuration from 'the_repository' */
 void git_config(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index 7654f61c634..e3ff1fcf683 100644
--- a/config.h
+++ b/config.h
@@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
  */
 int git_configset_add_file(struct config_set *cs, const char *filename);
 
+/**
+ * Parses command line options and environment variables, and adds the
+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
+ * if there is an error in parsing. The caller decides whether to free
+ * the incomplete configset or continue using it when the function
+ * returns -1.
+ */
+int git_configset_add_parameters(struct config_set *cs);
+
 /**
  * Finds and returns the value list, sorted in order of increasing priority
  * for the configuration variable `key` and config set `cs`. When the
@@ -505,6 +514,14 @@ int repo_config_get_maybe_bool(struct repository *repo,
 int repo_config_get_pathname(struct repository *repo,
 			     const char *key, const char **dest);
 
+/*
+ * Functions for reading protected config. By definition, protected
+ * config ignores repository config, so it is unnecessary to read
+ * protected config from any `struct repository` other than
+ * the_repository.
+ */
+void git_protected_config(config_fn_t fn, void *data);
+
 /**
  * Querying For Specific Variables
  * -------------------------------
diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index dd5f44d986f..54f54f8d2eb 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
 	! grep "hook running" stderr &&
 	test_path_is_missing .git/hook.args &&
 	test_path_is_missing .git/hook.stdin &&
-	test_path_is_missing .git/hook.stdout
+	test_path_is_missing .git/hook.stdout &&
+
+	# check that global config is used instead
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	git clone --no-local . dst2.git 2>stderr &&
+	grep "hook running" stderr
 '
 
 test_expect_success 'hook works with partial clone' '
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b36066..09f48317b02 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 		data->advertise_sid = git_config_bool(var, value);
 	}
 
-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
-		if (!strcmp("uploadpack.packobjectshook", var))
-			return git_config_string(&data->pack_objects_hook, var, value);
-	}
-
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
+{
+	struct upload_pack_data *data = cb_data;
+
+	if (!strcmp("uploadpack.packobjectshook", var))
+		return git_config_string(&data->pack_objects_hook, var, value);
+	return 0;
+}
+
+static void get_upload_pack_config(struct upload_pack_data *data)
+{
+	git_config(upload_pack_config, data);
+	git_protected_config(upload_pack_protected_config, data);
+}
+
 void upload_pack(const int advertise_refs, const int stateless_rpc,
 		 const int timeout)
 {
@@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 	struct upload_pack_data data;
 
 	upload_pack_data_init(&data);
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	data.stateless_rpc = stateless_rpc;
 	data.timeout = timeout;
@@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
 
 	upload_pack_data_init(&data);
 	data.use_sideband = LARGE_PACKET_MAX;
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	while (state != FETCH_DONE) {
 		switch (state) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 4/5] safe.directory: use git_protected_config()
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                         ` (2 preceding siblings ...)
  2022-06-07 20:57       ` [PATCH v4 3/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-06-07 20:57       ` Glen Choo via GitGitGadget
  2022-06-07 20:57       ` [PATCH v4 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-07 20:57 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected config only'. As a
result, `safe.directory` now respects "-c", so update the tests and docs
accordingly.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/safe.txt |  6 +++---
 setup.c                       |  2 +-
 t/t0033-safe-directory.sh     | 24 ++++++++++--------------
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index ae0e2e3bdb4..2a7d2324250 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -12,9 +12,9 @@ via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
-This config setting is only respected when specified in a system or global
-config, not when it is specified in a repository config, via the command
-line option `-c safe.directory=<path>`, or in environment variables.
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with this
+value.
 +
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
diff --git a/setup.c b/setup.c
index f818dd858c6..847d47f9195 100644
--- a/setup.c
+++ b/setup.c
@@ -1128,7 +1128,7 @@ static int ensure_valid_ownership(const char *path)
 	    is_path_owned_by_current_user(path))
 		return 1;
 
-	read_very_early_config(safe_directory_cb, &data);
+	git_protected_config(safe_directory_cb, &data);
 
 	return data.is_safe;
 }
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 238b25f91a3..5a1cd0d0947 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -16,24 +16,20 @@ test_expect_success 'safe.directory is not set' '
 	expect_rejected_dir
 '
 
-test_expect_success 'ignoring safe.directory on the command line' '
-	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory on the command line' '
+	git -c safe.directory="$(pwd)" status
 '
 
-test_expect_success 'ignoring safe.directory in the environment' '
-	test_must_fail env GIT_CONFIG_COUNT=1 \
-		GIT_CONFIG_KEY_0="safe.directory" \
-		GIT_CONFIG_VALUE_0="$(pwd)" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in the environment' '
+	env GIT_CONFIG_COUNT=1 \
+	    GIT_CONFIG_KEY_0="safe.directory" \
+	    GIT_CONFIG_VALUE_0="$(pwd)" \
+	    git status
 '
 
-test_expect_success 'ignoring safe.directory in GIT_CONFIG_PARAMETERS' '
-	test_must_fail env \
-		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
+	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
+	    git status
 '
 
 test_expect_success 'ignoring safe.directory in repo config' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v4 5/5] setup.c: create `discovery.bare`
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                         ` (3 preceding siblings ...)
  2022-06-07 20:57       ` [PATCH v4 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
@ 2022-06-07 20:57       ` Glen Choo via GitGitGadget
  2022-06-07 21:37         ` Glen Choo
  2022-06-22 22:03       ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Jonathan Tan
                         ` (2 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-07 20:57 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `discovery.bare`, that tells Git whether or
not to die() when it discovers a bare repository. This only affects
repository discovery, thus it has no effect if discovery was not
done (e.g. `--git-dir` was passed).

This config is an enum of:

- "always": always allow bare repositories (this is the default)
- "never": never allow bare repositories

If we want to protect users from such attacks by default, neither value
will suffice - "always" provides no protection, but "never" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  2 +
 Documentation/config/discovery.txt | 19 +++++++++
 setup.c                            | 57 ++++++++++++++++++++++++-
 t/t0035-discovery-bare.sh          | 68 ++++++++++++++++++++++++++++++
 4 files changed, 145 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e284b042f22..9a5e1329772 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -409,6 +409,8 @@ include::config/diff.txt[]
 
 include::config/difftool.txt[]
 
+include::config/discovery.txt[]
+
 include::config/extensions.txt[]
 
 include::config/fastimport.txt[]
diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
new file mode 100644
index 00000000000..fbe93597e7c
--- /dev/null
+++ b/Documentation/config/discovery.txt
@@ -0,0 +1,19 @@
+discovery.bare::
+	'(Protected config only)' Specifies whether Git will work with a
+	bare repository that it found during repository discovery. This
+	has no effect if the repository is specified directly via the
+	--git-dir command-line option or the GIT_DIR environment
+	variable (see linkgit:git[1]).
++
+The currently supported values are:
++
+* `always`: Git always works with bare repositories
+* `never`: Git never works with bare repositories
++
+This defaults to `always`, but this default may change in the future.
++
+If you do not use bare repositories in your workflow, then it may be
+beneficial to set `discovery.bare` to `never` in your global config.
+This will protect you from attacks that involve cloning a repository
+that contains a bare repository and running a Git command within that
+directory.
diff --git a/setup.c b/setup.c
index 847d47f9195..4d8d3c1bc7d 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,10 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum discovery_bare_allowed {
+	DISCOVERY_BARE_NEVER = 0,
+	DISCOVERY_BARE_ALWAYS,
+};
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1133,6 +1137,46 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+static int discovery_bare_cb(const char *key, const char *value, void *d)
+{
+	enum discovery_bare_allowed *discovery_bare_allowed = d;
+
+	if (strcmp(key, "discovery.bare"))
+		return 0;
+
+	if (!strcmp(value, "never")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_NEVER;
+		return 0;
+	}
+	if (!strcmp(value, "always")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_ALWAYS;
+		return 0;
+	}
+	return -1;
+}
+
+static enum discovery_bare_allowed get_discovery_bare(void)
+{
+	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
+	git_protected_config(discovery_bare_cb, &result);
+	return result;
+}
+
+static const char *discovery_bare_allowed_to_string(
+	enum discovery_bare_allowed discovery_bare_allowed)
+{
+	switch (discovery_bare_allowed) {
+	case DISCOVERY_BARE_NEVER:
+		return "never";
+	case DISCOVERY_BARE_ALWAYS:
+		return "always";
+	default:
+		BUG("invalid discovery_bare_allowed %d",
+		    discovery_bare_allowed);
+	}
+	return NULL;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1142,7 +1186,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
 };
 
 /*
@@ -1239,6 +1284,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (!get_discovery_bare())
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1385,6 +1432,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
+			    dir.buf,
+			    discovery_bare_allowed_to_string(get_discovery_bare()));
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
new file mode 100755
index 00000000000..0b345d361e6
--- /dev/null
+++ b/t/t0035-discovery-bare.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+test_description='verify discovery.bare checks'
+
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_accepted () {
+	git "$@" rev-parse --git-dir
+}
+
+expect_rejected () {
+	test_must_fail git "$@" rev-parse --git-dir 2>err &&
+	grep "discovery.bare" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare unset' '
+	(
+		cd outer-repo/bare-repo &&
+		expect_accepted
+	)
+'
+
+test_expect_success 'discovery.bare=always' '
+	git config --global discovery.bare always &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_accepted
+	)
+'
+
+test_expect_success 'discovery.bare=never' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_rejected
+	)
+'
+
+test_expect_success 'discovery.bare in the repository' '
+	(
+		cd outer-repo/bare-repo &&
+		# Temporarily set discovery.bare=always, otherwise git
+		# config fails with "fatal: not in a git directory"
+		# (like safe.directory)
+		git config --global discovery.bare always &&
+		git config discovery.bare always &&
+		git config --global discovery.bare never &&
+		expect_rejected
+	)
+'
+
+test_expect_success 'discovery.bare on the command line' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_accepted -c discovery.bare=always &&
+		expect_rejected -c discovery.bare=
+	)
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 5/5] setup.c: create `discovery.bare`
  2022-06-07 20:57       ` [PATCH v4 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-06-07 21:37         ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-07 21:37 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
> new file mode 100644
> index 00000000000..fbe93597e7c
> --- /dev/null
> +++ b/Documentation/config/discovery.txt
> @@ -0,0 +1,19 @@
> +discovery.bare::
> +	'(Protected config only)' Specifies whether Git will work with a
> +	bare repository that it found during repository discovery. This
> +	has no effect if the repository is specified directly via the
> +	--git-dir command-line option or the GIT_DIR environment
> +	variable (see linkgit:git[1]).

Ugh, I forgot to update the docs for `discovery.bare`. This should be
reworded to be consistent with `safe.directory` and
`uploadpack.packObjectsHook`, e.g.

   discovery.bare::
   	Specifies whether Git will work with a bare repository that it found
   	during repository discovery. This has no effect if the repository is
   	specified directly via the --git-dir command-line option or the
   	GIT_DIR environment variable (see linkgit:git[1]).

    This config setting is only respected in protected configuration
    (see <<SCOPES>>). This prevents the untrusted repository from
    tampering with this value.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 3/5] config: read protected config with `git_protected_config()`
  2022-06-07 20:57       ` [PATCH v4 3/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-06-07 22:49         ` Junio C Hamano
  2022-06-08  0:22           ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-06-07 22:49 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/upload-pack.c b/upload-pack.c
> index 3a851b36066..09f48317b02 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
>  		data->advertise_sid = git_config_bool(var, value);
>  	}
>  
> -	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
> -	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
> -		if (!strcmp("uploadpack.packobjectshook", var))
> -			return git_config_string(&data->pack_objects_hook, var, value);
> -	}
> -

The lossage of this block is because this general git_config()
callback routine that is used to read from any scope is no longer
used to pick up the sensitive variable.  Instead, we need to get it
with a different API, namely, git_protected_config().

It is probably is good that in the new code we are not encouraging
folks to write random comparisons on current_config_scope(), and
instead uniformly use a git_protected_config().  That may promote
consistency.

An obvious alternative to achieve the same consistency would be to
introduce a helper, and rewrite (instead of removing) the above part
like so:

	if (in_protected_scope()) {
		... parse sensitive variable ...
	}

We would not need any other change to this file in this patch if we
go that route, I suspect.

>  	if (parse_object_filter_config(var, value, data) < 0)
>  		return -1;
>  
>  	return parse_hide_refs_config(var, value, "uploadpack");
>  }
>  
> +static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct upload_pack_data *data = cb_data;
> +
> +	if (!strcmp("uploadpack.packobjectshook", var))
> +		return git_config_string(&data->pack_objects_hook, var, value);
> +	return 0;
> +}
> +
> +static void get_upload_pack_config(struct upload_pack_data *data)
> +{
> +	git_config(upload_pack_config, data);
> +	git_protected_config(upload_pack_protected_config, data);
> +}

Where we used to just do git_config(upload_pack_config), we now need
to do a separate git_protected_config().  It feels a bit wasteful to
iterate over the same configset twice, but it is not like we are
doing the IO and text file parsing multiple times.  This looks quite
straight-forward.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 3/5] config: read protected config with `git_protected_config()`
  2022-06-07 22:49         ` Junio C Hamano
@ 2022-06-08  0:22           ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-08  0:22 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> diff --git a/upload-pack.c b/upload-pack.c
>> index 3a851b36066..09f48317b02 100644
>> --- a/upload-pack.c
>> +++ b/upload-pack.c
>> @@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
>>  		data->advertise_sid = git_config_bool(var, value);
>>  	}
>>  
>> -	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
>> -	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
>> -		if (!strcmp("uploadpack.packobjectshook", var))
>> -			return git_config_string(&data->pack_objects_hook, var, value);
>> -	}
>> -
>
> The lossage of this block is because this general git_config()
> callback routine that is used to read from any scope is no longer
> used to pick up the sensitive variable.  Instead, we need to get it
> with a different API, namely, git_protected_config().
>
> It is probably is good that in the new code we are not encouraging
> folks to write random comparisons on current_config_scope(), and
> instead uniformly use a git_protected_config().  That may promote
> consistency.
>
> An obvious alternative to achieve the same consistency would be to
> introduce a helper, and rewrite (instead of removing) the above part
> like so:
>
> 	if (in_protected_scope()) {
> 		... parse sensitive variable ...
> 	}
>
> We would not need any other change to this file in this patch if we
> go that route, I suspect.

Yes, and as noted in the commit message, this approach seems to work for
`safe.directory` and `discovery.bare` too.

>>  	if (parse_object_filter_config(var, value, data) < 0)
>>  		return -1;
>>  
>>  	return parse_hide_refs_config(var, value, "uploadpack");
>>  }
>>  
>> +static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
>> +{
>> +	struct upload_pack_data *data = cb_data;
>> +
>> +	if (!strcmp("uploadpack.packobjectshook", var))
>> +		return git_config_string(&data->pack_objects_hook, var, value);
>> +	return 0;
>> +}
>> +
>> +static void get_upload_pack_config(struct upload_pack_data *data)
>> +{
>> +	git_config(upload_pack_config, data);
>> +	git_protected_config(upload_pack_protected_config, data);
>> +}
>
> Where we used to just do git_config(upload_pack_config), we now need
> to do a separate git_protected_config().  It feels a bit wasteful to
> iterate over the same configset twice, but it is not like we are
> doing the IO and text file parsing multiple times.  This looks quite
> straight-forward.

Yeah it's not optimal, but at the very least, I think it's easy enough
to understand that we could replace it with something more economical in
the future.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 2/5] Documentation: define protected configuration
  2022-06-07 20:57       ` [PATCH v4 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-06-22 21:58         ` Jonathan Tan
  2022-06-23 18:21           ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Jonathan Tan @ 2022-06-22 21:58 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: Jonathan Tan, git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Glen Choo <chooglen@google.com>
> 
> For security reasons, there are config variables that are only trusted
> when they are specified in extra-trustworthy configuration scopes, which

Probably better to delete "extra-trustworthy", or at least "extra-" -
it's better to explain why and how they're trustworthy, which you have
already done in the commit message.

> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
> index 5e4c95f2423..2b4334faec9 100644
> --- a/Documentation/git-config.txt
> +++ b/Documentation/git-config.txt

[snip]

> +Protected config refers to the 'system', 'global', and 'command' scopes. Git
> +considers these scopes to be especially trustworthy because they are likely
> +to be controlled by the user or a trusted administrator. An attacker who
> +controls these scopes can do substantial harm without using Git, so it is
> +assumed that the user's environment protects these scopes against attackers.
> +
> +For security reasons, certain options are only respected when they are
> +specified in protected config, and ignored otherwise.

Also "especially trustworthy" here.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 0/5] config: introduce discovery.bare and protected config
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                         ` (4 preceding siblings ...)
  2022-06-07 20:57       ` [PATCH v4 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-06-22 22:03       ` Jonathan Tan
  2022-06-23 17:13         ` Glen Choo
  2022-06-27 18:19       ` Glen Choo
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
  7 siblings, 1 reply; 113+ messages in thread
From: Jonathan Tan @ 2022-06-22 22:03 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: Jonathan Tan, git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
> Glen Choo (5):
>   Documentation/git-config.txt: add SCOPES section
>   Documentation: define protected configuration

Forgot to mention when I was sending my comments on patch 2: we should
standardize on "protected config" and not use "protected configuration"
anywhere.

>   config: read protected config with `git_protected_config()`
>   safe.directory: use git_protected_config()
>   setup.c: create `discovery.bare`

Thanks - I think this is a nice feature to have. Everything looks good
except for some minor comments on text in patch 2, which I have sent.

One alternative design would have been to have separate configsets for
protected config and non-protected config (or even better, separate
configsets for trace2 config, protected config minus trace2 config, and
non-protected config) but that doesn't have to block the submission of
this patch set.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 0/5] config: introduce discovery.bare and protected config
  2022-06-22 22:03       ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Jonathan Tan
@ 2022-06-23 17:13         ` Glen Choo
  2022-06-23 18:32           ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-06-23 17:13 UTC (permalink / raw)
  To: Jonathan Tan, Glen Choo via GitGitGadget
  Cc: Jonathan Tan, git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer

Jonathan Tan <jonathantanmy@google.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> Glen Choo (5):
>>   Documentation/git-config.txt: add SCOPES section
>>   Documentation: define protected configuration
>
> Forgot to mention when I was sending my comments on patch 2: we should
> standardize on "protected config" and not use "protected configuration"
> anywhere.

Makes sense.

> One alternative design would have been to have separate configsets for
> protected config and non-protected config (or even better, separate
> configsets for trace2 config, protected config minus trace2 config, and
> non-protected config) but that doesn't have to block the submission of
> this patch set.

I suppose that the idea behind this is that we only parse and store each
config file exactly once. It's a good goal, but the whole point of the
configset is that we can query a single struct to figure out the value
of a config variable. Having multiple configsets starts to shift more of
the burden to the callers because they now have to query multiple
configsets to find their desired config value, and we already start to
see some of this unpleasantness in this series.

An alternative that I'd been thinking about is to make a few changes to
the git_config_* + configset API to allow us to use a single configset
for all of our needs:

1. Keep track of what config we've read when reading into
   the_repository->config, i.e. instead of a boolean "all config has
   been [un]read", we can express "system and global config has been
   read, but not local or command config". Then, use this information to
   load config from sources as they become available. This will allow us
   to read incomplete config for trace2 and setup.c (discovery.bare and
   safe.directory), and only read what we need later on.

   This assumes that when Git reads config, that config is always valid
   later on. So this is broken if, e.g. we read global config file A
   during setup, but when we discover the repo, we discard A and read
   global config file B instead. I don't know if we do this or if we are
   planning to in the future.

2. Add an additional argument that specifies what scopes to respect when
   reading config (maybe as a set of flags). This gives us extra
   specificity when using the git_config*() functions, so we could get
   rid of git_protected_config() like so:

    /* Change enum config_scope into flags first... */

    #define WIP_SCOPES_PROTECTED = CONFIG_SCOPE_SYSTEM & \
      CONFIG_SCOPE_GLOBAL & CONFIG_SCOPE_COMMAND

    static enum discovery_bare_allowed get_discovery_bare(void)
    {
      enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
      git_config(discovery_bare_cb, &result, WIP_SCOPES_PROTECTED);
      return result;
    }

   And as an added bonus, this gives us an easy way to implement the
   constant time git_config_*() functions for protected config. We could
   even do this without doing 1. first. I haven't looked into whether
   we could turn the enum into flags, but otherwise, I think this is
   pretty feasible.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 2/5] Documentation: define protected configuration
  2022-06-22 21:58         ` Jonathan Tan
@ 2022-06-23 18:21           ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-23 18:21 UTC (permalink / raw)
  To: Jonathan Tan, Glen Choo via GitGitGadget
  Cc: Jonathan Tan, git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer

Jonathan Tan <jonathantanmy@google.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> From: Glen Choo <chooglen@google.com>
>> 
>> For security reasons, there are config variables that are only trusted
>> when they are specified in extra-trustworthy configuration scopes, which
>
> Probably better to delete "extra-trustworthy", or at least "extra-" -
> it's better to explain why and how they're trustworthy, which you have
> already done in the commit message.

Hm, do you find it superfluous, misleading or something else entirely?

The use of "extra-" was quite intentional. I'm afraid that if we
describe protected config as "trustworthy", we insinuate that
local/worktree config is "untrustworthy" (but of course this isn't
always true, Git usually uses repo config.)

>> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
>> index 5e4c95f2423..2b4334faec9 100644
>> --- a/Documentation/git-config.txt
>> +++ b/Documentation/git-config.txt
>
> [snip]
>
>> +Protected config refers to the 'system', 'global', and 'command' scopes. Git
>> +considers these scopes to be especially trustworthy because they are likely
>> +to be controlled by the user or a trusted administrator. An attacker who
>> +controls these scopes can do substantial harm without using Git, so it is
>> +assumed that the user's environment protects these scopes against attackers.
>> +
>> +For security reasons, certain options are only respected when they are
>> +specified in protected config, and ignored otherwise.
>
> Also "especially trustworthy" here.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 0/5] config: introduce discovery.bare and protected config
  2022-06-23 17:13         ` Glen Choo
@ 2022-06-23 18:32           ` Junio C Hamano
  2022-06-27 17:34             ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-06-23 18:32 UTC (permalink / raw)
  To: Glen Choo
  Cc: Jonathan Tan, Glen Choo via GitGitGadget, git, Taylor Blau,
	brian m. carlson, Derrick Stolee, Emily Shaffer

Glen Choo <chooglen@google.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>> Glen Choo (5):
>>>   Documentation/git-config.txt: add SCOPES section
>>>   Documentation: define protected configuration
>>
>> Forgot to mention when I was sending my comments on patch 2: we should
>> standardize on "protected config" and not use "protected configuration"
>> anywhere.
>
> Makes sense.

Using a single word consistently does make sense, but why favor a
non-word over a proper word ;-)?

> I suppose that the idea behind this is that we only parse and store each
> config file exactly once. It's a good goal, but the whole point of the
> configset is that we can query a single struct to figure out the value
> of a config variable. Having multiple configsets starts to shift more of
> the burden to the callers because they now have to query multiple
> configsets to find their desired config value, and we already start to
> see some of this unpleasantness in this series.

Yes, I was worried about this, too.  "parse and store exactly once"
may merely be a performance thing, but it still matters, even though
it is not worse than making duplicate callbacks to overwrite globals
that have been already set earlier, which will affect correctness ;-)

> An alternative that I'd been thinking about is to make a few changes to
> the git_config_* + configset API to allow us to use a single configset
> for all of our needs:
>
> 1. Keep track of what config we've read when reading into
>    the_repository->config, i.e. instead of a boolean "all config has
>    been [un]read", we can express "system and global config has been
>    read, but not local or command config". Then, use this information to
>    load config from sources as they become available. This will allow us
>    to read incomplete config for trace2 and setup.c (discovery.bare and
>    safe.directory), and only read what we need later on.

That is not a bad direction to go, but are we sure that we always
read in the right order (and there is one single right order) and
stop at the right step?

config.c::do_git_config_sequence() reads the system and then the
global before the local, the worktree, and the command line.  We
would allow the values of "protected" configuration variables to be
inspected by stopping after the first two and inspecting the result
before the local and the rest overrides them, but will we need
*only* that kind of partial configuration reading that stops exactly
there?  Even with the proposed "protected" scheme, I thought we plan
to honor the command line ones, so we may need to read
system+global+command without reading anything else to grab the
values only from the protected sources (ah, I like the application
of the adjective "protected" to the source, not variables, because
that is what we are really talking about---alternatively we could
call it "safe").  But if we later read local and worktree ones
lazily, unless we _insert_ them before what we read from the command
line, we'll break the last-one-wins property, so we need to be
careful.  I guess each configuration value in the configset knows
where it came from, so it probably is possible to insert the ones
you read lazily later in the right spot.

> 2. Add an additional argument that specifies what scopes to respect when
>    reading config (maybe as a set of flags). This gives us extra
>    specificity when using the git_config*() functions, so we could get
>    rid of git_protected_config() like so:
>
>     /* Change enum config_scope into flags first... */
>
>     #define WIP_SCOPES_PROTECTED = CONFIG_SCOPE_SYSTEM & \
>       CONFIG_SCOPE_GLOBAL & CONFIG_SCOPE_COMMAND
>
>     static enum discovery_bare_allowed get_discovery_bare(void)
>     {
>       enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
>       git_config(discovery_bare_cb, &result, WIP_SCOPES_PROTECTED);
>       return result;
>     }

Alternatively, we could make the callback aware of the scope for
each var-value it is called and have it filter, but that would be a
bigger surgery.

I think a new iterator git_config_in_scope(), instead of updating
git_config(), would make sense.  By definition, all existing
git_config() callers do not need the scope specifiers, and
"protected" may be the first one but will not be the last one that
needs to read from particular scopes.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 0/5] config: introduce discovery.bare and protected config
  2022-06-23 18:32           ` Junio C Hamano
@ 2022-06-27 17:34             ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-27 17:34 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Tan, Glen Choo via GitGitGadget, git, Taylor Blau,
	brian m. carlson, Derrick Stolee, Emily Shaffer

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> Jonathan Tan <jonathantanmy@google.com> writes:
>>
>>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>>> Glen Choo (5):
>>>>   Documentation/git-config.txt: add SCOPES section
>>>>   Documentation: define protected configuration
>>>
>>> Forgot to mention when I was sending my comments on patch 2: we should
>>> standardize on "protected config" and not use "protected configuration"
>>> anywhere.
>>
>> Makes sense.
>
> Using a single word consistently does make sense, but why favor a
> non-word over a proper word ;-)?

Hm, I guess there's an argument that "config" is a term of art that
specifically refers to things from "git config". From that lens, it's much less
confusing to see the CONFIGURATION section in
Documentation/git-config.txt. But the argument is a little flimsy
because I don't think that's something we've stuck to anywhere.

I'll use "configuration" if it's not too unwieldy.

>> I suppose that the idea behind this is that we only parse and store each
>> config file exactly once. It's a good goal, but the whole point of the
>> configset is that we can query a single struct to figure out the value
>> of a config variable. Having multiple configsets starts to shift more of
>> the burden to the callers because they now have to query multiple
>> configsets to find their desired config value, and we already start to
>> see some of this unpleasantness in this series.
>
> Yes, I was worried about this, too.  "parse and store exactly once"
> may merely be a performance thing, but it still matters, even though
> it is not worse than making duplicate callbacks to overwrite globals
> that have been already set earlier, which will affect correctness ;-)

Exactly.

>> An alternative that I'd been thinking about is to make a few changes to
>> the git_config_* + configset API to allow us to use a single configset
>> for all of our needs:
>>
>> 1. Keep track of what config we've read when reading into
>>    the_repository->config, i.e. instead of a boolean "all config has
>>    been [un]read", we can express "system and global config has been
>>    read, but not local or command config". Then, use this information to
>>    load config from sources as they become available. This will allow us
>>    to read incomplete config for trace2 and setup.c (discovery.bare and
>>    safe.directory), and only read what we need later on.
>
> That is not a bad direction to go, but are we sure that we always
> read in the right order (and there is one single right order) and
> stop at the right step?
>
> config.c::do_git_config_sequence() reads the system and then the
> global before the local, the worktree, and the command line.  We
> would allow the values of "protected" configuration variables to be
> inspected by stopping after the first two and inspecting the result
> before the local and the rest overrides them, but will we need
> *only* that kind of partial configuration reading that stops exactly
> there?  Even with the proposed "protected" scheme, I thought we plan
> to honor the command line ones, so we may need to read
> system+global+command without reading anything else to grab the
> values only from the protected sources (ah, I like the application
> of the adjective "protected" to the source, not variables, because
> that is what we are really talking about---alternatively we could
> call it "safe").  But if we later read local and worktree ones
> lazily, unless we _insert_ them before what we read from the command
> line, we'll break the last-one-wins property, so we need to be
> careful.  I guess each configuration value in the configset knows
> where it came from, so it probably is possible to insert the ones
> you read lazily later in the right spot.

Yeah, last-one-wins makes this a lot trickier. I thought that it would
be nice to have insert-with-priority because that also eliminates some
of the correctness concerns in this series, i.e. that ensures protected
config has the same priority as regular config, but that's a bigger
undertaking and I'm not certain about the performance.

>> 2. Add an additional argument that specifies what scopes to respect when
>>    reading config (maybe as a set of flags). This gives us extra
>>    specificity when using the git_config*() functions, so we could get
>>    rid of git_protected_config() like so:
>>
>>     /* Change enum config_scope into flags first... */
>>
>>     #define WIP_SCOPES_PROTECTED = CONFIG_SCOPE_SYSTEM & \
>>       CONFIG_SCOPE_GLOBAL & CONFIG_SCOPE_COMMAND
>>
>>     static enum discovery_bare_allowed get_discovery_bare(void)
>>     {
>>       enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
>>       git_config(discovery_bare_cb, &result, WIP_SCOPES_PROTECTED);
>>       return result;
>>     }
>
> Alternatively, we could make the callback aware of the scope for
> each var-value it is called and have it filter, but that would be a
> bigger surgery.
>
> I think a new iterator git_config_in_scope(), instead of updating
> git_config(), would make sense.  By definition, all existing
> git_config() callers do not need the scope specifiers, and
> "protected" may be the first one but will not be the last one that
> needs to read from particular scopes.

Makes sense. The signature of git_config() could stay the same, but we
could refactor it to use git_config_in_scope().

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v4 0/5] config: introduce discovery.bare and protected config
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                         ` (5 preceding siblings ...)
  2022-06-22 22:03       ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Jonathan Tan
@ 2022-06-27 18:19       ` Glen Choo
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
  7 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-27 18:19 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget, git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This round doesn't introduce any major code changes. The most notable
> changes are:
>

[...]

>
>  * I added a test for "git config --show-scope" and the 'worktree' scope,
>    since 'worktree' wasn't listed in Documentation/git-config.txt (6/5).

Erratum: This patch was sent as gc/document-config-worktree-scope [1].
Patch 6/5 obviously doesn't exist.

[1] https://lore.kernel.org/git/pull.1274.git.git.1654637044966.gitgitgadget@gmail.com

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v5 0/5] config: introduce discovery.bare and protected config
  2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                         ` (6 preceding siblings ...)
  2022-06-27 18:19       ` Glen Choo
@ 2022-06-27 18:36       ` Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
                           ` (5 more replies)
  7 siblings, 6 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-27 18:36 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan, Glen Choo

The previous round of this series was picked up by the Google-hosted "Review
Club" (the event is on http://tinyurl.com/gitcal). This round incorporates
Jonathan Tan's feedback (thanks!) as well some feedback from Review Club
itself.

This round only contains changes to commit messages and documentation. As
requested in Review Club, I've included a full "Description" section in this
cover letter for the convenience of new readers.

= Description

There is a known social engineering attack that takes advantage of the fact
that a working tree can include an entire bare repository, including a
config file. A user could run a Git command inside the bare repository
thinking that the config file of the 'outer' repository would be used, but
in reality, the bare repository's config file (which is attacker-controlled)
is used, which may result in arbitrary code execution. See [1] for a fuller
description and deeper discussion.

This series implements a simple way of preventing such attacks: create a
config option, discovery.bare, that tells Git whether or not to die when it
finds a bare repository. discovery.bare has two values:

 * "always": always allow bare repositories (default), identical to current
   behavior
 * "never": never allow bare repositories

and users/system administrators who never expect to work with bare
repositories can secure their environments using "never". discovery.bare has
no effect if --git-dir or GIT_DIR is passed because we are confident that
the user is not confused about which repository is being used.

This series does not change the default behavior, but in the long-run, a
"no-embedded" option might be a safe and usable default [2]. "never" is too
restrictive and unlikely to be the default.

For security reasons, discovery.bare cannot be read from repository-level
config (because we would end up trusting the embedded bare repository that
we aren't supposed to trust to begin with). Since this would introduce a 3rd
variable that is only read from 'protected/trusted configuration' (the
others are safe.directory and uploadpack.packObjectsHook) this series also
defines and creates a shared implementation for 'protected configuration'

= Patch organization

 * Patch 1 add a section on configuration scopes to our docs
 * Patches 2-3 define 'protected configuration' and create a shared
   implementation.
 * Patch 4 refactors safe.directory to use protected configuration
 * Patch 5 adds discovery.bare

= Series history

Changes in v5:

 * Standardize the usage of "protected configuration" instead of mixing
   "config" and "configuration". This required some unfortunate rewrapping.
 * Remove mentions of "trustworthiness" when discussing protected
   configuration and focus on what Git does instead.
   * The rationale of protected vs non-protected is still kept.
 * Fix the stale documentation entry for discovery.bare.
 * Include a fuller description of how discovery.bare and "--git-dir"
   interact instead of saying "has no effect".

Changes in v4:

 * 2/5's commit message now justifies what scopes are included in protected
   config
 * The global configset is now a file-scope static inside config.c
   (previously it was a member of the_repository).
 * Rename discovery_bare_config to discovery_bare_allowed
 * Make discovery_bare_allowed function-scoped (instead of global).
 * Add an expect_accepted helper to the discovery.bare tests.
 * Add a helper to "upload-pack" that reads the protected and non-protected
   config

Changes in v3:

 * Rebase onto a more recent 'master'
 * Reframe this feature in only in terms of the 'embedded bare repo' attack.
 * Other docs improvements (thanks Stolee in particular!)
 * Protected config no longer uses read_very_early_config() and is only read
   once
 * Protected config now includes "-c"
 * uploadpack.packObjectsHook now uses protected config instead of ignoring
   repo config using config scopes

Changes in v2:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

= Future work

 * This series does not implement the "no-embedded" option [2] and I won't
   work on it any time soon, but I'd be more than happy to review if someone
   sends patches.
 * With discovery.bare, if a builtin is marked RUN_SETUP_GENTLY, setup.c
   doesn't die() and we don't tell users why their repository was rejected,
   e.g. "git config" gives an opaque "fatal: not in a git directory". This
   isn't a new problem though, since safe.directory has the same issue.

[1]
https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com

[2] This was first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Glen Choo (5):
  Documentation/git-config.txt: add SCOPES section
  Documentation: define protected configuration
  config: learn `git_protected_config()`
  safe.directory: use git_protected_config()
  setup.c: create `discovery.bare`

 Documentation/config.txt            |  2 +
 Documentation/config/discovery.txt  | 23 +++++++++
 Documentation/config/safe.txt       |  6 +--
 Documentation/config/uploadpack.txt |  6 +--
 Documentation/git-config.txt        | 77 +++++++++++++++++++++++------
 config.c                            | 51 +++++++++++++++++++
 config.h                            | 17 +++++++
 setup.c                             | 59 +++++++++++++++++++++-
 t/t0033-safe-directory.sh           | 24 ++++-----
 t/t0035-discovery-bare.sh           | 68 +++++++++++++++++++++++++
 t/t5544-pack-objects-hook.sh        |  7 ++-
 upload-pack.c                       | 27 ++++++----
 12 files changed, 320 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh


base-commit: f770e9f396d48b567ef7b37d273e91ad570a3522
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v5
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v4:

 1:  c0e27ab3b3e ! 1:  ee9619f6ec0 Documentation/git-config.txt: add SCOPES section
     @@ Metadata
       ## Commit message ##
          Documentation/git-config.txt: add SCOPES section
      
     -    In a subsequent commit, we will introduce "protected config", which is
     -    easiest to describe in terms of configuration scopes (i.e. it's the
     -    union of the 'system', 'global', and 'command' scopes). This description
     -    is fine for ML discussions, but it's inadequate for end users because we
     -    don't provide a good description of "config scopes" in the public docs.
     +    In a subsequent commit, we will introduce "protected configuration",
     +    which is easiest to describe in terms of configuration scopes (i.e. it's
     +    the union of the 'system', 'global', and 'command' scopes). This
     +    description is fine for ML discussions, but it's inadequate for end
     +    users because we don't provide a good description of "configuration
     +    scopes" in the public docs.
      
          145d59f482 (config: add '--show-scope' to print the scope of a config
          value, 2020-02-10) introduced the word "scope" to our public docs, but
     @@ Commit message
          those values mean.
      
          Add a SCOPES section to Documentation/git-config.txt that describes the
     -    config scopes, their corresponding CLI options, and mentions that some
     -    configuration options are only respected in certain scopes. Then,
     +    configuration scopes, their corresponding CLI options, and mentions that
     +    some configuration options are only respected in certain scopes. Then,
          use the word "scope" to simplify the FILES section and change some
          confusing wording.
      
 2:  a5a1dcb03e1 ! 2:  43627c05c0b Documentation: define protected configuration
     @@ Commit message
          Documentation: define protected configuration
      
          For security reasons, there are config variables that are only trusted
     -    when they are specified in extra-trustworthy configuration scopes, which
     -    are sometimes referred to on-list as 'protected configuration' [1]. A
     -    future commit will introduce another such variable, so let's define our
     -    terms so that we can have consistent documentation and implementation.
     +    when they are specified in certain configuration scopes, which are
     +    sometimes referred to on-list as 'protected configuration' [1]. A future
     +    commit will introduce another such variable, so let's define our terms
     +    so that we can have consistent documentation and implementation.
      
     -    In our documentation, define 'protected config' as the system, global
     -    and command config scopes. As a shorthand, I will refer to variables
     -    that are only respected in protected config as 'protected config only',
     -    but this term is not used in the documentation.
     +    In our documentation, define 'protected configuration' as the system,
     +    global and command config scopes. As a shorthand, I will refer to
     +    variables that are only respected in protected config as 'protected
     +    configuration only', but this term is not used in the documentation.
      
     -    This definition of protected configuration is based on whether or not
     -    Git can reasonably protect the user by ignoring the configuration scope:
     +    This definition of protected config is based on whether or not Git can
     +    reasonably protect the user by ignoring the configuration scope:
      
          - System, global and command line config are considered protected
            because an attacker who has control over any of those can do plenty of
     @@ Commit message
            considered protected because it is relatively easy for an attacker to
            control local config, e.g.:
            - On some shared user environments, a non-admin attacker can create a
     -        repository high up the directory hierarchy (e.g. C:\.git on Windows),
     -        and a user may accidentally use it when their PS1 automatically
     -        invokes "git" commands.
     +        repository high up the directory hierarchy (e.g. C:\.git on
     +        Windows), and a user may accidentally use it when their PS1
     +        automatically invokes "git" commands.
      
              `safe.directory` prevents attacks of this form by making sure that
              the user intended to use the shared repository. It obviously
     @@ Commit message
              repository (because "git upload-pack" would fail), but we can limit
              the risks by ignoring `uploadpack.packObjectsHook`.
      
     -    Only `uploadpack.packObjectsHook` is 'protected config only'. The
     +    Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
          following variables are intentionally excluded:
      
     -    - `safe.directory` should be 'protected config only', but it does not
     -      technically fit the definition because it is not respected in the
     +    - `safe.directory` should be 'protected configuration only', but it does
     +      not technically fit the definition because it is not respected in the
            "command" scope. A future commit will fix this.
      
          - `trace2.*` happens to read the same scopes as `safe.directory` because
     @@ Documentation/git-config.txt: Most configuration options are respected regardles
       defined in, but some options are only respected in certain scopes. See the
       option's documentation for the full details.
       
     -+Protected config
     -+~~~~~~~~~~~~~~~~
     -+
     -+Protected config refers to the 'system', 'global', and 'command' scopes. Git
     -+considers these scopes to be especially trustworthy because they are likely
     -+to be controlled by the user or a trusted administrator. An attacker who
     -+controls these scopes can do substantial harm without using Git, so it is
     -+assumed that the user's environment protects these scopes against attackers.
     ++Protected configuration
     ++~~~~~~~~~~~~~~~~~~~~~~~
      +
     ++Protected configuration refers to the 'system', 'global', and 'command' scopes.
      +For security reasons, certain options are only respected when they are
     -+specified in protected config, and ignored otherwise.
     ++specified in protected configuration, and ignored otherwise.
     ++
     ++Git treats these scopes as if they are controlled by the user or a trusted
     ++administrator. This is because an attacker who controls these scopes can do
     ++substantial harm without using Git, so it is assumed that the user's environment
     ++protects these scopes against attackers.
      +
       ENVIRONMENT
       -----------
 3:  94b40907e66 ! 3:  3efe282e6b9 config: read protected config with `git_protected_config()`
     @@ Metadata
      Author: Glen Choo <chooglen@google.com>
      
       ## Commit message ##
     -    config: read protected config with `git_protected_config()`
     +    config: learn `git_protected_config()`
      
     -    `uploadpack.packObjectsHook` is the only 'protected config only'
     +    `uploadpack.packObjectsHook` is the only 'protected configuration only'
          variable today, but we've noted that `safe.directory` and the upcoming
     -    `discovery.bare` should also be 'protected config only'. So, for
     +    `discovery.bare` should also be 'protected configuration only'. So, for
          consistency, we'd like to have a single implementation for protected
          config.
      
          The primary constraints are:
      
     -    1. Reading from protected config should be as fast as possible. Nearly
     -       all "git" commands inside a bare repository will read both
     +    1. Reading from protected configuration should be as fast as possible.
     +       Nearly all "git" commands inside a bare repository will read both
             `safe.directory` and `discovery.bare`, so we cannot afford to be
             slow.
      
     @@ Commit message
             `safe.directory` and `discovery.bare` both affect repository
             discovery and the gitdir is not known at that point [1].
      
     -    The chosen implementation in this commit is to read protected config and
     -    cache the values in a global configset. This is similar to the caching
     -    behavior we get with the_repository->config.
     +    The chosen implementation in this commit is to read protected
     +    configuration and cache the values in a global configset. This is
     +    similar to the caching behavior we get with the_repository->config.
      
     -    Introduce git_protected_config(), which reads protected config and
     -    caches them in the global configset protected_config. Then, refactor
     +    Introduce git_protected_config(), which reads protected configuration
     +    and caches them in the global configset protected_config. Then, refactor
          `uploadpack.packObjectsHook` to use git_protected_config().
      
     -    The protected config functions are named similarly to their
     +    The protected configuration functions are named similarly to their
          non-protected counterparts, e.g. git_protected_config_check_init() vs
          git_config_check_init().
      
          In light of constraint 1, this implementation can still be improved
          since git_protected_config() iterates through every variable in
          protected_config, which may still be too expensive. There exist constant
     -    time lookup functions for non-protected config (repo_config_get_*()),
     -    but for simplicity, this commit does not implement similar functions for
     -    protected config.
     +    time lookup functions for non-protected configuration
     +    (repo_config_get_*()), but for simplicity, this commit does not
     +    implement similar functions for protected configuration.
      
          An alternative that avoids introducing another configset is to continue
          to read all config using git_config(), but only accept values that have
     @@ Commit message
          the_repository->config, which would need to be reset when the gitdir is
          known and git_config() needs to read the local and worktree config.
          Resetting the_repository->config might be reasonable while we only have
     -    these 'protected config only' variables, but it's not clear whether this
     -    extends well to future variables.
     +    these 'protected configuration only' variables, but it's not clear
     +    whether this extends well to future variables.
      
          [1] In this case, we do have a candidate gitdir though, so with a little
          refactoring, it might be possible to provide a gitdir.
 4:  156817966fa ! 4:  ec925823414 safe.directory: use git_protected_config()
     @@ Commit message
          safe.directory: use git_protected_config()
      
          Use git_protected_config() to read `safe.directory` instead of
     -    read_very_early_config(), making it 'protected config only'. As a
     -    result, `safe.directory` now respects "-c", so update the tests and docs
     -    accordingly.
     +    read_very_early_config(), making it 'protected configuration only'.
     +
     +    As a result, `safe.directory` now respects "-c", so update the tests and
     +    docs accordingly. It used to ignore "-c" due to how it was implemented,
     +    not because of security or correctness concerns [1].
     +
     +    [1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
 5:  29053d029f8 ! 5:  14411512783 setup.c: create `discovery.bare`
     @@ Commit message
          Create a config variable, `discovery.bare`, that tells Git whether or
          not to die() when it discovers a bare repository. This only affects
          repository discovery, thus it has no effect if discovery was not
     -    done (e.g. `--git-dir` was passed).
     +    done, e.g. if the user passes `--git-dir=my-dir`, discovery will be
     +    skipped and my-dir will be used as the repo regardless of the
     +    `discovery.bare` value.
      
          This config is an enum of:
      
     @@ Documentation/config.txt: include::config/diff.txt[]
       ## Documentation/config/discovery.txt (new) ##
      @@
      +discovery.bare::
     -+	'(Protected config only)' Specifies whether Git will work with a
     -+	bare repository that it found during repository discovery. This
     -+	has no effect if the repository is specified directly via the
     -+	--git-dir command-line option or the GIT_DIR environment
     -+	variable (see linkgit:git[1]).
     ++	Specifies whether Git will work with a bare repository that it
     ++	found during repository discovery. If the repository is
     ++	specified directly via the --git-dir command-line option or the
     ++	GIT_DIR environment variable (see linkgit:git[1]), Git will
     ++	always use the specified repository, regardless of this value.
     +++
     ++This config setting is only respected in protected configuration (see
     ++<<SCOPES>>). This prevents the untrusted repository from tampering with
     ++this value.
      ++
      +The currently supported values are:
      ++

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v5 1/5] Documentation/git-config.txt: add SCOPES section
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
@ 2022-06-27 18:36         ` Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
                           ` (4 subsequent siblings)
  5 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-27 18:36 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.

145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
those values mean.

Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/git-config.txt | 64 ++++++++++++++++++++++++++++--------
 1 file changed, 50 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 9376e39aef2..f93d437b898 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -297,8 +297,8 @@ The default is to use a pager.
 FILES
 -----
 
-If not set explicitly with `--file`, there are four files where
-'git config' will search for configuration options:
+By default, 'git config' will read configuration options from multiple
+files:
 
 $(prefix)/etc/gitconfig::
 	System-wide configuration file.
@@ -322,27 +322,63 @@ $GIT_DIR/config.worktree::
 	This is optional and is only searched when
 	`extensions.worktreeConfig` is present in $GIT_DIR/config.
 
-If no further options are given, all reading options will read all of these
-files that are available. If the global or the system-wide configuration
-file are not available they will be ignored. If the repository configuration
-file is not available or readable, 'git config' will exit with a non-zero
-error code. However, in neither case will an error message be issued.
+You may also provide additional configuration parameters when running any
+git command by using the `-c` option. See linkgit:git[1] for details.
+
+Options will be read from all of these files that are available. If the
+global or the system-wide configuration file are not available they will be
+ignored. If the repository configuration file is not available or readable,
+'git config' will exit with a non-zero error code. However, in neither case
+will an error message be issued.
 
 The files are read in the order given above, with last value found taking
 precedence over values read earlier.  When multiple values are taken then all
 values of a key from all files will be used.
 
-You may override individual configuration parameters when running any git
-command by using the `-c` option. See linkgit:git[1] for details.
-
-All writing options will per default write to the repository specific
+By default, options are only written to the repository specific
 configuration file. Note that this also affects options like `--replace-all`
 and `--unset`. *'git config' will only ever change one file at a time*.
 
-You can override these rules using the `--global`, `--system`,
-`--local`, `--worktree`, and `--file` command-line options; see
-<<OPTIONS>> above.
+You can change the way options are read/written by specifying the path to a
+file (`--file`), or by specifying a configuration scope (`--system`,
+`--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
+
+SCOPES
+------
+
+Each configuration source falls within a configuration scope. The scopes
+are:
+
+system::
+	$(prefix)/etc/gitconfig
+
+global::
+	$XDG_CONFIG_HOME/git/config
++
+~/.gitconfig
+
+local::
+	$GIT_DIR/config
+
+worktree::
+	$GIT_DIR/config.worktree
+
+command::
+	environment variables
++
+the `-c` option
+
+With the exception of 'command', each scope corresponds to a command line
+option - `--system`, `--global`, `--local`, `--worktree`.
+
+When reading options, specifying a scope will only read options from the
+files within that scope. When writing options, specifying a scope will write
+to the files within that scope (instead of the repository specific
+configuration file). See <<OPTIONS>> above for a complete description.
 
+Most configuration options are respected regardless of the scope it is
+defined in, but some options are only respected in certain scopes. See the
+option's documentation for the full details.
 
 ENVIRONMENT
 -----------
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 2/5] Documentation: define protected configuration
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-06-27 18:36         ` Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
                           ` (3 subsequent siblings)
  5 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-27 18:36 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.

In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected config as 'protected
configuration only', but this term is not used in the documentation.

This definition of protected config is based on whether or not Git can
reasonably protect the user by ignoring the configuration scope:

- System, global and command line config are considered protected
  because an attacker who has control over any of those can do plenty of
  harm without Git, so we gain very little by ignoring those scopes.
- On the other hand, local (and similarly, worktree) config are not
  considered protected because it is relatively easy for an attacker to
  control local config, e.g.:
  - On some shared user environments, a non-admin attacker can create a
    repository high up the directory hierarchy (e.g. C:\.git on
    Windows), and a user may accidentally use it when their PS1
    automatically invokes "git" commands.

    `safe.directory` prevents attacks of this form by making sure that
    the user intended to use the shared repository. It obviously
    shouldn't be read from the repository, because that would end up
    trusting the repository that Git was supposed to reject.
  - "git upload-pack" is expected to run in repositories that may not be
    controlled by the user. We cannot ignore all config in that
    repository (because "git upload-pack" would fail), but we can limit
    the risks by ignoring `uploadpack.packObjectsHook`.

Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:

- `safe.directory` should be 'protected configuration only', but it does
  not technically fit the definition because it is not respected in the
  "command" scope. A future commit will fix this.

- `trace2.*` happens to read the same scopes as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/uploadpack.txt |  6 +++---
 Documentation/git-config.txt        | 13 +++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..029abbefdff 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -49,9 +49,9 @@ uploadpack.packObjectsHook::
 	`pack-objects` to the hook, and expects a completed packfile on
 	stdout.
 +
-Note that this configuration variable is ignored if it is seen in the
-repository-level config (this is a safety measure against fetching from
-untrusted repositories).
+Note that this configuration variable is only respected when it is specified
+in protected config (see <<SCOPES>>). This is a safety measure against
+fetching from untrusted repositories.
 
 uploadpack.allowFilter::
 	If this option is set, `upload-pack` will support partial
diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index f93d437b898..f1810952891 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -343,6 +343,7 @@ You can change the way options are read/written by specifying the path to a
 file (`--file`), or by specifying a configuration scope (`--system`,
 `--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
 
+[[SCOPES]]
 SCOPES
 ------
 
@@ -380,6 +381,18 @@ Most configuration options are respected regardless of the scope it is
 defined in, but some options are only respected in certain scopes. See the
 option's documentation for the full details.
 
+Protected configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Protected configuration refers to the 'system', 'global', and 'command' scopes.
+For security reasons, certain options are only respected when they are
+specified in protected configuration, and ignored otherwise.
+
+Git treats these scopes as if they are controlled by the user or a trusted
+administrator. This is because an attacker who controls these scopes can do
+substantial harm without using Git, so it is assumed that the user's environment
+protects these scopes against attackers.
+
 ENVIRONMENT
 -----------
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 3/5] config: learn `git_protected_config()`
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-06-27 18:36         ` Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
                           ` (2 subsequent siblings)
  5 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-27 18:36 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`discovery.bare` should also be 'protected configuration only'. So, for
consistency, we'd like to have a single implementation for protected
config.

The primary constraints are:

1. Reading from protected configuration should be as fast as possible.
   Nearly all "git" commands inside a bare repository will read both
   `safe.directory` and `discovery.bare`, so we cannot afford to be
   slow.

2. Protected config must be readable when the gitdir is not known.
   `safe.directory` and `discovery.bare` both affect repository
   discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved
since git_protected_config() iterates through every variable in
protected_config, which may still be too expensive. There exist constant
time lookup functions for non-protected configuration
(repo_config_get_*()), but for simplicity, this commit does not
implement similar functions for protected configuration.

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 config.c                     | 51 ++++++++++++++++++++++++++++++++++++
 config.h                     | 17 ++++++++++++
 t/t5544-pack-objects-hook.sh |  7 ++++-
 upload-pack.c                | 27 ++++++++++++-------
 4 files changed, 91 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index 9b0e9c93285..29e62f5d0ed 100644
--- a/config.c
+++ b/config.c
@@ -81,6 +81,18 @@ static enum config_scope current_parsing_scope;
 static int pack_compression_seen;
 static int zlib_compression_seen;
 
+/*
+ * Config that comes from trusted sources, namely:
+ * - system config files (e.g. /etc/gitconfig)
+ * - global config files (e.g. $HOME/.gitconfig,
+ *   $XDG_CONFIG_HOME/git)
+ * - the command line.
+ *
+ * This is declared here for code cleanliness, but unlike the other
+ * static variables, this does not hold config parser state.
+ */
+static struct config_set protected_config;
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return getc_unlocked(conf->u.file);
@@ -2378,6 +2390,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
 	return git_config_from_file(config_set_callback, filename, cs);
 }
 
+int git_configset_add_parameters(struct config_set *cs)
+{
+	return git_config_from_parameters(config_set_callback, cs);
+}
+
 int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
 {
 	const struct string_list *values = NULL;
@@ -2619,6 +2636,40 @@ int repo_config_get_pathname(struct repository *repo,
 	return ret;
 }
 
+/* Read values into protected_config. */
+static void read_protected_config(void)
+{
+	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
+
+	git_configset_init(&protected_config);
+
+	system_config = git_system_config();
+	git_global_config(&user_config, &xdg_config);
+
+	git_configset_add_file(&protected_config, system_config);
+	git_configset_add_file(&protected_config, xdg_config);
+	git_configset_add_file(&protected_config, user_config);
+	git_configset_add_parameters(&protected_config);
+
+	free(system_config);
+	free(xdg_config);
+	free(user_config);
+}
+
+/* Ensure that protected_config has been initialized. */
+static void git_protected_config_check_init(void)
+{
+	if (protected_config.hash_initialized)
+		return;
+	read_protected_config();
+}
+
+void git_protected_config(config_fn_t fn, void *data)
+{
+	git_protected_config_check_init();
+	configset_iter(&protected_config, fn, data);
+}
+
 /* Functions used historically to read configuration from 'the_repository' */
 void git_config(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index 7654f61c634..e3ff1fcf683 100644
--- a/config.h
+++ b/config.h
@@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
  */
 int git_configset_add_file(struct config_set *cs, const char *filename);
 
+/**
+ * Parses command line options and environment variables, and adds the
+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
+ * if there is an error in parsing. The caller decides whether to free
+ * the incomplete configset or continue using it when the function
+ * returns -1.
+ */
+int git_configset_add_parameters(struct config_set *cs);
+
 /**
  * Finds and returns the value list, sorted in order of increasing priority
  * for the configuration variable `key` and config set `cs`. When the
@@ -505,6 +514,14 @@ int repo_config_get_maybe_bool(struct repository *repo,
 int repo_config_get_pathname(struct repository *repo,
 			     const char *key, const char **dest);
 
+/*
+ * Functions for reading protected config. By definition, protected
+ * config ignores repository config, so it is unnecessary to read
+ * protected config from any `struct repository` other than
+ * the_repository.
+ */
+void git_protected_config(config_fn_t fn, void *data);
+
 /**
  * Querying For Specific Variables
  * -------------------------------
diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index dd5f44d986f..54f54f8d2eb 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
 	! grep "hook running" stderr &&
 	test_path_is_missing .git/hook.args &&
 	test_path_is_missing .git/hook.stdin &&
-	test_path_is_missing .git/hook.stdout
+	test_path_is_missing .git/hook.stdout &&
+
+	# check that global config is used instead
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	git clone --no-local . dst2.git 2>stderr &&
+	grep "hook running" stderr
 '
 
 test_expect_success 'hook works with partial clone' '
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b36066..09f48317b02 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 		data->advertise_sid = git_config_bool(var, value);
 	}
 
-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
-		if (!strcmp("uploadpack.packobjectshook", var))
-			return git_config_string(&data->pack_objects_hook, var, value);
-	}
-
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
+{
+	struct upload_pack_data *data = cb_data;
+
+	if (!strcmp("uploadpack.packobjectshook", var))
+		return git_config_string(&data->pack_objects_hook, var, value);
+	return 0;
+}
+
+static void get_upload_pack_config(struct upload_pack_data *data)
+{
+	git_config(upload_pack_config, data);
+	git_protected_config(upload_pack_protected_config, data);
+}
+
 void upload_pack(const int advertise_refs, const int stateless_rpc,
 		 const int timeout)
 {
@@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 	struct upload_pack_data data;
 
 	upload_pack_data_init(&data);
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	data.stateless_rpc = stateless_rpc;
 	data.timeout = timeout;
@@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
 
 	upload_pack_data_init(&data);
 	data.use_sideband = LARGE_PACKET_MAX;
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	while (state != FETCH_DONE) {
 		switch (state) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 4/5] safe.directory: use git_protected_config()
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
                           ` (2 preceding siblings ...)
  2022-06-27 18:36         ` [PATCH v5 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-06-27 18:36         ` Glen Choo via GitGitGadget
  2022-06-27 18:36         ` [PATCH v5 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  5 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-27 18:36 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/safe.txt |  6 +++---
 setup.c                       |  2 +-
 t/t0033-safe-directory.sh     | 24 ++++++++++--------------
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index fa02f3ccc54..f72b4408798 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -12,9 +12,9 @@ via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
-This config setting is only respected when specified in a system or global
-config, not when it is specified in a repository config, via the command
-line option `-c safe.directory=<path>`, or in environment variables.
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with this
+value.
 +
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
diff --git a/setup.c b/setup.c
index faf5095e44d..c8e3c32814d 100644
--- a/setup.c
+++ b/setup.c
@@ -1137,7 +1137,7 @@ static int ensure_valid_ownership(const char *path)
 	    is_path_owned_by_current_user(path))
 		return 1;
 
-	read_very_early_config(safe_directory_cb, &data);
+	git_protected_config(safe_directory_cb, &data);
 
 	return data.is_safe;
 }
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 238b25f91a3..5a1cd0d0947 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -16,24 +16,20 @@ test_expect_success 'safe.directory is not set' '
 	expect_rejected_dir
 '
 
-test_expect_success 'ignoring safe.directory on the command line' '
-	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory on the command line' '
+	git -c safe.directory="$(pwd)" status
 '
 
-test_expect_success 'ignoring safe.directory in the environment' '
-	test_must_fail env GIT_CONFIG_COUNT=1 \
-		GIT_CONFIG_KEY_0="safe.directory" \
-		GIT_CONFIG_VALUE_0="$(pwd)" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in the environment' '
+	env GIT_CONFIG_COUNT=1 \
+	    GIT_CONFIG_KEY_0="safe.directory" \
+	    GIT_CONFIG_VALUE_0="$(pwd)" \
+	    git status
 '
 
-test_expect_success 'ignoring safe.directory in GIT_CONFIG_PARAMETERS' '
-	test_must_fail env \
-		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
+	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
+	    git status
 '
 
 test_expect_success 'ignoring safe.directory in repo config' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v5 5/5] setup.c: create `discovery.bare`
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
                           ` (3 preceding siblings ...)
  2022-06-27 18:36         ` [PATCH v5 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
@ 2022-06-27 18:36         ` Glen Choo via GitGitGadget
  2022-06-30 13:20           ` Ævar Arnfjörð Bjarmason
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  5 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-27 18:36 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `discovery.bare`, that tells Git whether or
not to die() when it discovers a bare repository. This only affects
repository discovery, thus it has no effect if discovery was not
done, e.g. if the user passes `--git-dir=my-dir`, discovery will be
skipped and my-dir will be used as the repo regardless of the
`discovery.bare` value.

This config is an enum of:

- "always": always allow bare repositories (this is the default)
- "never": never allow bare repositories

If we want to protect users from such attacks by default, neither value
will suffice - "always" provides no protection, but "never" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  2 +
 Documentation/config/discovery.txt | 23 ++++++++++
 setup.c                            | 57 ++++++++++++++++++++++++-
 t/t0035-discovery-bare.sh          | 68 ++++++++++++++++++++++++++++++
 4 files changed, 149 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e284b042f22..9a5e1329772 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -409,6 +409,8 @@ include::config/diff.txt[]
 
 include::config/difftool.txt[]
 
+include::config/discovery.txt[]
+
 include::config/extensions.txt[]
 
 include::config/fastimport.txt[]
diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
new file mode 100644
index 00000000000..bbcf89bb0b5
--- /dev/null
+++ b/Documentation/config/discovery.txt
@@ -0,0 +1,23 @@
+discovery.bare::
+	Specifies whether Git will work with a bare repository that it
+	found during repository discovery. If the repository is
+	specified directly via the --git-dir command-line option or the
+	GIT_DIR environment variable (see linkgit:git[1]), Git will
+	always use the specified repository, regardless of this value.
++
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with
+this value.
++
+The currently supported values are:
++
+* `always`: Git always works with bare repositories
+* `never`: Git never works with bare repositories
++
+This defaults to `always`, but this default may change in the future.
++
+If you do not use bare repositories in your workflow, then it may be
+beneficial to set `discovery.bare` to `never` in your global config.
+This will protect you from attacks that involve cloning a repository
+that contains a bare repository and running a Git command within that
+directory.
diff --git a/setup.c b/setup.c
index c8e3c32814d..16938fd5a24 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,10 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum discovery_bare_allowed {
+	DISCOVERY_BARE_NEVER = 0,
+	DISCOVERY_BARE_ALWAYS,
+};
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1142,6 +1146,46 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+static int discovery_bare_cb(const char *key, const char *value, void *d)
+{
+	enum discovery_bare_allowed *discovery_bare_allowed = d;
+
+	if (strcmp(key, "discovery.bare"))
+		return 0;
+
+	if (!strcmp(value, "never")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_NEVER;
+		return 0;
+	}
+	if (!strcmp(value, "always")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_ALWAYS;
+		return 0;
+	}
+	return -1;
+}
+
+static enum discovery_bare_allowed get_discovery_bare(void)
+{
+	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
+	git_protected_config(discovery_bare_cb, &result);
+	return result;
+}
+
+static const char *discovery_bare_allowed_to_string(
+	enum discovery_bare_allowed discovery_bare_allowed)
+{
+	switch (discovery_bare_allowed) {
+	case DISCOVERY_BARE_NEVER:
+		return "never";
+	case DISCOVERY_BARE_ALWAYS:
+		return "always";
+	default:
+		BUG("invalid discovery_bare_allowed %d",
+		    discovery_bare_allowed);
+	}
+	return NULL;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1151,7 +1195,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
 };
 
 /*
@@ -1248,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (!get_discovery_bare())
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1394,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
+			    dir.buf,
+			    discovery_bare_allowed_to_string(get_discovery_bare()));
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
new file mode 100755
index 00000000000..0b345d361e6
--- /dev/null
+++ b/t/t0035-discovery-bare.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+test_description='verify discovery.bare checks'
+
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_accepted () {
+	git "$@" rev-parse --git-dir
+}
+
+expect_rejected () {
+	test_must_fail git "$@" rev-parse --git-dir 2>err &&
+	grep "discovery.bare" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare unset' '
+	(
+		cd outer-repo/bare-repo &&
+		expect_accepted
+	)
+'
+
+test_expect_success 'discovery.bare=always' '
+	git config --global discovery.bare always &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_accepted
+	)
+'
+
+test_expect_success 'discovery.bare=never' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_rejected
+	)
+'
+
+test_expect_success 'discovery.bare in the repository' '
+	(
+		cd outer-repo/bare-repo &&
+		# Temporarily set discovery.bare=always, otherwise git
+		# config fails with "fatal: not in a git directory"
+		# (like safe.directory)
+		git config --global discovery.bare always &&
+		git config discovery.bare always &&
+		git config --global discovery.bare never &&
+		expect_rejected
+	)
+'
+
+test_expect_success 'discovery.bare on the command line' '
+	git config --global discovery.bare never &&
+	(
+		cd outer-repo/bare-repo &&
+		expect_accepted -c discovery.bare=always &&
+		expect_rejected -c discovery.bare=
+	)
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v5 5/5] setup.c: create `discovery.bare`
  2022-06-27 18:36         ` [PATCH v5 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-06-30 13:20           ` Ævar Arnfjörð Bjarmason
  2022-06-30 17:28             ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-30 13:20 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Jonathan Tan, Glen Choo


On Mon, Jun 27 2022, Glen Choo via GitGitGadget wrote:

> From: Glen Choo <chooglen@google.com>

> diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
> new file mode 100755
> index 00000000000..0b345d361e6
> --- /dev/null
> +++ b/t/t0035-discovery-bare.sh
> @@ -0,0 +1,68 @@
> +#!/bin/sh
> +
> +test_description='verify discovery.bare checks'
> +

You're missing a:

	TEST_PASSES_SANITIZE_LEAK=true

Above this line:

> +. ./test-lib.sh

Which tells us that this new test doesn't leak (yay!)

> +expect_accepted () {
> +	git "$@" rev-parse --git-dir
> +}

I think we can do away with this helper, we use the argument support
once, and for the rest we can inline the trivial command...

> +
> +expect_rejected () {
> +	test_must_fail git "$@" rev-parse --git-dir 2>err &&
> +	grep "discovery.bare" err

grep -F ?

This helper is less trivial, but more obvious would be a "run command
and assirt xyz about the output" helper, see
e.g. test_stdout_line_count.


> +test_expect_success 'discovery.bare unset' '
> +	(
> +		cd outer-repo/bare-repo &&
> +		expect_accepted
> +	)

Also: Odd to use a sub-shell when the helper takes -C...

> +'
> +
> +test_expect_success 'discovery.bare=always' '
> +	git config --global discovery.bare always &&
> +	(
> +		cd outer-repo/bare-repo &&
> +		expect_accepted
> +	)
> +'
> +
> +test_expect_success 'discovery.bare=never' '
> +	git config --global discovery.bare never &&
> +	(
> +		cd outer-repo/bare-repo &&
> +		expect_rejected
> +	)

...ditto...


> +'
> +
> +test_expect_success 'discovery.bare in the repository' '
> +	(
> +		cd outer-repo/bare-repo &&
> +		# Temporarily set discovery.bare=always, otherwise git
> +		# config fails with "fatal: not in a git directory"
> +		# (like safe.directory)
> +		git config --global discovery.bare always &&
> +		git config discovery.bare always &&
> +		git config --global discovery.bare never &&
> +		expect_rejected
> +	)

Drop the sub-shell and use test_config?

> +'
> +
> +test_expect_success 'discovery.bare on the command line' '
> +	git config --global discovery.bare never &&
> +	(
> +		cd outer-repo/bare-repo &&
> +		expect_accepted -c discovery.bare=always &&
> +		expect_rejected -c discovery.bare=
> +	)
> +'
> +
> +test_done


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v5 5/5] setup.c: create `discovery.bare`
  2022-06-30 13:20           ` Ævar Arnfjörð Bjarmason
@ 2022-06-30 17:28             ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-06-30 17:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Jonathan Tan

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, Jun 27 2022, Glen Choo via GitGitGadget wrote:
>
>> From: Glen Choo <chooglen@google.com>
>
>> diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
>> new file mode 100755
>> index 00000000000..0b345d361e6
>> --- /dev/null
>> +++ b/t/t0035-discovery-bare.sh
>> @@ -0,0 +1,68 @@
>> +#!/bin/sh
>> +
>> +test_description='verify discovery.bare checks'
>> +
>
> You're missing a:
>
> 	TEST_PASSES_SANITIZE_LEAK=true
>
> Above this line:
>
>> +. ./test-lib.sh
>
> Which tells us that this new test doesn't leak (yay!)

Ah, thanks! Hooray.

>> +expect_accepted () {
>> +	git "$@" rev-parse --git-dir
>> +}
>
> I think we can do away with this helper, we use the argument support
> once, and for the rest we can inline the trivial command...

That is true, having fewer test helpers can be a good idea. Though in
this case, the helper wins out slightly (IMO at least) because of the 
readability/refactoring benefit.

>> +
>> +expect_rejected () {
>> +	test_must_fail git "$@" rev-parse --git-dir 2>err &&
>> +	grep "discovery.bare" err
>
> grep -F ?
>
> This helper is less trivial, but more obvious would be a "run command
> and assirt xyz about the output" helper, see
> e.g. test_stdout_line_count.

This takes precedent from t0033, which does the same "run command and
grep the result". And just as I typed this out, I remembered that
t0033's corresponding test helper was made more specific in f62563988f
(t0033-safe-directory: check the error message without matching the
trash dir, 2022-04-27), because just grep-ing for the config variable
masked some errors.

It turns out the same thing is happening in the last test - I forgot
that "-c" doesn't unset the variable (it sets the value to ''), and the
test_must_fail passes because we fail to parse "discovery.bare", _not_
because we forbade the repo.

So besides -F, I think the only change here would be to grep on the
specific "cannot use bare repository" message (instead of grepping for
"discovery.bare").

>> +test_expect_success 'discovery.bare unset' '
>> +	(
>> +		cd outer-repo/bare-repo &&
>> +		expect_accepted
>> +	)
>
> Also: Odd to use a sub-shell when the helper takes -C...
>
>> +'
>> +
>> +test_expect_success 'discovery.bare=always' '
>> +	git config --global discovery.bare always &&
>> +	(
>> +		cd outer-repo/bare-repo &&
>> +		expect_accepted
>> +	)
>> +'
>> +
>> +test_expect_success 'discovery.bare=never' '
>> +	git config --global discovery.bare never &&
>> +	(
>> +		cd outer-repo/bare-repo &&
>> +		expect_rejected
>> +	)
>
> ...ditto...

Ok, I'll drop the sub-shell.

>
>> +'
>> +
>> +test_expect_success 'discovery.bare in the repository' '
>> +	(
>> +		cd outer-repo/bare-repo &&
>> +		# Temporarily set discovery.bare=always, otherwise git
>> +		# config fails with "fatal: not in a git directory"
>> +		# (like safe.directory)
>> +		git config --global discovery.bare always &&
>> +		git config discovery.bare always &&
>> +		git config --global discovery.bare never &&
>> +		expect_rejected
>> +	)
>
> Drop the sub-shell and use test_config?

Oh, I was so focused on t0033 that I hadn't realized that we had
test_config_global. Thanks :)

>> +'
>> +
>> +test_expect_success 'discovery.bare on the command line' '
>> +	git config --global discovery.bare never &&
>> +	(
>> +		cd outer-repo/bare-repo &&
>> +		expect_accepted -c discovery.bare=always &&
>> +		expect_rejected -c discovery.bare=
>> +	)
>> +'
>> +
>> +test_done

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
                           ` (4 preceding siblings ...)
  2022-06-27 18:36         ` [PATCH v5 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-06-30 18:13         ` Glen Choo via GitGitGadget
  2022-06-30 18:13           ` [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
                             ` (7 more replies)
  5 siblings, 8 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-30 18:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

This is a quick re-roll to address Ævar's comments on the tests (thanks!).

This version is mostly simple refactoring, and I also removed a useless test
assertion (see "Series history").

= Description

There is a known social engineering attack that takes advantage of the fact
that a working tree can include an entire bare repository, including a
config file. A user could run a Git command inside the bare repository
thinking that the config file of the 'outer' repository would be used, but
in reality, the bare repository's config file (which is attacker-controlled)
is used, which may result in arbitrary code execution. See [1] for a fuller
description and deeper discussion.

This series implements a simple way of preventing such attacks: create a
config option, discovery.bare, that tells Git whether or not to die when it
finds a bare repository. discovery.bare has two values:

 * "always": always allow bare repositories (default), identical to current
   behavior
 * "never": never allow bare repositories

and users/system administrators who never expect to work with bare
repositories can secure their environments using "never". discovery.bare has
no effect if --git-dir or GIT_DIR is passed because we are confident that
the user is not confused about which repository is being used.

This series does not change the default behavior, but in the long-run, a
"no-embedded" option might be a safe and usable default [2]. "never" is too
restrictive and unlikely to be the default.

For security reasons, discovery.bare cannot be read from repository-level
config (because we would end up trusting the embedded bare repository that
we aren't supposed to trust to begin with). Since this would introduce a 3rd
variable that is only read from 'protected/trusted configuration' (the
others are safe.directory and uploadpack.packObjectsHook) this series also
defines and creates a shared implementation for 'protected configuration'

= Patch organization

 * Patch 1 add a section on configuration scopes to our docs
 * Patches 2-3 define 'protected configuration' and create a shared
   implementation.
 * Patch 4 refactors safe.directory to use protected configuration
 * Patch 5 adds discovery.bare

= Series history

Changes in v6:

 * Add TEST_PASSES_SANITIZE_LEAK=true
 * Replace all sub-shells with -C and use test_config_global
 * Change the expect_rejected helper to use "grep -F" with a more specific
   message.
   * This reveals that the "-c discovery.bare=" assertion in the last test
     was passing for the wrong reason (because '' is an invalid value for
     "discovery.bare"). I removed it because it wasn't doing anything useful
     anyway - I was trying to make discovery.bare unset in the command line,
     but the whole point of that test is to assert that we respect the CLI
     arg.

Changes in v5:

 * Standardize the usage of "protected configuration" instead of mixing
   "config" and "configuration". This required some unfortunate rewrapping.
 * Remove mentions of "trustworthiness" when discussing protected
   configuration and focus on what Git does instead.
   * The rationale of protected vs non-protected is still kept.
 * Fix the stale documentation entry for discovery.bare.
 * Include a fuller description of how discovery.bare and "--git-dir"
   interact instead of saying "has no effect".

Changes in v4:

 * 2/5's commit message now justifies what scopes are included in protected
   config
 * The global configset is now a file-scope static inside config.c
   (previously it was a member of the_repository).
 * Rename discovery_bare_config to discovery_bare_allowed
 * Make discovery_bare_allowed function-scoped (instead of global).
 * Add an expect_accepted helper to the discovery.bare tests.
 * Add a helper to "upload-pack" that reads the protected and non-protected
   config

Changes in v3:

 * Rebase onto a more recent 'master'
 * Reframe this feature in only in terms of the 'embedded bare repo' attack.
 * Other docs improvements (thanks Stolee in particular!)
 * Protected config no longer uses read_very_early_config() and is only read
   once
 * Protected config now includes "-c"
 * uploadpack.packObjectsHook now uses protected config instead of ignoring
   repo config using config scopes

Changes in v2:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

= Future work

 * This series does not implement the "no-embedded" option [2] and I won't
   work on it any time soon, but I'd be more than happy to review if someone
   sends patches.
 * With discovery.bare, if a builtin is marked RUN_SETUP_GENTLY, setup.c
   doesn't die() and we don't tell users why their repository was rejected,
   e.g. "git config" gives an opaque "fatal: not in a git directory". This
   isn't a new problem though, since safe.directory has the same issue.

[1]
https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com

[2] This was first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Glen Choo (5):
  Documentation/git-config.txt: add SCOPES section
  Documentation: define protected configuration
  config: learn `git_protected_config()`
  safe.directory: use git_protected_config()
  setup.c: create `discovery.bare`

 Documentation/config.txt            |  2 +
 Documentation/config/discovery.txt  | 23 +++++++++
 Documentation/config/safe.txt       |  6 +--
 Documentation/config/uploadpack.txt |  6 +--
 Documentation/git-config.txt        | 77 +++++++++++++++++++++++------
 config.c                            | 51 +++++++++++++++++++
 config.h                            | 17 +++++++
 setup.c                             | 59 +++++++++++++++++++++-
 t/t0033-safe-directory.sh           | 24 ++++-----
 t/t0035-discovery-bare.sh           | 52 +++++++++++++++++++
 t/t5544-pack-objects-hook.sh        |  7 ++-
 upload-pack.c                       | 27 ++++++----
 12 files changed, 304 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh


base-commit: f770e9f396d48b567ef7b37d273e91ad570a3522
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v6
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v5:

 1:  ee9619f6ec0 = 1:  ee9619f6ec0 Documentation/git-config.txt: add SCOPES section
 2:  43627c05c0b = 2:  43627c05c0b Documentation: define protected configuration
 3:  3efe282e6b9 = 3:  3efe282e6b9 config: learn `git_protected_config()`
 4:  ec925823414 = 4:  ec925823414 safe.directory: use git_protected_config()
 5:  14411512783 ! 5:  a1323d963f9 setup.c: create `discovery.bare`
     @@ t/t0035-discovery-bare.sh (new)
      +
      +test_description='verify discovery.bare checks'
      +
     ++TEST_PASSES_SANITIZE_LEAK=true
      +. ./test-lib.sh
      +
      +pwd="$(pwd)"
     @@ t/t0035-discovery-bare.sh (new)
      +
      +expect_rejected () {
      +	test_must_fail git "$@" rev-parse --git-dir 2>err &&
     -+	grep "discovery.bare" err
     ++	grep -F "cannot use bare repository" err
      +}
      +
      +test_expect_success 'setup bare repo in worktree' '
     @@ t/t0035-discovery-bare.sh (new)
      +'
      +
      +test_expect_success 'discovery.bare unset' '
     -+	(
     -+		cd outer-repo/bare-repo &&
     -+		expect_accepted
     -+	)
     ++	expect_accepted -C outer-repo/bare-repo
      +'
      +
      +test_expect_success 'discovery.bare=always' '
     -+	git config --global discovery.bare always &&
     -+	(
     -+		cd outer-repo/bare-repo &&
     -+		expect_accepted
     -+	)
     ++	test_config_global discovery.bare always &&
     ++	expect_accepted -C outer-repo/bare-repo
      +'
      +
      +test_expect_success 'discovery.bare=never' '
     -+	git config --global discovery.bare never &&
     -+	(
     -+		cd outer-repo/bare-repo &&
     -+		expect_rejected
     -+	)
     ++	test_config_global discovery.bare never &&
     ++	expect_rejected -C outer-repo/bare-repo
      +'
      +
      +test_expect_success 'discovery.bare in the repository' '
     -+	(
     -+		cd outer-repo/bare-repo &&
     -+		# Temporarily set discovery.bare=always, otherwise git
     -+		# config fails with "fatal: not in a git directory"
     -+		# (like safe.directory)
     -+		git config --global discovery.bare always &&
     -+		git config discovery.bare always &&
     -+		git config --global discovery.bare never &&
     -+		expect_rejected
     -+	)
     ++	# discovery.bare must not be "never", otherwise git config fails
     ++	# with "fatal: not in a git directory" (like safe.directory)
     ++	test_config -C outer-repo/bare-repo discovery.bare always &&
     ++	test_config_global discovery.bare never &&
     ++	expect_rejected -C outer-repo/bare-repo
      +'
      +
      +test_expect_success 'discovery.bare on the command line' '
     -+	git config --global discovery.bare never &&
     -+	(
     -+		cd outer-repo/bare-repo &&
     -+		expect_accepted -c discovery.bare=always &&
     -+		expect_rejected -c discovery.bare=
     -+	)
     ++	test_config_global discovery.bare never &&
     ++	expect_accepted -C outer-repo/bare-repo \
     ++		-c discovery.bare=always
      +'
      +
      +test_done

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
@ 2022-06-30 18:13           ` Glen Choo via GitGitGadget
  2022-06-30 22:32             ` Taylor Blau
  2022-06-30 18:13           ` [PATCH v6 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
                             ` (6 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-30 18:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.

145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
those values mean.

Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/git-config.txt | 64 ++++++++++++++++++++++++++++--------
 1 file changed, 50 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 9376e39aef2..f93d437b898 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -297,8 +297,8 @@ The default is to use a pager.
 FILES
 -----
 
-If not set explicitly with `--file`, there are four files where
-'git config' will search for configuration options:
+By default, 'git config' will read configuration options from multiple
+files:
 
 $(prefix)/etc/gitconfig::
 	System-wide configuration file.
@@ -322,27 +322,63 @@ $GIT_DIR/config.worktree::
 	This is optional and is only searched when
 	`extensions.worktreeConfig` is present in $GIT_DIR/config.
 
-If no further options are given, all reading options will read all of these
-files that are available. If the global or the system-wide configuration
-file are not available they will be ignored. If the repository configuration
-file is not available or readable, 'git config' will exit with a non-zero
-error code. However, in neither case will an error message be issued.
+You may also provide additional configuration parameters when running any
+git command by using the `-c` option. See linkgit:git[1] for details.
+
+Options will be read from all of these files that are available. If the
+global or the system-wide configuration file are not available they will be
+ignored. If the repository configuration file is not available or readable,
+'git config' will exit with a non-zero error code. However, in neither case
+will an error message be issued.
 
 The files are read in the order given above, with last value found taking
 precedence over values read earlier.  When multiple values are taken then all
 values of a key from all files will be used.
 
-You may override individual configuration parameters when running any git
-command by using the `-c` option. See linkgit:git[1] for details.
-
-All writing options will per default write to the repository specific
+By default, options are only written to the repository specific
 configuration file. Note that this also affects options like `--replace-all`
 and `--unset`. *'git config' will only ever change one file at a time*.
 
-You can override these rules using the `--global`, `--system`,
-`--local`, `--worktree`, and `--file` command-line options; see
-<<OPTIONS>> above.
+You can change the way options are read/written by specifying the path to a
+file (`--file`), or by specifying a configuration scope (`--system`,
+`--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
+
+SCOPES
+------
+
+Each configuration source falls within a configuration scope. The scopes
+are:
+
+system::
+	$(prefix)/etc/gitconfig
+
+global::
+	$XDG_CONFIG_HOME/git/config
++
+~/.gitconfig
+
+local::
+	$GIT_DIR/config
+
+worktree::
+	$GIT_DIR/config.worktree
+
+command::
+	environment variables
++
+the `-c` option
+
+With the exception of 'command', each scope corresponds to a command line
+option - `--system`, `--global`, `--local`, `--worktree`.
+
+When reading options, specifying a scope will only read options from the
+files within that scope. When writing options, specifying a scope will write
+to the files within that scope (instead of the repository specific
+configuration file). See <<OPTIONS>> above for a complete description.
 
+Most configuration options are respected regardless of the scope it is
+defined in, but some options are only respected in certain scopes. See the
+option's documentation for the full details.
 
 ENVIRONMENT
 -----------
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v6 2/5] Documentation: define protected configuration
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  2022-06-30 18:13           ` [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-06-30 18:13           ` Glen Choo via GitGitGadget
  2022-06-30 23:49             ` Taylor Blau
  2022-06-30 18:13           ` [PATCH v6 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
                             ` (5 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-30 18:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.

In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected config as 'protected
configuration only', but this term is not used in the documentation.

This definition of protected config is based on whether or not Git can
reasonably protect the user by ignoring the configuration scope:

- System, global and command line config are considered protected
  because an attacker who has control over any of those can do plenty of
  harm without Git, so we gain very little by ignoring those scopes.
- On the other hand, local (and similarly, worktree) config are not
  considered protected because it is relatively easy for an attacker to
  control local config, e.g.:
  - On some shared user environments, a non-admin attacker can create a
    repository high up the directory hierarchy (e.g. C:\.git on
    Windows), and a user may accidentally use it when their PS1
    automatically invokes "git" commands.

    `safe.directory` prevents attacks of this form by making sure that
    the user intended to use the shared repository. It obviously
    shouldn't be read from the repository, because that would end up
    trusting the repository that Git was supposed to reject.
  - "git upload-pack" is expected to run in repositories that may not be
    controlled by the user. We cannot ignore all config in that
    repository (because "git upload-pack" would fail), but we can limit
    the risks by ignoring `uploadpack.packObjectsHook`.

Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:

- `safe.directory` should be 'protected configuration only', but it does
  not technically fit the definition because it is not respected in the
  "command" scope. A future commit will fix this.

- `trace2.*` happens to read the same scopes as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/uploadpack.txt |  6 +++---
 Documentation/git-config.txt        | 13 +++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..029abbefdff 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -49,9 +49,9 @@ uploadpack.packObjectsHook::
 	`pack-objects` to the hook, and expects a completed packfile on
 	stdout.
 +
-Note that this configuration variable is ignored if it is seen in the
-repository-level config (this is a safety measure against fetching from
-untrusted repositories).
+Note that this configuration variable is only respected when it is specified
+in protected config (see <<SCOPES>>). This is a safety measure against
+fetching from untrusted repositories.
 
 uploadpack.allowFilter::
 	If this option is set, `upload-pack` will support partial
diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index f93d437b898..f1810952891 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -343,6 +343,7 @@ You can change the way options are read/written by specifying the path to a
 file (`--file`), or by specifying a configuration scope (`--system`,
 `--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
 
+[[SCOPES]]
 SCOPES
 ------
 
@@ -380,6 +381,18 @@ Most configuration options are respected regardless of the scope it is
 defined in, but some options are only respected in certain scopes. See the
 option's documentation for the full details.
 
+Protected configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Protected configuration refers to the 'system', 'global', and 'command' scopes.
+For security reasons, certain options are only respected when they are
+specified in protected configuration, and ignored otherwise.
+
+Git treats these scopes as if they are controlled by the user or a trusted
+administrator. This is because an attacker who controls these scopes can do
+substantial harm without using Git, so it is assumed that the user's environment
+protects these scopes against attackers.
+
 ENVIRONMENT
 -----------
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v6 3/5] config: learn `git_protected_config()`
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
  2022-06-30 18:13           ` [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
  2022-06-30 18:13           ` [PATCH v6 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-06-30 18:13           ` Glen Choo via GitGitGadget
  2022-07-01  1:22             ` Taylor Blau
  2022-06-30 18:13           ` [PATCH v6 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
                             ` (4 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-30 18:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`discovery.bare` should also be 'protected configuration only'. So, for
consistency, we'd like to have a single implementation for protected
config.

The primary constraints are:

1. Reading from protected configuration should be as fast as possible.
   Nearly all "git" commands inside a bare repository will read both
   `safe.directory` and `discovery.bare`, so we cannot afford to be
   slow.

2. Protected config must be readable when the gitdir is not known.
   `safe.directory` and `discovery.bare` both affect repository
   discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved
since git_protected_config() iterates through every variable in
protected_config, which may still be too expensive. There exist constant
time lookup functions for non-protected configuration
(repo_config_get_*()), but for simplicity, this commit does not
implement similar functions for protected configuration.

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 config.c                     | 51 ++++++++++++++++++++++++++++++++++++
 config.h                     | 17 ++++++++++++
 t/t5544-pack-objects-hook.sh |  7 ++++-
 upload-pack.c                | 27 ++++++++++++-------
 4 files changed, 91 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index 9b0e9c93285..29e62f5d0ed 100644
--- a/config.c
+++ b/config.c
@@ -81,6 +81,18 @@ static enum config_scope current_parsing_scope;
 static int pack_compression_seen;
 static int zlib_compression_seen;
 
+/*
+ * Config that comes from trusted sources, namely:
+ * - system config files (e.g. /etc/gitconfig)
+ * - global config files (e.g. $HOME/.gitconfig,
+ *   $XDG_CONFIG_HOME/git)
+ * - the command line.
+ *
+ * This is declared here for code cleanliness, but unlike the other
+ * static variables, this does not hold config parser state.
+ */
+static struct config_set protected_config;
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return getc_unlocked(conf->u.file);
@@ -2378,6 +2390,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
 	return git_config_from_file(config_set_callback, filename, cs);
 }
 
+int git_configset_add_parameters(struct config_set *cs)
+{
+	return git_config_from_parameters(config_set_callback, cs);
+}
+
 int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
 {
 	const struct string_list *values = NULL;
@@ -2619,6 +2636,40 @@ int repo_config_get_pathname(struct repository *repo,
 	return ret;
 }
 
+/* Read values into protected_config. */
+static void read_protected_config(void)
+{
+	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
+
+	git_configset_init(&protected_config);
+
+	system_config = git_system_config();
+	git_global_config(&user_config, &xdg_config);
+
+	git_configset_add_file(&protected_config, system_config);
+	git_configset_add_file(&protected_config, xdg_config);
+	git_configset_add_file(&protected_config, user_config);
+	git_configset_add_parameters(&protected_config);
+
+	free(system_config);
+	free(xdg_config);
+	free(user_config);
+}
+
+/* Ensure that protected_config has been initialized. */
+static void git_protected_config_check_init(void)
+{
+	if (protected_config.hash_initialized)
+		return;
+	read_protected_config();
+}
+
+void git_protected_config(config_fn_t fn, void *data)
+{
+	git_protected_config_check_init();
+	configset_iter(&protected_config, fn, data);
+}
+
 /* Functions used historically to read configuration from 'the_repository' */
 void git_config(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index 7654f61c634..e3ff1fcf683 100644
--- a/config.h
+++ b/config.h
@@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
  */
 int git_configset_add_file(struct config_set *cs, const char *filename);
 
+/**
+ * Parses command line options and environment variables, and adds the
+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
+ * if there is an error in parsing. The caller decides whether to free
+ * the incomplete configset or continue using it when the function
+ * returns -1.
+ */
+int git_configset_add_parameters(struct config_set *cs);
+
 /**
  * Finds and returns the value list, sorted in order of increasing priority
  * for the configuration variable `key` and config set `cs`. When the
@@ -505,6 +514,14 @@ int repo_config_get_maybe_bool(struct repository *repo,
 int repo_config_get_pathname(struct repository *repo,
 			     const char *key, const char **dest);
 
+/*
+ * Functions for reading protected config. By definition, protected
+ * config ignores repository config, so it is unnecessary to read
+ * protected config from any `struct repository` other than
+ * the_repository.
+ */
+void git_protected_config(config_fn_t fn, void *data);
+
 /**
  * Querying For Specific Variables
  * -------------------------------
diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index dd5f44d986f..54f54f8d2eb 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
 	! grep "hook running" stderr &&
 	test_path_is_missing .git/hook.args &&
 	test_path_is_missing .git/hook.stdin &&
-	test_path_is_missing .git/hook.stdout
+	test_path_is_missing .git/hook.stdout &&
+
+	# check that global config is used instead
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	git clone --no-local . dst2.git 2>stderr &&
+	grep "hook running" stderr
 '
 
 test_expect_success 'hook works with partial clone' '
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b36066..09f48317b02 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 		data->advertise_sid = git_config_bool(var, value);
 	}
 
-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
-		if (!strcmp("uploadpack.packobjectshook", var))
-			return git_config_string(&data->pack_objects_hook, var, value);
-	}
-
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
+{
+	struct upload_pack_data *data = cb_data;
+
+	if (!strcmp("uploadpack.packobjectshook", var))
+		return git_config_string(&data->pack_objects_hook, var, value);
+	return 0;
+}
+
+static void get_upload_pack_config(struct upload_pack_data *data)
+{
+	git_config(upload_pack_config, data);
+	git_protected_config(upload_pack_protected_config, data);
+}
+
 void upload_pack(const int advertise_refs, const int stateless_rpc,
 		 const int timeout)
 {
@@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 	struct upload_pack_data data;
 
 	upload_pack_data_init(&data);
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	data.stateless_rpc = stateless_rpc;
 	data.timeout = timeout;
@@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
 
 	upload_pack_data_init(&data);
 	data.use_sideband = LARGE_PACKET_MAX;
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	while (state != FETCH_DONE) {
 		switch (state) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v6 4/5] safe.directory: use git_protected_config()
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                             ` (2 preceding siblings ...)
  2022-06-30 18:13           ` [PATCH v6 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-06-30 18:13           ` Glen Choo via GitGitGadget
  2022-06-30 18:13           ` [PATCH v6 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
                             ` (3 subsequent siblings)
  7 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-30 18:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/safe.txt |  6 +++---
 setup.c                       |  2 +-
 t/t0033-safe-directory.sh     | 24 ++++++++++--------------
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index fa02f3ccc54..f72b4408798 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -12,9 +12,9 @@ via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
-This config setting is only respected when specified in a system or global
-config, not when it is specified in a repository config, via the command
-line option `-c safe.directory=<path>`, or in environment variables.
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with this
+value.
 +
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
diff --git a/setup.c b/setup.c
index faf5095e44d..c8e3c32814d 100644
--- a/setup.c
+++ b/setup.c
@@ -1137,7 +1137,7 @@ static int ensure_valid_ownership(const char *path)
 	    is_path_owned_by_current_user(path))
 		return 1;
 
-	read_very_early_config(safe_directory_cb, &data);
+	git_protected_config(safe_directory_cb, &data);
 
 	return data.is_safe;
 }
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 238b25f91a3..5a1cd0d0947 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -16,24 +16,20 @@ test_expect_success 'safe.directory is not set' '
 	expect_rejected_dir
 '
 
-test_expect_success 'ignoring safe.directory on the command line' '
-	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory on the command line' '
+	git -c safe.directory="$(pwd)" status
 '
 
-test_expect_success 'ignoring safe.directory in the environment' '
-	test_must_fail env GIT_CONFIG_COUNT=1 \
-		GIT_CONFIG_KEY_0="safe.directory" \
-		GIT_CONFIG_VALUE_0="$(pwd)" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in the environment' '
+	env GIT_CONFIG_COUNT=1 \
+	    GIT_CONFIG_KEY_0="safe.directory" \
+	    GIT_CONFIG_VALUE_0="$(pwd)" \
+	    git status
 '
 
-test_expect_success 'ignoring safe.directory in GIT_CONFIG_PARAMETERS' '
-	test_must_fail env \
-		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
+	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
+	    git status
 '
 
 test_expect_success 'ignoring safe.directory in repo config' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v6 5/5] setup.c: create `discovery.bare`
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                             ` (3 preceding siblings ...)
  2022-06-30 18:13           ` [PATCH v6 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
@ 2022-06-30 18:13           ` Glen Choo via GitGitGadget
  2022-07-01  1:30             ` Taylor Blau
  2022-06-30 22:13           ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Taylor Blau
                             ` (2 subsequent siblings)
  7 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-06-30 18:13 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `discovery.bare`, that tells Git whether or
not to die() when it discovers a bare repository. This only affects
repository discovery, thus it has no effect if discovery was not
done, e.g. if the user passes `--git-dir=my-dir`, discovery will be
skipped and my-dir will be used as the repo regardless of the
`discovery.bare` value.

This config is an enum of:

- "always": always allow bare repositories (this is the default)
- "never": never allow bare repositories

If we want to protect users from such attacks by default, neither value
will suffice - "always" provides no protection, but "never" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  2 ++
 Documentation/config/discovery.txt | 23 ++++++++++++
 setup.c                            | 57 +++++++++++++++++++++++++++++-
 t/t0035-discovery-bare.sh          | 52 +++++++++++++++++++++++++++
 4 files changed, 133 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e284b042f22..9a5e1329772 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -409,6 +409,8 @@ include::config/diff.txt[]
 
 include::config/difftool.txt[]
 
+include::config/discovery.txt[]
+
 include::config/extensions.txt[]
 
 include::config/fastimport.txt[]
diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
new file mode 100644
index 00000000000..bbcf89bb0b5
--- /dev/null
+++ b/Documentation/config/discovery.txt
@@ -0,0 +1,23 @@
+discovery.bare::
+	Specifies whether Git will work with a bare repository that it
+	found during repository discovery. If the repository is
+	specified directly via the --git-dir command-line option or the
+	GIT_DIR environment variable (see linkgit:git[1]), Git will
+	always use the specified repository, regardless of this value.
++
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with
+this value.
++
+The currently supported values are:
++
+* `always`: Git always works with bare repositories
+* `never`: Git never works with bare repositories
++
+This defaults to `always`, but this default may change in the future.
++
+If you do not use bare repositories in your workflow, then it may be
+beneficial to set `discovery.bare` to `never` in your global config.
+This will protect you from attacks that involve cloning a repository
+that contains a bare repository and running a Git command within that
+directory.
diff --git a/setup.c b/setup.c
index c8e3c32814d..16938fd5a24 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,10 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum discovery_bare_allowed {
+	DISCOVERY_BARE_NEVER = 0,
+	DISCOVERY_BARE_ALWAYS,
+};
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1142,6 +1146,46 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+static int discovery_bare_cb(const char *key, const char *value, void *d)
+{
+	enum discovery_bare_allowed *discovery_bare_allowed = d;
+
+	if (strcmp(key, "discovery.bare"))
+		return 0;
+
+	if (!strcmp(value, "never")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_NEVER;
+		return 0;
+	}
+	if (!strcmp(value, "always")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_ALWAYS;
+		return 0;
+	}
+	return -1;
+}
+
+static enum discovery_bare_allowed get_discovery_bare(void)
+{
+	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
+	git_protected_config(discovery_bare_cb, &result);
+	return result;
+}
+
+static const char *discovery_bare_allowed_to_string(
+	enum discovery_bare_allowed discovery_bare_allowed)
+{
+	switch (discovery_bare_allowed) {
+	case DISCOVERY_BARE_NEVER:
+		return "never";
+	case DISCOVERY_BARE_ALWAYS:
+		return "always";
+	default:
+		BUG("invalid discovery_bare_allowed %d",
+		    discovery_bare_allowed);
+	}
+	return NULL;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1151,7 +1195,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
 };
 
 /*
@@ -1248,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (!get_discovery_bare())
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1394,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
+			    dir.buf,
+			    discovery_bare_allowed_to_string(get_discovery_bare()));
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
new file mode 100755
index 00000000000..8f802746530
--- /dev/null
+++ b/t/t0035-discovery-bare.sh
@@ -0,0 +1,52 @@
+#!/bin/sh
+
+test_description='verify discovery.bare checks'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_accepted () {
+	git "$@" rev-parse --git-dir
+}
+
+expect_rejected () {
+	test_must_fail git "$@" rev-parse --git-dir 2>err &&
+	grep -F "cannot use bare repository" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare unset' '
+	expect_accepted -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare=always' '
+	test_config_global discovery.bare always &&
+	expect_accepted -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare=never' '
+	test_config_global discovery.bare never &&
+	expect_rejected -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare in the repository' '
+	# discovery.bare must not be "never", otherwise git config fails
+	# with "fatal: not in a git directory" (like safe.directory)
+	test_config -C outer-repo/bare-repo discovery.bare always &&
+	test_config_global discovery.bare never &&
+	expect_rejected -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare on the command line' '
+	test_config_global discovery.bare never &&
+	expect_accepted -C outer-repo/bare-repo \
+		-c discovery.bare=always
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                             ` (4 preceding siblings ...)
  2022-06-30 18:13           ` [PATCH v6 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-06-30 22:13           ` Taylor Blau
  2022-06-30 23:07           ` Ævar Arnfjörð Bjarmason
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
  7 siblings, 0 replies; 113+ messages in thread
From: Taylor Blau @ 2022-06-30 22:13 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Thu, Jun 30, 2022 at 06:13:54PM +0000, Glen Choo via GitGitGadget wrote:
> This series does not change the default behavior, but in the long-run, a
> "no-embedded" option might be a safe and usable default [2]. "never" is too
> restrictive and unlikely to be the default.

Thanks for this summary, and sorry for not taking a look at this series
earlier. I spent quite a bit of time with v1 and the RFC patches but
haven't had as much time as I would have liked to devote to reviewing
the later rounds.

I'm happy with this direction. I think "never" is an understandable
direction to go in for compliance reasons, or in deployments where bare
repositories are not expected. And I am glad that we are treating it as
unlikely to be the default in the future, I agree that it is likely too
restrictive for that to make sense.

So leaving the default behavior unchanged, and providing a big hammer to
prevent bare repository discovery entirely seems like a good first step
for this feature. I'm interested to see a potential future no-embedded
implementation, since I think that has a reasonable chance of being a
sensible default, depending on how it's implemented.

> For security reasons, discovery.bare cannot be read from repository-level
> config (because we would end up trusting the embedded bare repository that
> we aren't supposed to trust to begin with). Since this would introduce a 3rd
> variable that is only read from 'protected/trusted configuration' (the
> others are safe.directory and uploadpack.packObjectsHook) this series also
> defines and creates a shared implementation for 'protected configuration'

This concept is new from the earlier rounds, so I'll be curious to see
how it plays out as I take a closer look at the patches.

Thanks again for your work on this complicated and tricky-to-get-right
feature ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section
  2022-06-30 18:13           ` [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-06-30 22:32             ` Taylor Blau
  2022-07-06 17:44               ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Taylor Blau @ 2022-06-30 22:32 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Thu, Jun 30, 2022 at 06:13:55PM +0000, Glen Choo via GitGitGadget wrote:
> From: Glen Choo <chooglen@google.com>
>
> In a subsequent commit, we will introduce "protected configuration",
> which is easiest to describe in terms of configuration scopes (i.e. it's
> the union of the 'system', 'global', and 'command' scopes). This
> description is fine for ML discussions, but it's inadequate for end
> users because we don't provide a good description of "configuration
> scopes" in the public docs.
>
> 145d59f482 (config: add '--show-scope' to print the scope of a config
> value, 2020-02-10) introduced the word "scope" to our public docs, but
> that only enumerates the scopes and assumes the user can figure out
> those values mean.

Thanks, I think that "scope" is an appropriate term here. When I
originally read this patch, I was thinking that "origin" would be more
appropriate, since I was recalling the `--show-origin` option to `git
config`. But that shows the file name, and `--show-scope` is a separate
option entirely.

The latter is definitely more appropriate here, so I think this choice
of naming is good and makes sense.

> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
> index 9376e39aef2..f93d437b898 100644
> --- a/Documentation/git-config.txt
> +++ b/Documentation/git-config.txt
> @@ -297,8 +297,8 @@ The default is to use a pager.
>  FILES
>  -----
>
> -If not set explicitly with `--file`, there are four files where
> -'git config' will search for configuration options:
> +By default, 'git config' will read configuration options from multiple
> +files:
>
>  $(prefix)/etc/gitconfig::
>  	System-wide configuration file.
> @@ -322,27 +322,63 @@ $GIT_DIR/config.worktree::
>  	This is optional and is only searched when
>  	`extensions.worktreeConfig` is present in $GIT_DIR/config.
>
> -If no further options are given, all reading options will read all of these
> -files that are available. If the global or the system-wide configuration
> -file are not available they will be ignored. If the repository configuration
> -file is not available or readable, 'git config' will exit with a non-zero
> -error code. However, in neither case will an error message be issued.
> +You may also provide additional configuration parameters when running any
> +git command by using the `-c` option. See linkgit:git[1] for details.
> +
> +Options will be read from all of these files that are available. If the
> +global or the system-wide configuration file are not available they will be
> +ignored. If the repository configuration file is not available or readable,
> +'git config' will exit with a non-zero error code. However, in neither case
> +will an error message be issued.

Nit: the last sentence is a little awkwardly worded. Perhaps just:
"Note that neither case produces an error message".

> -All writing options will per default write to the repository specific
> +By default, options are only written to the repository specific
>  configuration file. Note that this also affects options like `--replace-all`

Should we mention that this is the same as the "local" scope below?

>  and `--unset`. *'git config' will only ever change one file at a time*.
>
> -You can override these rules using the `--global`, `--system`,
> -`--local`, `--worktree`, and `--file` command-line options; see
> -<<OPTIONS>> above.
> +You can change the way options are read/written by specifying the path to a
> +file (`--file`), or by specifying a configuration scope (`--system`,
> +`--global`, `--local`, `--worktree`); see <<OPTIONS>> above.

I think this paragraph could be slightly more descriptive about what
`--file` does while still linking out to <<OPTIONS>> above for more
detailed information. In the pre-image, we say:

    If not set explicitly with `--file`, there are four files will `git
    config will search`.

So I wonder if something more descriptive in this section might be:

    You can limit which configuration sources are read to or written
    from by specifying the path of a file with the `--file` option, or
    by specifying a scope with `--system`, `--global`, `--local`, or
    `--worktree`. For more, see <<OPTIONS>> above.

I don't think that's so different form what you wrote, but I think it's
a little clearer particularly what `--file` does (instead of "change the
way options are read/written" it "limit[s] which configuration sources
are read to or written from").

> +
> +SCOPES
> +------
> +
> +Each configuration source falls within a configuration scope. The scopes
> +are:
> +
> +system::
> +	$(prefix)/etc/gitconfig
> +
> +global::
> +	$XDG_CONFIG_HOME/git/config
> ++
> +~/.gitconfig
> +
> +local::
> +	$GIT_DIR/config
> +
> +worktree::
> +	$GIT_DIR/config.worktree
> +
> +command::
> +	environment variables
> ++
> +the `-c` option
> +
> +With the exception of 'command', each scope corresponds to a command line
> +option - `--system`, `--global`, `--local`, `--worktree`.

I think a colon after "option" is more appropriate than a single "-"
dash character, but this is definitely a trivial matter that I have no
strong opinion on.

One thing that this reminds me of (which I don't think is worth taking
up here, but perhaps in a future series, or as #leftoverbits) would be
promoting these scopes behind a single option. Back in the day, you
could ask for values out of `git config` by specifying their type with
`--int`, `--bool`, or similar. In e3e042b185 (Merge branch
'tb/config-type', 2018-05-08), we changed to
`--type=<int|bool|color|etc>`, which unified things and made it clearer
which options were grouped together by a single concept.

I think a similar change would make sense here, that is to replace
`--system`, `--global` (and so on) with `--scope=system`,
`--scope=global`, etc.

But that's not material to this series, and just something to think
about for later on if you end up thinking it's a good idea.

> +
> +When reading options, specifying a scope will only read options from the
> +files within that scope. When writing options, specifying a scope will write
> +to the files within that scope (instead of the repository specific
> +configuration file). See <<OPTIONS>> above for a complete description.
>
> +Most configuration options are respected regardless of the scope it is
> +defined in, but some options are only respected in certain scopes. See the
> +option's documentation for the full details.

I assume "the option's" is referring to whichever configuration variable
we're talking about. So it may be clearer to say "See the *respective*
option's documentation for more information" or similar.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                             ` (5 preceding siblings ...)
  2022-06-30 22:13           ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Taylor Blau
@ 2022-06-30 23:07           ` Ævar Arnfjörð Bjarmason
  2022-07-01 17:37             ` Glen Choo
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
  7 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-30 23:07 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Jonathan Tan, Glen Choo


On Thu, Jun 30 2022, Glen Choo via GitGitGadget wrote:

> This is a quick re-roll to address Ævar's comments on the tests (thanks!).

Thanks!

> = Description

Just more generally on this series & approach. I know this is a v6 by
now, but I haven't kept up with this topic, but to be fair I did mention
pretty much this in:
https://lore.kernel.org/git/220407.86lewhc6bz.gmgdl@evledraar.gmail.com/

So...

> There is a known social engineering attack that takes advantage of the fact
> that a working tree can include an entire bare repository, including a
> config file. A user could run a Git command inside the bare repository
> thinking that the config file of the 'outer' repository would be used, but
> in reality, the bare repository's config file (which is attacker-controlled)
> is used, which may result in arbitrary code execution. See [1] for a fuller
> description and deeper discussion.
>
> This series implements a simple way of preventing such attacks: create a
> config option, discovery.bare, that tells Git whether or not to die when it
> finds a bare repository. discovery.bare has two values:
>
>  * "always": always allow bare repositories (default), identical to current
>    behavior
>  * "never": never allow bare repositories
>
> and users/system administrators who never expect to work with bare
> repositories can secure their environments using "never". discovery.bare has
> no effect if --git-dir or GIT_DIR is passed because we are confident that
> the user is not confused about which repository is being used.

I'm not insisting that the entire approach here should be changed, but
in the above exchange you seemed to have performance concerns about the
"just walk up in setup.c" approach I mentioned, but it's not clear if
that's still the only thing that necessitates taking this approach.

There may be security subtleties that I've missed, but from the
description here it seems like that would work equally well, and
wouldn't require configuration, except insofar as we'd need to opt-in to
reading config from bare repositores *that also exist in a parent tree*.

And it would be a more narrow & more secure solution, since it would
e.g. allow you to intentionally navigate to /var/repos/git/git.git in a
server setup and read the config there, which it could distinguish from
a case of /var/repos/.git existing, and git/git.git being brought in as
a part of that "parent" repo.

The "more narrow" and "more secure" go hand-in-hand, since if you work
on such servers you'd turn this to "always" because you want to read
such config, but then be left vulnerable to the actual (and muche rarer)
exploit we're trying to prevent.

Which, it seems...

> This series does not change the default behavior, but in the long-run, a
> "no-embedded" option might be a safe and usable default [2]. "never" is too
> restrictive and unlikely to be the default.

This series has (since v3?) been noting aspirations to have a
"no-embedded" variant of this config, which your 5/5 here notes would be
better, but isn't implemented by this series.

But your 5/5 also notes:

    but detecting if a repository is embedded is potentially
    non-trivial, so this work is not implemented in this series.

Hrm, well, the diff-stat isn't quite that trivial either :) :

> [...]
>  upload-pack.c                       | 27 ++++++----
>  12 files changed, 304 insertions(+), 47 deletions(-)

In threads linked from the above ML link I linked to some POC code
showing how to hack a second .git discovery walk into setup.c. This was
as part of the "submodule parent dir" proposal, which is a different
feature, but also needs such "find the parent" code:
https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/

Now, obviously that's a dirty hack, but it's not that hard to just
change the part of setup.c where we're satisfied that we've found the
git dir, then walk up "$THAT_DIR/..", and start our search again.

Then:

	if (first_dir_was_bare() && found_parent_dir())
        	enforce_no_embedded();

Isn't that what your proposed "no embedded" option would need to do?
Well, maybe we'd also check if the "first dir" is in the index of the
parent, as opposed to just being a bare .git somewhere in ~/Downloads,
e.g. if you have a ~/.git and keep your dot-files in git.

But I think for an initial implementation just doing the walk would be
good enough, and would have a more narrow scope than this configuration
setting.

AFAICT the performance concerns aren't supported by any data, in the
case of the "submodule superproject" feature it turned out to not be the
directory walk, but us shelling out in a loop in git-submodule.sh.

Well, *maybe* that's not the case, I think I have managed to read
between the lines of some of these past exchanges that there's some odd
propriterary internal NFS-like setup at Google where *parent dirs* are
auto-mounted and searched on access, so a "walk up" pattern would be
much more expensive.

I do worry a bit about us ending up with design choices in git that we
wouldn't have ended up with, if not to cater to some in-house setup
somwhere that 99.99% of git users will never see.

But I don't have the full picture on the "submodule superproject"
problem, or this one, and maybe I'm missing something. Just food for
thought, and wondering where we're eventually taking this.

Thanks!


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 2/5] Documentation: define protected configuration
  2022-06-30 18:13           ` [PATCH v6 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-06-30 23:49             ` Taylor Blau
  2022-07-06 18:21               ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Taylor Blau @ 2022-06-30 23:49 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Thu, Jun 30, 2022 at 06:13:56PM +0000, Glen Choo via GitGitGadget wrote:
> @@ -380,6 +381,18 @@ Most configuration options are respected regardless of the scope it is
>  defined in, but some options are only respected in certain scopes. See the
>  option's documentation for the full details.
>
> +Protected configuration
> +~~~~~~~~~~~~~~~~~~~~~~~
> +
> +Protected configuration refers to the 'system', 'global', and 'command' scopes.
> +For security reasons, certain options are only respected when they are
> +specified in protected configuration, and ignored otherwise.
> +
> +Git treats these scopes as if they are controlled by the user or a trusted
> +administrator. This is because an attacker who controls these scopes can do
> +substantial harm without using Git, so it is assumed that the user's environment
> +protects these scopes against attackers.
> +

I think this description is a good starting point, but I think I would
have liked to see some more from the commit description make it into the
documentation here.

One thing that I didn't see mentioned in either is that the list of
protected configuration is far from exhaustive. There are dozens upon
dozens of configuration values that Git will happily execute as a
subprocess (core.editor, core.pager, core.alternateRefsCommand, to name
just a few).

I don't think we should try and enumerate every possible path from
configuration to command execution. But it is worth noting in the
documentation that the list of configuration values which are only read
in the protected context is non-exhaustive and best-effort only.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 3/5] config: learn `git_protected_config()`
  2022-06-30 18:13           ` [PATCH v6 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-07-01  1:22             ` Taylor Blau
  2022-07-06 22:42               ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Taylor Blau @ 2022-07-01  1:22 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Thu, Jun 30, 2022 at 06:13:57PM +0000, Glen Choo via GitGitGadget wrote:
> In light of constraint 1, this implementation can still be improved
> since git_protected_config() iterates through every variable in
> protected_config, which may still be too expensive. There exist constant
> time lookup functions for non-protected configuration
> (repo_config_get_*()), but for simplicity, this commit does not
> implement similar functions for protected configuration.

I don't quite follow along with this paragraph: it sounds like reading
protected configuration is supposed to be as fast as possible. But you
note that only the slower variant of reading each configuration variable
one at a time is implemented.

If we care about speed (and I think we should here), then would it make
more sense to implement only the lookup functions like
repo_config_get_*() for protected context? That would encourage usage by
providing a more limited set of options to callers.

> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  config.c                     | 51 ++++++++++++++++++++++++++++++++++++
>  config.h                     | 17 ++++++++++++
>  t/t5544-pack-objects-hook.sh |  7 ++++-
>  upload-pack.c                | 27 ++++++++++++-------
>  4 files changed, 91 insertions(+), 11 deletions(-)
>
> diff --git a/config.c b/config.c
> index 9b0e9c93285..29e62f5d0ed 100644
> --- a/config.c
> +++ b/config.c
> @@ -81,6 +81,18 @@ static enum config_scope current_parsing_scope;
>  static int pack_compression_seen;
>  static int zlib_compression_seen;
>
> +/*
> + * Config that comes from trusted sources, namely:

Should we be using the word "scope" here instead of sources? I think
it's clear enough from the context what you're referring to, but in the
spirit of being consistent...

> + * - system config files (e.g. /etc/gitconfig)
> + * - global config files (e.g. $HOME/.gitconfig,
> + *   $XDG_CONFIG_HOME/git)
> + * - the command line.
> + *
> + * This is declared here for code cleanliness, but unlike the other
> + * static variables, this does not hold config parser state.
> + */
> +static struct config_set protected_config;
> +
>  static int config_file_fgetc(struct config_source *conf)
>  {
>  	return getc_unlocked(conf->u.file);
> @@ -2378,6 +2390,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
>  	return git_config_from_file(config_set_callback, filename, cs);
>  }
>
> +int git_configset_add_parameters(struct config_set *cs)
> +{
> +	return git_config_from_parameters(config_set_callback, cs);
> +}
> +
>  int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
>  {
>  	const struct string_list *values = NULL;
> @@ -2619,6 +2636,40 @@ int repo_config_get_pathname(struct repository *repo,
>  	return ret;
>  }
>
> +/* Read values into protected_config. */
> +static void read_protected_config(void)
> +{
> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
> +
> +	git_configset_init(&protected_config);
> +
> +	system_config = git_system_config();
> +	git_global_config(&user_config, &xdg_config);
> +
> +	git_configset_add_file(&protected_config, system_config);
> +	git_configset_add_file(&protected_config, xdg_config);
> +	git_configset_add_file(&protected_config, user_config);
> +	git_configset_add_parameters(&protected_config);
> +
> +	free(system_config);
> +	free(xdg_config);
> +	free(user_config);
> +}
> +
> +/* Ensure that protected_config has been initialized. */
> +static void git_protected_config_check_init(void)
> +{
> +	if (protected_config.hash_initialized)
> +		return;
> +	read_protected_config();
> +}
> +
> +void git_protected_config(config_fn_t fn, void *data)
> +{
> +	git_protected_config_check_init();

This may be copying from an existing pattern, but I think you could
avoid the extra function declaration by writing git_protected_config()
as:

    if (!protected_config.hash_initialized)
        read_protected_config();
    configset_iter(&protected_config, fn, data);

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 5/5] setup.c: create `discovery.bare`
  2022-06-30 18:13           ` [PATCH v6 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-07-01  1:30             ` Taylor Blau
  2022-07-07 19:55               ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Taylor Blau @ 2022-07-01  1:30 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

On Thu, Jun 30, 2022 at 06:13:59PM +0000, Glen Choo via GitGitGadget wrote:
> If we want to protect users from such attacks by default, neither value
> will suffice - "always" provides no protection, but "never" is
> impractical for bare repository users. A more usable default would be to
> allow only non-embedded bare repositories ([2] contains one such
> proposal), but detecting if a repository is embedded is potentially
> non-trivial, so this work is not implemented in this series.

I think that everything you said in your patch message makes sense, but
I appreciate this paragraph in particular. The historical record is
definitely important and worth preserving here, and I hope that it'll be
helpful to future readers who may wonder why the default wasn't chosen
as "never".

> [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
> [2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com
>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  Documentation/config.txt           |  2 ++
>  Documentation/config/discovery.txt | 23 ++++++++++++
>  setup.c                            | 57 +++++++++++++++++++++++++++++-
>  t/t0035-discovery-bare.sh          | 52 +++++++++++++++++++++++++++
>  4 files changed, 133 insertions(+), 1 deletion(-)
>  create mode 100644 Documentation/config/discovery.txt
>  create mode 100755 t/t0035-discovery-bare.sh
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index e284b042f22..9a5e1329772 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -409,6 +409,8 @@ include::config/diff.txt[]
>
>  include::config/difftool.txt[]
>
> +include::config/discovery.txt[]
> +
>  include::config/extensions.txt[]
>
>  include::config/fastimport.txt[]
> diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
> new file mode 100644
> index 00000000000..bbcf89bb0b5
> --- /dev/null
> +++ b/Documentation/config/discovery.txt
> @@ -0,0 +1,23 @@
> +discovery.bare::
> +	Specifies whether Git will work with a bare repository that it
> +	found during repository discovery. If the repository is

Is it clear from the context what "discovery" means here? It's probably
easier to describe what it isn't, which you kind of do in the next
sentence. But it may be clearer to say something like:

    Specifies whether Git will recognize bare repositories that aren't
    specified via the top-level `--git-dir` command-line option, or the
    `GIT_DIR` environment variable (see linkgit:git[1]).

> +This defaults to `always`, but this default may change in the future.

I think the default being subject to change is par for the course. It's
probably easy enough to just say "Defaults to 'always'" and leave it at
that.

> ++
> +If you do not use bare repositories in your workflow, then it may be
> +beneficial to set `discovery.bare` to `never` in your global config.
> +This will protect you from attacks that involve cloning a repository
> +that contains a bare repository and running a Git command within that
> +directory.

I think we still don't have a great answer for people who trust some
bare repositories (e.g., known-embedded repositories that are used for
testing) but not others. To be clear, I think that is a fine point to
concede with this direction.

But we should be clear about that limitation by stating that Git does
not support the "I trust some bare repositories to be safely
discoverable but not others".

> +static enum discovery_bare_allowed get_discovery_bare(void)
> +{
> +	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
> +	git_protected_config(discovery_bare_cb, &result);
> +	return result;
> +}
> +
> +static const char *discovery_bare_allowed_to_string(
> +	enum discovery_bare_allowed discovery_bare_allowed)
> +{
> +	switch (discovery_bare_allowed) {
> +	case DISCOVERY_BARE_NEVER:
> +		return "never";
> +	case DISCOVERY_BARE_ALWAYS:
> +		return "always";

> +	default:
> +		BUG("invalid discovery_bare_allowed %d",
> +		    discovery_bare_allowed);

Should we have a default case here since the case arms above are
exhaustive?

> +	}
> +	return NULL;
> +}
> +
>  enum discovery_result {
>  	GIT_DIR_NONE = 0,
>  	GIT_DIR_EXPLICIT,
> @@ -1151,7 +1195,8 @@ enum discovery_result {
>  	GIT_DIR_HIT_CEILING = -1,
>  	GIT_DIR_HIT_MOUNT_POINT = -2,
>  	GIT_DIR_INVALID_GITFILE = -3,
> -	GIT_DIR_INVALID_OWNERSHIP = -4
> +	GIT_DIR_INVALID_OWNERSHIP = -4,
> +	GIT_DIR_DISALLOWED_BARE = -5,
>  };
>
>  /*
> @@ -1248,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>  		}
>
>  		if (is_git_directory(dir->buf)) {
> +			if (!get_discovery_bare())

Relying on NEVER being the zero value here seems fragile to me. Should
we check that `if (get_discovery_bare() == DISCOVERY_BARE_NEVER)` to be
more explicit here?

> +				return GIT_DIR_DISALLOWED_BARE;
>  			if (!ensure_valid_ownership(dir->buf))
>  				return GIT_DIR_INVALID_OWNERSHIP;
>  			strbuf_addstr(gitdir, ".");
> @@ -1394,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
>  		}
>  		*nongit_ok = 1;
>  		break;
> +	case GIT_DIR_DISALLOWED_BARE:
> +		if (!nongit_ok) {
> +			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
> +			    dir.buf,
> +			    discovery_bare_allowed_to_string(get_discovery_bare()));
> +		}
> +		*nongit_ok = 1;
> +		break;
>  	case GIT_DIR_NONE:
>  		/*
>  		 * As a safeguard against setup_git_directory_gently_1 returning

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-06-30 23:07           ` Ævar Arnfjörð Bjarmason
@ 2022-07-01 17:37             ` Glen Choo
  2022-07-08 21:58               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-07-01 17:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Jonathan Tan

Thanks for weighing in :) Despite the different proposed approaches, I
think we actually are in broad agreement.

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Jun 30 2022, Glen Choo via GitGitGadget wrote:
>
>> This is a quick re-roll to address Ævar's comments on the tests (thanks!).
>
> Thanks!
>
>> = Description
>
> Just more generally on this series & approach. I know this is a v6 by
> now, but I haven't kept up with this topic, but to be fair I did mention
> pretty much this in:
> https://lore.kernel.org/git/220407.86lewhc6bz.gmgdl@evledraar.gmail.com/
>
> So...
>
>> There is a known social engineering attack that takes advantage of the fact
>> that a working tree can include an entire bare repository, including a
>> config file. A user could run a Git command inside the bare repository
>> thinking that the config file of the 'outer' repository would be used, but
>> in reality, the bare repository's config file (which is attacker-controlled)
>> is used, which may result in arbitrary code execution. See [1] for a fuller
>> description and deeper discussion.
>>
>> This series implements a simple way of preventing such attacks: create a
>> config option, discovery.bare, that tells Git whether or not to die when it
>> finds a bare repository. discovery.bare has two values:
>>
>>  * "always": always allow bare repositories (default), identical to current
>>    behavior
>>  * "never": never allow bare repositories
>>
>> and users/system administrators who never expect to work with bare
>> repositories can secure their environments using "never". discovery.bare has
>> no effect if --git-dir or GIT_DIR is passed because we are confident that
>> the user is not confused about which repository is being used.
>
> I'm not insisting that the entire approach here should be changed, but
> in the above exchange you seemed to have performance concerns about the
> "just walk up in setup.c" approach I mentioned, but it's not clear if
> that's still the only thing that necessitates taking this approach.
>
> There may be security subtleties that I've missed, but from the
> description here it seems like that would work equally well, and
> wouldn't require configuration, except insofar as we'd need to opt-in to
> reading config from bare repositores *that also exist in a parent tree*.
>
> And it would be a more narrow & more secure solution, since it would
> e.g. allow you to intentionally navigate to /var/repos/git/git.git in a
> server setup and read the config there, which it could distinguish from
> a case of /var/repos/.git existing, and git/git.git being brought in as
> a part of that "parent" repo.

Performance is one major concern, yes, and I agree that your findings
show that the "just walk up" approach is cheap enough to consider doing.
Though in the few cases where it isn't cheap to walk, wouldn't it still
be useful to be able to opt out of it?

The other concern is simplicity and correctness. Are we confident that
we'll get the design of "just walk up" correct (including edge cases
like "bare repo in bare repo in non bare repo")? I'm 100% confident that
we'll get it right eventually, and that this approach will be a good
default for all users. But in comparison, "never" is so much easier to
understand and implement that I don't see why we shouldn't start by
presenting this option to the 0.1-1% of users who would find it useful.

And on the topic of simplicity, there's significant interest in
maintaining backwards-compatibility with repos with workflows that
absolutely depend on embedded bare repos, e.g. libgit2 and Git-LFS.
That's yet another special case that we'd have to get right. Stolee's
"no-embedded" proposal [1] pretty much covers that, but I don't see the
harm in simplifying the design space by making bare repo support a
non-goal.

[1] https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

> The "more narrow" and "more secure" go hand-in-hand, since if you work
> on such servers you'd turn this to "always" because you want to read
> such config, but then be left vulnerable to the actual (and muche rarer)
> exploit we're trying to prevent.

The point that we're not defending bare repo users is fair, but maybe
the group we're trying to protect isn't really dedicated Git-serving
servers. This exploit requires you to have a bare repo inside the
working tree of a non-bare repo. So I think this is less of an issue for
a server, and more for "mixed-use" environments with both regular and
bare clones.

> Which, it seems...
>
>> This series does not change the default behavior, but in the long-run, a
>> "no-embedded" option might be a safe and usable default [2]. "never" is too
>> restrictive and unlikely to be the default.
>
> This series has (since v3?) been noting aspirations to have a
> "no-embedded" variant of this config, which your 5/5 here notes would be
> better, but isn't implemented by this series.
>
> But your 5/5 also notes:
>
>     but detecting if a repository is embedded is potentially
>     non-trivial, so this work is not implemented in this series.
>
> Hrm, well, the diff-stat isn't quite that trivial either :) :

Well.. a lot of it is refactoring :P

>> [...]
>>  upload-pack.c                       | 27 ++++++----
>>  12 files changed, 304 insertions(+), 47 deletions(-)
>
> In threads linked from the above ML link I linked to some POC code
> showing how to hack a second .git discovery walk into setup.c. This was
> as part of the "submodule parent dir" proposal, which is a different
> feature, but also needs such "find the parent" code:
> https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/
>
> Now, obviously that's a dirty hack, but it's not that hard to just
> change the part of setup.c where we're satisfied that we've found the
> git dir, then walk up "$THAT_DIR/..", and start our search again.
>
> Then:
>
> 	if (first_dir_was_bare() && found_parent_dir())
>         	enforce_no_embedded();
>
> Isn't that what your proposed "no embedded" option would need to do?
> Well, maybe we'd also check if the "first dir" is in the index of the
> parent, as opposed to just being a bare .git somewhere in ~/Downloads,
> e.g. if you have a ~/.git and keep your dot-files in git.
>
> But I think for an initial implementation just doing the walk would be
> good enough, and would have a more narrow scope than this configuration
> setting.

A narrow scope is good, but I don't agree on this definition of
"narrow". My preference is to give an obvious solution to a 'narrow'
group of users, instead of a more tricky solution that affects all users
in a 'narrow' set of cases.

> AFAICT the performance concerns aren't supported by any data, in the
> case of the "submodule superproject" feature it turned out to not be the
> directory walk, but us shelling out in a loop in git-submodule.sh.
>
> Well, *maybe* that's not the case, I think I have managed to read
> between the lines of some of these past exchanges that there's some odd
> propriterary internal NFS-like setup at Google where *parent dirs* are
> auto-mounted and searched on access, so a "walk up" pattern would be
> much more expensive.
>
> I do worry a bit about us ending up with design choices in git that we
> wouldn't have ended up with, if not to cater to some in-house setup
> somwhere that 99.99% of git users will never see.

At the very least, I don't think you're saying that it's a bad idea to
have "never", just that we might not have come up with it if not for
some Google NFS thing.

Another use case I can think of is CI bots, which have no need for bare
repos. To some folks (maybe in very security-sensitive environments),
"never" might give more peace of mind than "no-embedded".

> But I don't have the full picture on the "submodule superproject"
> problem, or this one, and maybe I'm missing something. Just food for
> thought, and wondering where we're eventually taking this.
>
> Thanks!

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section
  2022-06-30 22:32             ` Taylor Blau
@ 2022-07-06 17:44               ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-06 17:44 UTC (permalink / raw)
  To: Taylor Blau, Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason


Hi! Thanks so much for lending your attention to this version again, I
really appreciate this wording feedback in particular, because the
Review Club reviewers and I agonized a lot over the wording and couldn't
come up with great alternatives to what I wrote in the patch, and your
suggestions are super helpful.

Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jun 30, 2022 at 06:13:55PM +0000, Glen Choo via GitGitGadget wrote:
>> From: Glen Choo <chooglen@google.com>
>> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
>> index 9376e39aef2..f93d437b898 100644
>> --- a/Documentation/git-config.txt
>> +++ b/Documentation/git-config.txt
>> @@ -297,8 +297,8 @@ The default is to use a pager.
>>  FILES
>>  -----
>>
>> -If not set explicitly with `--file`, there are four files where
>> -'git config' will search for configuration options:
>> +By default, 'git config' will read configuration options from multiple
>> +files:
>>
>>  $(prefix)/etc/gitconfig::
>>  	System-wide configuration file.
>> @@ -322,27 +322,63 @@ $GIT_DIR/config.worktree::
>>  	This is optional and is only searched when
>>  	`extensions.worktreeConfig` is present in $GIT_DIR/config.
>>
>> -If no further options are given, all reading options will read all of these
>> -files that are available. If the global or the system-wide configuration
>> -file are not available they will be ignored. If the repository configuration
>> -file is not available or readable, 'git config' will exit with a non-zero
>> -error code. However, in neither case will an error message be issued.
>> +You may also provide additional configuration parameters when running any
>> +git command by using the `-c` option. See linkgit:git[1] for details.
>> +
>> +Options will be read from all of these files that are available. If the
>> +global or the system-wide configuration file are not available they will be
>> +ignored. If the repository configuration file is not available or readable,
>> +'git config' will exit with a non-zero error code. However, in neither case
>> +will an error message be issued.
>
> Nit: the last sentence is a little awkwardly worded. Perhaps just:
> "Note that neither case produces an error message".

Good suggestion. I didn't change this sentence, but I agree that it's
worth improving.

>> -All writing options will per default write to the repository specific
>> +By default, options are only written to the repository specific
>>  configuration file. Note that this also affects options like `--replace-all`
>
> Should we mention that this is the same as the "local" scope below?

Also a good idea.

>>  and `--unset`. *'git config' will only ever change one file at a time*.
>>
>> -You can override these rules using the `--global`, `--system`,
>> -`--local`, `--worktree`, and `--file` command-line options; see
>> -<<OPTIONS>> above.
>> +You can change the way options are read/written by specifying the path to a
>> +file (`--file`), or by specifying a configuration scope (`--system`,
>> +`--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
>
> I think this paragraph could be slightly more descriptive about what
> `--file` does while still linking out to <<OPTIONS>> above for more
> detailed information. In the pre-image, we say:
>
>     If not set explicitly with `--file`, there are four files will `git
>     config will search`.
>
> So I wonder if something more descriptive in this section might be:
>
>     You can limit which configuration sources are read to or written
>     from by specifying the path of a file with the `--file` option, or
>     by specifying a scope with `--system`, `--global`, `--local`, or
>     `--worktree`. For more, see <<OPTIONS>> above.
>
> I don't think that's so different form what you wrote, but I think it's
> a little clearer particularly what `--file` does (instead of "change the
> way options are read/written" it "limit[s] which configuration sources
> are read to or written from").

I think this is _much_ clearer, actually. Thanks!

>> +
>> +SCOPES
>> +------
>> +
>> +Each configuration source falls within a configuration scope. The scopes
>> +are:
>> +
>> +system::
>> +	$(prefix)/etc/gitconfig
>> +
>> +global::
>> +	$XDG_CONFIG_HOME/git/config
>> ++
>> +~/.gitconfig
>> +
>> +local::
>> +	$GIT_DIR/config
>> +
>> +worktree::
>> +	$GIT_DIR/config.worktree
>> +
>> +command::
>> +	environment variables
>> ++
>> +the `-c` option
>> +
>> +With the exception of 'command', each scope corresponds to a command line
>> +option - `--system`, `--global`, `--local`, `--worktree`.
>
> I think a colon after "option" is more appropriate than a single "-"
> dash character, but this is definitely a trivial matter that I have no
> strong opinion on.
>
> One thing that this reminds me of (which I don't think is worth taking
> up here, but perhaps in a future series, or as #leftoverbits) would be
> promoting these scopes behind a single option. Back in the day, you
> could ask for values out of `git config` by specifying their type with
> `--int`, `--bool`, or similar. In e3e042b185 (Merge branch
> 'tb/config-type', 2018-05-08), we changed to
> `--type=<int|bool|color|etc>`, which unified things and made it clearer
> which options were grouped together by a single concept.
>
> I think a similar change would make sense here, that is to replace
> `--system`, `--global` (and so on) with `--scope=system`,
> `--scope=global`, etc.
>
> But that's not material to this series, and just something to think
> about for later on if you end up thinking it's a good idea.

This sounds like a great idea, actually. I agree that `--scope` is
probably a lot easier to reason about than having N scope flags, and
that this probably belongs in a future series.

>> +
>> +When reading options, specifying a scope will only read options from the
>> +files within that scope. When writing options, specifying a scope will write
>> +to the files within that scope (instead of the repository specific
>> +configuration file). See <<OPTIONS>> above for a complete description.
>>
>> +Most configuration options are respected regardless of the scope it is
>> +defined in, but some options are only respected in certain scopes. See the
>> +option's documentation for the full details.
>
> I assume "the option's" is referring to whichever configuration variable
> we're talking about. So it may be clearer to say "See the *respective*
> option's documentation for more information" or similar.

Good idea. Thanks again!

>
> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 2/5] Documentation: define protected configuration
  2022-06-30 23:49             ` Taylor Blau
@ 2022-07-06 18:21               ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-06 18:21 UTC (permalink / raw)
  To: Taylor Blau, Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jun 30, 2022 at 06:13:56PM +0000, Glen Choo via GitGitGadget wrote:
>> @@ -380,6 +381,18 @@ Most configuration options are respected regardless of the scope it is
>>  defined in, but some options are only respected in certain scopes. See the
>>  option's documentation for the full details.
>>
>> +Protected configuration
>> +~~~~~~~~~~~~~~~~~~~~~~~
>> +
>> +Protected configuration refers to the 'system', 'global', and 'command' scopes.
>> +For security reasons, certain options are only respected when they are
>> +specified in protected configuration, and ignored otherwise.
>> +
>> +Git treats these scopes as if they are controlled by the user or a trusted
>> +administrator. This is because an attacker who controls these scopes can do
>> +substantial harm without using Git, so it is assumed that the user's environment
>> +protects these scopes against attackers.
>> +
>
> I think this description is a good starting point, but I think I would
> have liked to see some more from the commit description make it into the
> documentation here.

Yeah, there's a bit of a tradeoff here. Glossing over some of the
details helps keep the documentation briefer and easier to understand
for the less experienced/invested, but is bound to frustrate others. I'd
appreciate any wording suggestions if you have any.

> One thing that I didn't see mentioned in either is that the list of
> protected configuration is far from exhaustive. There are dozens upon
> dozens of configuration values that Git will happily execute as a
> subprocess (core.editor, core.pager, core.alternateRefsCommand, to name
> just a few).
>
> I don't think we should try and enumerate every possible path from
> configuration to command execution. But it is worth noting in the
> documentation that the list of configuration values which are only read
> in the protected context is non-exhaustive and best-effort only.

By referencing command execution, I think you are alluding to Stolee's
Security Boundary discussion thread [1], and in particular, the "Example
Security Boundary Question: Unprotected Config and Executing Code"
section?

That section discusses the problem of arbitrary command execution based
on repository-local config, and how protected configuration might give
us a way to prevent that. That's a reasonable extension to this series,
though it seems a little premature to include allusions to command
execution, especially since I don't think we're anywhere close to a
long-term direction on what should/shouldn't be inside protected
configuration. For example, Stolee noted that, most of the command
execution options really do want per-repository customization, so if we
want to continue to support that, we'll need to use protected
configuration in a somewhat sophisticated manner (and not, e.g. only
respect command execution options in protected configuration). Perhaps
we could shelve this wording change until we've committed to such a
direction.

[1] https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com

> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 3/5] config: learn `git_protected_config()`
  2022-07-01  1:22             ` Taylor Blau
@ 2022-07-06 22:42               ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-06 22:42 UTC (permalink / raw)
  To: Taylor Blau, Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jun 30, 2022 at 06:13:57PM +0000, Glen Choo via GitGitGadget wrote:
>> In light of constraint 1, this implementation can still be improved
>> since git_protected_config() iterates through every variable in
>> protected_config, which may still be too expensive. There exist constant
>> time lookup functions for non-protected configuration
>> (repo_config_get_*()), but for simplicity, this commit does not
>> implement similar functions for protected configuration.
>
> I don't quite follow along with this paragraph: it sounds like reading
> protected configuration is supposed to be as fast as possible. But you
> note that only the slower variant of reading each configuration variable
> one at a time is implemented.

Right. I should have been clearer that this implementation is "fast
enough without introducing too much noise/complexity", and not "as fast
as possible".

> If we care about speed (and I think we should here), then would it make
> more sense to implement only the lookup functions like
> repo_config_get_*() for protected context? That would encourage usage by
> providing a more limited set of options to callers.

I held off on implementing these functions because:

- It requires rewriting `safe.directory`, which reads a multivalued
  string using a config iterator. It's not onerous to do (I had a POC
  of this at some point), but it seemed pretty noisy.
- It seems too noisy to implement all of the protected_config_get_*()
  functions, and a little inconsistent to only implement the ones used
  in this series (but maybe a little inconsistency is ok?)

But maybe a little noise and inconsistency is worth the performance
improvement, especially since it's been brought up ~1.5 times before
this [1] [2]. I'll do this for sure if you feel strongly about it,
otherwise I'll just try it out just to see what I think about it.

[1] https://lore.kernel.org/git/802c3541-3301-43fc-c39e-edd44e61a4eb@github.com
[2] https://lore.kernel.org/git/xmqqbkv4t7gp.fsf@gitster.g

>> Signed-off-by: Glen Choo <chooglen@google.com>
>> ---
>>  config.c                     | 51 ++++++++++++++++++++++++++++++++++++
>>  config.h                     | 17 ++++++++++++
>>  t/t5544-pack-objects-hook.sh |  7 ++++-
>>  upload-pack.c                | 27 ++++++++++++-------
>>  4 files changed, 91 insertions(+), 11 deletions(-)
>>
>> diff --git a/config.c b/config.c
>> index 9b0e9c93285..29e62f5d0ed 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -81,6 +81,18 @@ static enum config_scope current_parsing_scope;
>>  static int pack_compression_seen;
>>  static int zlib_compression_seen;
>>
>> +/*
>> + * Config that comes from trusted sources, namely:
>
> Should we be using the word "scope" here instead of sources? I think
> it's clear enough from the context what you're referring to, but in the
> spirit of being consistent...

Good catch.

>> + * - system config files (e.g. /etc/gitconfig)
>> + * - global config files (e.g. $HOME/.gitconfig,
>> + *   $XDG_CONFIG_HOME/git)
>> + * - the command line.
>> + *
>> + * This is declared here for code cleanliness, but unlike the other
>> + * static variables, this does not hold config parser state.
>> + */
>> +static struct config_set protected_config;
>> +
>>  static int config_file_fgetc(struct config_source *conf)
>>  {
>>  	return getc_unlocked(conf->u.file);
>> @@ -2378,6 +2390,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
>>  	return git_config_from_file(config_set_callback, filename, cs);
>>  }
>>
>> +int git_configset_add_parameters(struct config_set *cs)
>> +{
>> +	return git_config_from_parameters(config_set_callback, cs);
>> +}
>> +
>>  int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
>>  {
>>  	const struct string_list *values = NULL;
>> @@ -2619,6 +2636,40 @@ int repo_config_get_pathname(struct repository *repo,
>>  	return ret;
>>  }
>>
>> +/* Read values into protected_config. */
>> +static void read_protected_config(void)
>> +{
>> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>> +
>> +	git_configset_init(&protected_config);
>> +
>> +	system_config = git_system_config();
>> +	git_global_config(&user_config, &xdg_config);
>> +
>> +	git_configset_add_file(&protected_config, system_config);
>> +	git_configset_add_file(&protected_config, xdg_config);
>> +	git_configset_add_file(&protected_config, user_config);
>> +	git_configset_add_parameters(&protected_config);
>> +
>> +	free(system_config);
>> +	free(xdg_config);
>> +	free(user_config);
>> +}
>> +
>> +/* Ensure that protected_config has been initialized. */
>> +static void git_protected_config_check_init(void)
>> +{
>> +	if (protected_config.hash_initialized)
>> +		return;
>> +	read_protected_config();
>> +}
>> +
>> +void git_protected_config(config_fn_t fn, void *data)
>> +{
>> +	git_protected_config_check_init();
>
> This may be copying from an existing pattern, but I think you could
> avoid the extra function declaration by writing git_protected_config()
> as:
>
>     if (!protected_config.hash_initialized)
>         read_protected_config();
>     configset_iter(&protected_config, fn, data);

You're right, I can drop this if I don't implement
protected_config_get_*(); this pattern only makes sense for
git_config_check_init() because it's called by multiple functions.

> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 5/5] setup.c: create `discovery.bare`
  2022-07-01  1:30             ` Taylor Blau
@ 2022-07-07 19:55               ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-07 19:55 UTC (permalink / raw)
  To: Taylor Blau, Glen Choo via GitGitGadget
  Cc: git, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Taylor Blau <me@ttaylorr.com> writes:

> On Thu, Jun 30, 2022 at 06:13:59PM +0000, Glen Choo via GitGitGadget wrote:
>> [1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
>> [2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com
>>
>> Signed-off-by: Glen Choo <chooglen@google.com>
>> ---
>>  Documentation/config.txt           |  2 ++
>>  Documentation/config/discovery.txt | 23 ++++++++++++
>>  setup.c                            | 57 +++++++++++++++++++++++++++++-
>>  t/t0035-discovery-bare.sh          | 52 +++++++++++++++++++++++++++
>>  4 files changed, 133 insertions(+), 1 deletion(-)
>>  create mode 100644 Documentation/config/discovery.txt
>>  create mode 100755 t/t0035-discovery-bare.sh
>>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index e284b042f22..9a5e1329772 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -409,6 +409,8 @@ include::config/diff.txt[]
>>
>>  include::config/difftool.txt[]
>>
>> +include::config/discovery.txt[]
>> +
>>  include::config/extensions.txt[]
>>
>>  include::config/fastimport.txt[]
>> diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
>> new file mode 100644
>> index 00000000000..bbcf89bb0b5
>> --- /dev/null
>> +++ b/Documentation/config/discovery.txt
>> @@ -0,0 +1,23 @@
>> +discovery.bare::
>> +	Specifies whether Git will work with a bare repository that it
>> +	found during repository discovery. If the repository is
>
> Is it clear from the context what "discovery" means here? It's probably
> easier to describe what it isn't, which you kind of do in the next
> sentence. But it may be clearer to say something like:
>
>     Specifies whether Git will recognize bare repositories that aren't
>     specified via the top-level `--git-dir` command-line option, or the
>     `GIT_DIR` environment variable (see linkgit:git[1]).

Hm that's a good point and the suggestion is very well-worded. In
addition to what you have, I think we should make reference to
"discovery" _somewhere_ in here since the option is named
`discovery.bare`, and this seems like a good teaching opportunity.

>> +This defaults to `always`, but this default may change in the future.
>
> I think the default being subject to change is par for the course. It's
> probably easy enough to just say "Defaults to 'always'" and leave it at
> that.

Makes sense.

>> +static enum discovery_bare_allowed get_discovery_bare(void)
>> +{
>> +	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
>> +	git_protected_config(discovery_bare_cb, &result);
>> +	return result;
>> +}
>> +
>> +static const char *discovery_bare_allowed_to_string(
>> +	enum discovery_bare_allowed discovery_bare_allowed)
>> +{
>> +	switch (discovery_bare_allowed) {
>> +	case DISCOVERY_BARE_NEVER:
>> +		return "never";
>> +	case DISCOVERY_BARE_ALWAYS:
>> +		return "always";
>
>> +	default:
>> +		BUG("invalid discovery_bare_allowed %d",
>> +		    discovery_bare_allowed);
>
> Should we have a default case here since the case arms above are
> exhaustive?

Ah, this "default:" was suggested by Stolee in
https://lore.kernel.org/git/7b37f3b7-58c5-1ac5-46eb-d995dc3cc33b@github.com

  This case should be a "default:" in case somehow an arbitrary integer
  value was placed in the variable. [...]

I'm not sure where we stand on this kind of defensiveness. It's not
really necessary, but I suppose having a "default:" won't hurt here,
especially if it BUG()-s instead of silently passing.

>> +	}
>> +	return NULL;
>> +}
>> +
>>  enum discovery_result {
>>  	GIT_DIR_NONE = 0,
>>  	GIT_DIR_EXPLICIT,
>> @@ -1151,7 +1195,8 @@ enum discovery_result {
>>  	GIT_DIR_HIT_CEILING = -1,
>>  	GIT_DIR_HIT_MOUNT_POINT = -2,
>>  	GIT_DIR_INVALID_GITFILE = -3,
>> -	GIT_DIR_INVALID_OWNERSHIP = -4
>> +	GIT_DIR_INVALID_OWNERSHIP = -4,
>> +	GIT_DIR_DISALLOWED_BARE = -5,
>>  };
>>
>>  /*
>> @@ -1248,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
>>  		}
>>
>>  		if (is_git_directory(dir->buf)) {
>> +			if (!get_discovery_bare())
>
> Relying on NEVER being the zero value here seems fragile to me. Should
> we check that `if (get_discovery_bare() == DISCOVERY_BARE_NEVER)` to be
> more explicit here?

This was also originally suggested by Stolee in 
https://lore.kernel.org/git/7b37f3b7-58c5-1ac5-46eb-d995dc3cc33b@github.com

  With (some changes to return the enum), we can [...] let the caller
  treat the response as a simple boolean.

but.. your suggestion does seem less fragile. It won't really matter
when we add a third enum and replace the "if" with a "switch", but it
does matter if we ever muck around with the integer values of
DISCOVER_BARE_*.

>> +				return GIT_DIR_DISALLOWED_BARE;
>>  			if (!ensure_valid_ownership(dir->buf))
>>  				return GIT_DIR_INVALID_OWNERSHIP;
>>  			strbuf_addstr(gitdir, ".");
>> @@ -1394,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
>>  		}
>>  		*nongit_ok = 1;
>>  		break;
>> +	case GIT_DIR_DISALLOWED_BARE:
>> +		if (!nongit_ok) {
>> +			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
>> +			    dir.buf,
>> +			    discovery_bare_allowed_to_string(get_discovery_bare()));
>> +		}
>> +		*nongit_ok = 1;
>> +		break;
>>  	case GIT_DIR_NONE:
>>  		/*
>>  		 * As a safeguard against setup_git_directory_gently_1 returning
>
> Thanks,
> Taylor

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v7 0/5] config: introduce discovery.bare and protected config
  2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
                             ` (6 preceding siblings ...)
  2022-06-30 23:07           ` Ævar Arnfjörð Bjarmason
@ 2022-07-07 23:01           ` Glen Choo via GitGitGadget
  2022-07-07 23:01             ` [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
                               ` (6 more replies)
  7 siblings, 7 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-07 23:01 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

This version incorporates most of Taylor's comments and suggestions. Thanks
especially for the wording suggestions, I struggled with those a lot :)

(I believe) I've responded upthread with my intention for each comment. The
only differences between that and the actual changes are:

 * In Documentation/git-config.txt, I dropped a suggestion to mention that
   "git config --local" is identical to the default behavior when writing
   options because I found it too hard to fit in.

 * In Documentation/config/discovery.txt, I took Taylor's suggestion, but
   didn't mention "discovery" for the same reasons.

 * I decided to leave out the protected config lookup functions. I made some
   POC patches at:
   
   https://github.com/chooglen/git/compare/setup/disable-bare-repo-config...chooglen:git:config/protected-config-lookup-fns
   
   which show that the code ends up cleaner with the lookup functions (in
   particular, it lets us remove struct safe_directory_data, which we only
   needed to maintain the state of the config iteration). But, I ultimately
   decided to leave them out of this series because the safe.directory
   conversion is pretty noisy and might end up becoming a distraction from
   the discussion here. If there are no strong objections, I'll send them as
   a follow up series instead.

= Description

There is a known social engineering attack that takes advantage of the fact
that a working tree can include an entire bare repository, including a
config file. A user could run a Git command inside the bare repository
thinking that the config file of the 'outer' repository would be used, but
in reality, the bare repository's config file (which is attacker-controlled)
is used, which may result in arbitrary code execution. See [1] for a fuller
description and deeper discussion.

This series implements a simple way of preventing such attacks: create a
config option, discovery.bare, that tells Git whether or not to die when it
finds a bare repository. discovery.bare has two values:

 * "always": always allow bare repositories (default), identical to current
   behavior
 * "never": never allow bare repositories

and users/system administrators who never expect to work with bare
repositories can secure their environments using "never". discovery.bare has
no effect if --git-dir or GIT_DIR is passed because we are confident that
the user is not confused about which repository is being used.

This series does not change the default behavior, but in the long-run, a
"no-embedded" option might be a safe and usable default [2]. "never" is too
restrictive and unlikely to be the default.

For security reasons, discovery.bare cannot be read from repository-level
config (because we would end up trusting the embedded bare repository that
we aren't supposed to trust to begin with). Since this would introduce a 3rd
variable that is only read from 'protected/trusted configuration' (the
others are safe.directory and uploadpack.packObjectsHook) this series also
defines and creates a shared implementation for 'protected configuration'

= Patch organization

 * Patch 1 add a section on configuration scopes to our docs
 * Patches 2-3 define 'protected configuration' and create a shared
   implementation.
 * Patch 4 refactors safe.directory to use protected configuration
 * Patch 5 adds discovery.bare

= Series history

Changes in v7:

 * Numerous docs improvements and code cleanup.
 * In 3/5's commit message, drop "as fast as possible" and allude to lookup
   functions coming in a later series.
 * Remove a comment in 3/5 about repository.protected_config. That was stale
   since v4, but slipped under the radar until now.
 * Fix some s/protected config/protected configuration (leftover from v5).

Changes in v6:

 * Add TEST_PASSES_SANITIZE_LEAK=true
 * Replace all sub-shells with -C and use test_config_global
 * Change the expect_rejected helper to use "grep -F" with a more specific
   message.
   * This reveals that the "-c discovery.bare=" assertion in the last test
     was passing for the wrong reason (because '' is an invalid value for
     "discovery.bare"). I removed it because it wasn't doing anything useful
     anyway - I was trying to make discovery.bare unset in the command line,
     but the whole point of that test is to assert that we respect the CLI
     arg.

Changes in v5:

 * Standardize the usage of "protected configuration" instead of mixing
   "config" and "configuration". This required some unfortunate rewrapping.
 * Remove mentions of "trustworthiness" when discussing protected
   configuration and focus on what Git does instead.
   * The rationale of protected vs non-protected is still kept.
 * Fix the stale documentation entry for discovery.bare.
 * Include a fuller description of how discovery.bare and "--git-dir"
   interact instead of saying "has no effect".

Changes in v4:

 * 2/5's commit message now justifies what scopes are included in protected
   config
 * The global configset is now a file-scope static inside config.c
   (previously it was a member of the_repository).
 * Rename discovery_bare_config to discovery_bare_allowed
 * Make discovery_bare_allowed function-scoped (instead of global).
 * Add an expect_accepted helper to the discovery.bare tests.
 * Add a helper to "upload-pack" that reads the protected and non-protected
   config

Changes in v3:

 * Rebase onto a more recent 'master'
 * Reframe this feature in only in terms of the 'embedded bare repo' attack.
 * Other docs improvements (thanks Stolee in particular!)
 * Protected config no longer uses read_very_early_config() and is only read
   once
 * Protected config now includes "-c"
 * uploadpack.packObjectsHook now uses protected config instead of ignoring
   repo config using config scopes

Changes in v2:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

= Future work

 * This series doesn't implement config lookup functions for protected
   config. This will be done in a follow up series.
 * This series does not implement the "no-embedded" option [2] and I won't
   work on it any time soon, but I'd be more than happy to review if someone
   sends patches.
 * With discovery.bare, if a builtin is marked RUN_SETUP_GENTLY, setup.c
   doesn't die() and we don't tell users why their repository was rejected,
   e.g. "git config" gives an opaque "fatal: not in a git directory". This
   isn't a new problem though, since safe.directory has the same issue.

[1]
https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com

[2] This was first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Glen Choo (5):
  Documentation/git-config.txt: add SCOPES section
  Documentation: define protected configuration
  config: learn `git_protected_config()`
  safe.directory: use git_protected_config()
  setup.c: create `discovery.bare`

 Documentation/config.txt            |  2 +
 Documentation/config/discovery.txt  | 21 ++++++++
 Documentation/config/safe.txt       |  6 +--
 Documentation/config/uploadpack.txt |  6 +--
 Documentation/git-config.txt        | 78 +++++++++++++++++++++++------
 config.c                            | 43 ++++++++++++++++
 config.h                            | 16 ++++++
 setup.c                             | 59 +++++++++++++++++++++-
 t/t0033-safe-directory.sh           | 24 ++++-----
 t/t0035-discovery-bare.sh           | 52 +++++++++++++++++++
 t/t5544-pack-objects-hook.sh        |  7 ++-
 upload-pack.c                       | 27 ++++++----
 12 files changed, 294 insertions(+), 47 deletions(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh


base-commit: f770e9f396d48b567ef7b37d273e91ad570a3522
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v7
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v7
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v6:

 1:  ee9619f6ec0 ! 1:  5c58db3bb21 Documentation/git-config.txt: add SCOPES section
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
      +Options will be read from all of these files that are available. If the
      +global or the system-wide configuration file are not available they will be
      +ignored. If the repository configuration file is not available or readable,
     -+'git config' will exit with a non-zero error code. However, in neither case
     -+will an error message be issued.
     ++'git config' will exit with a non-zero error code. Note that neither case
     ++produces an error message.
       
       The files are read in the order given above, with last value found taking
       precedence over values read earlier.  When multiple values are taken then all
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
      -You can override these rules using the `--global`, `--system`,
      -`--local`, `--worktree`, and `--file` command-line options; see
      -<<OPTIONS>> above.
     -+You can change the way options are read/written by specifying the path to a
     -+file (`--file`), or by specifying a configuration scope (`--system`,
     -+`--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
     ++You can limit which configuration sources are read from or written to by
     ++specifying the path of a file with the `--file` option, or by specifying a
     ++configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
     ++For more, see <<OPTIONS>> above.
      +
      +SCOPES
      +------
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
      +the `-c` option
      +
      +With the exception of 'command', each scope corresponds to a command line
     -+option - `--system`, `--global`, `--local`, `--worktree`.
     ++option: `--system`, `--global`, `--local`, `--worktree`.
      +
      +When reading options, specifying a scope will only read options from the
      +files within that scope. When writing options, specifying a scope will write
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
       
      +Most configuration options are respected regardless of the scope it is
      +defined in, but some options are only respected in certain scopes. See the
     -+option's documentation for the full details.
     ++respective option's documentation for the full details.
       
       ENVIRONMENT
       -----------
 2:  43627c05c0b ! 2:  58f25612aa3 Documentation: define protected configuration
     @@ Commit message
      
          In our documentation, define 'protected configuration' as the system,
          global and command config scopes. As a shorthand, I will refer to
     -    variables that are only respected in protected config as 'protected
     -    configuration only', but this term is not used in the documentation.
     +    variables that are only respected in protected configuration as
     +    'protected configuration only', but this term is not used in the
     +    documentation.
      
     -    This definition of protected config is based on whether or not Git can
     -    reasonably protect the user by ignoring the configuration scope:
     +    This definition of protected configuration is based on whether or not
     +    Git can reasonably protect the user by ignoring the configuration scope:
      
          - System, global and command line config are considered protected
            because an attacker who has control over any of those can do plenty of
     @@ Documentation/config/uploadpack.txt: uploadpack.packObjectsHook::
      -repository-level config (this is a safety measure against fetching from
      -untrusted repositories).
      +Note that this configuration variable is only respected when it is specified
     -+in protected config (see <<SCOPES>>). This is a safety measure against
     -+fetching from untrusted repositories.
     ++in protected configuration (see <<SCOPES>>). This is a safety measure
     ++against fetching from untrusted repositories.
       
       uploadpack.allowFilter::
       	If this option is set, `upload-pack` will support partial
      
       ## Documentation/git-config.txt ##
     -@@ Documentation/git-config.txt: You can change the way options are read/written by specifying the path to a
     - file (`--file`), or by specifying a configuration scope (`--system`,
     - `--global`, `--local`, `--worktree`); see <<OPTIONS>> above.
     +@@ Documentation/git-config.txt: specifying the path of a file with the `--file` option, or by specifying a
     + configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
     + For more, see <<OPTIONS>> above.
       
      +[[SCOPES]]
       SCOPES
     @@ Documentation/git-config.txt: You can change the way options are read/written by
       
      @@ Documentation/git-config.txt: Most configuration options are respected regardless of the scope it is
       defined in, but some options are only respected in certain scopes. See the
     - option's documentation for the full details.
     + respective option's documentation for the full details.
       
      +Protected configuration
      +~~~~~~~~~~~~~~~~~~~~~~~
 3:  3efe282e6b9 ! 3:  3683d20f232 config: learn `git_protected_config()`
     @@ Commit message
          variable today, but we've noted that `safe.directory` and the upcoming
          `discovery.bare` should also be 'protected configuration only'. So, for
          consistency, we'd like to have a single implementation for protected
     -    config.
     +    configuration.
      
          The primary constraints are:
      
     -    1. Reading from protected configuration should be as fast as possible.
     -       Nearly all "git" commands inside a bare repository will read both
     -       `safe.directory` and `discovery.bare`, so we cannot afford to be
     -       slow.
     +    1. Reading from protected configuration should be fast. Nearly all "git"
     +       commands inside a bare repository will read both `safe.directory` and
     +       `discovery.bare`, so we cannot afford to be slow.
      
     -    2. Protected config must be readable when the gitdir is not known.
     -       `safe.directory` and `discovery.bare` both affect repository
     +    2. Protected configuration must be readable when the gitdir is not
     +       known. `safe.directory` and `discovery.bare` both affect repository
             discovery and the gitdir is not known at that point [1].
      
          The chosen implementation in this commit is to read protected
     @@ Commit message
          non-protected counterparts, e.g. git_protected_config_check_init() vs
          git_config_check_init().
      
     -    In light of constraint 1, this implementation can still be improved
     -    since git_protected_config() iterates through every variable in
     -    protected_config, which may still be too expensive. There exist constant
     -    time lookup functions for non-protected configuration
     -    (repo_config_get_*()), but for simplicity, this commit does not
     -    implement similar functions for protected configuration.
     +    In light of constraint 1, this implementation can still be improved.
     +    git_protected_config() iterates through every variable in
     +    protected_config, which is wasteful, but it makes the conversion simple
     +    because it matches existing patterns. We will likely implement constant
     +    time lookup functions for protected configuration in a future series
     +    (such functions already exist for non-protected configuration, i.e.
     +    repo_config_get_*()).
      
          An alternative that avoids introducing another configset is to continue
          to read all config using git_config(), but only accept values that have
     @@ config.c: static enum config_scope current_parsing_scope;
       static int zlib_compression_seen;
       
      +/*
     -+ * Config that comes from trusted sources, namely:
     -+ * - system config files (e.g. /etc/gitconfig)
     -+ * - global config files (e.g. $HOME/.gitconfig,
     -+ *   $XDG_CONFIG_HOME/git)
     -+ * - the command line.
     ++ * Config that comes from trusted scopes, namely:
     ++ * - CONFIG_SCOPE_SYSTEM (e.g. /etc/gitconfig)
     ++ * - CONFIG_SCOPE_GLOBAL (e.g. $HOME/.gitconfig, $XDG_CONFIG_HOME/git)
     ++ * - CONFIG_SCOPE_COMMAND (e.g. "-c" option, environment variables)
      + *
      + * This is declared here for code cleanliness, but unlike the other
      + * static variables, this does not hold config parser state.
     @@ config.c: int repo_config_get_pathname(struct repository *repo,
      +	free(user_config);
      +}
      +
     -+/* Ensure that protected_config has been initialized. */
     -+static void git_protected_config_check_init(void)
     -+{
     -+	if (protected_config.hash_initialized)
     -+		return;
     -+	read_protected_config();
     -+}
     -+
      +void git_protected_config(config_fn_t fn, void *data)
      +{
     -+	git_protected_config_check_init();
     ++	if (!protected_config.hash_initialized)
     ++		read_protected_config();
      +	configset_iter(&protected_config, fn, data);
      +}
      +
     @@ config.h: int repo_config_get_maybe_bool(struct repository *repo,
       
      +/*
      + * Functions for reading protected config. By definition, protected
     -+ * config ignores repository config, so it is unnecessary to read
     -+ * protected config from any `struct repository` other than
     -+ * the_repository.
     ++ * config ignores repository config, so these do not take a `struct
     ++ * repository` parameter.
      + */
      +void git_protected_config(config_fn_t fn, void *data);
      +
 4:  ec925823414 = 4:  6394818ffd8 safe.directory: use git_protected_config()
 5:  a1323d963f9 ! 5:  eff4b07480e setup.c: create `discovery.bare`
     @@ Documentation/config.txt: include::config/diff.txt[]
       ## Documentation/config/discovery.txt (new) ##
      @@
      +discovery.bare::
     -+	Specifies whether Git will work with a bare repository that it
     -+	found during repository discovery. If the repository is
     -+	specified directly via the --git-dir command-line option or the
     -+	GIT_DIR environment variable (see linkgit:git[1]), Git will
     -+	always use the specified repository, regardless of this value.
     ++	Specifies whether Git will work with a bare repository that
     ++	wasn't specified via the top-level `--git-dir` command-line
     ++	option, or the `GIT_DIR` environment variable (see
     ++	linkgit:git[1]). If the repository is specified, Git will always
     ++	use the specified repository, regardless of this value.
      ++
      +This config setting is only respected in protected configuration (see
      +<<SCOPES>>). This prevents the untrusted repository from tampering with
     @@ Documentation/config/discovery.txt (new)
      +* `always`: Git always works with bare repositories
      +* `never`: Git never works with bare repositories
      ++
     -+This defaults to `always`, but this default may change in the future.
     -++
      +If you do not use bare repositories in your workflow, then it may be
      +beneficial to set `discovery.bare` to `never` in your global config.
      +This will protect you from attacks that involve cloning a repository
     @@ setup.c: static enum discovery_result setup_git_directory_gently_1(struct strbuf
       		}
       
       		if (is_git_directory(dir->buf)) {
     -+			if (!get_discovery_bare())
     ++			if (get_discovery_bare() == DISCOVERY_BARE_NEVER)
      +				return GIT_DIR_DISALLOWED_BARE;
       			if (!ensure_valid_ownership(dir->buf))
       				return GIT_DIR_INVALID_OWNERSHIP;

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
@ 2022-07-07 23:01             ` Glen Choo via GitGitGadget
  2022-07-07 23:43               ` Junio C Hamano
  2022-07-07 23:01             ` [PATCH v7 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
                               ` (5 subsequent siblings)
  6 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-07 23:01 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.

145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
those values mean.

Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/git-config.txt | 65 ++++++++++++++++++++++++++++--------
 1 file changed, 51 insertions(+), 14 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 9376e39aef2..c4ce61a0493 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -297,8 +297,8 @@ The default is to use a pager.
 FILES
 -----
 
-If not set explicitly with `--file`, there are four files where
-'git config' will search for configuration options:
+By default, 'git config' will read configuration options from multiple
+files:
 
 $(prefix)/etc/gitconfig::
 	System-wide configuration file.
@@ -322,27 +322,64 @@ $GIT_DIR/config.worktree::
 	This is optional and is only searched when
 	`extensions.worktreeConfig` is present in $GIT_DIR/config.
 
-If no further options are given, all reading options will read all of these
-files that are available. If the global or the system-wide configuration
-file are not available they will be ignored. If the repository configuration
-file is not available or readable, 'git config' will exit with a non-zero
-error code. However, in neither case will an error message be issued.
+You may also provide additional configuration parameters when running any
+git command by using the `-c` option. See linkgit:git[1] for details.
+
+Options will be read from all of these files that are available. If the
+global or the system-wide configuration file are not available they will be
+ignored. If the repository configuration file is not available or readable,
+'git config' will exit with a non-zero error code. Note that neither case
+produces an error message.
 
 The files are read in the order given above, with last value found taking
 precedence over values read earlier.  When multiple values are taken then all
 values of a key from all files will be used.
 
-You may override individual configuration parameters when running any git
-command by using the `-c` option. See linkgit:git[1] for details.
-
-All writing options will per default write to the repository specific
+By default, options are only written to the repository specific
 configuration file. Note that this also affects options like `--replace-all`
 and `--unset`. *'git config' will only ever change one file at a time*.
 
-You can override these rules using the `--global`, `--system`,
-`--local`, `--worktree`, and `--file` command-line options; see
-<<OPTIONS>> above.
+You can limit which configuration sources are read from or written to by
+specifying the path of a file with the `--file` option, or by specifying a
+configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
+For more, see <<OPTIONS>> above.
+
+SCOPES
+------
+
+Each configuration source falls within a configuration scope. The scopes
+are:
+
+system::
+	$(prefix)/etc/gitconfig
+
+global::
+	$XDG_CONFIG_HOME/git/config
++
+~/.gitconfig
+
+local::
+	$GIT_DIR/config
+
+worktree::
+	$GIT_DIR/config.worktree
+
+command::
+	environment variables
++
+the `-c` option
+
+With the exception of 'command', each scope corresponds to a command line
+option: `--system`, `--global`, `--local`, `--worktree`.
+
+When reading options, specifying a scope will only read options from the
+files within that scope. When writing options, specifying a scope will write
+to the files within that scope (instead of the repository specific
+configuration file). See <<OPTIONS>> above for a complete description.
 
+Most configuration options are respected regardless of the scope it is
+defined in, but some options are only respected in certain scopes. See the
+respective option's documentation for the full details.
 
 ENVIRONMENT
 -----------
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v7 2/5] Documentation: define protected configuration
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
  2022-07-07 23:01             ` [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-07-07 23:01             ` Glen Choo via GitGitGadget
  2022-07-08  0:39               ` Junio C Hamano
  2022-07-07 23:01             ` [PATCH v7 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
                               ` (4 subsequent siblings)
  6 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-07 23:01 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.

In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected configuration as
'protected configuration only', but this term is not used in the
documentation.

This definition of protected configuration is based on whether or not
Git can reasonably protect the user by ignoring the configuration scope:

- System, global and command line config are considered protected
  because an attacker who has control over any of those can do plenty of
  harm without Git, so we gain very little by ignoring those scopes.
- On the other hand, local (and similarly, worktree) config are not
  considered protected because it is relatively easy for an attacker to
  control local config, e.g.:
  - On some shared user environments, a non-admin attacker can create a
    repository high up the directory hierarchy (e.g. C:\.git on
    Windows), and a user may accidentally use it when their PS1
    automatically invokes "git" commands.

    `safe.directory` prevents attacks of this form by making sure that
    the user intended to use the shared repository. It obviously
    shouldn't be read from the repository, because that would end up
    trusting the repository that Git was supposed to reject.
  - "git upload-pack" is expected to run in repositories that may not be
    controlled by the user. We cannot ignore all config in that
    repository (because "git upload-pack" would fail), but we can limit
    the risks by ignoring `uploadpack.packObjectsHook`.

Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:

- `safe.directory` should be 'protected configuration only', but it does
  not technically fit the definition because it is not respected in the
  "command" scope. A future commit will fix this.

- `trace2.*` happens to read the same scopes as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/uploadpack.txt |  6 +++---
 Documentation/git-config.txt        | 13 +++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..16264d82a72 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -49,9 +49,9 @@ uploadpack.packObjectsHook::
 	`pack-objects` to the hook, and expects a completed packfile on
 	stdout.
 +
-Note that this configuration variable is ignored if it is seen in the
-repository-level config (this is a safety measure against fetching from
-untrusted repositories).
+Note that this configuration variable is only respected when it is specified
+in protected configuration (see <<SCOPES>>). This is a safety measure
+against fetching from untrusted repositories.
 
 uploadpack.allowFilter::
 	If this option is set, `upload-pack` will support partial
diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index c4ce61a0493..2dc74f510f2 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -344,6 +344,7 @@ specifying the path of a file with the `--file` option, or by specifying a
 configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
 For more, see <<OPTIONS>> above.
 
+[[SCOPES]]
 SCOPES
 ------
 
@@ -381,6 +382,18 @@ Most configuration options are respected regardless of the scope it is
 defined in, but some options are only respected in certain scopes. See the
 respective option's documentation for the full details.
 
+Protected configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Protected configuration refers to the 'system', 'global', and 'command' scopes.
+For security reasons, certain options are only respected when they are
+specified in protected configuration, and ignored otherwise.
+
+Git treats these scopes as if they are controlled by the user or a trusted
+administrator. This is because an attacker who controls these scopes can do
+substantial harm without using Git, so it is assumed that the user's environment
+protects these scopes against attackers.
+
 ENVIRONMENT
 -----------
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v7 3/5] config: learn `git_protected_config()`
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
  2022-07-07 23:01             ` [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
  2022-07-07 23:01             ` [PATCH v7 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-07-07 23:01             ` Glen Choo via GitGitGadget
  2022-07-07 23:01             ` [PATCH v7 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
                               ` (3 subsequent siblings)
  6 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-07 23:01 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`discovery.bare` should also be 'protected configuration only'. So, for
consistency, we'd like to have a single implementation for protected
configuration.

The primary constraints are:

1. Reading from protected configuration should be fast. Nearly all "git"
   commands inside a bare repository will read both `safe.directory` and
   `discovery.bare`, so we cannot afford to be slow.

2. Protected configuration must be readable when the gitdir is not
   known. `safe.directory` and `discovery.bare` both affect repository
   discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved.
git_protected_config() iterates through every variable in
protected_config, which is wasteful, but it makes the conversion simple
because it matches existing patterns. We will likely implement constant
time lookup functions for protected configuration in a future series
(such functions already exist for non-protected configuration, i.e.
repo_config_get_*()).

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 config.c                     | 43 ++++++++++++++++++++++++++++++++++++
 config.h                     | 16 ++++++++++++++
 t/t5544-pack-objects-hook.sh |  7 +++++-
 upload-pack.c                | 27 +++++++++++++---------
 4 files changed, 82 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index 9b0e9c93285..015bec360f5 100644
--- a/config.c
+++ b/config.c
@@ -81,6 +81,17 @@ static enum config_scope current_parsing_scope;
 static int pack_compression_seen;
 static int zlib_compression_seen;
 
+/*
+ * Config that comes from trusted scopes, namely:
+ * - CONFIG_SCOPE_SYSTEM (e.g. /etc/gitconfig)
+ * - CONFIG_SCOPE_GLOBAL (e.g. $HOME/.gitconfig, $XDG_CONFIG_HOME/git)
+ * - CONFIG_SCOPE_COMMAND (e.g. "-c" option, environment variables)
+ *
+ * This is declared here for code cleanliness, but unlike the other
+ * static variables, this does not hold config parser state.
+ */
+static struct config_set protected_config;
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return getc_unlocked(conf->u.file);
@@ -2378,6 +2389,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
 	return git_config_from_file(config_set_callback, filename, cs);
 }
 
+int git_configset_add_parameters(struct config_set *cs)
+{
+	return git_config_from_parameters(config_set_callback, cs);
+}
+
 int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
 {
 	const struct string_list *values = NULL;
@@ -2619,6 +2635,33 @@ int repo_config_get_pathname(struct repository *repo,
 	return ret;
 }
 
+/* Read values into protected_config. */
+static void read_protected_config(void)
+{
+	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
+
+	git_configset_init(&protected_config);
+
+	system_config = git_system_config();
+	git_global_config(&user_config, &xdg_config);
+
+	git_configset_add_file(&protected_config, system_config);
+	git_configset_add_file(&protected_config, xdg_config);
+	git_configset_add_file(&protected_config, user_config);
+	git_configset_add_parameters(&protected_config);
+
+	free(system_config);
+	free(xdg_config);
+	free(user_config);
+}
+
+void git_protected_config(config_fn_t fn, void *data)
+{
+	if (!protected_config.hash_initialized)
+		read_protected_config();
+	configset_iter(&protected_config, fn, data);
+}
+
 /* Functions used historically to read configuration from 'the_repository' */
 void git_config(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index 7654f61c634..ca994d77147 100644
--- a/config.h
+++ b/config.h
@@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
  */
 int git_configset_add_file(struct config_set *cs, const char *filename);
 
+/**
+ * Parses command line options and environment variables, and adds the
+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
+ * if there is an error in parsing. The caller decides whether to free
+ * the incomplete configset or continue using it when the function
+ * returns -1.
+ */
+int git_configset_add_parameters(struct config_set *cs);
+
 /**
  * Finds and returns the value list, sorted in order of increasing priority
  * for the configuration variable `key` and config set `cs`. When the
@@ -505,6 +514,13 @@ int repo_config_get_maybe_bool(struct repository *repo,
 int repo_config_get_pathname(struct repository *repo,
 			     const char *key, const char **dest);
 
+/*
+ * Functions for reading protected config. By definition, protected
+ * config ignores repository config, so these do not take a `struct
+ * repository` parameter.
+ */
+void git_protected_config(config_fn_t fn, void *data);
+
 /**
  * Querying For Specific Variables
  * -------------------------------
diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index dd5f44d986f..54f54f8d2eb 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
 	! grep "hook running" stderr &&
 	test_path_is_missing .git/hook.args &&
 	test_path_is_missing .git/hook.stdin &&
-	test_path_is_missing .git/hook.stdout
+	test_path_is_missing .git/hook.stdout &&
+
+	# check that global config is used instead
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	git clone --no-local . dst2.git 2>stderr &&
+	grep "hook running" stderr
 '
 
 test_expect_success 'hook works with partial clone' '
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b36066..09f48317b02 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 		data->advertise_sid = git_config_bool(var, value);
 	}
 
-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
-		if (!strcmp("uploadpack.packobjectshook", var))
-			return git_config_string(&data->pack_objects_hook, var, value);
-	}
-
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
+{
+	struct upload_pack_data *data = cb_data;
+
+	if (!strcmp("uploadpack.packobjectshook", var))
+		return git_config_string(&data->pack_objects_hook, var, value);
+	return 0;
+}
+
+static void get_upload_pack_config(struct upload_pack_data *data)
+{
+	git_config(upload_pack_config, data);
+	git_protected_config(upload_pack_protected_config, data);
+}
+
 void upload_pack(const int advertise_refs, const int stateless_rpc,
 		 const int timeout)
 {
@@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 	struct upload_pack_data data;
 
 	upload_pack_data_init(&data);
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	data.stateless_rpc = stateless_rpc;
 	data.timeout = timeout;
@@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
 
 	upload_pack_data_init(&data);
 	data.use_sideband = LARGE_PACKET_MAX;
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	while (state != FETCH_DONE) {
 		switch (state) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v7 4/5] safe.directory: use git_protected_config()
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
                               ` (2 preceding siblings ...)
  2022-07-07 23:01             ` [PATCH v7 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-07-07 23:01             ` Glen Choo via GitGitGadget
  2022-07-07 23:01             ` [PATCH v7 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
                               ` (2 subsequent siblings)
  6 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-07 23:01 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/safe.txt |  6 +++---
 setup.c                       |  2 +-
 t/t0033-safe-directory.sh     | 24 ++++++++++--------------
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index fa02f3ccc54..f72b4408798 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -12,9 +12,9 @@ via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
-This config setting is only respected when specified in a system or global
-config, not when it is specified in a repository config, via the command
-line option `-c safe.directory=<path>`, or in environment variables.
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with this
+value.
 +
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
diff --git a/setup.c b/setup.c
index faf5095e44d..c8e3c32814d 100644
--- a/setup.c
+++ b/setup.c
@@ -1137,7 +1137,7 @@ static int ensure_valid_ownership(const char *path)
 	    is_path_owned_by_current_user(path))
 		return 1;
 
-	read_very_early_config(safe_directory_cb, &data);
+	git_protected_config(safe_directory_cb, &data);
 
 	return data.is_safe;
 }
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 238b25f91a3..5a1cd0d0947 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -16,24 +16,20 @@ test_expect_success 'safe.directory is not set' '
 	expect_rejected_dir
 '
 
-test_expect_success 'ignoring safe.directory on the command line' '
-	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory on the command line' '
+	git -c safe.directory="$(pwd)" status
 '
 
-test_expect_success 'ignoring safe.directory in the environment' '
-	test_must_fail env GIT_CONFIG_COUNT=1 \
-		GIT_CONFIG_KEY_0="safe.directory" \
-		GIT_CONFIG_VALUE_0="$(pwd)" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in the environment' '
+	env GIT_CONFIG_COUNT=1 \
+	    GIT_CONFIG_KEY_0="safe.directory" \
+	    GIT_CONFIG_VALUE_0="$(pwd)" \
+	    git status
 '
 
-test_expect_success 'ignoring safe.directory in GIT_CONFIG_PARAMETERS' '
-	test_must_fail env \
-		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
-		git status 2>err &&
-	grep "unsafe repository" err
+test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
+	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
+	    git status
 '
 
 test_expect_success 'ignoring safe.directory in repo config' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v7 5/5] setup.c: create `discovery.bare`
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
                               ` (3 preceding siblings ...)
  2022-07-07 23:01             ` [PATCH v7 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
@ 2022-07-07 23:01             ` Glen Choo via GitGitGadget
  2022-07-08  1:07             ` [PATCH v7 0/5] config: introduce discovery.bare and protected config Junio C Hamano
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
  6 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-07 23:01 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, brian m. carlson, Derrick Stolee, Junio C Hamano,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `discovery.bare`, that tells Git whether or
not to die() when it discovers a bare repository. This only affects
repository discovery, thus it has no effect if discovery was not
done, e.g. if the user passes `--git-dir=my-dir`, discovery will be
skipped and my-dir will be used as the repo regardless of the
`discovery.bare` value.

This config is an enum of:

- "always": always allow bare repositories (this is the default)
- "never": never allow bare repositories

If we want to protect users from such attacks by default, neither value
will suffice - "always" provides no protection, but "never" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config.txt           |  2 ++
 Documentation/config/discovery.txt | 21 +++++++++++
 setup.c                            | 57 +++++++++++++++++++++++++++++-
 t/t0035-discovery-bare.sh          | 52 +++++++++++++++++++++++++++
 4 files changed, 131 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/config/discovery.txt
 create mode 100755 t/t0035-discovery-bare.sh

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e284b042f22..9a5e1329772 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -409,6 +409,8 @@ include::config/diff.txt[]
 
 include::config/difftool.txt[]
 
+include::config/discovery.txt[]
+
 include::config/extensions.txt[]
 
 include::config/fastimport.txt[]
diff --git a/Documentation/config/discovery.txt b/Documentation/config/discovery.txt
new file mode 100644
index 00000000000..6f38b86884b
--- /dev/null
+++ b/Documentation/config/discovery.txt
@@ -0,0 +1,21 @@
+discovery.bare::
+	Specifies whether Git will work with a bare repository that
+	wasn't specified via the top-level `--git-dir` command-line
+	option, or the `GIT_DIR` environment variable (see
+	linkgit:git[1]). If the repository is specified, Git will always
+	use the specified repository, regardless of this value.
++
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with
+this value.
++
+The currently supported values are:
++
+* `always`: Git always works with bare repositories
+* `never`: Git never works with bare repositories
++
+If you do not use bare repositories in your workflow, then it may be
+beneficial to set `discovery.bare` to `never` in your global config.
+This will protect you from attacks that involve cloning a repository
+that contains a bare repository and running a Git command within that
+directory.
diff --git a/setup.c b/setup.c
index c8e3c32814d..84cd02a1209 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,10 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum discovery_bare_allowed {
+	DISCOVERY_BARE_NEVER = 0,
+	DISCOVERY_BARE_ALWAYS,
+};
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1142,6 +1146,46 @@ static int ensure_valid_ownership(const char *path)
 	return data.is_safe;
 }
 
+static int discovery_bare_cb(const char *key, const char *value, void *d)
+{
+	enum discovery_bare_allowed *discovery_bare_allowed = d;
+
+	if (strcmp(key, "discovery.bare"))
+		return 0;
+
+	if (!strcmp(value, "never")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_NEVER;
+		return 0;
+	}
+	if (!strcmp(value, "always")) {
+		*discovery_bare_allowed = DISCOVERY_BARE_ALWAYS;
+		return 0;
+	}
+	return -1;
+}
+
+static enum discovery_bare_allowed get_discovery_bare(void)
+{
+	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
+	git_protected_config(discovery_bare_cb, &result);
+	return result;
+}
+
+static const char *discovery_bare_allowed_to_string(
+	enum discovery_bare_allowed discovery_bare_allowed)
+{
+	switch (discovery_bare_allowed) {
+	case DISCOVERY_BARE_NEVER:
+		return "never";
+	case DISCOVERY_BARE_ALWAYS:
+		return "always";
+	default:
+		BUG("invalid discovery_bare_allowed %d",
+		    discovery_bare_allowed);
+	}
+	return NULL;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1151,7 +1195,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
 };
 
 /*
@@ -1248,6 +1293,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (get_discovery_bare() == DISCOVERY_BARE_NEVER)
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1394,6 +1441,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
+			    dir.buf,
+			    discovery_bare_allowed_to_string(get_discovery_bare()));
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0035-discovery-bare.sh b/t/t0035-discovery-bare.sh
new file mode 100755
index 00000000000..8f802746530
--- /dev/null
+++ b/t/t0035-discovery-bare.sh
@@ -0,0 +1,52 @@
+#!/bin/sh
+
+test_description='verify discovery.bare checks'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_accepted () {
+	git "$@" rev-parse --git-dir
+}
+
+expect_rejected () {
+	test_must_fail git "$@" rev-parse --git-dir 2>err &&
+	grep -F "cannot use bare repository" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare unset' '
+	expect_accepted -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare=always' '
+	test_config_global discovery.bare always &&
+	expect_accepted -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare=never' '
+	test_config_global discovery.bare never &&
+	expect_rejected -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare in the repository' '
+	# discovery.bare must not be "never", otherwise git config fails
+	# with "fatal: not in a git directory" (like safe.directory)
+	test_config -C outer-repo/bare-repo discovery.bare always &&
+	test_config_global discovery.bare never &&
+	expect_rejected -C outer-repo/bare-repo
+'
+
+test_expect_success 'discovery.bare on the command line' '
+	test_config_global discovery.bare never &&
+	expect_accepted -C outer-repo/bare-repo \
+		-c discovery.bare=always
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section
  2022-07-07 23:01             ` [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-07-07 23:43               ` Junio C Hamano
  2022-07-08 17:01                 ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-07-07 23:43 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Glen Choo <chooglen@google.com>
>
> In a subsequent commit, we will introduce "protected configuration",
> which is easiest to describe in terms of configuration scopes (i.e. it's
> the union of the 'system', 'global', and 'command' scopes). This
> description is fine for ML discussions, but it's inadequate for end
> users because we don't provide a good description of "configuration
> scopes" in the public docs.
>
> 145d59f482 (config: add '--show-scope' to print the scope of a config
> value, 2020-02-10) introduced the word "scope" to our public docs, but
> that only enumerates the scopes and assumes the user can figure out
> those values mean.

Probably: "figure out those values mean" -> "figure out what those
values mean"

> Add a SCOPES section to Documentation/git-config.txt that describes the
> configuration scopes, their corresponding CLI options, and mentions that
> some configuration options are only respected in certain scopes. Then,
> use the word "scope" to simplify the FILES section and change some
> confusing wording.
>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  Documentation/git-config.txt | 65 ++++++++++++++++++++++++++++--------
>  1 file changed, 51 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
> index 9376e39aef2..c4ce61a0493 100644
> --- a/Documentation/git-config.txt
> +++ b/Documentation/git-config.txt
> @@ -297,8 +297,8 @@ The default is to use a pager.
>  FILES
>  -----
>  
> -If not set explicitly with `--file`, there are four files where
> -'git config' will search for configuration options:
> +By default, 'git config' will read configuration options from multiple
> +files:
>  
>  $(prefix)/etc/gitconfig::
>  	System-wide configuration file.
> @@ -322,27 +322,64 @@ $GIT_DIR/config.worktree::
>  	This is optional and is only searched when
>  	`extensions.worktreeConfig` is present in $GIT_DIR/config.
>  
> -If no further options are given, all reading options will read all of these
> -files that are available. If the global or the system-wide configuration
> -file are not available they will be ignored. If the repository configuration
> -file is not available or readable, 'git config' will exit with a non-zero
> -error code. However, in neither case will an error message be issued.
> +You may also provide additional configuration parameters when running any
> +git command by using the `-c` option. See linkgit:git[1] for details.

Listing "-c" as another one in addition to these files is probably a
good simplification, instead of "-c override others" in the original,
as we would need to say "more specific ones override less specific
ones" anyway.

> +Options will be read from all of these files that are available. If the
> +global or the system-wide configuration file are not available they will be
> +ignored. If the repository configuration file is not available or readable,
> +'git config' will exit with a non-zero error code. Note that neither case
> +produces an error message.

Problem inherited from the original, but I suspect that rephrasing
"not available" to "missing" (or "does not exist") may make it
easier to follow.  Also, "global" in the preceding description is
explained as one of the user-specific configuration files, so it may
be better to avoid it, e.g.

	If the user-specific or the system-wide configuration files
	are missing, they will be ignored.  If the repository
	configuration file is missing or unreadable, ...

Alternatively, we may want to tighten the description of
$XDG_CONFIG_HOME/git/config and ~/.gitconfig a bit better,
e.g. something along the lines of ...

	$XDG_CONFIG_HOME/git/config::
	$HOME/.gitconfig::
		User-specific configuration file.  When the
		XDG_CONFIG_HOME environment variable is not set or
		empty, $HOME/.config/ is used as $XDG_CONFIG_HOME.
	+
	These are often called the "global" configuration file.
	When either or both of them exist(s), both files are read.

Note that I deliberately omitted the mehtion that our $XDG support
may be too recent and $HOME/.gitconfig may be preferred for
portability.  It came from 21cf3227 (config: read (but not write)
from $XDG_CONFIG_HOME/git/config file, 2012-06-22) but sufficient
number of years have passed.

Also note that I originally wrote the following immediately after
the above

	+
	When writing to the "--global" scope (see below),
	$XDG_CONFIG_HOME/git/config is used if it exists; otherwise
	$HOME/.gitconfig is used.

but decided to discard it, since the OPTIONS -> "--global" covers
it well enough.

With a tightening of the definition of what "global" is,, we can
rephrase "If the global or the system configuration files are
missing...".

> +You can limit which configuration sources are read from or written to by
> +specifying the path of a file with the `--file` option, or by specifying a
> +configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
> +For more, see <<OPTIONS>> above.
> +
> +SCOPES
> +------
> +
> +Each configuration source falls within a configuration scope. The scopes
> +are:
> +
> +system::
> +	$(prefix)/etc/gitconfig
> +
> +global::
> +	$XDG_CONFIG_HOME/git/config
> ++
> +~/.gitconfig
> +
> +local::
> +	$GIT_DIR/config
> +
> +worktree::
> +	$GIT_DIR/config.worktree
> +
> +command::
> +	environment variables

We'd need to tighten this a bit, like:

	GIT_CONFIG_{COUNT,KEY,VALUE} environment varialbes (see below)

GIT_CONFIG_GLOBAL or GIT_CONFIG_SYSTEM environment varialbes and
others are listed in the ENVIRONMENT section that comes after this
section, but you do not want the readers to be confused into
thinking that we'd give them more precedence over others.

> ++
> +the `-c` option

> +With the exception of 'command', each scope corresponds to a command line
> +option: `--system`, `--global`, `--local`, `--worktree`.
> +
> +When reading options, specifying a scope will only read options from the
> +files within that scope. When writing options, specifying a scope will write
> +to the files within that scope (instead of the repository specific
> +configuration file). See <<OPTIONS>> above for a complete description.
>  
> +Most configuration options are respected regardless of the scope it is
> +defined in, but some options are only respected in certain scopes. See the
> +respective option's documentation for the full details.
>  
>  ENVIRONMENT
>  -----------

Overall it was a pleasant read.  Thanks, will queue.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 2/5] Documentation: define protected configuration
  2022-07-07 23:01             ` [PATCH v7 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-07-08  0:39               ` Junio C Hamano
  0 siblings, 0 replies; 113+ messages in thread
From: Junio C Hamano @ 2022-07-08  0:39 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Glen Choo <chooglen@google.com>
>
> For security reasons, there are config variables that are only trusted
> when they are specified in certain configuration scopes, which are
> sometimes referred to on-list as 'protected configuration' [1]. A future
> commit will introduce another such variable, so let's define our terms
> so that we can have consistent documentation and implementation.
>
> In our documentation, define 'protected configuration' as the system,
> global and command config scopes. As a shorthand, I will refer to
> variables that are only respected in protected configuration as
> 'protected configuration only', but this term is not used in the
> documentation.
>
> This definition of protected configuration is based on whether or not
> Git can reasonably protect the user by ignoring the configuration scope:
>
> - System, global and command line config are considered protected
>   because an attacker who has control over any of those can do plenty of
>   harm without Git, so we gain very little by ignoring those scopes.
> - On the other hand, local (and similarly, worktree) config are not
>   considered protected because it is relatively easy for an attacker to
>   control local config, e.g.:
>   - On some shared user environments, a non-admin attacker can create a
>     repository high up the directory hierarchy (e.g. C:\.git on
>     Windows), and a user may accidentally use it when their PS1
>     automatically invokes "git" commands.
>
>     `safe.directory` prevents attacks of this form by making sure that
>     the user intended to use the shared repository. It obviously
>     shouldn't be read from the repository, because that would end up
>     trusting the repository that Git was supposed to reject.
>   - "git upload-pack" is expected to run in repositories that may not be
>     controlled by the user. We cannot ignore all config in that
>     repository (because "git upload-pack" would fail), but we can limit
>     the risks by ignoring `uploadpack.packObjectsHook`.

This is only about the formatting, but have a blank line between
each bullet-point (e.g. before the line that talks about "On some
shared user enviornments, ..." and "git upload-pack").  A paragraph
break within a single bullet-point (i.e. the paragraph that talks
about `safe.directory` is a second paragraph of hte same bullet
point as the paragraph before it) looks like a stronger break than
separation between each bullet-point, which you wrote without any
blank lines in between.

> Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
> following variables are intentionally excluded:
>
> - `safe.directory` should be 'protected configuration only', but it does
>   not technically fit the definition because it is not respected in the
>   "command" scope. A future commit will fix this.
>
> - `trace2.*` happens to read the same scopes as `safe.directory` because
>   they share an implementation. However, this is not for security
>   reasons; it is because we want to start tracing so early that
>   repository-level config and "-c" are not available [2].
>
>   This requirement is unique to `trace2.*`, so it does not makes sense
>   for protected configuration to be subject to the same constraints.

Very well reasoned.


^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 0/5] config: introduce discovery.bare and protected config
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
                               ` (4 preceding siblings ...)
  2022-07-07 23:01             ` [PATCH v7 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
@ 2022-07-08  1:07             ` Junio C Hamano
  2022-07-08 20:35               ` Glen Choo
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
  6 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-07-08  1:07 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Glen Choo

"Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> This version incorporates most of Taylor's comments and suggestions. Thanks
> especially for the wording suggestions, I struggled with those a lot :)
>
> (I believe) I've responded upthread with my intention for each comment. The
> only differences between that and the actual changes are:
>
>  * In Documentation/git-config.txt, I dropped a suggestion to mention that
>    "git config --local" is identical to the default behavior when writing
>    options because I found it too hard to fit in.
>
>  * In Documentation/config/discovery.txt, I took Taylor's suggestion, but
>    didn't mention "discovery" for the same reasons.
>
>  * I decided to leave out the protected config lookup functions. I made some
>    POC patches at:

These patches overall looked ok.  I am not very happy to see the
proliferation of namespaces like safe.* and discovery.* that would
not likely to get the second variable, though.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section
  2022-07-07 23:43               ` Junio C Hamano
@ 2022-07-08 17:01                 ` Glen Choo
  2022-07-08 19:01                   ` Junio C Hamano
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-07-08 17:01 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Junio C Hamano <gitster@pobox.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> From: Glen Choo <chooglen@google.com>
>>
>> In a subsequent commit, we will introduce "protected configuration",
>> which is easiest to describe in terms of configuration scopes (i.e. it's
>> the union of the 'system', 'global', and 'command' scopes). This
>> description is fine for ML discussions, but it's inadequate for end
>> users because we don't provide a good description of "configuration
>> scopes" in the public docs.
>>
>> 145d59f482 (config: add '--show-scope' to print the scope of a config
>> value, 2020-02-10) introduced the word "scope" to our public docs, but
>> that only enumerates the scopes and assumes the user can figure out
>> those values mean.
>
> Probably: "figure out those values mean" -> "figure out what those
> values mean"
>
>> Add a SCOPES section to Documentation/git-config.txt that describes the
>> configuration scopes, their corresponding CLI options, and mentions that
>> some configuration options are only respected in certain scopes. Then,
>> use the word "scope" to simplify the FILES section and change some
>> confusing wording.
>>
>> Signed-off-by: Glen Choo <chooglen@google.com>
>> ---
>>  Documentation/git-config.txt | 65 ++++++++++++++++++++++++++++--------
>>  1 file changed, 51 insertions(+), 14 deletions(-)
>>
>> diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
>> index 9376e39aef2..c4ce61a0493 100644
>> --- a/Documentation/git-config.txt
>> +++ b/Documentation/git-config.txt
>> @@ -297,8 +297,8 @@ The default is to use a pager.
>>  FILES
>>  -----
>>  
>> -If not set explicitly with `--file`, there are four files where
>> -'git config' will search for configuration options:
>> +By default, 'git config' will read configuration options from multiple
>> +files:
>>  
>>  $(prefix)/etc/gitconfig::
>>  	System-wide configuration file.
>> @@ -322,27 +322,64 @@ $GIT_DIR/config.worktree::
>>  	This is optional and is only searched when
>>  	`extensions.worktreeConfig` is present in $GIT_DIR/config.
>>  
>> -If no further options are given, all reading options will read all of these
>> -files that are available. If the global or the system-wide configuration
>> -file are not available they will be ignored. If the repository configuration
>> -file is not available or readable, 'git config' will exit with a non-zero
>> -error code. However, in neither case will an error message be issued.
>> +You may also provide additional configuration parameters when running any
>> +git command by using the `-c` option. See linkgit:git[1] for details.
>
> Listing "-c" as another one in addition to these files is probably a
> good simplification, instead of "-c override others" in the original,
> as we would need to say "more specific ones override less specific
> ones" anyway.
>
>> +Options will be read from all of these files that are available. If the
>> +global or the system-wide configuration file are not available they will be
>> +ignored. If the repository configuration file is not available or readable,
>> +'git config' will exit with a non-zero error code. Note that neither case
>> +produces an error message.
>
> Problem inherited from the original, but I suspect that rephrasing
> "not available" to "missing" (or "does not exist") may make it
> easier to follow.  Also, "global" in the preceding description is
> explained as one of the user-specific configuration files, so it may
> be better to avoid it, e.g.
>
> 	If the user-specific or the system-wide configuration files
> 	are missing, they will be ignored.  If the repository
> 	configuration file is missing or unreadable, ...
>
> Alternatively, we may want to tighten the description of
> $XDG_CONFIG_HOME/git/config and ~/.gitconfig a bit better,
> e.g. something along the lines of ...
>
> 	$XDG_CONFIG_HOME/git/config::
> 	$HOME/.gitconfig::
> 		User-specific configuration file.  When the
> 		XDG_CONFIG_HOME environment variable is not set or
> 		empty, $HOME/.config/ is used as $XDG_CONFIG_HOME.
> 	+
> 	These are often called the "global" configuration file.
> 	When either or both of them exist(s), both files are read.
>
> Note that I deliberately omitted the mehtion that our $XDG support
> may be too recent and $HOME/.gitconfig may be preferred for
> portability.  It came from 21cf3227 (config: read (but not write)
> from $XDG_CONFIG_HOME/git/config file, 2012-06-22) but sufficient
> number of years have passed.
>
> Also note that I originally wrote the following immediately after
> the above
>
> 	+
> 	When writing to the "--global" scope (see below),
> 	$XDG_CONFIG_HOME/git/config is used if it exists; otherwise
> 	$HOME/.gitconfig is used.
>
> but decided to discard it, since the OPTIONS -> "--global" covers
> it well enough.
>
> With a tightening of the definition of what "global" is,, we can
> rephrase "If the global or the system configuration files are
> missing...".

Makes sense. I think this is simpler and more coherent with your
suggested changes.

The only change I'd suggest is to expand "missing" -> "missing or
unreadable". The original wording is "not available", which could be
interpreted to cover both cases. We'd obviously also have to amend
"not available or readable" accordingly.

>> +You can limit which configuration sources are read from or written to by
>> +specifying the path of a file with the `--file` option, or by specifying a
>> +configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
>> +For more, see <<OPTIONS>> above.
>> +
>> +SCOPES
>> +------
>> +
>> +Each configuration source falls within a configuration scope. The scopes
>> +are:
>> +
>> +system::
>> +	$(prefix)/etc/gitconfig
>> +
>> +global::
>> +	$XDG_CONFIG_HOME/git/config
>> ++
>> +~/.gitconfig
>> +
>> +local::
>> +	$GIT_DIR/config
>> +
>> +worktree::
>> +	$GIT_DIR/config.worktree
>> +
>> +command::
>> +	environment variables
>
> We'd need to tighten this a bit, like:
>
> 	GIT_CONFIG_{COUNT,KEY,VALUE} environment varialbes (see below)
>
> GIT_CONFIG_GLOBAL or GIT_CONFIG_SYSTEM environment varialbes and
> others are listed in the ENVIRONMENT section that comes after this
> section, but you do not want the readers to be confused into
> thinking that we'd give them more precedence over others.

Ah, good point.

>> ++
>> +the `-c` option
>
>> +With the exception of 'command', each scope corresponds to a command line
>> +option: `--system`, `--global`, `--local`, `--worktree`.
>> +
>> +When reading options, specifying a scope will only read options from the
>> +files within that scope. When writing options, specifying a scope will write
>> +to the files within that scope (instead of the repository specific
>> +configuration file). See <<OPTIONS>> above for a complete description.
>>  
>> +Most configuration options are respected regardless of the scope it is
>> +defined in, but some options are only respected in certain scopes. See the
>> +respective option's documentation for the full details.
>>  
>>  ENVIRONMENT
>>  -----------
>
> Overall it was a pleasant read.  Thanks, will queue.

Thanks! Shall I apply your suggestions, or were you planning to apply
them yourself?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section
  2022-07-08 17:01                 ` Glen Choo
@ 2022-07-08 19:01                   ` Junio C Hamano
  2022-07-08 21:38                     ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Junio C Hamano @ 2022-07-08 19:01 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Glen Choo <chooglen@google.com> writes:

>> Problem inherited from the original, but I suspect that rephrasing
>> "not available" to "missing" (or "does not exist") may make it
>> easier to follow.
>> ...
>
> The only change I'd suggest is to expand "missing" -> "missing or
> unreadable". The original wording is "not available", which could be
> interpreted to cover both cases. We'd obviously also have to amend
> "not available or readable" accordingly.

Probably.  I wonder if we should document that we at least warn when
a file we are expected to read exists but is not readable (instead
of simply saying "is ignored"), but other than that I agree with you.

> Thanks! Shall I apply your suggestions, or were you planning to apply
> them yourself?

Definitely not the latter ;-)

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 0/5] config: introduce discovery.bare and protected config
  2022-07-08  1:07             ` [PATCH v7 0/5] config: introduce discovery.bare and protected config Junio C Hamano
@ 2022-07-08 20:35               ` Glen Choo
  2022-07-12 22:11                 ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-07-08 20:35 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Junio C Hamano <gitster@pobox.com> writes:

> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> This version incorporates most of Taylor's comments and suggestions. Thanks
>> especially for the wording suggestions, I struggled with those a lot :)
>>
>> (I believe) I've responded upthread with my intention for each comment. The
>> only differences between that and the actual changes are:
>>
>>  * In Documentation/git-config.txt, I dropped a suggestion to mention that
>>    "git config --local" is identical to the default behavior when writing
>>    options because I found it too hard to fit in.
>>
>>  * In Documentation/config/discovery.txt, I took Taylor's suggestion, but
>>    didn't mention "discovery" for the same reasons.
>>
>>  * I decided to leave out the protected config lookup functions. I made some
>>    POC patches at:
>
> These patches overall looked ok.  I am not very happy to see the
> proliferation of namespaces like safe.* and discovery.* that would
> not likely to get the second variable, though.

Fair. I think `discovery.bare` is similar enough to `safe.directory`
that it could belong in the safe.* namespace if we find a good name for
it.

We rejected "safe.bareRepository" earlier because of the insinuation
that bare repos are unsafe. Maybe:

- safe.bareDiscovery
- safe.bareRepositoryDiscovery
- safe.unspecifiedBareRepository
- safe.discoveredBareRepository

"safe.unspecifiedBareRepository" is sounding pretty good to me
actually.. Any thoughts?

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section
  2022-07-08 19:01                   ` Junio C Hamano
@ 2022-07-08 21:38                     ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-08 21:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason

Junio C Hamano <gitster@pobox.com> writes:

> Glen Choo <chooglen@google.com> writes:
>
>> Thanks! Shall I apply your suggestions, or were you planning to apply
>> them yourself?
>
> Definitely not the latter ;-)

;) Ok, I'll give others some time to weigh in before rerolling.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-07-01 17:37             ` Glen Choo
@ 2022-07-08 21:58               ` Ævar Arnfjörð Bjarmason
  2022-07-12 20:47                 ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-08 21:58 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Junio C Hamano, Emily Shaffer, Jonathan Tan


On Fri, Jul 01 2022, Glen Choo wrote:

Sorry for the late reply, I see there's a v7, but this seems to also
apply to it, so...

> Thanks for weighing in :) Despite the different proposed approaches, I
> think we actually are in broad agreement.
>
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, Jun 30 2022, Glen Choo via GitGitGadget wrote:
>>
>>> This is a quick re-roll to address Ævar's comments on the tests (thanks!).
>>
>> Thanks!
>>
>>> = Description
>>
>> Just more generally on this series & approach. I know this is a v6 by
>> now, but I haven't kept up with this topic, but to be fair I did mention
>> pretty much this in:
>> https://lore.kernel.org/git/220407.86lewhc6bz.gmgdl@evledraar.gmail.com/
>>
>> So...
>>
>>> There is a known social engineering attack that takes advantage of the fact
>>> that a working tree can include an entire bare repository, including a
>>> config file. A user could run a Git command inside the bare repository
>>> thinking that the config file of the 'outer' repository would be used, but
>>> in reality, the bare repository's config file (which is attacker-controlled)
>>> is used, which may result in arbitrary code execution. See [1] for a fuller
>>> description and deeper discussion.
>>>
>>> This series implements a simple way of preventing such attacks: create a
>>> config option, discovery.bare, that tells Git whether or not to die when it
>>> finds a bare repository. discovery.bare has two values:
>>>
>>>  * "always": always allow bare repositories (default), identical to current
>>>    behavior
>>>  * "never": never allow bare repositories
>>>
>>> and users/system administrators who never expect to work with bare
>>> repositories can secure their environments using "never". discovery.bare has
>>> no effect if --git-dir or GIT_DIR is passed because we are confident that
>>> the user is not confused about which repository is being used.
>>
>> I'm not insisting that the entire approach here should be changed, but
>> in the above exchange you seemed to have performance concerns about the
>> "just walk up in setup.c" approach I mentioned, but it's not clear if
>> that's still the only thing that necessitates taking this approach.
>>
>> There may be security subtleties that I've missed, but from the
>> description here it seems like that would work equally well, and
>> wouldn't require configuration, except insofar as we'd need to opt-in to
>> reading config from bare repositores *that also exist in a parent tree*.
>>
>> And it would be a more narrow & more secure solution, since it would
>> e.g. allow you to intentionally navigate to /var/repos/git/git.git in a
>> server setup and read the config there, which it could distinguish from
>> a case of /var/repos/.git existing, and git/git.git being brought in as
>> a part of that "parent" repo.
>
> Performance is one major concern, yes, and I agree that your findings
> show that the "just walk up" approach is cheap enough to consider doing.
> Though in the few cases where it isn't cheap to walk, wouldn't it still
> be useful to be able to opt out of it?

Maybe, but until we at least have a reason to think that there is a
performance concern this all seems rather hypothetical.

In the thread(s) I linked to I noted that I tested my POC implementation
on AIX, which has a sloooooooow filesystem.

Why are you concerned about this having a performance impact?

> The other concern is simplicity and correctness. Are we confident that
> we'll get the design of "just walk up" correct (including edge cases
> like "bare repo in bare repo in non bare repo")? I'm 100% confident that
> we'll get it right eventually, and that this approach will be a good
> default for all users. But in comparison, "never" is so much easier to
> understand and implement that I don't see why we shouldn't start by
> presenting this option to the 0.1-1% of users who would find it useful.

If you run "git status" or whatever in a directory anywhere on your FS
now it'll confidently tell you if you're not within a git repository.

We're just talking about re-using that code, why would we be concerned
that:

 1. Finding a repo at a given <PATH>
 2. Appending "/.." to that <PATH>
 3. Feeding it to a "given a path, find me the git repo" (code setup.c
    already has)

Is something we'd get wrong?

> And on the topic of simplicity, there's significant interest in
> maintaining backwards-compatibility with repos with workflows that
> absolutely depend on embedded bare repos, e.g. libgit2 and Git-LFS.
> That's yet another special case that we'd have to get right. Stolee's
> "no-embedded" proposal [1] pretty much covers that, but I don't see the
> harm in simplifying the design space by making bare repo support a
> non-goal.
>
> [1] https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Sure, and we could have a config know to still use such repos (for
config or otherwise) with the "walk up" method, does that address your
concern here?

>> The "more narrow" and "more secure" go hand-in-hand, since if you work
>> on such servers you'd turn this to "always" because you want to read
>> such config, but then be left vulnerable to the actual (and muche rarer)
>> exploit we're trying to prevent.
>
> The point that we're not defending bare repo users is fair, but maybe
> the group we're trying to protect isn't really dedicated Git-serving
> servers. This exploit requires you to have a bare repo inside the
> working tree of a non-bare repo. So I think this is less of an issue for
> a server, and more for "mixed-use" environments with both regular and
> bare clones.

Yes, but this is only something that's even a question because of an
artificial limitation your proposal here suffers from.

I.e. in trying to detect nefarious repos where you've got "looks like
bare" content *tracked* in another repo you're conflating it with *any
bare repo*.

And the only reason we're doing that seems to me to be a premature
optimization.

>> Which, it seems...
>>
>>> This series does not change the default behavior, but in the long-run, a
>>> "no-embedded" option might be a safe and usable default [2]. "never" is too
>>> restrictive and unlikely to be the default.
>>
>> This series has (since v3?) been noting aspirations to have a
>> "no-embedded" variant of this config, which your 5/5 here notes would be
>> better, but isn't implemented by this series.
>>
>> But your 5/5 also notes:
>>
>>     but detecting if a repository is embedded is potentially
>>     non-trivial, so this work is not implemented in this series.
>>
>> Hrm, well, the diff-stat isn't quite that trivial either :) :
>
> Well.. a lot of it is refactoring :P
>
>>> [...]
>>>  upload-pack.c                       | 27 ++++++----
>>>  12 files changed, 304 insertions(+), 47 deletions(-)
>>
>> In threads linked from the above ML link I linked to some POC code
>> showing how to hack a second .git discovery walk into setup.c. This was
>> as part of the "submodule parent dir" proposal, which is a different
>> feature, but also needs such "find the parent" code:
>> https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/
>>
>> Now, obviously that's a dirty hack, but it's not that hard to just
>> change the part of setup.c where we're satisfied that we've found the
>> git dir, then walk up "$THAT_DIR/..", and start our search again.
>>
>> Then:
>>
>> 	if (first_dir_was_bare() && found_parent_dir())
>>         	enforce_no_embedded();
>>
>> Isn't that what your proposed "no embedded" option would need to do?
>> Well, maybe we'd also check if the "first dir" is in the index of the
>> parent, as opposed to just being a bare .git somewhere in ~/Downloads,
>> e.g. if you have a ~/.git and keep your dot-files in git.
>>
>> But I think for an initial implementation just doing the walk would be
>> good enough, and would have a more narrow scope than this configuration
>> setting.
>
> A narrow scope is good, but I don't agree on this definition of
> "narrow". My preference is to give an obvious solution to a 'narrow'
> group of users, instead of a more tricky solution that affects all users
> in a 'narrow' set of cases.

We could still have an option to say "I never want to consider bare
repos from config", but if we're able to distinguish *tracked* bare
repos from non-tracked ones this is entirely unrelated to the initial
stated motivation for this series, isn't it?

I.e. to solve the security issue at hand.

I've got nothing against *also* providing a "I never want config from
xyz repos", but that seems to be orthagonal.

>> AFAICT the performance concerns aren't supported by any data, in the
>> case of the "submodule superproject" feature it turned out to not be the
>> directory walk, but us shelling out in a loop in git-submodule.sh.
>>
>> Well, *maybe* that's not the case, I think I have managed to read
>> between the lines of some of these past exchanges that there's some odd
>> propriterary internal NFS-like setup at Google where *parent dirs* are
>> auto-mounted and searched on access, so a "walk up" pattern would be
>> much more expensive.
>>
>> I do worry a bit about us ending up with design choices in git that we
>> wouldn't have ended up with, if not to cater to some in-house setup
>> somwhere that 99.99% of git users will never see.
>
> At the very least, I don't think you're saying that it's a bad idea to
> have "never", just that we might not have come up with it if not for
> some Google NFS thing.

I'm just speculating as to why we've ended up with this approach, but
maybe I'm wrong.

> Another use case I can think of is CI bots, which have no need for bare
> repos. To some folks (maybe in very security-sensitive environments),
> "never" might give more peace of mind than "no-embedded".

Sure, but again, I've got nothing against having *more config knobs*,
and I've often e.g. wanted some way to tell git to read no config at all
(which you can sort of do with the right environment variables, but it's
a hassle).

But I don't see how that's not unrelated to addressing the issue this
series aims to address, and what I'm pointing out that it's doing so
with a method that's less accurate than the "walk up" method, and less
secure.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-07-08 21:58               ` Ævar Arnfjörð Bjarmason
@ 2022-07-12 20:47                 ` Glen Choo
  2022-07-12 23:53                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-07-12 20:47 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Junio C Hamano, Emily Shaffer, Jonathan Tan


Thanks for following up. I'm a concerned that this thread will be
unproductive if all we're doing is reiterating our own opinions. I'm ok
if the conclusion is "agree to disagree", but let's not spend too much
time talking circles around one another (myself included, of course:)).

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Fri, Jul 01 2022, Glen Choo wrote:
>>> The "more narrow" and "more secure" go hand-in-hand, since if you work
>>> on such servers you'd turn this to "always" because you want to read
>>> such config, but then be left vulnerable to the actual (and muche rarer)
>>> exploit we're trying to prevent.
>>
>> The point that we're not defending bare repo users is fair, but maybe
>> the group we're trying to protect isn't really dedicated Git-serving
>> servers. This exploit requires you to have a bare repo inside the
>> working tree of a non-bare repo. So I think this is less of an issue for
>> a server, and more for "mixed-use" environments with both regular and
>> bare clones.
>
> Yes, but this is only something that's even a question because of an
> artificial limitation your proposal here suffers from.
>
> I.e. in trying to detect nefarious repos where you've got "looks like
> bare" content *tracked* in another repo you're conflating it with *any
> bare repo*.
>
> And the only reason we're doing that seems to me to be a premature
> optimization.

Right, I hear you. Besides performance, let me offer the perspective
that I should have led with in the previous email. In this thread and
the original "embedded bare repo" one [1], there is a huge diversity of
opinion on what the default behavior should be, e.g.:

- How do we detect an embedded bare repo (fsck check? walk up [and check
  if it's tracked]?)
- What to do when we detect one (ignore the config? block the repo?)
- How to preserve workflows that rely on embedded bare repos (some kind
  of (global|per-repo) exception list? allow the repo but not the
  config?)

And rightfully so! There are a lot of options here, so we want to make
sure we get the defaults right. But at the same time that implies a
pretty slow, difficult process.

On the other hand, I haven't seen nearly as much disagreement on "just
refuse to work with bare repos" because it's so restrictive that it
probably won't be the default. So it'll have no effect on most users,
but still confers protection for the subset of users who can benefit
from it. For those who want the problem fixed _today_ (e.g. my
employer), this seems like simple, low-hanging fruit that buys time for
us to find good default.

FWIW, when time permits I'd be happy to work on that good default (which
will probably be some variant of "walk up"), and to pay off the tech
debt introduced by this implementation (I have some ideas about how we
could improve the config API to achieve this [2]). Hopefully that helps
allay some of your concerns?

[1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2] https://lore.kernel.org/git/kl6lr13fi9qn.fsf@chooglen-macbookpro.roam.corp.google.com

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v7 0/5] config: introduce discovery.bare and protected config
  2022-07-08 20:35               ` Glen Choo
@ 2022-07-12 22:11                 ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-12 22:11 UTC (permalink / raw)
  To: Junio C Hamano, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, brian m. carlson, Derrick Stolee,
	Emily Shaffer, Jonathan Tan,
	Ævar Arnfjörð Bjarmason, Johannes Schindelin

Glen Choo <chooglen@google.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> "Glen Choo via GitGitGadget" <gitgitgadget@gmail.com> writes:
>>
>>> This version incorporates most of Taylor's comments and suggestions. Thanks
>>> especially for the wording suggestions, I struggled with those a lot :)
>>>
>>> (I believe) I've responded upthread with my intention for each comment. The
>>> only differences between that and the actual changes are:
>>>
>>>  * In Documentation/git-config.txt, I dropped a suggestion to mention that
>>>    "git config --local" is identical to the default behavior when writing
>>>    options because I found it too hard to fit in.
>>>
>>>  * In Documentation/config/discovery.txt, I took Taylor's suggestion, but
>>>    didn't mention "discovery" for the same reasons.
>>>
>>>  * I decided to leave out the protected config lookup functions. I made some
>>>    POC patches at:
>>
>> These patches overall looked ok.  I am not very happy to see the
>> proliferation of namespaces like safe.* and discovery.* that would
>> not likely to get the second variable, though.
>
> Fair. I think `discovery.bare` is similar enough to `safe.directory`
> that it could belong in the safe.* namespace if we find a good name for
> it.
>
> We rejected "safe.bareRepository" earlier because of the insinuation
> that bare repos are unsafe. Maybe:
>
> - safe.bareDiscovery
> - safe.bareRepositoryDiscovery
> - safe.unspecifiedBareRepository
> - safe.discoveredBareRepository
>
> "safe.unspecifiedBareRepository" is sounding pretty good to me
> actually.. Any thoughts?

(+CC Johannes Schindelin for thoughts on what should go into `safe.*`
and/or design considerations that went into it.)

Another thought is that `discovery.bare` and `safe.directory` should
both indeed live in the same namespace, but that namespace should be
named something other than `safe.*`, e.g. if we had
`allowedRepositories.otherOwner` instead of `safe.directory`, it would
have been a no-brainer for me to put this in the `allowedRepositories.*`
namespace.

So an alternative proposal would be:

- rename this to `allowedRepositories.discoveredBare`
- (possibly not in this series, but at some point) create a
  `safe.directory` alias in that namespace, e.g.
  `allowedRepositories.otherOwner`

*But* I don't see the former making sense without the latter (I really
think both should be in the same namespace), so if we think that's
unnecessary churn, I'll drop this idea entirely.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: [PATCH v6 0/5] config: introduce discovery.bare and protected config
  2022-07-12 20:47                 ` Glen Choo
@ 2022-07-12 23:53                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-12 23:53 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, brian m. carlson,
	Derrick Stolee, Junio C Hamano, Emily Shaffer, Jonathan Tan


On Tue, Jul 12 2022, Glen Choo wrote:

> Thanks for following up. I'm a concerned that this thread will be
> unproductive if all we're doing is reiterating our own opinions. I'm ok
> if the conclusion is "agree to disagree", but let's not spend too much
> time talking circles around one another (myself included, of course:)).

Yes, I have not been following up here to merely repeat what's been said
before, but...

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Fri, Jul 01 2022, Glen Choo wrote:
>>>> The "more narrow" and "more secure" go hand-in-hand, since if you work
>>>> on such servers you'd turn this to "always" because you want to read
>>>> such config, but then be left vulnerable to the actual (and muche rarer)
>>>> exploit we're trying to prevent.
>>>
>>> The point that we're not defending bare repo users is fair, but maybe
>>> the group we're trying to protect isn't really dedicated Git-serving
>>> servers. This exploit requires you to have a bare repo inside the
>>> working tree of a non-bare repo. So I think this is less of an issue for
>>> a server, and more for "mixed-use" environments with both regular and
>>> bare clones.
>>
>> Yes, but this is only something that's even a question because of an
>> artificial limitation your proposal here suffers from.
>>
>> I.e. in trying to detect nefarious repos where you've got "looks like
>> bare" content *tracked* in another repo you're conflating it with *any
>> bare repo*.
>>
>> And the only reason we're doing that seems to me to be a premature
>> optimization.
>
> Right, I hear you. Besides performance,[...]

...have been following up because it's still genuinely unclear to me
what data or design constraints led to this solution. I.e. in [1] you
noted ("[...]" interjection is mine):

	"I don't see how we could implement this [the "walk-up" method]
	without imposing a big penalty to all bare repo users[...]."

[Continued below]

> let me offer the perspective
> that I should have led with in the previous email. In this thread and
> the original "embedded bare repo" one [1], there is a huge diversity of
> opinion on what the default behavior should be, e.g.:

I read that thread over again, and some of the highlights were:

 * brian asking if we can't basically do the "walk up" method:
   https://lore.kernel.org/git/Yk9hONuCIVIq6ieV@camp.crustytoothpaste.net/

 * Taylor wondering how much we need to worry about this attack (among
   other things) & worrying about legitimate "bare repo" workflows being
   broken: https://lore.kernel.org/git/YloTQH35r2xVdPm1@nand.local/ &
   https://lore.kernel.org/git/Ylobp7sntKeWTLDX@nand.local/

But most importantly, here's something I hadn't noticed before:

 * Emily talking about the supposed slowness of the "walk up" method:
   https://lore.kernel.org/git/CAJoAoZkgnnvdymuBsM9Ja3+eYSnyohr=FQZMVX_uzZ_pkQhgaw@mail.gmail.com/

I.e.:

	"wantonly scanning up the filesystem for any gitdir above the
	current one is really expensive. When I tried that approach for
	the purposes of including some shared config between
	superproject and submodules, it slowed down the Git test suite
	by something like 3-5x."

Which I'm now 99.99% certain based on past context[2] is a misstatement
or misrecollection about an early version of
submodule.superprojectGitDir v.s. what setup.c would do.

I.e. that 3-5x slowness referred to git-submodule.sh shelling out to
"git rev-parse", it's not a reference to the expense of the few syscalls
we'd need to make to discover a parent git directory.

Did you hear about the directory walking being a performance concern
from Emily, or was it an independent discovery?

It seems as though this might have come about because of a
misrecollection about the git-rev-parse(1)/git-submodule.sh v.s. setup.c
performance with reference to submodule.superprojectGitDir, and that
we've now got a design that's optimized to avoid a performance problem
that doesn't exist, at the cost of accuracy.

And not to reiterate, but I think the performance isn't a concern
per-se, but rather that performance concerns seem to have driven one
design over another.

> - How do we detect an embedded bare repo (fsck check? walk up [and check
>   if it's tracked]?)
> - What to do when we detect one (ignore the config? block the repo?)
> - How to preserve workflows that rely on embedded bare repos (some kind
>   of (global|per-repo) exception list? allow the repo but not the
>   config?)
>
> And rightfully so! There are a lot of options here, so we want to make
> sure we get the defaults right. But at the same time that implies a
> pretty slow, difficult process.

I saw some implementation discussion about how we'd do this with fsck,
which is one thing, but I don't really see the trickyness or ambiguity
on the client side.

I.e. we know when we'd "find a repo", so that's the criteria we'd use to
ignore such a contained repo or not. The only trickyness seems to come
about if the approach we pick is one where we conflate embedded bare
repos v.s. non-embedded bare repos.

> On the other hand, I haven't seen nearly as much disagreement on "just
> refuse to work with bare repos" because it's so restrictive that it
> probably won't be the default. So it'll have no effect on most users,
> but still confers protection for the subset of users who can benefit
> from it. For those who want the problem fixed _today_ (e.g. my
> employer), this seems like simple, low-hanging fruit that buys time for
> us to find good default.
>
> FWIW, when time permits I'd be happy to work on that good default (which
> will probably be some variant of "walk up"), and to pay off the tech
> debt introduced by this implementation (I have some ideas about how we
> could improve the config API to achieve this [2]). Hopefully that helps
> allay some of your concerns?

It really just seems like a dead end to me, sorry.

I.e. we know what the security problem is, but the side-effects of this
approach are such that we'll probably never turn it on by default.

So that'll mean that the vast majority of users who could benefit from
the security mitigation won't even know about the config, or if they do
might not have it turned on.

And yes, we might end up with a better design later, but then we'll have
to still support this config mechanism, potentially deprecate it etc.

> [1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
> [2] https://lore.kernel.org/git/kl6lr13fi9qn.fsf@chooglen-macbookpro.roam.corp.google.com

1. https://lore.kernel.org/git/kl6lee1z8mcm.fsf@chooglen-macbookpro.roam.corp.google.com/
2. https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v8 0/5] config: introduce safe.bareRepository and protected config
  2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
                               ` (5 preceding siblings ...)
  2022-07-08  1:07             ` [PATCH v7 0/5] config: introduce discovery.bare and protected config Junio C Hamano
@ 2022-07-14 21:27             ` Glen Choo via GitGitGadget
  2022-07-14 21:27               ` [PATCH v8 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
                                 ` (4 more replies)
  6 siblings, 5 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-14 21:27 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Glen Choo

Thanks all! This version takes Junio's wording suggestions on the previous
round, renames the config variable and, because of conflicts with 2.37.1, is
newly rebased onto master.

The config variable is now named "safe.bareRepository"; I discarded this
name earlier in this series, but the intent behind the series has come
around to the point where "safe.bareRepository" makes sense again. And
thanks to Dscho's suggestion to explicitly create an option for "no
discovered bare repos" [1], the UI is self-documenting enough that we can
write terser docs.

As mentioned upthread, I would love to see "safe.*" namespace renamed (it's
an adjective, not a Git command/entity), but I'm doubling down on it anyway.
This config option does such a similar thing to "safe.directory" that they
really should be siblings, and I don't foresee a world where
"safe.directory" gets renamed.

I've triple-checked to make sure I've scrubbed the commits of the old name,
but I'd appreciate the extra eyes :)

= Description

There is a known social engineering attack that takes advantage of the fact
that a working tree can include an entire bare repository, including a
config file. A user could run a Git command inside the bare repository
thinking that the config file of the 'outer' repository would be used, but
in reality, the bare repository's config file (which is attacker-controlled)
is used, which may result in arbitrary code execution. See [2] for a fuller
description and deeper discussion.

This series implements a simple way of preventing such attacks: create a
config option, safe.bareRepository, that tells Git whether or not to die
when it finds a bare repository. safe.bareRepository has two values:

 * "all": allow all bare repositories (default), identical to current
   behavior
 * "explicit": only allow bare repositories specified via --git-dir or
   GIT_DIR

and users/system administrators who never expect to work with bare
repositories can secure their environments using "explicit". We still trust
explicit bare repositories because we are confident that the user is not
confused about which repository is being used.

This series does not change the default behavior, but in the long-run, a
"no-embedded" option might be a safe and usable default [3]. "never" is too
restrictive and unlikely to be the default.

For security reasons, safe.bareRepository cannot be read from
repository-level config (because we would end up trusting the embedded bare
repository that we aren't supposed to trust to begin with). Since this would
introduce a 3rd variable that is only read from 'protected/trusted
configuration' (the others are safe.directory and
uploadpack.packObjectsHook) this series also defines and creates a shared
implementation for 'protected configuration'

= Patch organization

 * Patch 1 add a section on configuration scopes to our docs
 * Patches 2-3 define 'protected configuration' and create a shared
   implementation.
 * Patch 4 refactors safe.directory to use protected configuration
 * Patch 5 adds safe.bareRepository

= Series history

Changes in v8:

 * Rename discovery.bare -> safe.bareRepository, change values from
   "always|never" -> "all|explicit"
 * Numerous docs improvements
 * Rebase onto post-2.37.1 master

Changes in v7:

 * Numerous docs improvements and code cleanup.
 * In 3/5's commit message, drop "as fast as possible" and allude to lookup
   functions coming in a later series.
 * Remove a comment in 3/5 about repository.protected_config. That was stale
   since v4, but slipped under the radar until now.
 * Fix some s/protected config/protected configuration (leftover from v5).

Changes in v6:

 * Add TEST_PASSES_SANITIZE_LEAK=true
 * Replace all sub-shells with -C and use test_config_global
 * Change the expect_rejected helper to use "grep -F" with a more specific
   message.
   * This reveals that the "-c discovery.bare=" assertion in the last test
     was passing for the wrong reason (because '' is an invalid value for
     "discovery.bare"). I removed it because it wasn't doing anything useful
     anyway - I was trying to make discovery.bare unset in the command line,
     but the whole point of that test is to assert that we respect the CLI
     arg.

Changes in v5:

 * Standardize the usage of "protected configuration" instead of mixing
   "config" and "configuration". This required some unfortunate rewrapping.
 * Remove mentions of "trustworthiness" when discussing protected
   configuration and focus on what Git does instead.
   * The rationale of protected vs non-protected is still kept.
 * Fix the stale documentation entry for discovery.bare.
 * Include a fuller description of how discovery.bare and "--git-dir"
   interact instead of saying "has no effect".

Changes in v4:

 * 2/5's commit message now justifies what scopes are included in protected
   config
 * The global configset is now a file-scope static inside config.c
   (previously it was a member of the_repository).
 * Rename discovery_bare_config to discovery_bare_allowed
 * Make discovery_bare_allowed function-scoped (instead of global).
 * Add an expect_accepted helper to the discovery.bare tests.
 * Add a helper to "upload-pack" that reads the protected and non-protected
   config

Changes in v3:

 * Rebase onto a more recent 'master'
 * Reframe this feature in only in terms of the 'embedded bare repo' attack.
 * Other docs improvements (thanks Stolee in particular!)
 * Protected config no longer uses read_very_early_config() and is only read
   once
 * Protected config now includes "-c"
 * uploadpack.packObjectsHook now uses protected config instead of ignoring
   repo config using config scopes

Changes in v2:

 * Rename safe.barerepository to discovery.bare and make it die()
 * Move tests into t/t0034-discovery-bare.sh
 * Avoid unnecessary config reading by using a static variable
 * Add discovery.bare=cwd
 * Fix typos

= Future work

 * This series doesn't implement config lookup functions for protected
   config. This will be done in a follow up series.
 * This series does not implement the "no-embedded" option [3] and I won't
   work on it any time soon, but I'd be more than happy to review if someone
   sends patches.
 * With discovery.bare, if a builtin is marked RUN_SETUP_GENTLY, setup.c
   doesn't die() and we don't tell users why their repository was rejected,
   e.g. "git config" gives an opaque "fatal: not in a git directory". This
   isn't a new problem though, since safe.directory has the same issue.

[1] https://lore.kernel.org/git/5ps2q552-1rr3-7161-4181-31556pp2ns12@tzk.qr

[2]
https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com

[3] This was first suggested in
https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Glen Choo (5):
  Documentation/git-config.txt: add SCOPES section
  Documentation: define protected configuration
  config: learn `git_protected_config()`
  safe.directory: use git_protected_config()
  setup.c: create `safe.bareRepository`

 Documentation/config/safe.txt       | 25 +++++++-
 Documentation/config/uploadpack.txt |  6 +-
 Documentation/git-config.txt        | 95 ++++++++++++++++++++++-------
 config.c                            | 43 +++++++++++++
 config.h                            | 16 +++++
 setup.c                             | 59 +++++++++++++++++-
 t/t0033-safe-directory.sh           | 24 +++-----
 t/t0035-safe-bare-repository.sh     | 54 ++++++++++++++++
 t/t5544-pack-objects-hook.sh        |  7 ++-
 upload-pack.c                       | 27 +++++---
 10 files changed, 300 insertions(+), 56 deletions(-)
 create mode 100755 t/t0035-safe-bare-repository.sh


base-commit: 4e2a4d1dd44367d7783f33b169698f2930ff13c0
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1261%2Fchooglen%2Fsetup%2Fdisable-bare-repo-config-v8
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1261/chooglen/setup/disable-bare-repo-config-v8
Pull-Request: https://github.com/git/git/pull/1261

Range-diff vs v7:

 1:  5c58db3bb21 ! 1:  6147751c9c1 Documentation/git-config.txt: add SCOPES section
     @@ Commit message
          145d59f482 (config: add '--show-scope' to print the scope of a config
          value, 2020-02-10) introduced the word "scope" to our public docs, but
          that only enumerates the scopes and assumes the user can figure out
     -    those values mean.
     +    what those values mean.
      
          Add a SCOPES section to Documentation/git-config.txt that describes the
          configuration scopes, their corresponding CLI options, and mentions that
     @@ Documentation/git-config.txt: The default is to use a pager.
       
       $(prefix)/etc/gitconfig::
       	System-wide configuration file.
     + 
     + $XDG_CONFIG_HOME/git/config::
     +-	Second user-specific configuration file. If $XDG_CONFIG_HOME is not set
     +-	or empty, `$HOME/.config/git/config` will be used. Any single-valued
     +-	variable set in this file will be overwritten by whatever is in
     +-	`~/.gitconfig`.  It is a good idea not to create this file if
     +-	you sometimes use older versions of Git, as support for this
     +-	file was added fairly recently.
     +-
     + ~/.gitconfig::
     +-	User-specific configuration file. Also called "global"
     +-	configuration file.
     ++	User-specific configuration files. When the XDG_CONFIG_HOME environment
     ++	variable is not set or empty, $HOME/.config/ is used as
     ++	$XDG_CONFIG_HOME.
     +++
     ++These are also called "global" configuration files. If both files exist, both
     ++files are read in the order given above.
     + 
     + $GIT_DIR/config::
     + 	Repository specific configuration file.
      @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
       	This is optional and is only searched when
       	`extensions.worktreeConfig` is present in $GIT_DIR/config.
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
      +git command by using the `-c` option. See linkgit:git[1] for details.
      +
      +Options will be read from all of these files that are available. If the
     -+global or the system-wide configuration file are not available they will be
     -+ignored. If the repository configuration file is not available or readable,
     -+'git config' will exit with a non-zero error code. Note that neither case
     -+produces an error message.
     ++global or the system-wide configuration files are missing or unreadable they
     ++will be ignored. If the repository configuration file is missing or unreadable,
     ++'git config' will exit with a non-zero error code. An error message is produced
     ++if the file is unreadable, but not if it is missing.
       
       The files are read in the order given above, with last value found taking
       precedence over values read earlier.  When multiple values are taken then all
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
      +	$GIT_DIR/config.worktree
      +
      +command::
     -+	environment variables
     ++	GIT_CONFIG_{COUNT,KEY,VALUE} environment variables (see <<ENVIRONMENT>>
     ++	below)
      ++
      +the `-c` option
      +
     @@ Documentation/git-config.txt: $GIT_DIR/config.worktree::
      +defined in, but some options are only respected in certain scopes. See the
      +respective option's documentation for the full details.
       
     ++[[ENVIRONMENT]]
       ENVIRONMENT
       -----------
     + 
 2:  58f25612aa3 ! 2:  df8a1a78d53 Documentation: define protected configuration
     @@ Commit message
          - System, global and command line config are considered protected
            because an attacker who has control over any of those can do plenty of
            harm without Git, so we gain very little by ignoring those scopes.
     +
          - On the other hand, local (and similarly, worktree) config are not
            considered protected because it is relatively easy for an attacker to
            control local config, e.g.:
     +
            - On some shared user environments, a non-admin attacker can create a
              repository high up the directory hierarchy (e.g. C:\.git on
              Windows), and a user may accidentally use it when their PS1
     @@ Commit message
              the user intended to use the shared repository. It obviously
              shouldn't be read from the repository, because that would end up
              trusting the repository that Git was supposed to reject.
     +
            - "git upload-pack" is expected to run in repositories that may not be
              controlled by the user. We cannot ignore all config in that
              repository (because "git upload-pack" would fail), but we can limit
     @@ Documentation/git-config.txt: Most configuration options are respected regardles
      +substantial harm without using Git, so it is assumed that the user's environment
      +protects these scopes against attackers.
      +
     + [[ENVIRONMENT]]
       ENVIRONMENT
       -----------
     - 
 3:  3683d20f232 ! 3:  30ac73716cb config: learn `git_protected_config()`
     @@ Commit message
      
          `uploadpack.packObjectsHook` is the only 'protected configuration only'
          variable today, but we've noted that `safe.directory` and the upcoming
     -    `discovery.bare` should also be 'protected configuration only'. So, for
     -    consistency, we'd like to have a single implementation for protected
     +    `safe.bareRepository` should also be 'protected configuration only'. So,
     +    for consistency, we'd like to have a single implementation for protected
          configuration.
      
          The primary constraints are:
      
          1. Reading from protected configuration should be fast. Nearly all "git"
             commands inside a bare repository will read both `safe.directory` and
     -       `discovery.bare`, so we cannot afford to be slow.
     +       `safe.bareRepository`, so we cannot afford to be slow.
      
          2. Protected configuration must be readable when the gitdir is not
     -       known. `safe.directory` and `discovery.bare` both affect repository
     -       discovery and the gitdir is not known at that point [1].
     +       known. `safe.directory` and `safe.bareRepository` both affect
     +       repository discovery and the gitdir is not known at that point [1].
      
          The chosen implementation in this commit is to read protected
          configuration and cache the values in a global configset. This is
 4:  6394818ffd8 ! 4:  b3256d68f84 safe.directory: use git_protected_config()
     @@ Documentation/config/safe.txt: via `git config --add`. To reset the list of safe
       path relative to the home directory and `%(prefix)/<path>` expands to a
      
       ## setup.c ##
     -@@ setup.c: static int ensure_valid_ownership(const char *path)
     - 	    is_path_owned_by_current_user(path))
     - 		return 1;
     - 
     +@@ setup.c: static int ensure_valid_ownership(const char *gitfile,
     + 	 * constant regardless of what failed above. data.is_safe should be
     + 	 * initialized to false, and might be changed by the callback.
     + 	 */
      -	read_very_early_config(safe_directory_cb, &data);
      +	git_protected_config(safe_directory_cb, &data);
       
     @@ t/t0033-safe-directory.sh: test_expect_success 'safe.directory is not set' '
       
      -test_expect_success 'ignoring safe.directory on the command line' '
      -	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
     --	grep "unsafe repository" err
     +-	grep "dubious ownership" err
      +test_expect_success 'safe.directory on the command line' '
      +	git -c safe.directory="$(pwd)" status
       '
     @@ t/t0033-safe-directory.sh: test_expect_success 'safe.directory is not set' '
      -		GIT_CONFIG_KEY_0="safe.directory" \
      -		GIT_CONFIG_VALUE_0="$(pwd)" \
      -		git status 2>err &&
     --	grep "unsafe repository" err
     +-	grep "dubious ownership" err
      +test_expect_success 'safe.directory in the environment' '
      +	env GIT_CONFIG_COUNT=1 \
      +	    GIT_CONFIG_KEY_0="safe.directory" \
     @@ t/t0033-safe-directory.sh: test_expect_success 'safe.directory is not set' '
      -	test_must_fail env \
      -		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
      -		git status 2>err &&
     --	grep "unsafe repository" err
     +-	grep "dubious ownership" err
      +test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
      +	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
      +	    git status
 5:  eff4b07480e ! 5:  50069bba9a5 setup.c: create `discovery.bare`
     @@ Metadata
      Author: Glen Choo <chooglen@google.com>
      
       ## Commit message ##
     -    setup.c: create `discovery.bare`
     +    setup.c: create `safe.bareRepository`
      
          There is a known social engineering attack that takes advantage of the
          fact that a working tree can include an entire bare repository,
     @@ Commit message
          `--git-dir` or `GIT_DIR`. In environments that don't use bare
          repositories, this would be minimally disruptive.
      
     -    Create a config variable, `discovery.bare`, that tells Git whether or
     -    not to die() when it discovers a bare repository. This only affects
     -    repository discovery, thus it has no effect if discovery was not
     -    done, e.g. if the user passes `--git-dir=my-dir`, discovery will be
     -    skipped and my-dir will be used as the repo regardless of the
     -    `discovery.bare` value.
     +    Create a config variable, `safe.bareRepository`, that tells Git whether
     +    or not to die() when working with a bare repository. This config is an
     +    enum of:
      
     -    This config is an enum of:
     -
     -    - "always": always allow bare repositories (this is the default)
     -    - "never": never allow bare repositories
     +    - "all": allow all bare repositories (this is the default)
     +    - "explicit": only allow bare repositories specified via --git-dir
     +      or GIT_DIR.
      
          If we want to protect users from such attacks by default, neither value
     -    will suffice - "always" provides no protection, but "never" is
     +    will suffice - "all" provides no protection, but "explicit" is
          impractical for bare repository users. A more usable default would be to
          allow only non-embedded bare repositories ([2] contains one such
          proposal), but detecting if a repository is embedded is potentially
     @@ Commit message
      
          Signed-off-by: Glen Choo <chooglen@google.com>
      
     - ## Documentation/config.txt ##
     -@@ Documentation/config.txt: include::config/diff.txt[]
     - 
     - include::config/difftool.txt[]
     - 
     -+include::config/discovery.txt[]
     -+
     - include::config/extensions.txt[]
     - 
     - include::config/fastimport.txt[]
     -
     - ## Documentation/config/discovery.txt (new) ##
     + ## Documentation/config/safe.txt ##
      @@
     -+discovery.bare::
     -+	Specifies whether Git will work with a bare repository that
     -+	wasn't specified via the top-level `--git-dir` command-line
     -+	option, or the `GIT_DIR` environment variable (see
     -+	linkgit:git[1]). If the repository is specified, Git will always
     -+	use the specified repository, regardless of this value.
     ++safe.bareRepository::
     ++	Specifies which bare repositories Git will work with. The currently
     ++	supported values are:
     +++
     ++* `all`: Git works with all bare repositories. This is the default.
     ++* `explicit`: Git only works with bare repositories specified via
     ++  the top-level `--git-dir` command-line option, or the `GIT_DIR`
     ++  environment variable (see linkgit:git[1]).
     +++
     ++If you do not use bare repositories in your workflow, then it may be
     ++beneficial to set `safe.bareRepository` to `explicit` in your global
     ++config. This will protect you from attacks that involve cloning a
     ++repository that contains a bare repository and running a Git command
     ++within that directory.
      ++
      +This config setting is only respected in protected configuration (see
      +<<SCOPES>>). This prevents the untrusted repository from tampering with
      +this value.
     -++
     -+The currently supported values are:
     -++
     -+* `always`: Git always works with bare repositories
     -+* `never`: Git never works with bare repositories
     -++
     -+If you do not use bare repositories in your workflow, then it may be
     -+beneficial to set `discovery.bare` to `never` in your global config.
     -+This will protect you from attacks that involve cloning a repository
     -+that contains a bare repository and running a Git command within that
     -+directory.
     ++
     + safe.directory::
     + 	These config entries specify Git-tracked directories that are
     + 	considered safe even if they are owned by someone other than the
      
       ## setup.c ##
      @@
       static int inside_git_dir = -1;
       static int inside_work_tree = -1;
       static int work_tree_config_is_bogus;
     -+enum discovery_bare_allowed {
     -+	DISCOVERY_BARE_NEVER = 0,
     -+	DISCOVERY_BARE_ALWAYS,
     ++enum allowed_bare_repo {
     ++	ALLOWED_BARE_REPO_EXPLICIT = 0,
     ++	ALLOWED_BARE_REPO_ALL,
      +};
       
       static struct startup_info the_startup_info;
       struct startup_info *startup_info = &the_startup_info;
     -@@ setup.c: static int ensure_valid_ownership(const char *path)
     +@@ setup.c: static int ensure_valid_ownership(const char *gitfile,
       	return data.is_safe;
       }
       
     -+static int discovery_bare_cb(const char *key, const char *value, void *d)
     ++static int allowed_bare_repo_cb(const char *key, const char *value, void *d)
      +{
     -+	enum discovery_bare_allowed *discovery_bare_allowed = d;
     ++	enum allowed_bare_repo *allowed_bare_repo = d;
      +
     -+	if (strcmp(key, "discovery.bare"))
     ++	if (strcasecmp(key, "safe.bareRepository"))
      +		return 0;
      +
     -+	if (!strcmp(value, "never")) {
     -+		*discovery_bare_allowed = DISCOVERY_BARE_NEVER;
     ++	if (!strcmp(value, "explicit")) {
     ++		*allowed_bare_repo = ALLOWED_BARE_REPO_EXPLICIT;
      +		return 0;
      +	}
     -+	if (!strcmp(value, "always")) {
     -+		*discovery_bare_allowed = DISCOVERY_BARE_ALWAYS;
     ++	if (!strcmp(value, "all")) {
     ++		*allowed_bare_repo = ALLOWED_BARE_REPO_ALL;
      +		return 0;
      +	}
      +	return -1;
      +}
      +
     -+static enum discovery_bare_allowed get_discovery_bare(void)
     ++static enum allowed_bare_repo get_allowed_bare_repo(void)
      +{
     -+	enum discovery_bare_allowed result = DISCOVERY_BARE_ALWAYS;
     -+	git_protected_config(discovery_bare_cb, &result);
     ++	enum allowed_bare_repo result = ALLOWED_BARE_REPO_ALL;
     ++	git_protected_config(allowed_bare_repo_cb, &result);
      +	return result;
      +}
      +
     -+static const char *discovery_bare_allowed_to_string(
     -+	enum discovery_bare_allowed discovery_bare_allowed)
     ++static const char *allowed_bare_repo_to_string(
     ++	enum allowed_bare_repo allowed_bare_repo)
      +{
     -+	switch (discovery_bare_allowed) {
     -+	case DISCOVERY_BARE_NEVER:
     -+		return "never";
     -+	case DISCOVERY_BARE_ALWAYS:
     -+		return "always";
     ++	switch (allowed_bare_repo) {
     ++	case ALLOWED_BARE_REPO_EXPLICIT:
     ++		return "explicit";
     ++	case ALLOWED_BARE_REPO_ALL:
     ++		return "all";
      +	default:
     -+		BUG("invalid discovery_bare_allowed %d",
     -+		    discovery_bare_allowed);
     ++		BUG("invalid allowed_bare_repo %d",
     ++		    allowed_bare_repo);
      +	}
      +	return NULL;
      +}
     @@ setup.c: static enum discovery_result setup_git_directory_gently_1(struct strbuf
       		}
       
       		if (is_git_directory(dir->buf)) {
     -+			if (get_discovery_bare() == DISCOVERY_BARE_NEVER)
     ++			if (get_allowed_bare_repo() == ALLOWED_BARE_REPO_EXPLICIT)
      +				return GIT_DIR_DISALLOWED_BARE;
     - 			if (!ensure_valid_ownership(dir->buf))
     + 			if (!ensure_valid_ownership(NULL, NULL, dir->buf))
       				return GIT_DIR_INVALID_OWNERSHIP;
       			strbuf_addstr(gitdir, ".");
      @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
     @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
       		break;
      +	case GIT_DIR_DISALLOWED_BARE:
      +		if (!nongit_ok) {
     -+			die(_("cannot use bare repository '%s' (discovery.bare is '%s')"),
     ++			die(_("cannot use bare repository '%s' (safe.bareRepository is '%s')"),
      +			    dir.buf,
     -+			    discovery_bare_allowed_to_string(get_discovery_bare()));
     ++			    allowed_bare_repo_to_string(get_allowed_bare_repo()));
      +		}
      +		*nongit_ok = 1;
      +		break;
     @@ setup.c: const char *setup_git_directory_gently(int *nongit_ok)
       		/*
       		 * As a safeguard against setup_git_directory_gently_1 returning
      
     - ## t/t0035-discovery-bare.sh (new) ##
     + ## t/t0035-safe-bare-repository.sh (new) ##
      @@
      +#!/bin/sh
      +
     -+test_description='verify discovery.bare checks'
     ++test_description='verify safe.bareRepository checks'
      +
      +TEST_PASSES_SANITIZE_LEAK=true
      +. ./test-lib.sh
     @@ t/t0035-discovery-bare.sh (new)
      +	git init --bare outer-repo/bare-repo
      +'
      +
     -+test_expect_success 'discovery.bare unset' '
     ++test_expect_success 'safe.bareRepository unset' '
      +	expect_accepted -C outer-repo/bare-repo
      +'
      +
     -+test_expect_success 'discovery.bare=always' '
     -+	test_config_global discovery.bare always &&
     ++test_expect_success 'safe.bareRepository=all' '
     ++	test_config_global safe.bareRepository all &&
      +	expect_accepted -C outer-repo/bare-repo
      +'
      +
     -+test_expect_success 'discovery.bare=never' '
     -+	test_config_global discovery.bare never &&
     ++test_expect_success 'safe.bareRepository=explicit' '
     ++	test_config_global safe.bareRepository explicit &&
      +	expect_rejected -C outer-repo/bare-repo
      +'
      +
     -+test_expect_success 'discovery.bare in the repository' '
     -+	# discovery.bare must not be "never", otherwise git config fails
     -+	# with "fatal: not in a git directory" (like safe.directory)
     -+	test_config -C outer-repo/bare-repo discovery.bare always &&
     -+	test_config_global discovery.bare never &&
     ++test_expect_success 'safe.bareRepository in the repository' '
     ++	# safe.bareRepository must not be "explicit", otherwise
     ++	# git config fails with "fatal: not in a git directory" (like
     ++	# safe.directory)
     ++	test_config -C outer-repo/bare-repo safe.bareRepository \
     ++		all &&
     ++	test_config_global safe.bareRepository explicit &&
      +	expect_rejected -C outer-repo/bare-repo
      +'
      +
     -+test_expect_success 'discovery.bare on the command line' '
     -+	test_config_global discovery.bare never &&
     ++test_expect_success 'safe.bareRepository on the command line' '
     ++	test_config_global safe.bareRepository explicit &&
      +	expect_accepted -C outer-repo/bare-repo \
     -+		-c discovery.bare=always
     ++		-c safe.bareRepository=all
      +'
      +
      +test_done

-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 113+ messages in thread

* [PATCH v8 1/5] Documentation/git-config.txt: add SCOPES section
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
@ 2022-07-14 21:27               ` Glen Choo via GitGitGadget
  2022-07-14 21:27               ` [PATCH v8 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-14 21:27 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.

145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
what those values mean.

Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/git-config.txt | 82 ++++++++++++++++++++++++++----------
 1 file changed, 59 insertions(+), 23 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 9376e39aef2..53c7c65f9ed 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -297,23 +297,20 @@ The default is to use a pager.
 FILES
 -----
 
-If not set explicitly with `--file`, there are four files where
-'git config' will search for configuration options:
+By default, 'git config' will read configuration options from multiple
+files:
 
 $(prefix)/etc/gitconfig::
 	System-wide configuration file.
 
 $XDG_CONFIG_HOME/git/config::
-	Second user-specific configuration file. If $XDG_CONFIG_HOME is not set
-	or empty, `$HOME/.config/git/config` will be used. Any single-valued
-	variable set in this file will be overwritten by whatever is in
-	`~/.gitconfig`.  It is a good idea not to create this file if
-	you sometimes use older versions of Git, as support for this
-	file was added fairly recently.
-
 ~/.gitconfig::
-	User-specific configuration file. Also called "global"
-	configuration file.
+	User-specific configuration files. When the XDG_CONFIG_HOME environment
+	variable is not set or empty, $HOME/.config/ is used as
+	$XDG_CONFIG_HOME.
++
+These are also called "global" configuration files. If both files exist, both
+files are read in the order given above.
 
 $GIT_DIR/config::
 	Repository specific configuration file.
@@ -322,28 +319,67 @@ $GIT_DIR/config.worktree::
 	This is optional and is only searched when
 	`extensions.worktreeConfig` is present in $GIT_DIR/config.
 
-If no further options are given, all reading options will read all of these
-files that are available. If the global or the system-wide configuration
-file are not available they will be ignored. If the repository configuration
-file is not available or readable, 'git config' will exit with a non-zero
-error code. However, in neither case will an error message be issued.
+You may also provide additional configuration parameters when running any
+git command by using the `-c` option. See linkgit:git[1] for details.
+
+Options will be read from all of these files that are available. If the
+global or the system-wide configuration files are missing or unreadable they
+will be ignored. If the repository configuration file is missing or unreadable,
+'git config' will exit with a non-zero error code. An error message is produced
+if the file is unreadable, but not if it is missing.
 
 The files are read in the order given above, with last value found taking
 precedence over values read earlier.  When multiple values are taken then all
 values of a key from all files will be used.
 
-You may override individual configuration parameters when running any git
-command by using the `-c` option. See linkgit:git[1] for details.
-
-All writing options will per default write to the repository specific
+By default, options are only written to the repository specific
 configuration file. Note that this also affects options like `--replace-all`
 and `--unset`. *'git config' will only ever change one file at a time*.
 
-You can override these rules using the `--global`, `--system`,
-`--local`, `--worktree`, and `--file` command-line options; see
-<<OPTIONS>> above.
+You can limit which configuration sources are read from or written to by
+specifying the path of a file with the `--file` option, or by specifying a
+configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
+For more, see <<OPTIONS>> above.
+
+SCOPES
+------
+
+Each configuration source falls within a configuration scope. The scopes
+are:
+
+system::
+	$(prefix)/etc/gitconfig
+
+global::
+	$XDG_CONFIG_HOME/git/config
++
+~/.gitconfig
+
+local::
+	$GIT_DIR/config
+
+worktree::
+	$GIT_DIR/config.worktree
+
+command::
+	GIT_CONFIG_{COUNT,KEY,VALUE} environment variables (see <<ENVIRONMENT>>
+	below)
++
+the `-c` option
+
+With the exception of 'command', each scope corresponds to a command line
+option: `--system`, `--global`, `--local`, `--worktree`.
+
+When reading options, specifying a scope will only read options from the
+files within that scope. When writing options, specifying a scope will write
+to the files within that scope (instead of the repository specific
+configuration file). See <<OPTIONS>> above for a complete description.
 
+Most configuration options are respected regardless of the scope it is
+defined in, but some options are only respected in certain scopes. See the
+respective option's documentation for the full details.
 
+[[ENVIRONMENT]]
 ENVIRONMENT
 -----------
 
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v8 2/5] Documentation: define protected configuration
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
  2022-07-14 21:27               ` [PATCH v8 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
@ 2022-07-14 21:27               ` Glen Choo via GitGitGadget
  2022-07-14 21:27               ` [PATCH v8 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
                                 ` (2 subsequent siblings)
  4 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-14 21:27 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.

In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected configuration as
'protected configuration only', but this term is not used in the
documentation.

This definition of protected configuration is based on whether or not
Git can reasonably protect the user by ignoring the configuration scope:

- System, global and command line config are considered protected
  because an attacker who has control over any of those can do plenty of
  harm without Git, so we gain very little by ignoring those scopes.

- On the other hand, local (and similarly, worktree) config are not
  considered protected because it is relatively easy for an attacker to
  control local config, e.g.:

  - On some shared user environments, a non-admin attacker can create a
    repository high up the directory hierarchy (e.g. C:\.git on
    Windows), and a user may accidentally use it when their PS1
    automatically invokes "git" commands.

    `safe.directory` prevents attacks of this form by making sure that
    the user intended to use the shared repository. It obviously
    shouldn't be read from the repository, because that would end up
    trusting the repository that Git was supposed to reject.

  - "git upload-pack" is expected to run in repositories that may not be
    controlled by the user. We cannot ignore all config in that
    repository (because "git upload-pack" would fail), but we can limit
    the risks by ignoring `uploadpack.packObjectsHook`.

Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:

- `safe.directory` should be 'protected configuration only', but it does
  not technically fit the definition because it is not respected in the
  "command" scope. A future commit will fix this.

- `trace2.*` happens to read the same scopes as `safe.directory` because
  they share an implementation. However, this is not for security
  reasons; it is because we want to start tracing so early that
  repository-level config and "-c" are not available [2].

  This requirement is unique to `trace2.*`, so it does not makes sense
  for protected configuration to be subject to the same constraints.

[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/uploadpack.txt |  6 +++---
 Documentation/git-config.txt        | 13 +++++++++++++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index 32fad5bbe81..16264d82a72 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -49,9 +49,9 @@ uploadpack.packObjectsHook::
 	`pack-objects` to the hook, and expects a completed packfile on
 	stdout.
 +
-Note that this configuration variable is ignored if it is seen in the
-repository-level config (this is a safety measure against fetching from
-untrusted repositories).
+Note that this configuration variable is only respected when it is specified
+in protected configuration (see <<SCOPES>>). This is a safety measure
+against fetching from untrusted repositories.
 
 uploadpack.allowFilter::
 	If this option is set, `upload-pack` will support partial
diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 53c7c65f9ed..7a2bcb2f6cb 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -341,6 +341,7 @@ specifying the path of a file with the `--file` option, or by specifying a
 configuration scope with `--system`, `--global`, `--local`, or `--worktree`.
 For more, see <<OPTIONS>> above.
 
+[[SCOPES]]
 SCOPES
 ------
 
@@ -379,6 +380,18 @@ Most configuration options are respected regardless of the scope it is
 defined in, but some options are only respected in certain scopes. See the
 respective option's documentation for the full details.
 
+Protected configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Protected configuration refers to the 'system', 'global', and 'command' scopes.
+For security reasons, certain options are only respected when they are
+specified in protected configuration, and ignored otherwise.
+
+Git treats these scopes as if they are controlled by the user or a trusted
+administrator. This is because an attacker who controls these scopes can do
+substantial harm without using Git, so it is assumed that the user's environment
+protects these scopes against attackers.
+
 [[ENVIRONMENT]]
 ENVIRONMENT
 -----------
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v8 3/5] config: learn `git_protected_config()`
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
  2022-07-14 21:27               ` [PATCH v8 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
  2022-07-14 21:27               ` [PATCH v8 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
@ 2022-07-14 21:27               ` Glen Choo via GitGitGadget
  2022-07-25 18:26                 ` SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`) Ævar Arnfjörð Bjarmason
  2022-07-14 21:28               ` [PATCH v8 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
  2022-07-14 21:28               ` [PATCH v8 5/5] setup.c: create `safe.bareRepository` Glen Choo via GitGitGadget
  4 siblings, 1 reply; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-14 21:27 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`safe.bareRepository` should also be 'protected configuration only'. So,
for consistency, we'd like to have a single implementation for protected
configuration.

The primary constraints are:

1. Reading from protected configuration should be fast. Nearly all "git"
   commands inside a bare repository will read both `safe.directory` and
   `safe.bareRepository`, so we cannot afford to be slow.

2. Protected configuration must be readable when the gitdir is not
   known. `safe.directory` and `safe.bareRepository` both affect
   repository discovery and the gitdir is not known at that point [1].

The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.

Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().

The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().

In light of constraint 1, this implementation can still be improved.
git_protected_config() iterates through every variable in
protected_config, which is wasteful, but it makes the conversion simple
because it matches existing patterns. We will likely implement constant
time lookup functions for protected configuration in a future series
(such functions already exist for non-protected configuration, i.e.
repo_config_get_*()).

An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.

[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.

Signed-off-by: Glen Choo <chooglen@google.com>
---
 config.c                     | 43 ++++++++++++++++++++++++++++++++++++
 config.h                     | 16 ++++++++++++++
 t/t5544-pack-objects-hook.sh |  7 +++++-
 upload-pack.c                | 27 +++++++++++++---------
 4 files changed, 82 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index 9b0e9c93285..015bec360f5 100644
--- a/config.c
+++ b/config.c
@@ -81,6 +81,17 @@ static enum config_scope current_parsing_scope;
 static int pack_compression_seen;
 static int zlib_compression_seen;
 
+/*
+ * Config that comes from trusted scopes, namely:
+ * - CONFIG_SCOPE_SYSTEM (e.g. /etc/gitconfig)
+ * - CONFIG_SCOPE_GLOBAL (e.g. $HOME/.gitconfig, $XDG_CONFIG_HOME/git)
+ * - CONFIG_SCOPE_COMMAND (e.g. "-c" option, environment variables)
+ *
+ * This is declared here for code cleanliness, but unlike the other
+ * static variables, this does not hold config parser state.
+ */
+static struct config_set protected_config;
+
 static int config_file_fgetc(struct config_source *conf)
 {
 	return getc_unlocked(conf->u.file);
@@ -2378,6 +2389,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
 	return git_config_from_file(config_set_callback, filename, cs);
 }
 
+int git_configset_add_parameters(struct config_set *cs)
+{
+	return git_config_from_parameters(config_set_callback, cs);
+}
+
 int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
 {
 	const struct string_list *values = NULL;
@@ -2619,6 +2635,33 @@ int repo_config_get_pathname(struct repository *repo,
 	return ret;
 }
 
+/* Read values into protected_config. */
+static void read_protected_config(void)
+{
+	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
+
+	git_configset_init(&protected_config);
+
+	system_config = git_system_config();
+	git_global_config(&user_config, &xdg_config);
+
+	git_configset_add_file(&protected_config, system_config);
+	git_configset_add_file(&protected_config, xdg_config);
+	git_configset_add_file(&protected_config, user_config);
+	git_configset_add_parameters(&protected_config);
+
+	free(system_config);
+	free(xdg_config);
+	free(user_config);
+}
+
+void git_protected_config(config_fn_t fn, void *data)
+{
+	if (!protected_config.hash_initialized)
+		read_protected_config();
+	configset_iter(&protected_config, fn, data);
+}
+
 /* Functions used historically to read configuration from 'the_repository' */
 void git_config(config_fn_t fn, void *data)
 {
diff --git a/config.h b/config.h
index 7654f61c634..ca994d77147 100644
--- a/config.h
+++ b/config.h
@@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
  */
 int git_configset_add_file(struct config_set *cs, const char *filename);
 
+/**
+ * Parses command line options and environment variables, and adds the
+ * variable-value pairs to the `config_set`. Returns 0 on success, or -1
+ * if there is an error in parsing. The caller decides whether to free
+ * the incomplete configset or continue using it when the function
+ * returns -1.
+ */
+int git_configset_add_parameters(struct config_set *cs);
+
 /**
  * Finds and returns the value list, sorted in order of increasing priority
  * for the configuration variable `key` and config set `cs`. When the
@@ -505,6 +514,13 @@ int repo_config_get_maybe_bool(struct repository *repo,
 int repo_config_get_pathname(struct repository *repo,
 			     const char *key, const char **dest);
 
+/*
+ * Functions for reading protected config. By definition, protected
+ * config ignores repository config, so these do not take a `struct
+ * repository` parameter.
+ */
+void git_protected_config(config_fn_t fn, void *data);
+
 /**
  * Querying For Specific Variables
  * -------------------------------
diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
index dd5f44d986f..54f54f8d2eb 100755
--- a/t/t5544-pack-objects-hook.sh
+++ b/t/t5544-pack-objects-hook.sh
@@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
 	! grep "hook running" stderr &&
 	test_path_is_missing .git/hook.args &&
 	test_path_is_missing .git/hook.stdin &&
-	test_path_is_missing .git/hook.stdout
+	test_path_is_missing .git/hook.stdout &&
+
+	# check that global config is used instead
+	test_config_global uploadpack.packObjectsHook ./hook &&
+	git clone --no-local . dst2.git 2>stderr &&
+	grep "hook running" stderr
 '
 
 test_expect_success 'hook works with partial clone' '
diff --git a/upload-pack.c b/upload-pack.c
index 3a851b36066..09f48317b02 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
 		data->advertise_sid = git_config_bool(var, value);
 	}
 
-	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
-	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
-		if (!strcmp("uploadpack.packobjectshook", var))
-			return git_config_string(&data->pack_objects_hook, var, value);
-	}
-
 	if (parse_object_filter_config(var, value, data) < 0)
 		return -1;
 
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
+static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
+{
+	struct upload_pack_data *data = cb_data;
+
+	if (!strcmp("uploadpack.packobjectshook", var))
+		return git_config_string(&data->pack_objects_hook, var, value);
+	return 0;
+}
+
+static void get_upload_pack_config(struct upload_pack_data *data)
+{
+	git_config(upload_pack_config, data);
+	git_protected_config(upload_pack_protected_config, data);
+}
+
 void upload_pack(const int advertise_refs, const int stateless_rpc,
 		 const int timeout)
 {
@@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
 	struct upload_pack_data data;
 
 	upload_pack_data_init(&data);
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	data.stateless_rpc = stateless_rpc;
 	data.timeout = timeout;
@@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
 
 	upload_pack_data_init(&data);
 	data.use_sideband = LARGE_PACKET_MAX;
-
-	git_config(upload_pack_config, &data);
+	get_upload_pack_config(&data);
 
 	while (state != FETCH_DONE) {
 		switch (state) {
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v8 4/5] safe.directory: use git_protected_config()
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
                                 ` (2 preceding siblings ...)
  2022-07-14 21:27               ` [PATCH v8 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-07-14 21:28               ` Glen Choo via GitGitGadget
  2022-07-14 21:28               ` [PATCH v8 5/5] setup.c: create `safe.bareRepository` Glen Choo via GitGitGadget
  4 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-14 21:28 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/safe.txt |  6 +++---
 setup.c                       |  2 +-
 t/t0033-safe-directory.sh     | 24 ++++++++++--------------
 3 files changed, 14 insertions(+), 18 deletions(-)

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index fa02f3ccc54..f72b4408798 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -12,9 +12,9 @@ via `git config --add`. To reset the list of safe directories (e.g. to
 override any such directories specified in the system config), add a
 `safe.directory` entry with an empty value.
 +
-This config setting is only respected when specified in a system or global
-config, not when it is specified in a repository config, via the command
-line option `-c safe.directory=<path>`, or in environment variables.
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with this
+value.
 +
 The value of this setting is interpolated, i.e. `~/<path>` expands to a
 path relative to the home directory and `%(prefix)/<path>` expands to a
diff --git a/setup.c b/setup.c
index 09b6549ba9e..ec5b9139e32 100644
--- a/setup.c
+++ b/setup.c
@@ -1155,7 +1155,7 @@ static int ensure_valid_ownership(const char *gitfile,
 	 * constant regardless of what failed above. data.is_safe should be
 	 * initialized to false, and might be changed by the callback.
 	 */
-	read_very_early_config(safe_directory_cb, &data);
+	git_protected_config(safe_directory_cb, &data);
 
 	return data.is_safe;
 }
diff --git a/t/t0033-safe-directory.sh b/t/t0033-safe-directory.sh
index 3908597d42d..f4d737dadd0 100755
--- a/t/t0033-safe-directory.sh
+++ b/t/t0033-safe-directory.sh
@@ -16,24 +16,20 @@ test_expect_success 'safe.directory is not set' '
 	expect_rejected_dir
 '
 
-test_expect_success 'ignoring safe.directory on the command line' '
-	test_must_fail git -c safe.directory="$(pwd)" status 2>err &&
-	grep "dubious ownership" err
+test_expect_success 'safe.directory on the command line' '
+	git -c safe.directory="$(pwd)" status
 '
 
-test_expect_success 'ignoring safe.directory in the environment' '
-	test_must_fail env GIT_CONFIG_COUNT=1 \
-		GIT_CONFIG_KEY_0="safe.directory" \
-		GIT_CONFIG_VALUE_0="$(pwd)" \
-		git status 2>err &&
-	grep "dubious ownership" err
+test_expect_success 'safe.directory in the environment' '
+	env GIT_CONFIG_COUNT=1 \
+	    GIT_CONFIG_KEY_0="safe.directory" \
+	    GIT_CONFIG_VALUE_0="$(pwd)" \
+	    git status
 '
 
-test_expect_success 'ignoring safe.directory in GIT_CONFIG_PARAMETERS' '
-	test_must_fail env \
-		GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
-		git status 2>err &&
-	grep "dubious ownership" err
+test_expect_success 'safe.directory in GIT_CONFIG_PARAMETERS' '
+	env GIT_CONFIG_PARAMETERS="${SQ}safe.directory${SQ}=${SQ}$(pwd)${SQ}" \
+	    git status
 '
 
 test_expect_success 'ignoring safe.directory in repo config' '
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 113+ messages in thread

* [PATCH v8 5/5] setup.c: create `safe.bareRepository`
  2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
                                 ` (3 preceding siblings ...)
  2022-07-14 21:28               ` [PATCH v8 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
@ 2022-07-14 21:28               ` Glen Choo via GitGitGadget
  4 siblings, 0 replies; 113+ messages in thread
From: Glen Choo via GitGitGadget @ 2022-07-14 21:28 UTC (permalink / raw)
  To: git
  Cc: Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Ævar Arnfjörð Bjarmason,
	Johannes Schindelin, Glen Choo, Glen Choo

From: Glen Choo <chooglen@google.com>

There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `safe.bareRepository`, that tells Git whether
or not to die() when working with a bare repository. This config is an
enum of:

- "all": allow all bare repositories (this is the default)
- "explicit": only allow bare repositories specified via --git-dir
  or GIT_DIR.

If we want to protect users from such attacks by default, neither value
will suffice - "all" provides no protection, but "explicit" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
---
 Documentation/config/safe.txt   | 19 +++++++++++
 setup.c                         | 57 ++++++++++++++++++++++++++++++++-
 t/t0035-safe-bare-repository.sh | 54 +++++++++++++++++++++++++++++++
 3 files changed, 129 insertions(+), 1 deletion(-)
 create mode 100755 t/t0035-safe-bare-repository.sh

diff --git a/Documentation/config/safe.txt b/Documentation/config/safe.txt
index f72b4408798..bde7f31459b 100644
--- a/Documentation/config/safe.txt
+++ b/Documentation/config/safe.txt
@@ -1,3 +1,22 @@
+safe.bareRepository::
+	Specifies which bare repositories Git will work with. The currently
+	supported values are:
++
+* `all`: Git works with all bare repositories. This is the default.
+* `explicit`: Git only works with bare repositories specified via
+  the top-level `--git-dir` command-line option, or the `GIT_DIR`
+  environment variable (see linkgit:git[1]).
++
+If you do not use bare repositories in your workflow, then it may be
+beneficial to set `safe.bareRepository` to `explicit` in your global
+config. This will protect you from attacks that involve cloning a
+repository that contains a bare repository and running a Git command
+within that directory.
++
+This config setting is only respected in protected configuration (see
+<<SCOPES>>). This prevents the untrusted repository from tampering with
+this value.
+
 safe.directory::
 	These config entries specify Git-tracked directories that are
 	considered safe even if they are owned by someone other than the
diff --git a/setup.c b/setup.c
index ec5b9139e32..8c683e92b62 100644
--- a/setup.c
+++ b/setup.c
@@ -10,6 +10,10 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 static int work_tree_config_is_bogus;
+enum allowed_bare_repo {
+	ALLOWED_BARE_REPO_EXPLICIT = 0,
+	ALLOWED_BARE_REPO_ALL,
+};
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
@@ -1160,6 +1164,46 @@ static int ensure_valid_ownership(const char *gitfile,
 	return data.is_safe;
 }
 
+static int allowed_bare_repo_cb(const char *key, const char *value, void *d)
+{
+	enum allowed_bare_repo *allowed_bare_repo = d;
+
+	if (strcasecmp(key, "safe.bareRepository"))
+		return 0;
+
+	if (!strcmp(value, "explicit")) {
+		*allowed_bare_repo = ALLOWED_BARE_REPO_EXPLICIT;
+		return 0;
+	}
+	if (!strcmp(value, "all")) {
+		*allowed_bare_repo = ALLOWED_BARE_REPO_ALL;
+		return 0;
+	}
+	return -1;
+}
+
+static enum allowed_bare_repo get_allowed_bare_repo(void)
+{
+	enum allowed_bare_repo result = ALLOWED_BARE_REPO_ALL;
+	git_protected_config(allowed_bare_repo_cb, &result);
+	return result;
+}
+
+static const char *allowed_bare_repo_to_string(
+	enum allowed_bare_repo allowed_bare_repo)
+{
+	switch (allowed_bare_repo) {
+	case ALLOWED_BARE_REPO_EXPLICIT:
+		return "explicit";
+	case ALLOWED_BARE_REPO_ALL:
+		return "all";
+	default:
+		BUG("invalid allowed_bare_repo %d",
+		    allowed_bare_repo);
+	}
+	return NULL;
+}
+
 enum discovery_result {
 	GIT_DIR_NONE = 0,
 	GIT_DIR_EXPLICIT,
@@ -1169,7 +1213,8 @@ enum discovery_result {
 	GIT_DIR_HIT_CEILING = -1,
 	GIT_DIR_HIT_MOUNT_POINT = -2,
 	GIT_DIR_INVALID_GITFILE = -3,
-	GIT_DIR_INVALID_OWNERSHIP = -4
+	GIT_DIR_INVALID_OWNERSHIP = -4,
+	GIT_DIR_DISALLOWED_BARE = -5,
 };
 
 /*
@@ -1297,6 +1342,8 @@ static enum discovery_result setup_git_directory_gently_1(struct strbuf *dir,
 		}
 
 		if (is_git_directory(dir->buf)) {
+			if (get_allowed_bare_repo() == ALLOWED_BARE_REPO_EXPLICIT)
+				return GIT_DIR_DISALLOWED_BARE;
 			if (!ensure_valid_ownership(NULL, NULL, dir->buf))
 				return GIT_DIR_INVALID_OWNERSHIP;
 			strbuf_addstr(gitdir, ".");
@@ -1443,6 +1490,14 @@ const char *setup_git_directory_gently(int *nongit_ok)
 		}
 		*nongit_ok = 1;
 		break;
+	case GIT_DIR_DISALLOWED_BARE:
+		if (!nongit_ok) {
+			die(_("cannot use bare repository '%s' (safe.bareRepository is '%s')"),
+			    dir.buf,
+			    allowed_bare_repo_to_string(get_allowed_bare_repo()));
+		}
+		*nongit_ok = 1;
+		break;
 	case GIT_DIR_NONE:
 		/*
 		 * As a safeguard against setup_git_directory_gently_1 returning
diff --git a/t/t0035-safe-bare-repository.sh b/t/t0035-safe-bare-repository.sh
new file mode 100755
index 00000000000..ecbdc8238db
--- /dev/null
+++ b/t/t0035-safe-bare-repository.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+
+test_description='verify safe.bareRepository checks'
+
+TEST_PASSES_SANITIZE_LEAK=true
+. ./test-lib.sh
+
+pwd="$(pwd)"
+
+expect_accepted () {
+	git "$@" rev-parse --git-dir
+}
+
+expect_rejected () {
+	test_must_fail git "$@" rev-parse --git-dir 2>err &&
+	grep -F "cannot use bare repository" err
+}
+
+test_expect_success 'setup bare repo in worktree' '
+	git init outer-repo &&
+	git init --bare outer-repo/bare-repo
+'
+
+test_expect_success 'safe.bareRepository unset' '
+	expect_accepted -C outer-repo/bare-repo
+'
+
+test_expect_success 'safe.bareRepository=all' '
+	test_config_global safe.bareRepository all &&
+	expect_accepted -C outer-repo/bare-repo
+'
+
+test_expect_success 'safe.bareRepository=explicit' '
+	test_config_global safe.bareRepository explicit &&
+	expect_rejected -C outer-repo/bare-repo
+'
+
+test_expect_success 'safe.bareRepository in the repository' '
+	# safe.bareRepository must not be "explicit", otherwise
+	# git config fails with "fatal: not in a git directory" (like
+	# safe.directory)
+	test_config -C outer-repo/bare-repo safe.bareRepository \
+		all &&
+	test_config_global safe.bareRepository explicit &&
+	expect_rejected -C outer-repo/bare-repo
+'
+
+test_expect_success 'safe.bareRepository on the command line' '
+	test_config_global safe.bareRepository explicit &&
+	expect_accepted -C outer-repo/bare-repo \
+		-c safe.bareRepository=all
+'
+
+test_done
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 113+ messages in thread

* SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`)
  2022-07-14 21:27               ` [PATCH v8 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
@ 2022-07-25 18:26                 ` Ævar Arnfjörð Bjarmason
  2022-07-25 20:15                   ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-25 18:26 UTC (permalink / raw)
  To: Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Johannes Schindelin, Glen Choo


On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:

> From: Glen Choo <chooglen@google.com>
>
> `uploadpack.packObjectsHook` is the only 'protected configuration only'
> variable today, but we've noted that `safe.directory` and the upcoming
> `safe.bareRepository` should also be 'protected configuration only'. So,
> for consistency, we'd like to have a single implementation for protected
> configuration.
>
> The primary constraints are:
>
> 1. Reading from protected configuration should be fast. Nearly all "git"
>    commands inside a bare repository will read both `safe.directory` and
>    `safe.bareRepository`, so we cannot afford to be slow.
>
> 2. Protected configuration must be readable when the gitdir is not
>    known. `safe.directory` and `safe.bareRepository` both affect
>    repository discovery and the gitdir is not known at that point [1].
>
> The chosen implementation in this commit is to read protected
> configuration and cache the values in a global configset. This is
> similar to the caching behavior we get with the_repository->config.
>
> Introduce git_protected_config(), which reads protected configuration
> and caches them in the global configset protected_config. Then, refactor
> `uploadpack.packObjectsHook` to use git_protected_config().
>
> The protected configuration functions are named similarly to their
> non-protected counterparts, e.g. git_protected_config_check_init() vs
> git_config_check_init().
>
> In light of constraint 1, this implementation can still be improved.
> git_protected_config() iterates through every variable in
> protected_config, which is wasteful, but it makes the conversion simple
> because it matches existing patterns. We will likely implement constant
> time lookup functions for protected configuration in a future series
> (such functions already exist for non-protected configuration, i.e.
> repo_config_get_*()).
>
> An alternative that avoids introducing another configset is to continue
> to read all config using git_config(), but only accept values that have
> the correct config scope [2]. This technically fulfills constraint 2,
> because git_config() simply ignores the local and worktree config when
> the gitdir is not known. However, this would read incomplete config into
> the_repository->config, which would need to be reset when the gitdir is
> known and git_config() needs to read the local and worktree config.
> Resetting the_repository->config might be reasonable while we only have
> these 'protected configuration only' variables, but it's not clear
> whether this extends well to future variables.
>
> [1] In this case, we do have a candidate gitdir though, so with a little
> refactoring, it might be possible to provide a gitdir.
> [2] This is how `uploadpack.packObjectsHook` was implemented prior to
> this commit.
>
> Signed-off-by: Glen Choo <chooglen@google.com>
> ---
>  config.c                     | 43 ++++++++++++++++++++++++++++++++++++
>  config.h                     | 16 ++++++++++++++
>  t/t5544-pack-objects-hook.sh |  7 +++++-
>  upload-pack.c                | 27 +++++++++++++---------
>  4 files changed, 82 insertions(+), 11 deletions(-)
>
> diff --git a/config.c b/config.c
> index 9b0e9c93285..015bec360f5 100644
> --- a/config.c
> +++ b/config.c
> @@ -81,6 +81,17 @@ static enum config_scope current_parsing_scope;
>  static int pack_compression_seen;
>  static int zlib_compression_seen;
>  
> +/*
> + * Config that comes from trusted scopes, namely:
> + * - CONFIG_SCOPE_SYSTEM (e.g. /etc/gitconfig)
> + * - CONFIG_SCOPE_GLOBAL (e.g. $HOME/.gitconfig, $XDG_CONFIG_HOME/git)
> + * - CONFIG_SCOPE_COMMAND (e.g. "-c" option, environment variables)
> + *
> + * This is declared here for code cleanliness, but unlike the other
> + * static variables, this does not hold config parser state.
> + */
> +static struct config_set protected_config;
> +
>  static int config_file_fgetc(struct config_source *conf)
>  {
>  	return getc_unlocked(conf->u.file);
> @@ -2378,6 +2389,11 @@ int git_configset_add_file(struct config_set *cs, const char *filename)
>  	return git_config_from_file(config_set_callback, filename, cs);
>  }
>  
> +int git_configset_add_parameters(struct config_set *cs)
> +{
> +	return git_config_from_parameters(config_set_callback, cs);
> +}
> +
>  int git_configset_get_value(struct config_set *cs, const char *key, const char **value)
>  {
>  	const struct string_list *values = NULL;
> @@ -2619,6 +2635,33 @@ int repo_config_get_pathname(struct repository *repo,
>  	return ret;
>  }
>  
> +/* Read values into protected_config. */
> +static void read_protected_config(void)
> +{
> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
> +
> +	git_configset_init(&protected_config);
> +
> +	system_config = git_system_config();
> +	git_global_config(&user_config, &xdg_config);
> +
> +	git_configset_add_file(&protected_config, system_config);
> +	git_configset_add_file(&protected_config, xdg_config);
> +	git_configset_add_file(&protected_config, user_config);
> +	git_configset_add_parameters(&protected_config);
> +
> +	free(system_config);
> +	free(xdg_config);
> +	free(user_config);
> +}
> +
> +void git_protected_config(config_fn_t fn, void *data)
> +{
> +	if (!protected_config.hash_initialized)
> +		read_protected_config();
> +	configset_iter(&protected_config, fn, data);
> +}
> +
>  /* Functions used historically to read configuration from 'the_repository' */
>  void git_config(config_fn_t fn, void *data)
>  {
> diff --git a/config.h b/config.h
> index 7654f61c634..ca994d77147 100644
> --- a/config.h
> +++ b/config.h
> @@ -446,6 +446,15 @@ void git_configset_init(struct config_set *cs);
>   */
>  int git_configset_add_file(struct config_set *cs, const char *filename);
>  
> +/**
> + * Parses command line options and environment variables, and adds the
> + * variable-value pairs to the `config_set`. Returns 0 on success, or -1
> + * if there is an error in parsing. The caller decides whether to free
> + * the incomplete configset or continue using it when the function
> + * returns -1.
> + */
> +int git_configset_add_parameters(struct config_set *cs);
> +
>  /**
>   * Finds and returns the value list, sorted in order of increasing priority
>   * for the configuration variable `key` and config set `cs`. When the
> @@ -505,6 +514,13 @@ int repo_config_get_maybe_bool(struct repository *repo,
>  int repo_config_get_pathname(struct repository *repo,
>  			     const char *key, const char **dest);
>  
> +/*
> + * Functions for reading protected config. By definition, protected
> + * config ignores repository config, so these do not take a `struct
> + * repository` parameter.
> + */
> +void git_protected_config(config_fn_t fn, void *data);
> +
>  /**
>   * Querying For Specific Variables
>   * -------------------------------
> diff --git a/t/t5544-pack-objects-hook.sh b/t/t5544-pack-objects-hook.sh
> index dd5f44d986f..54f54f8d2eb 100755
> --- a/t/t5544-pack-objects-hook.sh
> +++ b/t/t5544-pack-objects-hook.sh
> @@ -56,7 +56,12 @@ test_expect_success 'hook does not run from repo config' '
>  	! grep "hook running" stderr &&
>  	test_path_is_missing .git/hook.args &&
>  	test_path_is_missing .git/hook.stdin &&
> -	test_path_is_missing .git/hook.stdout
> +	test_path_is_missing .git/hook.stdout &&
> +
> +	# check that global config is used instead
> +	test_config_global uploadpack.packObjectsHook ./hook &&
> +	git clone --no-local . dst2.git 2>stderr &&
> +	grep "hook running" stderr
>  '
>  
>  test_expect_success 'hook works with partial clone' '
> diff --git a/upload-pack.c b/upload-pack.c
> index 3a851b36066..09f48317b02 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -1321,18 +1321,27 @@ static int upload_pack_config(const char *var, const char *value, void *cb_data)
>  		data->advertise_sid = git_config_bool(var, value);
>  	}
>  
> -	if (current_config_scope() != CONFIG_SCOPE_LOCAL &&
> -	    current_config_scope() != CONFIG_SCOPE_WORKTREE) {
> -		if (!strcmp("uploadpack.packobjectshook", var))
> -			return git_config_string(&data->pack_objects_hook, var, value);
> -	}
> -
>  	if (parse_object_filter_config(var, value, data) < 0)
>  		return -1;
>  
>  	return parse_hide_refs_config(var, value, "uploadpack");
>  }
>  
> +static int upload_pack_protected_config(const char *var, const char *value, void *cb_data)
> +{
> +	struct upload_pack_data *data = cb_data;
> +
> +	if (!strcmp("uploadpack.packobjectshook", var))
> +		return git_config_string(&data->pack_objects_hook, var, value);
> +	return 0;
> +}
> +
> +static void get_upload_pack_config(struct upload_pack_data *data)
> +{
> +	git_config(upload_pack_config, data);
> +	git_protected_config(upload_pack_protected_config, data);
> +}
> +
>  void upload_pack(const int advertise_refs, const int stateless_rpc,
>  		 const int timeout)
>  {
> @@ -1340,8 +1349,7 @@ void upload_pack(const int advertise_refs, const int stateless_rpc,
>  	struct upload_pack_data data;
>  
>  	upload_pack_data_init(&data);
> -
> -	git_config(upload_pack_config, &data);
> +	get_upload_pack_config(&data);
>  
>  	data.stateless_rpc = stateless_rpc;
>  	data.timeout = timeout;
> @@ -1695,8 +1703,7 @@ int upload_pack_v2(struct repository *r, struct packet_reader *request)
>  
>  	upload_pack_data_init(&data);
>  	data.use_sideband = LARGE_PACKET_MAX;
> -
> -	git_config(upload_pack_config, &data);
> +	get_upload_pack_config(&data);
>  
>  	while (state != FETCH_DONE) {
>  		switch (state) {

Noticed after it landed on master: This change fails with:

	make SANITIZE=address test T=t0410*.sh

Running that manually shows that we fail like this:
	
	$ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
	AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
	AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
	AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
	AH01215: =================================================================: /home/avar/g/git/git-http-backend
	AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
	AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
	AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
	AH01215:     #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
	AH01215:     #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
	AH01215:     #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
	AH01215:     #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
	AH01215:     #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
	AH01215:     #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
	AH01215:     #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
	AH01215:     #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
	AH01215:     #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
	AH01215:     #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
	AH01215:     #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
	AH01215:     #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
	AH01215:     #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
	AH01215:     #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
	AH01215:     #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
	AH01215:     #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
	AH01215:     #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
	AH01215:     #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
	AH01215:     #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
	AH01215:     #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
	AH01215:     #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
	AH01215: : /home/avar/g/git/git-http-backend
	AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
	AH01215: SUMMARY: AddressSanitizer: SEGV ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167 in __sanitizer::internal_strlen(char const*): /home/avar/g/git/git-http-backend
	AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
	AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend

(We really should have a SANITIZE=address in CI, but it takes a while...)

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`)
  2022-07-25 18:26                 ` SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`) Ævar Arnfjörð Bjarmason
@ 2022-07-25 20:15                   ` Glen Choo
  2022-07-25 20:41                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 113+ messages in thread
From: Glen Choo @ 2022-07-25 20:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, Glen Choo via GitGitGadget
  Cc: git, Taylor Blau, Derrick Stolee, Junio C Hamano, Emily Shaffer,
	Jonathan Tan, Johannes Schindelin

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
>
>> +/* Read values into protected_config. */
>> +static void read_protected_config(void)
>> +{
>> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>> +
>> +	git_configset_init(&protected_config);
>> +
>> +	system_config = git_system_config();
>> +	git_global_config(&user_config, &xdg_config);
>> +
>> +	git_configset_add_file(&protected_config, system_config);
>> +	git_configset_add_file(&protected_config, xdg_config);
>> +	git_configset_add_file(&protected_config, user_config);
>> +	git_configset_add_parameters(&protected_config);
>> +
>> +	free(system_config);
>> +	free(xdg_config);
>> +	free(user_config);
>> +}
>
> Noticed after it landed on master: This change fails with:
>
> 	make SANITIZE=address test T=t0410*.sh
>
> Running that manually shows that we fail like this:
> 	
> 	$ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
> 	AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
> 	AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
> 	AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
> 	AH01215: =================================================================: /home/avar/g/git/git-http-backend
> 	AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
> 	AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
> 	AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
> 	AH01215:     #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
> 	AH01215:     #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
> 	AH01215:     #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
> 	AH01215:     #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
> 	AH01215:     #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
> 	AH01215:     #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
> 	AH01215:     #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
> 	AH01215:     #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
> 	AH01215:     #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
> 	AH01215:     #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
> 	AH01215:     #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
> 	AH01215:     #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
> 	AH01215:     #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
> 	AH01215:     #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
> 	AH01215:     #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
> 	AH01215:     #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
> 	AH01215:     #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
> 	AH01215:     #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
> 	AH01215:     #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
> 	AH01215:     #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
> 	AH01215:     #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
> 	AH01215: : /home/avar/g/git/git-http-backend
> 	AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
> 	AH01215: SUMMARY: AddressSanitizer: SEGV ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167 in __sanitizer::internal_strlen(char const*): /home/avar/g/git/git-http-backend
> 	AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
> 	AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
>
> (We really should have a SANITIZE=address in CI, but it takes a while...)

Thanks. I narrowed the failure down to the hunk above, specifically this
line:

  git_configset_add_file(&protected_config, xdg_config);

Since xdg_config can be NULL, this results in the failing call
fopen_or_warn(NULL, "r").

This logic was lifted  from do_git_config_sequence(), which checks that
each of the paths are not NULL. So a fix might be something like:

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----

  diff --git a/config.c b/config.c
  index 015bec360f..208a3dd7a7 100644
  --- a/config.c
  +++ b/config.c
  @@ -2645,9 +2645,13 @@ static void read_protected_config(void)
    system_config = git_system_config();
    git_global_config(&user_config, &xdg_config);

  -	git_configset_add_file(&protected_config, system_config);
  -	git_configset_add_file(&protected_config, xdg_config);
  -	git_configset_add_file(&protected_config, user_config);
  +
  +	if (system_config)
  +		git_configset_add_file(&protected_config, system_config);
  +	if (xdg_config)
  +		git_configset_add_file(&protected_config, xdg_config);
  +	if (user_config)
  +		git_configset_add_file(&protected_config, user_config);
    git_configset_add_parameters(&protected_config);

    free(system_config);

----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----

I'm not sure if system_config can ever be NULL, but (xdg|user)_config is
NULL when $HOME is unset, and xdg_config is also unset if
$GIT_CONFIG_GLOBAL is set.

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`)
  2022-07-25 20:15                   ` Glen Choo
@ 2022-07-25 20:41                     ` Ævar Arnfjörð Bjarmason
  2022-07-25 20:56                       ` Glen Choo
  0 siblings, 1 reply; 113+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-25 20:41 UTC (permalink / raw)
  To: Glen Choo
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Jonathan Tan, Johannes Schindelin


On Mon, Jul 25 2022, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
>>
>>> +/* Read values into protected_config. */
>>> +static void read_protected_config(void)
>>> +{
>>> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>>> +
>>> +	git_configset_init(&protected_config);
>>> +
>>> +	system_config = git_system_config();
>>> +	git_global_config(&user_config, &xdg_config);
>>> +
>>> +	git_configset_add_file(&protected_config, system_config);
>>> +	git_configset_add_file(&protected_config, xdg_config);
>>> +	git_configset_add_file(&protected_config, user_config);
>>> +	git_configset_add_parameters(&protected_config);
>>> +
>>> +	free(system_config);
>>> +	free(xdg_config);
>>> +	free(user_config);
>>> +}
>>
>> Noticed after it landed on master: This change fails with:
>>
>> 	make SANITIZE=address test T=t0410*.sh
>>
>> Running that manually shows that we fail like this:
>> 	
>> 	$ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
>> 	AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
>> 	AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
>> 	AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
>> 	AH01215: =================================================================: /home/avar/g/git/git-http-backend
>> 	AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
>> 	AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
>> 	AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
>> 	AH01215:     #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
>> 	AH01215:     #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
>> 	AH01215:     #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
>> 	AH01215:     #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
>> 	AH01215:     #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
>> 	AH01215:     #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
>> 	AH01215:     #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
>> 	AH01215:     #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
>> 	AH01215:     #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
>> 	AH01215:     #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
>> 	AH01215:     #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
>> 	AH01215:     #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
>> 	AH01215:     #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
>> 	AH01215:     #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
>> 	AH01215:     #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
>> 	AH01215:     #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
>> 	AH01215:     #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
>> 	AH01215:     #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
>> 	AH01215:     #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
>> 	AH01215:     #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
>> 	AH01215:     #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
>> 	AH01215: : /home/avar/g/git/git-http-backend
>> 	AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
>> 	AH01215: SUMMARY: AddressSanitizer: SEGV
>> ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167
>> in __sanitizer::internal_strlen(char const*):
>> /home/avar/g/git/git-http-backend
>> 	AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
>> 	AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
>>
>> (We really should have a SANITIZE=address in CI, but it takes a while...)
>
> Thanks. I narrowed the failure down to the hunk above, specifically this
> line:
>
>   git_configset_add_file(&protected_config, xdg_config);
>
> Since xdg_config can be NULL, this results in the failing call
> fopen_or_warn(NULL, "r").
>
> This logic was lifted  from do_git_config_sequence(), which checks that
> each of the paths are not NULL. So a fix might be something like:
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>
>   diff --git a/config.c b/config.c
>   index 015bec360f..208a3dd7a7 100644
>   --- a/config.c
>   +++ b/config.c
>   @@ -2645,9 +2645,13 @@ static void read_protected_config(void)
>     system_config = git_system_config();
>     git_global_config(&user_config, &xdg_config);
>
>   -	git_configset_add_file(&protected_config, system_config);
>   -	git_configset_add_file(&protected_config, xdg_config);
>   -	git_configset_add_file(&protected_config, user_config);
>   +
>   +	if (system_config)
>   +		git_configset_add_file(&protected_config, system_config);
>   +	if (xdg_config)
>   +		git_configset_add_file(&protected_config, xdg_config);
>   +	if (user_config)
>   +		git_configset_add_file(&protected_config, user_config);
>     git_configset_add_parameters(&protected_config);
>
>     free(system_config);
>
> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>
> I'm not sure if system_config can ever be NULL, but (xdg|user)_config is
> NULL when $HOME is unset, and xdg_config is also unset if
> $GIT_CONFIG_GLOBAL is set.

Not having looked into it much at all: Doesn't this then introduce
another logic error where git_protected_config() is now buggy, i.e. it's
a "lazy load" method where we'll expect to read_protected_config()
first.

The assumption with that seems to have been that it's invariant within a
single process, is that still the case, or can e.g. HOME be set during
our runtime when we rely on these functions?

(I don't know)

^ permalink raw reply	[flat|nested] 113+ messages in thread

* Re: SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`)
  2022-07-25 20:41                     ` Ævar Arnfjörð Bjarmason
@ 2022-07-25 20:56                       ` Glen Choo
  0 siblings, 0 replies; 113+ messages in thread
From: Glen Choo @ 2022-07-25 20:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Glen Choo via GitGitGadget, git, Taylor Blau, Derrick Stolee,
	Junio C Hamano, Emily Shaffer, Jonathan Tan, Johannes Schindelin

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Mon, Jul 25 2022, Glen Choo wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>>> On Thu, Jul 14 2022, Glen Choo via GitGitGadget wrote:
>>>
>>>> +/* Read values into protected_config. */
>>>> +static void read_protected_config(void)
>>>> +{
>>>> +	char *xdg_config = NULL, *user_config = NULL, *system_config = NULL;
>>>> +
>>>> +	git_configset_init(&protected_config);
>>>> +
>>>> +	system_config = git_system_config();
>>>> +	git_global_config(&user_config, &xdg_config);
>>>> +
>>>> +	git_configset_add_file(&protected_config, system_config);
>>>> +	git_configset_add_file(&protected_config, xdg_config);
>>>> +	git_configset_add_file(&protected_config, user_config);
>>>> +	git_configset_add_parameters(&protected_config);
>>>> +
>>>> +	free(system_config);
>>>> +	free(xdg_config);
>>>> +	free(user_config);
>>>> +}
>>>
>>> Noticed after it landed on master: This change fails with:
>>>
>>> 	make SANITIZE=address test T=t0410*.sh
>>>
>>> Running that manually shows that we fail like this:
>>> 	
>>> 	$ cat trash\ directory.t0410-partial-clone/httpd/error.log | grep -o AH0.*
>>> 	AH00163: Apache/2.4.54 (Debian) configured -- resuming normal operations
>>> 	AH00094: Command line: '/usr/sbin/apache2 -d /home/avar/g/git/t/trash directory.t0410-partial-clone/httpd -f /home/avar/g/git/t/lib-httpd/apache.conf -c Listen 127.0.0.1:10410'
>>> 	AH01215: AddressSanitizer:DEADLYSIGNAL: /home/avar/g/git/git-http-backend
>>> 	AH01215: =================================================================: /home/avar/g/git/git-http-backend
>>> 	AH01215: ==27820==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f7af5dc0d66 bp 0x7fff11964450 sp 0x7fff11963be8 T0): /home/avar/g/git/git-http-backend
>>> 	AH01215: ==27820==The signal is caused by a READ memory access.: /home/avar/g/git/git-http-backend
>>> 	AH01215: ==27820==Hint: address points to the zero page.: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #0 0x7f7af5dc0d66 in __sanitizer::internal_strlen(char const*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #1 0x7f7af5d512f2 in __interceptor_fopen64 ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:6220: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #2 0x562a65e37cc8 in git_fopen compat/fopen.c:22: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #3 0x562a65df3879 in fopen_or_warn wrapper.c:431: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #4 0x562a65a12476 in git_config_from_file_with_options config.c:1982: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #5 0x562a65a124f4 in git_config_from_file config.c:1993: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #6 0x562a65a15288 in git_configset_add_file config.c:2389: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #7 0x562a65a16a37 in read_protected_config config.c:2649: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #8 0x562a65a16b5c in git_protected_config config.c:2661: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #9 0x562a65dd9f9a in get_upload_pack_config upload-pack.c:1342: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #10 0x562a65ddc1cb in upload_pack_v2 upload-pack.c:1706: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #11 0x562a65d2eb8a in process_request serve.c:308: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #12 0x562a65d2ec18 in protocol_v2_serve_loop serve.c:323: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #13 0x562a6593c5ae in cmd_upload_pack builtin/upload-pack.c:55: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #14 0x562a656cf8ff in run_builtin git.c:466: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #15 0x562a656d02ab in handle_builtin git.c:720: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #16 0x562a656d09d5 in run_argv git.c:787: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #17 0x562a656d174f in cmd_main git.c:920: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #18 0x562a6594b0b9 in main common-main.c:56: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #19 0x7f7af5a5681c in __libc_start_main ../csu/libc-start.c:332: /home/avar/g/git/git-http-backend
>>> 	AH01215:     #20 0x562a656cb209 in _start (git+0x1d1209): /home/avar/g/git/git-http-backend
>>> 	AH01215: : /home/avar/g/git/git-http-backend
>>> 	AH01215: AddressSanitizer can not provide additional info.: /home/avar/g/git/git-http-backend
>>> 	AH01215: SUMMARY: AddressSanitizer: SEGV
>>> ../../../../src/libsanitizer/sanitizer_common/sanitizer_libc.cpp:167
>>> in __sanitizer::internal_strlen(char const*):
>>> /home/avar/g/git/git-http-backend
>>> 	AH01215: ==27820==ABORTING: /home/avar/g/git/git-http-backend
>>> 	AH01215: error: upload-pack died of signal 6: /home/avar/g/git/git-http-backend
>>>
>>> (We really should have a SANITIZE=address in CI, but it takes a while...)
>>
>> Thanks. I narrowed the failure down to the hunk above, specifically this
>> line:
>>
>>   git_configset_add_file(&protected_config, xdg_config);
>>
>> Since xdg_config can be NULL, this results in the failing call
>> fopen_or_warn(NULL, "r").
>>
>> This logic was lifted  from do_git_config_sequence(), which checks that
>> each of the paths are not NULL. So a fix might be something like:
>>
>> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>>
>>   diff --git a/config.c b/config.c
>>   index 015bec360f..208a3dd7a7 100644
>>   --- a/config.c
>>   +++ b/config.c
>>   @@ -2645,9 +2645,13 @@ static void read_protected_config(void)
>>     system_config = git_system_config();
>>     git_global_config(&user_config, &xdg_config);
>>
>>   -	git_configset_add_file(&protected_config, system_config);
>>   -	git_configset_add_file(&protected_config, xdg_config);
>>   -	git_configset_add_file(&protected_config, user_config);
>>   +
>>   +	if (system_config)
>>   +		git_configset_add_file(&protected_config, system_config);
>>   +	if (xdg_config)
>>   +		git_configset_add_file(&protected_config, xdg_config);
>>   +	if (user_config)
>>   +		git_configset_add_file(&protected_config, user_config);
>>     git_configset_add_parameters(&protected_config);
>>
>>     free(system_config);
>>
>> ----- >8 --------- >8 --------- >8 --------- >8 --------- >8 ----
>>
>> I'm not sure if system_config can ever be NULL, but (xdg|user)_config is
>> NULL when $HOME is unset, and xdg_config is also unset if
>> $GIT_CONFIG_GLOBAL is set.
>
> Not having looked into it much at all: Doesn't this then introduce
> another logic error where git_protected_config() is now buggy, i.e. it's
> a "lazy load" method where we'll expect to read_protected_config()
> first.
>
> The assumption with that seems to have been that it's invariant within a
> single process, is that still the case, or can e.g. HOME be set during
> our runtime when we rely on these functions?
>
> (I don't know)

I don't think this introduces an error, or at least, not one that we
don't already have. This mimics do_git_config_sequence() (which also
assumes this invariant), which is used under the hood by
(git|repo)_read_config(),

In retrospect, it might have been a good idea to implement
read_protected_config() using do_git_config_sequence() /
config_with_options(); those functions are a bit bloated, but at least
we'd only have one implementation.

^ permalink raw reply	[flat|nested] 113+ messages in thread

end of thread, other threads:[~2022-07-25 20:56 UTC | newest]

Thread overview: 113+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-06 18:30 [PATCH] [RFC] setup.c: make bare repo discovery optional Glen Choo via GitGitGadget
2022-05-06 20:33 ` Junio C Hamano
2022-05-09 21:42 ` Taylor Blau
2022-05-09 22:54   ` Junio C Hamano
2022-05-09 23:57     ` Taylor Blau
2022-05-10  0:23       ` Junio C Hamano
2022-05-10 22:00   ` Glen Choo
2022-05-13 23:37 ` [PATCH v2 0/2] " Glen Choo via GitGitGadget
2022-05-13 23:37   ` [PATCH v2 1/2] " Glen Choo via GitGitGadget
2022-05-16 18:12     ` Glen Choo
2022-05-16 18:46     ` Derrick Stolee
2022-05-16 22:25       ` Taylor Blau
2022-05-17 20:24       ` Glen Choo
2022-05-17 21:51         ` Glen Choo
2022-05-13 23:37   ` [PATCH v2 2/2] setup.c: learn discovery.bareRepository=cwd Glen Choo via GitGitGadget
2022-05-16 18:49     ` Derrick Stolee
2022-05-16 16:40   ` [PATCH v2 0/2] setup.c: make bare repo discovery optional Junio C Hamano
2022-05-16 18:36     ` Glen Choo
2022-05-16 19:16       ` Junio C Hamano
2022-05-16 20:27         ` Glen Choo
2022-05-16 22:16           ` Junio C Hamano
2022-05-16 16:43   ` Junio C Hamano
2022-05-16 19:07   ` Derrick Stolee
2022-05-16 22:43     ` Taylor Blau
2022-05-16 23:19     ` Junio C Hamano
2022-05-17 18:56     ` Glen Choo
2022-05-27 21:09   ` [PATCH v3 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
2022-05-27 21:09     ` [PATCH v3 1/5] Documentation: define protected configuration Glen Choo via GitGitGadget
2022-05-27 23:29       ` Junio C Hamano
2022-06-02 12:42         ` Derrick Stolee
2022-06-02 16:53           ` Junio C Hamano
2022-06-02 17:39             ` Glen Choo
2022-06-03 15:57         ` Glen Choo
2022-05-27 21:09     ` [PATCH v3 2/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
2022-05-28  0:28       ` Junio C Hamano
2022-05-31 17:43         ` Glen Choo
2022-06-01 15:58           ` Junio C Hamano
2022-06-02 12:56       ` Derrick Stolee
2022-05-27 21:09     ` [PATCH v3 3/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
2022-05-28  0:59       ` Junio C Hamano
2022-06-02 13:11       ` Derrick Stolee
2022-05-27 21:09     ` [PATCH v3 4/5] config: include "-c" in protected config Glen Choo via GitGitGadget
2022-06-02 13:15       ` Derrick Stolee
2022-05-27 21:09     ` [PATCH v3 5/5] upload-pack: make uploadpack.packObjectsHook protected Glen Choo via GitGitGadget
2022-06-02 13:18       ` Derrick Stolee
2022-06-07 20:57     ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
2022-06-07 20:57       ` [PATCH v4 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
2022-06-07 20:57       ` [PATCH v4 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
2022-06-22 21:58         ` Jonathan Tan
2022-06-23 18:21           ` Glen Choo
2022-06-07 20:57       ` [PATCH v4 3/5] config: read protected config with `git_protected_config()` Glen Choo via GitGitGadget
2022-06-07 22:49         ` Junio C Hamano
2022-06-08  0:22           ` Glen Choo
2022-06-07 20:57       ` [PATCH v4 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
2022-06-07 20:57       ` [PATCH v4 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
2022-06-07 21:37         ` Glen Choo
2022-06-22 22:03       ` [PATCH v4 0/5] config: introduce discovery.bare and protected config Jonathan Tan
2022-06-23 17:13         ` Glen Choo
2022-06-23 18:32           ` Junio C Hamano
2022-06-27 17:34             ` Glen Choo
2022-06-27 18:19       ` Glen Choo
2022-06-27 18:36       ` [PATCH v5 " Glen Choo via GitGitGadget
2022-06-27 18:36         ` [PATCH v5 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
2022-06-27 18:36         ` [PATCH v5 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
2022-06-27 18:36         ` [PATCH v5 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
2022-06-27 18:36         ` [PATCH v5 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
2022-06-27 18:36         ` [PATCH v5 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
2022-06-30 13:20           ` Ævar Arnfjörð Bjarmason
2022-06-30 17:28             ` Glen Choo
2022-06-30 18:13         ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Glen Choo via GitGitGadget
2022-06-30 18:13           ` [PATCH v6 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
2022-06-30 22:32             ` Taylor Blau
2022-07-06 17:44               ` Glen Choo
2022-06-30 18:13           ` [PATCH v6 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
2022-06-30 23:49             ` Taylor Blau
2022-07-06 18:21               ` Glen Choo
2022-06-30 18:13           ` [PATCH v6 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
2022-07-01  1:22             ` Taylor Blau
2022-07-06 22:42               ` Glen Choo
2022-06-30 18:13           ` [PATCH v6 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
2022-06-30 18:13           ` [PATCH v6 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
2022-07-01  1:30             ` Taylor Blau
2022-07-07 19:55               ` Glen Choo
2022-06-30 22:13           ` [PATCH v6 0/5] config: introduce discovery.bare and protected config Taylor Blau
2022-06-30 23:07           ` Ævar Arnfjörð Bjarmason
2022-07-01 17:37             ` Glen Choo
2022-07-08 21:58               ` Ævar Arnfjörð Bjarmason
2022-07-12 20:47                 ` Glen Choo
2022-07-12 23:53                   ` Ævar Arnfjörð Bjarmason
2022-07-07 23:01           ` [PATCH v7 " Glen Choo via GitGitGadget
2022-07-07 23:01             ` [PATCH v7 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
2022-07-07 23:43               ` Junio C Hamano
2022-07-08 17:01                 ` Glen Choo
2022-07-08 19:01                   ` Junio C Hamano
2022-07-08 21:38                     ` Glen Choo
2022-07-07 23:01             ` [PATCH v7 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
2022-07-08  0:39               ` Junio C Hamano
2022-07-07 23:01             ` [PATCH v7 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
2022-07-07 23:01             ` [PATCH v7 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
2022-07-07 23:01             ` [PATCH v7 5/5] setup.c: create `discovery.bare` Glen Choo via GitGitGadget
2022-07-08  1:07             ` [PATCH v7 0/5] config: introduce discovery.bare and protected config Junio C Hamano
2022-07-08 20:35               ` Glen Choo
2022-07-12 22:11                 ` Glen Choo
2022-07-14 21:27             ` [PATCH v8 0/5] config: introduce safe.bareRepository " Glen Choo via GitGitGadget
2022-07-14 21:27               ` [PATCH v8 1/5] Documentation/git-config.txt: add SCOPES section Glen Choo via GitGitGadget
2022-07-14 21:27               ` [PATCH v8 2/5] Documentation: define protected configuration Glen Choo via GitGitGadget
2022-07-14 21:27               ` [PATCH v8 3/5] config: learn `git_protected_config()` Glen Choo via GitGitGadget
2022-07-25 18:26                 ` SANITIZE=address failure on master (was: [PATCH v8 3/5] config: learn `git_protected_config()`) Ævar Arnfjörð Bjarmason
2022-07-25 20:15                   ` Glen Choo
2022-07-25 20:41                     ` Ævar Arnfjörð Bjarmason
2022-07-25 20:56                       ` Glen Choo
2022-07-14 21:28               ` [PATCH v8 4/5] safe.directory: use git_protected_config() Glen Choo via GitGitGadget
2022-07-14 21:28               ` [PATCH v8 5/5] setup.c: create `safe.bareRepository` Glen Choo via GitGitGadget

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.