git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Glen Choo <chooglen@google.com>
Cc: Derrick Stolee <derrickstolee@github.com>,
	Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	git@vger.kernel.org
Subject: Re: [PATCH] fsck: detect bare repos in trees and warn
Date: Fri, 15 Apr 2022 14:46:15 +0200	[thread overview]
Message-ID: <220415.861qxycx03.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <kl6lee1z8mcm.fsf@chooglen-macbookpro.roam.corp.google.com>


On Thu, Apr 14 2022, Glen Choo wrote:

> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Thu, Apr 07 2022, Derrick Stolee wrote:
>>
>>> A more complete protection here would be:
>>>
>>>  1. Warn when finding a bare repo as a tree (this patch).
>>>
>>>  2. Suppress warnings on trusted repos, scoped to a specific set of known
>>>     trees _or_ based on some set of known commits (in case the known trees
>>>     are too large).
>>>
>>>  3. Prevent writing a bare repo to the worktree, unless the user provided
>>>     an opt-in to that behavior.
>>>
>>> Since your patch is moving in the right direction here, I don't think
>>> steps (2) and (3) are required to move forward with your patch. However,
>>> it is a good opportunity to discuss the full repercussions of this issue.
>>
>> Isn't a gentler solution here to:
>>
>>  1. In setup.c, we detect a repo
>>  2. Walk up a directory
>>  3. Do we find a repo?
>>  4. Does that repo "contain" the first one?
>>     If yes: die on setup
>>     If no: it's OK
>>
>> It also seems to me that there's pretty much perfect overlap between
>> this and the long-discussed topic of marking a submodule with config
>> v.s. detecting it on the fly.
>
> Your suggestion seems similar to:
>
>   == 3. Detect that we are in an embedded bare repo and ignore the embedded bare
>   repository in favor of the containing repo.
>
> which I also think is a simple, robust mitigation if we put aside the
> problem of walking up to the root in too many situations. I seem to
> recall that this problem has come up before in [1] (and possibly other
> topics? I wasn't really able to locate them through a cursory search..),
> so I assume that's what you're referring to by "long-discussed topic".

Yes, I mean the submodule.superprojectGitDir topic.

> (Forgive me if I'm asking you to repeat yourself yet another time) I
> seem to recall that we weren't able to reach consensus on whether it's
> okay for Git to opportunistically walk up the directory hierarchy during
> setup, especially since There are some situations where this is
> extremely expensive (VFS, network mount).

I'm not sure, but I think per the later
https://lore.kernel.org/git/220204.86pmo34d2m.gmgdl@evledraar.gmail.com/
and
https://lore.kernel.org/git/220311.8635joj0lf.gmgdl@evledraar.gmail.com/
that any optimization concerns were likely just "this is slow in
shellscript" and not at the FS level.

There were also passing references to some internal Google-specific
NFS-ish implementation that I know nothing about (but you might),
i.e. what I asked about in:
https://lore.kernel.org/git/220212.864k53yfws.gmgdl@evledraar.gmail.com/

But given the v9 superprojectGitDir becoming a boolean instead of a path
in v9 I'm not sure/have no idea.

The only thing I'm sure of is if past iterations of the series were
addressing such a problem as an optimization that doesn't seem to be a
current goal.

As noted in those past exchanges I have tested this method on e.g. AIX
whose FS is unbelievably slow, and I couldn't even tell the differenc.

That's because if you look at the total FS syscalls even for an
uninitialized repo just traversing .git, getting config etc. is going to
dwarf "walking up" in terms of number of calls.

Of course not all calls are going to be equal, and there's that
potential "I'm not NFS-y, but a parent is" case etc.

In any case, I think even *if* we had such a case somewhere that this
plan would still make sense. Such users could simply set
GIT_CEILING_DIRECTORIES or something similar if they cared about the
performance.

But for everyone else we'd do the right thing, and not prematurely
optimize. I.e. we actually *are* concerned not with "does it look like a
bare repo?" but "is this thing that looks like a bare repo within our
current actual repo or not?".

> I actually like this option quite a lot, but I don't see how we could
> implement this without imposing a big penalty to all bare repo users -
> they'd either be forced to set GIT_DIR or GIT_CEILING_DIRECTORIES, or
> take a (potentially big) performance hit. Hopefully I'm just framing
> this too narrowly and you're approaching this differently.

As noted in the [1] you quoted (link below) I tried to quantify that
potential penalty, and it seems to be a complete non-issue.

Of course there may be other scenarios where it matters, but I haven't
seen any concrete data to support that.

Doesn't pretty everyone who cares about the performance of bare in any
capacity do so because they're running a server that's using
git-upload-pack and the like? Those require you to specify the exact
.git directory you want.

I.e. wouldn't this *only* apply to those doing the equivalent of "git -C
some-dir" to "cd" to a bare repo?

> PS: As an aside, wouldn't this also break libgit2? We could make this
> opt-out behavior, though that requires us to read system config _before_
> discovering the gitdir (as I discussed in [2]).

No it wouldn't? I don't use libgit2, but upthread there's concern that
banning things that look-like-a-repo from being tracked would break it.

Whereas I'm pointing out that we don't need to do that, we can just keep
searching upwards.

But yes, it would "break" anything that assumed you could cd to that
tracked-looks-like-or-is--a-gitdir and have e.g. "git config" pick up
its config instead of our "real repo" config, but that's exactly what we
want in this case isn't it?

I'm just pointing out that we can do it on the fly in setup.c, instead
of forbidding such content from ever being tracked within the
repository, which we'd be doing because we know we're doing the wrong
thing in that setup.c codepath.

Let's just fix that bit in setup.c instead.

> [1] https://lore.kernel.org/git/211109.86v912dtfw.gmgdl@evledraar.gmail.com/
> [2] https://lore.kernel.org/git/kl6lv8vc90ts.fsf@chooglen-macbookpro.roam.corp.google.com


  reply	other threads:[~2022-04-15 13:11 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-06 22:43 Bare repositories in the working tree are a security risk Glen Choo
2022-04-06 23:22 ` [PATCH] fsck: detect bare repos in trees and warn Glen Choo
2022-04-07 12:42   ` Johannes Schindelin
2022-04-07 13:21     ` Derrick Stolee
2022-04-07 14:14       ` Ævar Arnfjörð Bjarmason
2022-04-14 20:02         ` Glen Choo
2022-04-15 12:46           ` Ævar Arnfjörð Bjarmason [this message]
2022-04-07 15:11       ` Junio C Hamano
2022-04-13 22:24       ` Glen Choo
2022-04-07 13:12   ` Ævar Arnfjörð Bjarmason
2022-04-07 15:20   ` Junio C Hamano
2022-04-07 18:38 ` Bare repositories in the working tree are a security risk John Cai
2022-04-07 21:24 ` brian m. carlson
2022-04-07 21:53   ` Justin Steven
2022-04-07 22:10     ` brian m. carlson
2022-04-07 22:40       ` rsbecker
2022-04-08  5:54       ` Junio C Hamano
2022-04-14  0:03         ` Junio C Hamano
2022-04-14  0:04         ` Glen Choo
2022-04-13 23:44       ` Glen Choo
2022-04-13 20:37 ` Glen Choo
2022-04-13 23:36   ` Junio C Hamano
2022-04-14 16:41     ` Glen Choo
2022-04-14 17:35       ` Junio C Hamano
2022-04-14 18:19         ` Junio C Hamano
2022-04-15 21:33         ` Glen Choo
2022-04-15 22:17           ` Junio C Hamano
2022-04-16  0:52             ` Taylor Blau
2022-04-15 22:43           ` Glen Choo
2022-04-15 20:13       ` Junio C Hamano
2022-04-15 23:45         ` Glen Choo
2022-04-15 23:59           ` Glen Choo
2022-04-16  1:00           ` Taylor Blau
2022-04-16  1:18             ` Junio C Hamano
2022-04-16  1:30               ` Taylor Blau
2022-04-16  0:34 ` Glen Choo
2022-04-16  0:41 ` Glen Choo
2022-04-16  1:28   ` Taylor Blau
2022-04-21 18:25     ` Emily Shaffer
2022-04-21 18:29       ` Emily Shaffer
2022-04-21 18:47         ` Junio C Hamano
2022-04-21 18:54           ` Taylor Blau
2022-04-21 19:09       ` Taylor Blau
2022-04-21 21:01         ` Emily Shaffer
2022-04-21 21:22           ` Taylor Blau
2022-04-29 23:57     ` Glen Choo
2022-04-30  1:14       ` Taylor Blau
2022-05-02 19:39         ` Glen Choo
2022-05-02 14:05       ` Philip Oakley
2022-05-02 18:50         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220415.861qxycx03.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=chooglen@google.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).