All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Neeraj Singh <nksingh85@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)
Date: Fri, 1 Oct 2021 00:50:14 -0700	[thread overview]
Message-ID: <CABPp-BH6ZzC9p94xda3SqfL0JjxoVAb3oV57a9cpg2ZDc=5NNA@mail.gmail.com> (raw)
In-Reply-To: <YVVyUkwYNfkEqNfU@coredump.intra.peff.net>

On Thu, Sep 30, 2021 at 1:16 AM Jeff King <peff@peff.net> wrote:
>
> On Wed, Sep 29, 2021 at 11:43:39AM -0700, Neeraj Singh wrote:
>
> > It seems to me that one problem is that the new "primary" objdir code doesn't
> > disable ref updates since the GIT_QUARANTINE_ENVIRONMENT setting isn't set.
> > If we fix that we should be forbidding ref updates.
> >
> > I've included a path that fixes my test case. This is on top of:
> > https://lore.kernel.org/git/CANQDOddqwVtWfC4eEP3fJB4sUiszGX8bLqoEVLcMf=v+jzx19g@mail.gmail.com/
>
> Ah, right, I forgot we had that "forbid ref updates in quarantine" magic
> (despite being the one who added it).
>
> I do think this improves the safety, but things are still a bit more
> dangerous because we're handling it all in a single process, which sees
> both the quarantine and non-quarantine states. I wrote more details in
> this reply a few minutes ago:
>
>   https://lore.kernel.org/git/YVVmssXlaFM6yD5W@coredump.intra.peff.net/
>
> (sorry, it's long; search for the paragraph starting with "Whereas doing
> it in-process").
>
> > When creating a subprocess with a temporary ODB, we set the
> > GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
> > to update refs, since the tmp-objdir may go away.
> >
> > Introduce a similar mechanism for in-process temporary ODBs when
> > we call tmp_objdir_replace_primary_odb.
> >
> > Peff's test case was invoking ref updates via the cachetextconv
> > setting. That particular code silently does nothing when a ref
> > update is forbidden
>
> Oh good. I was worried that it would generate ugly errors, rather than
> silently skipping the update.
>
> > @@ -2126,7 +2146,7 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
> >               break;
> >       }
> >
> > -     if (getenv(GIT_QUARANTINE_ENVIRONMENT)) {
> > +     if (getenv(GIT_QUARANTINE_ENVIRONMENT) || !ref_updates_are_enabled()) {
> >               strbuf_addstr(err,
> >                             _("ref updates forbidden inside quarantine environment"));
> >               return -1;
>
> I think the overall goal of this patch makes sense, but this
> conditional, along with tmp-objdir reaching out to the refs code, feels
> funny. Should we perhaps have a single:
>
>   int tmp_objdir_is_primary(struct repository *r)
>   {
>         if (the_tmp_objdir &&
>             !strcmp(the_tmp_objdir->path.buf, r->objects->odb->path))
>                 return 1; /* our temporary is the primary */
>
>         if (getenv(GIT_QUARANTINE_PATH))
>                 return 1; /* our caller put us in quarantine */
>
>         return 0;
>   }
>
> Then it's all nicely abstracted and the ref code does not have to know
> the details of GIT_QUARANTINE_PATH in the first place.
>
> If you do got that route, the strcmp() might need to be a little more
> careful about whether r->objects can be NULL (I didn't check).
> Alternatively, I kind of wonder if "struct object_directory" should just
> have a flag that says "is_temporary".

Actually, wouldn't this be the safest approach, for my particular
usecase?  By having the quarantine in place, no refs will be updated,
which removes the risk of new refs mentioning objects that will go
away.  The only issue that could arise would be from new objects
referencing objects that will go away.  But the new objects are also
written to the temporary object store when it remains the primary, and
thus those new objects will also be deleted when the temporary object
store is.

In contrast, moving the temporary object store to the end of the
alternates list would allow new objects to be written that reference
the objects that will go away.  Using pretend_object_file() will also
allow new objects to be written that reference the objects that will
go away.

Said another way, I don't think anything should be writing a critical
file that needs to be durable when we're in the middle of a
"read-only" process like git-log.  The only things written should be
temporary stuff, like the automatic remerge calculation from
merge-ort, the textconv cache optimization stuff, or perhaps future
gitattributes transformation caching.  All that stuff can safely be
blown away at the completion of each merge.

  reply	other threads:[~2021-10-01  7:50 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-28  0:52 What's cooking in git.git (Sep 2021, #08; Mon, 27) Junio C Hamano
2021-09-28  1:57 ` Ævar Arnfjörð Bjarmason
2021-09-28 20:52   ` Junio C Hamano
2021-09-28  6:46 ` Elijah Newren
2021-09-28  7:45   ` Ævar Arnfjörð Bjarmason
2021-09-28 17:25     ` Junio C Hamano
2021-09-28 21:00       ` Neeraj Singh
2021-09-28 23:34         ` Junio C Hamano
2021-09-28 23:53           ` Neeraj Singh
2021-10-07 22:01             ` Junio C Hamano
2021-10-08  6:51               ` Elijah Newren
2021-10-08 22:30                 ` Neeraj Singh
2021-10-08 23:01                 ` Junio C Hamano
2021-09-28  8:07   ` Ævar Arnfjörð Bjarmason
2021-09-28 17:27     ` Junio C Hamano
2021-09-28 13:31   ` Derrick Stolee
2021-09-28 17:33     ` Junio C Hamano
2021-09-28 20:16       ` Derrick Stolee
2021-09-28 17:16   ` Junio C Hamano
2021-09-29  6:42     ` Elijah Newren
2021-09-28 23:40   ` Jeff King
2021-09-28 23:49     ` Jeff King
2021-09-29 18:43     ` Neeraj Singh
2021-09-30  8:16       ` Jeff King
2021-10-01  7:50         ` Elijah Newren [this message]
2021-10-01 17:02           ` Junio C Hamano
2021-10-01 17:39             ` Neeraj Singh
2021-10-01 18:15               ` Elijah Newren
2021-10-01 18:12             ` Elijah Newren
2021-10-01 22:02               ` Junio C Hamano
2021-10-01 23:05                 ` Elijah Newren
2021-10-04 13:45     ` Elijah Newren
2021-09-28  8:22 ` da/difftool (was: Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)) Ævar Arnfjörð Bjarmason
2021-09-28  8:23 ` ns/batched-fsync & en/remerge-diff (was " Ævar Arnfjörð Bjarmason
2021-09-28  8:31 ` sg/test-split-index-fix " Ævar Arnfjörð Bjarmason
2021-09-28  8:35 ` hn/reftable (Re: " Ævar Arnfjörð Bjarmason
2021-09-28 12:18   ` Han-Wen Nienhuys
2021-09-30  5:06     ` Carlo Arenas
2021-09-29  8:12 ` What's cooking in git.git (Sep 2021, #08; Mon, 27) Fabian Stelzer
2021-09-30 21:26   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BH6ZzC9p94xda3SqfL0JjxoVAb3oV57a9cpg2ZDc=5NNA@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=nksingh85@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.