All of lore.kernel.org
 help / color / mirror / Atom feed
From: Neeraj Singh <nksingh85@gmail.com>
To: Jeff King <peff@peff.net>
Cc: Elijah Newren <newren@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>,
	Jonathan Nieder <jrnieder@gmail.com>
Subject: Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)
Date: Wed, 29 Sep 2021 11:43:39 -0700	[thread overview]
Message-ID: <20210929184339.GA19712@neerajsi-x1.localdomain> (raw)
In-Reply-To: <YVOn3hDsb5pnxR53@coredump.intra.peff.net>

On Tue, Sep 28, 2021 at 07:40:14PM -0400, Jeff King wrote:
> On Mon, Sep 27, 2021 at 11:46:40PM -0700, Elijah Newren wrote:
> 
> > > * en/remerge-diff (2021-08-31) 7 commits
> > >  - doc/diff-options: explain the new --remerge-diff option
> > >  - show, log: provide a --remerge-diff capability
> > >  - tmp-objdir: new API for creating and removing primary object dirs
> > >  - merge-ort: capture and print ll-merge warnings in our preferred fashion
> > >  - ll-merge: add API for capturing warnings in a strbuf instead of stderr
> > >  - merge-ort: add ability to record conflict messages in a file
> > >  - merge-ort: mark a few more conflict messages as omittable
> > >
> > >  A new presentation for two-parent merge "--remerge-diff" can be
> > >  used to show the difference between mechanical (and possibly
> > >  conflicted) merge results and the recorded resolution.
> > >
> > >  Will merge to 'next'?
> > 
> > It has been a month that it's been cooking with no issues brought up,
> > and it's been in production for nearly a year...
> > 
> > But just this morning I pinged peff and jrnieder if they might have
> > time to respectively look at the tmp-objdir stuff (patch 5, plus its
> > integration into log-tree.c in patch 7) and the ll-merge.[ch] changes
> > (patch 3).  I don't know if either will have time to do it, but
> > perhaps wait half a week or so to see if they'll mention they have
> > time?  Otherwise, yeah, it's probably time to merge this down.
> 
> Sorry to take so long. I think this is a very exciting topic, and I
> appreciate being called into to look at tmp-objdir stuff, because it can
> be quite subtle.
> 
> I just left a rather long-ish mail in the thread, but the gist of it is
> that I'm worried that there's some possibility of corruption there if
> the diff code writes objects. I didn't do a proof-of-concept there, but
> I worked one up just now. Try this:
> 
>   # make a repo
>   git init repo
>   cd repo
>   echo base >file
>   git add file
>   git commit -m base
> 
>   # two diverging branches
>   echo main >file
>   git commit -am main
>   git checkout -b side HEAD^
>   echo side >file
>   git commit -am side
> 
>   # we get a conflict, and we resolve
>   git merge main
>   echo resolved >file
>   git commit -am merged
> 
>   # now imagine we had a file with a diff driver. I stuffed it
>   # in here after the fact, but it could have been here all along,
>   # or come as part of the merge, etc.
>   echo whatever >unrelated
>   echo "unrelated diff=foo" >.gitattributes
>   git add .
>   git commit --amend --no-edit
> 
>   # set up the diff driver not just to do a textconv, but to cache the
>   # result. This will require writing out new objects for the cache
>   # as part of the diff operation.
>   git config diff.foo.textconv "$PWD/upcase"
>   git config diff.foo.cachetextconv true
>   cat >upcase <<\EOF &&
>   #!/bin/sh
>   tr a-z A-Z <$1
>   EOF
>   chmod +x upcase
> 
>   # now show the diff
>   git log -1 --remerge-diff
> 
>   # and make sure the repo is still OK
>   git fsck
> 
> The remerge diff will correctly show the textconv'd content (because
> it's not in either parent, and hence an evil merge):
> 
>   diff --git a/unrelated b/unrelated
>   new file mode 100644
>   index 0000000..982793c
>   --- /dev/null
>   +++ b/unrelated
>   @@ -0,0 +1 @@
>   +WHATEVER
> 
> but then fsck shows that our cache is corrupted:
> 
>   Checking object directories: 100% (256/256), done.
>   error: refs/notes/textconv/foo: invalid sha1 pointer 1e9b3ecffca73411001043fe914a7ec0e7291043
>   error: refs/notes/textconv/foo: invalid reflog entry 1e9b3ecffca73411001043fe914a7ec0e7291043
>   dangling tree 569b6e414203d2f50bdaafc1585eae11e28afc37
> 
> Now I'll admit the textconv-cache is a pretty seldom-used feature. And
> that there _probably_ aren't a lot of other ways that the diff code
> would try to write objects or references. But I think it speaks to the
> kind of subtle interactions we should worry about (and certainly
> corrupting the repository is a pretty bad outcome, though at least in
> this case it's "just" a cache and we could blow away that ref manually).
> 
> -Peff

It seems to me that one problem is that the new "primary" objdir code doesn't
disable ref updates since the GIT_QUARANTINE_ENVIRONMENT setting isn't set.
If we fix that we should be forbidding ref updates.

I've included a path that fixes my test case. This is on top of:
https://lore.kernel.org/git/CANQDOddqwVtWfC4eEP3fJB4sUiszGX8bLqoEVLcMf=v+jzx19g@mail.gmail.com/

From 38be7f6d31da8df3f205b35d25dd0a505aa75a8a Mon Sep 17 00:00:00 2001
From: Neeraj Singh <neerajsi@microsoft.com>
Date: Wed, 29 Sep 2021 11:24:05 -0700
Subject: [PATCH] tmp-objdir: disable ref updates when replacing the primary
 odb

When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.

Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb.

Peff's test case was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden

Reported-by: Jeff King <peff@peff.net>

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 object-file.c |  7 +++++++
 refs.c        | 22 +++++++++++++++++++++-
 refs.h        |  5 +++++
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/object-file.c b/object-file.c
index 990381abee..e3e5cec3c8 100644
--- a/object-file.c
+++ b/object-file.c
@@ -770,6 +770,12 @@ struct object_directory *set_temporary_primary_odb(const char *dir, int will_des
 	new_odb->will_destroy = will_destroy;
 	new_odb->next = the_repository->objects->odb;
 	the_repository->objects->odb = new_odb;
+
+	/*
+	 * Disable ref updates while a temporary odb is active, since
+	 * the objects in the database may roll back.
+	 */
+	refs_disable_updates();
 	return new_odb->next;
 }
 
@@ -786,6 +792,7 @@ void restore_primary_odb(struct object_directory *restore_odb, const char *old_p
 
 	the_repository->objects->odb = restore_odb;
 	free_object_directory(cur_odb);
+	refs_enable_updates();
 }
 
 /*
diff --git a/refs.c b/refs.c
index 8b9f7c3a80..98026b7341 100644
--- a/refs.c
+++ b/refs.c
@@ -19,6 +19,8 @@
 #include "repository.h"
 #include "sigchain.h"
 
+static int ref_update_disabled_count;
+
 /*
  * List of all available backends
  */
@@ -1045,6 +1047,24 @@ void ref_transaction_free(struct ref_transaction *transaction)
 	free(transaction);
 }
 
+void refs_disable_updates(void)
+{
+	++ref_update_disabled_count;
+}
+
+void refs_enable_updates(void)
+{
+	if (!ref_update_disabled_count)
+		BUG("ref updates are not disabled");
+
+	--ref_update_disabled_count;
+}
+
+int ref_updates_are_enabled(void)
+{
+	return !ref_update_disabled_count;
+}
+
 struct ref_update *ref_transaction_add_update(
 		struct ref_transaction *transaction,
 		const char *refname, unsigned int flags,
@@ -2126,7 +2146,7 @@ int ref_transaction_prepare(struct ref_transaction *transaction,
 		break;
 	}
 
-	if (getenv(GIT_QUARANTINE_ENVIRONMENT)) {
+	if (getenv(GIT_QUARANTINE_ENVIRONMENT) || !ref_updates_are_enabled()) {
 		strbuf_addstr(err,
 			      _("ref updates forbidden inside quarantine environment"));
 		return -1;
diff --git a/refs.h b/refs.h
index 48970dfc7e..acd7e275c5 100644
--- a/refs.h
+++ b/refs.h
@@ -840,6 +840,11 @@ int ref_storage_backend_exists(const char *name);
 
 struct ref_store *get_main_ref_store(struct repository *r);
 
+/* Helpers to mark updates to the refs as forbidden */
+void refs_disable_updates(void);
+void refs_enable_updates(void);
+int ref_updates_are_enabled(void);
+
 /**
  * Submodules
  * ----------
-- 
2.25.1


  parent reply	other threads:[~2021-09-29 18:43 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-28  0:52 What's cooking in git.git (Sep 2021, #08; Mon, 27) Junio C Hamano
2021-09-28  1:57 ` Ævar Arnfjörð Bjarmason
2021-09-28 20:52   ` Junio C Hamano
2021-09-28  6:46 ` Elijah Newren
2021-09-28  7:45   ` Ævar Arnfjörð Bjarmason
2021-09-28 17:25     ` Junio C Hamano
2021-09-28 21:00       ` Neeraj Singh
2021-09-28 23:34         ` Junio C Hamano
2021-09-28 23:53           ` Neeraj Singh
2021-10-07 22:01             ` Junio C Hamano
2021-10-08  6:51               ` Elijah Newren
2021-10-08 22:30                 ` Neeraj Singh
2021-10-08 23:01                 ` Junio C Hamano
2021-09-28  8:07   ` Ævar Arnfjörð Bjarmason
2021-09-28 17:27     ` Junio C Hamano
2021-09-28 13:31   ` Derrick Stolee
2021-09-28 17:33     ` Junio C Hamano
2021-09-28 20:16       ` Derrick Stolee
2021-09-28 17:16   ` Junio C Hamano
2021-09-29  6:42     ` Elijah Newren
2021-09-28 23:40   ` Jeff King
2021-09-28 23:49     ` Jeff King
2021-09-29 18:43     ` Neeraj Singh [this message]
2021-09-30  8:16       ` Jeff King
2021-10-01  7:50         ` Elijah Newren
2021-10-01 17:02           ` Junio C Hamano
2021-10-01 17:39             ` Neeraj Singh
2021-10-01 18:15               ` Elijah Newren
2021-10-01 18:12             ` Elijah Newren
2021-10-01 22:02               ` Junio C Hamano
2021-10-01 23:05                 ` Elijah Newren
2021-10-04 13:45     ` Elijah Newren
2021-09-28  8:22 ` da/difftool (was: Re: What's cooking in git.git (Sep 2021, #08; Mon, 27)) Ævar Arnfjörð Bjarmason
2021-09-28  8:23 ` ns/batched-fsync & en/remerge-diff (was " Ævar Arnfjörð Bjarmason
2021-09-28  8:31 ` sg/test-split-index-fix " Ævar Arnfjörð Bjarmason
2021-09-28  8:35 ` hn/reftable (Re: " Ævar Arnfjörð Bjarmason
2021-09-28 12:18   ` Han-Wen Nienhuys
2021-09-30  5:06     ` Carlo Arenas
2021-09-29  8:12 ` What's cooking in git.git (Sep 2021, #08; Mon, 27) Fabian Stelzer
2021-09-30 21:26   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210929184339.GA19712@neerajsi-x1.localdomain \
    --to=nksingh85@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.