From: Michael Haggerty <mhagger@alum.mit.edu>
To: git@vger.kernel.org, David Turner <dturner@twopensource.com>
Cc: "Junio C Hamano" <gitster@pobox.com>,
"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
"Jeff King" <peff@peff.net>,
"Ramsay Jones" <ramsay@ramsayjones.plus.com>,
"Michael Haggerty" <mhagger@alum.mit.edu>
Subject: [PATCH 00/29] Yet more preparation for reference backends
Date: Wed, 27 Apr 2016 18:57:17 +0200 [thread overview]
Message-ID: <cover.1461768689.git.mhagger@alum.mit.edu> (raw)
This started as a modest attempt to move the last couple of patches
mentioned here [1] to before the vtable patches. I wanted to do that
because having real changes mixed with the vtable refactoring was
making rebasing and stuff awkward.
But then it snowballed. A lot of what's new is pretty trivial (though
in some cases fixes minor bugs). But a few of the later patches are
pretty major.
The two main problems addressed by this series are:
1. Reference transactions will, in the future, sometimes span two
completely different reference backends. For example, we already
have worktree-specific refs like `HEAD` and `refs/bisect/*`, which
are stored within the worktree, plus shared refs, which live in the
main repository. It is even possible for a symref in one reference
backend to point at a reference in another reference backend. Thus
we need to be able to split one main transaction into one
transaction per backend.
2. Currently, reference transactions that involve symbolic refs are
not atomic: the symref is not locked at all, even when its reflog
is being updated. This is a no-no. It also means that its referent
can change between the time that the symref is resolved to find out
its old_oid and the time that the referent is locked. These aren't
super serious races because symrefs are usually not modified very
often (especially on Git servers, which is mostly where concurrent
modifications are an issue). But still...
The approach taken to solve these problems is inspired by David
Turner's patch [2], whose approach was first discussed here [3]. David
wrote a separate function, dereference_symrefs(transaction, ...),
which scanned through the whole transaction and split up any symref
updates into one symref update with the REF_LOG_ONLY option (to update
the symrefs reflog), plus a second update that changes the underlying
reference. This approach solves problem 1 but not problem 2.
This patch series takes a slightly different approach. For each
reference update during the "lock references" part of
ref_transaction_commit(), it
1. Locks the named reference. (If it is a symref, lock *the symref*,
not its referent!)
2. Reads the reference non-recursively
3. If it is a symref, change the update to REF_LOG_ONLY and add a
second update to the transaction for the underlying reference.
4. If it is not a symref but was derived from a symref update, record
the reference's old_oid in the ref_update structure for the
original symref.
In this way, each reference in a symref chain is locked down *before*
we read it and the lock is held throughout the transaction. As a
bonus, we don't have to use resolve_ref_full() to find old_oid for the
references; it is enough to read the references *once*, because we do
it under lock.
The three patches starting with "refs: resolve symbolic refs first"
involve a lot of new code in an area that is pretty intricate. I've
reviewed them a few times, but it's quite possible I've made some
mistakes (especially in the error-handling code). Careful review here
would be much appreciated.
It's possible that people will think this is too much new code for
fixing a relatively unlikely race. I'm not absolutely convinced myself
that it's not overengineered. But splitting ref_updates is definitely
needed for supporting multiple backends, and I think the approach of
locking then following one reference at a time is more or less a
prerequisite for making symref locking work correctly with multiple
reference backends. So (I think) our choices are to accept this code
or something like it, or to give up the hope of correct atomicity of
transactions that span backends.
This patch series is also available from my GitHub repository [4] as
branch "split-under-lock". It applies to Junio's current master.
Incidentally, a draft of the *next* patch series, which adds a
ref_store vtable abstraction for managing reference backends, is
available as branch "ref-store" in my GitHub repo. That branch passes
all tests but is not yet carefully reviewed.
Following is a table of contents for this patch series...
Chapter 1: Prologue
This chapter consists of small cleanups that I noticed while working
on the code.
* safe_create_leading_directories(): improve docstring
* remove_dir_recursively(): add docstring
* refname_is_safe(): use skip_prefix()
This patch fixes a bug in refname_is_safe():
* refname_is_safe(): don't allow the empty string
This one makes refname_is_safe() a little stricter...it shouldn't
allow refnames like "refs/foo/../bar///baz" because the downstream
code isn't smart enough to handle them anyway. (At the latest if the
reference has to be sought in packed-refs, things would fail):
* refname_is_safe(): insist that the refname already be normalized
Chapter 2: Character development
Make sure error output goes the right place:
* commit_ref_update(): write error message to *err, not stderr
Tighten up some code, rename some parameters:
* rename_ref(): remove unneeded local variable
* ref_transaction_commit(): remove local variable n
* read_raw_ref(): rename flags argument to type
* read_raw_ref(): clear *type at start of function
* read_raw_ref(): rename symref argument to referent
* read_raw_ref(): improve docstring
* lock_ref_sha1_basic(): remove unneeded local variable
Chapter 3: Small adventures
Change error messages to adhere to our new policy of starting with
lower-case letters:
* refs: make error messages more consistent
Require REF_NODEREF if REF_ISPRUNING is set:
* ref_transaction_create(): disallow recursive pruning
Correctly handle an (unlikely) error path:
* ref_transaction_commit(): correctly report close_ref() failure
Oops, here's a problem that could have bit us later. (In fact, it
did bite me when I implemented one of the later patches):
* delete_branches(): use resolve_refdup()
Chapter 4: Flashback
The following two patches are split out of one of David Turner's
patches mentioned above:
* refs: allow log-only updates
* refs: don't dereference on rename
Chapter 5: The tension builds
The protagonist clearly has something planned...but what?
* verify_refname_available(): adjust constness in declaration
* add_update(): initialize the whole ref_update
* lock_ref_for_update(): new function
* unlock_ref(): move definition higher in the file
Here we force a little more refname safety while building up
ref_transactions:
* ref_transaction_update(): check refname_is_safe() at a minimum
Chapter 6: The climax
This is the guts of the patch series. Everything up to this point
should be pretty uncontroversial. These three patches are the main
subject of the discussion above. They are pretty large patches, but
I don't see a reasonable way to break them up more:
* refs: resolve symbolic refs first
* lock_ref_for_update(): don't re-read non-symbolic references
* lock_ref_for_update(): don't resolve symrefs
Chapter 7: Denouement
Remove some stuff that's not needed anymore.
* commit_ref_update(): remove the flags parameter
* lock_ref_sha1_basic(): only handle REF_NODEREF mode
Michael
[1] http://article.gmane.org/gmane.comp.version-control.git/289994
[2] http://thread.gmane.org/gmane.comp.version-control.git/287949/focus=287958
[3] http://article.gmane.org/gmane.comp.version-control.git/282927
[4] https://github.com/mhagger/git
David Turner (2):
refs: allow log-only updates
refs: don't dereference on rename
Michael Haggerty (27):
safe_create_leading_directories(): improve docstring
remove_dir_recursively(): add docstring
refname_is_safe(): use skip_prefix()
refname_is_safe(): don't allow the empty string
refname_is_safe(): insist that the refname already be normalized
commit_ref_update(): write error message to *err, not stderr
rename_ref(): remove unneeded local variable
ref_transaction_commit(): remove local variable n
read_raw_ref(): rename flags argument to type
read_raw_ref(): clear *type at start of function
read_raw_ref(): rename symref argument to referent
read_raw_ref(): improve docstring
lock_ref_sha1_basic(): remove unneeded local variable
refs: make error messages more consistent
ref_transaction_create(): disallow recursive pruning
ref_transaction_commit(): correctly report close_ref() failure
delete_branches(): use resolve_refdup()
verify_refname_available(): adjust constness in declaration
add_update(): initialize the whole ref_update
lock_ref_for_update(): new function
unlock_ref(): move definition higher in the file
ref_transaction_update(): check refname_is_safe() at a minimum
refs: resolve symbolic refs first
lock_ref_for_update(): don't re-read non-symbolic references
lock_ref_for_update(): don't resolve symrefs
commit_ref_update(): remove the flags parameter
lock_ref_sha1_basic(): only handle REF_NODEREF mode
builtin/branch.c | 19 +-
cache.h | 5 +
dir.h | 23 ++
refs.c | 67 ++--
refs/files-backend.c | 882 +++++++++++++++++++++++++++++++++++++-----------
refs/refs-internal.h | 55 ++-
t/t1400-update-ref.sh | 41 ++-
t/t1430-bad-ref-name.sh | 2 +-
t/t3200-branch.sh | 9 +
9 files changed, 869 insertions(+), 234 deletions(-)
--
2.8.1
next reply other threads:[~2016-04-27 17:05 UTC|newest]
Thread overview: 70+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-27 16:57 Michael Haggerty [this message]
2016-04-27 16:57 ` [PATCH 01/29] safe_create_leading_directories(): improve docstring Michael Haggerty
2016-04-27 16:57 ` [PATCH 02/29] remove_dir_recursively(): add docstring Michael Haggerty
2016-04-27 16:57 ` [PATCH 03/29] refname_is_safe(): use skip_prefix() Michael Haggerty
2016-04-27 16:57 ` [PATCH 04/29] refname_is_safe(): don't allow the empty string Michael Haggerty
2016-04-27 16:57 ` [PATCH 05/29] refname_is_safe(): insist that the refname already be normalized Michael Haggerty
2016-04-27 17:59 ` Junio C Hamano
2016-04-27 20:10 ` David Turner
2016-04-27 20:15 ` Jeff King
2016-04-27 20:34 ` David Turner
2016-04-27 20:37 ` Jeff King
2016-04-27 22:19 ` Jeff King
2016-04-28 17:44 ` David Turner
2016-04-27 16:57 ` [PATCH 06/29] commit_ref_update(): write error message to *err, not stderr Michael Haggerty
2016-04-27 16:57 ` [PATCH 07/29] rename_ref(): remove unneeded local variable Michael Haggerty
2016-04-27 16:57 ` [PATCH 08/29] ref_transaction_commit(): remove local variable n Michael Haggerty
2016-04-27 18:16 ` Junio C Hamano
2016-04-27 20:45 ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 09/29] read_raw_ref(): rename flags argument to type Michael Haggerty
2016-04-27 16:57 ` [PATCH 10/29] read_raw_ref(): clear *type at start of function Michael Haggerty
2016-04-27 16:57 ` [PATCH 11/29] read_raw_ref(): rename symref argument to referent Michael Haggerty
2016-04-27 16:57 ` [PATCH 12/29] read_raw_ref(): improve docstring Michael Haggerty
2016-04-27 18:31 ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 13/29] lock_ref_sha1_basic(): remove unneeded local variable Michael Haggerty
2016-04-27 16:57 ` [PATCH 14/29] refs: make error messages more consistent Michael Haggerty
2016-04-27 16:57 ` [PATCH 15/29] ref_transaction_create(): disallow recursive pruning Michael Haggerty
2016-04-27 18:47 ` Junio C Hamano
2016-04-27 20:23 ` David Turner
2016-04-27 20:45 ` Junio C Hamano
2016-04-27 21:15 ` Junio C Hamano
2016-04-28 17:48 ` David Turner
2016-04-28 20:15 ` Junio C Hamano
2016-04-29 6:56 ` Michael Haggerty
2016-04-29 8:19 ` Junio C Hamano
2016-04-29 8:41 ` Junio C Hamano
2016-04-29 14:29 ` Michael Haggerty
2016-04-27 16:57 ` [PATCH 16/29] ref_transaction_commit(): correctly report close_ref() failure Michael Haggerty
2016-04-27 16:57 ` [PATCH 17/29] delete_branches(): use resolve_refdup() Michael Haggerty
2016-04-27 16:57 ` [PATCH 18/29] refs: allow log-only updates Michael Haggerty
2016-04-27 16:57 ` [PATCH 19/29] refs: don't dereference on rename Michael Haggerty
2016-04-27 18:55 ` Junio C Hamano
2016-04-29 7:38 ` Michael Haggerty
2016-04-29 8:53 ` Junio C Hamano
2016-04-29 10:57 ` Michael Haggerty
2016-04-29 12:12 ` Jeff King
2016-04-29 13:55 ` Michael Haggerty
2016-04-29 14:08 ` Jeff King
2016-04-29 15:29 ` Junio C Hamano
2016-04-29 23:21 ` David Turner
2016-04-30 3:48 ` Michael Haggerty
2016-05-02 17:55 ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 20/29] verify_refname_available(): adjust constness in declaration Michael Haggerty
2016-04-27 16:57 ` [PATCH 21/29] add_update(): initialize the whole ref_update Michael Haggerty
2016-04-27 16:57 ` [PATCH 22/29] lock_ref_for_update(): new function Michael Haggerty
2016-04-27 16:57 ` [PATCH 23/29] unlock_ref(): move definition higher in the file Michael Haggerty
2016-04-27 16:57 ` [PATCH 24/29] ref_transaction_update(): check refname_is_safe() at a minimum Michael Haggerty
2016-04-27 20:14 ` Junio C Hamano
2016-04-29 7:42 ` Michael Haggerty
2016-04-29 8:53 ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 25/29] refs: resolve symbolic refs first Michael Haggerty
2016-04-28 23:40 ` David Turner
2016-04-29 9:51 ` Michael Haggerty
2016-04-29 23:14 ` David Turner
2016-05-02 18:06 ` Junio C Hamano
2016-04-27 16:57 ` [PATCH 26/29] lock_ref_for_update(): don't re-read non-symbolic references Michael Haggerty
2016-04-27 16:57 ` [PATCH 27/29] lock_ref_for_update(): don't resolve symrefs Michael Haggerty
2016-04-27 16:57 ` [PATCH 28/29] commit_ref_update(): remove the flags parameter Michael Haggerty
2016-04-27 16:57 ` [PATCH 29/29] lock_ref_sha1_basic(): only handle REF_NODEREF mode Michael Haggerty
2016-04-29 15:43 ` Junio C Hamano
2016-04-29 1:14 ` [PATCH 00/29] Yet more preparation for reference backends David Turner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1461768689.git.mhagger@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
--cc=ramsay@ramsayjones.plus.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).