* [PATCH 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF @ 2016-01-11 15:46 Jeff King 2016-01-11 15:49 ` [PATCH 1/2] checkout,clone: check return value of create_symref Jeff King ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Jeff King @ 2016-01-11 15:46 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty I came across an interesting regression case with the new create_symref() that uses lock_ref_sha1_basic(). It turns out that the bug is actually in the latter function, but that it's slightly easier to tickle it since we added new callers. The bug and fix are rather involved, so I won't repeat the explanation from patch 2/2 here. The first patch is just a cleanup necessary to accurately in test in the second. [1/2]: checkout,clone: check return value of create_symref [2/2]: lock_ref_sha1_basic: handle REF_NODEREF with invalid refs -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/2] checkout,clone: check return value of create_symref 2016-01-11 15:46 [PATCH 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King @ 2016-01-11 15:49 ` Jeff King 2016-01-12 4:09 ` Michael Haggerty 2016-01-11 15:52 ` [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2016-01-12 9:56 ` [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-11 15:49 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty It's unlikely that we would fail to create or update a symbolic ref (especially HEAD), but if we do, we should notice and complain. Note that there's no need to give more details in our error message; create_symref will already have done so. While we're here, let's also fix a minor memory leak in clone. Signed-off-by: Jeff King <peff@peff.net> --- This patch could go to maint. I don't know if it's worth the trouble. I was unable to figure out a way to trigger this reliably (hence no tests). The two ways I considered were: - "chmod -w .git", but it results in a die() already - the bug I'm fixing in 2/2; but we don't want to rely on that in our test suite, since I'm about to fix it. :-/ builtin/checkout.c | 3 ++- builtin/clone.c | 11 +++++++---- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index e8110a9..5af84a3 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -661,7 +661,8 @@ static void update_refs_for_switch(const struct checkout_opts *opts, describe_detached_head(_("HEAD is now at"), new->commit); } } else if (new->path) { /* Switch branches. */ - create_symref("HEAD", new->path, msg.buf); + if (create_symref("HEAD", new->path, msg.buf) < 0) + die("unable to update HEAD"); if (!opts->quiet) { if (old->path && !strcmp(new->path, old->path)) { if (opts->new_branch_force) diff --git a/builtin/clone.c b/builtin/clone.c index a0b3cd9..a7c8def 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -636,9 +636,11 @@ static void update_remote_refs(const struct ref *refs, struct strbuf head_ref = STRBUF_INIT; strbuf_addstr(&head_ref, branch_top); strbuf_addstr(&head_ref, "HEAD"); - create_symref(head_ref.buf, - remote_head_points_at->peer_ref->name, - msg); + if (create_symref(head_ref.buf, + remote_head_points_at->peer_ref->name, + msg) < 0) + die("unable to update %s", head_ref.buf); + strbuf_release(&head_ref); } } @@ -648,7 +650,8 @@ static void update_head(const struct ref *our, const struct ref *remote, const char *head; if (our && skip_prefix(our->name, "refs/heads/", &head)) { /* Local default branch link */ - create_symref("HEAD", our->name, NULL); + if (create_symref("HEAD", our->name, NULL) < 0) + die("unable to update HEAD"); if (!option_bare) { update_ref(msg, "HEAD", our->old_oid.hash, NULL, 0, UPDATE_REFS_DIE_ON_ERR); -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] checkout,clone: check return value of create_symref 2016-01-11 15:49 ` [PATCH 1/2] checkout,clone: check return value of create_symref Jeff King @ 2016-01-12 4:09 ` Michael Haggerty 2016-01-12 9:49 ` Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Michael Haggerty @ 2016-01-12 4:09 UTC (permalink / raw) To: Jeff King, git; +Cc: Junio C Hamano On 01/11/2016 04:49 PM, Jeff King wrote: > It's unlikely that we would fail to create or update a > symbolic ref (especially HEAD), but if we do, we should > notice and complain. Note that there's no need to give more > details in our error message; create_symref will already > have done so. > > While we're here, let's also fix a minor memory leak in > clone. > > Signed-off-by: Jeff King <peff@peff.net> LGTM. > --- > This patch could go to maint. I don't know if it's worth the trouble. I > was unable to figure out a way to trigger this reliably (hence no > tests). The two ways I considered were: > > - "chmod -w .git", but it results in a die() already > > - the bug I'm fixing in 2/2; but we don't want to rely on that in our > test suite, since I'm about to fix it. :-/ A locking conflict is an easy way to trigger this error: $ git --version git version 2.7.0.rc2.31.g396da8f $ git branch foo $ touch .git/HEAD.lock $ git checkout foo error: Unable to create '/home/mhagger/tmp/brokhead/.git/HEAD.lock': File exists. If no other git process is currently running, this probably means a git process crashed in this repository earlier. Make sure no other git process is running and remove the file manually to continue. Switched to branch 'foo' $ echo $? 0 After this patch, the above "checkout" command fails with retcode=128. Michael -- Michael Haggerty mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 1/2] checkout,clone: check return value of create_symref 2016-01-12 4:09 ` Michael Haggerty @ 2016-01-12 9:49 ` Jeff King 0 siblings, 0 replies; 23+ messages in thread From: Jeff King @ 2016-01-12 9:49 UTC (permalink / raw) To: Michael Haggerty; +Cc: git, Junio C Hamano On Tue, Jan 12, 2016 at 05:09:07AM +0100, Michael Haggerty wrote: > > This patch could go to maint. I don't know if it's worth the trouble. I > > was unable to figure out a way to trigger this reliably (hence no > > tests). The two ways I considered were: > > > > - "chmod -w .git", but it results in a die() already > > > > - the bug I'm fixing in 2/2; but we don't want to rely on that in our > > test suite, since I'm about to fix it. :-/ > > A locking conflict is an easy way to trigger this error: > [...] Thanks, that's a good suggestion; I'll add a test in the re-roll. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-11 15:46 [PATCH 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2016-01-11 15:49 ` [PATCH 1/2] checkout,clone: check return value of create_symref Jeff King @ 2016-01-11 15:52 ` Jeff King 2016-01-11 18:28 ` Junio C Hamano 2016-01-12 4:55 ` Michael Haggerty 2016-01-12 9:56 ` [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2 siblings, 2 replies; 23+ messages in thread From: Jeff King @ 2016-01-11 15:52 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty We sometimes call lock_ref_sha1_basic both with REF_NODEREF to operate directly on a symbolic ref. This is used, for example, to move to a detached HEAD, or when updating the contents of HEAD via checkout or symbolic-ref. However, the first step of the function is to resolve the refname to get the "old" sha1, and we do so without telling resolve_ref_unsafe() that we are only interested in the symref. As a result, we may detect a problem there not with the symref itself, but with something it points to. The real-world example I found (and what is used in the test suite) is a HEAD pointing to a ref that cannot exist, because it would cause a directory/file conflict with other existing refs. This situation is somewhat broken, of course, as trying to _commit_ on that HEAD would fail. But it's not explicitly forbidden, and we should be able to move away from it. However, neither "git checkout" nor "git symbolic-ref" can do so. We try to take the lock on HEAD, which is pointing to a non-existent ref. We bail from resolve_ref_unsafe() with errno set to EISDIR, and the lock code thinks we are attempting to create a d/f conflict. Of course we're not. The problem is that the lock code has no idea what level we were at when we got EISDIR, so trying to diagnose or remove empty directories for HEAD is not useful. The most obvious solution would be to call resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE, so that we never look beyond the symref (and any problems we find must be attributable to it). However, that means we would not correctly gather the "old" sha1. We do not typically care about it for locking purposes with a symref (since the symref has no value on its own), but it does affect what we write into the HEAD reflog. Another possibility is to avoid the d/f check when REF_NORECURSE is set. But that would mean we fail to notice a real d/f conflict. This is impossible with HEAD, but we would not want to create refs/heads/origin/HEAD.lock if we already have refs/heads/origin/HEAD/foo. So instead, we attempt to resolve HEAD fully to get the old sha1, and only if that fails do we fallback to a non-recursive resolution. We lose nothing to the fallback, since we know the ref cannot be resolved, and thus we have no old sha1 in the first place. And we still get the benefit of the d/f-checking for the symref itself. This does mean an extra round of filesystem lookups in some cases, but they should be rare. It only kicks in with REF_NODEREF, and then only when the existing ref cannot be resolved. Signed-off-by: Jeff King <peff@peff.net> --- This is prepared on top of the jk/symbolic-ref topic. As shown by the tests, though, this can be triggered when checking out a detached HEAD, so it existed prior to that. The fix is independent, but some of the additions to the test suite rely on that topic. I can split it into separate patches, but I'm not sure it's worth the trouble. The only case I found that triggers it is quite obscure, and prior to the jk/symbolic-ref topic, you can recover using a non-detached checkout (or symbolic-ref). refs/files-backend.c | 4 ++++ t/t1401-symbolic-ref.sh | 7 +++++++ t/t2011-checkout-invalid-head.sh | 20 ++++++++++++++++++++ 3 files changed, 31 insertions(+) diff --git a/refs/files-backend.c b/refs/files-backend.c index 180c837..ea67d82 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1901,6 +1901,10 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, refname = resolve_ref_unsafe(refname, resolve_flags, lock->old_oid.hash, &type); + if (!refname && (flags & REF_NODEREF)) + refname = resolve_ref_unsafe(orig_refname, + resolve_flags | RESOLVE_REF_NO_RECURSE, + lock->old_oid.hash, &type); if (!refname && errno == EISDIR) { /* * we are trying to lock foo but we used to diff --git a/t/t1401-symbolic-ref.sh b/t/t1401-symbolic-ref.sh index 5db876c..a713766 100755 --- a/t/t1401-symbolic-ref.sh +++ b/t/t1401-symbolic-ref.sh @@ -122,4 +122,11 @@ test_expect_success 'symbolic-ref does not create ref d/f conflicts' ' test_must_fail git symbolic-ref refs/heads/df/conflict refs/heads/df ' +test_expect_success 'symbolic-ref handles existing pointer to invalid name' ' + head=$(git rev-parse HEAD) && + git symbolic-ref HEAD refs/heads/outer && + git update-ref refs/heads/outer/inner $head && + git symbolic-ref HEAD refs/heads/unrelated +' + test_done diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index 300f8bf..c6c11c4 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -19,4 +19,24 @@ test_expect_success 'checkout master from invalid HEAD' ' git checkout master -- ' +test_expect_success 'create ref directory/file conflict scenario' ' + git update-ref refs/heads/outer/inner master && + + # do not rely on symbolic-ref to get a known state, + # as it may use the same code we are testing + reset_to_df () { + echo "ref: refs/heads/outer" >.git/HEAD + } +' + +test_expect_failure 'checkout away from d/f HEAD (to branch)' ' + reset_to_df && + git checkout master +' + +test_expect_failure 'checkout away from d/f HEAD (to detached)' ' + reset_to_df && + git checkout --detach master +' + test_done -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-11 15:52 ` [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King @ 2016-01-11 18:28 ` Junio C Hamano 2016-01-12 4:55 ` Michael Haggerty 1 sibling, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2016-01-11 18:28 UTC (permalink / raw) To: Jeff King; +Cc: git, Michael Haggerty Jeff King <peff@peff.net> writes: > So instead, we attempt to resolve HEAD fully to get the old > sha1, and only if that fails do we fallback to a > non-recursive resolution. We lose nothing to the fallback, > since we know the ref cannot be resolved, and thus we have > no old sha1 in the first place. And we still get the benefit > of the d/f-checking for the symref itself. > > This does mean an extra round of filesystem lookups in some > cases, but they should be rare. It only kicks in with > REF_NODEREF, and then only when the existing ref cannot be > resolved. Makes perfect sense to me. Thanks. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-11 15:52 ` [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2016-01-11 18:28 ` Junio C Hamano @ 2016-01-12 4:55 ` Michael Haggerty 2016-01-12 9:52 ` Jeff King 1 sibling, 1 reply; 23+ messages in thread From: Michael Haggerty @ 2016-01-12 4:55 UTC (permalink / raw) To: Jeff King, git; +Cc: Junio C Hamano On 01/11/2016 04:52 PM, Jeff King wrote: > We sometimes call lock_ref_sha1_basic both with REF_NODEREF > to operate directly on a symbolic ref. ^^^ This sentence seems to be missing some words. > This is used, for > example, to move to a detached HEAD, or when updating > the contents of HEAD via checkout or symbolic-ref. > > However, the first step of the function is to resolve the > refname to get the "old" sha1, and we do so without telling > resolve_ref_unsafe() that we are only interested in the > symref. As a result, we may detect a problem there not with > the symref itself, but with something it points to. > > The real-world example I found (and what is used in the test > suite) is a HEAD pointing to a ref that cannot exist, > because it would cause a directory/file conflict with other > existing refs. This situation is somewhat broken, of > course, as trying to _commit_ on that HEAD would fail. But > it's not explicitly forbidden, and we should be able to move > away from it. However, neither "git checkout" nor "git > symbolic-ref" can do so. We try to take the lock on HEAD, > which is pointing to a non-existent ref. We bail from > resolve_ref_unsafe() with errno set to EISDIR, and the lock > code thinks we are attempting to create a d/f conflict. > > Of course we're not. The problem is that the lock code has > no idea what level we were at when we got EISDIR, so trying > to diagnose or remove empty directories for HEAD is not > useful. > > The most obvious solution would be to call > resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE, so that we > never look beyond the symref (and any problems we find must > be attributable to it). However, that means we would not > correctly gather the "old" sha1. We do not typically care > about it for locking purposes with a symref (since the > symref has no value on its own), but it does affect what we > write into the HEAD reflog. > > Another possibility is to avoid the d/f check when > REF_NORECURSE is set. But that would mean we fail to notice > a real d/f conflict. This is impossible with HEAD, but we > would not want to create refs/heads/origin/HEAD.lock if we > already have refs/heads/origin/HEAD/foo. > > So instead, we attempt to resolve HEAD fully to get the old > sha1, and only if that fails do we fallback to a > non-recursive resolution. We lose nothing to the fallback, > since we know the ref cannot be resolved, and thus we have > no old sha1 in the first place. And we still get the benefit > of the d/f-checking for the symref itself. > > This does mean an extra round of filesystem lookups in some > cases, but they should be rare. It only kicks in with > REF_NODEREF, and then only when the existing ref cannot be > resolved. > > Signed-off-by: Jeff King <peff@peff.net> > --- > This is prepared on top of the jk/symbolic-ref topic. As shown by the > tests, though, this can be triggered when checking out a detached HEAD, > so it existed prior to that. The fix is independent, but some of the > additions to the test suite rely on that topic. > > I can split it into separate patches, but I'm not sure it's worth the > trouble. The only case I found that triggers it is quite obscure, and > prior to the jk/symbolic-ref topic, you can recover using a non-detached > checkout (or symbolic-ref). > > refs/files-backend.c | 4 ++++ > t/t1401-symbolic-ref.sh | 7 +++++++ > t/t2011-checkout-invalid-head.sh | 20 ++++++++++++++++++++ > 3 files changed, 31 insertions(+) > > diff --git a/refs/files-backend.c b/refs/files-backend.c > index 180c837..ea67d82 100644 > --- a/refs/files-backend.c > +++ b/refs/files-backend.c > @@ -1901,6 +1901,10 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, > > refname = resolve_ref_unsafe(refname, resolve_flags, > lock->old_oid.hash, &type); > + if (!refname && (flags & REF_NODEREF)) > + refname = resolve_ref_unsafe(orig_refname, > + resolve_flags | RESOLVE_REF_NO_RECURSE, > + lock->old_oid.hash, &type); > [...] The main risk for this change would be that this new recovery code allows the function to continue, but one of the outputs of the second function invocation is not correct for the code that follows. Let me think out loud: * refname -- now will be equal to orig_refname. I think the main effect is that it will be passed to verify_refname_available_dir(). This seems to be what we want. * type -- now reflects orig_refname; i.e., usually REF_ISSYMREF. This also seems correct. * lock->old_oid.hash -- is now ZEROS. This might get compared to the caller's old_sha1 in verify_lock(), and it will also be written to the reflog as the "old" value. I think this is also what we want. So this change looks good to me. Thanks for the good catch and especially the awesome commit message! Michael -- Michael Haggerty mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 4:55 ` Michael Haggerty @ 2016-01-12 9:52 ` Jeff King 2016-01-12 18:11 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 9:52 UTC (permalink / raw) To: Michael Haggerty; +Cc: git, Junio C Hamano On Tue, Jan 12, 2016 at 05:55:25AM +0100, Michael Haggerty wrote: > On 01/11/2016 04:52 PM, Jeff King wrote: > > We sometimes call lock_ref_sha1_basic both with REF_NODEREF > > to operate directly on a symbolic ref. > > ^^^ This sentence seems to be missing some words. I think it has one too many. :) It was originally "both with a regular ref and with a symref", but I shortened it since we only care about the symref case. I think just getting rid of "both" is the right thing. > > diff --git a/refs/files-backend.c b/refs/files-backend.c > > index 180c837..ea67d82 100644 > > --- a/refs/files-backend.c > > +++ b/refs/files-backend.c > > @@ -1901,6 +1901,10 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, > > > > refname = resolve_ref_unsafe(refname, resolve_flags, > > lock->old_oid.hash, &type); > > + if (!refname && (flags & REF_NODEREF)) > > + refname = resolve_ref_unsafe(orig_refname, > > + resolve_flags | RESOLVE_REF_NO_RECURSE, > > + lock->old_oid.hash, &type); > > [...] > > The main risk for this change would be that this new recovery code > allows the function to continue, but one of the outputs of the second > function invocation is not correct for the code that follows. Let me > think out loud: > > * refname -- now will be equal to orig_refname. I think the main effect > is that it will be passed to verify_refname_available_dir(). This seems > to be what we want. > > * type -- now reflects orig_refname; i.e., usually REF_ISSYMREF. This > also seems correct. > > * lock->old_oid.hash -- is now ZEROS. This might get compared to the > caller's old_sha1 in verify_lock(), and it will also be written to the > reflog as the "old" value. I think this is also what we want. > > So this change looks good to me. Thanks. I had a nagging feeling that I hadn't considered all cases, but the way you've framed it makes sense to me. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 9:52 ` Jeff King @ 2016-01-12 18:11 ` Junio C Hamano 0 siblings, 0 replies; 23+ messages in thread From: Junio C Hamano @ 2016-01-12 18:11 UTC (permalink / raw) To: Jeff King; +Cc: Michael Haggerty, git Jeff King <peff@peff.net> writes: > On Tue, Jan 12, 2016 at 05:55:25AM +0100, Michael Haggerty wrote: > >> On 01/11/2016 04:52 PM, Jeff King wrote: >> > We sometimes call lock_ref_sha1_basic both with REF_NODEREF >> > to operate directly on a symbolic ref. >> >> ^^^ This sentence seems to be missing some words. > > I think it has one too many. :) > > It was originally "both with a regular ref and with a symref", but I > shortened it since we only care about the symref case. I think just > getting rid of "both" is the right thing. Thanks, I did notice this and wondered the same while reviewing, but totally forgot about it when I queued X-<. >> > diff --git a/refs/files-backend.c b/refs/files-backend.c >> > index 180c837..ea67d82 100644 >> > --- a/refs/files-backend.c >> > +++ b/refs/files-backend.c >> > @@ -1901,6 +1901,10 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, >> > >> > refname = resolve_ref_unsafe(refname, resolve_flags, >> > lock->old_oid.hash, &type); >> > + if (!refname && (flags & REF_NODEREF)) >> > + refname = resolve_ref_unsafe(orig_refname, >> > + resolve_flags | RESOLVE_REF_NO_RECURSE, >> > + lock->old_oid.hash, &type); >> > [...] >> >> The main risk for this change would be that this new recovery code >> allows the function to continue, but one of the outputs of the second >> function invocation is not correct for the code that follows. Let me >> think out loud: >> >> * refname -- now will be equal to orig_refname. I think the main effect >> is that it will be passed to verify_refname_available_dir(). This seems >> to be what we want. >> >> * type -- now reflects orig_refname; i.e., usually REF_ISSYMREF. This >> also seems correct. >> >> * lock->old_oid.hash -- is now ZEROS. This might get compared to the >> caller's old_sha1 in verify_lock(), and it will also be written to the >> reflog as the "old" value. I think this is also what we want. >> >> So this change looks good to me. > > Thanks. I had a nagging feeling that I hadn't considered all cases, but > the way you've framed it makes sense to me. Sounds good. Thanks, both. ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF 2016-01-11 15:46 [PATCH 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2016-01-11 15:49 ` [PATCH 1/2] checkout,clone: check return value of create_symref Jeff King 2016-01-11 15:52 ` [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King @ 2016-01-12 9:56 ` Jeff King 2016-01-12 9:57 ` [PATCH v2 1/2] checkout,clone: check return value of create_symref Jeff King 2016-01-12 9:58 ` [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2 siblings, 2 replies; 23+ messages in thread From: Jeff King @ 2016-01-12 9:56 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty On Mon, Jan 11, 2016 at 10:46:51AM -0500, Jeff King wrote: > I came across an interesting regression case with the new > create_symref() that uses lock_ref_sha1_basic(). It turns out that the > bug is actually in the latter function, but that it's slightly easier to > tickle it since we added new callers. > > The bug and fix are rather involved, so I won't repeat the explanation > from patch 2/2 here. The first patch is just a cleanup necessary to > accurately in test in the second. > > [1/2]: checkout,clone: check return value of create_symref > [2/2]: lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Here's a re-roll incorporating feedback from Michael. Patch 1 now has a test for git-checkout (doing one for clone seems rather hard without being racy). Patch 2 has a typo fix in the commit message, and the test script is adjusted to account for the new test. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v2 1/2] checkout,clone: check return value of create_symref 2016-01-12 9:56 ` [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King @ 2016-01-12 9:57 ` Jeff King 2016-01-12 9:58 ` [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 1 sibling, 0 replies; 23+ messages in thread From: Jeff King @ 2016-01-12 9:57 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty It's unlikely that we would fail to create or update a symbolic ref (especially HEAD), but if we do, we should notice and complain. Note that there's no need to give more details in our error message; create_symref will already have done so. While we're here, let's also fix a minor memory leak in clone. Signed-off-by: Jeff King <peff@peff.net> --- builtin/checkout.c | 3 ++- builtin/clone.c | 11 +++++++---- t/t2011-checkout-invalid-head.sh | 6 ++++++ 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index e8110a9..5af84a3 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -661,7 +661,8 @@ static void update_refs_for_switch(const struct checkout_opts *opts, describe_detached_head(_("HEAD is now at"), new->commit); } } else if (new->path) { /* Switch branches. */ - create_symref("HEAD", new->path, msg.buf); + if (create_symref("HEAD", new->path, msg.buf) < 0) + die("unable to update HEAD"); if (!opts->quiet) { if (old->path && !strcmp(new->path, old->path)) { if (opts->new_branch_force) diff --git a/builtin/clone.c b/builtin/clone.c index a0b3cd9..a7c8def 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -636,9 +636,11 @@ static void update_remote_refs(const struct ref *refs, struct strbuf head_ref = STRBUF_INIT; strbuf_addstr(&head_ref, branch_top); strbuf_addstr(&head_ref, "HEAD"); - create_symref(head_ref.buf, - remote_head_points_at->peer_ref->name, - msg); + if (create_symref(head_ref.buf, + remote_head_points_at->peer_ref->name, + msg) < 0) + die("unable to update %s", head_ref.buf); + strbuf_release(&head_ref); } } @@ -648,7 +650,8 @@ static void update_head(const struct ref *our, const struct ref *remote, const char *head; if (our && skip_prefix(our->name, "refs/heads/", &head)) { /* Local default branch link */ - create_symref("HEAD", our->name, NULL); + if (create_symref("HEAD", our->name, NULL) < 0) + die("unable to update HEAD"); if (!option_bare) { update_ref(msg, "HEAD", our->old_oid.hash, NULL, 0, UPDATE_REFS_DIE_ON_ERR); diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index 300f8bf..d444d5e 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -19,4 +19,10 @@ test_expect_success 'checkout master from invalid HEAD' ' git checkout master -- ' +test_expect_success 'checkout notices failure to lock HEAD' ' + test_when_finished "rm -f .git/HEAD.lock" && + >.git/HEAD.lock && + test_must_fail git checkout -b other +' + test_done -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 9:56 ` [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2016-01-12 9:57 ` [PATCH v2 1/2] checkout,clone: check return value of create_symref Jeff King @ 2016-01-12 9:58 ` Jeff King 2016-01-12 13:26 ` Jeff King 1 sibling, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 9:58 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty We sometimes call lock_ref_sha1_basic with REF_NODEREF to operate directly on a symbolic ref. This is used, for example, to move to a detached HEAD, or when updating the contents of HEAD via checkout or symbolic-ref. However, the first step of the function is to resolve the refname to get the "old" sha1, and we do so without telling resolve_ref_unsafe() that we are only interested in the symref. As a result, we may detect a problem there not with the symref itself, but with something it points to. The real-world example I found (and what is used in the test suite) is a HEAD pointing to a ref that cannot exist, because it would cause a directory/file conflict with other existing refs. This situation is somewhat broken, of course, as trying to _commit_ on that HEAD would fail. But it's not explicitly forbidden, and we should be able to move away from it. However, neither "git checkout" nor "git symbolic-ref" can do so. We try to take the lock on HEAD, which is pointing to a non-existent ref. We bail from resolve_ref_unsafe() with errno set to EISDIR, and the lock code thinks we are attempting to create a d/f conflict. Of course we're not. The problem is that the lock code has no idea what level we were at when we got EISDIR, so trying to diagnose or remove empty directories for HEAD is not useful. The most obvious solution would be to call resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE, so that we never look beyond the symref (and any problems we find must be attributable to it). However, that means we would not correctly gather the "old" sha1. We do not typically care about it for locking purposes with a symref (since the symref has no value on its own), but it does affect what we write into the HEAD reflog. Another possibility is to avoid the d/f check when REF_NORECURSE is set. But that would mean we fail to notice a real d/f conflict. This is impossible with HEAD, but we would not want to create refs/heads/origin/HEAD.lock if we already have refs/heads/origin/HEAD/foo. So instead, we attempt to resolve HEAD fully to get the old sha1, and only if that fails do we fallback to a non-recursive resolution. We lose nothing to the fallback, since we know the ref cannot be resolved, and thus we have no old sha1 in the first place. And we still get the benefit of the d/f-checking for the symref itself. This does mean an extra round of filesystem lookups in some cases, but they should be rare. It only kicks in with REF_NODEREF, and then only when the existing ref cannot be resolved. Signed-off-by: Jeff King <peff@peff.net> --- refs/files-backend.c | 4 ++++ t/t1401-symbolic-ref.sh | 7 +++++++ t/t2011-checkout-invalid-head.sh | 20 ++++++++++++++++++++ 3 files changed, 31 insertions(+) diff --git a/refs/files-backend.c b/refs/files-backend.c index 180c837..ea67d82 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1901,6 +1901,10 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, refname = resolve_ref_unsafe(refname, resolve_flags, lock->old_oid.hash, &type); + if (!refname && (flags & REF_NODEREF)) + refname = resolve_ref_unsafe(orig_refname, + resolve_flags | RESOLVE_REF_NO_RECURSE, + lock->old_oid.hash, &type); if (!refname && errno == EISDIR) { /* * we are trying to lock foo but we used to diff --git a/t/t1401-symbolic-ref.sh b/t/t1401-symbolic-ref.sh index 5db876c..a713766 100755 --- a/t/t1401-symbolic-ref.sh +++ b/t/t1401-symbolic-ref.sh @@ -122,4 +122,11 @@ test_expect_success 'symbolic-ref does not create ref d/f conflicts' ' test_must_fail git symbolic-ref refs/heads/df/conflict refs/heads/df ' +test_expect_success 'symbolic-ref handles existing pointer to invalid name' ' + head=$(git rev-parse HEAD) && + git symbolic-ref HEAD refs/heads/outer && + git update-ref refs/heads/outer/inner $head && + git symbolic-ref HEAD refs/heads/unrelated +' + test_done diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index d444d5e..9c1fddf 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -25,4 +25,24 @@ test_expect_success 'checkout notices failure to lock HEAD' ' test_must_fail git checkout -b other ' +test_expect_success 'create ref directory/file conflict scenario' ' + git update-ref refs/heads/outer/inner master && + + # do not rely on symbolic-ref to get a known state, + # as it may use the same code we are testing + reset_to_df () { + echo "ref: refs/heads/outer" >.git/HEAD + } +' + +test_expect_failure 'checkout away from d/f HEAD (to branch)' ' + reset_to_df && + git checkout master +' + +test_expect_failure 'checkout away from d/f HEAD (to detached)' ' + reset_to_df && + git checkout --detach master +' + test_done -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 9:58 ` [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King @ 2016-01-12 13:26 ` Jeff King 2016-01-12 13:55 ` [PATCH v3 " Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 13:26 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty On Tue, Jan 12, 2016 at 04:58:04AM -0500, Jeff King wrote: > +test_expect_failure 'checkout away from d/f HEAD (to branch)' ' > + reset_to_df && > + git checkout master > +' > + > +test_expect_failure 'checkout away from d/f HEAD (to detached)' ' > + reset_to_df && > + git checkout --detach master > +' These should be expect_success, of course (I had originally planned to introduce the tests and then fix them later, but it all ended up in the same patch). Unfortunately, I think there is a case we're missing. I'm still digging, but hope to have something soon. In the meantime, don't bother with v2. :) -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 13:26 ` Jeff King @ 2016-01-12 13:55 ` Jeff King 2016-01-12 19:41 ` Junio C Hamano 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 13:55 UTC (permalink / raw) To: git; +Cc: Junio C Hamano, Michael Haggerty On Tue, Jan 12, 2016 at 08:26:28AM -0500, Jeff King wrote: > On Tue, Jan 12, 2016 at 04:58:04AM -0500, Jeff King wrote: > > > +test_expect_failure 'checkout away from d/f HEAD (to branch)' ' > > + reset_to_df && > > + git checkout master > > +' > > + > > +test_expect_failure 'checkout away from d/f HEAD (to detached)' ' > > + reset_to_df && > > + git checkout --detach master > > +' > > These should be expect_success, of course (I had originally planned to > introduce the tests and then fix them later, but it all ended up in the > same patch). > > Unfortunately, I think there is a case we're missing. I'm still digging, > but hope to have something soon. In the meantime, don't bother with v2. This rabbit hole just keeps getting deeper. :) The behavior I tested and fixed earlier only covers the _loose_ ref case, where we actually see EISDIR, and resolve_ref returns NULL. But if the refs are all packed, resolve_ref will happily report the pointed-to ref, and we do not hit this fallback code path at all. Here's a v3 for the second patch (the first one is fine) that handles this case. See the amended description in the commit message, and I added some comments which hopefully make things more obvious. I also notice that if we are deleting, we _do_ set RESOLVE_REF_NO_RECURSE from the very beginning, which means we would generally not get a valid lock->old_oid.hash for a symref. But I'm not sure what it would mean to delete a symref while asking for its current value (it cannot have one!). So I don't think it is a bug. -- >8 -- Subject: lock_ref_sha1_basic: handle REF_NODEREF with invalid refs We sometimes call lock_ref_sha1_basic with REF_NODEREF to operate directly on a symbolic ref. This is used, for example, to move to a detached HEAD, or when updating the contents of HEAD via checkout or symbolic-ref. However, the first step of the function is to resolve the refname to get the "old" sha1, and we do so without telling resolve_ref_unsafe() that we are only interested in the symref. As a result, we may detect a problem there not with the symref itself, but with something it points to. The real-world example I found (and what is used in the test suite) is a HEAD pointing to a ref that cannot exist, because it would cause a directory/file conflict with other existing refs. This situation is somewhat broken, of course, as trying to _commit_ on that HEAD would fail. But it's not explicitly forbidden, and we should be able to move away from it. However, neither "git checkout" nor "git symbolic-ref" can do so. We try to take the lock on HEAD, which is pointing to a non-existent ref. We bail from resolve_ref_unsafe() with errno set to EISDIR, and the lock code thinks we are attempting to create a d/f conflict. Of course we're not. The problem is that the lock code has no idea what level we were at when we got EISDIR, so trying to diagnose or remove empty directories for HEAD is not useful. To make things even more complicated, we only get EISDIR in the loose-ref case. If the refs are packed, the resolution may "succeed", giving us the pointed-to ref in "refname", but a null oid. Later, we say "ah, the null oid means we are creating; let's make sure there is room for it", but mistakenly check against the _resolved_ refname, not the original. The most obvious solution would be to call resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE, so that we never look beyond the symref (and any problems we find must be attributable to it). However, that means we would not correctly gather the "old" sha1. We do not typically care about it for locking purposes with a symref (since the symref has no value on its own), but it does affect what we write into the HEAD reflog. Another possibility is to avoid the d/f check when REF_NORECURSE is set. But that would mean we fail to notice a real d/f conflict. This is impossible with HEAD, but we would not want to create refs/heads/origin/HEAD.lock if we already have refs/heads/origin/HEAD/foo. So instead, we attempt to resolve HEAD fully to get the old sha1, and only if that fails do we fallback to a non-recursive resolution. We lose nothing to the fallback, since we know the ref cannot be resolved, and thus we have no old sha1 in the first place. And we still get the benefit of the d/f-checking for the symref itself. This does mean an extra round of filesystem lookups in some cases, but they should be rare. It only kicks in with REF_NODEREF, and then only when the existing ref cannot be resolved. And if the lookup of HEAD _does_ succeed, we switch to using orig_refname much sooner, to cover the packed-refs case. Signed-off-by: Jeff King <peff@peff.net> --- refs/files-backend.c | 23 +++++++++++++++++++++++ t/t1401-symbolic-ref.sh | 7 +++++++ t/t2011-checkout-invalid-head.sh | 34 ++++++++++++++++++++++++++++++++++ 3 files changed, 64 insertions(+) diff --git a/refs/files-backend.c b/refs/files-backend.c index 180c837..291b18d 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1899,8 +1899,31 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, resolve_flags |= RESOLVE_REF_NO_RECURSE; } + /* + * Resolve with recursion, even if REF_NODEREF was set, + * as we want to make sure the "old" hash is filled in. + */ refname = resolve_ref_unsafe(refname, resolve_flags, lock->old_oid.hash, &type); + if (flags & REF_NODEREF) { + /* + * If we got a refname, then we know that the outer symref + * exists (though it might not point to an existing ref). + * From here out, we care only about the original name, because + * of REF_NODEREF. + * + * Otherwise, we know there was some lookup error, and we want + * to know whether it was due to orig_refname, or something it + * pointed to. Repeat our lookup without recursion. + */ + if (refname) + refname = orig_refname; + else + refname = resolve_ref_unsafe(orig_refname, + resolve_flags | RESOLVE_REF_NO_RECURSE, + lock->old_oid.hash, &type); + } + if (!refname && errno == EISDIR) { /* * we are trying to lock foo but we used to diff --git a/t/t1401-symbolic-ref.sh b/t/t1401-symbolic-ref.sh index 5db876c..a713766 100755 --- a/t/t1401-symbolic-ref.sh +++ b/t/t1401-symbolic-ref.sh @@ -122,4 +122,11 @@ test_expect_success 'symbolic-ref does not create ref d/f conflicts' ' test_must_fail git symbolic-ref refs/heads/df/conflict refs/heads/df ' +test_expect_success 'symbolic-ref handles existing pointer to invalid name' ' + head=$(git rev-parse HEAD) && + git symbolic-ref HEAD refs/heads/outer && + git update-ref refs/heads/outer/inner $head && + git symbolic-ref HEAD refs/heads/unrelated +' + test_done diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index d444d5e..762907d 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -25,4 +25,38 @@ test_expect_success 'checkout notices failure to lock HEAD' ' test_must_fail git checkout -b other ' +test_expect_success 'create ref directory/file conflict scenario' ' + git update-ref refs/heads/outer/inner master && + + # do not rely on symbolic-ref to get a known state, + # as it may use the same code we are testing + reset_to_df () { + echo "ref: refs/heads/outer" >.git/HEAD + } +' + +test_expect_success 'checkout away from d/f HEAD (unpacked, to branch)' ' + reset_to_df && + git checkout master +' + +test_expect_success 'checkout away from d/f HEAD (unpacked, to detached)' ' + reset_to_df && + git checkout --detach master +' + +test_expect_success 'pack refs' ' + git pack-refs --all --prune +' + +test_expect_success 'checkout away from d/f HEAD (packed, to branch)' ' + reset_to_df && + git checkout master +' + +test_expect_success 'checkout away from d/f HEAD (packed, to detached)' ' + reset_to_df && + git checkout --detach master +' + test_done -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 13:55 ` [PATCH v3 " Jeff King @ 2016-01-12 19:41 ` Junio C Hamano 2016-01-12 20:22 ` Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Junio C Hamano @ 2016-01-12 19:41 UTC (permalink / raw) To: Jeff King; +Cc: git, Michael Haggerty Jeff King <peff@peff.net> writes: > I also notice that if we are deleting, we _do_ set > RESOLVE_REF_NO_RECURSE from the very beginning, which means we would > generally not get a valid lock->old_oid.hash for a symref. But I'm not > sure what it would mean to delete a symref while asking for its current > value (it cannot have one!). So I don't think it is a bug. I started scratching my head after noticing that the NO_RECURSE bit set in the DELETING codepath before reading the above, and I am still doing so. A transaction that attempts to delete an existing symref presumably wants to make sure that the "old" value it read hasn't changed, but ensuring the object name (obtained by reading the ref that is pointed by the symref by dereferencing) are the same is not the right way to ensure nobody raced with us in the meantime anyway (we should rather be making sure that the symref is still pointing at the same ref), so in that sense, in the context of acquiring the lock, old oid value is meaningless for symrefs. This patch is a strict improvement as the behaviour for REF_DELETING case is unchanged by it (an idempotent resolve-ref-unsafe may be called one more time in some cases), and other cases are better, I think. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 19:41 ` Junio C Hamano @ 2016-01-12 20:22 ` Jeff King 2016-01-12 20:42 ` Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 20:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty On Tue, Jan 12, 2016 at 11:41:17AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > I also notice that if we are deleting, we _do_ set > > RESOLVE_REF_NO_RECURSE from the very beginning, which means we would > > generally not get a valid lock->old_oid.hash for a symref. But I'm not > > sure what it would mean to delete a symref while asking for its current > > value (it cannot have one!). So I don't think it is a bug. > > I started scratching my head after noticing that the NO_RECURSE bit > set in the DELETING codepath before reading the above, and I am > still doing so. > > A transaction that attempts to delete an existing symref presumably > wants to make sure that the "old" value it read hasn't changed, but > ensuring the object name (obtained by reading the ref that is > pointed by the symref by dereferencing) are the same is not the > right way to ensure nobody raced with us in the meantime anyway (we > should rather be making sure that the symref is still pointing at > the same ref), so in that sense, in the context of acquiring the > lock, old oid value is meaningless for symrefs. Right, that's the point I was trying to make. Though I think there is something even more subtle going in (see the end of this message). In theory you might want the old sha1 for logging purposes, but since we delete the reflog along with the symref, I don't think it matters there, either. I'm not sure we actually delete symrefs very often, though. Grepping for delete_ref and REF_NODEREF shows the callers expecting symrefs to mostly be "git remote", which never passes in an old_sha1. The only caller which does so is "branch -d", but I think it doesn't affect symrefs. It gets the "old" sha1 by calling resolve_ref_unsafe() itself with RESOLVE_REF_NO_RECURSE, so it will unconditionally remove a symref you ask it to, even if somebody else raced and put something in it. You can also call "update-ref --no-ref -d" with an "old" sha1, but I doubt anyone ever does so. > This patch is a strict improvement as the behaviour for REF_DELETING > case is unchanged by it (an idempotent resolve-ref-unsafe may be > called one more time in some cases), and other cases are better, I > think. Yeah. My gut feeling is that the REF_DELETING special-handling of REF_NODEREF could just be folded into what I've added in this series. But absent a case that is demonstrably broken, I'm inclined not to muck with it too much. I had thought that this: git init git commit --allow-empty -m foo git symbolic-ref refs/foo refs/heads/master old=$(git rev-parse foo) git update-ref --no-deref -d refs/foo $old might trigger a problem (because reading refs/foo with NODEREF will give us a blank sha1 to compare against). But of course that is nonsense. The actual lock verification is not done by this initial resolve_ref. It happens _after_ we take the lock (as it must to avoid races), when verify_lock() calls read_ref_full(). But now I'm doubly confused. When we call read_ref_full(), it is _also_ into lock->old_oid.hash. Which should be overwriting what the earlier resolve_ref_unsafe() wrote into it. Which would mean my whole commit is wrong; we can just unconditionally do a non-recursive resolution in the first place. But when I did so, I ended up with funny reflog values (which is why I wrote the patch as I did). Let me try to dig a little further into that case and see what is going on. -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 20:22 ` Jeff King @ 2016-01-12 20:42 ` Jeff King 2016-01-12 21:43 ` [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 20:42 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty On Tue, Jan 12, 2016 at 03:22:51PM -0500, Jeff King wrote: > I had thought that this: > > git init > git commit --allow-empty -m foo > git symbolic-ref refs/foo refs/heads/master > old=$(git rev-parse foo) > git update-ref --no-deref -d refs/foo $old > > might trigger a problem (because reading refs/foo with NODEREF will give > us a blank sha1 to compare against). But of course that is nonsense. The > actual lock verification is not done by this initial resolve_ref. It > happens _after_ we take the lock (as it must to avoid races), when > verify_lock() calls read_ref_full(). > > But now I'm doubly confused. When we call read_ref_full(), it is _also_ > into lock->old_oid.hash. Which should be overwriting what the earlier > resolve_ref_unsafe() wrote into it. Which would mean my whole commit is > wrong; we can just unconditionally do a non-recursive resolution in the > first place. But when I did so, I ended up with funny reflog values > (which is why I wrote the patch as I did). > > Let me try to dig a little further into that case and see what is going > on. Ah, I see. When calling git-symbolic-ref, we don't provide an old_sha1, and therefore never call verify_lock(). And we get whatever value in lock->old_oid we happened to read earlier in resolve_ref_unsafe(). Which happened outside of a lock. Yikes. It seems like we could racily write the wrong reflog entry in such a case. So I think we'd want something like this: diff --git a/refs/files-backend.c b/refs/files-backend.c index 291b18d..c6ce503 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1845,7 +1845,7 @@ static int verify_lock(struct ref_lock *lock, errno = save_errno; return -1; } - if (hashcmp(lock->old_oid.hash, old_sha1)) { + if (old_sha1 && hashcmp(lock->old_oid.hash, old_sha1)) { strbuf_addf(err, "ref %s is at %s but expected %s", lock->ref_name, sha1_to_hex(lock->old_oid.hash), @@ -2008,7 +1983,7 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, goto error_return; } } - if (old_sha1 && verify_lock(lock, old_sha1, mustexist, err)) { + if (verify_lock(lock, old_sha1, mustexist, err)) { last_errno = errno; goto error_return; } to make sure that the value in lock->old_oid always comes from what we read under the lock. And then the resolve_ref() calls in lock_ref_sha1_basic() really don't matter. They are just about making sure there is space to create the lockfile. The patch above is not quite right; I'll work up a series that takes this approach. -Peff ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic 2016-01-12 20:42 ` Jeff King @ 2016-01-12 21:43 ` Jeff King 2016-01-12 21:44 ` [PATCH 1/3] checkout,clone: check return value of create_symref Jeff King ` (2 more replies) 0 siblings, 3 replies; 23+ messages in thread From: Jeff King @ 2016-01-12 21:43 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty On Tue, Jan 12, 2016 at 03:42:29PM -0500, Jeff King wrote: > Ah, I see. When calling git-symbolic-ref, we don't provide an old_sha1, > and therefore never call verify_lock(). And we get whatever value in > lock->old_oid we happened to read earlier in resolve_ref_unsafe(). Which > happened outside of a lock. Yikes. It seems like we could racily write > the wrong reflog entry in such a case. > > [...] > The patch above is not quite right; I'll work up a series that takes > this approach. OK, here it is. This replaces the top two patches of jk/symbolic-ref (i.e., everything in this thread I've sent in the last day or two). Besides fixing the race (which is detailed in patch 2/3), I think the resulting 3/3 is much cleaner. Sorry for all the false starts. The more I looked at this, the more complex it seemed to get. But I _think_ this is the right solution. :) [1/3]: checkout,clone: check return value of create_symref [2/3]: lock_ref_sha1_basic: always fill old_oid while holding lock [3/3]: lock_ref_sha1_basic: handle REF_NODEREF with invalid refs -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 1/3] checkout,clone: check return value of create_symref 2016-01-12 21:43 ` [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic Jeff King @ 2016-01-12 21:44 ` Jeff King 2016-01-12 21:44 ` [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock Jeff King 2016-01-12 21:45 ` [PATCH 3/3] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2 siblings, 0 replies; 23+ messages in thread From: Jeff King @ 2016-01-12 21:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty It's unlikely that we would fail to create or update a symbolic ref (especially HEAD), but if we do, we should notice and complain. Note that there's no need to give more details in our error message; create_symref will already have done so. While we're here, let's also fix a minor memory leak in clone. Signed-off-by: Jeff King <peff@peff.net> --- Same as before (but with the extra test added in v2, in case you didn't that up yet). builtin/checkout.c | 3 ++- builtin/clone.c | 11 +++++++---- t/t2011-checkout-invalid-head.sh | 6 ++++++ 3 files changed, 15 insertions(+), 5 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index e8110a9..5af84a3 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -661,7 +661,8 @@ static void update_refs_for_switch(const struct checkout_opts *opts, describe_detached_head(_("HEAD is now at"), new->commit); } } else if (new->path) { /* Switch branches. */ - create_symref("HEAD", new->path, msg.buf); + if (create_symref("HEAD", new->path, msg.buf) < 0) + die("unable to update HEAD"); if (!opts->quiet) { if (old->path && !strcmp(new->path, old->path)) { if (opts->new_branch_force) diff --git a/builtin/clone.c b/builtin/clone.c index a0b3cd9..a7c8def 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -636,9 +636,11 @@ static void update_remote_refs(const struct ref *refs, struct strbuf head_ref = STRBUF_INIT; strbuf_addstr(&head_ref, branch_top); strbuf_addstr(&head_ref, "HEAD"); - create_symref(head_ref.buf, - remote_head_points_at->peer_ref->name, - msg); + if (create_symref(head_ref.buf, + remote_head_points_at->peer_ref->name, + msg) < 0) + die("unable to update %s", head_ref.buf); + strbuf_release(&head_ref); } } @@ -648,7 +650,8 @@ static void update_head(const struct ref *our, const struct ref *remote, const char *head; if (our && skip_prefix(our->name, "refs/heads/", &head)) { /* Local default branch link */ - create_symref("HEAD", our->name, NULL); + if (create_symref("HEAD", our->name, NULL) < 0) + die("unable to update HEAD"); if (!option_bare) { update_ref(msg, "HEAD", our->old_oid.hash, NULL, 0, UPDATE_REFS_DIE_ON_ERR); diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index 300f8bf..d444d5e 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -19,4 +19,10 @@ test_expect_success 'checkout master from invalid HEAD' ' git checkout master -- ' +test_expect_success 'checkout notices failure to lock HEAD' ' + test_when_finished "rm -f .git/HEAD.lock" && + >.git/HEAD.lock && + test_must_fail git checkout -b other +' + test_done -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock 2016-01-12 21:43 ` [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic Jeff King 2016-01-12 21:44 ` [PATCH 1/3] checkout,clone: check return value of create_symref Jeff King @ 2016-01-12 21:44 ` Jeff King 2016-01-13 1:25 ` Eric Sunshine 2016-01-12 21:45 ` [PATCH 3/3] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2 siblings, 1 reply; 23+ messages in thread From: Jeff King @ 2016-01-12 21:44 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty Our basic strategy for taking a ref lock is: 1. Create $ref.lock to take the lock 2. Read the ref again while holding the lock (during which time we know that nobody else can be updating it). 3. Compare the value we read to the expected "old_sha1" The value we read in step (2) is returned to the caller via the lock->old_oid field, who may use it for other purposes (such as writing a reflog). If we have no "old_sha1" (i.e., we are unconditionally taking the lock), then we obviously must omit step 3. But we _also_ omit step 2. This seems like a nice optimization, but it means that the caller sees only whatever was left in lock->old_oid from previous calls to resolve_ref_unsafe(), which happened outside of the lock. We can demonstrate this race pretty easily. Imagine you have three commits, $one, $two, and $three. One script just flips between $one and $two, without providing an old-sha1: while true; do git update-ref -m one refs/heads/foo $one git update-ref -m two refs/heads/foo $two done Meanwhile, another script tries to set the value to $three, also not using an old-sha1: while true; do git update-ref -m three refs/heads/foo $three done If these run simultaneously, we'll see a lot of lock contention, but each of the writes will succeed some of the time. The reflog may record movements between any of the three refs, but we would expect it to provide a consistent log: the "from" field of each log entry should be the same as the "two" field of the previous one. But if we check this: perl -alne ' print "mismatch on line $." if defined $last && $F[0] ne $last; $last = $F[1]; ' .git/logs/refs/heads/foo we'll see many mismatches. Why? Because sometimes, in the time between lock_ref_sha1_basic filling lock->old_oid via resolve_ref_unsafe() and it taking the lock, there may be a complete write by another process. And the "from" field in our reflog entry will be wrong, and will refer to an older value. This is probably quite rare in practice. It requires writers which do not provide an old-sha1 value, and it is a very quick race. However, it is easy to fix: we simply perform step (2), the read-under-lock, whether we have an old-sha1 or not. Then the value we hand back to the caller is always atomic. Signed-off-by: Jeff King <peff@peff.net> --- refs/files-backend.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 180c837..69c3ecf 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1840,12 +1840,17 @@ static int verify_lock(struct ref_lock *lock, if (read_ref_full(lock->ref_name, mustexist ? RESOLVE_REF_READING : 0, lock->old_oid.hash, NULL)) { - int save_errno = errno; - strbuf_addf(err, "can't verify ref %s", lock->ref_name); - errno = save_errno; - return -1; + if (old_sha1) { + int save_errno = errno; + strbuf_addf(err, "can't verify ref %s", lock->ref_name); + errno = save_errno; + return -1; + } else { + hashclr(lock->old_oid.hash); + return 0; + } } - if (hashcmp(lock->old_oid.hash, old_sha1)) { + if (old_sha1 && hashcmp(lock->old_oid.hash, old_sha1)) { strbuf_addf(err, "ref %s is at %s but expected %s", lock->ref_name, sha1_to_hex(lock->old_oid.hash), @@ -1985,7 +1990,7 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, goto error_return; } } - if (old_sha1 && verify_lock(lock, old_sha1, mustexist, err)) { + if (verify_lock(lock, old_sha1, mustexist, err)) { last_errno = errno; goto error_return; } -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock 2016-01-12 21:44 ` [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock Jeff King @ 2016-01-13 1:25 ` Eric Sunshine 2016-01-13 11:38 ` Jeff King 0 siblings, 1 reply; 23+ messages in thread From: Eric Sunshine @ 2016-01-13 1:25 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, Git List, Michael Haggerty On Tue, Jan 12, 2016 at 4:44 PM, Jeff King <peff@peff.net> wrote: > Our basic strategy for taking a ref lock is: > [...] > If these run simultaneously, we'll see a lot of lock > contention, but each of the writes will succeed some of the > time. The reflog may record movements between any of the > three refs, but we would expect it to provide a consistent > log: the "from" field of each log entry should be the same > as the "two" field of the previous one. s/two/to/ ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock 2016-01-13 1:25 ` Eric Sunshine @ 2016-01-13 11:38 ` Jeff King 0 siblings, 0 replies; 23+ messages in thread From: Jeff King @ 2016-01-13 11:38 UTC (permalink / raw) To: Eric Sunshine; +Cc: Junio C Hamano, Git List, Michael Haggerty On Tue, Jan 12, 2016 at 08:25:45PM -0500, Eric Sunshine wrote: > On Tue, Jan 12, 2016 at 4:44 PM, Jeff King <peff@peff.net> wrote: > > Our basic strategy for taking a ref lock is: > > [...] > > If these run simultaneously, we'll see a lot of lock > > contention, but each of the writes will succeed some of the > > time. The reflog may record movements between any of the > > three refs, but we would expect it to provide a consistent > > log: the "from" field of each log entry should be the same > > as the "two" field of the previous one. > > s/two/to/ Whoops. Was tweaking my scripts "one" and "two" to test the race while I wrote up the commit message. :) -Peff ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 3/3] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs 2016-01-12 21:43 ` [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic Jeff King 2016-01-12 21:44 ` [PATCH 1/3] checkout,clone: check return value of create_symref Jeff King 2016-01-12 21:44 ` [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock Jeff King @ 2016-01-12 21:45 ` Jeff King 2 siblings, 0 replies; 23+ messages in thread From: Jeff King @ 2016-01-12 21:45 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Michael Haggerty We sometimes call lock_ref_sha1_basic with REF_NODEREF to operate directly on a symbolic ref. This is used, for example, to move to a detached HEAD, or when updating the contents of HEAD via checkout or symbolic-ref. However, the first step of the function is to resolve the refname to get the "old" sha1, and we do so without telling resolve_ref_unsafe() that we are only interested in the symref. As a result, we may detect a problem there not with the symref itself, but with something it points to. The real-world example I found (and what is used in the test suite) is a HEAD pointing to a ref that cannot exist, because it would cause a directory/file conflict with other existing refs. This situation is somewhat broken, of course, as trying to _commit_ on that HEAD would fail. But it's not explicitly forbidden, and we should be able to move away from it. However, neither "git checkout" nor "git symbolic-ref" can do so. We try to take the lock on HEAD, which is pointing to a non-existent ref. We bail from resolve_ref_unsafe() with errno set to EISDIR, and the lock code thinks we are attempting to create a d/f conflict. Of course we're not. The problem is that the lock code has no idea what level we were at when we got EISDIR, so trying to diagnose or remove empty directories for HEAD is not useful. To make things even more complicated, we only get EISDIR in the loose-ref case. If the refs are packed, the resolution may "succeed", giving us the pointed-to ref in "refname", but a null oid. Later, we say "ah, the null oid means we are creating; let's make sure there is room for it", but mistakenly check against the _resolved_ refname, not the original. We can fix this by making two tweaks: 1. Call resolve_ref_unsafe() with RESOLVE_REF_NO_RECURSE when REF_NODEREF is set. This means any errors we get will be from the orig_refname, and we can act accordingly. We already do this in the REF_DELETING case, but we should do it for update, too. 2. If we do get a "refname" return from resolve_ref_unsafe(), even with RESOLVE_REF_NO_RECURSE it may be the name of the ref pointed-to by a symref. We already normalize this back to orig_refname before taking the lockfile, but we need to do so before the null_oid check. While we're rearranging the REF_NODEREF handling, we can also bump the initialization of lflags to the top of the function, where we are setting up other flags. This saves us from having yet another conditional block on REF_NODEREF just to set it later. Signed-off-by: Jeff King <peff@peff.net> --- refs/files-backend.c | 19 ++++++++++--------- t/t1401-symbolic-ref.sh | 7 +++++++ t/t2011-checkout-invalid-head.sh | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 50 insertions(+), 9 deletions(-) diff --git a/refs/files-backend.c b/refs/files-backend.c index 69c3ecf..81c92b4 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -1887,7 +1887,8 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, const char *orig_refname = refname; struct ref_lock *lock; int last_errno = 0; - int type, lflags; + int type; + int lflags = 0; int mustexist = (old_sha1 && !is_null_sha1(old_sha1)); int resolve_flags = 0; int attempts_remaining = 3; @@ -1898,10 +1899,11 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, if (mustexist) resolve_flags |= RESOLVE_REF_READING; - if (flags & REF_DELETING) { + if (flags & REF_DELETING) resolve_flags |= RESOLVE_REF_ALLOW_BAD_NAME; - if (flags & REF_NODEREF) - resolve_flags |= RESOLVE_REF_NO_RECURSE; + if (flags & REF_NODEREF) { + resolve_flags |= RESOLVE_REF_NO_RECURSE; + lflags |= LOCK_NO_DEREF; } refname = resolve_ref_unsafe(refname, resolve_flags, @@ -1937,6 +1939,10 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, goto error_return; } + + if (flags & REF_NODEREF) + refname = orig_refname; + /* * If the ref did not exist and we are creating it, make sure * there is no existing packed ref whose name begins with our @@ -1952,11 +1958,6 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname, lock->lk = xcalloc(1, sizeof(struct lock_file)); - lflags = 0; - if (flags & REF_NODEREF) { - refname = orig_refname; - lflags |= LOCK_NO_DEREF; - } lock->ref_name = xstrdup(refname); lock->orig_ref_name = xstrdup(orig_refname); strbuf_git_path(&ref_file, "%s", refname); diff --git a/t/t1401-symbolic-ref.sh b/t/t1401-symbolic-ref.sh index 5db876c..a713766 100755 --- a/t/t1401-symbolic-ref.sh +++ b/t/t1401-symbolic-ref.sh @@ -122,4 +122,11 @@ test_expect_success 'symbolic-ref does not create ref d/f conflicts' ' test_must_fail git symbolic-ref refs/heads/df/conflict refs/heads/df ' +test_expect_success 'symbolic-ref handles existing pointer to invalid name' ' + head=$(git rev-parse HEAD) && + git symbolic-ref HEAD refs/heads/outer && + git update-ref refs/heads/outer/inner $head && + git symbolic-ref HEAD refs/heads/unrelated +' + test_done diff --git a/t/t2011-checkout-invalid-head.sh b/t/t2011-checkout-invalid-head.sh index d444d5e..c5501b0 100755 --- a/t/t2011-checkout-invalid-head.sh +++ b/t/t2011-checkout-invalid-head.sh @@ -25,4 +25,37 @@ test_expect_success 'checkout notices failure to lock HEAD' ' test_must_fail git checkout -b other ' +test_expect_success 'create ref directory/file conflict scenario' ' + git update-ref refs/heads/outer/inner master && + + # do not rely on symbolic-ref to get a known state, + # as it may use the same code we are testing + reset_to_df () { + echo "ref: refs/heads/outer" >.git/HEAD + } +' + +test_expect_success 'checkout away from d/f HEAD (unpacked, to branch)' ' + reset_to_df && + git checkout master +' + +test_expect_success 'checkout away from d/f HEAD (unpacked, to detached)' ' + reset_to_df && + git checkout --detach master +' + +test_expect_success 'pack refs' ' + git pack-refs --all --prune +' + +test_expect_success 'checkout away from d/f HEAD (packed, to branch)' ' + reset_to_df && + git checkout master +' + +test_expect_success 'checkout away from d/f HEAD (packed, to detached)' ' + reset_to_df && + git checkout --detach master +' test_done -- 2.7.0.368.g04bc9ee ^ permalink raw reply related [flat|nested] 23+ messages in thread
end of thread, other threads:[~2016-01-13 11:38 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-01-11 15:46 [PATCH 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2016-01-11 15:49 ` [PATCH 1/2] checkout,clone: check return value of create_symref Jeff King 2016-01-12 4:09 ` Michael Haggerty 2016-01-12 9:49 ` Jeff King 2016-01-11 15:52 ` [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2016-01-11 18:28 ` Junio C Hamano 2016-01-12 4:55 ` Michael Haggerty 2016-01-12 9:52 ` Jeff King 2016-01-12 18:11 ` Junio C Hamano 2016-01-12 9:56 ` [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King 2016-01-12 9:57 ` [PATCH v2 1/2] checkout,clone: check return value of create_symref Jeff King 2016-01-12 9:58 ` [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King 2016-01-12 13:26 ` Jeff King 2016-01-12 13:55 ` [PATCH v3 " Jeff King 2016-01-12 19:41 ` Junio C Hamano 2016-01-12 20:22 ` Jeff King 2016-01-12 20:42 ` Jeff King 2016-01-12 21:43 ` [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic Jeff King 2016-01-12 21:44 ` [PATCH 1/3] checkout,clone: check return value of create_symref Jeff King 2016-01-12 21:44 ` [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock Jeff King 2016-01-13 1:25 ` Eric Sunshine 2016-01-13 11:38 ` Jeff King 2016-01-12 21:45 ` [PATCH 3/3] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.