* die("bad object.. for duplicate tagged tag in remote @ 2017-05-19 17:28 Chris West 2017-05-20 8:03 ` Jeff King 0 siblings, 1 reply; 3+ messages in thread From: Chris West @ 2017-05-19 17:28 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 1709 bytes --] Bear with me here, I hit this in a real repo. If you have an annotated tag of an annotated tag, and `remote update` elects not to fetch this tag (perhaps because it has a name collision locally), then the repo ends up corrupt: you can't gc it, but fsck doesn't notice. Two repos, named "bad" and "good": bad$ git tag -a inner bad$ git tag -a outer inner bad$ git tag -d inner bad$ git show outer tag outer Tagger: ... Date: ... This is the outer tag. tag inner Tagger: ... Date: ... This is the inner tag. commit 826365dcfec304a80b227a990f7d5c805bce3dd9 Author: ... ... bad$ git rev-parse outer 070707.. bad$ git cat-file tag outer object 03030303... good$ git tag -a outer # create a colliding tag good$ git remote add bad ../bad good$ git remote update warning: no common commits remote: Counting objects: 4, done. remote: Compressing objects: 100% (2/2), done. remote: Total 4 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (4/4), done. From ../bad * [new branch] master -> bad/master Note how it has not fetched the tag ref, but it has fetched one of the tag objects: $ git show 07070 error: Could not read object 0303030.. tag outer Tagger: ... $ git fsck ... dangling tag 07070... I actually don't get that on the real repo, but do on this testcase. Hmm. `git fsck` exits with success here. This is bad, as the object graph is incomplete? $ git gc fatal: bad object 03030303... error: failed to run repack `git gc` fails with this meaningless error. The attached patch improves the error. I don't know where the rest of the problem lies. What's the expected behaviour when a tag already exists locally, but is different? Fetch the object anyway, but ignore it? [-- Attachment #2: 0001-print-tag-id-when-tagged-object-is-bad.patch --] [-- Type: text/x-diff, Size: 745 bytes --] From aa861789077012f78605431e1a1f191292693325 Mon Sep 17 00:00:00 2001 From: "Chris West (Faux)" <git@goeswhere.com> Date: Fri, 19 May 2017 19:24:03 +0200 Subject: [PATCH] print tag id when tagged object is bad --- revision.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/revision.c b/revision.c index 8a8c178..22b6021 100644 --- a/revision.c +++ b/revision.c @@ -232,7 +232,8 @@ static struct commit *handle_commit(struct rev_info *revs, if (!object) { if (flags & UNINTERESTING) return NULL; - die("bad object %s", oid_to_hex(&tag->tagged->oid)); + die("bad tagged object %s in %s", oid_to_hex(&tag->tagged->oid), + oid_to_hex(&tag->object.oid)); } object->flags |= flags; /* -- 2.7.4 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: die("bad object.. for duplicate tagged tag in remote 2017-05-19 17:28 die("bad object.. for duplicate tagged tag in remote Chris West @ 2017-05-20 8:03 ` Jeff King 2017-05-20 8:30 ` [PATCH] revision.c: ignore broken tags with ignore_missing_links Jeff King 0 siblings, 1 reply; 3+ messages in thread From: Jeff King @ 2017-05-20 8:03 UTC (permalink / raw) To: Chris West; +Cc: git On Fri, May 19, 2017 at 06:28:56PM +0100, Chris West wrote: > If you have an annotated tag of an annotated tag, and `remote update` > elects not to fetch this tag (perhaps because it has a name collision > locally), then the repo ends up corrupt: you can't gc it, but fsck > doesn't notice. > > Two repos, named "bad" and "good": > [...] What version of git are you using? If I run this script: -- >8 -- #!/bin/sh rm -rf good bad git init bad cd bad git commit --allow-empty -m bad git tag -m 'bad inner' inner git tag -m 'bad outer' outer inner outer=$(git -C ../bad rev-parse outer) inner=$(git -C ../bad rev-parse inner) git tag -d inner cd .. git init good cd good git commit --allow-empty -m good git tag -m 'good outer' outer git remote add bad ../bad git fetch bad echo "===> outer is $outer" git cat-file tag $outer echo "===> inner is $inner" git cat-file tag $inner -- >8 -- then prior to Git v2.10.1, the final cat-file fails. But after it is fine. This is due to b773ddea2 (pack-objects: walk tag chains for --include-tag, 2016-09-05), which dealt with this exact tag-of-tag case. In the real world, it would depend on which version of Git the server is running (the fix is on the pack-objects side). There's another interesting thing going on with the fsck/gc thing, though. The fetching repo isn't actually corrupt. The guarantee that Git makes is that we have the complete graph of anything that's reachable from a ref, not that we might not have stray objects (though it does try to avoid breaking even unreachable parts of history, as I'll explain in a moment). And that's what's happening here; the client gets "outer" but it's not actually reachable. So what happens in your case in more detail is: 1. git-fetch sends the include-tag capability to the server, asking it to include tags that point to whatever we're fetching (master in this case) 2. The server sees that "outer" eventually points to what the client is fetching and adds it. It doesn't do the same for "inner" because it no longer has a tag ref pointing at it. And because it is pre-v2.10.1, it doesn't walk the full chain of "outer", and so never considers "inner" at all. 3. The next step would normally be for git-fetch to realize that it's missing an object and backfill tags with a followup request. But since it already has its own unrelated "outer", it knows it doesn't need "outer" and doesn't bother looking at it. So now we have "outer" in the object store (because the server thought we might need it), but we never actually pointed a ref at it. And that explains your fsck result: > $ git fsck > ... > dangling tag 07070... We have the object, but nobody points to it. > I actually don't get that on the real repo, but do on this testcase. Hmm. > `git fsck` exits with success here. This is bad, as the object graph is > incomplete? No, that outcome is correct. The interesting thing is that your real-world case probably _does_ have a ref pointing at it (if it's not getting a dangling-tag). I don't know how that got there, though. The original case that motivated the fix in v2.10.1 was cloning with a single branch, like: git clone --single-branch --no-local bad broken but that results in the clone failing, not a corrupt repo. Is it possible you or somebody else then ran something like: git update-ref refs/tags/other-outer $outer_sha1 after the fetch? That would reference the broken part of the graph, and the repository is corrupt at that point. > $ git gc > fatal: bad object 03030303... > error: failed to run repack So this is where I think there might be room for improvement, even with current versions of git. Traditionally, we wouldn't try to traverse or pack that unreferenced part of the object graph. But since v2.2.0, we traverse any objects that are "recent" (within the 2-week prune-expiration timestamp) to try to keep whole chunks of the graph intact (ironically, to prevent problems like the update-ref I showed above). We use the "ignore_missing_links" flag to tell the traversal that this is best-effort (i.e., we try to retain unreachable history if we can, but if it's already broken there's nothing we can do). So I wouldn't be surprised to find that we correctly respect that flag when following parent and tree links, but not in tags. > diff --git a/revision.c b/revision.c > index 8a8c178..22b6021 100644 > --- a/revision.c > +++ b/revision.c > @@ -232,7 +232,8 @@ static struct commit *handle_commit(struct rev_info *revs, > if (!object) { > if (flags & UNINTERESTING) > return NULL; > - die("bad object %s", oid_to_hex(&tag->tagged->oid)); > + die("bad tagged object %s in %s", oid_to_hex(&tag->tagged->oid), > + oid_to_hex(&tag->object.oid)); > } I agree this is an improvement. And that "if" in the context lines is almost certainly the culprit. It should also trigger when revs->ignore_missing_links is set. -Peff ^ permalink raw reply [flat|nested] 3+ messages in thread
* [PATCH] revision.c: ignore broken tags with ignore_missing_links 2017-05-20 8:03 ` Jeff King @ 2017-05-20 8:30 ` Jeff King 0 siblings, 0 replies; 3+ messages in thread From: Jeff King @ 2017-05-20 8:30 UTC (permalink / raw) To: Chris West; +Cc: git When peeling a tag for prepare_revision_walk(), we do not respect the ignore_missing_links flag. This can lead to a bogus error when pack-objects walks the possibly-broken unreachable-but-recent part of the object graph. The other link-following all happens via traverse_commit_list(), which explains why this case was missed. And our tests covered only broken links from commits. Let's be more comprehensive and cover broken tree entries (which do work) and tags (which shows off this bug). Signed-off-by: Jeff King <peff@peff.net> --- So here's the fix I showed earlier polished up as a real patch. I still think your patch to improve the error message is a good suggestion. I was going to just re-send it out on top of this, but I realized it was missing your signoff. Do you want to resend? revision.c | 2 +- t/t6501-freshen-objects.sh | 27 ++++++++++++++++++++++++++- 2 files changed, 27 insertions(+), 2 deletions(-) diff --git a/revision.c b/revision.c index 8a8c1789c..610638f2e 100644 --- a/revision.c +++ b/revision.c @@ -230,7 +230,7 @@ static struct commit *handle_commit(struct rev_info *revs, die("bad tag"); object = parse_object(tag->tagged->oid.hash); if (!object) { - if (flags & UNINTERESTING) + if (revs->ignore_missing_links || (flags & UNINTERESTING)) return NULL; die("bad object %s", oid_to_hex(&tag->tagged->oid)); } diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh index cf076dcd9..394b169ea 100755 --- a/t/t6501-freshen-objects.sh +++ b/t/t6501-freshen-objects.sh @@ -129,7 +129,7 @@ for repack in '' true; do ' done -test_expect_success 'do not complain about existing broken links' ' +test_expect_success 'do not complain about existing broken links (commit)' ' cat >broken-commit <<-\EOF && tree 0000000000000000000000000000000000000001 parent 0000000000000000000000000000000000000002 @@ -144,4 +144,29 @@ test_expect_success 'do not complain about existing broken links' ' test_must_be_empty stderr ' +test_expect_success 'do not complain about existing broken links (tree)' ' + cat >broken-tree <<-\EOF && + 100644 blob 0000000000000000000000000000000000000003 foo + EOF + tree=$(git mktree --missing <broken-tree) && + git gc 2>stderr && + git cat-file -e $tree && + test_must_be_empty stderr +' + +test_expect_success 'do not complain about existing broken links (tag)' ' + cat >broken-tag <<-\EOF && + object 0000000000000000000000000000000000000004 + type commit + tag broken + tagger whatever <whatever@example.com> 1234 -0000 + + this is a broken tag + EOF + tag=$(git hash-object -t tag -w broken-tag) && + git gc 2>stderr && + git cat-file -e $tag && + test_must_be_empty stderr +' + test_done -- 2.13.0.234.g029c1392a ^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2017-05-20 8:30 UTC | newest] Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-05-19 17:28 die("bad object.. for duplicate tagged tag in remote Chris West 2017-05-20 8:03 ` Jeff King 2017-05-20 8:30 ` [PATCH] revision.c: ignore broken tags with ignore_missing_links Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.