All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: fetch in certain repo always gives "did not send all necessary objects"
@ 2018-02-06 23:04 Elijah Newren
  2018-02-06 23:20 ` Stefan Beller
  0 siblings, 1 reply; 10+ messages in thread
From: Elijah Newren @ 2018-02-06 23:04 UTC (permalink / raw)
  To: Git Mailing List, Nguyễn Thái Ngọc
  Cc: Jeff King, Stefan Beller

[I cannot share the local repo and had to modify output of commands
slightly to redact hostnames, branch names, etc.  I think I haven't
messed anything up, but it's possible...]

Two people in the last week have come to me after running into a case
where they could not update their repo because any call to git fetch
or git pull would error out with

  fatal: bad object HEAD
  error: gerrit.server:reponame did not send all necessary objects"

which apparently has been reported before at both of:

https://public-inbox.org/git/CAE5ih78nLL6UhKPObvFEA9xQZUtc1XpPvGJNaYTH9fJ0RyFRvA@mail.gmail.com/
and

https://public-inbox.org/git/CAGZ79kYP0z1G_H3nwfmSHraWHMBOcik5LepUXKj0nveeBrihiw@mail.gmail.com/

but without a resolution and it appears to have been difficult to
reproduce.  I can reproduce reliably with a copy of one of their
repos, and have dug in a little bit and may be able to provide more
info.  The pieces I've found that look relevant:

* Both users were using git-2.16.1 (suggesting they update and try new
things often)
* "git fsck --no-dangling" reports no problem on either client or server.
* The error bisects to commit d0c39a49cc ("revision.c: --all adds HEAD
from all worktrees", 2017-08-23).
* The developer has 3 directories in use corresponding to different
worktrees, but four entries under .git/worktrees, one of which appears
that it was once in use but no longer is
* The unused entry under .git/worktrees has a detached HEAD that
points to an object that doesn't exist; I suspect it was garbage
collected.

More details about the last two items:

$ ls ..
master/  1.0-release/  2.0-release/  3.0-release/

$ for i in ../*/.git; do echo $i $(cat $i/.git 2>&1); done
../master/.git cat: ../master/.git: Is a directory
../1.0-release/.git gitdir: /clone/master/.git/worktrees/1.0-release
../2.0-release/.git gitdir: /clone/master/.git/worktrees/2.0-release
../3.0-release/.git gitdir: /clone/master/.git/worktrees/3.0-release1

$ for i in .git/worktrees/*/gitdir; do echo $i $(cat $i); done
.git/worktrees/1.0-release/gitdir /clone/1.0-release/.git
.git/worktrees/2.0-release/gitdir /clone/2.0-release/.git
.git/worktrees/3.0-release1/gitdir /clone/3.0-release/.git
.git/worktrees/3.0-release/gitdir /clone/3.0-release/.git

$ (cd .git && for i in worktrees/*/HEAD; do echo $i $(git rev-parse
$i^{object}); done)
worktrees/1.0-release/HEAD c483c6806fe19c2fa9bcda892e0ddb470f8e9d65
worktrees/2.0-release/HEAD ab4c699ec9d1eab143c7e8ef51a0a6c451fcd4ea
worktrees/3.0-release1/HEAD 55e52e24b24b2f81484de9c6ea7d4520238c5fd5
fatal: ambiguous argument 'worktrees/3.0-release/HEAD^{object}':
unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
worktrees/3.0-release/HEAD worktrees/3.0-release/HEAD^{object}


So, it appears both worktrees/3.0-release and worktrees/3.0-release1
believe they are used by /clone/3.0-release/, but /clone/3.0-release
uses worktree-3.0-release1.  That leaves worktrees/3.0-release unused,
and in fact its HEAD is some sha1 that does not exist in the repo --
possibly GC'ed.


Also, after the investigation above and based on the hint in one of
the other email threads linked above, we tried 'git worktree prune'
but it didn't remove the unused directory under .git/worktree (perhaps
because it thought something was using it?).  The fetch bug didn't go
away until doing an 'rm -rf .git/worktree/3.0-release'.

I have no experience prior to today with git worktree, and the
developer in question doesn't remember how things got to the current
state.  It was all a while ago.


Does anyone have an idea what may have happened here or how to avoid it?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-06 23:04 BUG: fetch in certain repo always gives "did not send all necessary objects" Elijah Newren
@ 2018-02-06 23:20 ` Stefan Beller
  2018-02-07  0:00   ` Elijah Newren
  0 siblings, 1 reply; 10+ messages in thread
From: Stefan Beller @ 2018-02-06 23:20 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Git Mailing List, Nguyễn Thái Ngọc, Jeff King

On Tue, Feb 6, 2018 at 3:04 PM, Elijah Newren <newren@gmail.com> wrote:

>
> Does anyone have an idea what may have happened here or how to avoid it?

According to Peff this got fixed
https://public-inbox.org/git/20171020031630.44zvzh3d2vlhglv4@sigill.intra.peff.net/
and but you've had a corrupted repo from back when you were using an older
version of Git.

Did that repo exist before d0c39a49cc was rolled out? Then we can keep that
hypothesis of "left-over corruption" as Peff put it.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-06 23:20 ` Stefan Beller
@ 2018-02-07  0:00   ` Elijah Newren
  2018-02-07 11:08     ` Duy Nguyen
  2018-02-07 13:21     ` Jeff King
  0 siblings, 2 replies; 10+ messages in thread
From: Elijah Newren @ 2018-02-07  0:00 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Git Mailing List, Nguyễn Thái Ngọc, Jeff King

On Tue, Feb 6, 2018 at 3:20 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Feb 6, 2018 at 3:04 PM, Elijah Newren <newren@gmail.com> wrote:
>
>>
>> Does anyone have an idea what may have happened here or how to avoid it?
>
> According to Peff this got fixed
> https://public-inbox.org/git/20171020031630.44zvzh3d2vlhglv4@sigill.intra.peff.net/
> and but you've had a corrupted repo from back when you were using an older
> version of Git.
>
> Did that repo exist before d0c39a49cc was rolled out? Then we can keep that
> hypothesis of "left-over corruption" as Peff put it.

I'm somewhat confused by this explanation.  That precise commit is the
one I bisected to that _caused_ the fetch to fail.  Also, there might
be one important difference here -- in the link you provide, it
suggests that you had a corrupted working directory that made use of a
now gc'ed commit.  In the case I was able to dig into, we did not.
(There was a left-over .git/worktree/<something> that had a now gc'ed
commit, but no working directory that used it.)

I suspect you mean that there was another previous bug that induced
corruption, that this commit fixed that other bug, while also
introducing this new bug that makes folks' clones unusable because the
error doesn't provide enough information for users to know how to fix.
It took me hours to figure it out, after users ran out of ideas and
came and asked me for help.  (Maybe if I was familiar with worktree,
and knew they had been using it, then I might have guessed that "HEAD"
meant "not your actual HEAD but the HEAD of the vestige of some other
worktree").

Does anyone have pointers about what might be doable in terms of
providing a more useful error message to allow users to recover?
And/or ideas of what steps could cause corruption so I can send out a
PSA to help users avoid it?

If not, I'll try to dig more, but I thought I'd ask others familiar
with this area.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07  0:00   ` Elijah Newren
@ 2018-02-07 11:08     ` Duy Nguyen
  2018-02-07 13:23       ` Jeff King
  2018-02-07 18:12       ` Elijah Newren
  2018-02-07 13:21     ` Jeff King
  1 sibling, 2 replies; 10+ messages in thread
From: Duy Nguyen @ 2018-02-07 11:08 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Stefan Beller, Git Mailing List, Jeff King

On Wed, Feb 7, 2018 at 7:00 AM, Elijah Newren <newren@gmail.com> wrote:
> and knew they had been using it, then I might have guessed that "HEAD"
> meant "not your actual HEAD but the HEAD of the vestige of some other
> worktree").
>
> Does anyone have pointers about what might be doable in terms of
> providing a more useful error message to allow users to recover?

I noticed this too. I was working on improving this message a bit but
got side tracked and since I figured this did not happen often
anymore, this fix got lower priority than others. I'll resume that
work.

> And/or ideas of what steps could cause corruption so I can send out a
> PSA to help users avoid it?

There is another thing we could do. One bad HEAD should not abort the
entire operation (at least if it's not the current worktree's HEAD).
We could still give a warning and move on, or don't warn at all and
let "git worktree prune" collect it (which I see from your message
that it also fails to do).

I guess that's two more items on my todo list :) Sorry for all the
trouble because of this bug of mine.

> If not, I'll try to dig more, but I thought I'd ask others familiar
> with this area.
-- 
Duy

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07  0:00   ` Elijah Newren
  2018-02-07 11:08     ` Duy Nguyen
@ 2018-02-07 13:21     ` Jeff King
  2018-02-07 13:25       ` Jeff King
  2018-02-07 17:25       ` Elijah Newren
  1 sibling, 2 replies; 10+ messages in thread
From: Jeff King @ 2018-02-07 13:21 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Stefan Beller, Git Mailing List, Nguyễn Thái Ngọc

On Tue, Feb 06, 2018 at 04:00:32PM -0800, Elijah Newren wrote:

> > According to Peff this got fixed
> > https://public-inbox.org/git/20171020031630.44zvzh3d2vlhglv4@sigill.intra.peff.net/
> > and but you've had a corrupted repo from back when you were using an older
> > version of Git.
> >
> > Did that repo exist before d0c39a49cc was rolled out? Then we can keep that
> > hypothesis of "left-over corruption" as Peff put it.
> 
> I'm somewhat confused by this explanation.  That precise commit is the
> one I bisected to that _caused_ the fetch to fail.  Also, there might
> be one important difference here -- in the link you provide, it
> suggests that you had a corrupted working directory that made use of a
> now gc'ed commit.  In the case I was able to dig into, we did not.
> (There was a left-over .git/worktree/<something> that had a now gc'ed
> commit, but no working directory that used it.)

If you had a corrupted .git/worktree/<something>/HEAD, then that does
sound like the same problem. It's true that the commit you bisected to
caused "fetch" to fail, but only because it started looking at more of
your corrupted repository. The corruption happened long before (and I
don't know exactly when it was fixed, but I couldn't replicate it
anymore; it might even still exist).

In your case it sounds like you have the extra twist that the matching
working directory for "<something>" had gone away, but I don't think
that materially changes anything. Until you run "git worktree prune",
that HEAD file is still there and still supposed to be valid.

> I suspect you mean that there was another previous bug that induced
> corruption, that this commit fixed that other bug, while also
> introducing this new bug that makes folks' clones unusable because the
> error doesn't provide enough information for users to know how to fix.

If you want to call that last thing a bug, then I guess so. It's perhaps
a matter for the philosophers whether it is the fault of the new code to
start complaining about an existing on-disk corruption.

> It took me hours to figure it out, after users ran out of ideas and
> came and asked me for help.  (Maybe if I was familiar with worktree,
> and knew they had been using it, then I might have guessed that "HEAD"
> meant "not your actual HEAD but the HEAD of the vestige of some other
> worktree").

Yeah, this is the obvious thing that seems like it ought to be improved.

> Does anyone have pointers about what might be doable in terms of
> providing a more useful error message to allow users to recover?
> And/or ideas of what steps could cause corruption so I can send out a
> PSA to help users avoid it?

Here's a minimal manual reproduction:

  # new repo...
  git init
  git commit --allow-empty -m one

  # with a worktree...
  git worktree add foo
  git -C foo commit --allow-empty -m two
  obj=.git/objects/$(git rev-parse foo | sed 's#..#&/#')

  # now we stop using that worktree
  git -C foo checkout --detach
  git branch -f -D foo
  rm -rf foo

  # and this is the corruption; this might have happened ye olden days
  # because of a bug in the worktree code, but we'll assume that somehow
  # the object went away
  rm -f $obj

And now lots of commands may fail with confusing errors:

  $ git prune
  fatal: unable to parse object: HEAD

Unfortunately fixing that is a little tricky. In this case the stack
looks like:

  parse_object_or_die (oid=0x7fffffffd690, name=0x555555792880 "HEAD") at object.c:239
  add_one_ref (path=0x555555792880 "HEAD", oid=0x7fffffffd690, flag=0, cb_data=0x7fffffffd8e0) at reachable.c:38
  refs_head_ref (refs=0x555555a65430, fn=0x5555556b6ef5 <add_one_ref>, cb_data=0x7fffffffd8e0) at refs.c:1316
  other_head_refs (fn=0x5555556b6ef5 <add_one_ref>, cb_data=0x7fffffffd8e0) at worktree.c:404

So other_head_refs knows that it's looking at the worktrees. And it
passes the alternate ref-store to refs_head_ref(), with "add_one_ref" as
the callback. But the knowledge that we're not talking about the real
"HEAD" is lost as we cross that callback boundary. We'd need to either
add another parameter to the callback, or have some way of talking about
"HEAD in this worktree" as a refname (which AFAIK we don't have).

As for PSAs, my normal go-to in confusing matters like this is git-fsck.
But it seems that it does not check worktree HEADs. :(

  $ git fsck
  Checking object directories: 100% (256/256), done.

So that seems like another bug.

The best PSA for this particular bug may be "try pruning the worktrees":

  $ git worktree prune -v
  Removing worktrees/foo: gitdir file points to non-existent location

  $ git prune; echo $?
  0

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07 11:08     ` Duy Nguyen
@ 2018-02-07 13:23       ` Jeff King
  2018-02-07 18:12       ` Elijah Newren
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff King @ 2018-02-07 13:23 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Elijah Newren, Stefan Beller, Git Mailing List

On Wed, Feb 07, 2018 at 06:08:40PM +0700, Duy Nguyen wrote:

> > And/or ideas of what steps could cause corruption so I can send out a
> > PSA to help users avoid it?
> 
> There is another thing we could do. One bad HEAD should not abort the
> entire operation (at least if it's not the current worktree's HEAD).
> We could still give a warning and move on, or don't warn at all and
> let "git worktree prune" collect it (which I see from your message
> that it also fails to do).

Whether that's safe or not depends very much on what the caller intends
to do with the ref. Skipping broken refs in "git prune" is a bad idea,
for instance.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07 13:21     ` Jeff King
@ 2018-02-07 13:25       ` Jeff King
  2018-02-07 17:25       ` Elijah Newren
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff King @ 2018-02-07 13:25 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Stefan Beller, Git Mailing List, Nguyễn Thái Ngọc

On Wed, Feb 07, 2018 at 08:21:57AM -0500, Jeff King wrote:

> The best PSA for this particular bug may be "try pruning the worktrees":
> 
>   $ git worktree prune -v
>   Removing worktrees/foo: gitdir file points to non-existent location
> 
>   $ git prune; echo $?
>   0

Sorry, I just read Stefan's response and not your original before
responding. I see that you did try that. So IMHO that's a separate bug
that "worktree prune" did not do the right thing for your particular
case.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07 13:21     ` Jeff King
  2018-02-07 13:25       ` Jeff King
@ 2018-02-07 17:25       ` Elijah Newren
  2018-02-07 18:17         ` Jeff King
  1 sibling, 1 reply; 10+ messages in thread
From: Elijah Newren @ 2018-02-07 17:25 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, Git Mailing List, Nguyễn Thái Ngọc

On Wed, Feb 7, 2018 at 5:21 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Feb 06, 2018 at 04:00:32PM -0800, Elijah Newren wrote:
>
>> It took me hours to figure it out, after users ran out of ideas and
>> came and asked me for help.  (Maybe if I was familiar with worktree,
>> and knew they had been using it, then I might have guessed that "HEAD"
>> meant "not your actual HEAD but the HEAD of the vestige of some other
>> worktree").
>
> Yeah, this is the obvious thing that seems like it ought to be improved.
<snip>
> Unfortunately fixing that is a little tricky. In this case the stack
> looks like:
>
>   parse_object_or_die (oid=0x7fffffffd690, name=0x555555792880 "HEAD") at object.c:239
>   add_one_ref (path=0x555555792880 "HEAD", oid=0x7fffffffd690, flag=0, cb_data=0x7fffffffd8e0) at reachable.c:38
>   refs_head_ref (refs=0x555555a65430, fn=0x5555556b6ef5 <add_one_ref>, cb_data=0x7fffffffd8e0) at refs.c:1316
>   other_head_refs (fn=0x5555556b6ef5 <add_one_ref>, cb_data=0x7fffffffd8e0) at worktree.c:404
>
> So other_head_refs knows that it's looking at the worktrees. And it
> passes the alternate ref-store to refs_head_ref(), with "add_one_ref" as
> the callback. But the knowledge that we're not talking about the real
> "HEAD" is lost as we cross that callback boundary. We'd need to either
> add another parameter to the callback, or have some way of talking about
> "HEAD in this worktree" as a refname (which AFAIK we don't have).

Can we use "worktrees/${WORKTREE}/HEAD"?  It already satisfies all the
necessary rev-parse rules...

(And on a slight tangent...do we want to start disallowing the
creation of branches/tags whose name starts with "worktrees/",
"refs/", "hooks/", or other paths that exists under gitdir?  Making a
branch named "refs/heads/foo" so that it fully-qualifies as
"refs/heads/refs/heads/foo" is always fun)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07 11:08     ` Duy Nguyen
  2018-02-07 13:23       ` Jeff King
@ 2018-02-07 18:12       ` Elijah Newren
  1 sibling, 0 replies; 10+ messages in thread
From: Elijah Newren @ 2018-02-07 18:12 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Stefan Beller, Git Mailing List, Jeff King

On Wed, Feb 7, 2018 at 3:08 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, Feb 7, 2018 at 7:00 AM, Elijah Newren <newren@gmail.com> wrote:
>> and knew they had been using it, then I might have guessed that "HEAD"
>> meant "not your actual HEAD but the HEAD of the vestige of some other
>> worktree").
>>
>> Does anyone have pointers about what might be doable in terms of
>> providing a more useful error message to allow users to recover?
>
> I noticed this too. I was working on improving this message a bit but
> got side tracked and since I figured this did not happen often
> anymore, this fix got lower priority than others. I'll resume that
> work.

Sweet, that'd be great.  Let me know if I can help.

>> And/or ideas of what steps could cause corruption so I can send out a
>> PSA to help users avoid it?
>
> There is another thing we could do. One bad HEAD should not abort the
> entire operation (at least if it's not the current worktree's HEAD).
> We could still give a warning and move on, or don't warn at all and
> let "git worktree prune" collect it (which I see from your message
> that it also fails to do).
>
> I guess that's two more items on my todo list :)

Cool, if you're taking a look at those, I'll take a look at the other
one mentioned by Peff in this thread -- "make fsck pay attention to
other worktree HEADs"  :-)

> Sorry for all the trouble because of this bug of mine.

Let me apologize if my tone came across too harshly in the report; I
sometimes accidentally do that.  Anyway, bugs happen.  People were
able to get back up and running with "just reclone somewhere else"
pretty quickly, and no one lost any important data here, it was just
jarring or perplexing enough that I felt the need to dig and figure
out an improvement before more reports come in.

Besides, you've made git a lot better.  Thanks for that.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: BUG: fetch in certain repo always gives "did not send all necessary objects"
  2018-02-07 17:25       ` Elijah Newren
@ 2018-02-07 18:17         ` Jeff King
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff King @ 2018-02-07 18:17 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Stefan Beller, Git Mailing List, Nguyễn Thái Ngọc

On Wed, Feb 07, 2018 at 09:25:42AM -0800, Elijah Newren wrote:

> > So other_head_refs knows that it's looking at the worktrees. And it
> > passes the alternate ref-store to refs_head_ref(), with "add_one_ref" as
> > the callback. But the knowledge that we're not talking about the real
> > "HEAD" is lost as we cross that callback boundary. We'd need to either
> > add another parameter to the callback, or have some way of talking about
> > "HEAD in this worktree" as a refname (which AFAIK we don't have).
> 
> Can we use "worktrees/${WORKTREE}/HEAD"?  It already satisfies all the
> necessary rev-parse rules...

True, but it's mostly an accident that it works. And once we have ref
backends besides the filesystem, it will probably stop working.

I think there was discussion at some point of embedding worktree refs
into the normal ref namespace, but I don't know what came of it (it's
not a feature I've followed very closely).

> (And on a slight tangent...do we want to start disallowing the
> creation of branches/tags whose name starts with "worktrees/",
> "refs/", "hooks/", or other paths that exists under gitdir?  Making a
> branch named "refs/heads/foo" so that it fully-qualifies as
> "refs/heads/refs/heads/foo" is always fun)

We recently taught the porcelain to disallow a branch named "HEAD".
Though I think there are actually two related problems with different
solutions. One is saying something like:

  git checkout -b HEAD

or:

  git checkout -b refs/heads/foo

both of which will not do what you want, and leave you with a
funnily-named branch in the ref namespace.

But that's separate from the fact that:

  git rev-parse info/refs

will look at a file that is not a ref at all. Long-term I think the
solution is storage formats that don't mingle with other files. But we
could probably teach even the files-backend that any ref at the
top-level is supposed to be either in refs/, or to consist only of
"[A-Z_]".

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-02-07 18:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-06 23:04 BUG: fetch in certain repo always gives "did not send all necessary objects" Elijah Newren
2018-02-06 23:20 ` Stefan Beller
2018-02-07  0:00   ` Elijah Newren
2018-02-07 11:08     ` Duy Nguyen
2018-02-07 13:23       ` Jeff King
2018-02-07 18:12       ` Elijah Newren
2018-02-07 13:21     ` Jeff King
2018-02-07 13:25       ` Jeff King
2018-02-07 17:25       ` Elijah Newren
2018-02-07 18:17         ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.