git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: Jonathan Nieder <jrnieder@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, jonathantanmy@google.com, gitster@pobox.com,
	newren@gmail.com, Jay Conrod <jayconrod@google.com>,
	Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2 2/2] shallow.c: use '{commit,rollback}_shallow_file'
Date: Tue, 2 Jun 2020 22:52:48 -0600	[thread overview]
Message-ID: <20200603045248.GA20266@syl.local> (raw)
In-Reply-To: <20200603034213.GB253041@google.com>

Hi Jonathan,

On Tue, Jun 02, 2020 at 08:42:13PM -0700, Jonathan Nieder wrote:
> Hi,
>
> Taylor Blau wrote:
>
> > Signed-off-by: Taylor Blau <me@ttaylorr.com>
> > ---
> >  builtin/receive-pack.c   |  4 ++--
> >  commit.h                 |  2 ++
> >  fetch-pack.c             | 10 +++++-----
> >  shallow.c                | 30 +++++++++++++++++++++---------
> >  t/t5537-fetch-shallow.sh | 29 +++++++++++++++++++++++++++++
> >  5 files changed, 59 insertions(+), 16 deletions(-)
>
> I haven't investigated the cause yet, but I've run into an interesting
> bug that bisects to this commit.  Jay Conrod (cc-ed) reports:
>
> | I believe this is also the cause of Go toolchain test failures we've
> | been seeing. Go uses git to fetch dependencies.
> |
> | The problem we're seeing can be reproduced with the script below. It
> | should print "success". Instead, the git merge-base command fails
> | because the commit 7303f77963648d5f1ec5e55eccfad8e14035866c
> | (origin/master) has no history.
>
> -- 8< --
> #!/bin/bash
>
> set -euxo pipefail
> if [ -d legacytest ]; then
>   echo "legacytest directory already exists" >&2
>   exit 1
> fi
> mkdir legacytest
> cd legacytest
> git init --bare
> git config protocol.version 2
> git config fetch.writeCommitGraph true
> git remote add origin -- https://github.com/rsc/legacytest
> git fetch -f --depth=1 origin refs/heads/master:refs/heads/master
> git fetch -f origin 'refs/heads/*:refs/heads/*' 'refs/tags/*:refs/tags/*'
> git fetch --unshallow -f origin
> git merge-base --is-ancestor -- v2.0.0 7303f77963648d5f1ec5e55eccfad8e14035866c
> echo success
> -- >8 --

Thanks to you and Jay for the report and reproduction script. Indeed, I
can reproduce this on the tip of master (which is equivalent to v2.27.0
at the time of writing).

> The fetch.writeCommitGraph part is interesting.  When does a commit
> graph file get written in this sequence of operations?  In an
> unshallow operation, does the usual guard against writing a commit
> graph in a shallow repo get missed?

The last 'git fetch' is the one that writes the commit-graph. You can
verify this by sticking a 'ls objects/info' after each 'git' invocation
in your script.

Here's where things get weird, though. Prior to this patch, Git would
pick up that the repository is shallow before unshallowing, but never
invalidate this fact after unshallowing. That means that once we got to
'write_commit_graph', we'd exit immediately since it appears as if the
repository is shallow.

In this patch, we don't do that anymore, since we rightly unset the fact
that we are (were) shallow.

In a debugger, I ran your script and a 'git commit-graph write --split
--reachable' side-by-side, and found an interesting discrepancy: some
commits (loaded from 'copy_oids_to_commits') *don't* have their parents
set when invoked from 'git fetch', but *do* when invoked as 'git
commit-graph write ...'.

I'm not an expert in the object cache, but my hunch is that when we
fetch these objects they're marked as parsed without having loaded their
parents. When we load them again via 'lookup_object', we get objects
that look parsed, but don't have parents where they otherwise should.

I'm going to CC Stolee to see if he has any thoughts on how to handle
this and/or if my idea is on the right track.

> "rm -fr objects/info/commit-graphs" recovers the full history in the
> repo, so this is not a case of writing the wrong shallows --- it's
> only a commit graph issue.
>
> I'll take a closer look, but thought I'd give others a chance to look
> to in case there's something obvious. :)
>
> Thanks,
> Jonathan

Thanks,
Taylor

  reply	other threads:[~2020-06-03  4:52 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-21 18:09 [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate Taylor Blau
2020-04-21 20:41 ` Junio C Hamano
2020-04-21 20:45   ` Taylor Blau
2020-04-21 20:52     ` Junio C Hamano
2020-04-21 22:21       ` Taylor Blau
2020-04-21 23:06         ` Junio C Hamano
2020-04-22 18:05       ` Jonathan Tan
2020-04-22 18:02 ` Jonathan Tan
2020-04-22 18:15   ` Junio C Hamano
2020-04-23  0:14     ` Taylor Blau
2020-04-23  0:25       ` [PATCH v2 0/2] shallow.c: reset shallow-ness after updating Taylor Blau
2020-04-23  0:25         ` [PATCH v2 1/2] t5537: use test_write_lines, indented heredocs for readability Taylor Blau
2020-04-23  1:14           ` Jonathan Nieder
2020-04-24 17:11             ` Taylor Blau
2020-04-24 17:17               ` Jonathan Nieder
2020-04-24 20:45               ` Junio C Hamano
2020-04-23  0:25         ` [PATCH v2 2/2] shallow.c: use '{commit,rollback}_shallow_file' Taylor Blau
2020-04-23  1:23           ` Jonathan Nieder
2020-04-23 18:09           ` Jonathan Tan
2020-04-23 20:40             ` Junio C Hamano
2020-04-24 17:13               ` Taylor Blau
2020-06-03  3:42           ` Jonathan Nieder
2020-06-03  4:52             ` Taylor Blau [this message]
2020-06-03  5:16               ` Taylor Blau
2020-06-03 13:08                 ` Derrick Stolee
2020-06-03 19:26                   ` Taylor Blau
2020-06-03 21:23                   ` Jonathan Nieder
2020-06-03 20:51                 ` Jonathan Nieder
2020-06-03 22:14                   ` Taylor Blau
2020-06-03 23:06                     ` Jonathan Nieder
2020-06-04 17:45                       ` Taylor Blau
2020-04-23 19:05       ` [PATCH] shallow.c: use 'reset_repository_shallow' when appropriate Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200603045248.GA20266@syl.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jayconrod@google.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).