git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: sandals@crustytoothpaste.net, avarab@gmail.com,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v2 0/5] Create commit-graph file format v2
Date: Wed, 24 Apr 2019 12:58:02 -0700 (PDT)	[thread overview]
Message-ID: <pull.112.v2.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.112.git.gitgitgadget@gmail.com>

The commit-graph file format has some shortcomings that were discussed
on-list:

 1. It doesn't use the 4-byte format ID from the_hash_algo.
    
    
 2. There is no way to change the reachability index from generation numbers
    to corrected commit date [1].
    
    
 3. The unused byte in the format could be used to signal the file is
    incremental, but current clients ignore the value even if it is
    non-zero.
    
    

This series adds a new version (2) to the commit-graph file. The fifth byte
already specified the file format, so existing clients will gracefully
respond to files with a different version number. The only real change now
is that the header takes 12 bytes instead of 8, due to using the 4-byte
format ID for the hash algorithm.

The new bytes reserved for the reachability index version and incremental
file formats are now expected to be equal to the defaults. When we update
these values to be flexible in the future, if a client understands
commit-graph v2 but not those new values, then it will fail gracefully.

NOTE: this series was rebased onto ab/commit-graph-fixes, as the conflicts
were significant and subtle.

Thanks, -Stolee

[1] 
https://public-inbox.org/git/6367e30a-1b3a-4fe9-611b-d931f51effef@gmail.com/

Derrick Stolee (5):
  commit-graph: return with errors during write
  commit-graph: collapse parameters into flags
  commit-graph: create new version flags
  commit-graph: add --version=<n> option
  commit-graph: implement file format version 2

 Documentation/git-commit-graph.txt            |   3 +
 .../technical/commit-graph-format.txt         |  26 ++-
 builtin/commit-graph.c                        |  43 +++--
 builtin/commit.c                              |   5 +-
 builtin/gc.c                                  |   7 +-
 commit-graph.c                                | 156 +++++++++++++-----
 commit-graph.h                                |  16 +-
 t/t5318-commit-graph.sh                       |  68 +++++++-
 8 files changed, 254 insertions(+), 70 deletions(-)


base-commit: 93b4405ffe4ad9308740e7c1c71383bfc369baaa
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-112%2Fderrickstolee%2Fgraph%2Fv2-head-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-112/derrickstolee/graph/v2-head-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/112

Range-diff vs v1:

 1:  e72498d0c5 ! 1:  91f300ec0a commit-graph: return with errors during write
     @@ -82,8 +82,8 @@
       --- a/builtin/gc.c
       +++ b/builtin/gc.c
      @@
     - 	if (pack_garbage.nr > 0)
       		clean_pack_garbage();
     + 	}
       
      -	if (gc_write_commit_graph)
      -		write_commit_graph_reachable(get_object_directory(), 0,
     @@ -182,9 +182,9 @@
       	}
       	stop_progress(&progress);
       
     --	if (count_distinct >= GRAPH_PARENT_MISSING)
     +-	if (count_distinct >= GRAPH_EDGE_LAST_MASK)
      -		die(_("the commit graph format cannot write %d commits"), count_distinct);
     -+	if (count_distinct >= GRAPH_PARENT_MISSING) {
     ++	if (count_distinct >= GRAPH_EDGE_LAST_MASK) {
      +		error(_("the commit graph format cannot write %d commits"), count_distinct);
      +		res = 1;
      +		goto cleanup;
     @@ -196,9 +196,9 @@
       	num_chunks = num_extra_edges ? 4 : 3;
       	stop_progress(&progress);
       
     --	if (commits.nr >= GRAPH_PARENT_MISSING)
     +-	if (commits.nr >= GRAPH_EDGE_LAST_MASK)
      -		die(_("too many commits to write graph"));
     -+	if (commits.nr >= GRAPH_PARENT_MISSING) {
     ++	if (commits.nr >= GRAPH_EDGE_LAST_MASK) {
      +		error(_("too many commits to write graph"));
      +		res = 1;
      +		goto cleanup;
 2:  43a40d0c43 ! 2:  04f5df1135 commit-graph: collapse parameters into flags
     @@ -66,7 +66,7 @@
       --- a/builtin/gc.c
       +++ b/builtin/gc.c
      @@
     - 		clean_pack_garbage();
     + 	}
       
       	if (gc_write_commit_graph &&
      -	    write_commit_graph_reachable(get_object_directory(), 0,
 3:  39319e36bc ! 3:  4ddb829163 commit-graph: create new version flags
     @@ -25,25 +25,25 @@
      -#define GRAPH_VERSION GRAPH_VERSION_1
      -
       #define GRAPH_EXTRA_EDGES_NEEDED 0x80000000
     - #define GRAPH_PARENT_MISSING 0x7fffffff
       #define GRAPH_EDGE_LAST_MASK 0x7fffffff
     + #define GRAPH_PARENT_NONE 0x70000000
      @@
       	}
       
       	graph_version = *(unsigned char*)(data + 4);
      -	if (graph_version != GRAPH_VERSION) {
      +	if (graph_version != 1) {
     - 		error(_("graph version %X does not match version %X"),
     + 		error(_("commit-graph version %X does not match version %X"),
      -		      graph_version, GRAPH_VERSION);
     --		goto cleanup_fail;
     +-		return NULL;
      -	}
      -
      -	hash_version = *(unsigned char*)(data + 5);
      -	if (hash_version != oid_version()) {
     --		error(_("hash version %X does not match version %X"),
     +-		error(_("commit-graph hash version %X does not match version %X"),
      -		      hash_version, oid_version());
      +		      graph_version, 1);
     - 		goto cleanup_fail;
     + 		return NULL;
       	}
       
       	graph = alloc_commit_graph();
     @@ -52,9 +52,9 @@
      +	case 1:
      +		hash_version = *(unsigned char*)(data + 5);
      +		if (hash_version != oid_version()) {
     -+			error(_("hash version %X does not match version %X"),
     ++			error(_("commit-graph hash version %X does not match version %X"),
      +			      hash_version, oid_version());
     -+			goto cleanup_fail;
     ++			return NULL;
      +		}
      +
      +		graph->num_chunks = *(unsigned char*)(data + 6);
     @@ -72,8 +72,8 @@
       	last_chunk_offset = 8;
      -	chunk_lookup = data + 8;
       	for (i = 0; i < graph->num_chunks; i++) {
     - 		uint32_t chunk_id = get_be32(chunk_lookup + 0);
     - 		uint64_t chunk_offset = get_be64(chunk_lookup + 4);
     + 		uint32_t chunk_id;
     + 		uint64_t chunk_offset;
      @@
       	int res = 0;
       	int append = flags & COMMIT_GRAPH_APPEND;
 4:  e7ae3007f5 ! 4:  b1b0c76eb4 commit-graph: add --version=<n> option
     @@ -2,7 +2,7 @@
      
          commit-graph: add --version=<n> option
      
     -    Allo the commit-graph builtin to specify the file format version
     +    Allow the commit-graph builtin to specify the file format version
          using the '--version=<n>' option. Specify the version exactly in
          the verification tests as using a different version would change
          the offsets used in those tests.
 5:  c55e0a738c ! 5:  09362bda1b commit-graph: implement file format version 2
     @@ -22,12 +22,20 @@
          We can also fail to read the commit-graph if the eighth byte is
          non-zero.
      
     -    The 'git commit-graph read' subcommand needs updating to show the
     -    new data.
     +    Update the 'git commit-graph read' subcommand to display the new
     +    data.
      
          Set the default file format version to 2, and adjust the tests to
          expect the new 'git commit-graph read' output.
      
     +    Add explicit tests for the upgrade path from version 1 to 2. Users
     +    with an existing commit-graph with version 1 will seamlessly
     +    upgrade to version 2 on their next write.
     +
     +    While we converted the existing 'verify' tests to use a version 1
     +    file to avoid recalculating data offsets, add explicit 'verify'
     +    tests on a version 2 file that corrupt the new header values.
     +
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
       diff --git a/Documentation/technical/commit-graph-format.txt b/Documentation/technical/commit-graph-format.txt
     @@ -118,21 +126,17 @@
      +	unsigned char graph_version, hash_version, reach_index_version;
      +	uint32_t hash_id;
       
     - 	if (fd < 0)
     + 	if (!graph_map)
       		return NULL;
      @@
       	}
       
       	graph_version = *(unsigned char*)(data + 4);
      -	if (graph_version != 1) {
     --		error(_("graph version %X does not match version %X"),
     --		      graph_version, 1);
      +	if (!graph_version || graph_version > 2) {
     -+		error(_("unsupported graph version %X"),
     -+		      graph_version);
     - 		goto cleanup_fail;
     - 	}
     - 
     + 		error(_("commit-graph version %X does not match version %X"),
     + 		      graph_version, 1);
     + 		return NULL;
      @@
       		graph->num_chunks = *(unsigned char*)(data + 6);
       		chunk_lookup = data + 8;
     @@ -145,18 +149,18 @@
      +		if (reach_index_version != 1) {
      +			error(_("unsupported reachability index version %d"),
      +			      reach_index_version);
     -+			goto cleanup_fail;
     ++			return NULL;
      +		}
      +
      +		if (*(unsigned char*)(data + 7)) {
      +			error(_("unsupported value in commit-graph header"));
     -+			goto cleanup_fail;
     ++			return NULL;
      +		}
      +
      +		hash_id = get_be32(data + 8);
      +		if (hash_id != the_hash_algo->format_id) {
      +			error(_("commit-graph hash algorithm does not match current algorithm"));
     -+			goto cleanup_fail;
     ++			return NULL;
      +		}
      +
      +		chunk_lookup = data + 12;
     @@ -209,6 +213,31 @@
       diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
       --- a/t/t5318-commit-graph.sh
       +++ b/t/t5318-commit-graph.sh
     +@@
     + 	git repack
     + '
     + 
     +-graph_git_two_modes() {
     ++graph_git_two_modes () {
     + 	git -c core.commitGraph=true $1 >output
     + 	git -c core.commitGraph=false $1 >expect
     + 	test_cmp expect output
     + }
     + 
     +-graph_git_behavior() {
     ++graph_git_behavior () {
     + 	MSG=$1
     + 	DIR=$2
     + 	BRANCH=$3
     +@@
     + 
     + graph_git_behavior 'no graph' full commits/3 commits/1
     + 
     +-graph_read_expect() {
     ++graph_read_expect () {
     + 	OPTIONAL=""
     + 	NUM_CHUNKS=3
     + 	if test ! -z $2
      @@
       		NUM_CHUNKS=$((3 + $(echo "$2" | wc -w)))
       	fi
     @@ -219,6 +248,40 @@
       	num_commits: $1
       	chunks: oid_fanout oid_lookup commit_metadata$OPTIONAL
       	EOF
     +@@
     + 	)
     + '
     + 
     ++test_expect_success 'write v1 graph' '
     ++	git commit-graph write --reachable --version=1 &&
     ++	git commit-graph verify
     ++'
     ++
     ++graph_git_behavior 'version 1 graph, commit 8 vs merge 2' full commits/8 merge/2
     ++graph_git_behavior 'version 1 graph, commit 8 vs merge 2' full commits/8 merge/2
     ++
     ++test_expect_success 'upgrade from v1 to v2' '
     ++	git checkout -b new-commit-for-upgrade &&
     ++	test_commit force-upgrade &&
     ++	git commit-graph write --reachable --version=2 &&
     ++	git commit-graph verify
     ++'
     ++
     ++graph_git_behavior 'upgraded graph, commit 8 vs merge 2' full commits/8 merge/2
     ++graph_git_behavior 'upgraded graph, commit 8 vs merge 2' full commits/8 merge/2
     ++
     + # the verify tests below expect the commit-graph to contain
     + # exactly the commits reachable from the commits/8 branch.
     + # If the file changes the set of commits in the list, then the
     +@@
     + # starting at <zero_pos>, then runs 'git commit-graph verify'
     + # and places the output in the file 'err'. Test 'err' for
     + # the given string.
     +-corrupt_graph_and_verify() {
     ++corrupt_graph_and_verify () {
     + 	pos=$1
     + 	data="${2:-\0}"
     + 	grepstr=$3
      @@
       '
       
     @@ -235,3 +298,41 @@
       test_expect_success 'detect bad hash version' '
       	corrupt_graph_and_verify $GRAPH_BYTE_HASH "\02" \
       		"hash version"
     +@@
     + 	test_must_fail git fsck
     + '
     + 
     ++test_expect_success 'rewrite commmit-graph with version 2' '
     ++	rm -f .git/objects/info/commit-graph &&
     ++	git commit-graph write --reachable --version=2 &&
     ++	git commit-graph verify
     ++'
     ++
     ++GRAPH_BYTE_CHUNK_COUNT=5
     ++GRAPH_BYTE_REACH_INDEX=6
     ++GRAPH_BYTE_UNUSED=7
     ++GRAPH_BYTE_HASH=8
     ++
     ++test_expect_success 'detect low chunk count (v2)' '
     ++	corrupt_graph_and_verify $GRAPH_BYTE_CHUNK_COUNT "\02" \
     ++		"missing the .* chunk"
     ++'
     ++
     ++test_expect_success 'detect incorrect reachability index' '
     ++	corrupt_graph_and_verify $GRAPH_BYTE_REACH_INDEX "\03" \
     ++		"reachability index version"
     ++'
     ++
     ++test_expect_success 'detect non-zero unused byte' '
     ++	corrupt_graph_and_verify $GRAPH_BYTE_UNUSED "\01" \
     ++		"unsupported value"
     ++'
     ++
     ++test_expect_success 'detect bad hash version (v2)' '
     ++	corrupt_graph_and_verify $GRAPH_BYTE_HASH "\00" \
     ++		"hash algorithm"
     ++'
     ++
     + test_expect_success 'setup non-the_repository tests' '
     + 	rm -rf repo &&
     + 	git init repo &&
 6:  693900b4c5 < -:  ---------- commit-graph: test verifying a corrupt v2 header

-- 
gitgitgadget

  parent reply	other threads:[~2019-04-24 19:58 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-23 21:59 [PATCH 0/6] Create commit-graph file format v2 Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 1/6] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 2/6] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 3/6] commit-graph: create new version flags Derrick Stolee via GitGitGadget
2019-01-23 21:59 ` [PATCH 4/6] commit-graph: add --version=<n> option Derrick Stolee via GitGitGadget
2019-01-24  9:31   ` Ævar Arnfjörð Bjarmason
2019-01-23 21:59 ` [PATCH 5/6] commit-graph: implement file format version 2 Derrick Stolee via GitGitGadget
2019-01-23 23:56   ` Jonathan Tan
2019-01-24  9:40   ` Ævar Arnfjörð Bjarmason
2019-01-24 14:34     ` Derrick Stolee
2019-03-21  9:21   ` Ævar Arnfjörð Bjarmason
2019-01-23 21:59 ` [PATCH 6/6] commit-graph: test verifying a corrupt v2 header Derrick Stolee via GitGitGadget
2019-01-23 23:59   ` Jonathan Tan
2019-01-24 23:05 ` [PATCH 0/6] Create commit-graph file format v2 Junio C Hamano
2019-01-24 23:39 ` Junio C Hamano
2019-01-25 13:54   ` Derrick Stolee
2019-04-24 19:58 ` Derrick Stolee via GitGitGadget [this message]
2019-04-24 19:58   ` [PATCH v2 1/5] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-04-24 19:58   ` [PATCH v2 2/5] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-04-25  5:21     ` Junio C Hamano
2019-04-24 19:58   ` [PATCH v2 3/5] commit-graph: create new version flags Derrick Stolee via GitGitGadget
2019-04-25  5:29     ` Junio C Hamano
2019-04-25 11:09       ` Derrick Stolee
2019-04-25 21:31     ` Ævar Arnfjörð Bjarmason
2019-04-26  2:20       ` Junio C Hamano
2019-04-24 19:58   ` [PATCH v2 4/5] commit-graph: add --version=<n> option Derrick Stolee via GitGitGadget
2019-04-24 19:58   ` [PATCH v2 5/5] commit-graph: implement file format version 2 Derrick Stolee via GitGitGadget
2019-04-25 22:09   ` [PATCH v2 0/5] Create commit-graph file format v2 Ævar Arnfjörð Bjarmason
2019-04-26  2:28     ` Junio C Hamano
2019-04-26  8:33       ` Ævar Arnfjörð Bjarmason
2019-04-26 12:06         ` Derrick Stolee
2019-04-26 13:55           ` Ævar Arnfjörð Bjarmason
2019-04-27 12:57     ` Ævar Arnfjörð Bjarmason
2019-05-01 13:11   ` [PATCH v3 0/6] " Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 1/6] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-05-01 14:46       ` Ævar Arnfjörð Bjarmason
2019-05-01 13:11     ` [PATCH v3 2/6] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 3/6] commit-graph: create new version parameter Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 4/6] commit-graph: add --version=<n> option Derrick Stolee via GitGitGadget
2019-05-01 13:11     ` [PATCH v3 5/6] commit-graph: implement file format version 2 Derrick Stolee via GitGitGadget
2019-05-01 19:12       ` Ævar Arnfjörð Bjarmason
2019-05-01 19:56         ` Derrick Stolee
2019-05-01 13:11     ` [PATCH v3 6/6] commit-graph: remove Future Work section Derrick Stolee via GitGitGadget
2019-05-01 14:58       ` Ævar Arnfjörð Bjarmason
2019-05-01 19:59         ` Derrick Stolee
2019-05-01 20:25     ` [PATCH v3 0/6] Create commit-graph file format v2 Ævar Arnfjörð Bjarmason
2019-05-02 13:26       ` Derrick Stolee
2019-05-02 18:02         ` Ævar Arnfjörð Bjarmason
2019-05-03 12:47           ` Derrick Stolee
2019-05-03 13:41             ` Ævar Arnfjörð Bjarmason
2019-05-06  8:27               ` Christian Couder
2019-05-06 13:47                 ` Derrick Stolee
2019-05-03 14:16             ` SZEDER Gábor
2019-05-03 15:11               ` Derrick Stolee
2019-05-09 14:22     ` [PATCH v4 00/11] Commit-graph write refactor (was: Create commit-graph file format v2) Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 01/11] commit-graph: fix the_repository reference Derrick Stolee via GitGitGadget
2019-05-13  2:56         ` Junio C Hamano
2019-05-09 14:22       ` [PATCH v4 02/11] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-05-13  3:13         ` Junio C Hamano
2019-05-13 11:04           ` Derrick Stolee
2019-05-13 11:22             ` Derrick Stolee
2019-05-09 14:22       ` [PATCH v4 03/11] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-05-13  3:44         ` Junio C Hamano
2019-05-13 11:07           ` Derrick Stolee
2019-05-09 14:22       ` [PATCH v4 04/11] commit-graph: remove Future Work section Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 05/11] commit-graph: create write_commit_graph_context Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 07/11] commit-graph: extract fill_oids_from_commit_hex() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 06/11] commit-graph: extract fill_oids_from_packs() Derrick Stolee via GitGitGadget
2019-05-13  5:05         ` Junio C Hamano
2019-05-09 14:22       ` [PATCH v4 08/11] commit-graph: extract fill_oids_from_all_packs() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 09/11] commit-graph: extract count_distinct_commits() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 10/11] commit-graph: extract copy_oids_to_commits() Derrick Stolee via GitGitGadget
2019-05-09 14:22       ` [PATCH v4 11/11] commit-graph: extract write_commit_graph_file() Derrick Stolee via GitGitGadget
2019-05-13  5:09         ` Junio C Hamano
2019-05-09 17:58       ` [PATCH v4 00/11] Commit-graph write refactor (was: Create commit-graph file format v2) Josh Steadmon
2019-06-12 13:29       ` [PATCH v5 " Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 01/11] commit-graph: fix the_repository reference Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 02/11] commit-graph: return with errors during write Derrick Stolee via GitGitGadget
2019-06-29 17:23           ` SZEDER Gábor
2019-07-01 12:19             ` Derrick Stolee
2019-06-12 13:29         ` [PATCH v5 03/11] commit-graph: collapse parameters into flags Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 04/11] commit-graph: remove Future Work section Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 05/11] commit-graph: create write_commit_graph_context Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 06/11] commit-graph: extract fill_oids_from_packs() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 08/11] commit-graph: extract fill_oids_from_all_packs() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 07/11] commit-graph: extract fill_oids_from_commit_hex() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 09/11] commit-graph: extract count_distinct_commits() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 10/11] commit-graph: extract copy_oids_to_commits() Derrick Stolee via GitGitGadget
2019-06-12 13:29         ` [PATCH v5 11/11] commit-graph: extract write_commit_graph_file() Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.112.v2.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).