All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: szeder.dev@gmail.com, me@ttaylorr.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: [PATCH 14/15] commit-graph: restore duplicate chunk checks
Date: Thu, 03 Dec 2020 16:16:53 +0000	[thread overview]
Message-ID: <106dd51f75699fbf4fc1e46687124995f5ef0278.1607012215.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.804.git.1607012215.gitgitgadget@gmail.com>

From: Derrick Stolee <dstolee@microsoft.com>

The previous change introduced read_table_of_contents() in the
chunk-format API, but dropped the duplicate chunk check from the
commit-graph parsing logic. This was done to keep flexibility in the
chunk-format API.

One way to restore this check is to have each chunk_read_fn method check
if it has run before. This is somewhat repetitive. If we determine that
the chunk-format API would be better off with a hard requirement that
chunks are never repeated, then this could be replaced with a check in
chunk-format.c.

For now, only restore the duplicate checks that previously existed in
the commit-graph parsing logic.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-graph.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/commit-graph.c b/commit-graph.c
index 0a3ba147df..c0102fceba 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -289,10 +289,20 @@ static int verify_commit_graph_lite(struct commit_graph *g)
 	return 0;
 }
 
+static int report_duplicate(void)
+{
+	warning(_("duplicate chunk detected"));
+	return 1;
+}
+
 static int graph_read_oid_fanout(const unsigned char *chunk_start,
 				 size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_oid_fanout)
+		return report_duplicate();
+
 	g->chunk_oid_fanout = (uint32_t*)chunk_start;
 	return 0;
 }
@@ -301,6 +311,10 @@ static int graph_read_oid_lookup(const unsigned char *chunk_start,
 				 size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_oid_lookup)
+		return report_duplicate();
+
 	g->chunk_oid_lookup = chunk_start;
 	g->num_commits = chunk_size / g->hash_len;
 	return 0;
@@ -310,6 +324,10 @@ static int graph_read_data(const unsigned char *chunk_start,
 				 size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_commit_data)
+		return report_duplicate();
+
 	g->chunk_commit_data = chunk_start;
 	return 0;
 }
@@ -318,6 +336,10 @@ static int graph_read_extra_edges(const unsigned char *chunk_start,
 				  size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_extra_edges)
+		return report_duplicate();
+
 	g->chunk_extra_edges = chunk_start;
 	return 0;
 }
@@ -326,6 +348,10 @@ static int graph_read_base_graphs(const unsigned char *chunk_start,
 				  size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_base_graphs)
+		return report_duplicate();
+
 	g->chunk_base_graphs = chunk_start;
 	return 0;
 }
@@ -334,6 +360,10 @@ static int graph_read_bloom_indices(const unsigned char *chunk_start,
 				    size_t chunk_size, void *data)
 {
 	struct commit_graph *g = (struct commit_graph *)data;
+
+	if (g->chunk_bloom_indexes)
+		return report_duplicate();
+
 	g->chunk_bloom_indexes = chunk_start;
 	return 0;
 }
@@ -343,6 +373,10 @@ static int graph_read_bloom_data(const unsigned char *chunk_start,
 {
 	struct commit_graph *g = (struct commit_graph *)data;
 	uint32_t hash_version;
+
+	if (g->chunk_bloom_data)
+		return report_duplicate();
+
 	g->chunk_bloom_data = chunk_start;
 	hash_version = get_be32(chunk_start);
 
-- 
gitgitgadget


  parent reply	other threads:[~2020-12-03 16:18 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 16:16 [PATCH 00/15] Refactor chunk-format into an API Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 01/15] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 02/15] chunk-format: add API for writing table of contents Derrick Stolee via GitGitGadget
2020-12-08 17:56   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 03/15] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 04/15] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 05/15] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2020-12-03 21:42   ` Junio C Hamano
2020-12-04 13:39     ` Derrick Stolee
2020-12-08 18:00   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 06/15] midx: add pack_perm " Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 07/15] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 08/15] midx: convert chunk write methods to return int Derrick Stolee via GitGitGadget
2020-12-03 21:50   ` Junio C Hamano
2020-12-04 13:40     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 09/15] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 10/15] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2020-12-08 18:42   ` Taylor Blau
2020-12-10 14:36     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 11/15] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2020-12-03 22:00   ` Junio C Hamano
2020-12-08 18:43     ` Taylor Blau
2020-12-03 16:16 ` [PATCH 12/15] chunk-format: create write_chunks() Derrick Stolee via GitGitGadget
2020-12-08 18:45   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 13/15] chunk-format: create chunk reading API Derrick Stolee via GitGitGadget
2020-12-03 22:17   ` Junio C Hamano
2020-12-04 13:47     ` Derrick Stolee
2020-12-04 20:17       ` Junio C Hamano
2020-12-03 22:43   ` Junio C Hamano
2020-12-04 13:45     ` Derrick Stolee
2020-12-03 16:16 ` Derrick Stolee via GitGitGadget [this message]
2020-12-07 13:43   ` [PATCH 14/15] commit-graph: restore duplicate chunk checks Derrick Stolee
2020-12-03 16:16 ` [PATCH 15/15] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2020-12-04 12:48 ` [PATCH 00/15] Refactor chunk-format into an API René Scharfe
2020-12-04 13:57   ` Derrick Stolee
2020-12-04 19:42   ` Junio C Hamano
2020-12-08 18:49   ` Taylor Blau
2020-12-09 17:13     ` René Scharfe
2020-12-10  0:50       ` Taylor Blau
2020-12-10 14:30         ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=106dd51f75699fbf4fc1e46687124995f5ef0278.1607012215.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=me@ttaylorr.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.