git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Derrick Stolee via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org
Cc: szeder.dev@gmail.com, me@ttaylorr.com,
	Derrick Stolee <derrickstolee@github.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 14/15] commit-graph: restore duplicate chunk checks
Date: Mon, 7 Dec 2020 08:43:24 -0500	[thread overview]
Message-ID: <86e1d647-2242-8a3b-d5a3-832e37e79ff8@gmail.com> (raw)
In-Reply-To: <106dd51f75699fbf4fc1e46687124995f5ef0278.1607012215.git.gitgitgadget@gmail.com>

On 12/3/2020 11:16 AM, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@microsoft.com>
> 
> The previous change introduced read_table_of_contents() in the
> chunk-format API, but dropped the duplicate chunk check from the
> commit-graph parsing logic. This was done to keep flexibility in the
> chunk-format API.

"keep flexibility" is bogus. This is the biggest YAGNI of this
series. Instead, consider the patch below instead which restores
duplicate checks for the commit-graph file AND adds them to the
multi-pack-index file due to the shared API.

This is also roughly half of the added lines from the previous
patch.

Thanks,
-Stolee

-- >8 --

From 0df4959d59d7f9df3e9f6326bb0acb7b84f84980 Mon Sep 17 00:00:00 2001
From: Derrick Stolee <dstolee@microsoft.com>
Date: Mon, 7 Dec 2020 08:36:42 -0500
Subject: [PATCH] chunk-format: restore duplicate chunk checks

Before refactoring into the chunk-format API, the commit-graph parsing
logic included checks for duplicate chunks. It is unlikely that we would
desire a chunk-based file format that allows duplicate chunk IDs in the
table of contents, so add duplicate checks into
read_table_of_contents().

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 chunk-format.c | 15 ++++++++++++++-
 chunk-format.h |  3 +++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/chunk-format.c b/chunk-format.c
index d888ef6ec73..a4891bbd28a 100644
--- a/chunk-format.c
+++ b/chunk-format.c
@@ -57,9 +57,13 @@ int read_table_of_contents(const unsigned char *mfile,
 			   int nr,
 			   void *data)
 {
+	int i;
 	uint32_t chunk_id;
 	const unsigned char *table_of_contents = mfile + toc_offset;
 
+	for (i = 0; i < nr; i++)
+		chunks[i].found = 0;
+
 	while (toc_length--) {
 		int i;
 		uint64_t chunk_offset, next_chunk_offset;
@@ -83,7 +87,16 @@ int read_table_of_contents(const unsigned char *mfile,
 		}
 		for (i = 0; i < nr; i++) {
 			if (chunks[i].id == chunk_id) {
-				int result = chunks[i].read_fn(
+				int result;
+
+				if (chunks[i].found) {
+					error(_("duplicate chunk ID %"PRIx32" found"),
+					      chunk_id);
+					return 1;
+				}
+
+				chunks[i].found = 1;
+				result = chunks[i].read_fn(
 						mfile + chunk_offset,
 						next_chunk_offset - chunk_offset,
 						data);
diff --git a/chunk-format.h b/chunk-format.h
index 7049800f734..de45797223a 100644
--- a/chunk-format.h
+++ b/chunk-format.h
@@ -56,6 +56,9 @@ typedef int (*chunk_read_fn)(const unsigned char *chunk_start,
 struct read_chunk_info {
 	uint32_t id;
 	chunk_read_fn read_fn;
+
+	/* used internally */
+	unsigned found:1;
 };
 
 int read_table_of_contents(const unsigned char *mfile,
-- 
2.29.0.vfs.0.0




  reply	other threads:[~2020-12-07 13:44 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-03 16:16 [PATCH 00/15] Refactor chunk-format into an API Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 01/15] commit-graph: anonymize data in chunk_write_fn Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 02/15] chunk-format: add API for writing table of contents Derrick Stolee via GitGitGadget
2020-12-08 17:56   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 03/15] midx: rename pack_info to write_midx_context Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 04/15] midx: use context in write_midx_pack_names() Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 05/15] midx: add entries to write_midx_context Derrick Stolee via GitGitGadget
2020-12-03 21:42   ` Junio C Hamano
2020-12-04 13:39     ` Derrick Stolee
2020-12-08 18:00   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 06/15] midx: add pack_perm " Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 07/15] midx: add num_large_offsets " Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 08/15] midx: convert chunk write methods to return int Derrick Stolee via GitGitGadget
2020-12-03 21:50   ` Junio C Hamano
2020-12-04 13:40     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 09/15] midx: drop chunk progress during write Derrick Stolee via GitGitGadget
2020-12-03 16:16 ` [PATCH 10/15] midx: use chunk-format API in write_midx_internal() Derrick Stolee via GitGitGadget
2020-12-08 18:42   ` Taylor Blau
2020-12-10 14:36     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 11/15] midx: use 64-bit multiplication for chunk sizes Derrick Stolee via GitGitGadget
2020-12-03 22:00   ` Junio C Hamano
2020-12-08 18:43     ` Taylor Blau
2020-12-03 16:16 ` [PATCH 12/15] chunk-format: create write_chunks() Derrick Stolee via GitGitGadget
2020-12-08 18:45   ` Taylor Blau
2020-12-03 16:16 ` [PATCH 13/15] chunk-format: create chunk reading API Derrick Stolee via GitGitGadget
2020-12-03 22:17   ` Junio C Hamano
2020-12-04 13:47     ` Derrick Stolee
2020-12-04 20:17       ` Junio C Hamano
2020-12-03 22:43   ` Junio C Hamano
2020-12-04 13:45     ` Derrick Stolee
2020-12-03 16:16 ` [PATCH 14/15] commit-graph: restore duplicate chunk checks Derrick Stolee via GitGitGadget
2020-12-07 13:43   ` Derrick Stolee [this message]
2020-12-03 16:16 ` [PATCH 15/15] chunk-format: add technical docs Derrick Stolee via GitGitGadget
2020-12-04 12:48 ` [PATCH 00/15] Refactor chunk-format into an API René Scharfe
2020-12-04 13:57   ` Derrick Stolee
2020-12-04 19:42   ` Junio C Hamano
2020-12-08 18:49   ` Taylor Blau
2020-12-09 17:13     ` René Scharfe
2020-12-10  0:50       ` Taylor Blau
2020-12-10 14:30         ` Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86e1d647-2242-8a3b-d5a3-832e37e79ff8@gmail.com \
    --to=stolee@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).