From: "Martin Ågren" <martin.agren@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: Git Mailing List <git@vger.kernel.org>,
Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH 25/44] packfile: compute and use the index CRC offset
Date: Sat, 16 May 2020 13:12:26 +0200 [thread overview]
Message-ID: <CAN0heSoozYnTJpiz3VZGq6XvrpmXenaf-5FiF=uts4Jhfb5c4g@mail.gmail.com> (raw)
In-Reply-To: <20200513005424.81369-26-sandals@crustytoothpaste.net>
On Wed, 13 May 2020 at 02:56, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Both v2 pack index files and the v3 format specified as part of the
> NewHash work have similar data starting at the CRC table. Much of the
> existing code wants to read either this table or the offset entries
> following it, and in doing so computes the offset each time.
>
> In order to share as much code between v2 and v3, compute the offset of
> the CRC table and store it when the pack is opened. Use this value to
> compute offsets to not only the CRC table, but to the offset entries
> beyond it.
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1555,13 +1555,9 @@ static void read_v2_anomalous_offsets(struct packed_git *p,
> {
> const uint32_t *idx1, *idx2;
> uint32_t i;
> - const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t);
>
> /* The address of the 4-byte offset table */
> - idx1 = (((const uint32_t *)p->index_data)
> - + 2 /* 8-byte header */
> - + 256 /* fan out */
> - + hashwords * p->num_objects /* object ID table */
> + idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset))
> + p->num_objects /* CRC32 table */
> );
This counts in four-byte words (so `+ 2` skips ahead 8B as the comment
notes). And that's why we need to use "rawsz/4".
Not new in this patch, but that outer pair of parenthesis just makes
this harder to read, IMHO. I keep scanning back and forth wondering,
"where is this whole thing going to get multiplied or something?"
idx1 = (const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset)
+ p->num_objects /* CRC32 table */;
The double-casting can be avoided with something like this, but I'm not
sure it's really any better:
idx1 = (const uint32_t *)p->index_data
+ p->crc_offset/sizeof(uint32_t)
+ p->num_objects /* CRC32 table */;
> --- a/packfile.c
> +++ b/packfile.c
> @@ -178,6 +178,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
> */
> (sizeof(off_t) <= 4))
> return error("pack too large for current definition of off_t in %s", path);
> + p->crc_offset = 8 + 4 * 256 + nr * hashsz;
> }
>
> p->index_version = version;
It doesn't fit in the context, but `nr` will be assigned to
`p->num_objects`. And now we can just use `hashsz` without dividing by
4, so this does the same calculation as the old one above.
Martin
next prev parent reply other threads:[~2020-05-16 11:12 UTC|newest]
Thread overview: 175+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-13 0:53 [PATCH 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-05-13 0:53 ` [PATCH 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-05-13 0:53 ` [PATCH 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-05-13 19:28 ` Martin Ågren
2020-05-14 1:12 ` Junio C Hamano
2020-05-15 23:22 ` brian m. carlson
2020-05-16 0:02 ` Junio C Hamano
2020-05-13 0:53 ` [PATCH 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-05-13 19:30 ` Martin Ågren
2020-05-13 0:53 ` [PATCH 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-05-13 19:32 ` Martin Ågren
2020-05-13 0:53 ` [PATCH 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-05-13 0:53 ` [PATCH 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-05-13 0:53 ` [PATCH 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-05-13 19:37 ` Martin Ågren
2020-05-13 0:53 ` [PATCH 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-05-13 0:53 ` [PATCH 09/44] transport: add a hash algorithm member brian m. carlson
2020-05-13 0:53 ` [PATCH 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-05-13 19:39 ` Martin Ågren
2020-05-13 22:49 ` brian m. carlson
2020-05-13 0:53 ` [PATCH 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-13 19:41 ` Martin Ågren
2020-05-13 22:52 ` brian m. carlson
2020-05-13 0:53 ` [PATCH 12/44] connect: make parse_feature_value extern brian m. carlson
2020-05-13 19:48 ` Martin Ågren
2020-05-13 0:53 ` [PATCH 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-13 0:53 ` [PATCH 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-05-16 10:40 ` Martin Ågren
2020-05-16 19:59 ` brian m. carlson
2020-05-13 0:53 ` [PATCH 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-16 10:41 ` Martin Ågren
2020-05-13 0:53 ` [PATCH 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-05-13 0:53 ` [PATCH 17/44] transport-helper: implement " brian m. carlson
2020-05-13 0:53 ` [PATCH 18/44] remote-curl: " brian m. carlson
2020-05-13 0:53 ` [PATCH 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-05-16 10:48 ` Martin Ågren
2020-05-13 0:54 ` [PATCH 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-05-16 10:55 ` Martin Ågren
2020-05-16 19:50 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 21/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-05-16 11:02 ` Martin Ågren
2020-05-16 19:14 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 22/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-05-16 11:03 ` Martin Ågren
2020-05-13 0:54 ` [PATCH 23/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-05-16 11:03 ` Martin Ågren
2020-05-16 19:29 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 24/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-05-16 11:04 ` Martin Ågren
2020-05-13 0:54 ` [PATCH 25/44] packfile: compute and use the index CRC offset brian m. carlson
2020-05-16 11:12 ` Martin Ågren [this message]
2020-05-13 0:54 ` [PATCH 26/44] t5302: modernize test formatting brian m. carlson
2020-05-13 0:54 ` [PATCH 27/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-05-18 16:20 ` Junio C Hamano
2020-05-19 0:31 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 28/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-05-13 0:54 ` [PATCH 29/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-05-13 0:54 ` [PATCH 30/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-05-16 11:13 ` Martin Ågren
2020-05-13 0:54 ` [PATCH 31/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-05-16 11:14 ` Martin Ågren
2020-05-17 22:37 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 32/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-05-16 11:15 ` Martin Ågren
2020-05-13 0:54 ` [PATCH 33/44] t5500: make hash independent brian m. carlson
2020-05-13 0:54 ` [PATCH 34/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-05-16 11:16 ` Martin Ågren
2020-05-16 20:28 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 35/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-05-16 11:17 ` Martin Ågren
2020-05-13 0:54 ` [PATCH 36/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-05-16 11:18 ` Martin Ågren
2020-05-16 20:47 ` brian m. carlson
2020-05-17 18:16 ` Martin Ågren
2020-05-17 20:52 ` brian m. carlson
2020-05-13 0:54 ` [PATCH 37/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-05-13 0:54 ` [PATCH 38/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-05-13 0:54 ` [PATCH 39/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-05-13 0:54 ` [PATCH 40/44] t5702: offer an object-format capability in the test brian m. carlson
2020-05-13 0:54 ` [PATCH 41/44] t5703: use object-format serve option brian m. carlson
2020-05-13 0:54 ` [PATCH 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-05-13 0:54 ` [PATCH 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-05-13 0:54 ` [PATCH 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-05-25 19:58 ` [PATCH v2 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-05-25 19:58 ` [PATCH v2 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-05-25 19:58 ` [PATCH v2 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-05-25 19:58 ` [PATCH v2 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-05-25 19:58 ` [PATCH v2 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-05-25 19:58 ` [PATCH v2 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-05-25 19:58 ` [PATCH v2 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-05-25 19:58 ` [PATCH v2 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-05-25 19:58 ` [PATCH v2 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-05-25 19:58 ` [PATCH v2 09/44] transport: add a hash algorithm member brian m. carlson
2020-05-25 19:58 ` [PATCH v2 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-05-25 19:58 ` [PATCH v2 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:58 ` [PATCH v2 12/44] connect: make parse_feature_value extern brian m. carlson
2020-05-25 19:58 ` [PATCH v2 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:59 ` [PATCH v2 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-05-25 19:59 ` [PATCH v2 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:59 ` [PATCH v2 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-05-25 19:59 ` [PATCH v2 17/44] transport-helper: implement " brian m. carlson
2020-05-25 19:59 ` [PATCH v2 18/44] remote-curl: " brian m. carlson
2020-05-25 19:59 ` [PATCH v2 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-05-25 19:59 ` [PATCH v2 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-05-25 19:59 ` [PATCH v2 21/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-05-25 19:59 ` [PATCH v2 22/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-05-25 19:59 ` [PATCH v2 23/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-05-25 19:59 ` [PATCH v2 24/44] packfile: compute and use the index CRC offset brian m. carlson
2020-05-25 19:59 ` [PATCH v2 25/44] t5302: modernize test formatting brian m. carlson
2020-05-25 19:59 ` [PATCH v2 26/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-05-25 19:59 ` [PATCH v2 27/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-05-25 19:59 ` [PATCH v2 28/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-05-25 19:59 ` [PATCH v2 29/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-05-25 19:59 ` [PATCH v2 30/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-05-25 19:59 ` [PATCH v2 31/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-05-25 19:59 ` [PATCH v2 32/44] t5500: make hash independent brian m. carlson
2020-05-25 19:59 ` [PATCH v2 33/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-05-25 19:59 ` [PATCH v2 34/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-05-25 19:59 ` [PATCH v2 35/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-05-25 19:59 ` [PATCH v2 36/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-05-25 19:59 ` [PATCH v2 37/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-05-25 19:59 ` [PATCH v2 38/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-05-25 19:59 ` [PATCH v2 39/44] t5702: offer an object-format capability in the test brian m. carlson
2020-05-25 19:59 ` [PATCH v2 40/44] t5703: use object-format serve option brian m. carlson
2020-05-25 19:59 ` [PATCH v2 41/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-05-25 19:59 ` [PATCH v2 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-05-25 19:59 ` [PATCH v2 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-05-25 19:59 ` [PATCH v2 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-06-19 17:55 ` [PATCH v3 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-06-19 17:55 ` [PATCH v3 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-06-19 17:55 ` [PATCH v3 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-06-19 17:55 ` [PATCH v3 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-06-19 17:55 ` [PATCH v3 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-06-19 17:55 ` [PATCH v3 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-06-19 17:55 ` [PATCH v3 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-06-19 17:55 ` [PATCH v3 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-06-19 17:55 ` [PATCH v3 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-06-19 17:55 ` [PATCH v3 09/44] transport: add a hash algorithm member brian m. carlson
2020-06-19 17:55 ` [PATCH v3 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-06-19 17:55 ` [PATCH v3 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55 ` [PATCH v3 12/44] connect: make parse_feature_value extern brian m. carlson
2020-06-19 17:55 ` [PATCH v3 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55 ` [PATCH v3 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-06-19 17:55 ` [PATCH v3 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55 ` [PATCH v3 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-06-19 17:55 ` [PATCH v3 17/44] transport-helper: implement " brian m. carlson
2020-06-19 17:55 ` [PATCH v3 18/44] remote-curl: " brian m. carlson
2020-06-19 17:55 ` [PATCH v3 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-06-19 17:55 ` [PATCH v3 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-06-19 17:55 ` [PATCH v3 21/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-06-19 17:55 ` [PATCH v3 22/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-06-19 17:55 ` [PATCH v3 23/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-06-19 17:55 ` [PATCH v3 24/44] packfile: compute and use the index CRC offset brian m. carlson
2020-06-19 17:55 ` [PATCH v3 25/44] t5302: modernize test formatting brian m. carlson
2020-06-19 17:55 ` [PATCH v3 26/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-06-19 17:55 ` [PATCH v3 27/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-06-19 17:55 ` [PATCH v3 28/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-06-19 17:55 ` [PATCH v3 29/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-06-19 17:55 ` [PATCH v3 30/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-06-19 17:55 ` [PATCH v3 31/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-06-19 17:55 ` [PATCH v3 32/44] t5500: make hash independent brian m. carlson
2020-06-19 17:55 ` [PATCH v3 33/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-06-19 17:55 ` [PATCH v3 34/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-06-19 17:55 ` [PATCH v3 35/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-06-19 17:55 ` [PATCH v3 36/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-06-19 17:55 ` [PATCH v3 37/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-06-19 17:55 ` [PATCH v3 38/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-06-19 17:55 ` [PATCH v3 39/44] t5702: offer an object-format capability in the test brian m. carlson
2020-06-19 17:55 ` [PATCH v3 40/44] t5703: use object-format serve option brian m. carlson
2020-06-19 17:55 ` [PATCH v3 41/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-06-19 17:55 ` [PATCH v3 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-06-19 17:56 ` [PATCH v3 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-06-19 17:56 ` [PATCH v3 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-06-19 21:09 ` [PATCH v3 00/44] SHA-256 part 2/3: protocol functionality Junio C Hamano
2020-06-20 1:33 ` brian m. carlson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAN0heSoozYnTJpiz3VZGq6XvrpmXenaf-5FiF=uts4Jhfb5c4g@mail.gmail.com' \
--to=martin.agren@gmail.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=sandals@crustytoothpaste.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).