git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>
Cc: Git Mailing List <git@vger.kernel.org>,
	Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH 25/44] packfile: compute and use the index CRC offset
Date: Sat, 16 May 2020 13:12:26 +0200	[thread overview]
Message-ID: <CAN0heSoozYnTJpiz3VZGq6XvrpmXenaf-5FiF=uts4Jhfb5c4g@mail.gmail.com> (raw)
In-Reply-To: <20200513005424.81369-26-sandals@crustytoothpaste.net>

On Wed, 13 May 2020 at 02:56, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> Both v2 pack index files and the v3 format specified as part of the
> NewHash work have similar data starting at the CRC table.  Much of the
> existing code wants to read either this table or the offset entries
> following it, and in doing so computes the offset each time.
>
> In order to share as much code between v2 and v3, compute the offset of
> the CRC table and store it when the pack is opened.  Use this value to
> compute offsets to not only the CRC table, but to the offset entries
> beyond it.

> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1555,13 +1555,9 @@ static void read_v2_anomalous_offsets(struct packed_git *p,
>  {
>         const uint32_t *idx1, *idx2;
>         uint32_t i;
> -       const uint32_t hashwords = the_hash_algo->rawsz / sizeof(uint32_t);
>
>         /* The address of the 4-byte offset table */
> -       idx1 = (((const uint32_t *)p->index_data)
> -               + 2 /* 8-byte header */
> -               + 256 /* fan out */
> -               + hashwords * p->num_objects /* object ID table */
> +       idx1 = (((const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset))
>                 + p->num_objects /* CRC32 table */
>                 );

This counts in four-byte words (so `+ 2` skips ahead 8B as the comment
notes). And that's why we need to use "rawsz/4".

Not new in this patch, but that outer pair of parenthesis just makes
this harder to read, IMHO. I keep scanning back and forth wondering,
"where is this whole thing going to get multiplied or something?"

  idx1 = (const uint32_t *)((const uint8_t *)p->index_data + p->crc_offset)
         + p->num_objects /* CRC32 table */;

The double-casting can be avoided with something like this, but I'm not
sure it's really any better:

  idx1 = (const uint32_t *)p->index_data
         + p->crc_offset/sizeof(uint32_t)
         + p->num_objects /* CRC32 table */;

> --- a/packfile.c
> +++ b/packfile.c
> @@ -178,6 +178,7 @@ int load_idx(const char *path, const unsigned int hashsz, void *idx_map,
>                      */
>                     (sizeof(off_t) <= 4))
>                         return error("pack too large for current definition of off_t in %s", path);
> +               p->crc_offset = 8 + 4 * 256 + nr * hashsz;
>         }
>
>         p->index_version = version;

It doesn't fit in the context, but `nr` will be assigned to
`p->num_objects`. And now we can just use `hashsz` without dividing by
4, so this does the same calculation as the old one above.


Martin

  reply	other threads:[~2020-05-16 11:12 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13  0:53 [PATCH 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-05-13  0:53 ` [PATCH 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-05-13  0:53 ` [PATCH 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-05-13 19:28   ` Martin Ågren
2020-05-14  1:12     ` Junio C Hamano
2020-05-15 23:22       ` brian m. carlson
2020-05-16  0:02         ` Junio C Hamano
2020-05-13  0:53 ` [PATCH 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-05-13 19:30   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-05-13 19:32   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-05-13  0:53 ` [PATCH 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-05-13  0:53 ` [PATCH 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-05-13 19:37   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-05-13  0:53 ` [PATCH 09/44] transport: add a hash algorithm member brian m. carlson
2020-05-13  0:53 ` [PATCH 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-05-13 19:39   ` Martin Ågren
2020-05-13 22:49     ` brian m. carlson
2020-05-13  0:53 ` [PATCH 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-13 19:41   ` Martin Ågren
2020-05-13 22:52     ` brian m. carlson
2020-05-13  0:53 ` [PATCH 12/44] connect: make parse_feature_value extern brian m. carlson
2020-05-13 19:48   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-13  0:53 ` [PATCH 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-05-16 10:40   ` Martin Ågren
2020-05-16 19:59     ` brian m. carlson
2020-05-13  0:53 ` [PATCH 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-16 10:41   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-05-13  0:53 ` [PATCH 17/44] transport-helper: implement " brian m. carlson
2020-05-13  0:53 ` [PATCH 18/44] remote-curl: " brian m. carlson
2020-05-13  0:53 ` [PATCH 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-05-16 10:48   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-05-16 10:55   ` Martin Ågren
2020-05-16 19:50     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 21/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-05-16 11:02   ` Martin Ågren
2020-05-16 19:14     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 22/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-05-16 11:03   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 23/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-05-16 11:03   ` Martin Ågren
2020-05-16 19:29     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 24/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-05-16 11:04   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 25/44] packfile: compute and use the index CRC offset brian m. carlson
2020-05-16 11:12   ` Martin Ågren [this message]
2020-05-13  0:54 ` [PATCH 26/44] t5302: modernize test formatting brian m. carlson
2020-05-13  0:54 ` [PATCH 27/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-05-18 16:20   ` Junio C Hamano
2020-05-19  0:31     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 28/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-05-13  0:54 ` [PATCH 29/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-05-13  0:54 ` [PATCH 30/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-05-16 11:13   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 31/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-05-16 11:14   ` Martin Ågren
2020-05-17 22:37     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 32/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-05-16 11:15   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 33/44] t5500: make hash independent brian m. carlson
2020-05-13  0:54 ` [PATCH 34/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-05-16 11:16   ` Martin Ågren
2020-05-16 20:28     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 35/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-05-16 11:17   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 36/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-05-16 11:18   ` Martin Ågren
2020-05-16 20:47     ` brian m. carlson
2020-05-17 18:16       ` Martin Ågren
2020-05-17 20:52         ` brian m. carlson
2020-05-13  0:54 ` [PATCH 37/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-05-13  0:54 ` [PATCH 38/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-05-13  0:54 ` [PATCH 39/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-05-13  0:54 ` [PATCH 40/44] t5702: offer an object-format capability in the test brian m. carlson
2020-05-13  0:54 ` [PATCH 41/44] t5703: use object-format serve option brian m. carlson
2020-05-13  0:54 ` [PATCH 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-05-13  0:54 ` [PATCH 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-05-13  0:54 ` [PATCH 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-05-25 19:58 ` [PATCH v2 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-05-25 19:58   ` [PATCH v2 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-05-25 19:58   ` [PATCH v2 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-05-25 19:58   ` [PATCH v2 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-05-25 19:58   ` [PATCH v2 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-05-25 19:58   ` [PATCH v2 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-05-25 19:58   ` [PATCH v2 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-05-25 19:58   ` [PATCH v2 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-05-25 19:58   ` [PATCH v2 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-05-25 19:58   ` [PATCH v2 09/44] transport: add a hash algorithm member brian m. carlson
2020-05-25 19:58   ` [PATCH v2 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-05-25 19:58   ` [PATCH v2 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:58   ` [PATCH v2 12/44] connect: make parse_feature_value extern brian m. carlson
2020-05-25 19:58   ` [PATCH v2 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:59   ` [PATCH v2 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-05-25 19:59   ` [PATCH v2 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:59   ` [PATCH v2 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-05-25 19:59   ` [PATCH v2 17/44] transport-helper: implement " brian m. carlson
2020-05-25 19:59   ` [PATCH v2 18/44] remote-curl: " brian m. carlson
2020-05-25 19:59   ` [PATCH v2 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-05-25 19:59   ` [PATCH v2 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-05-25 19:59   ` [PATCH v2 21/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-05-25 19:59   ` [PATCH v2 22/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-05-25 19:59   ` [PATCH v2 23/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-05-25 19:59   ` [PATCH v2 24/44] packfile: compute and use the index CRC offset brian m. carlson
2020-05-25 19:59   ` [PATCH v2 25/44] t5302: modernize test formatting brian m. carlson
2020-05-25 19:59   ` [PATCH v2 26/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-05-25 19:59   ` [PATCH v2 27/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 28/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 29/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-05-25 19:59   ` [PATCH v2 30/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-05-25 19:59   ` [PATCH v2 31/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 32/44] t5500: make hash independent brian m. carlson
2020-05-25 19:59   ` [PATCH v2 33/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-05-25 19:59   ` [PATCH v2 34/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-05-25 19:59   ` [PATCH v2 35/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-05-25 19:59   ` [PATCH v2 36/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-05-25 19:59   ` [PATCH v2 37/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-05-25 19:59   ` [PATCH v2 38/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-05-25 19:59   ` [PATCH v2 39/44] t5702: offer an object-format capability in the test brian m. carlson
2020-05-25 19:59   ` [PATCH v2 40/44] t5703: use object-format serve option brian m. carlson
2020-05-25 19:59   ` [PATCH v2 41/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-05-25 19:59   ` [PATCH v2 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-05-25 19:59   ` [PATCH v2 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-06-19 17:55 ` [PATCH v3 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-06-19 17:55   ` [PATCH v3 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-06-19 17:55   ` [PATCH v3 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-06-19 17:55   ` [PATCH v3 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-06-19 17:55   ` [PATCH v3 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-06-19 17:55   ` [PATCH v3 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-06-19 17:55   ` [PATCH v3 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-06-19 17:55   ` [PATCH v3 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-06-19 17:55   ` [PATCH v3 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-06-19 17:55   ` [PATCH v3 09/44] transport: add a hash algorithm member brian m. carlson
2020-06-19 17:55   ` [PATCH v3 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-06-19 17:55   ` [PATCH v3 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55   ` [PATCH v3 12/44] connect: make parse_feature_value extern brian m. carlson
2020-06-19 17:55   ` [PATCH v3 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55   ` [PATCH v3 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-06-19 17:55   ` [PATCH v3 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55   ` [PATCH v3 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-06-19 17:55   ` [PATCH v3 17/44] transport-helper: implement " brian m. carlson
2020-06-19 17:55   ` [PATCH v3 18/44] remote-curl: " brian m. carlson
2020-06-19 17:55   ` [PATCH v3 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-06-19 17:55   ` [PATCH v3 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-06-19 17:55   ` [PATCH v3 21/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-06-19 17:55   ` [PATCH v3 22/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-06-19 17:55   ` [PATCH v3 23/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-06-19 17:55   ` [PATCH v3 24/44] packfile: compute and use the index CRC offset brian m. carlson
2020-06-19 17:55   ` [PATCH v3 25/44] t5302: modernize test formatting brian m. carlson
2020-06-19 17:55   ` [PATCH v3 26/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-06-19 17:55   ` [PATCH v3 27/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 28/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 29/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-06-19 17:55   ` [PATCH v3 30/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-06-19 17:55   ` [PATCH v3 31/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 32/44] t5500: make hash independent brian m. carlson
2020-06-19 17:55   ` [PATCH v3 33/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-06-19 17:55   ` [PATCH v3 34/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-06-19 17:55   ` [PATCH v3 35/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-06-19 17:55   ` [PATCH v3 36/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-06-19 17:55   ` [PATCH v3 37/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-06-19 17:55   ` [PATCH v3 38/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-06-19 17:55   ` [PATCH v3 39/44] t5702: offer an object-format capability in the test brian m. carlson
2020-06-19 17:55   ` [PATCH v3 40/44] t5703: use object-format serve option brian m. carlson
2020-06-19 17:55   ` [PATCH v3 41/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-06-19 17:56   ` [PATCH v3 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-06-19 17:56   ` [PATCH v3 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-06-19 21:09   ` [PATCH v3 00/44] SHA-256 part 2/3: protocol functionality Junio C Hamano
2020-06-20  1:33     ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAN0heSoozYnTJpiz3VZGq6XvrpmXenaf-5FiF=uts4Jhfb5c4g@mail.gmail.com' \
    --to=martin.agren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).