git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin Ågren" <martin.agren@gmail.com>
To: "brian m. carlson" <sandals@crustytoothpaste.net>,
	"Martin Ågren" <martin.agren@gmail.com>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Jonathan Tan" <jonathantanmy@google.com>
Subject: Re: [PATCH 36/44] builtin/index-pack: add option to specify hash algorithm
Date: Sun, 17 May 2020 20:16:37 +0200	[thread overview]
Message-ID: <CAN0heSqXSPXG38aqQggxA6yjkg_+PVVdh3M01RQKJM0gO0wAPA@mail.gmail.com> (raw)
In-Reply-To: <20200516204710.GI6362@camp.crustytoothpaste.net>

On Sat, 16 May 2020 at 22:47, brian m. carlson
<sandals@crustytoothpaste.net> wrote:
>
> On 2020-05-16 at 11:18:12, Martin Ågren wrote:
> > On Wed, 13 May 2020 at 02:56, brian m. carlson
> > <sandals@crustytoothpaste.net> wrote:
> > >
> > > git index-pack is usually run in a repository, but need not be. Since
> > > packs don't contains information on the algorithm in use, instead
> > > relying on context, add an option to index-pack to tell it which one
> > > we're using in case someone runs it outside of a repository.
>
> > Similar to an earlier patch where we modify `the_hash_algo` like this, I
> > feel a bit nervous. What happens if you pass in a "wrong" algo here,
> > i.e., SHA-1 in a SHA-256 repo? Or, given the motivation in the commit
> > message, should this only be allowed if we really *are* outside a repo?
>
> Unfortunately, we can't prevent the user from being inside repository A,
> which is SHA-1, while invoking git index-pack on repository B, which is
> SHA-256.

Ah, I see.

>  That is valid without --stdin, if uncommon, and it needs to be
> supported.  I can prevent it from being used with --stdin, though.

Hmm, that might make sense. I suppose it could quickly get out of
control with bug reports coming in along the lines of "if I do this
really crazy git index-pack invocation, I manage to mess things up". The
easiest way to address this might be through documentation, i.e., "don't
use this option", "for internal use" or even "to be used by the test
suite only" for which there is even precedence in git-index-pack(1).

On the other hand, if we need to detect such hash mismatch even once the
SHA-256 work is 100% complete, then I suppose we really should try a
bit to catch bad invocations.

As a tangent, I see that v2.27.0 will come with `git init
--object-format=<format>` and `GIT_DEFAULT_HASH_ALGORITHM`. The docs for
the former mentions "(if enabled)". Should we add something more scary
to those to make it clear that they shouldn't be used and that you
basically shouldn't even try to figure out how to enable them? I can
already see the tweets and blog posts a few weeks from now about how you
can build Git from source setting a single switch, run

  git init --object-format=sha256

and you're in the future! Which will just lead to pain some days or
weeks later.... "I've done lots of work. How do I convert my repo to
SHA-1 so I can share it?"...

We've added "experimental" things before and tried to document the
experimental nature. Maybe here we're not even "experimental" -- more
like "if you use this in production, you *will* suffer"?

> If you pass in a wrong algorithm, we usually blow up with an inflate
> error because we consume more bytes than expected with our ref deltas.
> I'm not aware of any cases where we segfault or access invalid memory;
> we just blow up in a nonobvious way.  That's true, too, if you manually
> tamper with the algorithm in extensions.objectformat; usually we blow up
> (but not segfault) because the index is "corrupt".

Ok, I see. I suppose "some time", we could tweak error messages to hint
about an object-format mismatch, but I don't think that needs to block
your work here now.

Martin

  reply	other threads:[~2020-05-17 18:16 UTC|newest]

Thread overview: 175+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-13  0:53 [PATCH 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-05-13  0:53 ` [PATCH 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-05-13  0:53 ` [PATCH 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-05-13 19:28   ` Martin Ågren
2020-05-14  1:12     ` Junio C Hamano
2020-05-15 23:22       ` brian m. carlson
2020-05-16  0:02         ` Junio C Hamano
2020-05-13  0:53 ` [PATCH 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-05-13 19:30   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-05-13 19:32   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-05-13  0:53 ` [PATCH 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-05-13  0:53 ` [PATCH 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-05-13 19:37   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-05-13  0:53 ` [PATCH 09/44] transport: add a hash algorithm member brian m. carlson
2020-05-13  0:53 ` [PATCH 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-05-13 19:39   ` Martin Ågren
2020-05-13 22:49     ` brian m. carlson
2020-05-13  0:53 ` [PATCH 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-13 19:41   ` Martin Ågren
2020-05-13 22:52     ` brian m. carlson
2020-05-13  0:53 ` [PATCH 12/44] connect: make parse_feature_value extern brian m. carlson
2020-05-13 19:48   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-13  0:53 ` [PATCH 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-05-16 10:40   ` Martin Ågren
2020-05-16 19:59     ` brian m. carlson
2020-05-13  0:53 ` [PATCH 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-16 10:41   ` Martin Ågren
2020-05-13  0:53 ` [PATCH 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-05-13  0:53 ` [PATCH 17/44] transport-helper: implement " brian m. carlson
2020-05-13  0:53 ` [PATCH 18/44] remote-curl: " brian m. carlson
2020-05-13  0:53 ` [PATCH 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-05-16 10:48   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-05-16 10:55   ` Martin Ågren
2020-05-16 19:50     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 21/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-05-16 11:02   ` Martin Ågren
2020-05-16 19:14     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 22/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-05-16 11:03   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 23/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-05-16 11:03   ` Martin Ågren
2020-05-16 19:29     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 24/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-05-16 11:04   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 25/44] packfile: compute and use the index CRC offset brian m. carlson
2020-05-16 11:12   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 26/44] t5302: modernize test formatting brian m. carlson
2020-05-13  0:54 ` [PATCH 27/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-05-18 16:20   ` Junio C Hamano
2020-05-19  0:31     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 28/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-05-13  0:54 ` [PATCH 29/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-05-13  0:54 ` [PATCH 30/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-05-16 11:13   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 31/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-05-16 11:14   ` Martin Ågren
2020-05-17 22:37     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 32/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-05-16 11:15   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 33/44] t5500: make hash independent brian m. carlson
2020-05-13  0:54 ` [PATCH 34/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-05-16 11:16   ` Martin Ågren
2020-05-16 20:28     ` brian m. carlson
2020-05-13  0:54 ` [PATCH 35/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-05-16 11:17   ` Martin Ågren
2020-05-13  0:54 ` [PATCH 36/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-05-16 11:18   ` Martin Ågren
2020-05-16 20:47     ` brian m. carlson
2020-05-17 18:16       ` Martin Ågren [this message]
2020-05-17 20:52         ` brian m. carlson
2020-05-13  0:54 ` [PATCH 37/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-05-13  0:54 ` [PATCH 38/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-05-13  0:54 ` [PATCH 39/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-05-13  0:54 ` [PATCH 40/44] t5702: offer an object-format capability in the test brian m. carlson
2020-05-13  0:54 ` [PATCH 41/44] t5703: use object-format serve option brian m. carlson
2020-05-13  0:54 ` [PATCH 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-05-13  0:54 ` [PATCH 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-05-13  0:54 ` [PATCH 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-05-25 19:58 ` [PATCH v2 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-05-25 19:58   ` [PATCH v2 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-05-25 19:58   ` [PATCH v2 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-05-25 19:58   ` [PATCH v2 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-05-25 19:58   ` [PATCH v2 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-05-25 19:58   ` [PATCH v2 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-05-25 19:58   ` [PATCH v2 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-05-25 19:58   ` [PATCH v2 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-05-25 19:58   ` [PATCH v2 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-05-25 19:58   ` [PATCH v2 09/44] transport: add a hash algorithm member brian m. carlson
2020-05-25 19:58   ` [PATCH v2 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-05-25 19:58   ` [PATCH v2 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:58   ` [PATCH v2 12/44] connect: make parse_feature_value extern brian m. carlson
2020-05-25 19:58   ` [PATCH v2 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:59   ` [PATCH v2 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-05-25 19:59   ` [PATCH v2 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-05-25 19:59   ` [PATCH v2 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-05-25 19:59   ` [PATCH v2 17/44] transport-helper: implement " brian m. carlson
2020-05-25 19:59   ` [PATCH v2 18/44] remote-curl: " brian m. carlson
2020-05-25 19:59   ` [PATCH v2 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-05-25 19:59   ` [PATCH v2 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-05-25 19:59   ` [PATCH v2 21/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-05-25 19:59   ` [PATCH v2 22/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-05-25 19:59   ` [PATCH v2 23/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-05-25 19:59   ` [PATCH v2 24/44] packfile: compute and use the index CRC offset brian m. carlson
2020-05-25 19:59   ` [PATCH v2 25/44] t5302: modernize test formatting brian m. carlson
2020-05-25 19:59   ` [PATCH v2 26/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-05-25 19:59   ` [PATCH v2 27/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 28/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 29/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-05-25 19:59   ` [PATCH v2 30/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-05-25 19:59   ` [PATCH v2 31/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 32/44] t5500: make hash independent brian m. carlson
2020-05-25 19:59   ` [PATCH v2 33/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-05-25 19:59   ` [PATCH v2 34/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-05-25 19:59   ` [PATCH v2 35/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-05-25 19:59   ` [PATCH v2 36/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-05-25 19:59   ` [PATCH v2 37/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-05-25 19:59   ` [PATCH v2 38/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-05-25 19:59   ` [PATCH v2 39/44] t5702: offer an object-format capability in the test brian m. carlson
2020-05-25 19:59   ` [PATCH v2 40/44] t5703: use object-format serve option brian m. carlson
2020-05-25 19:59   ` [PATCH v2 41/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-05-25 19:59   ` [PATCH v2 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-05-25 19:59   ` [PATCH v2 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-05-25 19:59   ` [PATCH v2 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-06-19 17:55 ` [PATCH v3 00/44] SHA-256 part 2/3: protocol functionality brian m. carlson
2020-06-19 17:55   ` [PATCH v3 01/44] t1050: match object ID paths in a hash-insensitive way brian m. carlson
2020-06-19 17:55   ` [PATCH v3 02/44] Documentation: document v1 protocol object-format capability brian m. carlson
2020-06-19 17:55   ` [PATCH v3 03/44] connect: have ref processing code take struct packet_reader brian m. carlson
2020-06-19 17:55   ` [PATCH v3 04/44] wrapper: add function to compare strings with different NUL termination brian m. carlson
2020-06-19 17:55   ` [PATCH v3 05/44] remote: advertise the object-format capability on the server side brian m. carlson
2020-06-19 17:55   ` [PATCH v3 06/44] connect: add function to parse multiple v1 capability values brian m. carlson
2020-06-19 17:55   ` [PATCH v3 07/44] connect: add function to fetch value of a v2 server capability brian m. carlson
2020-06-19 17:55   ` [PATCH v3 08/44] pkt-line: add a member for hash algorithm brian m. carlson
2020-06-19 17:55   ` [PATCH v3 09/44] transport: add a hash algorithm member brian m. carlson
2020-06-19 17:55   ` [PATCH v3 10/44] connect: add function to detect supported v1 hash functions brian m. carlson
2020-06-19 17:55   ` [PATCH v3 11/44] send-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55   ` [PATCH v3 12/44] connect: make parse_feature_value extern brian m. carlson
2020-06-19 17:55   ` [PATCH v3 13/44] fetch-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55   ` [PATCH v3 14/44] connect: detect algorithm when fetching refs brian m. carlson
2020-06-19 17:55   ` [PATCH v3 15/44] builtin/receive-pack: detect when the server doesn't support our hash brian m. carlson
2020-06-19 17:55   ` [PATCH v3 16/44] docs: update remote helper docs for object-format extensions brian m. carlson
2020-06-19 17:55   ` [PATCH v3 17/44] transport-helper: implement " brian m. carlson
2020-06-19 17:55   ` [PATCH v3 18/44] remote-curl: " brian m. carlson
2020-06-19 17:55   ` [PATCH v3 19/44] builtin/clone: initialize hash algorithm properly brian m. carlson
2020-06-19 17:55   ` [PATCH v3 20/44] t5562: pass object-format in synthesized test data brian m. carlson
2020-06-19 17:55   ` [PATCH v3 21/44] fetch-pack: parse and advertise the object-format capability brian m. carlson
2020-06-19 17:55   ` [PATCH v3 22/44] setup: set the_repository's hash algo when checking format brian m. carlson
2020-06-19 17:55   ` [PATCH v3 23/44] t3200: mark assertion with SHA1 prerequisite brian m. carlson
2020-06-19 17:55   ` [PATCH v3 24/44] packfile: compute and use the index CRC offset brian m. carlson
2020-06-19 17:55   ` [PATCH v3 25/44] t5302: modernize test formatting brian m. carlson
2020-06-19 17:55   ` [PATCH v3 26/44] builtin/show-index: provide options to determine hash algo brian m. carlson
2020-06-19 17:55   ` [PATCH v3 27/44] t1302: expect repo format version 1 for SHA-256 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 28/44] Documentation/technical: document object-format for protocol v2 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 29/44] connect: pass full packet reader when parsing v2 refs brian m. carlson
2020-06-19 17:55   ` [PATCH v3 30/44] connect: parse v2 refs with correct hash algorithm brian m. carlson
2020-06-19 17:55   ` [PATCH v3 31/44] serve: advertise object-format capability for protocol v2 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 32/44] t5500: make hash independent brian m. carlson
2020-06-19 17:55   ` [PATCH v3 33/44] builtin/ls-remote: initialize repository based on fetch brian m. carlson
2020-06-19 17:55   ` [PATCH v3 34/44] remote-curl: detect algorithm for dumb HTTP by size brian m. carlson
2020-06-19 17:55   ` [PATCH v3 35/44] builtin/index-pack: add option to specify hash algorithm brian m. carlson
2020-06-19 17:55   ` [PATCH v3 36/44] t1050: pass algorithm to index-pack when outside repo brian m. carlson
2020-06-19 17:55   ` [PATCH v3 37/44] remote-curl: avoid truncating refs with ls-remote brian m. carlson
2020-06-19 17:55   ` [PATCH v3 38/44] t/helper: initialize the repository for test-sha1-array brian m. carlson
2020-06-19 17:55   ` [PATCH v3 39/44] t5702: offer an object-format capability in the test brian m. carlson
2020-06-19 17:55   ` [PATCH v3 40/44] t5703: use object-format serve option brian m. carlson
2020-06-19 17:55   ` [PATCH v3 41/44] t5704: send object-format capability with SHA-256 brian m. carlson
2020-06-19 17:55   ` [PATCH v3 42/44] t5300: pass --object-format to git index-pack brian m. carlson
2020-06-19 17:56   ` [PATCH v3 43/44] bundle: detect hash algorithm when reading refs brian m. carlson
2020-06-19 17:56   ` [PATCH v3 44/44] remote-testgit: adapt for object-format brian m. carlson
2020-06-19 21:09   ` [PATCH v3 00/44] SHA-256 part 2/3: protocol functionality Junio C Hamano
2020-06-20  1:33     ` brian m. carlson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAN0heSqXSPXG38aqQggxA6yjkg_+PVVdh3M01RQKJM0gO0wAPA@mail.gmail.com \
    --to=martin.agren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=sandals@crustytoothpaste.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).