* Preserving the ability to have both SHA1 and SHA256 signatures
@ 2021-05-08 2:22 dwh
2021-05-08 6:39 ` Christian Couder
2021-05-09 0:19 ` brian m. carlson
0 siblings, 2 replies; 19+ messages in thread
From: dwh @ 2021-05-08 2:22 UTC (permalink / raw)
To: git
Hi Everybody,
I was reading through the
Documentation/technical/hash-function-transition.txt doc and realized
that the plan is to support allowing BOTH SHA1 and SHA256 signatures to
exist in a single object:
> Signed Commits
> 1. using SHA-1 only, as in existing signed commit objects
> 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig
> fields.
> 3. using only SHA-256, by only using the gpgsig-sha256 field.
>
> Signed Tags
> 1. using SHA-1 only, as in existing signed tag objects
> 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body
> signature.
> 3. using only SHA-256, by only using the gpgsig-sha256 field.
The design that I'm working on only supports a single signature that
uses a combination of fields: one 'signtype', zero or more 'signoption'
and one 'sign' in objects. I am thinking that the best thing to do is
replace the gpgsig-sha256 fields in objects and allow old gpgsig (commits)
and in-body (tags) signatures to co-exist along side to give the same
functionality.
That not only paves the way forward but preserves the full backward
compatibility that is one of my top requirements.
Thoughts?
Cheers!
Dave
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Preserving the ability to have both SHA1 and SHA256 signatures 2021-05-08 2:22 Preserving the ability to have both SHA1 and SHA256 signatures dwh @ 2021-05-08 6:39 ` Christian Couder 2021-05-08 6:56 ` Junio C Hamano 2021-05-09 0:19 ` brian m. carlson 1 sibling, 1 reply; 19+ messages in thread From: Christian Couder @ 2021-05-08 6:39 UTC (permalink / raw) To: dwh; +Cc: git Hi, (Not sure why, but, when using "Reply to all" in Gmail, it doesn't actually reply to you (or Cc you), only to the mailing list. I had to manually add your email back.) On Sat, May 8, 2021 at 4:25 AM <dwh@linuxprogrammer.org> wrote: > > Hi Everybody, > > I was reading through the > Documentation/technical/hash-function-transition.txt doc and realized > that the plan is to support allowing BOTH SHA1 and SHA256 signatures to > exist in a single object: > > > Signed Commits > > 1. using SHA-1 only, as in existing signed commit objects > > 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig > > fields. > > 3. using only SHA-256, by only using the gpgsig-sha256 field. > > > > Signed Tags > > 1. using SHA-1 only, as in existing signed tag objects > > 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body > > signature. > > 3. using only SHA-256, by only using the gpgsig-sha256 field. > > The design that I'm working on only supports a single signature that > uses a combination of fields: one 'signtype', zero or more 'signoption' > and one 'sign' in objects. Here I understand that your design doesn't support both a SHA1 and a SHA256 signature. > I am thinking that the best thing to do is > replace the gpgsig-sha256 fields in objects and allow old gpgsig (commits) > and in-body (tags) signatures to co-exist along side to give the same > functionality. Is this part of your design, or a, maybe temporary, alternative to it? > That not only paves the way forward but preserves the full backward > compatibility that is one of my top requirements. There has been patches and discussions quite recently about this, that have been reported on in our Git Rev News newsletter: https://git.github.io/rev_news/2021/02/27/edition-72/ You can see that, with the latest patches (not sure the documentation is up-to-date though), signing both commits and tags can now be round-tripped through both SHA-1 and SHA-256 conversions. How isn't that fully backward compatible? Best, Christian. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Preserving the ability to have both SHA1 and SHA256 signatures 2021-05-08 6:39 ` Christian Couder @ 2021-05-08 6:56 ` Junio C Hamano 2021-05-08 8:03 ` Felipe Contreras 0 siblings, 1 reply; 19+ messages in thread From: Junio C Hamano @ 2021-05-08 6:56 UTC (permalink / raw) To: Christian Couder; +Cc: dwh, git Christian Couder <christian.couder@gmail.com> writes: > Hi, > > (Not sure why, but, when using "Reply to all" in Gmail, it doesn't > actually reply to you (or Cc you), only to the mailing list. I had to > manually add your email back.) I am sure why. DWH, please do not use mail-follow-up-to when working with this list. It is rude and wastes people's time (like the practice just did by stealing time from Christian). Also cf. https://lore.kernel.org/git/7v63l6f1mc.fsf@gitster.siamese.dyndns.org/ https://lore.kernel.org/git/7vk3zig92n.fsf@alter.siamese.dyndns.org/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Preserving the ability to have both SHA1 and SHA256 signatures 2021-05-08 6:56 ` Junio C Hamano @ 2021-05-08 8:03 ` Felipe Contreras 2021-05-08 10:11 ` Stefan Moch 0 siblings, 1 reply; 19+ messages in thread From: Felipe Contreras @ 2021-05-08 8:03 UTC (permalink / raw) To: Junio C Hamano, Christian Couder; +Cc: dwh, git Junio C Hamano wrote: > Christian Couder <christian.couder@gmail.com> writes: > > (Not sure why, but, when using "Reply to all" in Gmail, it doesn't > > actually reply to you (or Cc you), only to the mailing list. I had to > > manually add your email back.) > > I am sure why. DWH, please do not use mail-follow-up-to when > working with this list. It is rude and wastes people's time (like > the practice just did by stealing time from Christian). I agree with this, but shouldn't this be written in some kind of mail etiquiette guideline? Along with a rationale. -- Felipe Contreras ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Preserving the ability to have both SHA1 and SHA256 signatures 2021-05-08 8:03 ` Felipe Contreras @ 2021-05-08 10:11 ` Stefan Moch 2021-05-08 11:12 ` Junio C Hamano 0 siblings, 1 reply; 19+ messages in thread From: Stefan Moch @ 2021-05-08 10:11 UTC (permalink / raw) To: Felipe Contreras, Junio C Hamano, Christian Couder, dwh; +Cc: git Felipe Contreras wrote: > Junio C Hamano wrote: >> Christian Couder <christian.couder@gmail.com> writes: >>> (Not sure why, but, when using "Reply to all" in Gmail, it doesn't >>> actually reply to you (or Cc you), only to the mailing list. I had to >>> manually add your email back.) >> >> I am sure why. DWH, please do not use mail-follow-up-to when >> working with this list. It is rude and wastes people's time (like >> the practice just did by stealing time from Christian). > > I agree with this, but shouldn't this be written in some kind of mail > etiquiette guideline? Along with a rationale. Good idea to write this down. How to use the mailing list is only sparsely documented. The following files talk about sending to the mailing list: 1. README.md 2. Documentation/SubmittingPatches 3. Documentation/MyFirstContribution.txt 4. MaintNotes (in Junio's “todo” branch, sent out to the list from time to time as “A note from the maintainer”) 2, 3 and 4 mention sending Cc to everyone involved. 2 is about new messages. 3 and 4 specifically talk about keeping everyone in Cc: in replies. Both in the context of “you don't have to be subscribed and you don't need to ask for Cc:”. Please also note, that mutt sets the “Mail-Followup-To:” header by default for sending to known mailing lists, unless “followup_to” is set to “no”. Whether or not it removes the sender address in this header depends on the list address to be known to be subscribed to or simply known to be a mailing list. It also does not set this header if no recipient address is known as a mailing list. http://www.mutt.org/doc/manual/#followup-to http://www.mutt.org/doc/manual/#using-lists ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Preserving the ability to have both SHA1 and SHA256 signatures 2021-05-08 10:11 ` Stefan Moch @ 2021-05-08 11:12 ` Junio C Hamano 0 siblings, 0 replies; 19+ messages in thread From: Junio C Hamano @ 2021-05-08 11:12 UTC (permalink / raw) To: Stefan Moch; +Cc: Felipe Contreras, Christian Couder, dwh, git Stefan Moch <stefanmoch@mail.de> writes: > Good idea to write this down. How to use the mailing list is only > sparsely documented. The following files talk about sending to the > mailing list: > > 1. README.md > 2. Documentation/SubmittingPatches > 3. Documentation/MyFirstContribution.txt > 4. MaintNotes (in Junio's “todo” branch, sent out to the list from > time to time as “A note from the maintainer”) > > 2, 3 and 4 mention sending Cc to everyone involved. > > 2 is about new messages. > > 3 and 4 specifically talk about keeping everyone in Cc: in replies. > Both in the context of “you don't have to be subscribed and you > don't need to ask for Cc:”. In case somebody wants to write a doc, a better pair of references than what I quoted earlier to draw material from are: https://public-inbox.org/git/7v4pndfjym.fsf@assigned-by-dhcp.cox.net/ https://public-inbox.org/git/7vei7zjr3y.fsf@alter.siamese.dyndns.org/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Preserving the ability to have both SHA1 and SHA256 signatures 2021-05-08 2:22 Preserving the ability to have both SHA1 and SHA256 signatures dwh 2021-05-08 6:39 ` Christian Couder @ 2021-05-09 0:19 ` brian m. carlson 2021-05-10 12:22 ` Is the sha256 object format experimental or not? Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 19+ messages in thread From: brian m. carlson @ 2021-05-09 0:19 UTC (permalink / raw) To: git [-- Attachment #1: Type: text/plain, Size: 1689 bytes --] On 2021-05-08 at 02:22:25, dwh@linuxprogrammer.org wrote: > Hi Everybody, > > I was reading through the > Documentation/technical/hash-function-transition.txt doc and realized > that the plan is to support allowing BOTH SHA1 and SHA256 signatures to > exist in a single object: > > > Signed Commits > > 1. using SHA-1 only, as in existing signed commit objects > > 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig > > fields. > > 3. using only SHA-256, by only using the gpgsig-sha256 field. > > > > Signed Tags > > 1. using SHA-1 only, as in existing signed tag objects > > 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body > > signature. > > 3. using only SHA-256, by only using the gpgsig-sha256 field. Yes, this is the case. We have tests for this case. > The design that I'm working on only supports a single signature that > uses a combination of fields: one 'signtype', zero or more 'signoption' > and one 'sign' in objects. I am thinking that the best thing to do is > replace the gpgsig-sha256 fields in objects and allow old gpgsig (commits) > and in-body (tags) signatures to co-exist along side to give the same > functionality. You can't do that. SHA-256 repositories already exist and that would break compatibility. > That not only paves the way forward but preserves the full backward > compatibility that is one of my top requirements. I've reviewed your proposed design and provided feedback that we need to preserve this functionality in your new design as well. People will want to have that functionality. -- brian m. carlson (he/him or they/them) Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Is the sha256 object format experimental or not? 2021-05-09 0:19 ` brian m. carlson @ 2021-05-10 12:22 ` Ævar Arnfjörð Bjarmason 2021-05-10 22:42 ` brian m. carlson 0 siblings, 1 reply; 19+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-05-10 12:22 UTC (permalink / raw) To: brian m. carlson; +Cc: git On Sun, May 09 2021, brian m. carlson wrote: > [[PGP Signed Part:Undecided]] > On 2021-05-08 at 02:22:25, dwh@linuxprogrammer.org wrote: >> Hi Everybody, >> >> I was reading through the >> Documentation/technical/hash-function-transition.txt doc and realized >> that the plan is to support allowing BOTH SHA1 and SHA256 signatures to >> exist in a single object: >> >> > Signed Commits >> > 1. using SHA-1 only, as in existing signed commit objects >> > 2. using both SHA-1 and SHA-256, by using both gpgsig-sha256 and gpgsig >> > fields. >> > 3. using only SHA-256, by only using the gpgsig-sha256 field. >> > >> > Signed Tags >> > 1. using SHA-1 only, as in existing signed tag objects >> > 2. using both SHA-1 and SHA-256, by using gpgsig-sha256 and an in-body >> > signature. >> > 3. using only SHA-256, by only using the gpgsig-sha256 field. > > Yes, this is the case. We have tests for this case. > >> The design that I'm working on only supports a single signature that >> uses a combination of fields: one 'signtype', zero or more 'signoption' >> and one 'sign' in objects. I am thinking that the best thing to do is >> replace the gpgsig-sha256 fields in objects and allow old gpgsig (commits) >> and in-body (tags) signatures to co-exist along side to give the same >> functionality. > > You can't do that. SHA-256 repositories already exist and that would > break compatibility. From memory this is at least the second time you've brought up this point on-list. My feeling is that almost nobody's using sha256 currently, and we have a very prominent ALL CAPS warning saying the format is experimental and may change, see ff233d8dda1 (Documentation: mark `--object-format=sha256` as experimental, 2020-08-16). I agree with the docs as they stand, and don't think we should hold back on changing the object format for sha256 in general if there's a compelling reason to do so. Whether this suggested change has a compelling reason is another matter (I haven't reviewed it). But it seems to me that if the main person pushing the sha256 effort disagrees with the content of Documentation/object-format-disclaimer.txt, we'd be better off at this point discussing a patch to change the wording there to something to the effect that we consider the format set in stone at this point. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-10 12:22 ` Is the sha256 object format experimental or not? Ævar Arnfjörð Bjarmason @ 2021-05-10 22:42 ` brian m. carlson 2021-05-13 20:29 ` dwh 0 siblings, 1 reply; 19+ messages in thread From: brian m. carlson @ 2021-05-10 22:42 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git [-- Attachment #1: Type: text/plain, Size: 2480 bytes --] On 2021-05-10 at 12:22:00, Ævar Arnfjörð Bjarmason wrote: > > On Sun, May 09 2021, brian m. carlson wrote: > > You can't do that. SHA-256 repositories already exist and that would > > break compatibility. > > From memory this is at least the second time you've brought up this > point on-list. > > My feeling is that almost nobody's using sha256 currently, and we have a > very prominent ALL CAPS warning saying the format is experimental and > may change, see ff233d8dda1 (Documentation: mark > `--object-format=sha256` as experimental, 2020-08-16). Yes, I agreed to such text because others thought it was a good idea in case we needed to make a change. However, we don't need to make an incompatible change here, so we should avoid that if possible. Almost nobody is using it because the main forges don't yet support it, because it's going to be just as much work to support it there as it has been in Git. We won't be making it easier by making deliberately incompatible changes when we don't have to. > I agree with the docs as they stand, and don't think we should hold back > on changing the object format for sha256 in general if there's a > compelling reason to do so. I am using it and I know of other people who are using it. There are people whose companies cannot use SHA-1 for compliance reasons and are already making use of it. The problem here is a chicken and egg: nobody's going to use SHA-256 support if it's experimental and their entire repo might end up totally useless, and it's not going to become stable if nobody uses it. > But it seems to me that if the main person pushing the sha256 effort > disagrees with the content of > Documentation/object-format-disclaimer.txt, we'd be better off at this > point discussing a patch to change the wording there to something to the > effect that we consider the format set in stone at this point. I've been pretty clear up front that I thought the data was stable and we should avoid making incompatible changes. It may be that it is still experimental and may change incompatibly, but if we can avoid that problem, we should. I don't personally intend to send a patch removing the note about it being experimental until I've finished getting object interop done, since that's the major issue where we might need to make an incompatible change, but that work is moving slowly. -- brian m. carlson (he/him or they/them) Houston, Texas, US [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 262 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-10 22:42 ` brian m. carlson @ 2021-05-13 20:29 ` dwh 2021-05-13 20:49 ` Konstantin Ryabitsev ` (2 more replies) 0 siblings, 3 replies; 19+ messages in thread From: dwh @ 2021-05-13 20:29 UTC (permalink / raw) To: brian m. carlson; +Cc: Ævar Arnfjörð Bjarmason, git On 10.05.2021 22:42, brian m. carlson wrote: >Almost nobody is using it because the main forges don't yet support it, >because it's going to be just as much work to support it there as it has >been in Git. We won't be making it easier by making deliberately >incompatible changes when we don't have to. I know that you said there is no reason to make a breaking change to the SHA256 implementation now, but because of what you say above, I think we still have the opportunity to make breaking changes. In any case I think we only need to make one breaking change to gain algorithmic agility going forward and avoid painful, multi-year transitions like the one you've been executing. My project to add universal cryptographic signing to Git by using a standard protocol and generalized configuration to support any cryptographic signing scheme could also apply to the digests as well, and I think it should. If object digests in Git were self-describing (i.e. they contain an algorithm identifier as well as the digest) then repos gain "algorithmic agility" and can change algorithms at any time to keep up as algorithms grow stale and are replaced. I think Git should externalize the calculation of object digests just like it externalizes the calcualtion of object digital signatures. Cryptography is very difficult to get correct and the dedicated tools for that (e.g. OpenSSH, OpenSSL, GnuPG, etc) get lots of scrutiny and have the best chance of getting it right. I don't think Git should try to do cryptography at all. Object digests should just be names for objects; Git doesn't really need to know anything more than "is this the name for that object?". Answering that question can, and should, be done by an external tool that is implemented correctly and hardened against attack. I think the only counter-argument for this approach is performace related. Pipe-forking a child process and reading/writing over IPC pipes is expensive in terms of context switching and process setup/teardown but there are a number of mitigations I won't go into here. I think we should make one last breaking change for digests and not go with the existing SHA-256 implementation but instead switch to self-describing digests and digital signatures and rely on external tools that Git talks to using a standard protocol. We can maintain full backward compatibility and even support full round tripping using some of the similar techniques that Brian came up with. A transitional half-old/half-new signed tag could look like: ``` object 04b871796dc0420f8e7561a895b52484b701d51a obj 0ED_zgYrQg584bCrqKPoUvxaQ5aMis0GtnW_NrZFTTxUlHLUOyp77LanoZEGV6ajhYGLGTaTfCIQhryovyeNFJuG type commit tag signedtag tagger C O Mitter <committer@example.com> 1465981006 +0000 signtype openpgp sign LS0tLS1CRUdJTiBQR1AgU0lHTkFUVVJFLS0tLS0KVmVyc2lvbjogR251UEcgdjEKCmlRRWN CQUFCQWdBR0JRSlhZUmhPQUFvSkVHRUpMb1czSW5HSmtsa0lBSWNuaEw3UndFYi8rUWVYOWVua1 hoeG4KcnhmZHFydldkMUs4MHNsMlRPdDhCZy9OWXdyVUJ3L1JXSitzZy9oaEhwNFd0dkUxSERHS GxrRXozeTExTGt1aAo4dFN4UzNxS1R4WFVHb3p5UEd1RTkwc0pmRXhoWmxXNGtuSVExd3QveVdx TSszM0U5cE40aHpQcUx3eXJkb2RzCnE4RldFcVBQVWJTSlhvTWJSUHcwNFM1anJMdFpTc1VXYlJ Zam1KQ0h6bGhTZkZXVzRlRmQzN3VxdUlhTFVCUzAKcmtDM0pyeDc0MjBqa0lwZ0ZjVEkyczYwdW hTUUx6Z2NDd2RBMnVrU1lJUm5qZy96RGtqOCszaC9HYVJPSjcyeApsWnlJNkhXaXhLSmtXdzhsR TlhQU9EOVRtVFc5c0ZKd2NWQXptQXVGWDJrVXJlRFVLTVpkdUdjb1JZR3BEN0U9Cj1qcFhhCi0t LS0tRU5EIFBHUCBTSUdOQVRVUkUtLS0tLQo signed tag signed tag message body -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJXYRhOAAoJEGEJLoW3InGJklkIAIcnhL7RwEb/+QeX9enkXhxn rxfdqrvWd1K80sl2TOt8Bg/NYwrUBw/RWJ+sg/hhHp4WtvE1HDGHlkEz3y11Lkuh 8tSxS3qKTxXUGozyPGuE90sJfExhZlW4knIQ1wt/yWqM+33E9pN4hzPqLwyrdods q8FWEqPPUbSJXoMbRPw04S5jrLtZSsUWbRYjmJCHzlhSfFWW4eFd37uquIaLUBS0 rkC3Jrx7420jkIpgFcTI2s60uhSQLzgcCwdA2ukSYIRnjg/zDkj8+3h/GaROJ72x lZyI6HWixKJkWw8lE9aAOD9TmTW9sFJwcVAzmAuFX2kUreDUKMZduGcoRYGpD7E= =jpXa -----END PGP SIGNATURE----- ``` I think a good move to make right now would be to add a general function for stripping out any number of named fields from objects and also stripping out in-body signatures found in tags. That way we can add support in today's Git for stripping out fields/data for things like creating/verifying the object digest and/or digital signature. BTW, in the example above the 'obj' field is a self-describing, URL-safe Base64 encoded Blake2b-512 digest encoded using the format described [here][1]. The starting '0E' Base64 characters identify the digest as Blake2b-512 and also specify that the length of the digest is 64-bytes. If you Base64 decode the 'obj' field value you get 66 bytes, the digest value is the last 64 bytes of the 66 bytes. By going with self-describing digests we can have configuration files that contain 'program' and 'options.*' for each external tool that can create/validate digests of each type. So in this case there would be something like: ``` [digest "blake2b"] program = "blake2bsum" [digest "blake2b.options"] length = 64 ``` Using self-describing cryptographic constructs for digests and signatures and relying on external tools makes it trivial for Git to walk the object graph and enumerate all of the digest types and signature types in a given repo and determine if a user has their configuration set up correctly to work with that repo. Projects can declare which types they are using and recommend tools to use for those types. Cheers! Dave TL;DR Let me try to lay out the case for making a breaking change to sha256 right now that will future-proof repos going forward. It has been known for a few decades now that cryptography has a shelf-life. By that I mean as technology and cryptanalisys improves we have had to make keys larger and invent new algorithms that resist the new attacks on cryptography. This has been true digest algorithms (i.e. hashes), digital signatures (i.e. non-repudiation), and encryption (i.e. confidentiality). The relevant case here is the fact that sha256 is vulnerable to extension attacks and cryptographers have lost some confidence in it after many Davies-Meyer (DM) structure and ARX network designs based on MD4 were broken 20 years ago. SHA-256 uses DM plus a block cipher based on an ARX network. The end result is that in high security software, SHA-256 is being replaced with SHA-3 and Blake2 digests. Another key thing to think about is that a git repo is a form of a provenance log that could become the primary tool for securing the software supply chain if we were to make some careful, well thought out changes arond the digests and digital signatures. What changes exactly? 1. upgrade the digests to something cryptographically secure. 2. digitally sign all commits/merges/tags using... 3. key material tracked with cryptographically secure provenance logs inside of the repo itself. 4. switch to "late binding", "self describing" cryptographic constructs. Let me go over these and describe how these fit together. 1. SHA-1 is not cryptographically secure and SHA-256 is already not being used in *new* systems and is being replaced in existing, high security systems. I think Git should move to more secure digest algorithms because the hashes in Git repos are used as naming identifiers for Git objects which gives them a higher security burden. 2. Digitally signing all commits/merges/tags is critical to tie contributions to contributors in a non-repudiable fashion. At the very least it is a more secure solution for S-o-b but it also opens up the possibility for cryptographically secure accountability. Banks and governments are already doing know-your-customer (KYC) verifications of identity that can be used to identify contributors and their contributions cryptographically. If privacy is a concern, zero-knowledge proofs, based on the KYC authentic data, can be used to create pseudonymous identities for contributors that can be linked to their real-world identity under judicial order. Essentially a developer can say, "you don't need to know my real world identity but here's proof that XYZ bank knows who I am and here is a large random number you can use to de-anonymize me with the help of a court if needed" 3. The key material used for identifying contributors needs to move into the repos themselves for many reasons but the most important two reasons are (1) the repo comes with all of the data necessary to verify all of the digital signatures (i.e. solving the PKI problem for a project) and (2) to track the provenance of the public keys and other related data that each contributor uses. If Git repos contain provenance logs that are controlled and maintained by each contributor, those logs can also contain digital signatures over the code of conduct and the developer certificate of origin and other governing documents for a project that are legally binding (i.e. follow eIDAS and other legal digital signature rules). Solving the PKI problem alone makes digitally signing commits infinitely more useful and will drive adoption. Solving the non-repudiable provenance problem is the raison d'être of organizations like the Linux Foundation. I think Git should align itself with where technology is heading on that front. 4. Currently Git uses "early-binding" for all cryptographic material. The digest algorithm is hard coded (SHA-1) and the new SHA-256 is as well. The digital signature algorithm is also hard coded as either GPG or GPGSM. Early-binding makes it very difficult to plan for the obsolescense of cryptographic algorithms. The solution is to move to "late-binding"/"self-describing" cryptographic constructs. If Git were to switch to self-describing digests and digital signatures, then Git could be entirely agnostic to cryptography and rely entirely upon external crytpographic tools for creating/verifying digests and digital signatures. Instead of the direction we're taking on the SHA256 changeover, I think Git should switch to self-describing digests and digital signatures and use a standard protocol for talking to external cryptographic tools instead of trying to get cryptography correct in its code. Secure Scuttlebutt uses late-binding constructs that contain a type "sigil", Base-64 encoded key/digest/blob followed by an algorithm decriptor (e.g. ".sha256" or ".ed25519"). Other examples exist such as the Multihash encoding scheme for self-describing hashes. All of my work on secure provenance logs uses the emerging consensus encoding described [here][1]. It uses Base64 encoded cryptographic data and it fills what would be the padding bytes with type identifiers. I'm not the only one thinking along these lines. The [KERI project][2] at the Decentralized Identity Foundation as well as [Konstantin][3]. [1]: https://github.com/decentralized-identity/keri/blob/master/kids/kid0001.md [2]: https://identity.foundation/working-groups/keri.html [3]: https://people.kernel.org/monsieuricon/patches-carved-into-developer-sigchains ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 20:29 ` dwh @ 2021-05-13 20:49 ` Konstantin Ryabitsev 2021-05-13 23:47 ` dwh 2021-05-13 21:03 ` Junio C Hamano 2021-05-18 5:32 ` Jonathan Nieder 2 siblings, 1 reply; 19+ messages in thread From: Konstantin Ryabitsev @ 2021-05-13 20:49 UTC (permalink / raw) To: dwh; +Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git On Thu, May 13, 2021 at 01:29:19PM -0700, dwh@linuxprogrammer.org wrote: > 3. The key material used for identifying contributors needs to move into > the repos themselves for many reasons but the most important two > reasons are (1) the repo comes with all of the data necessary to > verify all of the digital signatures (i.e. solving the PKI problem > for a project) and (2) to track the provenance of the public keys and > other related data that each contributor uses. If Git repos contain > provenance logs that are controlled and maintained by each > contributor, those logs can also contain digital signatures over the > code of conduct and the developer certificate of origin and other > governing documents for a project that are legally binding (i.e. > follow eIDAS and other legal digital signature rules). Solving the > PKI problem alone makes digitally signing commits infinitely more > useful and will drive adoption. Solving the non-repudiable provenance > problem is the raison d'être of organizations like the Linux > Foundation. I think Git should align itself with where technology is > heading on that front. Dave: Check out what we're doing as part of patatt and b4: https://pypi.org/project/patatt/ It takes your keyring-in-git idea and runs with it -- it would be good to have your input while the project is still young and widely unknown. :) -K ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 20:49 ` Konstantin Ryabitsev @ 2021-05-13 23:47 ` dwh 2021-05-14 13:45 ` Konstantin Ryabitsev 0 siblings, 1 reply; 19+ messages in thread From: dwh @ 2021-05-13 23:47 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git On 13.05.2021 16:49, Konstantin Ryabitsev wrote: >Check out what we're doing as part of patatt and b4: >https://pypi.org/project/patatt/ > >It takes your keyring-in-git idea and runs with it -- it would be good to have >your input while the project is still young and widely unknown. :) Konstantin: That's really clever. I especially love how you're using the list archive as the provenance log of old keys developers used. That seems like it would work although I have worries about the security of X-Developer-Key and the lack of key history immediately available to `git log` because it's in the list archive and not in the repo directly. I guess the old keys would still be in your local keyring for `gpg` to use but it would mark signatures created with old revoked keys as invalid even though they are valid. Old keys--even if revoked or compromised--matter in a world of digitally signed data. As a matter of course, people should rotate their signing keys on a regular basis. It's just good hygiene. That means that there will always be old data signed with old keys and those old keys need to be kept around to validate the old signatures. My approach has been to move to cryptographically secure provenance logs that contain key rotation events and commitments to future keys and also cryptographically linking to arbitrary metadata (e.g. KYC proofs, etc). The file format is documented using the Community Standard template from the LF. I'm hoping to move Git to use external tools for all digest and digital signature operations. Then I can start putting provenance logs into a ".well-known" path in Git repos, maybe ".plogs" or something. Then I can write/adapt a signing tool to understand provenance logs of public keys in the repo instead of the GPG keyring stuff we have today. Provenance logs accumulate the full key history of a developer over time. It represents a second axis of time such that the HEAD of a repo will have the full key history, for every contributor available to cryptographic tools for verifying signatures. This makes `git log --show-signature` operations maximally efficient because we don't have to check out old keyrings from history to recreate the state GPG was in when the signature was created. I still like your approach purely for the "it works right now" aspect of the solution. Good job. I can't wait to see it in action. Cheers! Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 23:47 ` dwh @ 2021-05-14 13:45 ` Konstantin Ryabitsev 2021-05-14 17:39 ` dwh 0 siblings, 1 reply; 19+ messages in thread From: Konstantin Ryabitsev @ 2021-05-14 13:45 UTC (permalink / raw) To: dwh; +Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git On Thu, May 13, 2021 at 04:47:06PM -0700, dwh@linuxprogrammer.org wrote: > On 13.05.2021 16:49, Konstantin Ryabitsev wrote: > > Check out what we're doing as part of patatt and b4: > > https://pypi.org/project/patatt/ > > > > It takes your keyring-in-git idea and runs with it -- it would be good to have > > your input while the project is still young and widely unknown. :) > > Konstantin: > > That's really clever. I especially love how you're using the list > archive as the provenance log of old keys developers used. That seems > like it would work although I have worries about the security of > X-Developer-Key and the lack of key history immediately available to > `git log` because it's in the list archive and not in the repo directly. > > I guess the old keys would still be in your local keyring for `gpg` to > use but it would mark signatures created with old revoked keys as > invalid even though they are valid. Thanks for taking a look at it. I don't view this as much of a problem, since the goal for the project is specifically end-to-end patch attestation. For git commits, if they are signed with a key from the in-git keyring, it would actually be really straightforward to get the valid key at the time of signing -- you just retrieve the keyring using the date of the commit. > My approach has been to move to cryptographically secure provenance logs > that contain key rotation events and commitments to future keys and also > cryptographically linking to arbitrary metadata (e.g. KYC proofs, etc). > The file format is documented using the Community Standard template from > the LF. I'm hoping to move Git to use external tools for all digest and > digital signature operations. Then I can start putting provenance logs > into a ".well-known" path in Git repos, maybe ".plogs" or something. > Then I can write/adapt a signing tool to understand provenance logs > of public keys in the repo instead of the GPG keyring stuff we have > today. > > Provenance logs accumulate the full key history of a developer over > time. It represents a second axis of time such that the HEAD of a repo > will have the full key history, for every contributor available to > cryptographic tools for verifying signatures. This makes `git log > --show-signature` operations maximally efficient because we don't have > to check out old keyrings from history to recreate the state GPG was in > when the signature was created. Hmm... I'm not sure if it's an inefficient operation in the first place. If the keyring is in the same branch as the commit itself, then you can retrieve the public key using "git show [commit-sha]:path/to/that/pubkey". If it's in a different branch, then it's slightly more complicated because then you have to find a keyring commit corresponding to the commit-date of the object you're checking. In any case, these are all pretty fast operations in git. > I still like your approach purely for the "it works right now" aspect of > the solution. Good job. I can't wait to see it in action. As you know, this is my third attempt at getting patch attestation off the ground. The first one I implemented using detached attestation documents and it was clever and neat, but it was too complicated and failed to take off -- I think mostly because a) it wasn't easy to understand what it's doing, and b) it required that people adjust their workflows too much. The second attempt was better, but I think it was still too complicated, because it required that we parse patch content, making it fragile and slow on very large patch sets. I'm hoping that this version resolves the downsides of the previous two attempts by both being dumb and simple and by only requiring a simple one-time setup (via the sendemail-validate hook) with no further changes to the usual git-send-email workflow after that. I've not yet widely promoted this, as patatt is a very new project, but I'm hoping to start reaching out to people to trial it out in the next few weeks. Thanks, -K ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-14 13:45 ` Konstantin Ryabitsev @ 2021-05-14 17:39 ` dwh 0 siblings, 0 replies; 19+ messages in thread From: dwh @ 2021-05-14 17:39 UTC (permalink / raw) To: Konstantin Ryabitsev Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git On 14.05.2021 09:45, Konstantin Ryabitsev wrote: >As you know, this is my third attempt at getting patch attestation off the >ground. Yes, I've been following. It's been a long road. >I'm hoping that this version resolves the downsides of the previous two >attempts by both being dumb and simple and by only requiring a simple one-time >setup (via the sendemail-validate hook) with no further changes to the usual >git-send-email workflow after that. I'm very interested in whether this one works. You and I are completely aligned on this. I don't think I'm paying enough attention to the emailed patch attestations as you have. I think I understand the requirements but maybe not all of them. Do you have any threads on public-inbox where you discuss them? I want to make sure that what I'm doing doesn't undermine anything you're trying to do. The end goal is to have an air-tight provenance on all contributions and accountable/audtiable software supply chain. We're all working towards that. >I've not yet widely promoted this, as patatt is a very new project, but I'm >hoping to start reaching out to people to trial it out in the next few weeks. Hopefully this approach strikes the right balance. Cheers! Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 20:29 ` dwh 2021-05-13 20:49 ` Konstantin Ryabitsev @ 2021-05-13 21:03 ` Junio C Hamano 2021-05-13 23:26 ` dwh 2021-05-14 8:49 ` Ævar Arnfjörð Bjarmason 2021-05-18 5:32 ` Jonathan Nieder 2 siblings, 2 replies; 19+ messages in thread From: Junio C Hamano @ 2021-05-13 21:03 UTC (permalink / raw) To: dwh; +Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git dwh@linuxprogrammer.org writes: > I think Git should externalize the calculation of object digests just > like it externalizes the calcualtion of object digital signatures. The hashing algorithms used to generate object names has requirements fundamentally different from that of digital signatures. I strongly suspect that that fact would change the equation when you rethink what you said above. We can "upgrade" digital signature algorithms fairly easily---nobody would complain if you suddenly choose different signing algorithm over a blob of data, as long as all project participants are aware (and self-describing datastream helps here) and are capable of grokking the new algorithm we are adopting. But because object names are used by one object to refer to another, and most importantly, we do not want a single object to have multiple names, we cannot afford to introduce a new hashing algorithm every time we feel like it. In other words, diversity of object naming algorithms is to be avoided as much as possible, while diversity of signature algorithms is naturally expected. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 21:03 ` Junio C Hamano @ 2021-05-13 23:26 ` dwh 2021-05-14 8:49 ` Ævar Arnfjörð Bjarmason 1 sibling, 0 replies; 19+ messages in thread From: dwh @ 2021-05-13 23:26 UTC (permalink / raw) To: Junio C Hamano Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git On 14.05.2021 06:03, Junio C Hamano wrote: >dwh@linuxprogrammer.org writes: > >> I think Git should externalize the calculation of object digests just >> like it externalizes the calcualtion of object digital signatures. > >The hashing algorithms used to generate object names has >requirements fundamentally different from that of digital >signatures. I strongly suspect that that fact would change the >equation when you rethink what you said above. I agree with you. Object names are exactly that: names. Names for resources/data must be persistent, as well as global in scope and uniqueness, and autonomously assigned. What this means is that once an object has a name, that name shall never change as long as the object remains unchanged. The names must be unique in the scope of all objects (e.g. all copies of a repo) and generated without coordination. Calculating object names using a digest algorithm meets all of these requirements. Choosing a strong digest algorithm creates a strong cryptographic binding between the name and the object contents. Using self-describing digests allows for a repo to switch digest algorithms at arbitrary points in the history. I think that objects named with SHA1 digests should remain named with the SHA1 digest. I do *not* advocate going back and rewriting history to change all of the object names to a digest with a different algorithm. Git is a provenance log and history matters. I recommend preserving all existing names, even if they were created with known-weak digest algorithms, and making the change to a new algorithm at a specific point in time (e.g. at a tag). Using self-describing digest encoding and externalizing digest calculation future-proofs repositories and allows for preservation of history while allowing algorithm agility. To illustrate my point, I envision that a repos could have a history like this: object 2923f6fa36614586ea09b4424b438915cc1b9b67 (naked SHA1) | <many objects named with SHA1> | object 5f167fb6b3e96273b564fff0b041fb94fee4d3de (naked SHA1) | <modify Git to ext. digest calculation and self-desc encoding> | object 98c2e1c0965e60b0f137577ac5dd0a5c96ce224d (naked SHA1) | <many objects named with SHA1> | <a project decides to switch to SHA2-256, maybe marked in a tag> | object IAOdLVxteOxQwKa-xn8yCBUkuPkjAqcuQ2V7fKAlao8o (self-desc.SHA2-256) | <many objects named with self-describing SHA2-256 digests> | <a project decices to switch to SHA3-256, maybe marked in a tag> | object EK832G0PFhBFf-Dfgr205UKpUMqmVXJX9ltLwQo4Awct (self-desc.SHA3-256) | <many objects named with self-descring SHA3-256 digests> . . . Neither decision to switch to SHA2-256 nor to SHA3-256 would require any code changes. If we continue down the current SHA-256 road, we will have to repeat that multi-year effort in the future to switch to SHA3 or something else. Most importantly, the choice of digest algorithm would be left up to the maintainers of a given repo and not limited to the algorithms we have hard coded into Git. Brian's work on the SHA-256 switch is valuable. We can leverage a lot of it to switch to externalized digest calculation and self-describing digests and never have to worry about doing that again. Cheers! Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 21:03 ` Junio C Hamano 2021-05-13 23:26 ` dwh @ 2021-05-14 8:49 ` Ævar Arnfjörð Bjarmason 2021-05-14 18:10 ` dwh 1 sibling, 1 reply; 19+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2021-05-14 8:49 UTC (permalink / raw) To: Junio C Hamano; +Cc: dwh, brian m. carlson, git On Fri, May 14 2021, Junio C Hamano wrote: > dwh@linuxprogrammer.org writes: > >> I think Git should externalize the calculation of object digests just >> like it externalizes the calcualtion of object digital signatures. > > The hashing algorithms used to generate object names has > requirements fundamentally different from that of digital > signatures. I strongly suspect that that fact would change the > equation when you rethink what you said above. > > We can "upgrade" digital signature algorithms fairly easily---nobody > would complain if you suddenly choose different signing algorithm > over a blob of data, as long as all project participants are aware > (and self-describing datastream helps here) and are capable of > grokking the new algorithm we are adopting. But because object > names are used by one object to refer to another, and most > importantly, we do not want a single object to have multiple names, > we cannot afford to introduce a new hashing algorithm every time we > feel like it. In other words, diversity of object naming algorithms > is to be avoided as much as possible, while diversity of signature > algorithms is naturally expected. I agree insofar that I don't see a good reason for us to support some plethora of hash algorithms, but I wouldn't have objections to adding more if people find them useful for some reason. See e.g. [1] for an implementation. But I really don't see how anything you've said would present a technical hurdle once we have SHA-1<->SHA-256 interop in a good enough state. At that point we'll support re-hashing on arrival of content hashed with algorithm X into Y, with a local lookup table between X<=>Y. So if somebody wants to maintain content hashed with algorithm Z locally we should easily be able to support that. The "diversity of naming" won't matter past that local repository, any mention of Z will be translated to X or Y on fetch/push. 1. https://lore.kernel.org/git/20191222064809.35667-1-michaeljclark@mac.com/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-14 8:49 ` Ævar Arnfjörð Bjarmason @ 2021-05-14 18:10 ` dwh 0 siblings, 0 replies; 19+ messages in thread From: dwh @ 2021-05-14 18:10 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Junio C Hamano, brian m. carlson, git On 14.05.2021 10:49, Ævar Arnfjörð Bjarmason wrote: >I agree insofar that I don't see a good reason for us to support some >plethora of hash algorithms, but I wouldn't have objections to adding >more if people find them useful for some reason. See e.g. [1] for an >implementation. I think Git should not try to do any cryptographic operations at all and rely on external tools that are implemented properly and hardended. Implementing cryptography isn't just about translating the algorithm into code but also getting memory security correct, file handling correct, input security correct, control flow correct (equal cost multi-path), etc, etc. Most of the cryptography libraries aren't designed to be misuse resistant. The only one I know of that has that as a top-line requirement is Hyperledger Ursa [1]. I would like to see us remove all cryptography code (e.g. digests, digital signatures, etc) from Git and rely on external tools entirely. If we store the cryptographic material in a self-describing format that identifies the associated tool as well as the cryptographic data, then Git can be completely agnostic. >But I really don't see how anything you've said would present a >technical hurdle once we have SHA-1<->SHA-256 interop in a good enough >state. At that point we'll support re-hashing on arrival of content >hashed with algorithm X into Y, with a local lookup table between X<=>Y. > >So if somebody wants to maintain content hashed with algorithm Z locally >we should easily be able to support that. The "diversity of naming" >won't matter past that local repository, any mention of Z will be >translated to X or Y on fetch/push. Using self-describing formats allows us to honor history and keep old object names as they and eliminate all of this added complications you describe. I think there is a lot of room for errors to creep in when collaborators have copies of the same repo and they have local mappings between different hashing algorithms. How is this not setting up for a combinatorial explosion of data? If the canonical repo uses SHA1 and one contributor uses SHA2-512, another uses Blake2b-256, and yet another uses SHA3-384, won't they all have to maintain six different translation tables for all objects? SHA1 <=> SHA2-512, SHA1 <=> Blake2b-256, SHA1 <=> SHA3-384, SHA2-512 <=> Blake2b-256, SHA2-512 <=> SHA3-384, and Blake2b-256 <=> SHA3-384? I guess that's your motivation for not allowing algorithmic agility. The way around this is to use self-describing formats and external tools. Git repo copies wouldn't be required to have only *one* algorithm naming all objects, requiring the translation tables. Instead Git repos would/could have heterogeneous object names, each one with a single name generated with a different digest algorithm. Git would simply consider those names as plain strings and validating those strings requires talking to the correct external tool, sending the name string and the object data and reading back the result. I think this is a much better approach because: 1. It creates algorithmic agility in a way that isn't top-down and heavy handed. 2. It eliminates the need for all of the translation tables and round-tripping complexity. 3. It empowers maintainers to decide which algorithms can/must be used when naming objcts in a given repo. Merge hooks, CI/CD checks and etiquette guides can be used to enforce this. 4. Git's attack surface becomes smaller (a very good thing) and limited to doing IPC to external tools correctly and securely (easy) instead of trying to get cryptography client code correct (very difficult). One other thing to consider is that there are new tools being developed that do similar things as Git that do have algorithmic agility and use self-describing cryptographic primitives. Late-binding trust is now a best practice and has been for quite some time. Many people rely upon Git and I think we should keep up with the best practices. Cheers! Dave ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: Is the sha256 object format experimental or not? 2021-05-13 20:29 ` dwh 2021-05-13 20:49 ` Konstantin Ryabitsev 2021-05-13 21:03 ` Junio C Hamano @ 2021-05-18 5:32 ` Jonathan Nieder 2 siblings, 0 replies; 19+ messages in thread From: Jonathan Nieder @ 2021-05-18 5:32 UTC (permalink / raw) To: dwh; +Cc: brian m. carlson, Ævar Arnfjörð Bjarmason, git Hi, dwh@linuxprogrammer.org wrote: > I think we should make one last breaking change for digests and not go > with the existing SHA-256 implementation but instead switch to > self-describing digests and digital signatures and rely on external > tools that Git talks to using a standard protocol. We can maintain full > backward compatibility and even support full round tripping using some > of the similar techniques that Brian came up with. Forgive my ignorance: can you describe what compatibility break you mean? Do you mean _removing_ support for gpgsig-sha256? If so, why --- couldn't you get the same benefit by introducing the new functionality you're describing without getting rid of historical functionality at the same time? A nice thing about signatures is that they don't change the semantics of the object. So some future version of Git can remove support for verifying them, if they turn out By the way, to be clear, the hash-function-transition doc in Documentation/technical/ is not by Brian alone. It is the result of collaboration by various people on list (see its git history for details). [...] > object 04b871796dc0420f8e7561a895b52484b701d51a > obj 0ED_zgYrQg584bCrqKPoUvxaQ5aMis0GtnW_NrZFTTxUlHLUOyp77LanoZEGV6ajhYGLGTaTfCIQhryovyeNFJuG > type commit > tag signedtag > tagger C O Mitter <committer@example.com> 1465981006 +0000 > signtype openpgp > sign LS0tLS1CRUdJTiBQR1AgU0lHTkFUVVJFLS0tLS0KVmVyc2lvbjogR251UEcgdjEKCmlRRWN > CQUFCQWdBR0JRSlhZUmhPQUFvSkVHRUpMb1czSW5HSmtsa0lBSWNuaEw3UndFYi8rUWVYOWVua1 > hoeG4KcnhmZHFydldkMUs4MHNsMlRPdDhCZy9OWXdyVUJ3L1JXSitzZy9oaEhwNFd0dkUxSERHS > GxrRXozeTExTGt1aAo4dFN4UzNxS1R4WFVHb3p5UEd1RTkwc0pmRXhoWmxXNGtuSVExd3QveVdx > TSszM0U5cE40aHpQcUx3eXJkb2RzCnE4RldFcVBQVWJTSlhvTWJSUHcwNFM1anJMdFpTc1VXYlJ > Zam1KQ0h6bGhTZkZXVzRlRmQzN3VxdUlhTFVCUzAKcmtDM0pyeDc0MjBqa0lwZ0ZjVEkyczYwdW > hTUUx6Z2NDd2RBMnVrU1lJUm5qZy96RGtqOCszaC9HYVJPSjcyeApsWnlJNkhXaXhLSmtXdzhsR > TlhQU9EOVRtVFc5c0ZKd2NWQXptQXVGWDJrVXJlRFVLTVpkdUdjb1JZR3BEN0U9Cj1qcFhhCi0t > LS0tRU5EIFBHUCBTSUdOQVRVUkUtLS0tLQo [...] > I think a good move to make right now would be to add a general function > for stripping out any number of named fields from objects and also > stripping out in-body signatures found in tags. That way we can add > support in today's Git for stripping out fields/data for things like > creating/verifying the object digest and/or digital signature. Can you say a little more about the user-facing model here? How does a user know whether the signature verification result they're looking at describes the part of the object they care about or has stripped it out? [...] > Let me try to lay out the case for making a breaking change to sha256 > right now that will future-proof repos going forward. > > It has been known for a few decades now that cryptography has a > shelf-life. Yes, this is a key assumption of the hash function transition. It is meant to be repeatable, so that we are not stuck on a particular cryptographic hash. [...] > The end result is that in high > security software, SHA-256 is being replaced with SHA-3 and Blake2 > digests. Do you mean that practice is drifting away from the conclusion of https://www.imperialviolet.org/2017/05/31/skipsha3.html? Where can I read more? It took a while to decide on sha256 as the hash for Git to use to replace sha1. The process involved useful feedback from Keccak team and others, and I feel pretty comfortable with how thoroughly it was discussed, though of course I wouldn't be surprised if the state of cryptanalysis has changed in some way since then. The front runners were from the SHA2, SHA3, and Blake2 families. The main factor that led to deciding on SHA2 is the wide availability of efficient and trustworthy implementations, in hardware and software. See https://lore.kernel.org/git/alpine.DEB.2.21.1.1706151122180.4200@virtualbox/#t and https://lore.kernel.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/#t for some of the discussion that led there. [...] > 4. switch to "late binding", "self describing" cryptographic constructs. As Junio mentioned, Git does not impose a requirement on the signature algorithm used in a signature block, including the digest involved. However, signing history typically involves signing object names, and object names use a cryptographic hash for other reasons. If we want Git to stop using a content addressable object store, that would be a more fundamental changes to its design. Thanks and hope that helps, Jonathan ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2021-05-18 5:32 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-08 2:22 Preserving the ability to have both SHA1 and SHA256 signatures dwh 2021-05-08 6:39 ` Christian Couder 2021-05-08 6:56 ` Junio C Hamano 2021-05-08 8:03 ` Felipe Contreras 2021-05-08 10:11 ` Stefan Moch 2021-05-08 11:12 ` Junio C Hamano 2021-05-09 0:19 ` brian m. carlson 2021-05-10 12:22 ` Is the sha256 object format experimental or not? Ævar Arnfjörð Bjarmason 2021-05-10 22:42 ` brian m. carlson 2021-05-13 20:29 ` dwh 2021-05-13 20:49 ` Konstantin Ryabitsev 2021-05-13 23:47 ` dwh 2021-05-14 13:45 ` Konstantin Ryabitsev 2021-05-14 17:39 ` dwh 2021-05-13 21:03 ` Junio C Hamano 2021-05-13 23:26 ` dwh 2021-05-14 8:49 ` Ævar Arnfjörð Bjarmason 2021-05-14 18:10 ` dwh 2021-05-18 5:32 ` Jonathan Nieder
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).