From: Johannes Schindelin <Johannes.Schindelin@gmx.de> To: Brandon Williams <firstname.lastname@example.org> Cc: "brian m. carlson" <email@example.com>, Linus Torvalds <firstname.lastname@example.org>, Jonathan Nieder <email@example.com>, Git Mailing List <firstname.lastname@example.org>, Stefan Beller <email@example.com>, firstname.lastname@example.org, Jeff King <email@example.com>, Junio Hamano <firstname.lastname@example.org> Subject: Which hash function to use, was Re: RFC: Another proposed hash function transition plan Date: Thu, 15 Jun 2017 12:30:46 +0200 (CEST) Message-ID: <alpine.DEB.22.214.171.1246151122180.4200@virtualbox> (raw) In-Reply-To: <20170306182423.GB183239@google.com> Hi, I thought it better to revive this old thread rather than start a new thread, so as to automatically reach everybody who chimed in originally. On Mon, 6 Mar 2017, Brandon Williams wrote: > On 03/06, brian m. carlson wrote: > > > On Sat, Mar 04, 2017 at 06:35:38PM -0800, Linus Torvalds wrote: > > > > > Btw, I do think the particular choice of hash should still be on the > > > table. sha-256 may be the obvious first choice, but there are > > > definitely a few reasons to consider alternatives, especially if > > > it's a complete switch-over like this. > > > > > > One is large-file behavior - a parallel (or tree) mode could improve > > > on that noticeably. BLAKE2 does have special support for that, for > > > example. And SHA-256 does have known attacks compared to SHA-3-256 > > > or BLAKE2 - whether that is due to age or due to more effort, I > > > can't really judge. But if we're switching away from SHA1 due to > > > known attacks, it does feel like we should be careful. > > > > I agree with Linus on this. SHA-256 is the slowest option, and it's > > the one with the most advanced cryptanalysis. SHA-3-256 is faster on > > 64-bit machines (which, as we've seen on the list, is the overwhelming > > majority of machines using Git), and even BLAKE2b-256 is stronger. > > > > Doing this all over again in another couple years should also be a > > non-goal. > > I agree that when we decide to move to a new algorithm that we should > select one which we plan on using for as long as possible (much longer > than a couple years). While writing the document we simply used > "sha256" because it was more tangible and easier to reference. The SHA-1 transition *requires* a knob telling Git that the current repository uses a hash function different from SHA-1. It would make *a whole of a lot of sense* to make that knob *not* Boolean, but to specify *which* hash function is in use. That way, it will be easier to switch another time when it becomes necessary. And it will also make it easier for interested parties to use a different hash function in their infrastructure if they want. And it lifts part of that burden that we have to consider *very carefully* which function to pick. We still should be more careful than in 2005, when Git was born, and when, incidentally, when the first attacks on SHA-1 became known, of course. We were just lucky for almost 12 years. Now, with Dunning-Kruger in mind, I feel that my degree in mathematics equips me with *just enough* competence to know just how little *even I* know about cryptography. The smart thing to do, hence, was to get involved in this discussion and act as Lt Tawney Madison between us Git developers and experts in cryptography. It just so happens that I work at a company with access to excellent cryptographers, and as we own the largest Git repository on the planet, we have a vested interest in ensuring Git's continued success. After a couple of conversations with a couple of experts who I cannot thank enough for their time and patience, let alone their knowledge about this matter, it would appear that we may not have had a complete enough picture yet to even start to make the decision on the hash function to use. From what I read, pretty much everybody who participated in the discussion was aware that the essential question is: performance vs security. It turns out that we can have essentially both. SHA-256 is most likely the best-studied hash function we currently know about (*maybe* SHA3-256 has been studied slightly more, but only slightly). All the experts in the field banged on it with multiple sticks and other weapons. And so far, they only found one weakness that does not even apply to Git's usage [*1*]. For cryptography experts, this is the ultimate measure of security: if something has been attacked that intensely, by that many experts, for that long, with that little effect, it is the best we got at the time. And since SHA-256 has become the standard, and more importantly: since SHA-256 was explicitly designed to allow for relatively inexpensive hardware acceleration, this is what we will soon have: hardware support in the form of, say, special CPU instructions. (That is what I meant by: we can have performance *and* security.) This is a rather important point to stress, by the way: BLAKE's design is apparently *not* friendly to CPU instruction implementations. Meaning that SHA-256 will be faster than BLAKE (and even than BLAKE2) once the Intel and AMD CPUs with hardware support for SHA-256 become common. I also heard something really worrisome about BLAKE2 that makes me want to stay away from it (in addition to the difficulty it poses for hardware acceleration): to compete in the SHA-3 contest, BLAKE added complexity so that it would be roughly on par with its competitors. To allow for faster execution in software, this complexity was *removed* from BLAKE to create BLAKE2, making it weaker than SHA-256. Another important point to consider is that SHA-256 implementations are everywhere. Think e.g. how difficult we would make it on, say, JGit or go-git if we chose a less common hash function. As to KangarooTwelve: it has seen substantially less cryptanalysis than SHA-256 and SHA3-256. That does not necessarily mean that it is weaker, but it means that we simply cannot know whether it is as strong. On that basis alone, I would already reject it, and then there are far fewer implementations, too. When it comes to choosing SHA-256 vs SHA3-256, I would like to point out that hardware acceleration is a lot farther in the future than SHA-256 support. And according to the experts I asked, they are roughly equally secure as far as Git's usage is concerned, even if the SHA-3 contest provided SHA3-256 with even fiercer cryptanalysis than SHA-256. In short: my takeaway from the conversations with cryptography experts was that SHA-256 would be the best choice for now, and that we should make sure that the next switch is not as painful as this one (read: we should not repeat the mistake of hard-coding the new hash function into Git as much as we hard-coded SHA-1 into it). Ciao, Johannes Footnote *1*: SHA-256, as all hash functions whose output is essentially the entire internal state, are susceptible to a so-called "length extension attack", where the hash of a secret+message can be used to generate the hash of secret+message+piggyback without knowing the secret. This is not the case for Git: only visible data are hashed. The type of attacks Git has to worry about is very different from the length extension attacks, and it is highly unlikely that that weakness of SHA-256 leads to, say, a collision attack.
next prev parent reply index Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-03-04 1:12 Jonathan Nieder 2017-03-05 2:35 ` Linus Torvalds 2017-03-06 0:26 ` brian m. carlson 2017-03-06 18:24 ` Brandon Williams 2017-06-15 10:30 ` Johannes Schindelin [this message] 2017-06-15 11:05 ` Which hash function to use, was " Mike Hommey 2017-06-15 13:01 ` Jeff King 2017-06-15 16:30 ` Ævar Arnfjörð Bjarmason 2017-06-15 19:34 ` Johannes Schindelin 2017-06-15 21:59 ` Adam Langley 2017-06-15 22:41 ` brian m. carlson 2017-06-15 23:36 ` Ævar Arnfjörð Bjarmason 2017-06-16 0:17 ` brian m. carlson 2017-06-16 6:25 ` Ævar Arnfjörð Bjarmason 2017-06-16 13:24 ` Johannes Schindelin 2017-06-16 17:38 ` Adam Langley 2017-06-16 20:52 ` Junio C Hamano 2017-06-16 21:12 ` Junio C Hamano 2017-06-16 21:24 ` Jonathan Nieder 2017-06-16 21:39 ` Ævar Arnfjörð Bjarmason 2017-06-16 20:42 ` Jeff King 2017-06-19 9:26 ` Johannes Schindelin 2017-06-15 21:10 ` Mike Hommey 2017-06-16 4:30 ` Jeff King 2017-06-15 17:36 ` Brandon Williams 2017-06-15 19:20 ` Junio C Hamano 2017-06-15 19:13 ` Jonathan Nieder 2017-03-07 0:17 ` RFC v3: " Jonathan Nieder 2017-03-09 19:14 ` Shawn Pearce 2017-03-09 20:24 ` Jonathan Nieder 2017-03-10 19:38 ` Jeff King 2017-03-10 19:55 ` Jonathan Nieder 2017-09-28 4:43 ` [PATCH v4] technical doc: add a design doc for hash function transition Jonathan Nieder 2017-09-29 6:06 ` Junio C Hamano 2017-09-29 8:09 ` Junio C Hamano 2017-09-29 17:34 ` Jonathan Nieder 2017-10-02 8:25 ` Junio C Hamano 2017-10-02 19:41 ` Jason Cooper 2017-10-02 9:02 ` Junio C Hamano 2017-10-02 19:23 ` Jason Cooper 2017-10-03 5:40 ` Junio C Hamano 2017-10-03 13:08 ` Jason Cooper 2017-10-04 1:44 ` Junio C Hamano 2017-09-06 6:28 ` RFC v3: Another proposed hash function transition plan Junio C Hamano 2017-09-08 2:40 ` Junio C Hamano 2017-09-08 3:34 ` Jeff King 2017-09-11 18:59 ` Brandon Williams 2017-09-13 12:05 ` Johannes Schindelin 2017-09-13 13:43 ` demerphq 2017-09-13 22:51 ` Jonathan Nieder 2017-09-14 18:26 ` Johannes Schindelin 2017-09-14 18:40 ` Jonathan Nieder 2017-09-14 22:09 ` Johannes Schindelin 2017-09-13 23:30 ` Linus Torvalds 2017-09-14 18:45 ` Johannes Schindelin 2017-09-18 12:17 ` Gilles Van Assche 2017-09-18 22:16 ` Johannes Schindelin 2017-09-19 16:45 ` Gilles Van Assche 2017-09-29 13:17 ` Johannes Schindelin 2017-09-29 14:54 ` Joan Daemen 2017-09-29 22:33 ` Johannes Schindelin 2017-09-30 22:02 ` Joan Daemen 2017-10-02 14:26 ` Johannes Schindelin 2017-09-18 22:25 ` Jonathan Nieder 2017-09-26 17:05 ` Jason Cooper 2017-09-26 22:11 ` Johannes Schindelin 2017-09-26 22:25 ` [PATCH] technical doc: add a design doc for hash function transition Stefan Beller 2017-09-26 23:38 ` Jonathan Nieder 2017-09-26 23:51 ` RFC v3: Another proposed hash function transition plan Jonathan Nieder 2017-10-02 14:54 ` Jason Cooper 2017-10-02 16:50 ` Brandon Williams 2017-10-02 14:00 ` Jason Cooper 2017-10-02 17:18 ` Linus Torvalds 2017-10-02 19:37 ` Jeff King 2017-09-13 16:30 ` Jonathan Nieder 2017-09-13 21:52 ` Junio C Hamano 2017-09-13 22:07 ` Stefan Beller 2017-09-13 22:18 ` Jonathan Nieder 2017-09-14 2:13 ` Junio C Hamano 2017-09-14 15:23 ` Johannes Schindelin 2017-09-14 15:45 ` demerphq 2017-09-14 22:06 ` Johannes Schindelin 2017-09-13 22:15 ` Junio C Hamano 2017-09-13 22:27 ` Jonathan Nieder 2017-09-14 2:10 ` Junio C Hamano 2017-09-14 12:39 ` Johannes Schindelin 2017-09-14 16:36 ` Brandon Williams 2017-09-14 18:49 ` Jonathan Nieder 2017-09-15 20:42 ` Philip Oakley 2017-03-05 11:02 ` RFC: " David Lang [not found] ` <CA+dhYEXHbQfJ6KUB1tWS9u1MLEOJL81fTYkbxu4XO-i+379LPw@mail.gmail.com> 2017-03-06 9:43 ` Jeff King 2017-03-06 23:40 ` Jonathan Nieder 2017-03-07 0:03 ` Mike Hommey 2017-03-06 8:43 ` Jeff King 2017-03-06 18:39 ` Jonathan Tan 2017-03-06 19:22 ` Linus Torvalds 2017-03-06 19:59 ` Brandon Williams 2017-03-06 21:53 ` Junio C Hamano 2017-03-07 8:59 ` Jeff King 2017-03-06 18:43 ` Junio C Hamano 2017-03-07 18:57 ` Ian Jackson 2017-03-07 19:15 ` Linus Torvalds 2017-03-08 11:20 ` Ian Jackson 2017-03-08 15:37 ` Johannes Schindelin 2017-03-13 9:24 ` The Keccak Team 2017-03-13 17:48 ` Jonathan Nieder 2017-03-13 18:34 ` ankostis 2017-03-17 11:07 ` Johannes Schindelin 2017-03-08 15:40 Johannes Schindelin 2017-03-20 5:21 ` Use base32? Jason Hennessey 2017-03-20 5:58 ` Michael Steuer 2017-03-20 8:05 ` Jacob Keller 2017-03-21 3:07 ` Michael Steuer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=alpine.DEB.126.96.36.1996151122180.4200@virtualbox \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ --email@example.com \ --firstname.lastname@example.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Mailing List Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/git/0 git/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 git git/ https://lore.kernel.org/git \ email@example.com public-inbox-index git Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git