On Wed, Aug 29, 2018 at 11:32:08AM +0200, Ævar Arnfjörð Bjarmason wrote: > > On Wed, Aug 29 2018, brian m. carlson wrote: > > > SHA-1 is weak and we need to transition to a new hash function. For > > some time, we have referred to this new function as NewHash. > > > > The selection criteria for NewHash specify that it should (a) be 256 > > bits in length, (b) have high quality implementations available, (c) > > should match Git's needs in terms of security, and (d) ideally, be fast > > to compute. > > > > SHA-256 has a variety of high quality implementations across various > > libraries. It is implemented by every cryptographic library we support > > and is available on every platform and in almost every programming > > language. It is often highly optimized, since it is commonly used in > > TLS and elsewhere. Additionally, there are various command line > > utilities that implement it, which is useful for educational and testing > > purposes. > > > > SHA-256 is presently considered secure and has received a reasonable > > amount of cryptanalysis in the literature. It is, admittedly, not > > resistant to length extension attacks, but Git object storage is immune > > to those due to the length field at the beginning. > > > > SHA-256 is somewhat slower to compute than SHA-1 in software. However, > > since our default SHA-1 implementation is collision-detecting, a > > reasonable cryptographic library implementation of SHA-256 will actually > > be faster than SHA-256. In addition, modern ARM and AMD processors (and > > some Intel processors) contain instructions for implementing SHA-256 in > > hardware, making it the fastest possible option. > > > > There are other reasons to select SHA-256. With signed commits and > > tags, it's possible to use SHA-256 for signatures and therefore have to > > rely on only one hash algorithm for security. > > None of this is wrong, but I think this would be better off as a simple > "See Documentation/technical/hash-function-transition.txt for why we're > switching to SHA-256", and to the extent that something is said here > that isn't said there it could be a patch to amend that document. I can certainly shorten this somewhat. I wrote this back when there wasn't a consensus on hash algorithm and Junio was going to leave it to me to make a decision. I was therefore obligated to provide a coherent rationale for that decision. > > Add a basic implementation of SHA-256 based off libtomcrypt, which is in > > the public domain. Optimize it and tidy it somewhat. > > For future changes & maintenance of this, let's do that in two > steps. One where we add the upstream code as-is, and another where the > tidying / cleanup / git specific stuff is wired, which makes it easy to > audit upstream as-is v.s. our changes in isolation. Also in the first of > those commits, say in the commit message "add a [libtomcrypt] copy from > such-and-such a URL at such-and-such a version", so that it's easy to > reproduce the import & find out how to re-update it. Doing what you suggest basically means importing a large amount of libtomcrypt into our codebase, since there are a large number of reused macros all over libtomcrypt (such as for processing a generic hash and for memcpy). This isn't surprising for a general purpose crypto library, but I did a decent amount to change it, condense it into a small number of files, and make it meet our code standards. The phrase "somewhat" may have been an understatement. This is also why I added tests: because I'm human and making a small change in a crypto library can result in wrong output very quickly. > Is this something we see ourselves perma-forking? Or as with sha1dc are > we likely to pull in upstream changes from time-to-time?SHA256 obiously > isn't under active development, but there's been some churn in the > upstream code since it was added, and if you're doing some optimizing / > tidying that's presumably something upstream could benefit from as well, > as well as just us being nicer open source citizens feeding > e.g. portability fixes to upstream (since git tends to get ported a > lot). This is a permafork. We need a basic SHA-256 implementation, and this one was faster than the one I'd written some time ago. Similarly to the block-sha1 implementation, I see this as something that we'll be shipping forever with little updating. I expect with the amount of changes we're making, they're unlikely to want our code. Also, any changes to our code would be under the GPLv2, which would be unappealing to a public domain library. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204