* SHA-256 transition @ 2022-06-20 22:51 Stephen Smith 2022-06-20 23:13 ` rsbecker 2022-06-21 10:25 ` Ævar Arnfjörð Bjarmason 0 siblings, 2 replies; 21+ messages in thread From: Stephen Smith @ 2022-06-20 22:51 UTC (permalink / raw) To: git What is the current status of the SHA-1 to SHA-256 transition? Is the transition far enough along that users should start changing over to the new format? ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: SHA-256 transition 2022-06-20 22:51 SHA-256 transition Stephen Smith @ 2022-06-20 23:13 ` rsbecker 2022-06-21 10:25 ` Ævar Arnfjörð Bjarmason 1 sibling, 0 replies; 21+ messages in thread From: rsbecker @ 2022-06-20 23:13 UTC (permalink / raw) To: 'Stephen Smith', 'git' On June 20, 2022 6:51 PM, Stephen Smith wrote: >What is the current status of the SHA-1 to SHA-256 transition? Is the >transition far enough along that users should start changing over to the new >format? I had the same question at a conference last week. Could not answer it so am curious about the plan. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-20 22:51 SHA-256 transition Stephen Smith 2022-06-20 23:13 ` rsbecker @ 2022-06-21 10:25 ` Ævar Arnfjörð Bjarmason 2022-06-21 13:18 ` rsbecker 2022-06-22 0:29 ` brian m. carlson 1 sibling, 2 replies; 21+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-21 10:25 UTC (permalink / raw) To: Stephen Smith; +Cc: git, Jeff King On Mon, Jun 20 2022, Stephen Smith wrote: > What is the current status of the SHA-1 to SHA-256 transition? Is the > transition far enough along that users should start changing over to the new > format? Just my 0.02, not the official project line or anything: I wouldn't recommend that anyone use it for anything serious at the moment, as far as I can tell the only users (if any) are currently (some) people work on git itself. The status of it is, I think it's fair to say, that it /should/ work 100% (or at least 99.99%?) as far as git itself is concerned. I.e. you can "init" a SHA-256 repository, all our in-repo tooling etc. will work with it. We run full CI tests with a SHA-256 test suite, and it's passing. But the reason I'd still say "no" on the technical/UX side is: * The inter-op between SHA-256 and SHA-1 repositories is still nonexistent, except for a one-off import. I.e. we don't have any graceful way to migrate an existing repository. * For new repositories I think you'll probably want to eventually push it to one of the online git hosting providers, none of which (as far as I'm aware) support SHA-256 repos. * Even if not, any local git tooling that's not part of git.git is likely to break, often for trivial reasons like expecting SHA-1 sized hashes in the output, but if you start using it for your repositories and use such tools you're very likely to be the first person to run into bugs in those areas. But more importantly (and note that these views are definitely *not* shared by some other project members, so take it with a grain of salt): There just isn't any compelling selling point to migrate to SHA-256 in the near or foreseeable future for a given individual user of git. The reason we started the SHA-1 -> $newhash (it wasn't known that it would be SHA-256 at the time) was in response to https://shattered.io; Although it had been discussed before, e.g. the thread starting at [1] in 2012. We've since migrated our default hash function from SHA-1 to SHA-1DC (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the SHAttered attack implemented by the same researchers. I'm not aware of a current viable SHA-1 collision against the variant of SHA-1 that we actually use these days. But even assuming for the sake of argument that we were using a much weaker and easier to break hash (say MD4 or MD5) most users still wouldn't have much or anything to worry about in practice. Discovering a hash collision is only the first step in attacking a Git repository. This aspect has been discussed many times on-list, but e.g. [3] is one such thread. The above is really *not* meant to poo-poo the whole notion of switching to a new hash. We're making good progress on it, although I think the really hard part UX-wise is left (online migration). Likewise I'd be really surprised if given the progress of that work the average Git user isn't going to be using not-SHA-1 with Git in 15-20 years, of it's even still around at that time as a relevant VCS. But should even advanced git users be spending time on migrating their data at this point? No, I don't think so given all of the above, and I really think we should carefully consider all of the trade-offs involved before recommending that the average user of git migrate over. 1. https://lore.kernel.org/git/CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47EQDE+DiUQ@mail.gmail.com/ 2. https://lore.kernel.org/git/cover-0.5-00000000000-20220422T094624Z-avarab@gmail.com/ 3. https://lore.kernel.org/git/CACBZZX65Kbp8N9X9UtBfJca7U1T0m-VtKZeKM5q9mhyCR7dwGg@mail.gmail.com/ ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: SHA-256 transition 2022-06-21 10:25 ` Ævar Arnfjörð Bjarmason @ 2022-06-21 13:18 ` rsbecker 2022-06-21 18:14 ` Ævar Arnfjörð Bjarmason 2022-06-22 0:29 ` brian m. carlson 1 sibling, 1 reply; 21+ messages in thread From: rsbecker @ 2022-06-21 13:18 UTC (permalink / raw) To: 'Ævar Arnfjörð Bjarmason', 'Stephen Smith' Cc: 'git', 'Jeff King' On June 21, 2022 6:25 AM, Ævar Arnfjörð Bjarmason wrote: >On Mon, Jun 20 2022, Stephen Smith wrote: > >> What is the current status of the SHA-1 to SHA-256 transition? Is the >> transition far enough along that users should start changing over to >> the new format? > >Just my 0.02, not the official project line or anything: > >I wouldn't recommend that anyone use it for anything serious at the moment, as >far as I can tell the only users (if any) are currently >(some) people work on git itself. > >The status of it is, I think it's fair to say, that it /should/ work 100% (or at least >99.99%?) as far as git itself is concerned. > >I.e. you can "init" a SHA-256 repository, all our in-repo tooling etc. will work with it. >We run full CI tests with a SHA-256 test suite, and it's passing. > >But the reason I'd still say "no" on the technical/UX side is: > > * The inter-op between SHA-256 and SHA-1 repositories is still > nonexistent, except for a one-off import. I.e. we don't have any > graceful way to migrate an existing repository. > > * For new repositories I think you'll probably want to eventually push > it to one of the online git hosting providers, none of which (as far > as I'm aware) support SHA-256 repos. > > * Even if not, any local git tooling that's not part of git.git is > likely to break, often for trivial reasons like expecting SHA-1 sized > hashes in the output, but if you start using it for your repositories > and use such tools you're very likely to be the first person to run > into bugs in those areas. > >But more importantly (and note that these views are definitely *not* shared by >some other project members, so take it with a grain of salt): >There just isn't any compelling selling point to migrate to SHA-256 in the near or >foreseeable future for a given individual user of git. > >The reason we started the SHA-1 -> $newhash (it wasn't known that it would be >SHA-256 at the time) was in response to https://shattered.io; Although it had >been discussed before, e.g. the thread starting at [1] in 2012. > >We've since migrated our default hash function from SHA-1 to SHA-1DC (except >on vanilla OSX, see [2]). It's a variant SHA-1 that detects the SHAttered attack >implemented by the same researchers. I'm not aware of a current viable SHA-1 >collision against the variant of SHA-1 that we actually use these days. > >But even assuming for the sake of argument that we were using a much weaker >and easier to break hash (say MD4 or MD5) most users still wouldn't have much or >anything to worry about in practice. > >Discovering a hash collision is only the first step in attacking a Git repository. This >aspect has been discussed many times on-list, but e.g. [3] is one such thread. > >The above is really *not* meant to poo-poo the whole notion of switching to a >new hash. We're making good progress on it, although I think the really hard part >UX-wise is left (online migration). > >Likewise I'd be really surprised if given the progress of that work the average Git >user isn't going to be using not-SHA-1 with Git in 15-20 years, of it's even still >around at that time as a relevant VCS. > >But should even advanced git users be spending time on migrating their data at >this point? > >No, I don't think so given all of the above, and I really think we should carefully >consider all of the trade-offs involved before recommending that the average >user of git migrate over. > >1. >https://lore.kernel.org/git/CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47E >QDE+DiUQ@mail.gmail.com/ >2. https://lore.kernel.org/git/cover-0.5-00000000000-20220422T094624Z- >avarab@gmail.com/ >3. https://lore.kernel.org/git/CACBZZX65Kbp8N9X9UtBfJca7U1T0m- >VtKZeKM5q9mhyCR7dwGg@mail.gmail.com/ > Adding my own 0.02, what some of us are facing is resistance to adopting git in our or client organizations because of the presence of SHA-1. There are organizations where SHA-1 is blanket banned across the board - regardless of its use. While it is sometimes possible to educate of out the situation, as above, and show that SHA-1 is not really vulnerable except as above, which arguably applies to any hash given enough computing power, and in in-flight communication scenarios and cryptographic use. Getting around this blanket ban is a serious amount of work and I have very recently seen customers move to older much less functional (or useful) VCS platforms just because of SHA-1. I also think the comment about git in 15-20 years is a bit concerning if we are making decisions on that basis. Having written code in the mid 1980s that is still alive and relevant today, once processes are put in place, customers are very reluctant to move. I expect git to continue to be relevant for a long time, particularly if it is actively maintained by a motivated team. IMO, the SHA-1 to SHA-256 (or other hash) migration should receive more attention, which I am willing to give, but I think it requires a deeper discussion. Arguably, if GitHub were to offer SHA-256 repos, I am 99% certain you will see much wider adoption. --Randall ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-21 13:18 ` rsbecker @ 2022-06-21 18:14 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 21+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-21 18:14 UTC (permalink / raw) To: rsbecker; +Cc: 'Stephen Smith', 'git', Jeff King On Tue, Jun 21 2022, rsbecker@nexbridge.com wrote: > On June 21, 2022 6:25 AM, Ævar Arnfjörð Bjarmason wrote: >>On Mon, Jun 20 2022, Stephen Smith wrote: >> >>> What is the current status of the SHA-1 to SHA-256 transition? Is the >>> transition far enough along that users should start changing over to >>> the new format? >> >>Just my 0.02, not the official project line or anything: >> >>I wouldn't recommend that anyone use it for anything serious at the moment, as >>far as I can tell the only users (if any) are currently >>(some) people work on git itself. >> >>The status of it is, I think it's fair to say, that it /should/ work 100% (or at least >>99.99%?) as far as git itself is concerned. >> >>I.e. you can "init" a SHA-256 repository, all our in-repo tooling etc. will work with it. >>We run full CI tests with a SHA-256 test suite, and it's passing. >> >>But the reason I'd still say "no" on the technical/UX side is: >> >> * The inter-op between SHA-256 and SHA-1 repositories is still >> nonexistent, except for a one-off import. I.e. we don't have any >> graceful way to migrate an existing repository. >> >> * For new repositories I think you'll probably want to eventually push >> it to one of the online git hosting providers, none of which (as far >> as I'm aware) support SHA-256 repos. >> >> * Even if not, any local git tooling that's not part of git.git is >> likely to break, often for trivial reasons like expecting SHA-1 sized >> hashes in the output, but if you start using it for your repositories >> and use such tools you're very likely to be the first person to run >> into bugs in those areas. >> >>But more importantly (and note that these views are definitely *not* shared by >>some other project members, so take it with a grain of salt): >>There just isn't any compelling selling point to migrate to SHA-256 in the near or >>foreseeable future for a given individual user of git. >> >>The reason we started the SHA-1 -> $newhash (it wasn't known that it would be >>SHA-256 at the time) was in response to https://shattered.io; Although it had >>been discussed before, e.g. the thread starting at [1] in 2012. >> >>We've since migrated our default hash function from SHA-1 to SHA-1DC (except >>on vanilla OSX, see [2]). It's a variant SHA-1 that detects the SHAttered attack >>implemented by the same researchers. I'm not aware of a current viable SHA-1 >>collision against the variant of SHA-1 that we actually use these days. >> >>But even assuming for the sake of argument that we were using a much weaker >>and easier to break hash (say MD4 or MD5) most users still wouldn't have much or >>anything to worry about in practice. >> >>Discovering a hash collision is only the first step in attacking a Git repository. This >>aspect has been discussed many times on-list, but e.g. [3] is one such thread. >> >>The above is really *not* meant to poo-poo the whole notion of switching to a >>new hash. We're making good progress on it, although I think the really hard part >>UX-wise is left (online migration). >> >>Likewise I'd be really surprised if given the progress of that work the average Git >>user isn't going to be using not-SHA-1 with Git in 15-20 years, of it's even still >>around at that time as a relevant VCS. >> >>But should even advanced git users be spending time on migrating their data at >>this point? >> >>No, I don't think so given all of the above, and I really think we should carefully >>consider all of the trade-offs involved before recommending that the average >>user of git migrate over. >> >>1. >>https://lore.kernel.org/git/CA+EOSBncr=4a4d8n9xS4FNehyebpmX8JiUwCsXD47E >>QDE+DiUQ@mail.gmail.com/ >>2. https://lore.kernel.org/git/cover-0.5-00000000000-20220422T094624Z- >>avarab@gmail.com/ >>3. https://lore.kernel.org/git/CACBZZX65Kbp8N9X9UtBfJca7U1T0m- >>VtKZeKM5q9mhyCR7dwGg@mail.gmail.com/ >> > > Adding my own 0.02, what some of us are facing is resistance to > adopting git in our or client organizations because of the presence of > SHA-1. There are organizations where SHA-1 is blanket banned across > the board - regardless of its use. While it is sometimes possible to > educate of out the situation, as above, and show that SHA-1 is not > really vulnerable except as above, which arguably applies to any hash > given enough computing power, and in in-flight communication scenarios > and cryptographic use. Getting around this blanket ban is a serious > amount of work and I have very recently seen customers move to older > much less functional (or useful) VCS platforms just because of SHA-1. I'm not sure if we're talking past one another, or if you're just using this thread to raise a tangental topic. I understood the question to be closer to "is it ready for normal users, and should we generally recommend it?". Not whether a fully functioning and integrated into the wider ecosystem git SHA-256 would be useful to anyone. Clearly it would be useful to you, but for that question I'd think that your experience here is one more datapoint in the "it's not really ready" column. I.e. if SHA-1 is a pain for you why not just use SHA-256? That's of course rhetorical, you and I know why you and I are not using it, which was I was trying to get across here. > I also think the comment about git in 15-20 years is a bit concerning > if we are making decisions on that basis. Having written code in the > mid 1980s that is still alive and relevant today, once processes are > put in place, customers are very reluctant to move. I expect git to > continue to be relevant for a long time, particularly if it is > actively maintained by a motivated team. I meant that I hope to be using git with SHA-256 in my daily workflow around that time, at least. I'd probably have been more optimistic in 2017, but it's now been around 5 years since SHAttered and well, here we are. So big migrations of infrastructure-level software take time. But even if you read that (which I didn't mean) that we couldn't expect git to be around by then that probably also wouldn't be such a big deal. Plenty of people were fully invested in Subversion around 2003 or so, and what system were those people using 15 years later in 2018 ? :) I hope git has more staying power than that, but if it doesn't then it's probably for the best, as whatever new system will replace it will be worthwhile enough to justify the migration pain. > IMO, the SHA-1 to SHA-256 (or other hash) migration should receive > more attention, which I am willing to give, but I think it requires a > deeper discussion. I think the overall state at this point is more "requires work/patches" than "requires [deeper] discussion". I.e. I think having some bidirectional mapping of SHA-1<->SHA-256 (as discussed in the hash transition doc) was up next, and hashing out all the UX issues around that. I'm not sure what the state of patches (if any) is that area. > Arguably, if GitHub were to offer SHA-256 repos, I am 99% certain you > will see much wider adoption. I hope you're right, but I'm really not so certain myself. Even if we and the wider ecosystem magically get 100% of the technological aspect right I think there'll still be emergent pain from any such transition that'll outweigh any gains for many existing repos. E.g. if you'll need to store objects twice for existing clients and maintain a mapping how is any hosting provider that charges you for storage space for your repositories going to handle that? And there'll inevitably be some time of confusion etc. as repositories are migrated. Anyone who's gone though e.g. a CVS->SVN->Git migration with a large organization will know what I mean. A Git->Git migration will be less painful, but probably never pain-free. I think it says a lot that the people most concerned about this (and this may just be my confirmation bias) seem the least familiar with how any potential issues with SHA-1 might affect Git in particular. Or, as in your case, people who are at the receiving end of "checklist compliance" droids :) Which (and I am partially serious) I wonder if it would help if we officialy stated that we're simply not using SHA-1 anymore. Which is the case both in the the mathematical sense (sha1collisiondetection won't return the same outputs for the same inputs as "real" SHA-1), and in the sense that actually matters. I.e. at least part of the urgency with SHA-1 migrations is because of SHAttered specifically, but not entirely, as it's thought that SHA-1 variants might have other future vulnerabilities. But that last bit is an area where I'm way less comfortable giving anybody advice on, so take that with an even bigger grain of salt. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-21 10:25 ` Ævar Arnfjörð Bjarmason 2022-06-21 13:18 ` rsbecker @ 2022-06-22 0:29 ` brian m. carlson 2022-06-23 0:45 ` Stephen Smith ` (2 more replies) 1 sibling, 3 replies; 21+ messages in thread From: brian m. carlson @ 2022-06-22 0:29 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: Stephen Smith, git, Jeff King [-- Attachment #1: Type: text/plain, Size: 2470 bytes --] On 2022-06-21 at 10:25:01, Ævar Arnfjörð Bjarmason wrote: > > But the reason I'd still say "no" on the technical/UX side is: > > * The inter-op between SHA-256 and SHA-1 repositories is still > nonexistent, except for a one-off import. I.e. we don't have any > graceful way to migrate an existing repository. True, but that doesn't meant that new repositories couldn't use SHA-256. > * For new repositories I think you'll probably want to eventually push > it to one of the online git hosting providers, none of which (as far > as I'm aware) support SHA-256 repos. This, in my view, is the only compelling reason not to use it for new repositories. > * Even if not, any local git tooling that's not part of git.git is > likely to break, often for trivial reasons like expecting SHA-1 sized > hashes in the output, but if you start using it for your repositories > and use such tools you're very likely to be the first person to run > into bugs in those areas. It's my hope to see libgit2 working on SHA-256 repositories in the relatively near future. > But more importantly (and note that these views are definitely *not* > shared by some other project members, so take it with a grain of salt): > There just isn't any compelling selling point to migrate to SHA-256 in > the near or foreseeable future for a given individual user of git. I wholly disagree. SHA-1 is obsolete, and as soon as hosting providers support SHA-256, all new repositories should be SHA-256. There is no other defensible reason to continue to use SHA-1 today. > The reason we started the SHA-1 -> $newhash (it wasn't known that it > would be SHA-256 at the time) was in response to https://shattered.io; > Although it had been discussed before, e.g. the thread starting at [1] > in 2012. > > We've since migrated our default hash function from SHA-1 to SHA-1DC > (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the > SHAttered attack implemented by the same researchers. I'm not aware of a > current viable SHA-1 collision against the variant of SHA-1 that we > actually use these days. That's true, but that still doesn't let you store the data. There is some data that you can't store in a SHA-1 repository, and SHA-1DC is extremely slow. Using SHA-256 can make things like indexing packs substantially faster. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-22 0:29 ` brian m. carlson @ 2022-06-23 0:45 ` Stephen Smith 2022-06-23 1:44 ` brian m. carlson 2022-06-23 22:21 ` Ævar Arnfjörð Bjarmason 2022-06-24 10:52 ` Jeff King 2 siblings, 1 reply; 21+ messages in thread From: Stephen Smith @ 2022-06-23 0:45 UTC (permalink / raw) To: brian m. carlson, Ævar Arnfjörð Bjarmason, Stephen Smith, git, Jeff King On Tuesday, June 21, 2022 5:29:59 PM MST brian m. carlson wrote: > On 2022-06-21 at 10:25:01, Ævar Arnfjörð Bjarmason wrote: > > But the reason I'd still say "no" on the technical/UX side is: > > * The inter-op between SHA-256 and SHA-1 repositories is still > > > > nonexistent, except for a one-off import. I.e. we don't have any > > graceful way to migrate an existing repository. > > True, but that doesn't meant that new repositories couldn't use SHA-256. So, any idea when a graceful way to migrate a repository might show up? > > > * For new repositories I think you'll probably want to eventually push > > > > it to one of the online git hosting providers, none of which (as far > > as I'm aware) support SHA-256 repos. > > This, in my view, is the only compelling reason not to use it for new > repositories. Which is a reason to send patches by email. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-23 0:45 ` Stephen Smith @ 2022-06-23 1:44 ` brian m. carlson 2022-06-23 15:32 ` Junio C Hamano 0 siblings, 1 reply; 21+ messages in thread From: brian m. carlson @ 2022-06-23 1:44 UTC (permalink / raw) To: Stephen Smith; +Cc: Ævar Arnfjörð Bjarmason, git, Jeff King [-- Attachment #1: Type: text/plain, Size: 1039 bytes --] On 2022-06-23 at 00:45:40, Stephen Smith wrote: > On Tuesday, June 21, 2022 5:29:59 PM MST brian m. carlson wrote: > > On 2022-06-21 at 10:25:01, Ævar Arnfjörð Bjarmason wrote: > > > But the reason I'd still say "no" on the technical/UX side is: > > > * The inter-op between SHA-256 and SHA-1 repositories is still > > > > > > nonexistent, except for a one-off import. I.e. we don't have any > > > graceful way to migrate an existing repository. > > > > True, but that doesn't meant that new repositories couldn't use SHA-256. > > So, any idea when a graceful way to migrate a repository might show up? I'm hoping that my employer will give me time to work on this in the future. Perhaps I'll have more to show on this closer to the last quarter of the year. At the moment I happen to be very busy in my personal life, so I'm not finding a great deal of time to code much of anything. But if that changes, I'll try to get back to it. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-23 1:44 ` brian m. carlson @ 2022-06-23 15:32 ` Junio C Hamano 0 siblings, 0 replies; 21+ messages in thread From: Junio C Hamano @ 2022-06-23 15:32 UTC (permalink / raw) To: brian m. carlson Cc: Stephen Smith, Ævar Arnfjörð Bjarmason, git, Jeff King "brian m. carlson" <sandals@crustytoothpaste.net> writes: > On 2022-06-23 at 00:45:40, Stephen Smith wrote: >> On Tuesday, June 21, 2022 5:29:59 PM MST brian m. carlson wrote: >> > On 2022-06-21 at 10:25:01, Ævar Arnfjörð Bjarmason wrote: >> > > But the reason I'd still say "no" on the technical/UX side is: >> > > * The inter-op between SHA-256 and SHA-1 repositories is still >> > > >> > > nonexistent, except for a one-off import. I.e. we don't have any >> > > graceful way to migrate an existing repository. >> > >> > True, but that doesn't meant that new repositories couldn't use SHA-256. >> >> So, any idea when a graceful way to migrate a repository might show up? > > I'm hoping that my employer will give me time to work on this in the > future. Perhaps I'll have more to show on this closer to the last > quarter of the year. > > At the moment I happen to be very busy in my personal life, so I'm not > finding a great deal of time to code much of anything. But if that > changes, I'll try to get back to it. Great ;-). Thanks. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-22 0:29 ` brian m. carlson 2022-06-23 0:45 ` Stephen Smith @ 2022-06-23 22:21 ` Ævar Arnfjörð Bjarmason 2022-06-24 0:29 ` Kyle Meyer 2022-06-24 1:03 ` Stephen Smith 2022-06-24 10:52 ` Jeff King 2 siblings, 2 replies; 21+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-23 22:21 UTC (permalink / raw) To: brian m. carlson; +Cc: Stephen Smith, git, Jeff King, Kyle Meyer On Wed, Jun 22 2022, brian m. carlson wrote: > [[PGP Signed Part:Undecided]] > On 2022-06-21 at 10:25:01, Ævar Arnfjörð Bjarmason wrote: >> >> But the reason I'd still say "no" on the technical/UX side is: >> >> * The inter-op between SHA-256 and SHA-1 repositories is still >> nonexistent, except for a one-off import. I.e. we don't have any >> graceful way to migrate an existing repository. > > True, but that doesn't meant that new repositories couldn't use SHA-256. Indeed, and people who know enough about its state can (and in some cases probably should) use it. I took the start of the thread to be a question about the state of the SHA-1 -> SHA-256 transition, and what we should be generally recommending to users at this point. >> * For new repositories I think you'll probably want to eventually push >> it to one of the online git hosting providers, none of which (as far >> as I'm aware) support SHA-256 repos. > > This, in my view, is the only compelling reason not to use it for new > repositories. I think certainly the main one, given most people's workflows around Git being heavily forge-based . >> * Even if not, any local git tooling that's not part of git.git is >> likely to break, often for trivial reasons like expecting SHA-1 sized >> hashes in the output, but if you start using it for your repositories >> and use such tools you're very likely to be the first person to run >> into bugs in those areas. > > It's my hope to see libgit2 working on SHA-256 repositories in the > relatively near future. I was referring to the very long tail of tooling here. E.g. I use magit with Emacs, and last I checked it would puke on SHA-256. But checking again it seems someone patched it in January of this year to e.g. change "{40}" in regexes to "{40,}", so in theory it should work now (but I didn't try actually using it in that mode). We even still have UI code shipped as part of git.git itself that only supports SHA-1, e.g. git-gui's "blame" feature. We were discussing some patches for that late last year, but they didn't make it in: https://lore.kernel.org/git/20211011121757.627-1-carenas@gmail.com/ Any individual tool like that isn't critical, but I'd think that a large long tail of tooling git users are likely to interact with, which for the most part isn't ready. I looked at "tig"'s source now, which I only very occasionally use, and it still has SHA-1 sized constants hardcoded etc... Of course that's a chicken & egg problem, and at some point we'll need more brave early adopters. I'm only trying to relay the ground truth of what the state is now, for someone who might not be aware of the potential trouble they're getting themselves into. >> But more importantly (and note that these views are definitely *not* >> shared by some other project members, so take it with a grain of salt): >> There just isn't any compelling selling point to migrate to SHA-256 in >> the near or foreseeable future for a given individual user of git. > > I wholly disagree. SHA-1 is obsolete, and as soon as hosting providers > support SHA-256, all new repositories should be SHA-256. There is no > other defensible reason to continue to use SHA-1 today. I really don't think we disagree on the need to move away from SHA-1 to SHA-256. I'm only attempting to summarize the practical threat, and how users might rightly weight that against other concerns. NIST deprecated SHA-1 in 2011. I think it's safe given Git's growth that most people who've used Git started using it after that date, so clearly there's a large disconnect between official hash algorithm recommendations and how that translates to practical concerns. >> The reason we started the SHA-1 -> $newhash (it wasn't known that it >> would be SHA-256 at the time) was in response to https://shattered.io; >> Although it had been discussed before, e.g. the thread starting at [1] >> in 2012. >> >> We've since migrated our default hash function from SHA-1 to SHA-1DC >> (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the >> SHAttered attack implemented by the same researchers. I'm not aware of a >> current viable SHA-1 collision against the variant of SHA-1 that we >> actually use these days. > > That's true, but that still doesn't let you store the data. There is > some data that you can't store in a SHA-1 repository, [...] I don't think that's come up before, that's correct, but has anyone wanted to do that? I.e. people aren't generating these collisions accidentally, they're crafted. If we did want to store those we could change the hardcoded -DSHA1DC_INIT_SAFE_HASH_DEFAULT=0 to "1", now it's set up to just die if it finds a collision, but it could be made to return the "safe hash". Of course doing so would mean going all-in on SHA1DC, i.e. such a repository couldn't interop with our optional OpenSSL and other vanilla SHA-1 backends. > [...]and SHA-1DC is extremely slow. Using SHA-256 can make things > like indexing packs substantially faster. Yeah, there's a lot of advantages. We could also safely use hardware acceleration. Really, I'm not meaning to poo-poo SHA-256 here, just to provide some summary of the current state a user might expect. I do think even this is mostly a fringe benefit in practice. I feel that pain when I e.g. clone chromium.git, but once I pay that one-off cost it's mostly not a bottleneck you notice on incremental push/fetch. You pay for it on "repack", but that's in the background for most users. It sure would make hosting providers happy though... We have discussed having our cake here & eating it too in the past. I.e. we could safely use say OpenSSL SHA-1 for "repack" on, as long as we kept state and only did so for objects reachable from tips that we'd already validated with SHA-1DC. I think it's a datapoint that even those of us who've noticed the hash slowdown have found it painful, but not *that* painful that we've invested the effort in even relatively low-hanging-fruit workarounds for the problem. ... Finally, I'd really like to thank you for all your work on SHA-256 so far, and really hope that none of what I've said here is discouraging in any way. This thread has received some attention outside this ML (on LWN), so I wanted to clarify some of the points above. Thanks! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-23 22:21 ` Ævar Arnfjörð Bjarmason @ 2022-06-24 0:29 ` Kyle Meyer 2022-06-24 1:03 ` Stephen Smith 1 sibling, 0 replies; 21+ messages in thread From: Kyle Meyer @ 2022-06-24 0:29 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: brian m. carlson, Stephen Smith, git, Jeff King Ævar Arnfjörð Bjarmason writes: > E.g. I use magit with Emacs, and last I checked it would puke on > SHA-256. But checking again it seems someone patched it in January of > this year to e.g. change "{40}" in regexes to "{40,}", so in theory it > should work now (but I didn't try actually using it in that mode). Yeah, I gave it some testing as I made those adjustments [*], but "in theory it should work" is about my level of confidence too. If you're experimenting with SHA-256 repos and find spots where Magit chokes, opening issues on Magit's side would be very appreciated. [*] https://github.com/magit/magit/pull/4585 ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-23 22:21 ` Ævar Arnfjörð Bjarmason 2022-06-24 0:29 ` Kyle Meyer @ 2022-06-24 1:03 ` Stephen Smith 2022-06-24 1:19 ` Ævar Arnfjörð Bjarmason 1 sibling, 1 reply; 21+ messages in thread From: Stephen Smith @ 2022-06-24 1:03 UTC (permalink / raw) To: brian m. carlson, Ævar Arnfjörð Bjarmason Cc: git, Jeff King, Kyle Meyer On Thursday, June 23, 2022 3:21:05 PM MST Ævar Arnfjörð Bjarmason wrote: > Finally, I'd really like to thank you for all your work on SHA-256 so > far, and really hope that none of what I've said here is discouraging in > any way. This thread has received some attention outside this ML (on > LWN), so I wanted to clarify some of the points above. Thanks! I had looked on LWN before I started the thread to see if anything was being discussed and it wasn't. I tend to be an early adopter. I hadn't seen any new commits in the main git repository in a while and was beginning to wonder if it had been abandoned. This thread has convinced me that isn't the case, but the main person doing the developing being busy. I too want to say thank you (Brian) for your hard work. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-24 1:03 ` Stephen Smith @ 2022-06-24 1:19 ` Ævar Arnfjörð Bjarmason 2022-06-24 14:42 ` Jonathan Corbet 0 siblings, 1 reply; 21+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-24 1:19 UTC (permalink / raw) To: Stephen Smith; +Cc: brian m. carlson, git, Jeff King, Kyle Meyer On Thu, Jun 23 2022, Stephen Smith wrote: > On Thursday, June 23, 2022 3:21:05 PM MST Ævar Arnfjörð Bjarmason wrote: >> Finally, I'd really like to thank you for all your work on SHA-256 so >> far, and really hope that none of what I've said here is discouraging in >> any way. This thread has received some attention outside this ML (on >> LWN), so I wanted to clarify some of the points above. Thanks! > > I had looked on LWN before I started the thread to see if anything was being > discussed and it wasn't. It wouldn't have helped, as I'm referring to LWN having written an article about this thread that you started :) It's part of an ongoing series they've had about Git's SHA-256 transition. Given how LWN makes money I don't know if it's OK to link to it, but it's easy enough to find and/or subscribe to LWN. > I tend to be an early adopter. I hadn't seen any new commits in the main git > repository in a while and was beginning to wonder if it had been abandoned. > This thread has convinced me that isn't the case, but the main person doing > the developing being busy. It was a good discussion, and I'm happy you started it. I think I've mentioned in some past discussions that it would be nice to have some "gitsecurity" user-facing documentation, and one thing such a thing could include is information that helped users to make an informed decision about how much (if at all) they should be worrying about issues arising from what hash they're using Git with. But some documentation on the questions raised here would also be good, i.e. "should I use the new hash?", which we could keep somewhat up-to-date, and e.g. talk about the approximate state of major third-party software, such as the forges. Currently the closest thing we have to that is the rather sparse and scary "THIS OPTION IS EXPERIMENTAL" in git-init(1) when talking about --object-format=sha256. > I too want to say thank you (Brian) for your hard work. And thank you for using & being interested in git, and contributing to the ML! ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-24 1:19 ` Ævar Arnfjörð Bjarmason @ 2022-06-24 14:42 ` Jonathan Corbet 0 siblings, 0 replies; 21+ messages in thread From: Jonathan Corbet @ 2022-06-24 14:42 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason, Stephen Smith Cc: brian m. carlson, git, Jeff King, Kyle Meyer Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes: > It wouldn't have helped, as I'm referring to LWN having written an > article about this thread that you started :) > > It's part of an ongoing series they've had about Git's SHA-256 > transition. > > Given how LWN makes money I don't know if it's OK to link to it, but > it's easy enough to find and/or subscribe to LWN. Heh...it's not like it hasn't been widely distributed thus far...:) https://lwn.net/SubscriberLink/898522/68ddb300e7eba05d/ Thanks, jon ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-22 0:29 ` brian m. carlson 2022-06-23 0:45 ` Stephen Smith 2022-06-23 22:21 ` Ævar Arnfjörð Bjarmason @ 2022-06-24 10:52 ` Jeff King 2022-06-24 15:49 ` Ævar Arnfjörð Bjarmason 2022-06-25 8:53 ` brian m. carlson 2 siblings, 2 replies; 21+ messages in thread From: Jeff King @ 2022-06-24 10:52 UTC (permalink / raw) To: brian m. carlson Cc: Ævar Arnfjörð Bjarmason, Stephen Smith, git On Wed, Jun 22, 2022 at 12:29:59AM +0000, brian m. carlson wrote: > > We've since migrated our default hash function from SHA-1 to SHA-1DC > > (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the > > SHAttered attack implemented by the same researchers. I'm not aware of a > > current viable SHA-1 collision against the variant of SHA-1 that we > > actually use these days. > > That's true, but that still doesn't let you store the data. There is > some data that you can't store in a SHA-1 repository, and SHA-1DC is > extremely slow. Using SHA-256 can make things like indexing packs > substantially faster. I'm curious if you have numbers on this. I naively converted linux.git to sha256 by doing "fast-export | fast-import" (the latter in a sha256 repo, of course, and then both repacked with "-f --window=250" to get reasonable apples-to-apples packs). Running "index-pack --verify" on the result takes about the same time (this is on an 8-core system, hence the real/user differences): [sha1dc] real 2m43.754s user 10m52.452s sys 0m36.745s [sha256] real 2m41.884s user 12m23.344s sys 0m35.222s The sha256 repo actually has about 10% fewer objects (I didn't investigate, but this is perhaps due to cutting out tags and a few other things to convince fast-export to finish running). I'm not sure about the extra user time (multicore timings here are funny because of frequency scaling, so I think the "real" line is more interesting). So sha256 actually comes out a bit worse here. On the other hand, this is just using our blk_SHA256 implementation. There may be faster alternatives (including ones with hardware support). I wouldn't be at all surprised if the difference isn't substantial in the long run, though. The repo is on the order of 100GB of object data. That's a lot to hash, but it's also just a lot to deal with at all (zlib inflating, applying deltas, etc). Anyway, this is a pretty rough cut at an experiment. I was mostly curious if you had done something more advanced, and/or gotten different results. -Peff ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-24 10:52 ` Jeff King @ 2022-06-24 15:49 ` Ævar Arnfjörð Bjarmason 2022-06-25 8:53 ` brian m. carlson 1 sibling, 0 replies; 21+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-24 15:49 UTC (permalink / raw) To: Jeff King; +Cc: brian m. carlson, Stephen Smith, git On Fri, Jun 24 2022, Jeff King wrote: > On Wed, Jun 22, 2022 at 12:29:59AM +0000, brian m. carlson wrote: > >> > We've since migrated our default hash function from SHA-1 to SHA-1DC >> > (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the >> > SHAttered attack implemented by the same researchers. I'm not aware of a >> > current viable SHA-1 collision against the variant of SHA-1 that we >> > actually use these days. >> >> That's true, but that still doesn't let you store the data. There is >> some data that you can't store in a SHA-1 repository, and SHA-1DC is >> extremely slow. Using SHA-256 can make things like indexing packs >> substantially faster. > > I'm curious if you have numbers on this. I naively converted linux.git > to sha256 by doing "fast-export | fast-import" (the latter in a sha256 > repo, of course, and then both repacked with "-f --window=250" to get > reasonable apples-to-apples packs). > > Running "index-pack --verify" on the result takes about the same time > (this is on an 8-core system, hence the real/user differences): > > [sha1dc] > real 2m43.754s > user 10m52.452s > sys 0m36.745s > > [sha256] > real 2m41.884s > user 12m23.344s > sys 0m35.222s > > The sha256 repo actually has about 10% fewer objects (I didn't > investigate, but this is perhaps due to cutting out tags and a few other > things to convince fast-export to finish running). I'm not sure about > the extra user time (multicore timings here are funny because of > frequency scaling, so I think the "real" line is more interesting). So > sha256 actually comes out a bit worse here. On the other hand, this is > just using our blk_SHA256 implementation. There may be faster > alternatives (including ones with hardware support). > > I wouldn't be at all surprised if the difference isn't substantial in > the long run, though. The repo is on the order of 100GB of object data. > That's a lot to hash, but it's also just a lot to deal with at all (zlib > inflating, applying deltas, etc). > > Anyway, this is a pretty rough cut at an experiment. I was mostly > curious if you had done something more advanced, and/or gotten different > results. I haven't checked or verified this, but https://www.marc-stevens.nl/research/#software claims: Counter-cryptanalysis: New improved release SHA-1 collision detection library, which protects against twice as many SHA-1 attack classes (disturbance vectors), but is 9 times faster than previous version. Speed is now 1.87 times normal SHA-1. It is currently used among others by Git, GitHub, GMail, Google Drive and Microsoft OneDrive. And looking at the OID you initially imported for sha1dc (and my later submodule import) we've always had what seems to have been that performance improvement, which I think (but I didn't have time to benchmark) is: https://github.com/cr-marcstevens/sha1collisiondetection/pull/20 *But* there was also this later performance work: https://github.com/cr-marcstevens/sha1collisiondetection/pull/30; see also this comment: https://github.com/cr-marcstevens/sha1collisiondetection/commit/33a694a9ee1b79c24be45f9eab5ac0e1aeeaf271 And then if you look at the sha1collisiondetection repo the latest tag is stable-v1.0.3, which pre-dates that (but not the original perf work), and was tagged in 2017. There were a lot of commits since then. I wasn't able to find any third party package using DC_SHA1_EXTERNAL, but I wonder if any performance tests with sha1dc in the wild are using some older version, which from the looks of it might have had a performance regression on x86... ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-24 10:52 ` Jeff King 2022-06-24 15:49 ` Ævar Arnfjörð Bjarmason @ 2022-06-25 8:53 ` brian m. carlson 2022-06-26 0:09 ` Plan for SHA-256 repos to support SHA-1? Eric W. Biederman 2022-07-01 18:00 ` SHA-256 transition Jeff King 1 sibling, 2 replies; 21+ messages in thread From: brian m. carlson @ 2022-06-25 8:53 UTC (permalink / raw) To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, Stephen Smith, git [-- Attachment #1: Type: text/plain, Size: 2602 bytes --] On 2022-06-24 at 10:52:36, Jeff King wrote: > On Wed, Jun 22, 2022 at 12:29:59AM +0000, brian m. carlson wrote: > > > > We've since migrated our default hash function from SHA-1 to SHA-1DC > > > (except on vanilla OSX, see [2]). It's a variant SHA-1 that detects the > > > SHAttered attack implemented by the same researchers. I'm not aware of a > > > current viable SHA-1 collision against the variant of SHA-1 that we > > > actually use these days. > > > > That's true, but that still doesn't let you store the data. There is > > some data that you can't store in a SHA-1 repository, and SHA-1DC is > > extremely slow. Using SHA-256 can make things like indexing packs > > substantially faster. > > I'm curious if you have numbers on this. I naively converted linux.git > to sha256 by doing "fast-export | fast-import" (the latter in a sha256 > repo, of course, and then both repacked with "-f --window=250" to get > reasonable apples-to-apples packs). I did the same thing, except I just did a regular gc and not a custom repack, and I created both a SHA-1 and SHA-256 repo from the same original. > Running "index-pack --verify" on the result takes about the same time > (this is on an 8-core system, hence the real/user differences): > > [sha1dc] > real 2m43.754s > user 10m52.452s > sys 0m36.745s > > [sha256] > real 2m41.884s > user 12m23.344s > sys 0m35.222s Here are my results: [sha256] time ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack 2768.42s user 181.00s system 185% cpu 26:31.70 total [sha1dc] time ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack 3041.28s user 184.84s system 199% cpu 26:54.74 total Note that in my case, I'm using an accelerated hardware-based SHA-256 implementation (Nettle, which I will send a patch for soon). This is a brand new ThinkPad X1 Carbon Gen 10 with an i7-1280P (with 20 "cores" of different sizes). So this is about 9% faster in terms of total CPU usage on SHA-256 with that implementation. The wallclock time is less impressive here. Of course, it might be slower in software, but considering that AMD has had SHA-NI for some time, newer Intel processors have it, and ARM also has SHA-2 acceleration instructions, it's likely it will be faster on most recent machines assuming it's compiled appropriately. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Plan for SHA-256 repos to support SHA-1? 2022-06-25 8:53 ` brian m. carlson @ 2022-06-26 0:09 ` Eric W. Biederman 2022-06-26 0:27 ` Junio C Hamano 2022-07-01 18:00 ` SHA-256 transition Jeff King 1 sibling, 1 reply; 21+ messages in thread From: Eric W. Biederman @ 2022-06-26 0:09 UTC (permalink / raw) To: brian m. carlson Cc: Jeff King, Ævar Arnfjörð Bjarmason, Stephen Smith, git Is there at this point a solid plan for how SHA-256 repos will support access SHA-1 only clients? I remember reading a discussion of having a table somewhere that would translate SHA-256 to SHA-1 when needed. I had a brainstorm which is probably the uniformed opinion of an outsider. I was thinking in server settings that a well-packed pack of all of the objects is kept to make it quick for git clone to do it's work. I was thinking perhaps in a repo that wanted to support access from SHA-1 clients it might makes sense to have three packs instead of the standard 1. A pack of all of the blobs with no oid references. So that either a SHA-256 or a SHA-1 client could consume it (modulo header changes that are needed). The pack of blobs could have both an ordinary SHA-256 index and a SHA-1 index. Then there could be two packs of metadata (aka trees and commits and tags that embed oids). One pack in SHA-256 and one pack in SHA-1. Then with a little header surgery git clone could be served with sendfile and gluing the pack of blobs and pack of object together. In the normal end user client case that is doesn't seem to make a lot of sense as all that is needed is to figure out which oid to use and always display SHA-256. My naivete suggests that just keeping the SHA-1 metadata in a SHA-256 repo could be simple enough to implement that it would allow the transition to start happening, and it could be optimized away later. Eric ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Plan for SHA-256 repos to support SHA-1? 2022-06-26 0:09 ` Plan for SHA-256 repos to support SHA-1? Eric W. Biederman @ 2022-06-26 0:27 ` Junio C Hamano 2022-06-26 15:19 ` brian m. carlson 0 siblings, 1 reply; 21+ messages in thread From: Junio C Hamano @ 2022-06-26 0:27 UTC (permalink / raw) To: Eric W. Biederman Cc: brian m. carlson, Jeff King, Ævar Arnfjörð Bjarmason, Stephen Smith, git On Sat, Jun 25, 2022 at 5:10 PM Eric W. Biederman <ebiederm@xmission.com> wrote: > Is there at this point a solid plan for how SHA-256 repos will support > access SHA-1 only clients? > > I remember reading a discussion of having a table somewhere that would > translate SHA-256 to SHA-1 when needed. Documentation/technical/hash-function-transition.txt has flushed out the necessary details? ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: Plan for SHA-256 repos to support SHA-1? 2022-06-26 0:27 ` Junio C Hamano @ 2022-06-26 15:19 ` brian m. carlson 0 siblings, 0 replies; 21+ messages in thread From: brian m. carlson @ 2022-06-26 15:19 UTC (permalink / raw) To: Junio C Hamano Cc: Eric W. Biederman, Jeff King, Ævar Arnfjörð Bjarmason, Stephen Smith, git [-- Attachment #1: Type: text/plain, Size: 782 bytes --] On 2022-06-26 at 00:27:57, Junio C Hamano wrote: > On Sat, Jun 25, 2022 at 5:10 PM Eric W. Biederman <ebiederm@xmission.com> wrote: > > Is there at this point a solid plan for how SHA-256 repos will support > > access SHA-1 only clients? > > > > I remember reading a discussion of having a table somewhere that would > > translate SHA-256 to SHA-1 when needed. > > Documentation/technical/hash-function-transition.txt has flushed out > the necessary details? Yup. The design there sounds very simple and it is, conceptually, but practically implementing it is quite complex. You can pull the in-progress work from transition-interop on my GitHub remote to see where some of the complexity lies. -- brian m. carlson (he/him or they/them) Toronto, Ontario, CA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: SHA-256 transition 2022-06-25 8:53 ` brian m. carlson 2022-06-26 0:09 ` Plan for SHA-256 repos to support SHA-1? Eric W. Biederman @ 2022-07-01 18:00 ` Jeff King 1 sibling, 0 replies; 21+ messages in thread From: Jeff King @ 2022-07-01 18:00 UTC (permalink / raw) To: brian m. carlson Cc: Ævar Arnfjörð Bjarmason, Stephen Smith, git On Sat, Jun 25, 2022 at 08:53:53AM +0000, brian m. carlson wrote: > > I'm curious if you have numbers on this. I naively converted linux.git > > to sha256 by doing "fast-export | fast-import" (the latter in a sha256 > > repo, of course, and then both repacked with "-f --window=250" to get > > reasonable apples-to-apples packs). > > I did the same thing, except I just did a regular gc and not a custom > repack, and I created both a SHA-1 and SHA-256 repo from the same > original. That _might_ influence your timings a bit, just because the fast-import packs have lousy deltas. I think my linux.git was something like 6GB from fast-import, packed down to 1.5GB after "repack -f". But I'm not sure if it would change the direction of the trend of what you were measuring, only the magnitude. We'll hash the same bytes in either case, but in the fast-import pack we'd spend more time on zlib inflating and less time on delta reconstruction. Which one is more expensive probably depends on a lot of factors, but it's entirely possible that running your test after a "repack -f" would actually show a greater change between the two cases. > Here are my results: > > [sha256] > time ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack > ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack 2768.42s user 181.00s system 185% cpu 26:31.70 total > > [sha1dc] > time ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack > ~/checkouts/git/git index-pack --verify .git/objects/pack/pack-*.pack 3041.28s user 184.84s system 199% cpu 26:54.74 total > > Note that in my case, I'm using an accelerated hardware-based SHA-256 > implementation (Nettle, which I will send a patch for soon). This is a > brand new ThinkPad X1 Carbon Gen 10 with an i7-1280P (with 20 "cores" of > different sizes). OK, that probably explains the difference in results we saw. Thanks for sharing your numbers. I think that's pretty "apples to apples" since we'd hope that sha256 will eventually be accelerated, but sha1dc never will be. -Peff ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2022-07-01 18:00 UTC | newest] Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-20 22:51 SHA-256 transition Stephen Smith 2022-06-20 23:13 ` rsbecker 2022-06-21 10:25 ` Ævar Arnfjörð Bjarmason 2022-06-21 13:18 ` rsbecker 2022-06-21 18:14 ` Ævar Arnfjörð Bjarmason 2022-06-22 0:29 ` brian m. carlson 2022-06-23 0:45 ` Stephen Smith 2022-06-23 1:44 ` brian m. carlson 2022-06-23 15:32 ` Junio C Hamano 2022-06-23 22:21 ` Ævar Arnfjörð Bjarmason 2022-06-24 0:29 ` Kyle Meyer 2022-06-24 1:03 ` Stephen Smith 2022-06-24 1:19 ` Ævar Arnfjörð Bjarmason 2022-06-24 14:42 ` Jonathan Corbet 2022-06-24 10:52 ` Jeff King 2022-06-24 15:49 ` Ævar Arnfjörð Bjarmason 2022-06-25 8:53 ` brian m. carlson 2022-06-26 0:09 ` Plan for SHA-256 repos to support SHA-1? Eric W. Biederman 2022-06-26 0:27 ` Junio C Hamano 2022-06-26 15:19 ` brian m. carlson 2022-07-01 18:00 ` SHA-256 transition Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).