From: Jonathan Nieder <jrnieder@gmail.com> To: Jeff King <peff@peff.net> Cc: Shawn Pearce <spearce@spearce.org>, Linus Torvalds <torvalds@linux-foundation.org>, Git Mailing List <git@vger.kernel.org>, Stefan Beller <sbeller@google.com>, bmwill@google.com, Jonathan Tan <jonathantanmy@google.com>, David Lang <david@lang.hm>, "brian m. carlson" <sandals@crustytoothpaste.net> Subject: Re: RFC v3: Another proposed hash function transition plan Date: Fri, 10 Mar 2017 11:55:24 -0800 Message-ID: <20170310195523.GF26789@aiede.mtv.corp.google.com> (raw) In-Reply-To: <20170310193835.t7syswueuu7nmkjz@sigill.intra.peff.net> Jeff King wrote: > On Thu, Mar 09, 2017 at 12:24:08PM -0800, Jonathan Nieder wrote: >>> SHA-1 to SHA-3: lookup SHA-1 in .msha1, reverse .idx, find offset to >>> read the SHA-3. >>> SHA-3 to SHA-1: lookup SHA-3 in .idx, and reverse the .msha1 file to >>> translate offset to SHA-1. >> >> Thanks for this suggestion. I was initially vaguely nervous about >> lookup times in an idx-style file, but as you say, object reads from a >> packfile already have to deal with this kind of lookup and work fine. > > Not exactly. The "reverse .idx" step has to build the reverse mapping on > the fly, and it's non-trivial. Sure. To be clear, I was handwaving over that since adding an on-disk reverse .idx is a relatively small change. [...] > So I think it's solvable, but I suspect we would want an extension to > the .idx format to store the mapping array, in order to keep it log-n. i.e., this. The loose object side is the more worrying bit, since we currently don't have any practical bound on the number of loose objects. One way to deal with that is to disallow loose objects completely. Use packfiles for new objects, batching the objects produced by a single process into a single packfile. Teach "git gc --auto" a behavior similar to Martin Fick's "git exproll" to combine packfiles between full gcs to maintain reasonable performance. For unreachable objects, instead of using loose objects, use "unreachable garbage" packs explicitly labeled as such, with similar semantics to what JGit's DfsRepository backend uses (described in the discussion at https://git.eclipse.org/r/89455). That's a direction that I want in the long term anyway. I was hoping not to couple such changes with the hash transition but it might be one of the simpler ways to go. Jonathan
next prev parent reply index Thread overview: 111+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-03-04 1:12 RFC: " Jonathan Nieder 2017-03-05 2:35 ` Linus Torvalds 2017-03-06 0:26 ` brian m. carlson 2017-03-06 18:24 ` Brandon Williams 2017-06-15 10:30 ` Which hash function to use, was " Johannes Schindelin 2017-06-15 11:05 ` Mike Hommey 2017-06-15 13:01 ` Jeff King 2017-06-15 16:30 ` Ævar Arnfjörð Bjarmason 2017-06-15 19:34 ` Johannes Schindelin 2017-06-15 21:59 ` Adam Langley 2017-06-15 22:41 ` brian m. carlson 2017-06-15 23:36 ` Ævar Arnfjörð Bjarmason 2017-06-16 0:17 ` brian m. carlson 2017-06-16 6:25 ` Ævar Arnfjörð Bjarmason 2017-06-16 13:24 ` Johannes Schindelin 2017-06-16 17:38 ` Adam Langley 2017-06-16 20:52 ` Junio C Hamano 2017-06-16 21:12 ` Junio C Hamano 2017-06-16 21:24 ` Jonathan Nieder 2017-06-16 21:39 ` Ævar Arnfjörð Bjarmason 2017-06-16 20:42 ` Jeff King 2017-06-19 9:26 ` Johannes Schindelin 2017-06-15 21:10 ` Mike Hommey 2017-06-16 4:30 ` Jeff King 2017-06-15 17:36 ` Brandon Williams 2017-06-15 19:20 ` Junio C Hamano 2017-06-15 19:13 ` Jonathan Nieder 2017-03-07 0:17 ` RFC v3: " Jonathan Nieder 2017-03-09 19:14 ` Shawn Pearce 2017-03-09 20:24 ` Jonathan Nieder 2017-03-10 19:38 ` Jeff King 2017-03-10 19:55 ` Jonathan Nieder [this message] 2017-09-28 4:43 ` [PATCH v4] technical doc: add a design doc for hash function transition Jonathan Nieder 2017-09-29 6:06 ` Junio C Hamano 2017-09-29 8:09 ` Junio C Hamano 2017-09-29 17:34 ` Jonathan Nieder 2017-10-02 8:25 ` Junio C Hamano 2017-10-02 19:41 ` Jason Cooper 2017-10-02 9:02 ` Junio C Hamano 2017-10-02 19:23 ` Jason Cooper 2017-10-03 5:40 ` Junio C Hamano 2017-10-03 13:08 ` Jason Cooper 2017-10-04 1:44 ` Junio C Hamano 2017-09-06 6:28 ` RFC v3: Another proposed hash function transition plan Junio C Hamano 2017-09-08 2:40 ` Junio C Hamano 2017-09-08 3:34 ` Jeff King 2017-09-11 18:59 ` Brandon Williams 2017-09-13 12:05 ` Johannes Schindelin 2017-09-13 13:43 ` demerphq 2017-09-13 22:51 ` Jonathan Nieder 2017-09-14 18:26 ` Johannes Schindelin 2017-09-14 18:40 ` Jonathan Nieder 2017-09-14 22:09 ` Johannes Schindelin 2017-09-13 23:30 ` Linus Torvalds 2017-09-14 18:45 ` Johannes Schindelin 2017-09-18 12:17 ` Gilles Van Assche 2017-09-18 22:16 ` Johannes Schindelin 2017-09-19 16:45 ` Gilles Van Assche 2017-09-29 13:17 ` Johannes Schindelin 2017-09-29 14:54 ` Joan Daemen 2017-09-29 22:33 ` Johannes Schindelin 2017-09-30 22:02 ` Joan Daemen 2017-10-02 14:26 ` Johannes Schindelin 2017-09-18 22:25 ` Jonathan Nieder 2017-09-26 17:05 ` Jason Cooper 2017-09-26 22:11 ` Johannes Schindelin 2017-09-26 22:25 ` [PATCH] technical doc: add a design doc for hash function transition Stefan Beller 2017-09-26 23:38 ` Jonathan Nieder 2017-09-26 23:51 ` RFC v3: Another proposed hash function transition plan Jonathan Nieder 2017-10-02 14:54 ` Jason Cooper 2017-10-02 16:50 ` Brandon Williams 2017-10-02 14:00 ` Jason Cooper 2017-10-02 17:18 ` Linus Torvalds 2017-10-02 19:37 ` Jeff King 2017-09-13 16:30 ` Jonathan Nieder 2017-09-13 21:52 ` Junio C Hamano 2017-09-13 22:07 ` Stefan Beller 2017-09-13 22:18 ` Jonathan Nieder 2017-09-14 2:13 ` Junio C Hamano 2017-09-14 15:23 ` Johannes Schindelin 2017-09-14 15:45 ` demerphq 2017-09-14 22:06 ` Johannes Schindelin 2017-09-13 22:15 ` Junio C Hamano 2017-09-13 22:27 ` Jonathan Nieder 2017-09-14 2:10 ` Junio C Hamano 2017-09-14 12:39 ` Johannes Schindelin 2017-09-14 16:36 ` Brandon Williams 2017-09-14 18:49 ` Jonathan Nieder 2017-09-15 20:42 ` Philip Oakley 2017-03-05 11:02 ` RFC: " David Lang [not found] ` <CA+dhYEXHbQfJ6KUB1tWS9u1MLEOJL81fTYkbxu4XO-i+379LPw@mail.gmail.com> 2017-03-06 9:43 ` Jeff King 2017-03-06 23:40 ` Jonathan Nieder 2017-03-07 0:03 ` Mike Hommey 2017-03-06 8:43 ` Jeff King 2017-03-06 18:39 ` Jonathan Tan 2017-03-06 19:22 ` Linus Torvalds 2017-03-06 19:59 ` Brandon Williams 2017-03-06 21:53 ` Junio C Hamano 2017-03-07 8:59 ` Jeff King 2017-03-06 18:43 ` Junio C Hamano 2017-03-07 18:57 ` Ian Jackson 2017-03-07 19:15 ` Linus Torvalds 2017-03-08 11:20 ` Ian Jackson 2017-03-08 15:37 ` Johannes Schindelin 2017-03-13 9:24 ` The Keccak Team 2017-03-13 17:48 ` Jonathan Nieder 2017-03-13 18:34 ` ankostis 2017-03-17 11:07 ` Johannes Schindelin 2017-03-08 15:40 Johannes Schindelin 2017-03-20 5:21 ` Use base32? Jason Hennessey 2017-03-20 5:58 ` Michael Steuer 2017-03-20 8:05 ` Jacob Keller 2017-03-21 3:07 ` Michael Steuer
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170310195523.GF26789@aiede.mtv.corp.google.com \ --to=jrnieder@gmail.com \ --cc=bmwill@google.com \ --cc=david@lang.hm \ --cc=git@vger.kernel.org \ --cc=jonathantanmy@google.com \ --cc=peff@peff.net \ --cc=sandals@crustytoothpaste.net \ --cc=sbeller@google.com \ --cc=spearce@spearce.org \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Git Mailing List Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/git/0 git/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 git git/ https://lore.kernel.org/git \ git@vger.kernel.org public-inbox-index git Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.git AGPL code for this site: git clone https://public-inbox.org/public-inbox.git