All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, avarab@gmail.com, dstolee@microsoft.com,
	gitster@pobox.com, jonathantanmy@google.com
Subject: Re: [PATCH v3 12/16] Documentation/technical: describe multi-pack reverse indexes
Date: Mon, 29 Mar 2021 08:12:39 -0400	[thread overview]
Message-ID: <YGHEN6rcao2tJ006@coredump.intra.peff.net> (raw)
In-Reply-To: <4745bb8590f5cdc24445618dd63ba6bd541227b4.1615482270.git.me@ttaylorr.com>

On Thu, Mar 11, 2021 at 12:05:25PM -0500, Taylor Blau wrote:

> As a prerequisite to implementing multi-pack bitmaps, motivate and
> describe the format and ordering of the multi-pack reverse index.

Nicely written overall. I found a few typos / formatting issues.

> +One solution is to let bits occupy the same position in the oid-sorted
> +index stored by the MIDX. But because oids are effectively random, there

s/there/their/

> +Given the list of packs and their counts of objects, you can
> +na&iuml;vely reconstruct that pseudo-pack ordering (e.g., the object at

An HTML entity seems to have snuck in. The source is utf8, so we can
just say ï.

> +position 27 must be (c,1) because packs "a" and "b" consumed 25 of the
> +slots). But there's a catch. Objects may be duplicated between packs, in
> +which case the MIDX only stores one pointer to the object (and thus we'd
> +want only one slot in the bitmap).
> +
> +Callers could handle duplicates themselves by reading objects in order
> +of their bit-position, but that's linear in the number of objects, and
> +much too expensive for ordinary bitmap lookups. Building a reverse index
> +solves this, since it is the logical inverse of the index, and that
> +index has already removed duplicates. But, building a reverse index on
> +the fly can be expensive. Since we already have an on-disk format for
> +pack-based reverse indexes, let's reuse it for the MIDX's pseudo-pack,
> +too.

Yep, I think this nicely builds up the logic explaining the need for the
midx .rev file.

> +Objects from the MIDX are ordered as follows to string together the
> +pseudo-pack. Let _pack(o)_ return the pack from which _o_ was selected
> +by the MIDX, and define an ordering of packs based on their numeric ID
> +(as stored by the MIDX). Let _offset(o)_ return the object offset of _o_
> +within _pack(o)_. Then, compare _o~1~_ and _o~2~_ as follows:

I guess the asciidoc-formatted version of this makes these nicely
italicized and subscripted. Personally I think pack(o) and o1 would be
more readable in the source (which is what I would tend to read). Or
maybe backticks if you want to be fancy.

> +  - If _pack(o~1~) &ne; pack(o~2~)_, then sort the two objects in
> +    descending order based on the pack ID.
> +
> +  - Otherwise, _pack(o~1~) &equals; pack(o~2~)_, and the objects are
> +    sorted in pack-order (i.e., _o~1~_ sorts ahead of _o~2~_ exactly
> +    when _offset(o~1~) &lt; offset(o~2~)_).

A few more HTML bits in the comparison operators.

-Peff

  reply	other threads:[~2021-03-29 12:13 UTC|newest]

Thread overview: 171+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-10 23:02 [PATCH 0/9] midx: implement a multi-pack reverse index Taylor Blau
2021-02-10 23:02 ` [PATCH 1/9] t/helper/test-read-midx.c: add '--show-objects' Taylor Blau
2021-02-11  2:27   ` Derrick Stolee
2021-02-11  2:34     ` Taylor Blau
2021-02-10 23:02 ` [PATCH 2/9] midx: allow marking a pack as preferred Taylor Blau
2021-02-11 19:33   ` SZEDER Gábor
2021-02-15 15:49     ` Taylor Blau
2021-02-15 17:01       ` Ævar Arnfjörð Bjarmason
2021-02-15 18:41         ` [PATCH 0/5] commit-graph: parse_options() cleanup Ævar Arnfjörð Bjarmason
2021-02-15 18:41         ` [PATCH 1/5] commit-graph: define common usage with a macro Ævar Arnfjörð Bjarmason
2021-02-16 11:33           ` Derrick Stolee
2021-02-15 18:41         ` [PATCH 2/5] commit-graph: remove redundant handling of -h Ævar Arnfjörð Bjarmason
2021-02-16 11:35           ` Derrick Stolee
2021-02-15 18:41         ` [PATCH 3/5] commit-graph: use parse_options_concat() Ævar Arnfjörð Bjarmason
2021-02-15 18:51           ` Taylor Blau
2021-02-15 19:53             ` Taylor Blau
2021-02-15 20:39             ` Ævar Arnfjörð Bjarmason
2021-09-17 21:13               ` SZEDER Gábor
2021-09-17 22:03                 ` Jeff King
2021-09-18  4:30                   ` Taylor Blau
2021-09-18  7:20                     ` Ævar Arnfjörð Bjarmason
2021-09-18 15:56                       ` Taylor Blau
2021-09-18 15:58                         ` Taylor Blau
2021-09-18  0:58                 ` Ævar Arnfjörð Bjarmason
2021-02-15 18:41         ` [PATCH 4/5] commit-graph: refactor dispatch loop for style Ævar Arnfjörð Bjarmason
2021-02-15 18:53           ` Taylor Blau
2021-02-16 11:40             ` Derrick Stolee
2021-02-16 12:02               ` Ævar Arnfjörð Bjarmason
2021-02-16 18:28                 ` Derrick Stolee
2021-02-15 18:41         ` [PATCH 5/5] commit-graph: show usage on "commit-graph [write|verify] garbage" Ævar Arnfjörð Bjarmason
2021-02-15 19:06           ` Taylor Blau
2021-02-16 11:43           ` Derrick Stolee
2021-02-15 21:01         ` [PATCH v2 0/4] midx: split out sub-commands Taylor Blau
2021-02-15 21:01           ` [PATCH v2 1/4] builtin/multi-pack-index.c: inline 'flags' with options Taylor Blau
2021-02-15 21:01           ` [PATCH v2 2/4] builtin/multi-pack-index.c: don't handle 'progress' separately Taylor Blau
2021-02-15 21:39             ` Ævar Arnfjörð Bjarmason
2021-02-15 21:45               ` Taylor Blau
2021-02-16 11:47                 ` Derrick Stolee
2021-02-15 21:01           ` [PATCH v2 3/4] builtin/multi-pack-index.c: define common usage with a macro Taylor Blau
2021-02-15 21:01           ` [PATCH v2 4/4] builtin/multi-pack-index.c: split sub-commands Taylor Blau
2021-02-15 21:54             ` Ævar Arnfjörð Bjarmason
2021-02-15 22:34               ` Taylor Blau
2021-02-15 23:11                 ` Ævar Arnfjörð Bjarmason
2021-02-15 23:49                   ` Taylor Blau
2021-02-16 11:50           ` [PATCH v2 0/4] midx: split out sub-commands Derrick Stolee
2021-02-16 14:28             ` Taylor Blau
2021-02-10 23:02 ` [PATCH 3/9] midx: don't free midx_name early Taylor Blau
2021-02-10 23:02 ` [PATCH 4/9] midx: keep track of the checksum Taylor Blau
2021-02-11  2:33   ` Derrick Stolee
2021-02-11  2:35     ` Taylor Blau
2021-02-10 23:03 ` [PATCH 5/9] midx: make some functions non-static Taylor Blau
2021-02-10 23:03 ` [PATCH 6/9] Documentation/technical: describe multi-pack reverse indexes Taylor Blau
2021-02-11  2:48   ` Derrick Stolee
2021-02-11  3:03     ` Taylor Blau
2021-02-10 23:03 ` [PATCH 7/9] pack-revindex: read " Taylor Blau
2021-02-11  2:53   ` Derrick Stolee
2021-02-11  3:04     ` Taylor Blau
2021-02-11  7:54   ` Junio C Hamano
2021-02-11 14:54     ` Taylor Blau
2021-02-10 23:03 ` [PATCH 8/9] pack-write.c: extract 'write_rev_file_order' Taylor Blau
2021-02-10 23:03 ` [PATCH 9/9] pack-revindex: write multi-pack reverse indexes Taylor Blau
2021-02-11  2:58 ` [PATCH 0/9] midx: implement a multi-pack reverse index Derrick Stolee
2021-02-11  3:06   ` Taylor Blau
2021-02-11  8:13 ` Junio C Hamano
2021-02-11 18:37   ` Derrick Stolee
2021-02-11 18:55     ` Junio C Hamano
2021-02-24 19:09 ` [PATCH v2 00/15] " Taylor Blau
2021-02-24 19:09   ` [PATCH v2 01/15] builtin/multi-pack-index.c: inline 'flags' with options Taylor Blau
2021-02-24 19:09   ` [PATCH v2 02/15] builtin/multi-pack-index.c: don't handle 'progress' separately Taylor Blau
2021-02-24 19:09   ` [PATCH v2 03/15] builtin/multi-pack-index.c: define common usage with a macro Taylor Blau
2021-02-24 19:09   ` [PATCH v2 04/15] builtin/multi-pack-index.c: split sub-commands Taylor Blau
2021-03-02  4:06     ` Jonathan Tan
2021-03-02 19:02       ` Taylor Blau
2021-03-04  1:54         ` Jonathan Tan
2021-03-04  3:02           ` Taylor Blau
2021-02-24 19:09   ` [PATCH v2 05/15] builtin/multi-pack-index.c: don't enter bogus cmd_mode Taylor Blau
2021-02-24 19:09   ` [PATCH v2 06/15] builtin/multi-pack-index.c: display usage on unrecognized command Taylor Blau
2021-02-24 19:09   ` [PATCH v2 07/15] t/helper/test-read-midx.c: add '--show-objects' Taylor Blau
2021-02-24 19:09   ` [PATCH v2 08/15] midx: allow marking a pack as preferred Taylor Blau
2021-03-02  4:17     ` Jonathan Tan
2021-03-02 19:09       ` Taylor Blau
2021-03-04  2:00         ` Jonathan Tan
2021-03-04  3:04           ` Taylor Blau
2021-02-24 19:09   ` [PATCH v2 09/15] midx: don't free midx_name early Taylor Blau
2021-02-24 19:10   ` [PATCH v2 10/15] midx: keep track of the checksum Taylor Blau
2021-02-24 19:10   ` [PATCH v2 11/15] midx: make some functions non-static Taylor Blau
2021-02-24 19:10   ` [PATCH v2 12/15] Documentation/technical: describe multi-pack reverse indexes Taylor Blau
2021-03-02  4:21     ` Jonathan Tan
2021-03-02  4:36       ` Taylor Blau
2021-03-02 19:15       ` Taylor Blau
2021-03-04  2:03         ` Jonathan Tan
2021-02-24 19:10   ` [PATCH v2 13/15] pack-revindex: read " Taylor Blau
2021-03-02 18:36     ` Jonathan Tan
2021-03-03 15:27       ` Taylor Blau
2021-02-24 19:10   ` [PATCH v2 14/15] pack-write.c: extract 'write_rev_file_order' Taylor Blau
2021-02-24 19:10   ` [PATCH v2 15/15] pack-revindex: write multi-pack reverse indexes Taylor Blau
2021-03-02 18:40     ` Jonathan Tan
2021-03-03 15:30       ` Taylor Blau
2021-03-04  2:04         ` Jonathan Tan
2021-03-04  3:06           ` Taylor Blau
2021-03-11 17:04 ` [PATCH v3 00/16] midx: implement a multi-pack reverse index Taylor Blau
2021-03-11 17:04   ` [PATCH v3 01/16] builtin/multi-pack-index.c: inline 'flags' with options Taylor Blau
2021-03-29 11:20     ` Jeff King
2021-03-11 17:04   ` [PATCH v3 02/16] builtin/multi-pack-index.c: don't handle 'progress' separately Taylor Blau
2021-03-29 11:22     ` Jeff King
2021-03-11 17:04   ` [PATCH v3 03/16] builtin/multi-pack-index.c: define common usage with a macro Taylor Blau
2021-03-11 17:04   ` [PATCH v3 04/16] builtin/multi-pack-index.c: split sub-commands Taylor Blau
2021-03-29 11:36     ` Jeff King
2021-03-29 20:38       ` Taylor Blau
2021-03-30  7:04         ` Jeff King
2021-03-11 17:04   ` [PATCH v3 05/16] builtin/multi-pack-index.c: don't enter bogus cmd_mode Taylor Blau
2021-03-11 17:04   ` [PATCH v3 06/16] builtin/multi-pack-index.c: display usage on unrecognized command Taylor Blau
2021-03-29 11:42     ` Jeff King
2021-03-29 20:41       ` Taylor Blau
2021-03-11 17:05   ` [PATCH v3 07/16] t/helper/test-read-midx.c: add '--show-objects' Taylor Blau
2021-03-11 17:05   ` [PATCH v3 08/16] midx: allow marking a pack as preferred Taylor Blau
2021-03-29 12:00     ` Jeff King
2021-03-29 21:15       ` Taylor Blau
2021-03-30  7:11         ` Jeff King
2021-03-11 17:05   ` [PATCH v3 09/16] midx: don't free midx_name early Taylor Blau
2021-03-11 17:05   ` [PATCH v3 10/16] midx: keep track of the checksum Taylor Blau
2021-03-11 17:05   ` [PATCH v3 11/16] midx: make some functions non-static Taylor Blau
2021-03-11 17:05   ` [PATCH v3 12/16] Documentation/technical: describe multi-pack reverse indexes Taylor Blau
2021-03-29 12:12     ` Jeff King [this message]
2021-03-29 21:22       ` Taylor Blau
2021-03-11 17:05   ` [PATCH v3 13/16] pack-revindex: read " Taylor Blau
2021-03-29 12:43     ` Jeff King
2021-03-29 21:27       ` Taylor Blau
2021-03-11 17:05   ` [PATCH v3 14/16] pack-write.c: extract 'write_rev_file_order' Taylor Blau
2021-03-11 17:05   ` [PATCH v3 15/16] pack-revindex: write multi-pack reverse indexes Taylor Blau
2021-03-29 12:53     ` Jeff King
2021-03-29 21:30       ` Taylor Blau
2021-03-11 17:05   ` [PATCH v3 16/16] midx.c: improve cache locality in midx_pack_order_cmp() Taylor Blau
2021-03-29 12:59     ` Jeff King
2021-03-29 21:34       ` Taylor Blau
2021-03-30  7:15         ` Jeff King
2021-03-12 15:16   ` [PATCH v3 00/16] midx: implement a multi-pack reverse index Derrick Stolee
2021-03-29 13:05   ` Jeff King
2021-03-29 21:30     ` Junio C Hamano
2021-03-29 21:37     ` Taylor Blau
2021-03-30  7:15       ` Jeff King
2021-03-30 13:37         ` Taylor Blau
2021-03-30 15:03 ` [PATCH v4 " Taylor Blau
2021-03-30 15:03   ` [PATCH v4 01/16] builtin/multi-pack-index.c: inline 'flags' with options Taylor Blau
2021-03-30 15:03   ` [PATCH v4 02/16] builtin/multi-pack-index.c: don't handle 'progress' separately Taylor Blau
2021-03-30 15:03   ` [PATCH v4 03/16] builtin/multi-pack-index.c: define common usage with a macro Taylor Blau
2021-03-30 15:03   ` [PATCH v4 04/16] builtin/multi-pack-index.c: split sub-commands Taylor Blau
2021-03-30 15:04   ` [PATCH v4 05/16] builtin/multi-pack-index.c: don't enter bogus cmd_mode Taylor Blau
2021-03-30 15:04   ` [PATCH v4 06/16] builtin/multi-pack-index.c: display usage on unrecognized command Taylor Blau
2021-03-30 15:04   ` [PATCH v4 07/16] t/helper/test-read-midx.c: add '--show-objects' Taylor Blau
2021-03-30 15:04   ` [PATCH v4 08/16] midx: allow marking a pack as preferred Taylor Blau
2021-04-01  0:32     ` Taylor Blau
2021-03-30 15:04   ` [PATCH v4 09/16] midx: don't free midx_name early Taylor Blau
2021-03-30 15:04   ` [PATCH v4 10/16] midx: keep track of the checksum Taylor Blau
2021-03-30 15:04   ` [PATCH v4 11/16] midx: make some functions non-static Taylor Blau
2021-03-30 15:04   ` [PATCH v4 12/16] Documentation/technical: describe multi-pack reverse indexes Taylor Blau
2021-03-30 15:04   ` [PATCH v4 13/16] pack-revindex: read " Taylor Blau
2021-03-30 15:04   ` [PATCH v4 14/16] pack-write.c: extract 'write_rev_file_order' Taylor Blau
2021-09-08  1:08     ` [PATCH] pack-write: skip *.rev work when not writing *.rev Ævar Arnfjörð Bjarmason
2021-09-08  1:35       ` Carlo Arenas
2021-09-08  2:42         ` Taylor Blau
2021-09-08 15:47           ` Junio C Hamano
2021-09-08  2:50       ` Taylor Blau
2021-09-08  3:50         ` Taylor Blau
2021-09-08 10:18           ` Ævar Arnfjörð Bjarmason
2021-09-08 16:32             ` Taylor Blau
2021-03-30 15:04   ` [PATCH v4 15/16] pack-revindex: write multi-pack reverse indexes Taylor Blau
2021-03-30 15:04   ` [PATCH v4 16/16] midx.c: improve cache locality in midx_pack_order_cmp() Taylor Blau
2021-03-30 15:45   ` [PATCH v4 00/16] midx: implement a multi-pack reverse index Jeff King
2021-03-30 15:49     ` Taylor Blau
2021-03-30 16:01       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YGHEN6rcao2tJ006@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=avarab@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.