Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Jonathan Tan <jonathantanmy@google.com>
To: me@ttaylorr.com
Cc: git@vger.kernel.org, dstolee@microsoft.com, gitster@pobox.com,
	peff@peff.net, martin.agren@gmail.com, szeder.dev@gmail.com,
	Jonathan Tan <jonathantanmy@google.com>
Subject: Re: [PATCH v2 23/24] pack-bitmap-write: relax unique rewalk condition
Date: Tue,  1 Dec 2020 23:44:39 -0800
Message-ID: <20201202074439.3478090-1-jonathantanmy@google.com> (raw)
In-Reply-To: <1da4fa0fb85fe848aa86987e767b33d296f8f878.1605649533.git.me@ttaylorr.com>

> However, there was a (significant) drawback: wide histories with many
> refs had an explosion of memory costs to compute the commit bitmasks
> during the exploration that discovers these intermediate commits. Since
> these wide histories are unlikely to repeat walking objects, the benefit
> of walking objects multiple times was not expensive before. But now, the
> commit walk *before computing bitmaps* is incredibly expensive.

Do you have numbers of how large the commit bitmasks are?

> In an effort to discover a happy medium, this change reduces the walk
> for intermediate commits to only the first-parent history. This focuses
> the walk on how the histories converge, which still has significant
> reduction in repeat object walks. It is still possible to create
> quadratic behavior in this version, but it is probably less likely in
> realistic data shapes.

Would this work? I agree that the width of the commit bitmasks would go
down (and there would also be fewer commit bitmasks generated, further
increasing the memory savings). But intuitively, if there is a commit
that is selected and only accessible through non-1st-parent links, then
any bitmaps generated for it cannot be contributed to its descendants
(since there was no descendant-to-ancestor walk that could reach it in
order to form the reverse edge).

> Here is some data taken on a fresh clone of the kernel:
> 
>              |   runtime (sec)    |   peak heap (GB)   |
>              |                    |                    |
>              |   from  |   with   |   from  |   with   |
>              | scratch | existing | scratch | existing |
>   -----------+---------+----------+---------+-----------
>     original |  64.044 |   83.241 |   2.088 |    2.194 |
>   last patch |  44.811 |   27.828 |   2.289 |    2.358 |
>   this patch | 100.641 |   35.560 |   2.152 |    2.224 |

Hmm...the jump from 44 to 100 seems rather large.

  reply index

Thread overview: 174+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-11 19:41 [PATCH 00/23] pack-bitmap: bitmap generation improvements Taylor Blau
2020-11-11 19:41 ` [PATCH 01/23] ewah/ewah_bitmap.c: grow buffer past 1 Taylor Blau
2020-11-22 19:36   ` Junio C Hamano
2020-11-23 16:22     ` Taylor Blau
2020-11-24  2:48       ` Jeff King
2020-11-24  2:51         ` Jeff King
2020-12-01 22:56           ` Taylor Blau
2020-11-11 19:41 ` [PATCH 02/23] pack-bitmap: fix header size check Taylor Blau
2020-11-12 17:39   ` Martin Ågren
2020-11-11 19:42 ` [PATCH 03/23] pack-bitmap: bounds-check size of cache extension Taylor Blau
2020-11-12 17:47   ` Martin Ågren
2020-11-13  4:57     ` Jeff King
2020-11-13  5:26       ` Martin Ågren
2020-11-13 21:29       ` Taylor Blau
2020-11-13 21:39         ` Jeff King
2020-11-13 21:49           ` Taylor Blau
2020-11-13 22:11             ` Jeff King
2020-11-11 19:42 ` [PATCH 04/23] t5310: drop size of truncated ewah bitmap Taylor Blau
2020-11-11 19:42 ` [PATCH 05/23] rev-list: die when --test-bitmap detects a mismatch Taylor Blau
2020-11-11 19:42 ` [PATCH 06/23] ewah: factor out bitmap growth Taylor Blau
2020-11-11 19:42 ` [PATCH 07/23] ewah: make bitmap growth less aggressive Taylor Blau
2020-11-22 20:32   ` Junio C Hamano
2020-11-23 16:49     ` Taylor Blau
2020-11-24  3:00       ` Jeff King
2020-11-24 20:11         ` Junio C Hamano
2020-11-11 19:43 ` [PATCH 08/23] ewah: implement bitmap_or() Taylor Blau
2020-11-22 20:34   ` Junio C Hamano
2020-11-23 16:52     ` Taylor Blau
2020-11-11 19:43 ` [PATCH 09/23] ewah: add bitmap_dup() function Taylor Blau
2020-11-11 19:43 ` [PATCH 10/23] pack-bitmap-write: reimplement bitmap writing Taylor Blau
2020-11-11 19:43 ` [PATCH 11/23] pack-bitmap-write: pass ownership of intermediate bitmaps Taylor Blau
2020-11-11 19:43 ` [PATCH 12/23] pack-bitmap-write: fill bitmap with commit history Taylor Blau
2020-11-11 19:43 ` [PATCH 13/23] bitmap: add bitmap_diff_nonzero() Taylor Blau
2020-11-11 19:43 ` [PATCH 14/23] commit: implement commit_list_contains() Taylor Blau
2020-11-11 19:43 ` [PATCH 15/23] t5310: add branch-based checks Taylor Blau
2020-11-11 20:58   ` Derrick Stolee
2020-11-11 21:04     ` Junio C Hamano
2020-11-15 23:26       ` Johannes Schindelin
2020-11-11 19:43 ` [PATCH 16/23] pack-bitmap-write: rename children to reverse_edges Taylor Blau
2020-11-11 19:43 ` [PATCH 17/23] pack-bitmap-write: build fewer intermediate bitmaps Taylor Blau
2020-11-13 22:23   ` SZEDER Gábor
2020-11-13 23:03     ` Jeff King
2020-11-14  6:23       ` Jeff King
2020-11-11 19:43 ` [PATCH 18/23] pack-bitmap-write: ignore BITMAP_FLAG_REUSE Taylor Blau
2020-11-11 19:44 ` [PATCH 19/23] pack-bitmap: factor out 'bitmap_for_commit()' Taylor Blau
2020-11-11 19:44 ` [PATCH 20/23] pack-bitmap: factor out 'add_commit_to_bitmap()' Taylor Blau
2020-11-11 19:44 ` [PATCH 21/23] pack-bitmap-write: use existing bitmaps Taylor Blau
2020-11-11 19:44 ` [PATCH 22/23] pack-bitmap-write: relax unique rewalk condition Taylor Blau
2020-11-11 19:44 ` [PATCH 23/23] pack-bitmap-write: better reuse bitmaps Taylor Blau
2020-11-17 21:46 ` [PATCH v2 00/24] pack-bitmap: bitmap generation improvements Taylor Blau
2020-11-17 21:46   ` [PATCH v2 01/24] ewah/ewah_bitmap.c: grow buffer past 1 Taylor Blau
2020-11-17 21:46   ` [PATCH v2 02/24] pack-bitmap: fix header size check Taylor Blau
2020-11-17 21:46   ` [PATCH v2 03/24] pack-bitmap: bounds-check size of cache extension Taylor Blau
2020-11-17 21:46   ` [PATCH v2 04/24] t5310: drop size of truncated ewah bitmap Taylor Blau
2020-11-17 21:46   ` [PATCH v2 05/24] rev-list: die when --test-bitmap detects a mismatch Taylor Blau
2020-11-17 21:46   ` [PATCH v2 06/24] ewah: factor out bitmap growth Taylor Blau
2020-11-17 21:47   ` [PATCH v2 07/24] ewah: make bitmap growth less aggressive Taylor Blau
2020-11-17 21:47   ` [PATCH v2 08/24] ewah: implement bitmap_or() Taylor Blau
2020-11-17 21:47   ` [PATCH v2 09/24] ewah: add bitmap_dup() function Taylor Blau
2020-11-17 21:47   ` [PATCH v2 10/24] pack-bitmap-write: reimplement bitmap writing Taylor Blau
2020-11-25  0:53     ` Jonathan Tan
2020-11-28 17:27       ` Taylor Blau
2020-11-17 21:47   ` [PATCH v2 11/24] pack-bitmap-write: pass ownership of intermediate bitmaps Taylor Blau
2020-11-25  1:00     ` Jonathan Tan
2020-11-17 21:47   ` [PATCH v2 12/24] pack-bitmap-write: fill bitmap with commit history Taylor Blau
2020-11-22 21:50     ` Junio C Hamano
2020-11-23 14:54       ` Derrick Stolee
2020-11-25  1:14     ` Jonathan Tan
2020-11-28 17:21       ` Taylor Blau
2020-11-30 18:33         ` Jonathan Tan
2020-11-17 21:47   ` [PATCH v2 13/24] bitmap: add bitmap_diff_nonzero() Taylor Blau
2020-11-22 22:01     ` Junio C Hamano
2020-11-23 20:19       ` Taylor Blau
2020-11-17 21:47   ` [PATCH v2 14/24] commit: implement commit_list_contains() Taylor Blau
2020-11-17 21:47   ` [PATCH v2 15/24] t5310: add branch-based checks Taylor Blau
2020-11-25  1:17     ` Jonathan Tan
2020-11-28 17:30       ` Taylor Blau
2020-11-17 21:47   ` [PATCH v2 16/24] pack-bitmap-write: rename children to reverse_edges Taylor Blau
2020-11-17 21:47   ` [PATCH v2 17/24] pack-bitmap.c: check reads more aggressively when loading Taylor Blau
2020-11-17 21:48   ` [PATCH v2 18/24] pack-bitmap-write: build fewer intermediate bitmaps Taylor Blau
2020-11-24  6:07     ` Jonathan Tan
2020-11-25  1:46     ` Jonathan Tan
2020-11-30 18:41       ` Derrick Stolee
2020-11-17 21:48   ` [PATCH v2 19/24] pack-bitmap-write: ignore BITMAP_FLAG_REUSE Taylor Blau
2020-12-02  7:13     ` Jonathan Tan
2020-11-17 21:48   ` [PATCH v2 20/24] pack-bitmap: factor out 'bitmap_for_commit()' Taylor Blau
2020-12-02  7:17     ` Jonathan Tan
2020-11-17 21:48   ` [PATCH v2 21/24] pack-bitmap: factor out 'add_commit_to_bitmap()' Taylor Blau
2020-12-02  7:20     ` Jonathan Tan
2020-11-17 21:48   ` [PATCH v2 22/24] pack-bitmap-write: use existing bitmaps Taylor Blau
2020-12-02  7:28     ` Jonathan Tan
2020-12-02 16:21       ` Taylor Blau
2020-11-17 21:48   ` [PATCH v2 23/24] pack-bitmap-write: relax unique rewalk condition Taylor Blau
2020-12-02  7:44     ` Jonathan Tan [this message]
2020-12-02 16:30       ` Taylor Blau
2020-12-07 18:19         ` Jonathan Tan
2020-12-07 18:43           ` Derrick Stolee
2020-12-07 18:45             ` Derrick Stolee
2020-12-07 18:48           ` Jeff King
2020-11-17 21:48   ` [PATCH v2 24/24] pack-bitmap-write: better reuse bitmaps Taylor Blau
2020-12-02  8:08     ` Jonathan Tan
2020-12-02 16:35       ` Taylor Blau
2020-12-02 18:22         ` Derrick Stolee
2020-12-02 18:25           ` Taylor Blau
2020-12-07 18:26             ` Jonathan Tan
2020-12-07 18:24           ` Jonathan Tan
2020-12-07 19:20             ` Derrick Stolee
2020-11-18 18:32   ` [PATCH v2 00/24] pack-bitmap: bitmap generation improvements SZEDER Gábor
2020-11-18 19:51     ` Taylor Blau
2020-11-22  2:17       ` Taylor Blau
2020-11-22  2:28         ` Taylor Blau
2020-11-20  6:34   ` Martin Ågren
2020-11-21 19:37     ` Junio C Hamano
2020-11-21 20:11       ` Martin Ågren
2020-11-22  2:31         ` Taylor Blau
2020-11-24  2:43           ` Jeff King
2020-12-01 23:04             ` Taylor Blau
2020-12-01 23:37               ` Jonathan Tan
2020-12-01 23:43                 ` Taylor Blau
2020-12-02  8:11                   ` Jonathan Tan
2020-12-08  0:04 ` [PATCH v3 " Taylor Blau
2020-12-08  0:04   ` [PATCH v3 01/24] ewah/ewah_bitmap.c: avoid open-coding ALLOC_GROW() Taylor Blau
2020-12-08  0:04   ` [PATCH v3 02/24] pack-bitmap: fix header size check Taylor Blau
2020-12-08  0:04   ` [PATCH v3 03/24] pack-bitmap: bounds-check size of cache extension Taylor Blau
2020-12-08  0:04   ` [PATCH v3 04/24] t5310: drop size of truncated ewah bitmap Taylor Blau
2020-12-08  0:04   ` [PATCH v3 05/24] rev-list: die when --test-bitmap detects a mismatch Taylor Blau
2020-12-08  0:04   ` [PATCH v3 06/24] ewah: factor out bitmap growth Taylor Blau
2020-12-08  0:04   ` [PATCH v3 07/24] ewah: make bitmap growth less aggressive Taylor Blau
2020-12-08  0:04   ` [PATCH v3 08/24] ewah: implement bitmap_or() Taylor Blau
2020-12-08  0:04   ` [PATCH v3 09/24] ewah: add bitmap_dup() function Taylor Blau
2020-12-08  0:04   ` [PATCH v3 10/24] pack-bitmap-write: reimplement bitmap writing Taylor Blau
2020-12-08  0:05   ` [PATCH v3 11/24] pack-bitmap-write: pass ownership of intermediate bitmaps Taylor Blau
2020-12-08  0:05   ` [PATCH v3 12/24] pack-bitmap-write: fill bitmap with commit history Taylor Blau
2020-12-08  0:05   ` [PATCH v3 13/24] bitmap: implement bitmap_is_subset() Taylor Blau
2020-12-08  0:05   ` [PATCH v3 14/24] commit: implement commit_list_contains() Taylor Blau
2020-12-08  0:05   ` [PATCH v3 15/24] t5310: add branch-based checks Taylor Blau
2020-12-08  0:05   ` [PATCH v3 16/24] pack-bitmap-write: rename children to reverse_edges Taylor Blau
2020-12-08  0:05   ` [PATCH v3 17/24] pack-bitmap.c: check reads more aggressively when loading Taylor Blau
2020-12-08  0:05   ` [PATCH v3 18/24] pack-bitmap-write: build fewer intermediate bitmaps Taylor Blau
2020-12-08  0:05   ` [PATCH v3 19/24] pack-bitmap-write: ignore BITMAP_FLAG_REUSE Taylor Blau
2020-12-08  0:05   ` [PATCH v3 20/24] pack-bitmap: factor out 'bitmap_for_commit()' Taylor Blau
2020-12-08  0:05   ` [PATCH v3 21/24] pack-bitmap: factor out 'add_commit_to_bitmap()' Taylor Blau
2020-12-08  0:05   ` [PATCH v3 22/24] pack-bitmap-write: use existing bitmaps Taylor Blau
2020-12-08  0:05   ` [PATCH v3 23/24] pack-bitmap-write: relax unique rewalk condition Taylor Blau
2020-12-08  0:05   ` [PATCH v3 24/24] pack-bitmap-write: better reuse bitmaps Taylor Blau
2020-12-08 20:56   ` [PATCH v3 00/24] pack-bitmap: bitmap generation improvements Junio C Hamano
2020-12-08 21:03     ` Taylor Blau
2020-12-08 22:03       ` Junio C Hamano
2020-12-08 22:03 ` [PATCH v4 " Taylor Blau
2020-12-08 22:03   ` [PATCH v4 01/24] ewah/ewah_bitmap.c: avoid open-coding ALLOC_GROW() Taylor Blau
2020-12-08 22:03   ` [PATCH v4 02/24] pack-bitmap: fix header size check Taylor Blau
2020-12-08 22:03   ` [PATCH v4 03/24] pack-bitmap: bounds-check size of cache extension Taylor Blau
2020-12-08 22:03   ` [PATCH v4 04/24] t5310: drop size of truncated ewah bitmap Taylor Blau
2020-12-08 22:03   ` [PATCH v4 05/24] rev-list: die when --test-bitmap detects a mismatch Taylor Blau
2020-12-08 22:03   ` [PATCH v4 06/24] ewah: factor out bitmap growth Taylor Blau
2020-12-08 22:03   ` [PATCH v4 07/24] ewah: make bitmap growth less aggressive Taylor Blau
2020-12-08 22:03   ` [PATCH v4 08/24] ewah: implement bitmap_or() Taylor Blau
2020-12-08 22:03   ` [PATCH v4 09/24] ewah: add bitmap_dup() function Taylor Blau
2020-12-08 22:03   ` [PATCH v4 10/24] pack-bitmap-write: reimplement bitmap writing Taylor Blau
2020-12-08 22:03   ` [PATCH v4 11/24] pack-bitmap-write: pass ownership of intermediate bitmaps Taylor Blau
2020-12-08 22:04   ` [PATCH v4 12/24] pack-bitmap-write: fill bitmap with commit history Taylor Blau
2020-12-08 22:04   ` [PATCH v4 13/24] bitmap: implement bitmap_is_subset() Taylor Blau
2020-12-08 22:04   ` [PATCH v4 14/24] commit: implement commit_list_contains() Taylor Blau
2020-12-08 22:04   ` [PATCH v4 15/24] t5310: add branch-based checks Taylor Blau
2020-12-08 22:04   ` [PATCH v4 16/24] pack-bitmap-write: rename children to reverse_edges Taylor Blau
2020-12-08 22:04   ` [PATCH v4 17/24] pack-bitmap.c: check reads more aggressively when loading Taylor Blau
2020-12-08 22:04   ` [PATCH v4 18/24] pack-bitmap-write: build fewer intermediate bitmaps Taylor Blau
2020-12-08 22:04   ` [PATCH v4 19/24] pack-bitmap-write: ignore BITMAP_FLAG_REUSE Taylor Blau
2020-12-08 22:04   ` [PATCH v4 20/24] pack-bitmap: factor out 'bitmap_for_commit()' Taylor Blau
2020-12-08 22:05     ` Taylor Blau
2020-12-08 22:05   ` [PATCH v4 21/24] pack-bitmap: factor out 'add_commit_to_bitmap()' Taylor Blau
2020-12-08 22:05   ` [PATCH v4 22/24] pack-bitmap-write: use existing bitmaps Taylor Blau
2020-12-08 22:05   ` [PATCH v4 23/24] pack-bitmap-write: relax unique revwalk condition Taylor Blau
2020-12-08 22:05   ` [PATCH v4 24/24] pack-bitmap-write: better reuse bitmaps Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201202074439.3478090-1-jonathantanmy@google.com \
    --to=jonathantanmy@google.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git