All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Taylor Blau <me@ttaylorr.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	git@vger.kernel.org, dstolee@microsoft.com, jrnieder@gmail.com
Subject: Re: [PATCH v2 16/20] builtin/gc.c: guess the size of the revindex
Date: Thu, 14 Jan 2021 14:43:24 -0500	[thread overview]
Message-ID: <YACe3KBgdRmIZA3b@coredump.intra.peff.net> (raw)
In-Reply-To: <YAB3E6lgOQdNgGOr@nand.local>

On Thu, Jan 14, 2021 at 11:53:39AM -0500, Taylor Blau wrote:

> On Wed, Jan 13, 2021 at 10:33:01PM -0800, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> > > 'estimate_repack_memory()' takes into account the amount of memory
> > > required to load the reverse index in memory by multiplying the assumed
> > > number of objects by the size of the 'revindex_entry' struct.
> > >
> > > Prepare for hiding the definition of 'struct revindex_entry' by removing
> > > a 'sizeof()' of that type from outside of pack-revindex.c. Instead,
> > > guess that one off_t and one uint32_t are required per object. Strictly
> > > speaking, this is a worse guess than asking for 'sizeof(struct
> > > revindex_entry)' directly, since the true size of this struct is 16
> > > bytes with padding on the end of the struct in order to align the offset
> > > field.
> >
> > Meaning that we under-estimate by 25%?
> 
> In this area, yes. I'm skeptical that this estimate is all that
> important, since it doesn't seem to take into account the memory
> required to select delta/base candidates [1].

It has many other inaccuracies:

  - it assumes half of all objects are blobs, which is not really
    accurate (linux.git is more like 60% trees, 12% commits, 28% blobs).
    This underestimates because blobs are the smallest struct.

  - since we moved a bunch of stuff out of "struct object_entry" into
    lazily-initialized auxiliary structures, we are under-counting the
    per-object cost when we have to spill into this structures

So I'm rather skeptical that this number is close to accurate. But
since there's a bunch of leeway (we are looking to use half of the
system memory) I suspect it doesn't matter all that much. But I
definitely don't think it's worth trying to micro-optimize its accuracy.

-Peff

  reply	other threads:[~2021-01-14 19:44 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:16 [PATCH 00/20] pack-revindex: prepare for on-disk reverse index Taylor Blau
2021-01-08 18:16 ` [PATCH 01/20] pack-revindex: introduce a new API Taylor Blau
2021-01-12  8:41   ` Jeff King
2021-01-12  9:41     ` Jeff King
2021-01-12 16:27     ` Taylor Blau
2021-01-13  8:06   ` Junio C Hamano
2021-01-13  8:54     ` Junio C Hamano
2021-01-13 13:17     ` Jeff King
2021-01-13 16:23     ` Taylor Blau
2021-01-08 18:16 ` [PATCH 02/20] write_reuse_object(): convert to new revindex API Taylor Blau
2021-01-12  8:47   ` Jeff King
2021-01-12 16:31     ` Taylor Blau
2021-01-13 13:02       ` Jeff King
2021-01-08 18:16 ` [PATCH 03/20] write_reused_pack_one(): " Taylor Blau
2021-01-12  8:50   ` Jeff King
2021-01-12 16:34     ` Taylor Blau
2021-01-08 18:16 ` [PATCH 04/20] write_reused_pack_verbatim(): " Taylor Blau
2021-01-12  8:50   ` Jeff King
2021-01-08 18:17 ` [PATCH 05/20] check_object(): " Taylor Blau
2021-01-11 11:43   ` Derrick Stolee
2021-01-11 16:15     ` Taylor Blau
2021-01-12  8:54       ` Jeff King
2021-01-12  8:51   ` Jeff King
2021-01-08 18:17 ` [PATCH 06/20] bitmap_position_packfile(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 07/20] show_objects_for_type(): " Taylor Blau
2021-01-12  8:57   ` Jeff King
2021-01-12 16:35     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 08/20] get_size_by_pos(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 09/20] try_partial_reuse(): " Taylor Blau
2021-01-12  9:06   ` Jeff King
2021-01-12 16:47     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 10/20] rebuild_existing_bitmaps(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 11/20] get_delta_base_oid(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 12/20] retry_bad_packed_offset(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 13/20] packed_object_info(): " Taylor Blau
2021-01-12  9:11   ` Jeff King
2021-01-12 16:51     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 14/20] unpack_entry(): " Taylor Blau
2021-01-12  9:22   ` Jeff King
2021-01-12 16:56     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 15/20] for_each_object_in_pack(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 16/20] builtin/gc.c: guess the size of the revindex Taylor Blau
2021-01-11 11:52   ` Derrick Stolee
2021-01-11 16:23     ` Taylor Blau
2021-01-11 17:09       ` Derrick Stolee
2021-01-12  9:28         ` Jeff King
2021-01-08 18:17 ` [PATCH 17/20] pack-revindex: remove unused 'find_pack_revindex()' Taylor Blau
2021-01-08 18:17 ` [PATCH 18/20] pack-revindex: remove unused 'find_revindex_position()' Taylor Blau
2021-01-11 11:57   ` Derrick Stolee
2021-01-11 16:27     ` Taylor Blau
2021-01-11 17:11       ` Derrick Stolee
2021-01-12  9:32   ` Jeff King
2021-01-12 16:59     ` Taylor Blau
2021-01-13 13:05       ` Jeff King
2021-01-08 18:18 ` [PATCH 19/20] pack-revindex: hide the definition of 'revindex_entry' Taylor Blau
2021-01-11 11:57   ` Derrick Stolee
2021-01-12  9:34   ` Jeff King
2021-01-08 18:18 ` [PATCH 20/20] pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' Taylor Blau
2021-01-12  9:37   ` Jeff King
2021-01-12 17:02     ` Taylor Blau
2021-01-11 12:07 ` [PATCH 00/20] pack-revindex: prepare for on-disk reverse index Derrick Stolee
2021-01-11 16:30   ` Taylor Blau
2021-01-11 17:15     ` Derrick Stolee
2021-01-11 17:29       ` Taylor Blau
2021-01-11 18:40       ` Junio C Hamano
2021-01-12  9:45 ` Jeff King
2021-01-12 17:07   ` Taylor Blau
2021-01-13  0:23 ` Junio C Hamano
2021-01-13  0:52   ` Taylor Blau
2021-01-13  2:15     ` Junio C Hamano
2021-01-13  3:23       ` Taylor Blau
2021-01-13  8:21         ` Junio C Hamano
2021-01-13 13:13           ` Jeff King
2021-01-13 15:34             ` Taylor Blau
2021-01-13 20:06               ` Junio C Hamano
2021-01-13 20:13                 ` Taylor Blau
2021-01-13 20:22                 ` Jeff King
2021-01-13 22:23 ` [PATCH v2 " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 01/20] pack-revindex: introduce a new API Taylor Blau
2021-01-14  6:46     ` Junio C Hamano
2021-01-14 12:00       ` Derrick Stolee
2021-01-14 17:06       ` Taylor Blau
2021-01-14 19:19         ` Jeff King
2021-01-14 20:47           ` Junio C Hamano
2021-01-13 22:23   ` [PATCH v2 02/20] write_reuse_object(): convert to new revindex API Taylor Blau
2021-01-13 22:23   ` [PATCH v2 03/20] write_reused_pack_one(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 04/20] write_reused_pack_verbatim(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 05/20] check_object(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 06/20] bitmap_position_packfile(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 07/20] show_objects_for_type(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 08/20] get_size_by_pos(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 09/20] try_partial_reuse(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 10/20] rebuild_existing_bitmaps(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 11/20] get_delta_base_oid(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 12/20] retry_bad_packed_offset(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 13/20] packed_object_info(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 14/20] unpack_entry(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 15/20] for_each_object_in_pack(): " Taylor Blau
2021-01-14  6:43     ` Junio C Hamano
2021-01-14 17:00       ` Taylor Blau
2021-01-14 19:33       ` Jeff King
2021-01-14 20:11         ` Jeff King
2021-01-14 20:15           ` Taylor Blau
2021-01-15  2:22             ` Junio C Hamano
2021-01-15  2:29               ` Taylor Blau
2021-01-14 20:51         ` Junio C Hamano
2021-01-13 22:24   ` [PATCH v2 16/20] builtin/gc.c: guess the size of the revindex Taylor Blau
2021-01-14  6:33     ` Junio C Hamano
2021-01-14 16:53       ` Taylor Blau
2021-01-14 19:43         ` Jeff King [this message]
2021-01-13 22:24   ` [PATCH v2 17/20] pack-revindex: remove unused 'find_pack_revindex()' Taylor Blau
2021-01-13 22:25   ` [PATCH v2 18/20] pack-revindex: remove unused 'find_revindex_position()' Taylor Blau
2021-01-14  6:42     ` Junio C Hamano
2021-01-13 22:25   ` [PATCH v2 19/20] pack-revindex: hide the definition of 'revindex_entry' Taylor Blau
2021-01-13 22:25   ` [PATCH v2 20/20] pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' Taylor Blau
2021-01-14  6:42     ` Junio C Hamano
2021-01-14 16:56       ` Taylor Blau
2021-01-14 19:51   ` [PATCH v2 00/20] pack-revindex: prepare for on-disk reverse index Jeff King
2021-01-14 20:53     ` Junio C Hamano
2021-01-15  9:32       ` Jeff King
2021-01-15  9:33         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YACe3KBgdRmIZA3b@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.