git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: git@vger.kernel.org, peff@peff.net, jrnieder@gmail.com
Subject: Re: [PATCH 16/20] builtin/gc.c: guess the size of the revindex
Date: Mon, 11 Jan 2021 12:09:27 -0500	[thread overview]
Message-ID: <13c28eca-81d5-10c9-c92c-162547416014@gmail.com> (raw)
In-Reply-To: <X/x7mrcwfxGO8xH7@nand.local>

On 1/11/2021 11:23 AM, Taylor Blau wrote:
> On Mon, Jan 11, 2021 at 06:52:24AM -0500, Derrick Stolee wrote:
>> This is so far the only not-completely-obvious change.
>>
>>> But, this is an approximation anyway, and it does remove a use of the
>>> 'struct revindex_entry' from outside of pack-revindex internals.
>>
>> And this might be enough justification for it, but...
>>
>>> -	heap += sizeof(struct revindex_entry) * nr_objects;
>>> +	heap += (sizeof(off_t) + sizeof(uint32_t)) * nr_objects;
>>
>> ...outside of the estimation change, will this need another change
>> when the rev-index is mmap'd? Should this instead be an API call,
>> such as estimate_rev_index_memory(nr_objects)? That would
>> centralize the estimate to be next to the code that currently
>> interacts with 'struct revindex_entry' and will later interact with
>> the mmap region.
> 
> I definitely did consider this, and it seems that I made a mistake in
> not documenting my consideration (since I assumed that it was so benign
> nobody would notice / care ;-)).
> 
> The reason I didn't pursue it here was that we haven't yet loaded the
> reverse index by this point. So, you'd want a function that at least
> stats the '*.rev' file (and either does or doesn't parse it [1]), or
> aborts early to indicate otherwise.

In this patch, I would expect it to use sizeof(struct revindex_entry).
Later, the method would know if a .rev file exists and do the right
thing instead. (Also, should mmap'd data count towards this estimate?)

> One would hope that 'load_pack_revindex()' would do just that, but it
> falls back to load a reverse index in memory, which involves exactly the
> slow sort that we're trying to avoid. (Of course, we're going to have to
> do it later anyway, but allocating many GB of heap just to provide an
> estimation seems ill-advised to me ;-).)
> 
> So, we'd have to expand the API in some way or another, and to me it
> didn't seem worth it. As I mentioned in the commit message, I'm
> skeptical of the value of being accurate here, since this is (after all)
> an estimation.

Yes, I'm probably just poking somewhere it was easy to poke. This is
probably not worth the time I'm spending asking about it.

Feel free to disregard.

-Stolee

  reply	other threads:[~2021-01-11 17:10 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-08 18:16 [PATCH 00/20] pack-revindex: prepare for on-disk reverse index Taylor Blau
2021-01-08 18:16 ` [PATCH 01/20] pack-revindex: introduce a new API Taylor Blau
2021-01-12  8:41   ` Jeff King
2021-01-12  9:41     ` Jeff King
2021-01-12 16:27     ` Taylor Blau
2021-01-13  8:06   ` Junio C Hamano
2021-01-13  8:54     ` Junio C Hamano
2021-01-13 13:17     ` Jeff King
2021-01-13 16:23     ` Taylor Blau
2021-01-08 18:16 ` [PATCH 02/20] write_reuse_object(): convert to new revindex API Taylor Blau
2021-01-12  8:47   ` Jeff King
2021-01-12 16:31     ` Taylor Blau
2021-01-13 13:02       ` Jeff King
2021-01-08 18:16 ` [PATCH 03/20] write_reused_pack_one(): " Taylor Blau
2021-01-12  8:50   ` Jeff King
2021-01-12 16:34     ` Taylor Blau
2021-01-08 18:16 ` [PATCH 04/20] write_reused_pack_verbatim(): " Taylor Blau
2021-01-12  8:50   ` Jeff King
2021-01-08 18:17 ` [PATCH 05/20] check_object(): " Taylor Blau
2021-01-11 11:43   ` Derrick Stolee
2021-01-11 16:15     ` Taylor Blau
2021-01-12  8:54       ` Jeff King
2021-01-12  8:51   ` Jeff King
2021-01-08 18:17 ` [PATCH 06/20] bitmap_position_packfile(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 07/20] show_objects_for_type(): " Taylor Blau
2021-01-12  8:57   ` Jeff King
2021-01-12 16:35     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 08/20] get_size_by_pos(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 09/20] try_partial_reuse(): " Taylor Blau
2021-01-12  9:06   ` Jeff King
2021-01-12 16:47     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 10/20] rebuild_existing_bitmaps(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 11/20] get_delta_base_oid(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 12/20] retry_bad_packed_offset(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 13/20] packed_object_info(): " Taylor Blau
2021-01-12  9:11   ` Jeff King
2021-01-12 16:51     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 14/20] unpack_entry(): " Taylor Blau
2021-01-12  9:22   ` Jeff King
2021-01-12 16:56     ` Taylor Blau
2021-01-08 18:17 ` [PATCH 15/20] for_each_object_in_pack(): " Taylor Blau
2021-01-08 18:17 ` [PATCH 16/20] builtin/gc.c: guess the size of the revindex Taylor Blau
2021-01-11 11:52   ` Derrick Stolee
2021-01-11 16:23     ` Taylor Blau
2021-01-11 17:09       ` Derrick Stolee [this message]
2021-01-12  9:28         ` Jeff King
2021-01-08 18:17 ` [PATCH 17/20] pack-revindex: remove unused 'find_pack_revindex()' Taylor Blau
2021-01-08 18:17 ` [PATCH 18/20] pack-revindex: remove unused 'find_revindex_position()' Taylor Blau
2021-01-11 11:57   ` Derrick Stolee
2021-01-11 16:27     ` Taylor Blau
2021-01-11 17:11       ` Derrick Stolee
2021-01-12  9:32   ` Jeff King
2021-01-12 16:59     ` Taylor Blau
2021-01-13 13:05       ` Jeff King
2021-01-08 18:18 ` [PATCH 19/20] pack-revindex: hide the definition of 'revindex_entry' Taylor Blau
2021-01-11 11:57   ` Derrick Stolee
2021-01-12  9:34   ` Jeff King
2021-01-08 18:18 ` [PATCH 20/20] pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' Taylor Blau
2021-01-12  9:37   ` Jeff King
2021-01-12 17:02     ` Taylor Blau
2021-01-11 12:07 ` [PATCH 00/20] pack-revindex: prepare for on-disk reverse index Derrick Stolee
2021-01-11 16:30   ` Taylor Blau
2021-01-11 17:15     ` Derrick Stolee
2021-01-11 17:29       ` Taylor Blau
2021-01-11 18:40       ` Junio C Hamano
2021-01-12  9:45 ` Jeff King
2021-01-12 17:07   ` Taylor Blau
2021-01-13  0:23 ` Junio C Hamano
2021-01-13  0:52   ` Taylor Blau
2021-01-13  2:15     ` Junio C Hamano
2021-01-13  3:23       ` Taylor Blau
2021-01-13  8:21         ` Junio C Hamano
2021-01-13 13:13           ` Jeff King
2021-01-13 15:34             ` Taylor Blau
2021-01-13 20:06               ` Junio C Hamano
2021-01-13 20:13                 ` Taylor Blau
2021-01-13 20:22                 ` Jeff King
2021-01-13 22:23 ` [PATCH v2 " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 01/20] pack-revindex: introduce a new API Taylor Blau
2021-01-14  6:46     ` Junio C Hamano
2021-01-14 12:00       ` Derrick Stolee
2021-01-14 17:06       ` Taylor Blau
2021-01-14 19:19         ` Jeff King
2021-01-14 20:47           ` Junio C Hamano
2021-01-13 22:23   ` [PATCH v2 02/20] write_reuse_object(): convert to new revindex API Taylor Blau
2021-01-13 22:23   ` [PATCH v2 03/20] write_reused_pack_one(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 04/20] write_reused_pack_verbatim(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 05/20] check_object(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 06/20] bitmap_position_packfile(): " Taylor Blau
2021-01-13 22:23   ` [PATCH v2 07/20] show_objects_for_type(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 08/20] get_size_by_pos(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 09/20] try_partial_reuse(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 10/20] rebuild_existing_bitmaps(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 11/20] get_delta_base_oid(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 12/20] retry_bad_packed_offset(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 13/20] packed_object_info(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 14/20] unpack_entry(): " Taylor Blau
2021-01-13 22:24   ` [PATCH v2 15/20] for_each_object_in_pack(): " Taylor Blau
2021-01-14  6:43     ` Junio C Hamano
2021-01-14 17:00       ` Taylor Blau
2021-01-14 19:33       ` Jeff King
2021-01-14 20:11         ` Jeff King
2021-01-14 20:15           ` Taylor Blau
2021-01-15  2:22             ` Junio C Hamano
2021-01-15  2:29               ` Taylor Blau
2021-01-14 20:51         ` Junio C Hamano
2021-01-13 22:24   ` [PATCH v2 16/20] builtin/gc.c: guess the size of the revindex Taylor Blau
2021-01-14  6:33     ` Junio C Hamano
2021-01-14 16:53       ` Taylor Blau
2021-01-14 19:43         ` Jeff King
2021-01-13 22:24   ` [PATCH v2 17/20] pack-revindex: remove unused 'find_pack_revindex()' Taylor Blau
2021-01-13 22:25   ` [PATCH v2 18/20] pack-revindex: remove unused 'find_revindex_position()' Taylor Blau
2021-01-14  6:42     ` Junio C Hamano
2021-01-13 22:25   ` [PATCH v2 19/20] pack-revindex: hide the definition of 'revindex_entry' Taylor Blau
2021-01-13 22:25   ` [PATCH v2 20/20] pack-revindex.c: avoid direct revindex access in 'offset_to_pack_pos()' Taylor Blau
2021-01-14  6:42     ` Junio C Hamano
2021-01-14 16:56       ` Taylor Blau
2021-01-14 19:51   ` [PATCH v2 00/20] pack-revindex: prepare for on-disk reverse index Jeff King
2021-01-14 20:53     ` Junio C Hamano
2021-01-15  9:32       ` Jeff King
2021-01-15  9:33         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=13c28eca-81d5-10c9-c92c-162547416014@gmail.com \
    --to=stolee@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=jrnieder@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).