All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Derrick Stolee <derrickstolee@github.com>,
	Bagas Sanjaya <bagasdotme@gmail.com>
Subject: Re: [PATCH] pack-objects: lazily set up "struct rev_info", don't leak
Date: Sat, 26 Mar 2022 02:09:08 +0100	[thread overview]
Message-ID: <220326.86wnghjz2s.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <xmqqmthdampa.fsf@gitster.g>


On Fri, Mar 25 2022, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> In the preceding [1] (pack-objects: move revs out of
>> get_object_list(), 2022-03-22) the "repo_init_revisions()" was moved
>> to cmd_pack_objects() so that it unconditionally took place for all
>> invocations of "git pack-objects".
>>
>> We'd thus start leaking memory, which is easily reproduced in
>> e.g. git.git by feeding e83c5163316 (Initial revision of "git", the
>> information manager from hell, 2005-04-07) to "git pack-objects";
>> ...
>> Narrowly fixing that commit would have been easy, just add call
>> repo_init_revisions() right before get_object_list(), which is
>> effectively what was done before that commit.
>>
>> But an unstated constraint when setting it up early is that it was
>> needed for the subsequent [2] (pack-objects: parse --filter directly
>> into revs.filter, 2022-03-22), i.e. we might have a --filter
>> command-line option, and need to either have the "struct rev_info"
>> setup when we encounter that option, or later.
>>
>> Let's just change the control flow so that we'll instead set up the
>> "struct rev_info" only when we need it. Doing so leads to a bit more
>> verbosity, but it's a lot clearer what we're doing and why.
>
> Is this about "we take it as given that the use of rev_info leaks
> until we fix revisions API, so let's keep its use limited to avoid
> unnecessary leaks"?

Not exactly,. When you use the revisions API to do "filter" stuff in
this codepath it leaks both before & after Derrick's patches, so nothing
has changed in that case, but...

> If so, it sort-of makes sense, but smells like a roundabout way to
> address the issue.  An obvious alternative is to wait until both the
> topic and the "plug revision API" topic graduate and then add a
> "release" call to release the resource in the same sope as the
> unconditional call to init_revisions at the end.  I do not quite get
> what on-demand lazy set-up buys us.  What we need to lazily set-up,
> when we do lazily set-up, needs to be released either way, no?

...We were doing lazy setup of "struct rev_info" before the parent
series, and as a result it introduces a new memory leak. We do a
malloc() for some diff.c code that revisions.c uses unconditionally,
which then don't use at all in some common cases.

The patch I've submitted on top just restores the previous state of the
initialization being lazy, but in a way that has to be adapted for other
code changes the series made.

  reply	other threads:[~2022-03-26  1:13 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-22 17:28 [PATCH 0/5] Partial bundle follow ups Derrick Stolee via GitGitGadget
2022-03-22 17:28 ` [PATCH 1/5] list-objects-filter: remove CL_ARG__FILTER Derrick Stolee via GitGitGadget
2022-03-22 17:28 ` [PATCH 2/5] pack-objects: move revs out of get_object_list() Derrick Stolee via GitGitGadget
2022-03-22 17:28 ` [PATCH 3/5] pack-objects: parse --filter directly into revs.filter Derrick Stolee via GitGitGadget
2022-03-22 19:37   ` [-SPAM-] " Ramsay Jones
2022-03-23 13:48     ` Derrick Stolee
2022-03-22 21:15   ` Ævar Arnfjörð Bjarmason
2022-03-22 17:28 ` [PATCH 4/5] bundle: move capabilities to end of 'verify' Derrick Stolee via GitGitGadget
2022-03-23  7:08   ` Bagas Sanjaya
2022-03-23 13:39     ` Derrick Stolee
2022-03-22 17:28 ` [PATCH 5/5] bundle: output hash information in 'verify' Derrick Stolee via GitGitGadget
2022-03-23 21:27 ` [PATCH 0/5] Partial bundle follow ups Junio C Hamano
2022-03-25 14:25 ` [PATCH] pack-objects: lazily set up "struct rev_info", don't leak Ævar Arnfjörð Bjarmason
2022-03-25 14:57   ` Derrick Stolee
2022-03-25 16:00     ` Ævar Arnfjörð Bjarmason
2022-03-25 16:41       ` Derrick Stolee
2022-03-25 17:34         ` Ævar Arnfjörð Bjarmason
2022-03-25 19:08           ` Derrick Stolee
2022-03-26  0:52             ` Ævar Arnfjörð Bjarmason
2022-03-28 14:04               ` Derrick Stolee
2022-03-25 18:53   ` Junio C Hamano
2022-03-26  1:09     ` Ævar Arnfjörð Bjarmason [this message]
2022-03-28 15:43   ` [PATCH v2] " Ævar Arnfjörð Bjarmason
2022-03-28 15:58     ` Derrick Stolee
2022-03-28 17:10     ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220326.86wnghjz2s.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.