git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "René Scharfe" <l.s.r@web.de>
To: "SZEDER Gábor" <szeder.dev@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH 08/15] name-rev: pull out deref handling from the recursion
Date: Mon, 23 Sep 2019 20:43:11 +0200	[thread overview]
Message-ID: <822d19dc-5202-b67f-5c0c-e5c69a679237@web.de> (raw)
In-Reply-To: <20190922190511.GB10866@szeder.dev>

Am 22.09.19 um 21:05 schrieb SZEDER Gábor:
> On Sat, Sep 21, 2019 at 02:37:18PM +0200, René Scharfe wrote:
>> Am 21.09.19 um 11:57 schrieb SZEDER Gábor:
>>> On Fri, Sep 20, 2019 at 08:14:07PM +0200, SZEDER Gábor wrote:
>>>> On Fri, Sep 20, 2019 at 08:13:02PM +0200, SZEDER Gábor wrote:
>>>>>> If the (re)introduced leak doesn't impact performance and memory
>>>>>> usage too much then duplicating tip_name again in name_rev() or
>>>>>> rather your new create_or_update_name() would likely make the
>>>>>> lifetimes of those string buffers easier to manage.
>>>>>
>>>>> Yeah, the easiest would be when each 'struct rev_name' in the commit
>>>>> slab would have its own 'tip_name' string, but that would result in
>>>>> a lot of duplicated strings and increased memory usage.
>>>>
>>>> I didn't measure how much more memory would be used, though.
>>>
>>> So, I tried the patch below to give each 'struct rev_name' instance
>>> its own copy of 'tip_name', and the memory usage of 'git name-rev
>>> --all' usually increased.
>>>
>>> The increase depends on how many merges and how many refs there are
>>> compared to the number of commits: the fewer merges and refs, the
>>> higher the more the memory usage increased:
>>>
>>>   linux:         +4.8%
>>>   gcc:           +7.2%
>>>   gecko-dev:     +9.2%
>>>   webkit:       +12.4%
>>>   llvm-project: +19.0%
>>
>> Is that the overall memory usage or just for struct rev_name instances
>> and tip_name strings?
>
> It's overall memory usage, the avarage of five runs of:
>
>   /usr/bin/time --format='%M' ~/src/git/git name-rev --all
>
>> And how much is that in absolute terms?
>
> git:     29801 ->  28514
> linux:  317018 -> 332218
> gcc:    106462 -> 114140
> gecko:  315448 -> 344486
> webkit:  55847 ->  62780
> llvm:   112867 -> 134384

I only have the first two handy, and I get numbers like this with
master:

git, lots of branches with long names: 3075476
git, local clone, single branch:       1349016
linux, single branch:                  1520468

O_o

>> (Perhaps
>> it's worth it to get the memory ownership question off the table at
>> least during the transformation to iterative processing.)
>
> I looked into it only because I got curious, but other than that I
> will definitely play the "beyond the scope of this patch series" card
> :)

Fair enough.

>> I wonder why regular commits even need a struct name_rev.  Shouldn't
>> only tips and roots need ones?  And perhaps merges and occasional
>> regular "checkpoint" commits, to avoid too many duplicate traversals.
>
> The 'struct rev_name' holds all info that's necessary to determine the
> commit's name.  It seems to be much simpler to just attach one to each
> commit and then retrieve it from the commit slab when printing the
> name of the commit than to come up with an algorithm where only a
> sleect set of commits get a 'struct rev_name', including how to access
> those when naming a commit that doesn't have one.

Sure, the lookup of individual commits is much easier once all commits
have name tags attached.  Preparing that sounds expensive, though.
It's a trade-off favoring looking up lots of names per program run.

>>>  --- >8 ---
>>>
>>> diff --git a/builtin/name-rev.c b/builtin/name-rev.c
>>> index 6969af76c4..62ab78242b 100644
>>> --- a/builtin/name-rev.c
>>> +++ b/builtin/name-rev.c
>>> @@ -88,6 +88,7 @@ static struct rev_name *create_or_update_name(struct commit *commit,
>>>  		set_commit_rev_name(commit, name);
>>>  		goto copy_data;
>>>  	} else if (is_better_name(name, taggerdate, distance, from_tag)) {
>>> +		free((char*) name->tip_name);
>>>  copy_data:
>>>  		name->tip_name = tip_name;
>>
>> I would have expected a xstrdup() call here.
>
> But then we'd needed to release the results of all those xstrfmt()
> calls at the callsites of create_or_update_name(), so instead of those
> strdup() calls that you deem unnecessary we would need additional
> free() calls.

Correct.  That would be simpler and shouldn't affect peak memory
usage.

René

  reply	other threads:[~2019-09-23 18:43 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19 21:46 [PATCH 00/15] name-rev: eliminate recursion SZEDER Gábor
2019-09-19 21:46 ` [PATCH 01/15] t6120-describe: correct test repo history graph in comment SZEDER Gábor
2019-09-20 21:47   ` Junio C Hamano
2019-09-20 22:29     ` SZEDER Gábor
2019-09-28  4:06       ` Junio C Hamano
2019-09-19 21:46 ` [PATCH 02/15] t6120-describe: modernize the 'check_describe' helper SZEDER Gábor
2019-09-20 21:49   ` Junio C Hamano
2019-09-19 21:46 ` [PATCH 03/15] name-rev: use strip_suffix() in get_rev_name() SZEDER Gábor
2019-09-20 16:36   ` René Scharfe
2019-09-20 17:10     ` SZEDER Gábor
2019-09-19 21:46 ` [PATCH 04/15] name-rev: avoid unnecessary cast in name_ref() SZEDER Gábor
2019-09-20 16:37   ` René Scharfe
2019-09-19 21:47 ` [PATCH 05/15] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation SZEDER Gábor
2019-09-20 15:11   ` Derrick Stolee
2019-09-20 15:40     ` SZEDER Gábor
2019-09-20 16:37   ` René Scharfe
2019-09-19 21:47 ` [PATCH 06/15] t6120: add a test to cover inner conditions in 'git name-rev's name_rev() SZEDER Gábor
2019-09-20 15:14   ` Derrick Stolee
2019-09-20 15:44     ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 07/15] name-rev: extract creating/updating a 'struct name_rev' into a helper SZEDER Gábor
2019-09-20 15:18   ` Derrick Stolee
2019-09-22  8:18   ` [PATCH] name-rev: rewrite create_or_update_name() Martin Ågren
2019-12-09 12:43     ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 08/15] name-rev: pull out deref handling from the recursion SZEDER Gábor
2019-09-20 15:21   ` Derrick Stolee
2019-09-20 17:42     ` SZEDER Gábor
2019-09-20 16:37   ` René Scharfe
2019-09-20 18:13     ` SZEDER Gábor
2019-09-20 18:14       ` SZEDER Gábor
2019-09-21  9:57         ` SZEDER Gábor
2019-09-21 12:37           ` René Scharfe
2019-09-22 19:05             ` SZEDER Gábor
2019-09-23 18:43               ` René Scharfe [this message]
2019-09-23 18:59                 ` SZEDER Gábor
2019-09-23 19:55                   ` René Scharfe
2019-09-23 20:47                     ` SZEDER Gábor
2019-09-24 17:03                       ` René Scharfe
2019-09-26 17:33                         ` SZEDER Gábor
2019-09-21 12:37       ` René Scharfe
2019-09-21 14:21         ` SZEDER Gábor
2019-09-21 15:52           ` René Scharfe
2019-09-19 21:47 ` [PATCH 09/15] name-rev: restructure parsing commits and applying date cutoff SZEDER Gábor
2019-09-21 12:37   ` René Scharfe
2019-09-19 21:47 ` [PATCH 10/15] name-rev: restructure creating/updating 'struct rev_name' instances SZEDER Gábor
2019-09-20 15:27   ` Derrick Stolee
2019-09-20 17:09     ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 11/15] name-rev: drop name_rev()'s 'generation' and 'distance' parameters SZEDER Gábor
2019-09-19 21:47 ` [PATCH 12/15] name-rev: eliminate recursion in name_rev() SZEDER Gábor
2019-09-19 21:47 ` [PATCH 13/15] name-rev: cleanup name_ref() SZEDER Gábor
2019-09-19 21:47 ` [PATCH 14/15] name-rev: plug a memory leak in name_rev() SZEDER Gábor
2019-09-19 21:47 ` [PATCH 14/15] name-rev: plug memory leak in name_rev() in the deref case SZEDER Gábor
2019-09-19 22:47   ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 15/15] name-rev: plug a " SZEDER Gábor
2019-09-20 15:35   ` Derrick Stolee
2019-09-19 21:47 ` [PATCH 15/15] name-rev: plug memory leak in name_rev() SZEDER Gábor
2019-09-19 22:48   ` SZEDER Gábor
2019-09-20 15:37 ` [PATCH 00/15] name-rev: eliminate recursion Derrick Stolee
2019-09-20 17:37   ` SZEDER Gábor
2019-11-12 10:38 ` [PATCH v2 00/13] " SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 01/13] t6120-describe: correct test repo history graph in comment SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 02/13] t6120-describe: modernize the 'check_describe' helper SZEDER Gábor
2019-11-27 18:02     ` Jonathan Tan
2019-11-12 10:38   ` [PATCH v2 03/13] name-rev: use strbuf_strip_suffix() in get_rev_name() SZEDER Gábor
2019-11-12 19:02     ` René Scharfe
2019-11-12 10:38   ` [PATCH v2 04/13] name-rev: avoid unnecessary cast in name_ref() SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 05/13] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 06/13] t6120: add a test to cover inner conditions in 'git name-rev's name_rev() SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 07/13] name-rev: extract creating/updating a 'struct name_rev' into a helper SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 08/13] name-rev: pull out deref handling from the recursion SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 09/13] name-rev: restructure parsing commits and applying date cutoff SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 10/13] name-rev: restructure creating/updating 'struct rev_name' instances SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 11/13] name-rev: drop name_rev()'s 'generation' and 'distance' parameters SZEDER Gábor
2019-11-27 18:13     ` Jonathan Tan
2019-11-12 10:38   ` [PATCH v2 12/13] name-rev: eliminate recursion in name_rev() SZEDER Gábor
2019-11-27 17:57     ` Jonathan Tan
2019-12-09 12:22       ` SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 13/13] name-rev: cleanup name_ref() SZEDER Gábor
2019-11-27 18:01     ` Jonathan Tan
2019-12-09 12:32       ` SZEDER Gábor
2019-11-12 19:17   ` [PATCH v2 00/13] name-rev: eliminate recursion Johannes Schindelin
2019-11-13 19:25     ` Sebastiaan Dammann
2019-12-09 11:52   ` [PATCH v3 00/14] " SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 01/14] t6120-describe: correct test repo history graph in comment SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 02/14] t6120-describe: modernize the 'check_describe' helper SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 03/14] name-rev: use strbuf_strip_suffix() in get_rev_name() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 04/14] name-rev: avoid unnecessary cast in name_ref() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 05/14] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 06/14] t6120: add a test to cover inner conditions in 'git name-rev's name_rev() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 07/14] name-rev: extract creating/updating a 'struct name_rev' into a helper SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 08/14] name-rev: pull out deref handling from the recursion SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 09/14] name-rev: restructure parsing commits and applying date cutoff SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 10/14] name-rev: restructure creating/updating 'struct rev_name' instances SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 11/14] name-rev: drop name_rev()'s 'generation' and 'distance' parameters SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 12/14] name-rev: use 'name->tip_name' instead of 'tip_name' SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 13/14] name-rev: eliminate recursion in name_rev() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 14/14] name-rev: cleanup name_ref() SZEDER Gábor
2019-12-09 15:08     ` [PATCH v3 00/14] name-rev: eliminate recursion Derrick Stolee
2019-12-11 17:33       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=822d19dc-5202-b67f-5c0c-e5c69a679237@web.de \
    --to=l.s.r@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=szeder.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).