git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH 10/15] name-rev: restructure creating/updating 'struct rev_name' instances
Date: Fri, 20 Sep 2019 19:09:36 +0200	[thread overview]
Message-ID: <20190920170936.GA32474@szeder.dev> (raw)
In-Reply-To: <515a4590-47e8-fe95-833b-307364fc8fa4@gmail.com>

On Fri, Sep 20, 2019 at 11:27:49AM -0400, Derrick Stolee wrote:
> On 9/19/2019 5:47 PM, SZEDER Gábor wrote:
> > At the beginning of the recursive name_rev() function it creates a new
> > 'struct rev_name' instance for each previously unvisited commit or, if
> > this visit results in better name for an already visited commit, then
> > updates the 'struct rev_name' instance attached to to the commit, or
> > returns early.
> > 
> > Restructure this so it's caller creates or updates the 'struct
> > rev_name' instance associated with the commit to be passed as
> > parameter, i.e. both name_ref() before calling name_rev() and
> > name_rev() itself as it iterates over the parent commits.
> > 
> > This makes eliminating the recursion a bit easier to follow, and it
> > will be moved back to name_rev() after the recursion is eliminated.
> > 
> > This change also plugs the memory leak that was temporarily unplugged
> > in the earlier "name-rev: pull out deref handling from the recursion"
> > patch in this series.
> [snip]
> >  
> > @@ -276,11 +277,17 @@ static int name_ref(const char *path, const struct object_id *oid, int flags, vo
> >  		path = name_ref_abbrev(path, can_abbreviate_output);
> >  		if (commit->date >= cutoff) {
> >  			const char *tip_name;
> > +			char *to_free = NULL;
> >  			if (deref)
> > -				tip_name = xstrfmt("%s^0", path);
> > +				tip_name = to_free = xstrfmt("%s^0", path);
> >  			else
> >  				tip_name = xstrdup(path);
> 
> So this xstrdup(path) is not a leak?

Well... yes, everything is leaked, eventually ;)

First of all, name_ref() is a callback function invoked by
for_each_ref(), and 'path' is one of its parameters.  This means
that we must duplicate this string, because we might need to access it
even after iterating over refs is over (when displaying the name of
the commits).

This copy then becomes the initial name_rev() invocation's 'tip_name'
parameter, and things get a bit harder to reason about because of
merges, the recursion, and being in the middle of the refactorings...

So for argument's sake look at current master's 'builtin/name-rev.c',
and assume that it has to deal with a linear history with a single
branch, and no tags.  When name_ref() is invoked with a branch, then
deref = 0, so in name_rev() the condition:

  if (deref) {
        tip_name = to_free = xstrfmt("%s^0", tip_name);

won't be fulfilled, and 'tip_name' remains unchanged.  Then comes the
initialization of the 'struct rev_name' associated with the tip commit
in the commit slab, including the

  name->tip_name = tip_name;

assignment, IOW we now have a pointer to the original 'tip_name' in
the commit slab.

Then name_rev() looks at the tip commit's parents, or rather, because
of the linear history, at its only parent.  This means that neither of
the other two xstrfmt() will be invoked, and name_rev() will be
recursively invoked with the original 'tip_name' pointer.  This will
then initialize another 'struct rev_name' instance, including yet
another pointer to the original 'tip_name'.

This will go on until the root commit is reached, and in the end every
commit will have an associated 'struct rev_name' instance with a
pointer to the original 'tip_name' string.

At this point we're done with the recursion, and since the repo has
only a single branch, we're done with for_each_ref() as well, and
finally print names of the commits.

Then it's time to clean up and free() memory, but:

  - we could only release the commit slab and free() all 'struct
    rev_name' instances within, but we can't free() the
    'name->tip_name' pointers, because, as shown above, we might have
    a bunch of pointers pointing to the same string.

  - we are at the end of cmd_name_rev() and are about to exit the
    program, so the OS will release any and all resources anyway.
    Yeah, it's far from ready to be turned into a library call...

Now, in a real repository we'll have multiple branches, tags, and
merges, so there will be cases when an existing 'struct rev_name'
instance is updated, and it's 'name->tip_name' is overwritten by a
better name.  At that point the old value is potentially leaked, but
we can't really do anything about it, because we don't know whether
any other 'struct rev_name' instances point to it.

Eliminating the recursion doesn't change anything in this respect.



  reply	other threads:[~2019-09-20 17:09 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-19 21:46 [PATCH 00/15] name-rev: eliminate recursion SZEDER Gábor
2019-09-19 21:46 ` [PATCH 01/15] t6120-describe: correct test repo history graph in comment SZEDER Gábor
2019-09-20 21:47   ` Junio C Hamano
2019-09-20 22:29     ` SZEDER Gábor
2019-09-28  4:06       ` Junio C Hamano
2019-09-19 21:46 ` [PATCH 02/15] t6120-describe: modernize the 'check_describe' helper SZEDER Gábor
2019-09-20 21:49   ` Junio C Hamano
2019-09-19 21:46 ` [PATCH 03/15] name-rev: use strip_suffix() in get_rev_name() SZEDER Gábor
2019-09-20 16:36   ` René Scharfe
2019-09-20 17:10     ` SZEDER Gábor
2019-09-19 21:46 ` [PATCH 04/15] name-rev: avoid unnecessary cast in name_ref() SZEDER Gábor
2019-09-20 16:37   ` René Scharfe
2019-09-19 21:47 ` [PATCH 05/15] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation SZEDER Gábor
2019-09-20 15:11   ` Derrick Stolee
2019-09-20 15:40     ` SZEDER Gábor
2019-09-20 16:37   ` René Scharfe
2019-09-19 21:47 ` [PATCH 06/15] t6120: add a test to cover inner conditions in 'git name-rev's name_rev() SZEDER Gábor
2019-09-20 15:14   ` Derrick Stolee
2019-09-20 15:44     ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 07/15] name-rev: extract creating/updating a 'struct name_rev' into a helper SZEDER Gábor
2019-09-20 15:18   ` Derrick Stolee
2019-09-22  8:18   ` [PATCH] name-rev: rewrite create_or_update_name() Martin Ågren
2019-12-09 12:43     ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 08/15] name-rev: pull out deref handling from the recursion SZEDER Gábor
2019-09-20 15:21   ` Derrick Stolee
2019-09-20 17:42     ` SZEDER Gábor
2019-09-20 16:37   ` René Scharfe
2019-09-20 18:13     ` SZEDER Gábor
2019-09-20 18:14       ` SZEDER Gábor
2019-09-21  9:57         ` SZEDER Gábor
2019-09-21 12:37           ` René Scharfe
2019-09-22 19:05             ` SZEDER Gábor
2019-09-23 18:43               ` René Scharfe
2019-09-23 18:59                 ` SZEDER Gábor
2019-09-23 19:55                   ` René Scharfe
2019-09-23 20:47                     ` SZEDER Gábor
2019-09-24 17:03                       ` René Scharfe
2019-09-26 17:33                         ` SZEDER Gábor
2019-09-21 12:37       ` René Scharfe
2019-09-21 14:21         ` SZEDER Gábor
2019-09-21 15:52           ` René Scharfe
2019-09-19 21:47 ` [PATCH 09/15] name-rev: restructure parsing commits and applying date cutoff SZEDER Gábor
2019-09-21 12:37   ` René Scharfe
2019-09-19 21:47 ` [PATCH 10/15] name-rev: restructure creating/updating 'struct rev_name' instances SZEDER Gábor
2019-09-20 15:27   ` Derrick Stolee
2019-09-20 17:09     ` SZEDER Gábor [this message]
2019-09-19 21:47 ` [PATCH 11/15] name-rev: drop name_rev()'s 'generation' and 'distance' parameters SZEDER Gábor
2019-09-19 21:47 ` [PATCH 12/15] name-rev: eliminate recursion in name_rev() SZEDER Gábor
2019-09-19 21:47 ` [PATCH 13/15] name-rev: cleanup name_ref() SZEDER Gábor
2019-09-19 21:47 ` [PATCH 14/15] name-rev: plug a memory leak in name_rev() SZEDER Gábor
2019-09-19 21:47 ` [PATCH 14/15] name-rev: plug memory leak in name_rev() in the deref case SZEDER Gábor
2019-09-19 22:47   ` SZEDER Gábor
2019-09-19 21:47 ` [PATCH 15/15] name-rev: plug a " SZEDER Gábor
2019-09-20 15:35   ` Derrick Stolee
2019-09-19 21:47 ` [PATCH 15/15] name-rev: plug memory leak in name_rev() SZEDER Gábor
2019-09-19 22:48   ` SZEDER Gábor
2019-09-20 15:37 ` [PATCH 00/15] name-rev: eliminate recursion Derrick Stolee
2019-09-20 17:37   ` SZEDER Gábor
2019-11-12 10:38 ` [PATCH v2 00/13] " SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 01/13] t6120-describe: correct test repo history graph in comment SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 02/13] t6120-describe: modernize the 'check_describe' helper SZEDER Gábor
2019-11-27 18:02     ` Jonathan Tan
2019-11-12 10:38   ` [PATCH v2 03/13] name-rev: use strbuf_strip_suffix() in get_rev_name() SZEDER Gábor
2019-11-12 19:02     ` René Scharfe
2019-11-12 10:38   ` [PATCH v2 04/13] name-rev: avoid unnecessary cast in name_ref() SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 05/13] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 06/13] t6120: add a test to cover inner conditions in 'git name-rev's name_rev() SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 07/13] name-rev: extract creating/updating a 'struct name_rev' into a helper SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 08/13] name-rev: pull out deref handling from the recursion SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 09/13] name-rev: restructure parsing commits and applying date cutoff SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 10/13] name-rev: restructure creating/updating 'struct rev_name' instances SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 11/13] name-rev: drop name_rev()'s 'generation' and 'distance' parameters SZEDER Gábor
2019-11-27 18:13     ` Jonathan Tan
2019-11-12 10:38   ` [PATCH v2 12/13] name-rev: eliminate recursion in name_rev() SZEDER Gábor
2019-11-27 17:57     ` Jonathan Tan
2019-12-09 12:22       ` SZEDER Gábor
2019-11-12 10:38   ` [PATCH v2 13/13] name-rev: cleanup name_ref() SZEDER Gábor
2019-11-27 18:01     ` Jonathan Tan
2019-12-09 12:32       ` SZEDER Gábor
2019-11-12 19:17   ` [PATCH v2 00/13] name-rev: eliminate recursion Johannes Schindelin
2019-11-13 19:25     ` Sebastiaan Dammann
2019-12-09 11:52   ` [PATCH v3 00/14] " SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 01/14] t6120-describe: correct test repo history graph in comment SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 02/14] t6120-describe: modernize the 'check_describe' helper SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 03/14] name-rev: use strbuf_strip_suffix() in get_rev_name() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 04/14] name-rev: avoid unnecessary cast in name_ref() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 05/14] name-rev: use sizeof(*ptr) instead of sizeof(type) in allocation SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 06/14] t6120: add a test to cover inner conditions in 'git name-rev's name_rev() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 07/14] name-rev: extract creating/updating a 'struct name_rev' into a helper SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 08/14] name-rev: pull out deref handling from the recursion SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 09/14] name-rev: restructure parsing commits and applying date cutoff SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 10/14] name-rev: restructure creating/updating 'struct rev_name' instances SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 11/14] name-rev: drop name_rev()'s 'generation' and 'distance' parameters SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 12/14] name-rev: use 'name->tip_name' instead of 'tip_name' SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 13/14] name-rev: eliminate recursion in name_rev() SZEDER Gábor
2019-12-09 11:52     ` [PATCH v3 14/14] name-rev: cleanup name_ref() SZEDER Gábor
2019-12-09 15:08     ` [PATCH v3 00/14] name-rev: eliminate recursion Derrick Stolee
2019-12-11 17:33       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190920170936.GA32474@szeder.dev \
    --to=szeder.dev@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).