All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Beller <sbeller@google.com>
To: "Martin Ågren" <martin.agren@gmail.com>
Cc: gitgitgadget@gmail.com, git <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 1/2] commit-graph: clean up leaked memory during write
Date: Tue, 2 Oct 2018 12:44:09 -0700	[thread overview]
Message-ID: <CAGZ79kb2pE3pFQx4A=oo-mYORjN1ubCgV6Gotc78i7d+BqZdBw@mail.gmail.com> (raw)
In-Reply-To: <CAN0heSqOjYDXRf4KE_C0GDnFW8r4qVfWnUVuW-Q+4D87nhFURQ@mail.gmail.com>

On Tue, Oct 2, 2018 at 12:09 PM Martin Ågren <martin.agren@gmail.com> wrote:
>
> On Tue, 2 Oct 2018 at 19:59, Stefan Beller <sbeller@google.com> wrote:
> > > > +
> > > > +       string_list_clear(&list, 0);
> > > >  }
> > >
> > > Nit: The blank line adds some asymmetry, IMVHO.
> >
> > I think these blank lines are super common, as in:
> >
> >     {
> >       declarations;
> >
> >       multiple;
> >       lines(of);
> >       code;
> >
> >       cleanup;
> >       and_frees;
> >     }
> >
> > (c.f. display_table in column.c, which I admit to have
> > cherry-picked as an example).
> >
> > While in nit territory, I would rather move the string list init
> > into the first block:
> >
> >   {
> >     struct string_list list = STRING_LIST_INIT_DUP;
> >
> >     for_each_ref(add_ref_to_list, &list);
> >     write_commit_graph(obj_dir, NULL, &list, append);
> >
> >     string_list_clear(&list, 0);
> >   }
>
> Now this looks very symmetrical. :-)
>
> > > >  void write_commit_graph(const char *obj_dir,
> > > > @@ -846,9 +848,11 @@ void write_commit_graph(const char *obj_dir,
> > > >         compute_generation_numbers(&commits, report_progress);
> > > >
> > > >         graph_name = get_commit_graph_filename(obj_dir);
> > > > -       if (safe_create_leading_directories(graph_name))
> > > > +       if (safe_create_leading_directories(graph_name)) {
> > > > +               UNLEAK(graph_name);
> > > >                 die_errno(_("unable to create leading directories of %s"),
> > > >                           graph_name);
> > > > +       }
> > >
> > > Do you really need this hunk?
> >
> > graph_name is produced via xstrfmt in get_commit_graph_filename,
> > so it needs to be free'd in any return/exit path.
>
> Agreed. Although I am questioning that `die()` and its siblings count.
>
> > > In my testing with LeakSanitizer and
> > > valgrind, I don't need this hunk to be leak-free.
> >
> >
> > > Generally speaking, it
> > > seems impossible to UNLEAK when dying, since we don't know what we have
> > > allocated higher up in the call-stack.
> >
> > I do not understand; I thought UNLEAK was specifically for the purpose of
> > die() calls without imposing extra overhead; rereading 0e5bba53af
> > (add UNLEAK annotation for reducing leak false positives, 2017-09-08)
> > doesn't provide an example for prematurely die()ing, only for regular
> > program exit.
> >
> > > [...] With this hunk, I am
> > > puzzled and feel uneasy, both about having to UNLEAK before dying and
> > > about having to UNLEAK outside of builtin/.
> >
> > I am not uneasy about an UNLEAK before dying, but about dying outside
> > builtin/ in general
>
> Yeah, not dying would be even better (out of scope for this patch).
>
> > (but having a die call accompanied by UNLEAK seems
> > to be the right thing). Can you explain the worries you have regarding the
> > allocations on the call stack, as xstrfmt is allocating on the heap and we
> > only UNLEAK the pointer to that?
>
> I think we agree that leaking things "allocat[ed] on the call stack"
> isn't much of a worry. The reason I mentioned the call stack is that
> we've got any number of calls behind us on it, and we might have made
> all sorts of allocations on the heap, and at this point, we have no
> idea about what we should be UNLEAK-ing.

Wouldn't that be the responsibility of each function to make sure things
are UNLEAK'd or free'd before the function is either over or stopped
intermittently (by a subroutine dying) ?

In an ideal world we'd only ever exit/die in the functions high up
the call chain (which are in builtin/) and all other code would gracefully
return error codes or messages instead or even cope with some failure
conditions?

> My worry is that one of these would seem to be true:
>
> * UNLEAK is unsuitable for the job. Whenever we have a `die()` as we do
>   here, we can UNLEAK the variables we know of, but we can't do anything
>   about the allocations we have made higher up the call-chain.

IMHO that is the issue of the functions higher up the call chain and ought
to not affect this patch. By doing the right thing here locally the code base
will approach a good state eventually.

> Our test
>   suite obviously provokes lots of calls to `die()` -- imagine that each
>   of those leaves a few leaked allocations behind. We'd have a semi-huge
>   number of leaks being reported. While we could mark with UNLEAK to
>   reduce that number, we wouldn't be able to bring the number of leaks
>   down to anywhere near manageable where we'd be able to find the last
>   few true positives.

Makes sense.

> * We add code with no purpose. In this case, we're not talking a lot of
>   lines, but across the code base, if they bring no gain, they are bound
>   to provide a negative net value given enough time.

I see. I did not estimate its negative impact to be high enough, as the
UNLEAK near a die() call was obvious good thing (locally).

I don't know what the best way to proceed is in this case.

Thanks,
Stefan

  reply	other threads:[~2018-10-02 19:44 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-02 14:58 [PATCH 0/2] Clean up leaks in commit-graph.c Derrick Stolee via GitGitGadget
2018-10-02 14:58 ` [PATCH 1/2] commit-graph: clean up leaked memory during write Derrick Stolee via GitGitGadget
2018-10-02 15:40   ` Martin Ågren
2018-10-02 17:59     ` Stefan Beller
2018-10-02 19:08       ` Martin Ågren
2018-10-02 19:44         ` Stefan Beller [this message]
2018-10-02 22:34           ` Jeff King
2018-10-02 22:44             ` Stefan Beller
2018-10-03 12:04               ` Derrick Stolee
2018-10-03 15:36                 ` [PATCH 0/2] commit-graph: more leak fixes Martin Ågren
2018-10-03 15:36                   ` [PATCH 1/2] commit-graph: free `struct packed_git` after closing it Martin Ågren
2018-10-03 15:36                   ` [PATCH 2/2] builtin/commit-graph.c: UNLEAK variables Martin Ågren
2018-10-03 16:19                   ` [PATCH 0/2] commit-graph: more leak fixes Derrick Stolee
2018-10-03 16:24                     ` Martin Ågren
2018-10-02 22:37       ` [PATCH 1/2] commit-graph: clean up leaked memory during write Jeff King
2018-10-02 14:58 ` [PATCH 2/2] commit-graph: reduce initial oid allocation Derrick Stolee via GitGitGadget
2018-10-03 17:12 ` [PATCH v2 0/3] Clean up leaks in commit-graph.c Derrick Stolee via GitGitGadget
2018-10-03 17:12   ` [PATCH v2 1/3] commit-graph: clean up leaked memory during write Derrick Stolee via GitGitGadget
2018-10-03 17:12   ` [PATCH v2 2/3] builtin/commit-graph.c: UNLEAK variables Martin Ågren via GitGitGadget
2018-10-03 17:12   ` [PATCH v2 3/3] commit-graph: reduce initial oid allocation Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGZ79kb2pE3pFQx4A=oo-mYORjN1ubCgV6Gotc78i7d+BqZdBw@mail.gmail.com' \
    --to=sbeller@google.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.