All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Haggerty <mhagger@alum.mit.edu>
To: "Martin Ågren" <martin.agren@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames
Date: Mon, 28 Aug 2017 10:06:57 +0200	[thread overview]
Message-ID: <CAMy9T_FUGehTbfVfVfi877Gi3DOR5Z45A-H-v92s7g=m7RVnSw@mail.gmail.com> (raw)
In-Reply-To: <CAN0heSq8RdH5vWFgq1UvJOfWerMJSZwhV4FMCjvA=XUqu2OQQQ@mail.gmail.com>

On Sat, Aug 26, 2017 at 12:16 PM, Martin Ågren <martin.agren@gmail.com> wrote:
> On 25 August 2017 at 23:00, Junio C Hamano <gitster@pobox.com> wrote:
>> Martin Ågren <martin.agren@gmail.com> writes:
>>
>>> files_transaction_prepare() and the functions it calls add strings to a
>>> string list without duplicating them, i.e., we keep the original raw
>>> pointers we were given. That is "ok", since we keep them only for a
>>> short-enough time, but we end up leaking some of them.
>>
>> [...]

Good find, Martin.

First of all, you are right that we don't want any memory leaks here.
Nobody promises that the program will end soon if a reference update
fails. (In fact, there are invocations of `git` that trigger multiple
reference updates.) This is a small leak, but we should fix it.

The problem (if I may take a stab at explaining it in my own words) is
that `files_transaction_prepare()` uses a `string_list` called
`affected_refnames` to ensure that the same reference name is not
modified more than once in a single reference transaction. Currently,
`affected_refnames` is configured *not* to duplicate the refnames that
are fed to it, which also means that it doesn't free the refnames when
it is cleared.

This is correct for most refnames, which are owned by entries in the
`ref_transaction`, and therefore have a longer lifetime than
`affected_refnames`.

But there is one code path that can add a refname to
`affected_refnames` without making a provision for the refname ever to
be freed:

* files_transaction_prepare()
  * lock_ref_for_update()
    * split_symref_update()
      * item = string_list_insert(affected_refnames, referent)

The `referent` in the last statement comes from a `strbuf` that is
created in `lock_ref_for_update()` then passed to `lock_raw_ref()`,
which fills it.

It would be a serious bug if `lock_ref_for_update()` would dispose of
`referent`, because the pointer in `affected_refnames` would point at
freed memory. But in fact we have the opposite problem;
`lock_ref_for_update()` never frees the memory (it doesn't even
`strbuf_detach()` it). So that memory is leaked.

Your proposed solution is to change `affected_refnames` to duplicate
the strings that are stored in it, in which case
`lock_ref_for_update()` can properly dispose of `referent`, fixing the
leak. This works, but at the price of having to allocate memory for
*all* references in the update, even though most of them are already
fine.

But note that `split_symref_update()` *also* passes a copy of
`referent` to `ref_transaction_add_update()`, which *already* makes a
copy of the reference name and adds an entry containing the name to
the `ref_transaction`. If we would store *that* copy to
`affected_refnames`, then it would get freed when the
`ref_transaction` is cleaned up, and we could fix the memory leak
without allocating any new memory. This requires a little reorg of
`split_symref_update()` but it's not too bad:

* Do the initial check using `string_list_has_string()` rather than
calling `string_list_insert()` already.
* After `new_update` has been created, call `string_list_insert()`,
passing it `new_update->refname` as the string.

If this is done in place of your first commit, then your second commit
could be left unchanged.

Michael

  reply	other threads:[~2017-08-28  9:18 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-25 18:49 [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Martin Ågren
2017-08-25 18:49 ` [PATCH 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-08-25 21:00 ` [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Junio C Hamano
2017-08-26 10:16   ` Martin Ågren
2017-08-28  8:06     ` Michael Haggerty [this message]
2017-08-28 10:09       ` Martin Ågren
2017-08-28 20:32         ` [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
2017-08-29  8:33           ` Michael Haggerty
2017-08-28 20:32         ` [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-08-29  8:39           ` Michael Haggerty
2017-08-29 10:41             ` Martin Ågren
2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
2017-08-30  2:52                 ` Michael Haggerty
2017-08-30 18:02                   ` Martin Ågren
2017-09-05 10:02                     ` Junio C Hamano
2017-09-05 17:24                       ` Martin Ågren
2017-09-05 20:36                         ` Jeff King
2017-09-05 21:26                           ` Junio C Hamano
2017-09-06 18:12                             ` Martin Ågren
2017-09-06 19:52                               ` Junio C Hamano
2017-09-06 23:45                               ` Jeff King
2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 1/4] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 2/4] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 3/4] refs/files-backend: correct return value " Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 4/4] refs/files-backend: add `refname`, not "HEAD", to list Martin Ågren
2017-09-09 10:47                                 ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Jeff King
2017-09-05  8:45                 ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Jeff King
2017-09-05  9:03                   ` Michael Haggerty
2017-09-05  9:04                     ` Jeff King
2017-08-29 17:18               ` [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-09-05  8:47                 ` Jeff King
2017-09-05 17:28                   ` Martin Ågren
2017-08-29 17:18               ` [PATCH v3 3/3] refs/files-backend: correct return value " Martin Ågren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMy9T_FUGehTbfVfVfi877Gi3DOR5Z45A-H-v92s7g=m7RVnSw@mail.gmail.com' \
    --to=mhagger@alum.mit.edu \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=martin.agren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.