All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrzej Hunt <andrzej@ahunt.org>
To: phillip.wood@dunelm.org.uk, Elijah Newren <newren@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 11/12] builtin/rebase: fix options.strategy memory lifecycle
Date: Sun, 25 Jul 2021 15:03:21 +0200	[thread overview]
Message-ID: <9f298c97-07d6-7117-baab-6a44359c44d2@ahunt.org> (raw)
In-Reply-To: <d1ef45c1-067e-abde-62a2-1df2c12ba3a3@gmail.com>



On 22/06/2021 11:02, Phillip Wood wrote:
> Hi Elijah
> 
> On 21/06/2021 22:39, Elijah Newren wrote:
>> On Sun, Jun 20, 2021 at 11:29 AM Phillip Wood 
>> <phillip.wood123@gmail.com> wrote:
>>>
>>> Hi Andrzej
>>>
>>> Thanks for working on removing memory leaks from git.
>>>
>>> On 20/06/2021 16:12, andrzej@ahunt.org wrote:
>>>> From: Andrzej Hunt <ajrhunt@google.com>
>>>>
>>>> This change:
>>>> - xstrdup()'s all string being used for replace_opts.strategy, to
>>>
>>> I think you mean replay_opts rather than replace_opts.
>>>
>>>>     guarantee that replace_opts owns these strings. This is needed 
>>>> because
>>>>     sequencer_remove_state() will free replace_opts.strategy, and it's
>>>>     usually called as part of the usage of replace_opts.
>>>> - Removes xstrdup()'s being used to populate options.strategy in
>>>>     cmd_rebase(), which avoids leaking options.strategy, even in the
>>>>     case where strategy is never moved/copied into replace_opts.
>>>
>>>
>>>> These changes are needed because:
>>>> - We would always create a new string for options.strategy if we either
>>>>     get a strategy via options (OPT_STRING(...strategy...), or via
>>>>     GIT_TEST_MERGE_ALGORITHM.
>>>> - But only sometimes is this string copied into replace_opts - in which
>>>>     case it did get free()'d in sequencer_remove_state().
>>>> - The rest of the time, the newly allocated string would remain unused,
>>>>     causing a leak. But we can't just add a free because that can 
>>>> result
>>>>     in a double-free in those cases where replace_opts was populated.
>>>>
>>>> An alternative approach would be to set options.strategy to NULL when
>>>> moving the pointer to replace_opts.strategy, combined with always
>>>> free()'ing options.strategy, but that seems like a more
>>>> complicated and wasteful approach.
>>>
>>> read_basic_state() contains
>>>          if (file_exists(state_dir_path("strategy", opts))) {
>>>                  strbuf_reset(&buf);
>>>                  if (!read_oneliner(&buf, state_dir_path("strategy", 
>>> opts),
>>>                                     READ_ONELINER_WARN_MISSING))
>>>                          return -1;
>>>                  free(opts->strategy);
>>>                  opts->strategy = xstrdup(buf.buf);
>>>          }
>>>
>>> So we do try to free opts->strategy when reading the state from disc and
>>> we allocate a new string. I suspect that opts->strategy is actually NULL
>>> in when this function is called but I haven't checked. 

Thank you for noticing this. I think you're right - running an ASAN 
build past the whole test suite also didn't catch any double-frees which 
mostly confirms that opts->strategy is indeed always NULL here. But 
that's not a good reason for taking the risk.

>>> Given that we are
>>> allocating a copy above I think maybe your alternative approach of
>>> always freeing opts->strategy would be better.

I will go down this route for V2. Although on further thought: instead 
of my original idea of moving the string to replay_opts (and NULL'ing 
out rebase_options->strategy), I think it's better to create a new copy 
when populating replay_opts. The move/NULL approach I suggested in V1 
happens to work OK, but I think it's non-obvious and could break if we 
ever wanted to use get_replay_opts() more than once - creating separate 
copies reduces the number of surprises.

>>
>> Good catches.  sequencer_remove_state() in sequencer.c also has a
>> free(opts->strategy) call.
>>
>> To make things even more muddy, we have code like
>>      replay.strategy = replay.default_strategy;
>> or
>>      opts->strategy = opts->default_strategy;
>> which both will probably work really poorly with the calls to
>>      free(opts->default_strategy);
>>      free(opts->strategy);
>> from sequencer_remove_state().  I suspect we've got a few bugs here...
> 
> It's not immediately obvious but I think those are actually safe. 
> opts->default_strategy is allocated by sequencer_init_config() so it is 
> correct to free it and when we assign it in rebase.c we do
> 
>      else if (!replay.strategy && replay.default_strategy) {
>          replay.strategy = replay.default_strategy;
>          replay.default_strategy = NULL;
>      }
> 
> so there is no double free.

As mentioned above, ASAN isn't catching any double-frees here (but I 
guess that depends on whether or not you trust the test suite to be 
reasonably testing all permutations).

But it's still good to take note of sequencer_remove_state() free'ing 
opts->strategy, because I almost did manage to add a double free when I 
added a free(options.strategy) to cmd_rebase without also xstrdup'ing 
strategy in get_replay_opts().

> There is similar code in builtin/revert.c 
> which I think is where your other example came from. I think there is a 
> leak in builtin/revert.c though
> 
>      if (!opts->strategy && opts->default_strategy) {
>          opts->strategy = opts->default_strategy;
>          opts->default_strategy = NULL;
>      }
> 
>      /* do some other stuff */
> 
>      /* These option values will be free()d */
>      opts->gpg_sign = xstrdup_or_null(opts->gpg_sign);
>      opts->strategy = xstrdup_or_null(opts->strategy);
> 
> So we copy the default strategy, leaking the original copy from 
> sequencer_init_options() if --strategy isn't given on the command line. 
> I think it would be simple to fix this by making the copy earlier.
> 
>      if (!opts->strategy && opts->default_strategy) {
>          opts->strategy = opts->default_strategy;
>          opts->default_strategy = NULL;
>      } else if (opts->strategy) {
>      /* This option will be free()d in sequencer_remove_state() */
>          opts->strategy = xstrdup(opts->strategy);
>      }
> 

Nice find. I'm noticing a lot of interesting leaks in git's options 
handling, and those leaks also tend to be the trickiest ones to fix (as 
my blunder in the original version of this patch demonstrates :) ).

ATB,

   Andrzej

  reply	other threads:[~2021-07-25 13:03 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-20 15:11 [PATCH 00/12] Fix all leaks in tests t0002-t0099: Part 2 andrzej
2021-06-20 15:11 ` [PATCH 01/12] fmt-merge-msg: free newly allocated temporary strings when done andrzej
2021-06-21 20:34   ` Elijah Newren
2021-06-20 15:11 ` [PATCH 02/12] environment: move strbuf into block to plug leak andrzej
2021-06-21 20:49   ` Elijah Newren
2021-06-26  8:27     ` René Scharfe
2021-06-20 15:11 ` [PATCH 03/12] builtin/submodule--helper: release unused strbuf to avoid leak andrzej
2021-06-20 15:11 ` [PATCH 04/12] builtin/for-each-repo: remove unnecessary argv copy to plug leak andrzej
2021-06-21 20:55   ` Elijah Newren
2021-06-20 15:11 ` [PATCH 05/12] diffcore-rename: move old_dir/new_dir definition " andrzej
2021-06-21 14:01   ` Elijah Newren
2021-06-20 15:11 ` [PATCH 06/12] ref-filter: also free head for ATOM_HEAD to avoid leak andrzej
2021-06-21 21:10   ` Elijah Newren
2021-06-20 15:11 ` [PATCH 07/12] read-cache: call diff_setup_done " andrzej
2021-06-21 21:17   ` Elijah Newren
2021-06-20 15:12 ` [PATCH 08/12] convert: release strbuf " andrzej
2021-06-21 20:31   ` Elijah Newren
2021-06-20 15:12 ` [PATCH 09/12] builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv andrzej
2021-06-20 15:12 ` [PATCH 10/12] builtin/merge: free found_ref when done andrzej
2021-06-21 21:27   ` Elijah Newren
2021-06-20 15:12 ` [PATCH 11/12] builtin/rebase: fix options.strategy memory lifecycle andrzej
2021-06-20 18:14   ` Phillip Wood
2021-06-21 21:39     ` Elijah Newren
2021-06-22  9:02       ` Phillip Wood
2021-07-25 13:03         ` Andrzej Hunt [this message]
2021-07-27 19:34           ` Phillip Wood
2021-06-20 15:12 ` [PATCH 12/12] reset: clear_unpack_trees_porcelain to plug leak andrzej
2021-06-21 21:44   ` Elijah Newren
2021-06-21 21:54 ` [PATCH 00/12] Fix all leaks in tests t0002-t0099: Part 2 Elijah Newren
2021-07-25 13:05   ` Andrzej Hunt
2021-07-26  8:01   ` Christian Couder
2021-07-25 13:08 ` [PATCH v2 " andrzej
2021-07-25 13:08   ` [PATCH v2 01/12] fmt-merge-msg: free newly allocated temporary strings when done andrzej
2021-07-26 19:20     ` Junio C Hamano
2021-07-25 13:08   ` [PATCH v2 02/12] environment: move strbuf into block to plug leak andrzej
2021-07-25 13:08   ` [PATCH v2 03/12] builtin/submodule--helper: release unused strbuf to avoid leak andrzej
2021-07-25 13:08   ` [PATCH v2 04/12] builtin/for-each-repo: remove unnecessary argv copy to plug leak andrzej
2021-07-26 20:02     ` Junio C Hamano
2021-07-25 13:08   ` [PATCH v2 05/12] diffcore-rename: move old_dir/new_dir definition " andrzej
2021-07-26 20:02     ` Junio C Hamano
2021-07-25 13:08   ` [PATCH v2 06/12] ref-filter: also free head for ATOM_HEAD to avoid leak andrzej
2021-07-26 20:04     ` Junio C Hamano
2021-07-25 13:08   ` [PATCH v2 07/12] read-cache: call diff_setup_done " andrzej
2021-07-26 20:10     ` Junio C Hamano
2021-07-25 13:08   ` [PATCH v2 08/12] convert: release strbuf " andrzej
2021-07-26 20:15     ` Junio C Hamano
2021-07-25 13:08   ` [PATCH v2 09/12] builtin/mv: free or UNLEAK multiple pointers at end of cmd_mv andrzej
2021-07-25 13:08   ` [PATCH v2 10/12] builtin/merge: free found_ref when done andrzej
2021-07-25 13:08   ` [PATCH v2 11/12] builtin/rebase: fix options.strategy memory lifecycle andrzej
2021-07-25 13:08   ` [PATCH v2 12/12] reset: clear_unpack_trees_porcelain to plug leak andrzej
2021-07-26 20:20   ` [PATCH v2 00/12] Fix all leaks in tests t0002-t0099: Part 2 Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f298c97-07d6-7117-baab-6a44359c44d2@ahunt.org \
    --to=andrzej@ahunt.org \
    --cc=git@vger.kernel.org \
    --cc=newren@gmail.com \
    --cc=phillip.wood@dunelm.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.