git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Hommey <mh@glandium.org>
To: Jeff King <peff@peff.net>
Cc: Elijah Newren <newren@gmail.com>, git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH 2/2] fast-import: duplicate into history rather than passing ownership
Date: Sun, 25 Aug 2019 19:02:13 +0900	[thread overview]
Message-ID: <20190825100213.fssjydohathfhhe5@glandium.org> (raw)
In-Reply-To: <20190825081055.GB31824@sigill.intra.peff.net>

On Sun, Aug 25, 2019 at 04:10:55AM -0400, Jeff King wrote:
> Fast-import's read_next_command() has somewhat odd memory ownership
> semantics for the command_buf strbuf. After reading a command, we copy
> the strbuf's pointer (without duplicating the string) into our cmd_hist
> array of recent commands. And then when we're about to read a new
> command, we clear the strbuf by calling strbuf_detach(), dropping
> ownership from the strbuf (leaving the cmd_hist reference as the
> remaining owner).
> 
> This has a few surprising implications:
> 
>   - if the strbuf hasn't been copied into cmd_hist (e.g., because we
>     haven't ready any commands yet), then the strbuf_detach() will leak
>     the resulting string
> 
>   - any modification to command_buf risks invalidating the pointer held
>     by cmd_hist. There doesn't seem to be any way to trigger this
>     currently (since we tend to modify it only by detaching and reading
>     in a new value), but it's subtly dangerous.
> 
>   - any pointers into an input string will remain valid as long as
>     cmd_hist points to them. So in general, you can point into
>     command_buf.buf and call read_next_command() up to 100 times before
>     your string is cycled out and freed, leaving you with a dangling
>     pointer. This makes it easy to miss bugs during testing, as they
>     might trigger only for a sufficiently large commit (e.g., the bug
>     fixed in the previous commit).
> 
> Instead, let's make a new string to copy the command into the history
> array, rather than having dual ownership with the old. Then we can drop
> the strbuf_detach() calls entirely, and just reuse the same buffer
> within command_buf over and over. We'd normally have to strbuf_reset()
> it before using it again, but in both cases here we're using
> strbuf_getline(), which does it automatically for us.
> 
> This fixes the leak, and it means that even a single call to
> read_next_command() will invalidate any held pointers, making it easier
> to find bugs. In fact, we can drop the extra input lines added to the
> test case by the previous commit, as the unfixed bug would now trigger
> just from reading the commit message, even without any modified files in
> the commit.
> 
> Reported-by: Mike Hommey <mh@glandium.org>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  fast-import.c          | 4 +---
>  t/t9300-fast-import.sh | 5 -----
>  2 files changed, 1 insertion(+), 8 deletions(-)
> 
> diff --git a/fast-import.c b/fast-import.c
> index ee7258037a..1f9160b645 100644
> --- a/fast-import.c
> +++ b/fast-import.c
> @@ -1763,7 +1763,6 @@ static int read_next_command(void)
>  		} else {
>  			struct recent_command *rc;
>  
> -			strbuf_detach(&command_buf, NULL);
>  			stdin_eof = strbuf_getline_lf(&command_buf, stdin);
>  			if (stdin_eof)
>  				return EOF;
> @@ -1784,7 +1783,7 @@ static int read_next_command(void)
>  				free(rc->buf);
>  			}
>  
> -			rc->buf = command_buf.buf;
> +			rc->buf = xstrdup(command_buf.buf);

You could xstrndup(command_buf.buf, command_buf.len), which would avoid
a hidden strlen.

Mike

  reply	other threads:[~2019-08-25 10:02 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-25  4:13 [PATCH] fast-import: Reinitialize command_buf rather than detach it Mike Hommey
2019-08-25  6:57 ` Jeff King
2019-08-25  7:20   ` Mike Hommey
2019-08-25  7:28     ` Jeff King
2019-08-25  8:06   ` [PATCH 0/2] fast-import input string handling bugs Jeff King
2019-08-25  8:08     ` [PATCH 1/2] fast-import: duplicate parsed encoding string Jeff King
2019-08-26 18:28       ` Elijah Newren
2019-08-26 18:44         ` Jeff King
2019-08-25  8:10     ` [PATCH 2/2] fast-import: duplicate into history rather than passing ownership Jeff King
2019-08-25 10:02       ` Mike Hommey [this message]
2019-08-25 14:21         ` René Scharfe
2019-08-26 18:42           ` Jeff King
2019-08-26 15:36     ` [PATCH 0/2] fast-import input string handling bugs Junio C Hamano
2019-08-26 19:18     ` Elijah Newren
2019-08-25 12:35 ` [PATCH] fast-import: Reinitialize command_buf rather than detach it René Scharfe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190825100213.fssjydohathfhhe5@glandium.org \
    --to=mh@glandium.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).