All of lore.kernel.org
 help / color / mirror / Atom feed
From: Phillip Wood <phillip.wood@talktalk.net>
To: Eric Sunshine <sunshine@sunshineco.com>, git@vger.kernel.org
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	Junio C Hamano <gitster@pobox.com>,
	Akinori MUSHA <knu@idaemons.org>
Subject: Re: [PATCH v2 3/4] sequencer: fix "rebase -i --root" corrupting author header timestamp
Date: Tue, 31 Jul 2018 11:00:56 +0100	[thread overview]
Message-ID: <31870bca-329f-2451-750a-56d917153844@talktalk.net> (raw)
In-Reply-To: <20180731073331.40007-4-sunshine@sunshineco.com>

On 31/07/18 08:33, Eric Sunshine wrote:
> When "git rebase -i --root" creates a new root commit, it corrupts the
> "author" header's timestamp by prepending a "@":
> 
>      author A U Thor <author@example.com> @1112912773 -0700
> 
> The commit parser is very strict about the format of the "author"
> header, and does not allow a "@" in that position.
> 
> The "@" comes from GIT_AUTHOR_DATE in "rebase-merge/author-script",
> signifying a Unix epoch-based timestamp, however, read_author_ident()
> incorrectly allows it to slip into the commit's "author" header, thus
> corrupting it.
> 
> One possible fix would be simply to filter out the "@" when constructing
> the "author" header timestamp, however, a more correct fix is to parse
> the GIT_AUTHOR_DATE date (via parse_date()) and format the parsed result
> into the "author" header. Since "rebase-merge/author-script" may be
> edited by the user, this approach has the extra benefit of catching
> other potential timestamp corruption due to hand-editing.
> 
> We can do better than calling parse_date() ourselves and constructing
> the "author" header manually, however, by instead taking advantage of
> fmt_ident() which does this work for us.
> 
> The benefits of using fmt_ident() are twofold. First, it simplifies the
> logic considerably by allowing us to avoid the complexity of building
> the "author" header in parallel with and in the same buffer from which
> "rebase-merge/author-script" is being parsed. Instead, fmt_ident() is
> invoked to compose the header after parsing is complete.
> 
> Second, fmt_ident() is careful to prevent "crud" from polluting the
> composed ident. As with validating GIT_AUTHOR_DATE, this "crud"
> avoidance prevents other (possibly hand-edited) bogus author information
> from "rebase-merge/author-script" from corrupting the commit object.
> 
> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
> ---
>   sequencer.c                   | 23 +++++++++--------------
>   t/t3404-rebase-interactive.sh |  2 +-
>   2 files changed, 10 insertions(+), 15 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 1008f6d71a..15a66a334c 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -709,14 +709,16 @@ static const char *read_author_ident(struct strbuf *buf)
>   	const char *keys[] = {
>   		"GIT_AUTHOR_NAME=", "GIT_AUTHOR_EMAIL=", "GIT_AUTHOR_DATE="
>   	};
> -	char *in, *out, *eol;
> -	int i = 0, len;
> +	struct strbuf out = STRBUF_INIT;
> +	char *in, *eol;
> +	const char *val[3];
> +	int i = 0;
>   
>   	if (strbuf_read_file(buf, rebase_path_author_script(), 256) <= 0)
>   		return NULL;
>   
>   	/* dequote values and construct ident line in-place */
> -	for (in = out = buf->buf; i < 3 && in - buf->buf < buf->len; i++) {
> +	for (in = buf->buf; i < 3 && in - buf->buf < buf->len; i++) {
>   		if (!skip_prefix(in, keys[i], (const char **)&in)) {
>   			warning("could not parse '%s' (looking for '%s'",
>   				rebase_path_author_script(), keys[i]);
> @@ -730,16 +732,7 @@ static const char *read_author_ident(struct strbuf *buf)
>   				keys[i], rebase_path_author_script());
>   			return NULL;
>   		}
> -		len = strlen(in);
> -
> -		if (i > 0) /* separate values by spaces */
> -			*(out++) = ' ';
> -		if (i == 1) /* email needs to be surrounded by <...> */
> -			*(out++) = '<';
> -		memmove(out, in, len);
> -		out += len;
> -		if (i == 1) /* email needs to be surrounded by <...> */
> -			*(out++) = '>';
> +		val[i] = in;
>   		in = eol + 1;
>   	}
>   
> @@ -749,7 +742,9 @@ static const char *read_author_ident(struct strbuf *buf)
>   		return NULL;
>   	}
>   
> -	strbuf_setlen(buf, out - buf->buf);
> +	strbuf_addstr(&out, fmt_ident(val[0], val[1], val[2], 0));
> +	strbuf_swap(buf, &out);
> +	strbuf_release(&out);
>   	return buf->buf;
>   }

This is a welcome simplification

> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index fd3a18154e..d340018781 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh

Now that the author is correct, can we test_cmp() it against its 
expected value to make sure there are no hidden surprises in the name 
and email in the future. (It would be reassuring to test an author with 
"'" in the name as well but that is out of scope for this series.)

> @@ -1420,7 +1420,7 @@ test_expect_success 'valid author header after --root swap' '
+	git cat-file commit HEAD^ |grep ^author >expected &&
>   	set_fake_editor &&
>   	FAKE_LINES="2 1" git rebase -i --root &&
>   	git cat-file commit HEAD^ >out &&
-	git cat-file commit HEAD^ >out &&
> -	grep "^author ..*> @[0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
+	git cat-file commit HEAD^ |grep ^author >out &&
+	test_cmp expected out	
> +	grep "^author ..*> [0-9][0-9]* [-+][0-9][0-9][0-9][0-9]$" out
>   ' >   test_done
> 


  reply	other threads:[~2018-07-31 10:01 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-31  7:33 [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit Eric Sunshine
2018-07-31  7:33 ` [PATCH v2 1/4] sequencer: fix "rebase -i --root" corrupting author header Eric Sunshine
2018-07-31  7:33 ` [PATCH v2 2/4] sequencer: fix "rebase -i --root" corrupting author header timezone Eric Sunshine
2018-07-31  9:50   ` Phillip Wood
2018-07-31 10:15     ` Eric Sunshine
2018-07-31  7:33 ` [PATCH v2 3/4] sequencer: fix "rebase -i --root" corrupting author header timestamp Eric Sunshine
2018-07-31 10:00   ` Phillip Wood [this message]
2018-07-31 10:30     ` Eric Sunshine
2018-07-31  7:33 ` [PATCH v2 4/4] sequencer: don't die() on bogus user-edited timestamp Eric Sunshine
2018-07-31 10:02   ` Phillip Wood
2018-07-31 10:38     ` Eric Sunshine
2018-07-31 10:05 ` [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit Phillip Wood
2018-07-31 10:46   ` Eric Sunshine
2018-07-31 11:19     ` Phillip Wood
2018-07-31 11:27     ` Eric Sunshine
2018-07-31 11:15 ` [PATCH v2 0/2] Fix author script quoting Phillip Wood
2018-07-31 11:15   ` [PATCH v2 1/2] sequencer: handle errors in read_author_ident() Phillip Wood
2018-07-31 20:47     ` Eric Sunshine
2018-08-01  9:28       ` Phillip Wood
2018-07-31 11:15   ` [PATCH v2 2/2] sequencer: fix quoting in write_author_script Phillip Wood
2018-07-31 21:39     ` Eric Sunshine
2018-08-01 10:24       ` Phillip Wood
2018-08-01 15:22         ` Junio C Hamano
2018-08-01 15:50       ` Phillip Wood
2018-08-01 19:19         ` Eric Sunshine
2018-08-01  1:30 ` [PATCH v2 0/4] fix "rebase -i --root" corrupting root commit Hilco Wijbenga
2018-08-01  6:22   ` Eric Sunshine
2018-08-07  1:19     ` Hilco Wijbenga
2018-08-07  3:31       ` Eric Sunshine
2018-08-07 21:09         ` Junio C Hamano
2018-08-27 22:34         ` Johannes Schindelin
2018-08-01 23:25 ` brian m. carlson
2018-08-02  8:09   ` Eric Sunshine
2018-08-02 11:20 ` [PATCH v3 0/2] Fix author script quoting Phillip Wood
2018-08-02 11:20   ` [PATCH v3 1/2] sequencer: handle errors in read_author_ident() Phillip Wood
2018-08-03  7:09     ` Eric Sunshine
2018-08-03 15:53       ` Junio C Hamano
2018-08-02 11:20   ` [PATCH v3 2/2] sequencer: fix quoting in write_author_script Phillip Wood
2018-08-02 17:27     ` Junio C Hamano
2018-08-03  7:59       ` Eric Sunshine
2018-08-03  9:33         ` Phillip Wood
2018-08-03 10:02           ` Eric Sunshine
2018-08-03 14:12             ` Phillip Wood
2018-08-07 17:20               ` Junio C Hamano
2018-08-07  9:34 ` [PATCH v4 0/2] fix author-script quoting Phillip Wood
2018-08-07  9:34   ` [PATCH v4 1/2] sequencer: handle errors from read_author_ident() Phillip Wood
2018-08-08  9:43     ` Eric Sunshine
2018-08-07  9:34   ` [PATCH v4 2/2] sequencer: fix quoting in write_author_script Phillip Wood
2018-08-07 10:23     ` Eric Sunshine
2018-08-07 13:54       ` Phillip Wood
2018-08-08  8:43         ` Eric Sunshine
2018-08-08 16:01           ` Junio C Hamano
2018-08-09 10:06             ` Phillip Wood
2018-08-09 10:08           ` Phillip Wood
2018-08-08  9:39     ` Eric Sunshine
2018-08-09 10:11       ` Phillip Wood
2018-08-08  9:51   ` [PATCH v4 0/2] fix author-script quoting Eric Sunshine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=31870bca-329f-2451-750a-56d917153844@talktalk.net \
    --to=phillip.wood@talktalk.net \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=knu@idaemons.org \
    --cc=phillip.wood@dunelm.org.uk \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.