git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, jrnieder@gmail.com,
	Elijah Newren <newren@gmail.com>
Subject: Re: [PATCH v2] fast-import: add new --date-format=raw-permissive format
Date: Fri, 29 May 2020 02:13:46 -0400	[thread overview]
Message-ID: <20200529061346.GB1294228@coredump.intra.peff.net> (raw)
In-Reply-To: <pull.795.v2.git.git.1590698437607.gitgitgadget@gmail.com>

On Thu, May 28, 2020 at 08:40:37PM +0000, Elijah Newren via GitGitGadget wrote:

>      * Instead of just allowing such timezones outright, did it behind a
>        --date-format=raw-permissive as suggested by Jonathan

Thanks, I like the safety this gives us against importer bugs. It does
mean that people doing "export | filter | import" may have to manually
loosen it, but it should be rare enough that this isn't a big burden
(and if they're rewriting anyway, maybe it gives them a chance to decide
how to fix it up).

>  fast-import.c          | 15 ++++++++++++---
>  t/t9300-fast-import.sh | 30 ++++++++++++++++++++++++++++++

Would we need a documentation update for "raw-permissive", too?

> @@ -1929,7 +1931,8 @@ static int validate_raw_date(const char *src, struct strbuf *result)
>  		return -1;
>  
>  	num = strtoul(src + 1, &endp, 10);
> -	if (errno || endp == src + 1 || *endp || 1400 < num)
> +	out_of_range_timezone = strict && (1400 < num);
> +	if (errno || endp == src + 1 || *endp || out_of_range_timezone)
>  		return -1;

It's a little funny to do computations on the result of a function
before we've checked whether it produced an error. But since the result
is just an integer, I don't think we'd do anything unexpected (we might
just throw away the value, though I imagine an optimizing compiler might
evaluate it lazily anyway).

> @@ -1963,7 +1966,11 @@ static char *parse_ident(const char *buf)
>  
>  	switch (whenspec) {
>  	case WHENSPEC_RAW:
> -		if (validate_raw_date(ltgt, &ident) < 0)
> +		if (validate_raw_date(ltgt, &ident, 1) < 0)
> +			die("Invalid raw date \"%s\" in ident: %s", ltgt, buf);
> +		break;
> +	case WHENSPEC_RAW_PERMISSIVE:
> +		if (validate_raw_date(ltgt, &ident, 0) < 0)
>  			die("Invalid raw date \"%s\" in ident: %s", ltgt, buf);
>  		break;

This looks simple enough. We have the bogus date in a buffer at that
point, and we stuff that string literally into the commit (or tag)
object. If we ever get more clever about storing the timestamp
internally as an integer, we may need to get more clever. But your new
test should alert us if that becomes the case.

> +test_expect_success 'B: accept invalid timezone with raw-permissive' '
> +	cat >input <<-INPUT_END &&
> +	commit refs/heads/invalid-timezone
> +	committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> 1234567890 +051800
> +	data <<COMMIT
> +	empty commit
> +	COMMIT
> +	INPUT_END
> +
> +	test_when_finished "git update-ref -d refs/heads/invalid-timezone
> +		git gc
> +		git prune" &&

We check the exit code of when_finished commands, so this should be
&&-chained as usual. And possibly indented like:

  test_when_finished "
    git update-ref -d refs/heads/invalid-timezone &&
    git gc &&
    git prune
  " &&

but I see this all is copying another nearby test. So I'm OK with
keeping it consistent for now and cleaning up the whole thing later.
Though if you want to do that step now, I have no objection. :)

I also also suspect this "gc" is not useful these days. For a small
input like this, fast-import will write loose objects, since d9545c7f46
(fast-import: implement unpack limit, 2016-04-25).

TBH, I think it would be easier to understand as:

  ...
  git init invalid-timezone &&
  git -C invalid-timezone fast-import <input &&
  git -C invalid-timezone cat-file -p master >out &&
  ...

and don't bother with the when_finished at all. Then you don't have to
wonder whether the cleanup was sufficient, and it's fewer processes
to boot (we'll leave the sub-repo cruft sitting around, but a single "rm
-rf" at the end of the test script will wipe them all out).

-Peff

  parent reply	other threads:[~2020-05-29  6:13 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28 19:15 [PATCH] fast-import: accept invalid timezones so we can import existing repos Elijah Newren via GitGitGadget
2020-05-28 19:26 ` Jonathan Nieder
2020-05-28 20:40 ` [PATCH v2] fast-import: add new --date-format=raw-permissive format Elijah Newren via GitGitGadget
2020-05-28 23:08   ` Junio C Hamano
2020-05-29  0:20   ` Jonathan Nieder
2020-05-29  6:13   ` Jeff King [this message]
2020-05-29 17:19     ` Junio C Hamano
2020-05-30 20:25   ` [PATCH v3] " Elijah Newren via GitGitGadget
2020-05-30 23:13     ` Jeff King
2021-02-03 11:57     ` Why does fast-import need to check the validity of idents? + Other ident adventures =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason
2021-02-03 19:20       ` Junio C Hamano
2021-02-05 15:25         ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200529061346.GB1294228@coredump.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=newren@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).