archive mirror
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <>
To: Junio C Hamano <>
Cc: Elijah Newren via GitGitGadget <>,,,,
	Elijah Newren <>
Subject: Re: Why does fast-import need to check the validity of idents? + Other ident adventures
Date: Fri, 05 Feb 2021 16:25:23 +0100	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Wed, Feb 03 2021, Junio C Hamano wrote:

> "=?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?=" Bjarmason <>
> writes:
>> But I was wondering about fast-import.c in particular. I think Elijah's
>> patch here is obviously good an incremental improvement. But stepping
>> back a bit: who cares about sort-of-fsck validation in fast-import.c
>> anyway?
> Those who want to notice and verify the procedure they used to
> produce the import data from the original before it is too late?
> I.e. data gets imported to Git, victory declared and then old SCM
> turned gets off---and only then the resulting imported repository is
> found not to pass fsck.
>> Shouldn't it just pretty much be importing data as-is, and then we could
>> document "if you don't trust it, run fsck afterwards"?
> If it is a small import, the distinction does not matter, but for a
> huge import, the procedure to produce the data is likely to be
> mechanical, so even after processing just a very small portion of
> early part of the datastream, systematic errors would be noticed
> before fast-import wastes importing too much garbage that need to be
> discarded after running such fsck.  So in that sense, I suspect that
> there is value in the early validation.

What I was fishing for here is that perhaps since fast-import was
originally written this use-case of in-place conversion of primary data
on a server might have become too obscure to care about, i.e. as opposed
to doing a conversion locally and then "git push"-ing it to something
that does transfer.fsckObjects.

>> Or, if it's a use-case people actually care about, then I might see
>> about unifying some of these parser functions as part of a series I'm
>> preparing.
> I think allowing people to loosen particular checks for fast-import
> (or elsewhere for that matter) is a good idea, and you can do so
> more easily once the existing checking is migrated to your new
> scheme that shares code with the fsck machinery.

...allright, depending on how much of a hassle that is I might just add
tests for the differences and leave this particular problem to someone
else :)

      reply	other threads:[~2021-02-05 22:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-28 19:15 [PATCH] fast-import: accept invalid timezones so we can import existing repos Elijah Newren via GitGitGadget
2020-05-28 19:26 ` Jonathan Nieder
2020-05-28 20:40 ` [PATCH v2] fast-import: add new --date-format=raw-permissive format Elijah Newren via GitGitGadget
2020-05-28 23:08   ` Junio C Hamano
2020-05-29  0:20   ` Jonathan Nieder
2020-05-29  6:13   ` Jeff King
2020-05-29 17:19     ` Junio C Hamano
2020-05-30 20:25   ` [PATCH v3] " Elijah Newren via GitGitGadget
2020-05-30 23:13     ` Jeff King
2021-02-03 11:57     ` Why does fast-import need to check the validity of idents? + Other ident adventures =?utf-8?B?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason
2021-02-03 19:20       ` Junio C Hamano
2021-02-05 15:25         ` Ævar Arnfjörð Bjarmason [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).