All of lore.kernel.org
 help / color / mirror / Atom feed
From: Derrick Stolee <stolee@gmail.com>
To: esr@thyrsus.com, "Michal Suchánek" <msuchanek@suse.de>
Cc: Jason Pyeron <jpyeron@pdinc.us>, git@vger.kernel.org
Subject: Re: Finer timestamps and serialization in git
Date: Mon, 20 May 2019 13:22:15 -0400	[thread overview]
Message-ID: <7e88805c-7e08-2631-599d-b47a098f1ce1@gmail.com> (raw)
In-Reply-To: <20190520163625.GA99397@thyrsus.com>

On 5/20/2019 12:36 PM, Eric S. Raymond wrote:
> Michal Suchánek <msuchanek@suse.de>:
>> On Wed, 15 May 2019 21:25:46 -0400
>> Derrick Stolee <stolee@gmail.com> wrote:
>>
>>> On 5/15/2019 8:28 PM, Eric S. Raymond wrote:
>>>> Derrick Stolee <stolee@gmail.com>:  
>>>>> What problem are you trying to solve where commit date is important?  
>>
>>>> B. Unique canonical form of import-stream representation.
>>>>
>>>> Reposurgeon is a very complex piece of software with subtle failure
>>>> modes.  I have a strong need to be able to regression-test its
>>>> operation.  Right now there are important cases in which I can't do
>>>> that because (a) the order in which it writes commits and (b) how it
>>>> colors branches, are both phase-of-moon dependent.  That is, the
>>>> algorithms may be deterministic but they're not documented and seem to
>>>> be dependent on variables that are hidden from me.
>>>>
>>>> Before import streams can have a canonical output order without hidden
>>>> variables (e.g. depending only on visible metadata) in practice, that
>>>> needs to be possible in principle. I've thought about this a lot and
>>>> not only are unique commit timestamps the most natural way to make
>>>> it possible, they're the only way conistent with the reality that
>>>> commit comments may be altered for various good reasons during
>>>> repository translation.  
>>>
>>> If you are trying to debug or test something, why don't you serialize
>>> the input you are using for your test?
>>
>> And that's the problem. Serialization of a git repository is not stable
>> because there is no total ordering on commits. And for testing you need
>> to serialize some 'before' and 'after' state and they can be totally
>> different. Not because the repository state is totally different but
>> because the serialization of the state is not stable.
> 
> Yes, msuchanek is right - that is exactly the problem.  Very well put.
> 
> git fast-import streams *are* the serialization; they're what reposurgeon
> ingests and emits.  The concrete problem I have is that there is no stable
> correspondence between a repository and one canonical fast-import
> serialization of it.
> 
> That is a bigger pain in the ass than you will be able to imagine unless
> and until you try writing surgical tools yourself and discover that you
> can't write tests for them.

What it sounds like you are doing is piping a 'git fast-import' process into
reposurgeon, and testing that reposurgeon does the same thing every time.
Of course this won't be consistent if 'git fast-import' isn't consistent.

But what you should do instead is store a fixed file from one run of
'git fast-import' and send that file to reposurgeon for the repeated test.
Don't rely on fast-import being consistent and instead use fixed input for
your test.

If reposurgeon is providing the input to _and_ consuming the output from
'git fast-import', then yes you will need to have at least one integration
test that runs the full pipeline. But for regression tests covering complicated
logic in reposurgeon, you're better off splitting the test (or mocking out
'git fast-import' with something that provides consistent output given
fixed input).

-Stolee
 


  reply	other threads:[~2019-05-20 17:22 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-15 19:16 Finer timestamps and serialization in git Eric S. Raymond
2019-05-15 20:16 ` Derrick Stolee
2019-05-15 20:28   ` Jason Pyeron
2019-05-15 21:14     ` Derrick Stolee
2019-05-15 22:07       ` Ævar Arnfjörð Bjarmason
2019-05-16  0:28       ` Eric S. Raymond
2019-05-16  1:25         ` Derrick Stolee
2019-05-20 15:05           ` Michal Suchánek
2019-05-20 16:36             ` Eric S. Raymond
2019-05-20 17:22               ` Derrick Stolee [this message]
2019-05-20 21:32                 ` Eric S. Raymond
2019-05-15 23:40     ` Eric S. Raymond
2019-05-19  0:16       ` Philip Oakley
2019-05-19  4:09         ` Eric S. Raymond
2019-05-19 10:07           ` Philip Oakley
2019-05-15 23:32   ` Eric S. Raymond
2019-05-16  1:14     ` Derrick Stolee
2019-05-16  9:50     ` Ævar Arnfjörð Bjarmason
2019-05-19 23:15       ` Jakub Narebski
2019-05-20  0:45         ` Eric S. Raymond
2019-05-20  9:43           ` Jakub Narebski
2019-05-20 10:08             ` Ævar Arnfjörð Bjarmason
2019-05-20 12:40             ` Jeff King
2019-05-20 14:14             ` Eric S. Raymond
2019-05-20 14:41               ` Michal Suchánek
2019-05-20 22:18                 ` Philip Oakley
2019-05-20 21:38               ` Elijah Newren
2019-05-20 23:12                 ` Eric S. Raymond
2019-05-21  0:08               ` Jakub Narebski
2019-05-21  1:05                 ` Eric S. Raymond
2019-05-15 20:20 ` Ævar Arnfjörð Bjarmason
2019-05-16  0:35   ` Eric S. Raymond
2019-05-16  4:14   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7e88805c-7e08-2631-599d-b47a098f1ce1@gmail.com \
    --to=stolee@gmail.com \
    --cc=esr@thyrsus.com \
    --cc=git@vger.kernel.org \
    --cc=jpyeron@pdinc.us \
    --cc=msuchanek@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.