All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Eric S. Raymond" <esr@thyrsus.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Finer timestamps and serialization in git
Date: Wed, 15 May 2019 19:32:30 -0400	[thread overview]
Message-ID: <20190515233230.GA124956@thyrsus.com> (raw)
In-Reply-To: <ae62476c-1642-0b9c-86a5-c2c8cddf9dfb@gmail.com>

Derrick Stolee <stolee@gmail.com>:
> On 5/15/2019 3:16 PM, Eric S. Raymond wrote:
> > The deeper problem is that I want something from Git that I cannot
> > have with 1-second granularity. That is: a unique timestamp on each
> > commit in a repository.
> 
> This is impossible in a distributed version control system like Git
> (where the commits are immutable). No matter your precision, there is
> a chance that two machiens commit at the exact same moment on two different
> machines and then those commits are merged into the same branch.

It's easy to work around that problem. Each git daemon has to single-thread
its handling of incoming commits at some level, because you need a lock on the
file system to guarantee consistent updates to it.

So if a commit comes in that would be the same as the date of the
previous commit on the current branch, you bump the incoming commit timestamp.
That's the simple case. The complicated case is checking for date
collisions on *other* branches. But there are ways to make that fast,
too. There's a very obvious one involving a presort that is is O(log2
n) in the number of commits.

I wouldn't have brought this up in the first place if I didn't have a
pretty clear idea how to do it in code!

> Even when you specify a committer, there are many environments where a set
> of parallel machines are creating commits with the same identity.

If those commit sets become the same commit in the final graph, this is
not a problem for total ordering.

> > Why do I want this? There are number of reasons, all related to a
> > mathematical concept called "total ordering".  At present, commits in
> > a Git repository only have partial ordering. 
> 
> This is true of any directed acyclic graph. If you want a total ordering
> that is completely unambiguous, then you should think about maintaining
> a linear commit history by requiring rebasing instead of merging.

Excuse me, but your premise is incorrect.  A git DAG isn't just "any" DAG.
The presence of timestamps makes a total ordering possible.

(I was a theoretical mathematician in a former life. This is all very
familiar ground to me.)

> > One consequence is that
> > action stamps - the committer/date pairs I use as VCS-independent commit
> > identifications in reposurgeon - are not unique.  When a patch sequence
> > is applied, it can easily happen fast enough to give several successive
> > commits the same committer-ID and timestamp.
> 
> Sorting by committer/date pairs sounds like an unhelpful idea, as that
> does not take any graph topology into account. It happens that commits
> can actually have an _earlier_ commit date than its parent.

Yes, I'm aware of that.  The uniqueness properties that make a total
ordering desirable are not actually dependent on timestamp order
coinciding with topo order.

> Changing the granularity of timestamps requires changing the commit format,
> which is probably a non-starter.

That's why I started by noting that you're going to have to break the
format anyway to move to an ECDSA hash (or whatever you end up using).

I'm saying that *since you'll need to do that anyway*, it's a good time
to think about making timestamps finer-grained and unique.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



  parent reply	other threads:[~2019-05-16  1:50 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-15 19:16 Finer timestamps and serialization in git Eric S. Raymond
2019-05-15 20:16 ` Derrick Stolee
2019-05-15 20:28   ` Jason Pyeron
2019-05-15 21:14     ` Derrick Stolee
2019-05-15 22:07       ` Ævar Arnfjörð Bjarmason
2019-05-16  0:28       ` Eric S. Raymond
2019-05-16  1:25         ` Derrick Stolee
2019-05-20 15:05           ` Michal Suchánek
2019-05-20 16:36             ` Eric S. Raymond
2019-05-20 17:22               ` Derrick Stolee
2019-05-20 21:32                 ` Eric S. Raymond
2019-05-15 23:40     ` Eric S. Raymond
2019-05-19  0:16       ` Philip Oakley
2019-05-19  4:09         ` Eric S. Raymond
2019-05-19 10:07           ` Philip Oakley
2019-05-15 23:32   ` Eric S. Raymond [this message]
2019-05-16  1:14     ` Derrick Stolee
2019-05-16  9:50     ` Ævar Arnfjörð Bjarmason
2019-05-19 23:15       ` Jakub Narebski
2019-05-20  0:45         ` Eric S. Raymond
2019-05-20  9:43           ` Jakub Narebski
2019-05-20 10:08             ` Ævar Arnfjörð Bjarmason
2019-05-20 12:40             ` Jeff King
2019-05-20 14:14             ` Eric S. Raymond
2019-05-20 14:41               ` Michal Suchánek
2019-05-20 22:18                 ` Philip Oakley
2019-05-20 21:38               ` Elijah Newren
2019-05-20 23:12                 ` Eric S. Raymond
2019-05-21  0:08               ` Jakub Narebski
2019-05-21  1:05                 ` Eric S. Raymond
2019-05-15 20:20 ` Ævar Arnfjörð Bjarmason
2019-05-16  0:35   ` Eric S. Raymond
2019-05-16  4:14   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190515233230.GA124956@thyrsus.com \
    --to=esr@thyrsus.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.