All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jakub Narębski" <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>,
	Stefan Beller <sbeller@google.com>,
	Marc Strapetz <marc.strapetz@syntevo.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: topological index field for commit objects
Date: Fri, 1 Jul 2016 11:48:40 +0200	[thread overview]
Message-ID: <57763C78.3020509@gmail.com> (raw)
In-Reply-To: <20160701031711.GA4832@sigill.intra.peff.net>

W dniu 2016-07-01 o 05:17, Jeff King pisze:
> On Thu, Jun 30, 2016 at 11:12:52AM -0700, Linus Torvalds wrote:
> 
>> I do think that it's ok to cache generation numbers somewhere if there
>> is an algorithm that can make use of them, but every time this comes
>> up, it's just not been important enough to make a big deal and a new
>> incompatible object format for it. The committer date is preexisting
>> and has existing pseudo-generation-number usage, so..improving on the
>> quality of it sounds like a good idea.
> 
> If you are OK with a cache, I don't think one needs to change the object
> format at all. It can be computed on the fly, and is purely a local
> optimization.

Note that if cache is generated during (re)packing, it could be optimized
for reading. If we want a cache that follow replacements, it would need
to be adjusted after each new or modified replacement; assuming that one
doesn't modify replacements by hand (then Git would need to detect changes,
and modify cache if needed, perhaps on query, e.g. "git log").

>> The first step should probably be to make fsck warn about the existing
>> cases of "commit has older date than parents". Something like the
>> attached patch, perhaps?
> 
> I have mixed feelings on this, because it forces the user to confront
> the issue at a time that's potentially very far from when it actually
> happened (and often when it is not their fault).

It would be nice to have as an optional check, like e.g. unreachable
objects, or use warning instead of error, like e.g. dangling objects
warning. This would be the default; it can always be changed using
the `fsck.<msg-id>` and `receive.fsck.<msg-id>` configuration variables.

This way we could use this mode to adjust clock-skew heuristics that
Git uses to speed up reachability queries (including time slop), be
it automatically or via a configuration variable.

[...]
> E.g., imagine somebody else has their clock set forward, and you fetch
> from them. Their object by itself is not broken. It is only when you
> want to commit on top of it, with the correct clock, that the broken
> state is created (and then, we cannot know whether it is your clock or
> the original committer's clock that is bad).
> 
> So I think it would be more productive to put a check like this in "git
> commit" rather than (or perhaps in addition to) fsck. That prevents
> us creating the broken relationship, but it does still mean the user may
> have to to go back and tell the original committer that their clock was
> broken.

Well, check at "git commit" time (perhaps with automatic fixing to used
committerdate for the commit being created, within specified conditions)
cannot distinguish between the two cases:

 * committerdate in parent commit(s) is correct,
   but the clock is in the past

 * committerdate in parent commit(s) is incorrect, and in the future,
   while the clock is correct
 
> You could also have the fsck check look not only for out-of-order
> commits, but also commits in the future (from the perspective of the
> receiver). That would reject such broken commits before they even hit
> your repository (though again, it is unclear in such a case if the
> commit is broken or the clock of the checker).

Well, one heuristics is that if commits in the future, and/or commits
with committerdate out of order have all the same committer, then
probably commits are broken. Another "heuristics" would be to assume
that if you invoked date-checking functionality, you have ensured that
your clock is correct.

-- 
Jakub Narębski



  parent reply	other threads:[~2016-07-01  9:49 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-29 18:31 topological index field for commit objects Marc Strapetz
2016-06-29 18:59 ` Junio C Hamano
2016-06-29 20:20   ` Stefan Beller
2016-06-29 20:39     ` Junio C Hamano
2016-06-29 20:54       ` Stefan Beller
2016-06-29 21:37         ` Stefan Beller
2016-06-29 21:43           ` Jeff King
2016-06-29 20:56       ` Jeff King
2016-06-29 21:49         ` Jakub Narębski
2016-06-29 22:00           ` Jeff King
2016-06-29 22:11             ` Junio C Hamano
2016-06-29 22:30               ` Jeff King
2016-07-05 11:43                 ` Johannes Schindelin
2016-07-05 12:59                   ` Jakub Narębski
2016-06-30 10:30             ` Jakub Narębski
2016-06-30 18:12               ` Linus Torvalds
2016-06-30 23:39                 ` Jakub Narębski
2016-06-30 23:59                 ` Mike Hommey
2016-07-01  3:17                 ` Jeff King
2016-07-01  6:45                   ` Marc Strapetz
2016-07-01  9:48                   ` Jakub Narębski [this message]
2016-07-01 16:08                   ` Junio C Hamano
2016-07-01  6:54               ` Jeff King
2016-07-01  9:59                 ` Jakub Narębski
2016-07-20  0:07             ` Jakub Narębski
2016-07-20 13:02               ` Jeff King
2017-02-04 13:43                 ` Jakub Narębski
2017-02-17  9:26                   ` Jeff King
2017-02-17  9:28                     ` Jakub Narębski
2016-06-29 22:15       ` Marc Strapetz
2016-06-29 21:00   ` Jakub Narębski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57763C78.3020509@gmail.com \
    --to=jnareb@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=marc.strapetz@syntevo.com \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.