From: Jakub Narebski <jnareb@gmail.com>
To: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Junio C Hamano <gitster@pobox.com>,
Derrick Stolee <stolee@microsoft.com>
Subject: Re: [PATCH 0/1] v2.19.0-rc1 Performance Regression in 'git merge-base'
Date: Sat, 06 Oct 2018 18:09:40 +0200 [thread overview]
Message-ID: <865zyfys3f.fsf@gmail.com> (raw)
In-Reply-To: <pull.28.git.gitgitgadget@gmail.com> (Derrick Stolee via GitGitGadget's message of "Thu, 30 Aug 2018 05:58:07 -0700 (PDT)")
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:
> As I was testing the release candidate, I stumbled across a regression in
> 'git merge-base' as a result of the switch to generation numbers. The commit
> message in [PATCH 1/1] describes the topology involved, but you can test it
> yourself by comparing 'git merge-base v4.8 v4.9' in the Linux kernel. The
> regression does not show up when running merge-base for tags at least v4.9,
> which is why I didn't see it when I was testing earlier.
Strange that it happens; I'll take a look.
> The solution is simple, but also will conflict with ds/reachable in next. I
> can send a similar patch that applies the same diff into commit-reach.c.
>
> With the integration of generation numbers into most commit walks coming to
> a close [1], it will be time to re-investigate other options for
> reachability indexes [2]. As I was digging into the issue with this
> regression, I discovered a way we can modify our generation numbers and pair
> them with commit dates to give us a simple-to-compute, immutable
> two-dimensional reachability index that would be immune to this regression.
> I will investigate that more and report back, but it is more important to
> fix this regression now.
I am interested in what you have created, especially because commit
creation dates are imperfect indicators because of clock skew etc.
>
> Thanks, -Stolee
>
> [1] https://public-inbox.org/git/pull.25.git.gitgitgadget@gmail.com/[PATCH
> 0/6] Use generation numbers for --topo-order
>
> [2] https://public-inbox.org/git/86muxcuyod.fsf@gmail.com/[RFC] Other chunks
> for commit-graph, part 2 - reachability indexes
Since then I have found few more possible approaches:
- working with repository metagraph[1], where all chains of commits were
replaced by edge, perhaps together with DAG Reduction[2] boosting
framework on this metagraph
- using FELINE-like index (the Graph+Label approch, also known as online
search), and for those where this index have false results, use Labels
only approach[3]
[1] Xian Tang et. al., "An Optimized Labeling Scheme for Reachability
Queries", CMC, vol.55, no.2, pp.267-283, 2018
[2] Marco Biazzini, Martin Monperrus, Benoit Baudry "On Analyzing the
Topology of Commit Histories in Decentralized Version Control
Systems", ICSME 2014 (conference paper)
[3] Junfeng Zhou et. al., "Accelerating reachability query processing
based on DAG reduction", The VLDB Journal (2018) 27: 271
--
Jakub Narębski
prev parent reply other threads:[~2018-10-06 16:11 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-30 12:58 [PATCH 0/1] v2.19.0-rc1 Performance Regression in 'git merge-base' Derrick Stolee via GitGitGadget
2018-08-30 12:58 ` [PATCH 1/1] commit: don't use generation numbers if not needed Derrick Stolee via GitGitGadget
2018-10-06 16:54 ` Jakub Narebski
2018-08-30 15:26 ` [PATCH 0/1] v2.19.0-rc1 Performance Regression in 'git merge-base' Junio C Hamano
2018-10-06 16:09 ` Jakub Narebski [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=865zyfys3f.fsf@gmail.com \
--to=jnareb@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
--cc=gitster@pobox.com \
--cc=stolee@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).