All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: peff@peff.net, Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v3 0/7] Use generation numbers for --topo-order
Date: Fri, 21 Sep 2018 10:39:26 -0700 (PDT)	[thread overview]
Message-ID: <pull.25.v3.git.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.25.v2.git.gitgitgadget@gmail.com>

This patch series performs a decently-sized refactoring of the revision-walk
machinery. Well, "refactoring" is probably the wrong word, as I don't
actually remove the old code. Instead, when we see certain options in the
'rev_info' struct, we redirect the commit-walk logic to a new set of methods
that distribute the workload differently. By using generation numbers in the
commit-graph, we can significantly improve 'git log --graph' commands (and
the underlying 'git rev-list --topo-order').

On the Linux repository, I got the following performance results when
comparing to the previous version with or without a commit-graph:

Test: git rev-list --topo-order -100 HEAD
HEAD~1, no commit-graph: 6.80 s
HEAD~1, w/ commit-graph: 0.77 s
  HEAD, w/ commit-graph: 0.02 s

Test: git rev-list --topo-order -100 HEAD -- tools
HEAD~1, no commit-graph: 9.63 s
HEAD~1, w/ commit-graph: 6.06 s
  HEAD, w/ commit-graph: 0.06 s

If you want to read this series but are unfamiliar with the commit-graph and
generation numbers, then I recommend reading 
Documentation/technical/commit-graph.txt or a blob post [1] I wrote on the
subject. In particular, the three-part walk described in "revision.c:
refactor basic topo-order logic" is present (but underexplained) as an
animated PNG [2].

Since revision.c is an incredibly important (and old) portion of the
codebase -- and because there are so many orthogonal options in 'struct
rev_info' -- I consider this submission to be "RFC quality". That is, I am
not confident that I am not missing anything, or that my solution is the
best it can be. I did merge this branch with ds/commit-graph-with-grafts and
the "DO-NOT-MERGE: write and read commit-graph always" commit that computes
a commit-graph with every 'git commit' command. The test suite passed with
that change, available on GitHub [3]. To ensure that I cover at least the
case I think are interesting, I added tests to t6600-test-reach.sh to verify
the walks report the correct results for the three cases there (no
commit-graph, full commit-graph, and a partial commit-graph so the walk
starts at GENERATION_NUMBER_INFINITY).

One notable case that is not included in this series is the case of a
history comparison such as 'git rev-list --topo-order A..B'. The existing
code in limit_list() has ways to cut the walk short when all pending commits
are UNINTERESTING. Since this code depends on commit_list instead of the
prio_queue we are using here, I chose to leave it untouched for now. We can
revisit it in a separate series later. Since handle_commit() turns on
revs->limited when a commit is UNINTERESTING, we do not hit the new code in
this case. Removing this 'revs->limited = 1;' line yields correct results,
but the performance is worse.

This series was based on ds/reachable, but is now based on 'master' to not
conflict with 182070 "commit: use timestamp_t for author_date_slab". There
is a small conflict with md/filter-trees, because it renamed a flag in
revisions.h in the line before I add new flags. Hopefully this conflict is
not too difficult to resolve.

Changes in V3: I added a new patch that updates the tab-alignment for flags
in revision.h before adding new ones (Thanks, Ævar!). Also, I squashed the
recommended changes to run_three_modes and test_three_modes from Szeder and
Junio. Thanks!

Thanks, -Stolee

[1] 
https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/
Supercharging the Git Commit Graph III: Generations and Graph Algorithms

[2] 
https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png
Animation showing three-part walk

[3] https://github.com/derrickstolee/git/tree/topo-order/testA branch
containing this series along with commits to compute commit-graph in entire
test suite.

Cc: avarab@gmail.comCc: szeder.dev@gmail.com

Derrick Stolee (7):
  prio-queue: add 'peek' operation
  test-reach: add run_three_modes method
  test-reach: add rev-list tests
  revision.c: begin refactoring --topo-order logic
  commit/revisions: bookkeeping before refactoring
  revision.h: add whitespace in flag definitions
  revision.c: refactor basic topo-order logic

 commit.c                   |  11 +-
 commit.h                   |   8 ++
 object.h                   |   4 +-
 prio-queue.c               |   9 ++
 prio-queue.h               |   6 +
 revision.c                 | 232 ++++++++++++++++++++++++++++++++++++-
 revision.h                 |  34 +++---
 t/helper/test-prio-queue.c |  10 +-
 t/t6600-test-reach.sh      |  96 ++++++++++++++-
 9 files changed, 374 insertions(+), 36 deletions(-)


base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303
Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-25%2Fderrickstolee%2Ftopo-order%2Fprogress-v3
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-25/derrickstolee/topo-order/progress-v3
Pull-Request: https://github.com/gitgitgadget/git/pull/25

Range-diff vs v2:

 1:  cc1ec4c270 = 1:  cc1ec4c270 prio-queue: add 'peek' operation
 2:  404c918608 ! 2:  b2a1ade148 test-reach: add run_three_modes method
     @@ -11,10 +11,6 @@
          run_three_modes method that executes the given command and tests
          the actual output to the expected output.
      
     -    While inspecting this code, I realized that the final test for
     -    'commit_contains --tag' is silently dropping the '--tag' argument.
     -    It should be quoted to include both.
     -
          Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
      
      diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
     @@ -28,31 +24,22 @@
      +run_three_modes () {
       	test_when_finished rm -rf .git/objects/info/commit-graph &&
      -	test-tool reach $1 <input >actual &&
     -+	$1 <input >actual &&
     ++	"$@" <input >actual &&
       	test_cmp expect actual &&
       	cp commit-graph-full .git/objects/info/commit-graph &&
      -	test-tool reach $1 <input >actual &&
     -+	$1 <input >actual &&
     ++	"$@" <input >actual &&
       	test_cmp expect actual &&
       	cp commit-graph-half .git/objects/info/commit-graph &&
      -	test-tool reach $1 <input >actual &&
     -+	$1 <input >actual &&
     ++	"$@" <input >actual &&
       	test_cmp expect actual
       }
       
      +test_three_modes () {
     -+	run_three_modes "test-tool reach $1"
     ++	run_three_modes test-tool reach "$@"
      +}
      +
       test_expect_success 'ref_newer:miss' '
       	cat >input <<-\EOF &&
       	A:commit-5-7
     -@@
     - 	EOF
     - 	echo "commit_contains(_,A,X,_):1" >expect &&
     - 	test_three_modes commit_contains &&
     --	test_three_modes commit_contains --tag
     -+	test_three_modes "commit_contains --tag"
     - '
     - 
     - test_expect_success 'commit_contains:miss' '
 3:  30dee58c61 ! 3:  b0ceb96076 test-reach: add rev-list tests
     @@ -30,7 +30,7 @@
      +		commit-6-2 commit-5-2 commit-4-2 commit-3-2 commit-2-2 commit-1-2 \
      +		commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \
      +	>expect &&
     -+	run_three_modes "git rev-list --topo-order commit-6-6"
     ++	run_three_modes git rev-list --topo-order commit-6-6
      +'
      +
      +test_expect_success 'rev-list: first-parent topo-order' '
     @@ -42,7 +42,7 @@
      +		commit-6-2 \
      +		commit-6-1 commit-5-1 commit-4-1 commit-3-1 commit-2-1 commit-1-1 \
      +	>expect &&
     -+	run_three_modes "git rev-list --first-parent --topo-order commit-6-6"
     ++	run_three_modes git rev-list --first-parent --topo-order commit-6-6
      +'
      +
      +test_expect_success 'rev-list: range topo-order' '
     @@ -54,7 +54,7 @@
      +		commit-6-2 commit-5-2 commit-4-2 \
      +		commit-6-1 commit-5-1 commit-4-1 \
      +	>expect &&
     -+	run_three_modes "git rev-list --topo-order commit-3-3..commit-6-6"
     ++	run_three_modes git rev-list --topo-order commit-3-3..commit-6-6
      +'
      +
      +test_expect_success 'rev-list: range topo-order' '
     @@ -66,7 +66,7 @@
      +		commit-6-2 commit-5-2 commit-4-2 \
      +		commit-6-1 commit-5-1 commit-4-1 \
      +	>expect &&
     -+	run_three_modes "git rev-list --topo-order commit-3-8..commit-6-6"
     ++	run_three_modes git rev-list --topo-order commit-3-8..commit-6-6
      +'
      +
      +test_expect_success 'rev-list: first-parent range topo-order' '
     @@ -78,7 +78,7 @@
      +		commit-6-2 \
      +		commit-6-1 commit-5-1 commit-4-1 \
      +	>expect &&
     -+	run_three_modes "git rev-list --first-parent --topo-order commit-3-8..commit-6-6"
     ++	run_three_modes git rev-list --first-parent --topo-order commit-3-8..commit-6-6
      +'
      +
      +test_expect_success 'rev-list: ancestry-path topo-order' '
     @@ -88,7 +88,7 @@
      +		commit-6-4 commit-5-4 commit-4-4 commit-3-4 \
      +		commit-6-3 commit-5-3 commit-4-3 \
      +	>expect &&
     -+	run_three_modes "git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6"
     ++	run_three_modes git rev-list --topo-order --ancestry-path commit-3-3..commit-6-6
      +'
      +
      +test_expect_success 'rev-list: symmetric difference topo-order' '
     @@ -102,7 +102,7 @@
      +		commit-3-8 commit-2-8 commit-1-8 \
      +		commit-3-7 commit-2-7 commit-1-7 \
      +	>expect &&
     -+	run_three_modes "git rev-list --topo-order commit-3-8...commit-6-6"
     ++	run_three_modes git rev-list --topo-order commit-3-8...commit-6-6
      +'
      +
       test_done
 4:  a74ae13d4e = 4:  fd1a0ab7cd revision.c: begin refactoring --topo-order logic
 5:  0e64fc144c = 5:  e86f304082 commit/revisions: bookkeeping before refactoring
 -:  ---------- > 6:  fa6d5ef152 revision.h: add whitespace in flag definitions
 6:  3b185ac3b1 ! 7:  020b2f50c5 revision.c: refactor basic topo-order logic
     @@ -404,11 +404,11 @@
      --- a/revision.h
      +++ b/revision.h
      @@
     - #define USER_GIVEN	(1u<<25) /* given directly by the user */
     - #define TRACK_LINEAR	(1u<<26)
     - #define ALL_REV_FLAGS	(((1u<<11)-1) | USER_GIVEN | TRACK_LINEAR)
     -+#define TOPO_WALK_EXPLORED (1u<<27)
     -+#define TOPO_WALK_INDEGREE (1u<<28)
     + #define USER_GIVEN		(1u<<25) /* given directly by the user */
     + #define TRACK_LINEAR		(1u<<26)
     + #define ALL_REV_FLAGS		(((1u<<11)-1) | USER_GIVEN | TRACK_LINEAR)
     ++#define TOPO_WALK_EXPLORED	(1u<<27)
     ++#define TOPO_WALK_INDEGREE	(1u<<28)
       
       #define DECORATE_SHORT_REFS	1
       #define DECORATE_FULL_REFS	2

-- 
gitgitgadget

  parent reply	other threads:[~2018-09-21 17:39 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-27 20:41 [PATCH 0/6] Use generation numbers for --topo-order Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 1/6] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 2/6] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 3/6] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 4/6] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 5/6] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-08-27 20:41 ` [PATCH 6/6] revision.c: refactor basic topo-order logic Derrick Stolee via GitGitGadget
2018-08-27 21:23 ` [PATCH 0/6] Use generation numbers for --topo-order Junio C Hamano
2018-09-18  4:08 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2018-09-18  4:08   ` [PATCH v2 1/6] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-09-18  4:08   ` [PATCH v2 2/6] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-09-18 18:02     ` SZEDER Gábor
2018-09-19 19:31       ` Junio C Hamano
2018-09-19 19:38         ` Junio C Hamano
2018-09-20 21:18           ` Junio C Hamano
2018-09-18  4:08   ` [PATCH v2 3/6] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-09-18  4:08   ` [PATCH v2 4/6] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-09-18  4:08   ` [PATCH v2 5/6] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-09-18  4:08   ` [PATCH v2 6/6] revision.c: refactor basic topo-order logic Derrick Stolee via GitGitGadget
2018-09-18  5:51     ` Ævar Arnfjörð Bjarmason
2018-09-18  6:05   ` [PATCH v2 0/6] Use generation numbers for --topo-order Ævar Arnfjörð Bjarmason
2018-09-21 15:47     ` Derrick Stolee
2018-09-21 17:39   ` Derrick Stolee via GitGitGadget [this message]
2018-09-21 17:39     ` [PATCH v3 1/7] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-09-26 19:15       ` Derrick Stolee
2018-10-11 13:54       ` Jeff King
2018-09-21 17:39     ` [PATCH v3 2/7] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-10-11 13:57       ` Jeff King
2018-09-21 17:39     ` [PATCH v3 3/7] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-10-11 13:58       ` Jeff King
2018-10-12  4:34         ` Junio C Hamano
2018-09-21 17:39     ` [PATCH v3 4/7] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-10-11 14:06       ` Jeff King
2018-10-12  6:33       ` Junio C Hamano
2018-10-12 12:32         ` Derrick Stolee
2018-10-12 16:15         ` Johannes Sixt
2018-10-13  8:05           ` Junio C Hamano
2018-09-21 17:39     ` [PATCH v3 5/7] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-10-11 14:21       ` Jeff King
2018-09-21 17:39     ` [PATCH v3 6/7] revision.h: add whitespace in flag definitions Derrick Stolee via GitGitGadget
2018-09-21 17:39     ` [PATCH v3 7/7] revision.c: refactor basic topo-order logic Derrick Stolee via GitGitGadget
2018-09-27 17:57       ` Derrick Stolee
2018-10-06 16:56         ` Jakub Narebski
2018-10-11 15:35       ` Jeff King
2018-10-11 16:21         ` Derrick Stolee
2018-10-25  9:43           ` Jeff King
2018-10-25 13:00             ` Derrick Stolee
2018-10-11 22:32       ` Stefan Beller
2018-09-21 21:22     ` [PATCH v3 0/7] Use generation numbers for --topo-order Junio C Hamano
2018-10-16 22:36     ` [PATCH v4 " Derrick Stolee via GitGitGadget
2018-10-16 22:36       ` [PATCH v4 1/7] prio-queue: add 'peek' operation Derrick Stolee via GitGitGadget
2018-10-16 22:36       ` [PATCH v4 2/7] test-reach: add run_three_modes method Derrick Stolee via GitGitGadget
2018-10-16 22:36       ` [PATCH v4 3/7] test-reach: add rev-list tests Derrick Stolee via GitGitGadget
2018-10-21 10:21         ` Jakub Narebski
2018-10-21 15:28           ` Derrick Stolee
2018-10-16 22:36       ` [PATCH v4 4/7] revision.c: begin refactoring --topo-order logic Derrick Stolee via GitGitGadget
2018-10-21 15:55         ` Jakub Narebski
2018-10-22  1:12           ` Junio C Hamano
2018-10-22  1:51             ` Derrick Stolee
2018-10-22  1:55               ` [RFC PATCH] revision.c: use new algorithm in A..B case Derrick Stolee
2018-10-25  8:28               ` [PATCH v4 4/7] revision.c: begin refactoring --topo-order logic Junio C Hamano
2018-10-26 20:56                 ` Jakub Narebski
2018-10-16 22:36       ` [PATCH v4 5/7] commit/revisions: bookkeeping before refactoring Derrick Stolee via GitGitGadget
2018-10-21 21:17         ` Jakub Narebski
2018-10-16 22:36       ` [PATCH v4 6/7] revision.c: generation-based topo-order algorithm Derrick Stolee via GitGitGadget
2018-10-22 13:37         ` Jakub Narebski
2018-10-23 13:54           ` Derrick Stolee
2018-10-26 16:55             ` Jakub Narebski
2018-10-16 22:36       ` [PATCH v4 7/7] t6012: make rev-list tests more interesting Derrick Stolee via GitGitGadget
2018-10-23 15:48         ` Jakub Narebski
2018-10-21 12:57       ` [PATCH v4 0/7] Use generation numbers for --topo-order Jakub Narebski
2018-11-01  5:21       ` Junio C Hamano
2018-11-01 13:49         ` Derrick Stolee
2018-11-01 23:54           ` Junio C Hamano
2018-11-01 13:46       ` [PATCH v5 " Derrick Stolee
2018-11-01 13:46         ` [PATCH v5 1/7] prio-queue: add 'peek' operation Derrick Stolee
2018-11-01 13:46         ` [PATCH v5 2/7] test-reach: add run_three_modes method Derrick Stolee
2018-11-01 13:46         ` [PATCH v5 3/7] test-reach: add rev-list tests Derrick Stolee
2018-11-01 13:46         ` [PATCH v5 4/7] revision.c: begin refactoring --topo-order logic Derrick Stolee
2018-11-01 13:46         ` [PATCH v5 5/7] commit/revisions: bookkeeping before refactoring Derrick Stolee
2018-11-01 13:46         ` [PATCH v5 6/7] revision.c: generation-based topo-order algorithm Derrick Stolee
2018-11-01 15:48           ` SZEDER Gábor
2018-11-01 16:12             ` Derrick Stolee
2019-11-08  2:50           ` Mike Hommey
2019-11-11  1:07             ` Derrick Stolee
2019-11-18 23:04               ` SZEDER Gábor
2018-11-01 13:46         ` [PATCH v5 7/7] t6012: make rev-list tests more interesting Derrick Stolee

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.25.v3.git.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.