git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: Derrick Stolee <dstolee@microsoft.com>,
	Jonathan Tan <jonathantanmy@google.com>,
	Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>,
	Jeff King <peff@peff.net>, Karsten Blees <blees@dcon.de>,
	Elijah Newren <newren@gmail.com>
Subject: [PATCH 0/2] Optimization batch 6: make full use of exact renames
Date: Wed, 03 Feb 2021 05:49:03 +0000	[thread overview]
Message-ID: <pull.842.git.1612331345.gitgitgadget@gmail.com> (raw)

This series depends on en/merge-ort-perf.

For the very curious who are wondering about the first five optimization
batches; see the end of this email.

This series makes full use of exact renames; see commit messages for
details. It represents "Optimization #1" from my Git Merge 2020 talk[1]. For
the testcases mentioned in commit 557ac0350d ("merge-ort: begin performance
work; instrument with trace2_region_* calls", 2020-10-28), the changes in
just this series improves the performance as follows:

                     Before Series           After Series
no-renames:       14.263 s ±  0.053 s    13.815 s ±  0.062 s
mega-renames:   5504.231 s ±  5.150 s  1799.937 s ±  0.493 s
just-one-mega:   158.534 s ±  0.498 s    51.289 s ±  0.019 s


As a reminder, before any merge-ort/diffcore-rename performance work, the
performance results we started with (as noted in the same commit message)
were:

no-renames-am:      6.940 s ±  0.485 s
no-renames:        18.912 s ±  0.174 s
mega-renames:    5964.031 s ± 10.459 s
just-one-mega:    149.583 s ±  0.751 s


[1]
https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf

=== Previous optimization batches ===

I'm labeling this as the "6th" batch, due to other optimizations submitted
previously, and a number of optimizations baked into the design of
fast-rebase and merge-ort.

 1. Previously submitted hashmap/strmap optimizations 1a) 33f20d8217
    (hashmap: introduce a new hashmap_partial_clear()) 1b) 6ccdfc2a20
    (strmap: enable faster clearing and reusing of strmaps) 1c) a208ec1f0b
    (strmap: enable allocations to come from a mem_pool) 1d) 23a276a9c4
    (strmap: take advantage of FLEXPTR_ALLOC_STR when relevant)

 2. Previously submitted diffcore-rename optimizations 2a) b970b4ef62
    (diffcore-rename: simplify and accelerate register_rename_src()) 2b)
    9db2ac5616 (diffcore-rename: accelerate rename_dst setup) 2c) 350410f6b1
    (diffcore-rename: remove unnecessary duplicate entry checks)

 3. fast-rebase optimizations 3a) Avoid updating working-tree/index with
    every intermediate patch 3b) avoid reading/writing rebase metadata until
    conflict or completion

 4. Small stuff baked into merge-ort design 4a) Using pahole to note I can
    reduce size of merged_info by 8 bytes 4b) Avoid recomparing hashes (due
    to use of match_masks) 4c) Avoid unconditional dropping and re-reading
    of the index 4d) avoid checking index matches HEAD with every patch; do
    it at start only

 5. Big stuff baked into merge-ort design 5a) Avoid quadratic behavior with
    O(N) insertions/removals of index entries 5b) Avoid numerous expensive
    mini-tree traversals done by merge-recursive 5c) Avoid recursing into
    trees where both sides match merge base

Elijah Newren (2):
  diffcore-rename: no point trying to find a match better than exact
  diffcore-rename: filter rename_src list when possible

 diffcore-rename.c | 69 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 62 insertions(+), 7 deletions(-)


base-commit: 557ac0350d9efa1f59c708779ca3fb3aee121131
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-842%2Fnewren%2Fort-perf-batch-6-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-842/newren/ort-perf-batch-6-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/842
-- 
gitgitgadget

             reply	other threads:[~2021-02-03  5:49 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-03  5:49 Elijah Newren via GitGitGadget [this message]
2021-02-03  5:49 ` [PATCH 1/2] diffcore-rename: no point trying to find a match better than exact Elijah Newren via GitGitGadget
2021-02-03 11:44   ` Derrick Stolee
2021-02-03 16:31     ` Elijah Newren
2021-02-03 18:46     ` Junio C Hamano
2021-02-03 19:10       ` Elijah Newren
2021-02-03  5:49 ` [PATCH 2/2] diffcore-rename: filter rename_src list when possible Elijah Newren via GitGitGadget
     [not found]   ` <13feb106-c3a7-a26d-0e6e-013aa45c58d4@gmail.com>
2021-02-03 17:12     ` Elijah Newren
2021-02-03 19:12   ` Junio C Hamano
2021-02-03 19:19     ` Elijah Newren
2021-02-03 20:03 ` [PATCH v2 0/2] Optimization batch 6: make full use of exact renames Elijah Newren via GitGitGadget
2021-02-03 20:03   ` [PATCH v2 1/2] diffcore-rename: no point trying to find a match better than exact Elijah Newren via GitGitGadget
2021-02-03 20:03   ` [PATCH v2 2/2] diffcore-rename: filter rename_src list when possible Elijah Newren via GitGitGadget
2021-02-13  1:04     ` Junio C Hamano
2021-02-13  4:24       ` Elijah Newren
2021-02-13  1:06     ` Junio C Hamano
2021-02-13  4:43       ` Elijah Newren
2021-02-03 21:56   ` [PATCH v2 0/2] Optimization batch 6: make full use of exact renames Junio C Hamano
2021-02-03 23:06     ` Elijah Newren
2021-02-03 23:26       ` Junio C Hamano
2021-02-03 23:36       ` Jeff King
2021-02-04  0:05         ` Elijah Newren
2021-02-14  7:34   ` [PATCH v3 " Elijah Newren via GitGitGadget
2021-02-14  7:35     ` [PATCH v3 1/2] diffcore-rename: no point trying to find a match better than exact Elijah Newren via GitGitGadget
2021-02-14  7:35     ` [PATCH v3 2/2] diffcore-rename: filter rename_src list when possible Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.842.git.1612331345.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=blees@dcon.de \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=me@ttaylorr.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).