git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Optimization batch 10: avoid detecting even more irrelevant renames
@ 2021-03-13 22:22 Elijah Newren via GitGitGadget
  2021-03-13 22:22 ` [PATCH 1/8] diffcore-rename: take advantage of "majority rules" to skip more renames Elijah Newren via GitGitGadget
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Elijah Newren via GitGitGadget @ 2021-03-13 22:22 UTC (permalink / raw)
  To: git
  Cc: Derrick Stolee, Ævar Arnfjörð Bjarmason,
	Jonathan Tan, Taylor Blau, Elijah Newren

This series depends on ort-perf-batch-9.

=== Basic Optimization idea ===

This series adds additional special cases where detection of renames is
irrelevant, where the irrelevance is due to the fact that the merge
machinery will arrive at the same result regardless of whether a rename is
detected for any of those paths. That high level wording makes it sound the
same as ort-perf-batch-9, and basically it is, it's just trying to take the
optimization a step further.

As noted in the last series, there are two reasons that the merge machinery
needs renames:

 * in order to do three-way content merging (pairing appropriate files)
 * in order to find where directories have been renamed

ort-perf-batch-9 provided a rough approximation for the second criteria that
was good enough, but which still left us detecting more renames than
necessary. This series focuses further on that criteria and finds ways to
avoid the need to detect as many renames while still detecting directory
renames identically to before. Thus, this series is an improvement on
"Optimization #2" from my Git Merge 2020 talk[1].

=== Results ===

For the testcases mentioned in commit 557ac03 ("merge-ort: begin performance
work; instrument with trace2_region_* calls", 2020-10-28), the changes in
just this series improves the performance as follows:

                     Before Series           After Series
no-renames:        5.680 s ±  0.096 s     5.665 s ±  0.129 s 
mega-renames:     13.812 s ±  0.162 s    11.435 s ±  0.158 s
just-one-mega:   506.0  ms ±  3.9  ms   494.2  ms ±  6.1  ms


While those results may look somewhat meager, it is important to note that
the previous optimizations have already reduced rename detection time to
nearly 0 for these particular testcases so there just isn't much left to
improve. The final patch in the series shows an alternate testcase where the
previous optimizations aren't as effective (a simple cherry-pick of a commit
that simply adds one new empty file), where there was a speedup factor of
approximately 3 due to this series:

                     Before Series           After Series
pick-empty:        1.936 s ±  0.024 s     688.1 ms ±  4.2 ms


There was also another testcase at $DAYJOB where I saw a factor 7
improvement from this particular optimization, so it certainly has the
potential to help when the previous optimizations are not quite enough.

As a reminder, before any merge-ort/diffcore-rename performance work, the
performance results we started with (as noted in the same commit message)
were:

no-renames-am:      6.940 s ±  0.485 s
no-renames:        18.912 s ±  0.174 s
mega-renames:    5964.031 s ± 10.459 s
just-one-mega:    149.583 s ±  0.751 s


[1]
https://github.com/newren/presentations/blob/pdfs/merge-performance/merge-performance-slides.pdf

Elijah Newren (8):
  diffcore-rename: take advantage of "majority rules" to skip more
    renames
  merge-ort, diffcore-rename: tweak dirs_removed and relevant_source
    type
  merge-ort: record the reason that we want a rename for a directory
  diffcore-rename: only compute dir_rename_count for relevant
    directories
  diffcore-rename: check if we have enough renames for directories early
    on
  diffcore-rename: add computation of number of unknown renames
  merge-ort: record the reason that we want a rename for a file
  diffcore-rename: determine which relevant_sources are no longer
    relevant

 diffcore-rename.c | 230 ++++++++++++++++++++++++++++++++++++++++------
 diffcore.h        |  19 +++-
 merge-ort.c       |  79 ++++++++++++----
 3 files changed, 281 insertions(+), 47 deletions(-)


base-commit: 98b0c7de5e70d62d47c3eeb3d290c6a234214f40
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-853%2Fnewren%2Fort-perf-batch-10-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-853/newren/ort-perf-batch-10-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/853
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-03-28  2:14 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-13 22:22 [PATCH 0/8] Optimization batch 10: avoid detecting even more irrelevant renames Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 1/8] diffcore-rename: take advantage of "majority rules" to skip more renames Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 2/8] merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 3/8] merge-ort: record the reason that we want a rename for a directory Elijah Newren via GitGitGadget
2021-03-15 14:31   ` Derrick Stolee
2021-03-15 15:27     ` Elijah Newren
2021-03-28  2:01       ` Junio C Hamano
2021-03-13 22:22 ` [PATCH 4/8] diffcore-rename: only compute dir_rename_count for relevant directories Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 5/8] diffcore-rename: check if we have enough renames for directories early on Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 6/8] diffcore-rename: add computation of number of unknown renames Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 7/8] merge-ort: record the reason that we want a rename for a file Elijah Newren via GitGitGadget
2021-03-13 22:22 ` [PATCH 8/8] diffcore-rename: determine which relevant_sources are no longer relevant Elijah Newren via GitGitGadget
2021-03-15 15:21 ` [PATCH 0/8] Optimization batch 10: avoid detecting even more irrelevant renames Derrick Stolee
2021-03-15 15:34   ` Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).