* time needed to rebase shortend by using --onto? @ 2021-05-26 10:09 Uwe Kleine-König 2021-05-26 11:04 ` Bagas Sanjaya ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Uwe Kleine-König @ 2021-05-26 10:09 UTC (permalink / raw) To: git; +Cc: entwicklung [-- Attachment #1: Type: text/plain, Size: 1965 bytes --] Hello, I have a kernel topic branch containing 4 patches on top of Linux v5.4. (I didn't speak to the affected customer, so I cannot easily share the patch stack. If need be I can probably anonymize it or ask if I can publish the patches.) It rebases clean on v5.10: $ time git rebase v5.10 Performing inexact rename detection: 100% (36806539/36806539), done. Performing inexact rename detection: 100% (36806539/36806539), done. Performing inexact rename detection: 100% (36806539/36806539), done. Performing inexact rename detection: 100% (36806539/36806539), done. Successfully rebased and updated detached HEAD. real 3m47.841s user 1m25.706s sys 0m11.181s If I start with the same rev checked out and explicitly specify the merge base, the rebase process is considerably faster: $ time git rebase --onto v5.10 v5.4 Performing inexact rename detection: 100% (36806539/36806539), done. Performing inexact rename detection: 100% (36806539/36806539), done. Performing inexact rename detection: 100% (36806539/36806539), done. Performing inexact rename detection: 100% (36806539/36806539), done. Successfully rebased and updated detached HEAD. real 1m20.588s user 1m12.645s sys 0m6.733s Is there some relevant complexity in the first invocation I'm not seeing that explains it takes more than the double time? I would have expected that git rebase v5.10 does the same as: git rebase --onto v5.10 $(git merge-base HEAD v5.10) . (FTR: $ time git merge-base HEAD v5.10 219d54332a09e8d8741c1e1982f5eae56099de85 real 0m0.158s user 0m0.105s sys 0m0.052s , 219d5433 is v5.4 as expected. $ git version git version 2.29.2 That's from the Debian package 1:2.29.2-1~bpo10+1 on a Debian 10 box.) Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-26 10:09 time needed to rebase shortend by using --onto? Uwe Kleine-König @ 2021-05-26 11:04 ` Bagas Sanjaya 2021-05-26 14:38 ` Elijah Newren 2021-05-26 22:18 ` Junio C Hamano 2 siblings, 0 replies; 12+ messages in thread From: Bagas Sanjaya @ 2021-05-26 11:04 UTC (permalink / raw) To: Uwe Kleine-König, git; +Cc: entwicklung Hi Uwe, On 26/05/21 17.09, Uwe Kleine-König wrote: > Hello, > > I have a kernel topic branch containing 4 patches on top of Linux v5.4. > (I didn't speak to the affected customer, so I cannot easily share the > patch stack. If need be I can probably anonymize it or ask if I can > publish the patches.) > > It rebases clean on v5.10: > > $ time git rebase v5.10 > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Successfully rebased and updated detached HEAD. > > real 3m47.841s > user 1m25.706s > sys 0m11.181s > > If I start with the same rev checked out and explicitly specify the > merge base, the rebase process is considerably faster: > > $ time git rebase --onto v5.10 v5.4 > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Successfully rebased and updated detached HEAD. > > real 1m20.588s > user 1m12.645s > sys 0m6.733s > > Is there some relevant complexity in the first invocation I'm not seeing > that explains it takes more than the double time? I would have expected > that > > git rebase v5.10 > > does the same as: > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) > > . (FTR: > > $ time git merge-base HEAD v5.10 > 219d54332a09e8d8741c1e1982f5eae56099de85 > > real 0m0.158s > user 0m0.105s > sys 0m0.052s > > , 219d5433 is v5.4 as expected. > > $ git version > git version 2.29.2 > > That's from the Debian package 1:2.29.2-1~bpo10+1 on a Debian 10 box.) > > Best regards > Uwe > Can you reproduce your findings with latest version (v2.32.0-rc1) please? -- An old man doll... just what I always wanted! - Clara ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-26 10:09 time needed to rebase shortend by using --onto? Uwe Kleine-König 2021-05-26 11:04 ` Bagas Sanjaya @ 2021-05-26 14:38 ` Elijah Newren 2021-05-27 21:59 ` Uwe Kleine-König 2021-05-26 22:18 ` Junio C Hamano 2 siblings, 1 reply; 12+ messages in thread From: Elijah Newren @ 2021-05-26 14:38 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Git Mailing List, entwicklung On Wed, May 26, 2021 at 3:13 AM Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > Hello, > > I have a kernel topic branch containing 4 patches on top of Linux v5.4. > (I didn't speak to the affected customer, so I cannot easily share the > patch stack. If need be I can probably anonymize it or ask if I can > publish the patches.) > > It rebases clean on v5.10: > > $ time git rebase v5.10 > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Successfully rebased and updated detached HEAD. > > real 3m47.841s > user 1m25.706s > sys 0m11.181s > > If I start with the same rev checked out and explicitly specify the > merge base, the rebase process is considerably faster: > > $ time git rebase --onto v5.10 v5.4 > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Successfully rebased and updated detached HEAD. > > real 1m20.588s > user 1m12.645s > sys 0m6.733s > > Is there some relevant complexity in the first invocation I'm not seeing > that explains it takes more than the double time? I would have expected > that > > git rebase v5.10 > > does the same as: > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) > > . (FTR: > > $ time git merge-base HEAD v5.10 > 219d54332a09e8d8741c1e1982f5eae56099de85 > > real 0m0.158s > user 0m0.105s > sys 0m0.052s > > , 219d5433 is v5.4 as expected. That does seem surprising, though if an automatic gc completed between the two commands that could certainly explain it. If that theory is correct, it would suggest that it'd be difficult for you to reproduce; running again with either command would give you something closer to the lower time both times. Is that the case? (Also, what's the output of "git count-objects -v"?) > > $ git version > git version 2.29.2 > > That's from the Debian package 1:2.29.2-1~bpo10+1 on a Debian 10 box.) > > Best regards > Uwe I'd love to try this with git-2.32.0-rc1 (or even my not-yet-upstream patches that optimize even further) with adding "--strategy=ort" to your rebase command to see how much of a timing difference it makes. Any chance the patches could either be published, or you could retry with git-2.32.0-rc1 and add the --strategy=ort command line option to your rebase command(s)? Elijah ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-26 14:38 ` Elijah Newren @ 2021-05-27 21:59 ` Uwe Kleine-König 2021-05-27 22:15 ` Uwe Kleine-König 2021-05-27 23:08 ` Elijah Newren 0 siblings, 2 replies; 12+ messages in thread From: Uwe Kleine-König @ 2021-05-27 21:59 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, entwicklung [-- Attachment #1: Type: text/plain, Size: 9052 bytes --] Hello, On Wed, May 26, 2021 at 07:38:08AM -0700, Elijah Newren wrote: > On Wed, May 26, 2021 at 3:13 AM Uwe Kleine-König > <u.kleine-koenig@pengutronix.de> wrote: > > I have a kernel topic branch containing 4 patches on top of Linux v5.4. > > (I didn't speak to the affected customer, so I cannot easily share the > > patch stack. If need be I can probably anonymize it or ask if I can > > publish the patches.) > > > > It rebases clean on v5.10: > > > > $ time git rebase v5.10 > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Successfully rebased and updated detached HEAD. > > > > real 3m47.841s > > user 1m25.706s > > sys 0m11.181s > > > > If I start with the same rev checked out and explicitly specify the > > merge base, the rebase process is considerably faster: > > > > $ time git rebase --onto v5.10 v5.4 > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Successfully rebased and updated detached HEAD. > > > > real 1m20.588s > > user 1m12.645s > > sys 0m6.733s > > > > Is there some relevant complexity in the first invocation I'm not seeing > > that explains it takes more than the double time? I would have expected > > that > > > > git rebase v5.10 > > > > does the same as: > > > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) > > > > . (FTR: > > > > $ time git merge-base HEAD v5.10 > > 219d54332a09e8d8741c1e1982f5eae56099de85 > > > > real 0m0.158s > > user 0m0.105s > > sys 0m0.052s > > > > , 219d5433 is v5.4 as expected. > > That does seem surprising, though if an automatic gc completed between > the two commands that could certainly explain it. If that theory is > correct, it would suggest that it'd be difficult for you to reproduce; This reproduces just fine. The repository is quite big and it is slow at times. With the same tree on a different machine, the rebase is quicker, but the factor 2 between the two different commands is visible there, too: uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 HEAD is now at bc2e99c9c9e0 [...] uwe@taurus:~/gsrc/linux$ time git rebase v5.10 warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. Successfully rebased and updated detached HEAD. real 0m20.737s user 0m14.188s sys 0m3.767s uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 HEAD is now at bc2e99c9c9e0 [...] uwe@taurus:~/gsrc/linux$ time git rebase --onto v5.10 v5.4 warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. Successfully rebased and updated detached HEAD. real 0m12.129s user 0m7.196s sys 0m3.141s (This is with a slightly newer git: 2.30.2-1 from Debian) Then I repeated the test with git 2.32.0-rc1 (wgit is just calling bin-wrappers/git in my git working copy): uwe@taurus:~/gsrc/linux$ wgit version git version 2.32.0.rc1 uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 HEAD is now at bc2e99c9c9e0 [...] uwe@taurus:~/gsrc/linux$ time wgit rebase v5.10 warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. Successfully rebased and updated detached HEAD. real 0m19.438s user 0m13.629s sys 0m3.299s uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 HEAD is now at bc2e99c9c9e0 [...] uwe@taurus:~/gsrc/linux$ time wgit rebase --onto v5.10 v5.4 warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. warning: inexact rename detection was skipped due to too many files. warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. Successfully rebased and updated detached HEAD. real 0m13.848s user 0m8.315s sys 0m3.182s So the surprise persists. > running again with either command would give you something closer to > the lower time both times. Is that the case? (Also, what's the > output of "git count-objects -v"?) After the above commands I have: count: 3203 size: 17664 in-pack: 4763753 packs: 11 size-pack: 1273957 prune-packable: 19 garbage: 0 size-garbage: 0 alternate: /home/uwe/var/gitstore/linux.git/objects (On the repository I did this initially I have: warning: garbage found: .git/objects/pack/pack-864148a84c0524073ed8c8aa1a76155d5c677879.pack.temp warning: garbage found: /ptx/src/git/linux.git/objects/pack/tmp_pack_X9gHnq count: 2652 size: 14640 in-pack: 2117015 packs: 8 size-pack: 574167 prune-packable: 856 garbage: 2 size-garbage: 1114236 alternate: /ptx/src/git/linux.git/objects (Is the garbage a reason this is so slow? Can I just remove the two files pointed out?) > I'd love to try this with git-2.32.0-rc1 (or even my not-yet-upstream > patches that optimize even further) with adding "--strategy=ort" to > your rebase command to see how much of a timing difference it makes. > Any chance the patches could either be published, or you could retry > with git-2.32.0-rc1 and add the --strategy=ort command line option to > your rebase command(s)? With --strategy=ort added I have: uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort v5.10 Successfully rebased and updated detached HEAD. real 0m19.202s user 0m12.724s sys 0m2.961s [...] uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort --onto v5.10 v5.4 Successfully rebased and updated detached HEAD. real 0m12.395s user 0m6.638s sys 0m3.284s So the warnings about inexact rename detection don't appear and it's a bit faster, but I still see the timing difference between these two commands. I assume you are still interested in seeing this branch? I think anonymising it shouldn't be so hard, the patches are not so big. I'll modify the branch to make it shareable and assuming the problem still reproduces with it will share it with you. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-27 21:59 ` Uwe Kleine-König @ 2021-05-27 22:15 ` Uwe Kleine-König 2021-05-28 5:38 ` Elijah Newren 2021-05-27 23:08 ` Elijah Newren 1 sibling, 1 reply; 12+ messages in thread From: Uwe Kleine-König @ 2021-05-27 22:15 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, entwicklung [-- Attachment #1: Type: text/plain, Size: 608 bytes --] On Thu, May 27, 2021 at 11:59:47PM +0200, Uwe Kleine-König wrote: > I assume you are still interested in seeing this branch? I think > anonymising it shouldn't be so hard, the patches are not so big. I'll > modify the branch to make it shareable and assuming the problem still > reproduces with it will share it with you. You can find the anonymised branch at: https://git.pengutronix.de/git/ukl/linux rebase-timing Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-27 22:15 ` Uwe Kleine-König @ 2021-05-28 5:38 ` Elijah Newren 0 siblings, 0 replies; 12+ messages in thread From: Elijah Newren @ 2021-05-28 5:38 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Git Mailing List, entwicklung On Thu, May 27, 2021 at 3:15 PM Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > On Thu, May 27, 2021 at 11:59:47PM +0200, Uwe Kleine-König wrote: > > I assume you are still interested in seeing this branch? I think > > anonymising it shouldn't be so hard, the patches are not so big. I'll > > modify the branch to make it shareable and assuming the problem still > > reproduces with it will share it with you. > > You can find the anonymised branch at: > > https://git.pengutronix.de/git/ukl/linux rebase-timing Cool, this helps. Short summary: * I can't reproduce your factor 2 timing difference when rename detection is active. There still have to be other factors at play (e.g. auto-gc). Can you reproduce those? * Your timing of merge-base was likely mistaken, due to where HEAD pointed (see below). * Without --onto, it looks like a huge chunk of time is spent checking whether any of the 4 patches happen to match one of the patches in v5.4..v5.10; passing --reapply-cherry-picks will save that time (that really ought to be the default IMO, but backward compatibility makes that impossible). * Adding --no-fork-point may have also sped up the timing for your original report (not your follow-up), depending on what's in your local reflog. * There are almost certainly other optimization opportunities available here. More detailed investigation: You did a few things that were quite a bit different between your original report and your follow-up about reproducing. So I'll try to be clear about what I tried with your repository, but first let me point out possible points of confusion: 1) In the original report, you clearly had merge.renamelimit set high enough to detect renames, whereas in the second you didn't. I can control for this and show both with and without having a high enough limit. 2) It appears in the first report that you were likely on a branch when you ran rebase, whereas in the second you were clearly using a detached HEAD (by first checking out a specific commit). Since you didn't use --no-fork-point, your local reflog would be consulted and the history of changes to the branch might add to the overall computation time. That's something I won't be able to reproduce since I don't have your reflog. 3) It's hard for me to shake that there might have been an automatic gc or something else running on the system that occurred between the first and second runs of your original report. I obviously can't reproduce anything like that. 4) The timing of the "git merge-base HEAD v5.10" command you ran was almost certainly done AFTER the rebase, which gives misleading results. When I run the same command AFTER rebasing, I see similarly really low timings: $ time git merge-base HEAD v5.10 2c85ebc57b3e1817b6ce1a6b703928e113a90442 real 0m0.004s Whereas BEFORE rebasing, I see significantly bigger times $ time git merge-base HEAD v5.10 219d54332a09e8d8741c1e1982f5eae56099de85 real 0m1.750s It would have been clearer to just use the command "time git merge-base v5.10 origin/rebase-timing" (or whatever the original un-rebased branch was instead of using HEAD). Okay, with all that out of the way, I cloned your repo and ran a bunch of timings. I first ran: $ git config merge.renamelimit 9999 to make sure renames are detected. I'll override it below when I don't want renames detected. With this config, on my machine, using git v2.32.0-rc1: 53.908s git rebase v5.10 47.668s git rebase --onto v5.10 v5.4 18.574s git -c merge.renamelimit=1000 rebase v5.10 11.800s git -c merge.renamelimit=1000 rebase --onto v5.10 v5.4 16.610s git rebase -sort v5.10 10.780s git rebase -sort --onto v5.10 v5.4 10.670s git rebase -sort --reapply-cherry-picks v5.10 10.589s git rebase -sort --reapply-cherry-picks --onto v5.10 v5.4 Using my development version of git (has a few more optimizations): 16.073s git rebase -sort v5.10 9.778s git rebase -sort --onto v5.10 v5.4 9.541s git rebase -sort --reapply-cherry-picks v5.10 9.062s git rebase -sort --reapply-cherry-picks --onto v5.10 v5.4 Using my development version + replacing can_fast_forward() with "return 0": 9.221s git rebase -sort --reapply-cherry-picks v5.10 8.124s git rebase -sort --reapply-cherry-picks --onto v5.10 v5.4 Note the following timings too (git version doesn't really matter): 6.495s git switch --quiet --detach v5.10 1.741s git merge-base v5.10 origin/rebase-timing So a theoretical lower bound is somewhere around 6.5s with --onto, and 8s without it, since these operations are just necessary. fast-rebase gets really close to that theoretical lower bound; it involves running all three of the following commands (because it won't do the checkout for you and needs a branch name): 6.495s git switch --quiet --detach v5.10 0.005s git branch -f rebase-timing origin/rebase-timing 0.176s test-tool fast-rebase --onto HEAD v5.4 rebase-timing for a combined time of 6.676s. Going back to the real rebase command, though (with my still-not-upstream git version), using trace2 and summing across common region names, I saw the following timings: $ git switch --quiet --detach origin/rebase-timing && summarize-perf git rebase -sort --reapply-cherry-picks v5.10 Successfully rebased and updated detached HEAD. Accumulated times: 2.104 : <unmeasured> (21.3%) 6.628 : 7 : label:unpack_trees 6.354 : <unmeasured> (95.9%) 0.192 : 7 : ..label:traverse_trees 0.082 : 1 : ..label:update 0.000 : 1 : ..label:Filtering content 0.515 : 4 : label:refresh 0.380 : 6 : label:do_write_index /home/newren/floss/uwe-linux/.git/index.lock 0.375 : <unmeasured> (98.7%) 0.005 : 6 : ..label:write 0.103 : 4 : label:preload 0.081 : 4 : label:checkout 0.000 : <unmeasured> ( 0.2%) 0.081 : 4 : ..label:unpack_trees 0.014 : <unmeasured> (17.7%) 0.065 : 4 : ....label:traverse_trees 0.002 : 4 : ....label:update 0.000 : 4 : ....label:Filtering content 0.069 : 7 : label:do_read_index .git/index 0.060 : <unmeasured> (88.0%) 0.008 : 7 : ..label:read 0.011 : 4 : label:incore_nonrecursive 0.000 : <unmeasured> ( 2.8%) 0.009 : 4 : ..label:process_entries 0.000 : <unmeasured> ( 1.3%) 0.008 : 4 : ....label:processing 0.000 : 4 : ....label:process_entries setup 0.000 : <unmeasured> (21.9%) 0.000 : 4 : ......label:plist special sort 0.000 : 4 : ......label:plist copy 0.000 : 4 : ......label:plist grow 0.000 : 4 : ....label:process_entries cleanup 0.002 : 4 : ..label:collect_merge_info 0.000 : <unmeasured> ( 6.5%) 0.002 : 4 : ....label:traverse_trees 0.000 : 4 : ..label:merge_start 0.000 : <unmeasured> (54.4%) 0.000 : 4 : ....label:allocate/init 0.000 : 4 : ....label:sanity checks 0.000 : 4 : ..label:renames 0.000 : 4 : label:write_auto_merge 0.000 : 4 : label:record_conflicted Estimated measurement overhead (.010 ms/region-measure * 134): 0.00134 Timing including forking: 9.917 (0.026 additional seconds) From this, the things that stood out to me were: 2.104 : <unmeasured> (21.3%) 2.1 of the 9.9 seconds was in rebase somewhere without trace2 regions to record it. From above, clearly one big chunk of this time is from can_fast_forward(). But that's only like 0.5-1.0s. What's all the rest? And can we get rid of most of it somehow? 6.628 : 7 : label:unpack_trees This corresponds to the time to switch to v5.10 before starting to apply patches 0.515 : 4 : label:refresh I think this is wasted time trying to re-sync the data from the fact that rebase shells out to external processes to do work it should do in-process (namely, "git commit"). 0.380 : 6 : label:do_write_index /home/newren/floss/uwe-linux/.git/index.lock 0.081 : 4 : label:checkout 0.069 : 7 : label:do_read_index .git/index 3/4 of this is wasted time from the fact that rebase updates the working copy and index with every commit instead of just at the end of the operation. The "preload" marker may also belong here, or maybe up with the preload. 0.011 : 4 : label:incore_nonrecursive It only took 11 milliseconds to do the actual merging and creating the new blobs and trees -- and that includes the rename detection time. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-27 21:59 ` Uwe Kleine-König 2021-05-27 22:15 ` Uwe Kleine-König @ 2021-05-27 23:08 ` Elijah Newren 2021-05-28 21:40 ` Uwe Kleine-König 1 sibling, 1 reply; 12+ messages in thread From: Elijah Newren @ 2021-05-27 23:08 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Git Mailing List, entwicklung On Thu, May 27, 2021 at 2:59 PM Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > Hello, > > On Wed, May 26, 2021 at 07:38:08AM -0700, Elijah Newren wrote: > > On Wed, May 26, 2021 at 3:13 AM Uwe Kleine-König > > <u.kleine-koenig@pengutronix.de> wrote: > > > I have a kernel topic branch containing 4 patches on top of Linux v5.4. > > > (I didn't speak to the affected customer, so I cannot easily share the > > > patch stack. If need be I can probably anonymize it or ask if I can > > > publish the patches.) > > > > > > It rebases clean on v5.10: > > > > > > $ time git rebase v5.10 > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Successfully rebased and updated detached HEAD. > > > > > > real 3m47.841s > > > user 1m25.706s > > > sys 0m11.181s > > > > > > If I start with the same rev checked out and explicitly specify the > > > merge base, the rebase process is considerably faster: > > > > > > $ time git rebase --onto v5.10 v5.4 > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > Successfully rebased and updated detached HEAD. > > > > > > real 1m20.588s > > > user 1m12.645s > > > sys 0m6.733s Note: In your original report you had rename detection and it clearly took a significant amount of time... > > > > > > Is there some relevant complexity in the first invocation I'm not seeing > > > that explains it takes more than the double time? I would have expected > > > that > > > > > > git rebase v5.10 > > > > > > does the same as: > > > > > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) > > > > > > . (FTR: > > > > > > $ time git merge-base HEAD v5.10 > > > 219d54332a09e8d8741c1e1982f5eae56099de85 > > > > > > real 0m0.158s > > > user 0m0.105s > > > sys 0m0.052s > > > > > > , 219d5433 is v5.4 as expected. > > > > That does seem surprising, though if an automatic gc completed between > > the two commands that could certainly explain it. If that theory is > > correct, it would suggest that it'd be difficult for you to reproduce; > > This reproduces just fine. The repository is quite big and it is slow at > times. With the same tree on a different machine, the rebase is quicker, > but the factor 2 between the two different commands is visible there, > too: > > uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > HEAD is now at bc2e99c9c9e0 [...] > > uwe@taurus:~/gsrc/linux$ time git rebase v5.10 > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > Successfully rebased and updated detached HEAD. > > real 0m20.737s > user 0m14.188s > sys 0m3.767s > > uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > HEAD is now at bc2e99c9c9e0 [...] > > uwe@taurus:~/gsrc/linux$ time git rebase --onto v5.10 v5.4 > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > Successfully rebased and updated detached HEAD. > > real 0m12.129s > user 0m7.196s > sys 0m3.141s > > (This is with a slightly newer git: 2.30.2-1 from Debian) And here, there was no rename detection so this isn't the same thing anymore. You could try setting merge.renameLimit higher. However, the 7-8 second difference (and the likely large differences between 5.4 and 5.10) do suggest that Junio's hunch that fork-point behavior being at play could be an issue in these two commands. > Then I repeated the test with git 2.32.0-rc1 (wgit is just calling > bin-wrappers/git in my git working copy): > > uwe@taurus:~/gsrc/linux$ wgit version > git version 2.32.0.rc1 > > uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > HEAD is now at bc2e99c9c9e0 [...] > > uwe@taurus:~/gsrc/linux$ time wgit rebase v5.10 > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > Successfully rebased and updated detached HEAD. > > real 0m19.438s > user 0m13.629s > sys 0m3.299s > > uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > HEAD is now at bc2e99c9c9e0 [...] > > uwe@taurus:~/gsrc/linux$ time wgit rebase --onto v5.10 v5.4 > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > warning: inexact rename detection was skipped due to too many files. > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > Successfully rebased and updated detached HEAD. > > real 0m13.848s > user 0m8.315s > sys 0m3.182s > > So the surprise persists. Yeah, with no rename detection, the newer git version isn't going to make a bit of difference. > > running again with either command would give you something closer to > > the lower time both times. Is that the case? (Also, what's the > > output of "git count-objects -v"?) > > After the above commands I have: > > count: 3203 > size: 17664 > in-pack: 4763753 > packs: 11 > size-pack: 1273957 > prune-packable: 19 > garbage: 0 > size-garbage: 0 So, not freshly packed, but not in need of an automatic gc either. > alternate: /home/uwe/var/gitstore/linux.git/objects You've got an alternate? How well packed is it? (What does "git count-objects -v" in that other repo show?) > > (On the repository I did this initially I have: > > warning: garbage found: .git/objects/pack/pack-864148a84c0524073ed8c8aa1a76155d5c677879.pack.temp > warning: garbage found: /ptx/src/git/linux.git/objects/pack/tmp_pack_X9gHnq > count: 2652 > size: 14640 > in-pack: 2117015 > packs: 8 > size-pack: 574167 > prune-packable: 856 > garbage: 2 > size-garbage: 1114236 > alternate: /ptx/src/git/linux.git/objects > > (Is the garbage a reason this is so slow? Can I just remove the two > files pointed out?) If there isn't some still-running git operation that is fetching and writing to these files, then yes they can be cleaned out. I doubt they'd make too much of a difference, though. I was more curious if you went from say 10000 loose objects to ~0, or from 50+ packs down to 1 between operations due to an automatic gc completing. > > I'd love to try this with git-2.32.0-rc1 (or even my not-yet-upstream > > patches that optimize even further) with adding "--strategy=ort" to > > your rebase command to see how much of a timing difference it makes. > > Any chance the patches could either be published, or you could retry > > with git-2.32.0-rc1 and add the --strategy=ort command line option to > > your rebase command(s)? > > With --strategy=ort added I have: > > uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort v5.10 > Successfully rebased and updated detached HEAD. > > real 0m19.202s > user 0m12.724s > sys 0m2.961s > > [...] > > uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort --onto v5.10 v5.4 > Successfully rebased and updated detached HEAD. > > real 0m12.395s > user 0m6.638s > sys 0m3.284s > > So the warnings about inexact rename detection don't appear and it's a > bit faster, but I still see the timing difference between these two > commands. Right, this says that --strategy=ort WITH rename detection is as fast as the default --strategy=recursive WITHOUT rename detection. It's not a fair comparison (you'd need to set merge.renameLimit higher and re-run the cases where you had warnings), but is interesting nonetheless. It basically suggests that rename detection comes for free with the ort strategy. > I assume you are still interested in seeing this branch? I think > anonymising it shouldn't be so hard, the patches are not so big. I'll > modify the branch to make it shareable and assuming the problem still > reproduces with it will share it with you. Thanks. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-27 23:08 ` Elijah Newren @ 2021-05-28 21:40 ` Uwe Kleine-König 2021-05-28 22:26 ` Elijah Newren 2021-05-29 16:59 ` Felipe Contreras 0 siblings, 2 replies; 12+ messages in thread From: Uwe Kleine-König @ 2021-05-28 21:40 UTC (permalink / raw) To: Elijah Newren; +Cc: Git Mailing List, entwicklung [-- Attachment #1: Type: text/plain, Size: 15873 bytes --] Hello Elijah, On Thu, May 27, 2021 at 04:08:32PM -0700, Elijah Newren wrote: > On Thu, May 27, 2021 at 2:59 PM Uwe Kleine-König > <u.kleine-koenig@pengutronix.de> wrote: > > On Wed, May 26, 2021 at 07:38:08AM -0700, Elijah Newren wrote: > > > On Wed, May 26, 2021 at 3:13 AM Uwe Kleine-König > > > <u.kleine-koenig@pengutronix.de> wrote: > > > > I have a kernel topic branch containing 4 patches on top of Linux v5.4. > > > > (I didn't speak to the affected customer, so I cannot easily share the > > > > patch stack. If need be I can probably anonymize it or ask if I can > > > > publish the patches.) > > > > > > > > It rebases clean on v5.10: > > > > > > > > $ time git rebase v5.10 > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Successfully rebased and updated detached HEAD. > > > > > > > > real 3m47.841s > > > > user 1m25.706s > > > > sys 0m11.181s > > > > > > > > If I start with the same rev checked out and explicitly specify the > > > > merge base, the rebase process is considerably faster: > > > > > > > > $ time git rebase --onto v5.10 v5.4 > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Performing inexact rename detection: 100% (36806539/36806539), done. > > > > Successfully rebased and updated detached HEAD. > > > > > > > > real 1m20.588s > > > > user 1m12.645s > > > > sys 0m6.733s > > Note: In your original report you had rename detection and it clearly > took a significant amount of time... FTR: My impression is that the repo I used for the first report is slow in general. Also git log sometimes takes a considerable time to start emitting output. > > > > Is there some relevant complexity in the first invocation I'm not seeing > > > > that explains it takes more than the double time? I would have expected > > > > that > > > > > > > > git rebase v5.10 > > > > > > > > does the same as: > > > > > > > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) > > > > > > > > . (FTR: > > > > > > > > $ time git merge-base HEAD v5.10 > > > > 219d54332a09e8d8741c1e1982f5eae56099de85 > > > > > > > > real 0m0.158s > > > > user 0m0.105s > > > > sys 0m0.052s > > > > > > > > , 219d5433 is v5.4 as expected. > > > > > > That does seem surprising, though if an automatic gc completed between > > > the two commands that could certainly explain it. If that theory is > > > correct, it would suggest that it'd be difficult for you to reproduce; > > > > This reproduces just fine. The repository is quite big and it is slow at > > times. With the same tree on a different machine, the rebase is quicker, > > but the factor 2 between the two different commands is visible there, > > too: > > > > uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > > HEAD is now at bc2e99c9c9e0 [...] > > > > uwe@taurus:~/gsrc/linux$ time git rebase v5.10 > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > Successfully rebased and updated detached HEAD. > > > > real 0m20.737s > > user 0m14.188s > > sys 0m3.767s > > > > uwe@taurus:~/gsrc/linux$ git checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > > HEAD is now at bc2e99c9c9e0 [...] > > > > uwe@taurus:~/gsrc/linux$ time git rebase --onto v5.10 v5.4 > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8604 and retry the command. > > Successfully rebased and updated detached HEAD. > > > > real 0m12.129s > > user 0m7.196s > > sys 0m3.141s > > > > (This is with a slightly newer git: 2.30.2-1 from Debian) > > And here, there was no rename detection so this isn't the same thing > anymore. You could try setting merge.renameLimit higher. I learned a few things since my last mail, here comes an updated test again on the machine and repo used for the initial report: ukl@dude.ptx:~/gsrc/linux$ wgit version git version 2.32.0.rc1 ukl@dude.ptx:~/gsrc/linux$ cat rebasecheck #!/bin/bash set -e # do it once to heat the caches and ensure all objects are available already to have the next cycles identical. wgit checkout 0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 wgit rebase v5.10 wgit checkout 0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 echo "rebase v5.10" time wgit rebase v5.10 wgit checkout 0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 echo "rebase --onto v5.10 v5.4" time wgit rebase --onto v5.10 v5.4 I do the rebase now once before the timing for the reasons described in the comment. The second identical command is quite a bit quicker. Also now that the commands are scripted they are done in a smaller time frame (which matters as the machine is used heavily among my colleagues and me). I run the script a few times in a row, after all colleagues are in their week-end: ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck ... rebase v5.10 ... real 1m13.579s user 1m2.919s sys 0m6.220s ... rebase --onto v5.10 v5.4 ... real 1m2.852s user 0m53.780s sys 0m6.225s ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck ... rebase v5.10 ... real 1m10.816s user 1m3.344s sys 0m6.991s ... rebase --onto v5.10 v5.4 ... real 0m59.695s user 0m53.510s sys 0m5.579s ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck ... rebase v5.10 ... real 1m9.688s user 1m3.346s sys 0m6.105s ... rebase --onto v5.10 v5.4 ... real 0m59.981s user 0m52.931s sys 0m6.282s So it's not a factor 2 any more, but still reproducibly quicker when --onto is used. > However, the 7-8 second difference (and the likely large differences > between 5.4 and 5.10) do suggest that Junio's hunch that fork-point > behavior being at play could be an issue in these two commands. > > > Then I repeated the test with git 2.32.0-rc1 (wgit is just calling > > bin-wrappers/git in my git working copy): > > > > uwe@taurus:~/gsrc/linux$ wgit version > > git version 2.32.0.rc1 > > > > uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > > HEAD is now at bc2e99c9c9e0 [...] > > > > uwe@taurus:~/gsrc/linux$ time wgit rebase v5.10 > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > Successfully rebased and updated detached HEAD. > > > > real 0m19.438s > > user 0m13.629s > > sys 0m3.299s > > > > uwe@taurus:~/gsrc/linux$ wgit checkout bc2e99c9c9e0d29494b1739624554e4f5f979d32 > > HEAD is now at bc2e99c9c9e0 [...] > > > > uwe@taurus:~/gsrc/linux$ time wgit rebase --onto v5.10 v5.4 > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > warning: inexact rename detection was skipped due to too many files. > > warning: you may want to set your merge.renamelimit variable to at least 8024 and retry the command. > > Successfully rebased and updated detached HEAD. > > > > real 0m13.848s > > user 0m8.315s > > sys 0m3.182s > > > > So the surprise persists. > > Yeah, with no rename detection, the newer git version isn't going to > make a bit of difference. > > > > running again with either command would give you something closer to > > > the lower time both times. Is that the case? (Also, what's the > > > output of "git count-objects -v"?) > > > > After the above commands I have: > > > > count: 3203 > > size: 17664 > > in-pack: 4763753 > > packs: 11 > > size-pack: 1273957 > > prune-packable: 19 > > garbage: 0 > > size-garbage: 0 > > So, not freshly packed, but not in need of an automatic gc either. > > > alternate: /home/uwe/var/gitstore/linux.git/objects > > You've got an alternate? How well packed is it? (What does "git > count-objects -v" in that other repo show?) > > > > > (On the repository I did this initially I have: > > > > warning: garbage found: .git/objects/pack/pack-864148a84c0524073ed8c8aa1a76155d5c677879.pack.temp > > warning: garbage found: /ptx/src/git/linux.git/objects/pack/tmp_pack_X9gHnq > > count: 2652 > > size: 14640 > > in-pack: 2117015 > > packs: 8 > > size-pack: 574167 > > prune-packable: 856 > > garbage: 2 > > size-garbage: 1114236 > > alternate: /ptx/src/git/linux.git/objects In the alternate I have: ukl@dude.ptx:/ptx/src/git/linux.git/objects$ wgit count-objects -v warning: garbage found: /ptx/work/user/git/linux.git/objects/pack/tmp_pack_X9gHnq count: 5035 size: 40720 in-pack: 87083076 packs: 1108 size-pack: 51109693 prune-packable: 3050 garbage: 1 size-garbage: 1112612 The alternate tracks git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git (only the tags for the two latter). > > (Is the garbage a reason this is so slow? Can I just remove the two > > files pointed out?) > > If there isn't some still-running git operation that is fetching and > writing to these files, then yes they can be cleaned out. I doubt > they'd make too much of a difference, though. I was more curious if > you went from say 10000 loose objects to ~0, or from 50+ packs down to > 1 between operations due to an automatic gc completing. > > > > I'd love to try this with git-2.32.0-rc1 (or even my not-yet-upstream > > > patches that optimize even further) with adding "--strategy=ort" to > > > your rebase command to see how much of a timing difference it makes. > > > Any chance the patches could either be published, or you could retry > > > with git-2.32.0-rc1 and add the --strategy=ort command line option to > > > your rebase command(s)? > > > > With --strategy=ort added I have: > > > > uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort v5.10 > > Successfully rebased and updated detached HEAD. > > > > real 0m19.202s > > user 0m12.724s > > sys 0m2.961s > > > > [...] > > > > uwe@taurus:~/gsrc/linux$ time wgit rebase --strategy=ort --onto v5.10 v5.4 > > Successfully rebased and updated detached HEAD. > > > > real 0m12.395s > > user 0m6.638s > > sys 0m3.284s > > > > So the warnings about inexact rename detection don't appear and it's a > > bit faster, but I still see the timing difference between these two > > commands. > > Right, this says that --strategy=ort WITH rename detection is as fast > as the default --strategy=recursive WITHOUT rename detection. I rerun the script with -sort added: ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck ... rebase v5.10 ... real 0m25.047s user 0m17.652s sys 0m5.802s ... rebase --onto v5.10 v5.4 ... real 0m12.471s user 0m7.854s sys 0m4.413s ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck ... rebase v5.10 ... real 0m22.180s user 0m17.219s sys 0m4.701s ... rebase --onto v5.10 v5.4 ... real 0m12.341s user 0m7.308s sys 0m4.632s So -sort is quite a bit quicker, but the ~10s overhead when not using --onto is visible there, too. When looking at the timing of the output, the 10s time difference occur before "Rebasing (1/4)" is emitted. wgit rebase -sort --onto v5.10 v5.10 behaves like wgit rebase -sort v5.10 and if I only rebase the first two patches (instead of four) it still takes nearly the same time. Another test I did was: time wgit rebase -sort --onto v5.10 v5.7 real 0m17.712s user 0m11.570s sys 0m5.396s So there seems to be something before the actual rebase is done that takes longer when HEAD..$base contains more objects. Given that ukl@dude.ptx:~/gsrc/linux$ time wgit log --oneline --cherry v5.10...0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 + 0091ecb84cfd (ptx/ukl/rebase-timing) nvmem: core: skip child nodes not matching binding + 38af1d38c542 spidev: add "hxxxxxxx,xxxxxx" compatible + a7edcfb6a968 regmap: fix memory leak in regmap_debugfs_init() + b1d90bc89408 pci: add quirk for txxxxx FPGA watchdog real 0m10.783s user 0m10.346s sys 0m0.436s I guess this range is searched for commits that have the same patch id as the patches to rebase? > It's not a fair comparison (you'd need to set merge.renameLimit higher > and re-run the cases where you had warnings), but is interesting > nonetheless. It basically suggests that rename detection comes for > free with the ort strategy. FTR: In the above repo I have: ukl@dude.ptx:~/gsrc/linux$ wgit config merge.renameLimit 10000 Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-28 21:40 ` Uwe Kleine-König @ 2021-05-28 22:26 ` Elijah Newren 2021-05-29 16:59 ` Felipe Contreras 1 sibling, 0 replies; 12+ messages in thread From: Elijah Newren @ 2021-05-28 22:26 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: Git Mailing List, entwicklung Hi Uwe, On Fri, May 28, 2021 at 2:40 PM Uwe Kleine-König <u.kleine-koenig@pengutronix.de> wrote: > > Hello Elijah, > > On Thu, May 27, 2021 at 04:08:32PM -0700, Elijah Newren wrote: > > On Thu, May 27, 2021 at 2:59 PM Uwe Kleine-König > > <u.kleine-koenig@pengutronix.de> wrote: > > > On Wed, May 26, 2021 at 07:38:08AM -0700, Elijah Newren wrote: > > > > On Wed, May 26, 2021 at 3:13 AM Uwe Kleine-König > > > > <u.kleine-koenig@pengutronix.de> wrote: ... > > Note: In your original report you had rename detection and it clearly > > took a significant amount of time... > > FTR: My impression is that the repo I used for the first report is slow > in general. Also git log sometimes takes a considerable time to start > emitting output. > ... > > I learned a few things since my last mail, here comes an updated test > again on the machine and repo used for the initial report: > > ukl@dude.ptx:~/gsrc/linux$ wgit version > git version 2.32.0.rc1 > > ukl@dude.ptx:~/gsrc/linux$ cat rebasecheck > #!/bin/bash > > set -e > > # do it once to heat the caches and ensure all objects are available already to have the next cycles identical. > wgit checkout 0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 > wgit rebase v5.10 > > wgit checkout 0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 > echo "rebase v5.10" > time wgit rebase v5.10 > > wgit checkout 0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 > echo "rebase --onto v5.10 v5.4" > time wgit rebase --onto v5.10 v5.4 > > I do the rebase now once before the timing for the reasons described in > the comment. The second identical command is quite a bit quicker. Also > now that the commands are scripted they are done in a smaller time frame > (which matters as the machine is used heavily among my colleagues and > me). I run the script a few times in a row, after all colleagues are in > their week-end: > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 1m13.579s > user 1m2.919s > sys 0m6.220s > ... > rebase --onto v5.10 v5.4 > ... > real 1m2.852s > user 0m53.780s > sys 0m6.225s > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 1m10.816s > user 1m3.344s > sys 0m6.991s > ... > rebase --onto v5.10 v5.4 > ... > real 0m59.695s > user 0m53.510s > sys 0m5.579s > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 1m9.688s > user 1m3.346s > sys 0m6.105s > ... > rebase --onto v5.10 v5.4 > ... > real 0m59.981s > user 0m52.931s > sys 0m6.282s > > So it's not a factor 2 any more, but still reproducibly quicker when > --onto is used. Yep, so that looks like the results I was getting. Adding --reapply-cherry-picks should remove most of that time difference as I stated in my previous email. > > However, the 7-8 second difference (and the likely large differences > > between 5.4 and 5.10) do suggest that Junio's hunch that fork-point > > behavior being at play could be an issue in these two commands. I don't think --no-fork-point will matter here since you are detaching HEAD before running rebase. fork-point is all about looking up the reflog of the current branch to find better matches. --reapply-cherry-picks should help you out and erase most of this 7-8 second difference. > > > > running again with either command would give you something closer to > > > > the lower time both times. Is that the case? (Also, what's the > > > > output of "git count-objects -v"?) > > > > > > After the above commands I have: > > > > > > count: 3203 > > > size: 17664 > > > in-pack: 4763753 > > > packs: 11 > > > size-pack: 1273957 > > > prune-packable: 19 > > > garbage: 0 > > > size-garbage: 0 > > > > So, not freshly packed, but not in need of an automatic gc either. > > > > > alternate: /home/uwe/var/gitstore/linux.git/objects > > > > You've got an alternate? How well packed is it? (What does "git > > count-objects -v" in that other repo show?) > > ... > > In the alternate I have: > > ukl@dude.ptx:/ptx/src/git/linux.git/objects$ wgit count-objects -v > warning: garbage found: /ptx/work/user/git/linux.git/objects/pack/tmp_pack_X9gHnq > count: 5035 This is really close to the threshold of needing repacking, but still okay. > size: 40720 > in-pack: 87083076 > packs: 1108 1108 packs!?!? This will make all kinds of operations slow. This explains your comment about operations with your original repo being slow in general, and why you feel you need to do a warmup run first to get a reasonable timing. 50 is the limit where repacking is deemed necessary; you're 2116% beyond that point. I've only seen repos with pack counts near this level a couple times and they are excruciatingly painful to deal with. However, be careful not to use "git gc" or "git prune" in this repo, since it's used as an alternate (doing so could corrupt the repos that depend on this one). Just use "git repack" with the appropriate flags instead. > size-pack: 51109693 51G. Wow. A fresh clone of linux is waaay smaller than that. 3 G, I think? I would have thought lots of your packs were small, but this suggests you probably have lots of duplicate objects in these packs. > prune-packable: 3050 > garbage: 1 > size-garbage: 1112612 And 1 G of garbage that could just be deleted. > I rerun the script with -sort added: > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 0m25.047s > user 0m17.652s > sys 0m5.802s > ... > rebase --onto v5.10 v5.4 > ... > real 0m12.471s > user 0m7.854s > sys 0m4.413s > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 0m22.180s > user 0m17.219s > sys 0m4.701s > ... > rebase --onto v5.10 v5.4 > ... > real 0m12.341s > user 0m7.308s > sys 0m4.632s > > So -sort is quite a bit quicker, but the ~10s overhead when not using > --onto is visible there, too. Yeah, try adding --reapply-cherry-picks; I think that flag should shrink most of the difference. > When looking at the timing of the output, the 10s time difference occur > before "Rebasing (1/4)" is emitted. > > wgit rebase -sort --onto v5.10 v5.10 > > behaves like > > wgit rebase -sort v5.10 > > and if I only rebase the first two patches (instead of four) it still > takes nearly the same time. Another test I did was: > > time wgit rebase -sort --onto v5.10 v5.7 > > real 0m17.712s > user 0m11.570s > sys 0m5.396s > > So there seems to be something before the actual rebase is done that > takes longer when HEAD..$base contains more objects. > Given that > > ukl@dude.ptx:~/gsrc/linux$ time wgit log --oneline --cherry v5.10...0091ecb84cfdef0f4cb65810219f5ac9bb4341e5 > + 0091ecb84cfd (ptx/ukl/rebase-timing) nvmem: core: skip child nodes not matching binding > + 38af1d38c542 spidev: add "hxxxxxxx,xxxxxx" compatible > + a7edcfb6a968 regmap: fix memory leak in regmap_debugfs_init() > + b1d90bc89408 pci: add quirk for txxxxx FPGA watchdog > > real 0m10.783s > user 0m10.346s > sys 0m0.436s > > I guess this range is searched for commits that have the same patch id > as the patches to rebase? Yep, and --reapply-cherry-picks removes this cherry-searching. Try it and see how it affects your results. I don't think it'll entirely eliminate the differences for you (it didn't for me), because there appears to be some other weird overhead -- part of it from can_fast_forward() and more that I didn't track down further. I do think that the --reapply-cherry-picks will remove most of the differences for you, though. > FTR: In the above repo I have: > > ukl@dude.ptx:~/gsrc/linux$ wgit config merge.renameLimit > 10000 Yep, so my choice of 9999 to try to reproduce your behavior was a pretty good pick, eh? :-) Hope that helps, Elijah ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-28 21:40 ` Uwe Kleine-König 2021-05-28 22:26 ` Elijah Newren @ 2021-05-29 16:59 ` Felipe Contreras 1 sibling, 0 replies; 12+ messages in thread From: Felipe Contreras @ 2021-05-29 16:59 UTC (permalink / raw) To: Uwe Kleine-König, Elijah Newren; +Cc: Git Mailing List, entwicklung Uwe Kleine-König wrote: > I do the rebase now once before the timing for the reasons described in > the comment. The second identical command is quite a bit quicker. Also > now that the commands are scripted they are done in a smaller time frame > (which matters as the machine is used heavily among my colleagues and > me). I run the script a few times in a row, after all colleagues are in > their week-end: > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 1m13.579s > user 1m2.919s > sys 0m6.220s > ... > rebase --onto v5.10 v5.4 > ... > real 1m2.852s > user 0m53.780s > sys 0m6.225s > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 1m10.816s > user 1m3.344s > sys 0m6.991s > ... > rebase --onto v5.10 v5.4 > ... > real 0m59.695s > user 0m53.510s > sys 0m5.579s > > ukl@dude.ptx:~/gsrc/linux$ bash rebasecheck > ... > rebase v5.10 > ... > real 1m9.688s > user 1m3.346s > sys 0m6.105s > ... > rebase --onto v5.10 v5.4 > ... > real 0m59.981s > user 0m52.931s > sys 0m6.282s > > So it's not a factor 2 any more, but still reproducibly quicker when > --onto is used. Years ago I completely rewrote `git rebase` to use `git cherry-pick`, and the result is a very simple command: git checkout $onto git cherry-pick --no-merges --right-only --topo-order --do-walk @{upstream}..v5.4 The difference when you don't specify --onto is basically that both onto and upstream are considered the same: git checkout $onto git cherry-pick --no-merges --right-only --topo-order --do-walk $onto..v5.4 Therefore it should be more efficient to specify --onto. Except git tries to be smart and first tries to check if a fast-forward is possible, even if you specify --no-ff (a mistake IMO). To check for linear history the old code used to do: git rev-list --parents $onto..v5.4 | grep " .* " Maybe that is too slow in your particular situation. You could try --restrict-revisions=v5.10 (or anything other than the merge base), but apparently that only works with --interactive. Another option is just hack git to disable the linear history check: diff --git a/builtin/rebase.c b/builtin/rebase.c index 12f093121d..bdbcfaa58e 100644 --- a/builtin/rebase.c +++ b/builtin/rebase.c @@ -1145,6 +1145,10 @@ static int can_fast_forward(struct commit *onto, struct commit *upstream, } oidcpy(merge_base, &merge_bases->item->object.oid); + + /* Hack to avoid linear history check */ + goto done; + if (!oideq(merge_base, &onto->object.oid)) goto done; Cheers. -- Felipe Contreras ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-26 10:09 time needed to rebase shortend by using --onto? Uwe Kleine-König 2021-05-26 11:04 ` Bagas Sanjaya 2021-05-26 14:38 ` Elijah Newren @ 2021-05-26 22:18 ` Junio C Hamano 2021-05-27 22:16 ` Uwe Kleine-König 2 siblings, 1 reply; 12+ messages in thread From: Junio C Hamano @ 2021-05-26 22:18 UTC (permalink / raw) To: Uwe Kleine-König; +Cc: git, entwicklung Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes: > It rebases clean on v5.10: > > $ time git rebase v5.10 > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Successfully rebased and updated detached HEAD. > > real 3m47.841s > user 1m25.706s > sys 0m11.181s > > If I start with the same rev checked out and explicitly specify the > merge base, the rebase process is considerably faster: > > $ time git rebase --onto v5.10 v5.4 > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Performing inexact rename detection: 100% (36806539/36806539), done. > Successfully rebased and updated detached HEAD. > > real 1m20.588s > user 1m12.645s > sys 0m6.733s > > Is there some relevant complexity in the first invocation I'm not seeing > that explains it takes more than the double time? I would have expected > that > > git rebase v5.10 > > does the same as: > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) There is a voodoo called fork-point detection that walks back the reflogs and repeatedly computes merge bases, and giving --onto to explicitly give a commit on which the history is transplanted should remove the need to do the computation, so that is a possibility. But according to the manpage, it should not kick in for invocations in the above example that specify the <upstream> (the rebase.forkpoint configuration variable can clobber this default). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: time needed to rebase shortend by using --onto? 2021-05-26 22:18 ` Junio C Hamano @ 2021-05-27 22:16 ` Uwe Kleine-König 0 siblings, 0 replies; 12+ messages in thread From: Uwe Kleine-König @ 2021-05-27 22:16 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, entwicklung [-- Attachment #1: Type: text/plain, Size: 2290 bytes --] Hello Junio, On Thu, May 27, 2021 at 07:18:52AM +0900, Junio C Hamano wrote: > Uwe Kleine-König <u.kleine-koenig@pengutronix.de> writes: > > > It rebases clean on v5.10: > > > > $ time git rebase v5.10 > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Successfully rebased and updated detached HEAD. > > > > real 3m47.841s > > user 1m25.706s > > sys 0m11.181s > > > > If I start with the same rev checked out and explicitly specify the > > merge base, the rebase process is considerably faster: > > > > $ time git rebase --onto v5.10 v5.4 > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Performing inexact rename detection: 100% (36806539/36806539), done. > > Successfully rebased and updated detached HEAD. > > > > real 1m20.588s > > user 1m12.645s > > sys 0m6.733s > > > > Is there some relevant complexity in the first invocation I'm not seeing > > that explains it takes more than the double time? I would have expected > > that > > > > git rebase v5.10 > > > > does the same as: > > > > git rebase --onto v5.10 $(git merge-base HEAD v5.10) > > There is a voodoo called fork-point detection that walks back the > reflogs and repeatedly computes merge bases, and giving --onto to > explicitly give a commit on which the history is transplanted should > remove the need to do the computation, so that is a possibility. > > But according to the manpage, it should not kick in for invocations > in the above example that specify the <upstream> (the > rebase.forkpoint configuration variable can clobber this default). FTR: I don't have this variable set in the two repositories that show the different timings. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | https://www.pengutronix.de/ | [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2021-05-29 16:59 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-05-26 10:09 time needed to rebase shortend by using --onto? Uwe Kleine-König 2021-05-26 11:04 ` Bagas Sanjaya 2021-05-26 14:38 ` Elijah Newren 2021-05-27 21:59 ` Uwe Kleine-König 2021-05-27 22:15 ` Uwe Kleine-König 2021-05-28 5:38 ` Elijah Newren 2021-05-27 23:08 ` Elijah Newren 2021-05-28 21:40 ` Uwe Kleine-König 2021-05-28 22:26 ` Elijah Newren 2021-05-29 16:59 ` Felipe Contreras 2021-05-26 22:18 ` Junio C Hamano 2021-05-27 22:16 ` Uwe Kleine-König
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).