All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 0/3] Make add_missing_tags() linear
Date: Wed, 31 Oct 2018 23:52:01 -0700	[thread overview]
Message-ID: <CABPp-BHHG9K0869=4CYkqjN6rwLCzRBiF_Z94KFevSo3_FvYAw@mail.gmail.com> (raw)
In-Reply-To: <20181031120505.237235-1-dstolee@microsoft.com>

On Wed, Oct 31, 2018 at 5:05 AM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 10/31/2018 2:04 AM, Elijah Newren wrote:
> > On Tue, Oct 30, 2018 at 7:16 AM Derrick Stolee via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> As reported earlier [1], the add_missing_tags() method in remote.c has
> >> quadratic performance. Some of that performance is curbed due to the
> >> generation-number cutoff in in_merge_bases_many(). However, that fix doesn't
> >> help users without a commit-graph, and it can still be painful if that
> >> cutoff is sufficiently low compared to the tags we are using for
> >> reachability testing.
> >>
> >> Add a new method in commit-reach.c called get_reachable_subset() which does
> >> a many-to-many reachability test. Starting at the 'from' commits, walk until
> >> the generation is below the smallest generation in the 'to' commits, or all
> >> 'to' commits have been discovered. This performs only one commit walk for
> >> the entire add_missing_tags() method, giving linear performance in the worst
> >> case.
> >>
> >> Tests are added in t6600-test-reach.sh to ensure get_reachable_subset()
> >> works independently of its application in add_missing_tags().
> >
> > On the original repo where the topic was brought up, with commit-graph
> > NOT turned on and using origin/master, I see:
> >
> > $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> > To /home/newren/repo-mirror
> >  * [new branch]       test5 -> test5
> >
> > real 1m20.081s
> > user 1m19.688s
> > sys 0m0.292s
> >
> > Merging this series in, I now get:
> >
> > $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> > To /home/newren/repo-mirror
> >  * [new branch]       test5 -> test5
> >
> > real 0m2.857s
> > user 0m2.580s
> > sys 0m0.328s
> >
> > which provides a very nice speedup.
> >
> > Oddly enough, if I _also_ do the following:
> > $ git config core.commitgraph true
> > $ git config gc.writecommitgraph true
> > $ git gc
> >
> > then my timing actually slows down just slightly:
> > $ time git push --follow-tags --dry-run /home/newren/repo-mirror
> > To /home/newren/repo-mirror
> >  * [new branch]          test5 -> test5
> >
> > real 0m3.027s
> > user 0m2.696s
> > sys 0m0.400s
>
> So you say that the commit-graph is off in the 2.8s case, but not here
> in the 3.1s case? I would expect _at minimum_ that the cost of parsing
> commits would have a speedup in the commit-graph case.  There may be
> something else going on here, since you are timing a `push` event that
> is doing more than the current walk.
>
> > (run-to-run variation seems pretty consistent, < .1s variation, so
> > this difference is just enough to notice.)  I wouldn't be that
> > surprised if that means there's some really old tags with very small
> > generation numbers, meaning it's not gaining anything in this special
> > case from the commit-graph, but it does pay the cost of loading the
> > commit-graph.
>
> While you have this test environment, do you mind applying the diff
> below and re-running the tests? It will output a count for how many
> commits are walked by the algorithm. This should help us determine if
> this is another case where generation numbers are worse than commit-date,
> or if there is something else going on. Thanks!

I can do that, but wouldn't you want a similar patch for the old
get_merge_bases_many() in order to compare?  Does an absolute number
help by itself?
It's going to have to be tomorrow, though; not enough time tonight.

  reply	other threads:[~2018-11-01  6:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30 14:16 [PATCH 0/3] Make add_missing_tags() linear Derrick Stolee via GitGitGadget
2018-10-30 14:16 ` [PATCH 1/3] commit-reach: implement get_reachable_subset Derrick Stolee via GitGitGadget
2018-10-31  3:35   ` Junio C Hamano
2018-10-31 12:01     ` Derrick Stolee
2018-11-02  1:51       ` Junio C Hamano
2018-10-31  6:07   ` Elijah Newren
2018-10-31 11:54     ` Derrick Stolee
2018-10-30 14:16 ` [PATCH 2/3] test-reach: test get_reachable_subset Derrick Stolee via GitGitGadget
2018-10-30 14:16 ` [PATCH 3/3] remote: make add_missing_tags() linear Derrick Stolee via GitGitGadget
2018-10-31  3:05 ` [PATCH 0/3] Make " Junio C Hamano
2018-10-31  6:04 ` Elijah Newren
2018-10-31 12:05   ` Derrick Stolee
2018-11-01  6:52     ` Elijah Newren [this message]
2018-11-01 12:32       ` Derrick Stolee
2018-11-01 18:57         ` Elijah Newren
2018-11-01 19:02           ` Derrick Stolee
2018-11-02 14:58             ` Elijah Newren
2018-11-02 15:38               ` Derrick Stolee
2018-11-02 13:14 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 1/3] commit-reach: implement get_reachable_subset Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 2/3] test-reach: test get_reachable_subset Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 3/3] remote: make add_missing_tags() linear Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CABPp-BHHG9K0869=4CYkqjN6rwLCzRBiF_Z94KFevSo3_FvYAw@mail.gmail.com' \
    --to=newren@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.