All of lore.kernel.org
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Derrick Stolee <stolee@gmail.com>
Cc: Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>,
	Derrick Stolee <dstolee@microsoft.com>
Subject: Re: [PATCH 0/3] Make add_missing_tags() linear
Date: Fri, 2 Nov 2018 07:58:21 -0700	[thread overview]
Message-ID: <CABPp-BEa3U6m8jWrHJXjFo77T32LqEKkdDs7H89m0y_gjHgOWw@mail.gmail.com> (raw)
In-Reply-To: <3aadf261-9ed7-6ded-4482-ab01d74802aa@gmail.com>

On Thu, Nov 1, 2018 at 12:02 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> On 11/1/2018 2:57 PM, Elijah Newren wrote:
> > On Thu, Nov 1, 2018 at 5:32 AM Derrick Stolee <stolee@gmail.com> wrote:
> >> No rush. I'd just like to understand how removing the commit-graph file
> >> can make the new algorithm faster. Putting a similar count in the old
> >> algorithm would involve giving a count for every call to
> >> in_merge_bases_many(), which would be very noisy.
> > $ time git push --dry-run --follow-tags /home/newren/repo-mirror
> > count: 92912
> > To /home/newren/repo-mirror
> >   * [new branch]              test5 -> test5
> >
> > real    0m3.024s
> > user    0m2.752s
> > sys    0m0.320s
>
> Is the above test with or without the commit-graph file? Can you run it
> in the other mode, too? I'd like to see if the "count" value changes
> when the only difference is the presence of a commit-graph file.

I apologize for providing misleading information earlier; this was an
apples to oranges comparison.  Here's what I did:

<build a version of git with your fixes>
git clone coworker.bundle coworker-repo
cd coworker-repo
time git push --dry-run --follow-tags /home/newren/repo-mirror
git config core.commitgraph true
git config gc.writecommitgraph true
git gc
time git push --dry-run --follow-tags /home/newren/nucleus-mirror


I figured I had just done a fresh clone, so surely the gc wouldn't do
anything other than write the .git/objects/info/commit-graph file.
However, the original bundle contained many references outside of
refs/heads/ and refs/tags/:

$ git bundle list-heads ../coworker.bundle | grep -v -e refs/heads/ -e
refs/tags/ -e HEAD | wc -l
2396

These other refs apparently referred to objects not otherwise
referenced in refs/heads/ and refs/tags/, and caused the gc to explode
lots of loose objects:
$ git count-objects -v
count: 147604
size: 856416
in-pack: 1180692
packs: 1
size-pack: 796143
prune-packable: 0
garbage: 0
size-garbage: 0


The slowdown with commit-graph was entirely due to there being lots of
loose objects (147K of them).  If I add a git-prune before doing the
timing with commit-graph, then the timing with commit-graph is faster
than the run without a commit-graph.

Sorry for the wild goose chase.

And thanks for the fixes; get_reachable_subset() makes things much
faster even without a commit-graph, and the commit-graph just improves
it more.  :-)

  reply	other threads:[~2018-11-02 14:58 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-30 14:16 [PATCH 0/3] Make add_missing_tags() linear Derrick Stolee via GitGitGadget
2018-10-30 14:16 ` [PATCH 1/3] commit-reach: implement get_reachable_subset Derrick Stolee via GitGitGadget
2018-10-31  3:35   ` Junio C Hamano
2018-10-31 12:01     ` Derrick Stolee
2018-11-02  1:51       ` Junio C Hamano
2018-10-31  6:07   ` Elijah Newren
2018-10-31 11:54     ` Derrick Stolee
2018-10-30 14:16 ` [PATCH 2/3] test-reach: test get_reachable_subset Derrick Stolee via GitGitGadget
2018-10-30 14:16 ` [PATCH 3/3] remote: make add_missing_tags() linear Derrick Stolee via GitGitGadget
2018-10-31  3:05 ` [PATCH 0/3] Make " Junio C Hamano
2018-10-31  6:04 ` Elijah Newren
2018-10-31 12:05   ` Derrick Stolee
2018-11-01  6:52     ` Elijah Newren
2018-11-01 12:32       ` Derrick Stolee
2018-11-01 18:57         ` Elijah Newren
2018-11-01 19:02           ` Derrick Stolee
2018-11-02 14:58             ` Elijah Newren [this message]
2018-11-02 15:38               ` Derrick Stolee
2018-11-02 13:14 ` [PATCH v2 " Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 1/3] commit-reach: implement get_reachable_subset Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 2/3] test-reach: test get_reachable_subset Derrick Stolee via GitGitGadget
2018-11-02 13:14   ` [PATCH v2 3/3] remote: make add_missing_tags() linear Derrick Stolee via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BEa3U6m8jWrHJXjFo77T32LqEKkdDs7H89m0y_gjHgOWw@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.