All of lore.kernel.org
 help / color / mirror / Atom feed
From: Clement Moyroud <clement.moyroud@gmail.com>
To: stolee@gmail.com
Cc: git@vger.kernel.org
Subject: Re: Git blame performance on files with a lot of history
Date: Mon, 17 Dec 2018 12:59:33 -0800	[thread overview]
Message-ID: <CABXAcUxNAr0H+EOmzv_TsW026Y-yDET01PeU3PtgcOmuFE5rjQ@mail.gmail.com> (raw)
In-Reply-To: <3f3e7b11-19ef-cc2f-3bd4-e03d9ba8dc91@gmail.com>

On Fri, Dec 14, 2018 at 1:31 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> Please double-check that you have the 'core.commitGraph' config setting
> enabled, or you will not read the commit-graph at run-time:
>
>      git config core.commitGraph true
>

Yeah, this is what happens when trying too many things at once :( I
had removed it to get
with/without scores, and forgot to re-enable it before trying my last
set of experiments.
Here are the results with it enabled:
> time GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y /path/to/git rev-list --count --full-history HEAD -- important/file.C
10:32:06.665057 revision.c:483          bloom filter total queries:
286363 definitely not: 234605 maybe: 51758 false positives: 48212 fp
ratio: 0.168360
GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y  rev-list --count
HEAD -  2.62s user 0.14s system 97% cpu 2.830 total
> time /path/to/git rev-list --count --full-history HEAD -- ic/lv/src/iclv/drc_compiler.C
3576
/path/to/git rev-list      8.86s user 0.15s system 99% cpu 9.031 total

So I'm getting a 3x benefit, not bad! This is on the re-repacked repo,
which is why I ran again
with and without the Bloom filter.

Let's see what this does for blame:
> time GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y /path/to/git blame master -- important/file.C > /tmp/foo
Blaming lines: 100% (33179/33179), done.
12:50:42.703522 revision.c:483          bloom filter total queries: 0
definitely not: 0 maybe: 0 false positives: 0 fp ratio: -nan
GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y  blame master --
>   132.59s user 2.15s system 99% cpu 2:14.95 total

Seems like it's not implemented for blame operations. I'll be happy to
test any implementation.

Take care,

Clément

  reply	other threads:[~2018-12-17 20:59 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-14 18:29 Git blame performance on files with a lot of history Clement Moyroud
2018-12-14 19:10 ` Bryan Turner
2018-12-17 20:43   ` Clement Moyroud
2018-12-14 21:31 ` Derrick Stolee
2018-12-17 20:59   ` Clement Moyroud [this message]
2018-12-14 22:48 ` Ævar Arnfjörð Bjarmason
2018-12-17 20:30   ` Clement Moyroud

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABXAcUxNAr0H+EOmzv_TsW026Y-yDET01PeU3PtgcOmuFE5rjQ@mail.gmail.com \
    --to=clement.moyroud@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.