All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: vleschuk@gmail.com
Cc: alexhenrie24@gmail.com, artagnon@gmail.com,
	bwilliams.eng@gmail.com, christian.couder@gmail.com, e@80x24.org,
	git@matthieu-moy.fr, git@vger.kernel.org, gitster@pobox.com,
	jonathantanmy@google.com, jrnieder@gmail.com,
	matheus.bernardino@usp.br, olyatelezhnaya@gmail.com,
	pclouds@gmail.com, peff@peff.net, vleschuk@accesssoftek.com
Subject: Re: [PATCH] grep: use no. of cores as the default no. of threads
Date: Thu, 16 Jan 2020 11:47:29 -0300	[thread overview]
Message-ID: <20200116144729.8033-1-matheus.bernardino@usp.br> (raw)
In-Reply-To: <CAGuA69ujsOBm2+RKEkGu8wLoEVvKxivY762Zokf9MWxDWrwWFQ@mail.gmail.com>

Hi, Victor

On Thu, Jan 16, 2020 at 10:11 AM Victor Leschuk <vleschuk@gmail.com> wrote:
>
> Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
> use not online_cpus() but online_cpus()*2?

I also tried this approach, but the tests I ran with online_cpus() * 2
only showed slowdowns. The results can be seen in the commit message:

> On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
[...]
> > The following measurements correspond to the
> > mean elapsed times for 30 git-grep executions in chromium's
> > repository[1] with a 95% confidence interval (each set of 30 were
> > performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> > '(static|extern) (int|double) \*'.
> >
> >       |          Working tree         |           Object Store
> > ------|-------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> > ------|---------------|---------------|----------------|---------------
> >   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
> >   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> > >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
> >    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
> >    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
> >    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
> >
> > The above tests were performed in a desktop running Debian 10.0 with
> > Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> > RAM and a 7200 rpm, SATA 3.1 HDD.
> >
> > Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> > with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
> >
> >       |          Working tree          |           Object Store
> > ------|--------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> > ------|---------------|----------------|----------------|---------------
> >   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
> >   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> > >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
> >    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
> >    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
> >    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

I deliberately used somehow complex regexes for these tests. So I
decided to do one more test with a very simple fixed string ("abc"),
allowing git-grep to spend less time in the cpu-bound regex searching.
The results can be seen bellow (the metodology is the same as described
above and the machine is the Manjaro laptop, for which online_cpus()
returns 8):

  #ths |  Working Three |  Object Store
 ------|----------------|---------------
    16 |  3.22s ± 0.20  |  5.96s ± 0.06
     8 |  2.92s ± 0.01  |  5.73s ± 0.02


      reply	other threads:[~2020-01-16 14:47 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
2019-11-12  2:54     ` Jonathan Tan
2019-11-13  5:20       ` Jeff King
2019-11-14  5:57         ` Matheus Tavares Bernardino
2019-11-14  6:01           ` Jeff King
2019-11-14 18:15             ` Jonathan Tan
2019-11-15  4:12               ` Jeff King
2019-12-19 22:27                 ` Matheus Tavares Bernardino
2020-01-09 22:02                   ` Matheus Tavares Bernardino
2020-01-10 19:07                     ` Christian Couder
2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2019-10-01 19:23     ` [PATCH] squash! " Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-29 11:26       ` SZEDER Gábor
2020-01-29 18:49         ` Junio C Hamano
2020-01-29 18:57         ` Junio C Hamano
2020-01-29 20:42           ` Matheus Tavares Bernardino
2020-01-30 13:28             ` Philippe Blain
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk
2020-01-16 14:47         ` Matheus Tavares [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200116144729.8033-1-matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=alexhenrie24@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=bwilliams.eng@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=e@80x24.org \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=olyatelezhnaya@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=vleschuk@accesssoftek.com \
    --cc=vleschuk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.