Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Victor Leschuk <vleschuk@gmail.com>
To: Matheus Tavares <matheus.bernardino@usp.br>
Cc: "Git List" <git@vger.kernel.org>,
	christian.couder@gmail.com, "Junio C Hamano" <gitster@pobox.com>,
	jrnieder@gmail.com, olyatelezhnaya@gmail.com,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Jonathan Tan" <jonathantanmy@google.com>,
	"Jeff King" <peff@peff.net>,
	"Brandon Williams" <bwilliams.eng@gmail.com>,
	"Alex Henrie" <alexhenrie24@gmail.com>,
	"Ramkumar Ramachandra" <artagnon@gmail.com>,
	"Victor Leschuk" <vleschuk@accesssoftek.com>,
	"Eric Wong" <e@80x24.org>, "Matthieu Moy" <git@matthieu-moy.fr>
Subject: Re: [PATCH v3 12/12] grep: use no. of cores as the default no. of threads
Date: Thu, 16 Jan 2020 16:11:02 +0300
Message-ID: <CAGuA69ujsOBm2+RKEkGu8wLoEVvKxivY762Zokf9MWxDWrwWFQ@mail.gmail.com> (raw)
In-Reply-To: <a5891176d7778b98ac35c756170dd334b8ee21c7.1579141989.git.matheus.bernardino@usp.br>

Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
use not online_cpus() but online_cpus()*2?

--
Victor


On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> When --threads is not specified, git-grep will use 8 threads by default.
> This fixed number may be too many for machines with fewer cores and too
> little for machines with more cores. So, instead, use the number of
> logical cores available in the machine, which seems to result in the
> best overall performance: The following measurements correspond to the
> mean elapsed times for 30 git-grep executions in chromium's
> repository[1] with a 95% confidence interval (each set of 30 were
> performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> '(static|extern) (int|double) \*'.
>
>       |          Working tree         |           Object Store
> ------|-------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> ------|---------------|---------------|----------------|---------------
>   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
>   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
>    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
>    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
>    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
>
> The above tests were performed in a desktop running Debian 10.0 with
> Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> RAM and a 7200 rpm, SATA 3.1 HDD.
>
> Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
>
>       |          Working tree          |           Object Store
> ------|--------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> ------|---------------|----------------|----------------|---------------
>   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
>   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
>    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
>    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
>    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08
>
> [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
>      04-06-2019), after a 'git gc' execution.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  Documentation/git-grep.txt | 4 ++--
>  builtin/grep.c             | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index de628741fa..eb5412724f 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -59,8 +59,8 @@ grep.extendedRegexp::
>         other than 'default'.
>
>  grep.threads::
> -       Number of grep worker threads to use.  If unset (or set to 0),
> -       8 threads are used by default (for now).
> +       Number of grep worker threads to use. If unset (or set to 0), Git will
> +       use as many threads as the number of logical cores available.
>
>  grep.fullName::
>         If set to true, enable `--full-name` option by default.
> diff --git a/builtin/grep.c b/builtin/grep.c
> index a85b710b48..629eaf5dbc 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -33,7 +33,6 @@ static char const * const grep_usage[] = {
>
>  static int recurse_submodules;
>
> -#define GREP_NUM_THREADS_DEFAULT 8
>  static int num_threads;
>
>  static pthread_t *threads;
> @@ -1064,7 +1063,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         } else if (num_threads < 0)
>                 die(_("invalid number of threads specified (%d)"), num_threads);
>         else if (num_threads == 0)
> -               num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
> +               num_threads = HAVE_THREADS ? online_cpus() : 1;
>
>         if (num_threads > 1) {
>                 if (!HAVE_THREADS)
> --
> 2.24.1
>

  reply index

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk [this message]
2020-01-16 14:47         ` [PATCH] " Matheus Tavares

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGuA69ujsOBm2+RKEkGu8wLoEVvKxivY762Zokf9MWxDWrwWFQ@mail.gmail.com \
    --to=vleschuk@gmail.com \
    --cc=alexhenrie24@gmail.com \
    --cc=artagnon@gmail.com \
    --cc=bwilliams.eng@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=e@80x24.org \
    --cc=git@matthieu-moy.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=matheus.bernardino@usp.br \
    --cc=olyatelezhnaya@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    --cc=vleschuk@accesssoftek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git