All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: "Christian Couder" <christian.couder@gmail.com>,
	"Olga Telezhnaya" <olyatelezhnaya@gmail.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
	"Brandon Williams" <bwilliams.eng@gmail.com>,
	"Manav Rathi" <mnvrth@gmail.com>
Subject: [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases
Date: Sat, 10 Aug 2019 17:27:30 -0300	[thread overview]
Message-ID: <8c26abe9156e069ad4d19e9f0ce131cd1453f030.1565468806.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1565468806.git.matheus.bernardino@usp.br>

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops. But now that
zlib inflation can be performed in parallel, for some of git-grep's
options, we can regain the speedup.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.3557s  | 20.8410s
    2    |   9.7170s  | 11.2415s
    8    |   6.1723s  |  6.9378s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ with 16GB of RAM and SSD. But to make sure the
optimization also performs well on HDD, the tests were repeated on an
AMD Turion 64 X2 TL-62 (dual-core) with 4GB of RAM and HDD (SATA-150,
5400 rpm):

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  40.3347s  |  47.6173s
    2    |  27.6547s  |  35.1797s

Unfortunately, textconv and submodules' operations remain thread-unsafe,
needing locks to be safely executed when threaded. Because of that, it's
not currently worthy to grep in parallel with them. So, when --textconv
or --recurse-submodules are given for a non-worktree case, threads are
kept disabled. In order to clarify this behavior, let's also add a
"NOTES" section to Documentation/git-grep.txt explaining the thread
usage details.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 12 ++++++++++++
 builtin/grep.c             |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 2d27969057..9686875fbc 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -330,6 +330,18 @@ EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES
+-----
+
+The --threads option (and grep.threads configuration) will be ignored when
+--open-files-in-pager is used, forcing a single-threaded execution.
+
+When grepping the index file (with --cached or giving tree objects), the
+following options will also suppress thread creation:
+
+	--recurse_submodules
+	--textconv
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index fa51392222..e5a9da471a 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1073,7 +1073,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager ||
+	   ((list.nr || cached) && (recurse_submodules || opt.allow_textconv))) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.22.0


  parent reply	other threads:[~2019-08-10 20:28 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
2019-08-10 20:27 ` Matheus Tavares [this message]
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
2019-11-12  2:54     ` Jonathan Tan
2019-11-13  5:20       ` Jeff King
2019-11-14  5:57         ` Matheus Tavares Bernardino
2019-11-14  6:01           ` Jeff King
2019-11-14 18:15             ` Jonathan Tan
2019-11-15  4:12               ` Jeff King
2019-12-19 22:27                 ` Matheus Tavares Bernardino
2020-01-09 22:02                   ` Matheus Tavares Bernardino
2020-01-10 19:07                     ` Christian Couder
2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2019-10-01 19:23     ` [PATCH] squash! " Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-29 11:26       ` SZEDER Gábor
2020-01-29 18:49         ` Junio C Hamano
2020-01-29 18:57         ` Junio C Hamano
2020-01-29 20:42           ` Matheus Tavares Bernardino
2020-01-30 13:28             ` Philippe Blain
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk
2020-01-16 14:47         ` [PATCH] " Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8c26abe9156e069ad4d19e9f0ce131cd1453f030.1565468806.git.matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=bwilliams.eng@gmail.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mnvrth@gmail.com \
    --cc=olyatelezhnaya@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.