All of lore.kernel.org
 help / color / mirror / Atom feed
* [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation
@ 2019-08-10 20:27 Matheus Tavares
  2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
                   ` (4 more replies)
  0 siblings, 5 replies; 47+ messages in thread
From: Matheus Tavares @ 2019-08-10 20:27 UTC (permalink / raw)
  To: git
  Cc: Christian Couder, Olga Telezhnaya, Nguyễn Thái Ngọc Duy

This series focuses on allowing parallel access to zlib inflation and
using that to perform a faster git-grep in the non-worktree case.

Threads were disabled for this case at 53b8d93 ("grep: disable
threading in non-worktree case", 12-12-2011), due to performance drops.

However, by allowing threads to perform inflation in parallel, we can
regain the speedup. This is a good hotspot for parallelism as some test
cases[1] showed that it can account for up to 48% of execution time.
And besides that, inflation tasks are already independent of each other.

As a result, grepping 'abcd[02]' ("Regex 1") and
'(static|extern) (int|double) \*' ("Regex 2") at chromium's
repository[2], I got (means of 30 executions):

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.3557s  | 20.8410s
        2    |   9.7170s  | 11.2415s
        8    |   6.1723s  |  6.9378s

As a reference, just enabling threads in the non-worktree case,
without parallel inflation, I got:

     Threads |   Regex 1  |  Regex 2
    ---------|------------|-----------
        1    |  17.1359s  | 20.8306s
        2    |  14.5036s  | 15.4172s
        8    |  13.6304s  | 13.8659s

For now, the optimization is not supported when --textconv or
--recurse-submodules are used, but I hope to send another patchset for
that still during GSoC. We may also try to allow even more parallelism,
refining the added 'obj_read_lock'.

[1]: https://matheustavares.gitlab.io/posts/week-6-working-at-zlib-inflation#multithreading-zlib-inflation
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

travis build: https://travis-ci.org/matheustavares/git/builds/570255029

Matheus Tavares (4):
  object-store: add lock to read_object_file_extended()
  grep: allow locks to be enabled individually
  grep: disable grep_read_mutex when possible
  grep: re-enable threads in some non-worktree cases

 Documentation/git-grep.txt | 12 ++++++++
 builtin/grep.c             | 22 +++++++++++---
 grep.c                     |  4 +--
 grep.h                     |  8 +++--
 object-store.h             |  4 +++
 packfile.c                 |  7 +++++
 sha1-file.c                | 61 ++++++++++++++++++++++++++++++++++----
 7 files changed, 105 insertions(+), 13 deletions(-)

-- 
2.22.0


^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2020-01-30 13:28 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 1/4] object-store: add lock to read_object_file_extended() Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 2/4] grep: allow locks to be enabled individually Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 3/4] grep: disable grep_read_mutex when possible Matheus Tavares
2019-08-10 20:27 ` [GSoC][PATCH 4/4] grep: re-enable threads in some non-worktree cases Matheus Tavares
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 01/11] grep: fix race conditions on userdiff calls Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 02/11] grep: fix race conditions at grep_submodule() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 03/11] grep: fix racy calls in grep_objects() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 04/11] replace-object: make replace operations thread-safe Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 05/11] object-store: allow threaded access to object reading Matheus Tavares
2019-11-12  2:54     ` Jonathan Tan
2019-11-13  5:20       ` Jeff King
2019-11-14  5:57         ` Matheus Tavares Bernardino
2019-11-14  6:01           ` Jeff King
2019-11-14 18:15             ` Jonathan Tan
2019-11-15  4:12               ` Jeff King
2019-12-19 22:27                 ` Matheus Tavares Bernardino
2020-01-09 22:02                   ` Matheus Tavares Bernardino
2020-01-10 19:07                     ` Christian Couder
2019-09-30  1:50   ` [PATCH v2 06/11] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2019-10-01 19:23     ` [PATCH] squash! " Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 07/11] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 08/11] grep: allow submodule functions to run in parallel Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 09/11] grep: protect packed_git [re-]initialization Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 10/11] grep: re-enable threads in non-worktree case Matheus Tavares
2019-09-30  1:50   ` [PATCH v2 11/11] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-29 11:26       ` SZEDER Gábor
2020-01-29 18:49         ` Junio C Hamano
2020-01-29 18:57         ` Junio C Hamano
2020-01-29 20:42           ` Matheus Tavares Bernardino
2020-01-30 13:28             ` Philippe Blain
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk
2020-01-16 14:47         ` [PATCH] " Matheus Tavares

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.