Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 00/11] grep: improve threading and fix race conditions
  2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
@ 2019-09-30  1:50 ` Matheus Tavares
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
  0 siblings, 1 reply; 21+ messages in thread
From: Matheus Tavares @ 2019-09-30  1:50 UTC (permalink / raw)
  To: git; +Cc: christian.couder, olyatelezhnaya, pclouds, gitster, jrnieder

This series focus on re-enabling threads at git-grep for the
non-worktree case. They are currently disabled due to being slower than
single-threaded grep in this case. However, by allowing parallel zlib
inflation when reading objects, speedups of up to 3x were observed.

The patchset also contains some fixes for race conditions found in the
worktree git-grep and thread optimizations to hopefully increase
overall performance.

This version was almost entirely re-written from scratch so I thought a
range-diff wouldn't be very useful. The major differences from the first
one are the race condition fixes and being able to run --textconv and
--recurse-submodules threaded now.

Matheus Tavares (11):
  grep: fix race conditions on userdiff calls
  grep: fix race conditions at grep_submodule()
  grep: fix racy calls in grep_objects()
  replace-object: make replace operations thread-safe
  object-store: allow threaded access to object reading
  grep: replace grep_read_mutex by internal obj read lock
  submodule-config: add skip_if_read option to repo_read_gitmodules()
  grep: allow submodule functions to run in parallel
  grep: protect packed_git [re-]initialization
  grep: re-enable threads in non-worktree case
  grep: move driver pre-load out of critical section

 .tsan-suppressions         |  6 +++
 Documentation/git-grep.txt | 11 +++++
 builtin/grep.c             | 90 +++++++++++++++++++-------------------
 grep.c                     | 32 ++++++++------
 grep.h                     | 13 ------
 object-store.h             | 37 ++++++++++++++++
 object.c                   |  2 +
 packfile.c                 |  9 ++++
 replace-object.c           | 11 ++++-
 replace-object.h           |  7 ++-
 sha1-file.c                | 57 +++++++++++++++++++++---
 submodule-config.c         | 18 +++-----
 submodule-config.h         |  2 +-
 unpack-trees.c             |  4 +-
 14 files changed, 205 insertions(+), 94 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 00/12] grep: improve threading and fix race conditions
  2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
@ 2020-01-16  2:39   ` " Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
                       ` (11 more replies)
  0 siblings, 12 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff

This series focus on re-enabling threads at git-grep for the
object store case. They are currently disabled due to being slower than
single-threaded grep in this case. However, by allowing parallel zlib
inflation when reading objects, speedups of up to 3.3x were observed.

The patchset also contains some fixes for race conditions found in the
working tree git-grep and other thread optimizations. With them, the
working tree case reached speedups of up to 1.52x.

Changes since v2:
- Updated commit message of patch 6 and added comment in the code to
  document how we ensure that the unlocked inflation code is thread-safe
  (as suggested by Christian[1]).
- Added patch to adjust the default number of threads in git-grep
  according to the number of available cores.
- Fixed some typos in the commit messages
- Rebased with master (d0654dc)

Note: this patchset is part of my GSoC project, and also of my
bachelor's capstone project (which had the same theme/goal). In case
someone might want to take a look, the project's final essay can be
found here[2] :) Besides extra details, it contains more performance
tests and plots of the resulting speedups.


[1]: https://public-inbox.org/git/CAP8UFD3QHNoePGsc9t0fVxTJpPy+KBCbPfXKGoNOx_2bCf1Hxg@mail.gmail.com/
[2]: https://matheustavares.gitlab.io/assets/tavares-final-essay.pdf
travis build: https://travis-ci.org/matheustavares/git/builds/637708218

Matheus Tavares (12):
  grep: fix race conditions on userdiff calls
  grep: fix race conditions at grep_submodule()
  grep: fix racy calls in grep_objects()
  replace-object: make replace operations thread-safe
  object-store: allow threaded access to object reading
  grep: replace grep_read_mutex by internal obj read lock
  submodule-config: add skip_if_read option to repo_read_gitmodules()
  grep: allow submodule functions to run in parallel
  grep: protect packed_git [re-]initialization
  grep: re-enable threads in non-worktree case
  grep: move driver pre-load out of critical section
  grep: use no. of cores as the default no. of threads

 .tsan-suppressions         |  6 +++
 Documentation/git-grep.txt | 15 +++++-
 builtin/grep.c             | 93 +++++++++++++++++++-------------------
 grep.c                     | 32 +++++++------
 grep.h                     | 13 ------
 object-store.h             | 37 +++++++++++++++
 object.c                   |  2 +
 packfile.c                 | 34 ++++++++++++++
 replace-object.c           | 11 ++++-
 replace-object.h           |  7 ++-
 sha1-file.c                | 57 +++++++++++++++++++++--
 submodule-config.c         | 18 +++-----
 submodule-config.h         |  2 +-
 unpack-trees.c             |  4 +-
 14 files changed, 233 insertions(+), 98 deletions(-)

Range-diff against v2:
 1:  0f31cb0c12 !  1:  e2f3d377f5 grep: fix race conditions on userdiff calls
    @@ Commit message
         git-grep uses an internal grep_read_mutex to protect object reading
         operations. Similarly, there's a grep_attr_mutex to protect calls to the
         gitattributes machinery. However, two of the three functions protected
    -    by the last mutex may also perform object reading, as seen bellow:
    +    by the last mutex may also perform object reading, as seen below:
     
         - userdiff_get_textconv() > notes_cache_init() >
           notes_cache_match_validity() > lookup_commit_reference_gently() >
    @@ Commit message
         that, let's make sure to acquire the lock before both of these calls.
     
         Note: this patch might slow down the threaded grep in worktree, for the
    -    sake of thread-safeness. However, in the following patches we should
    +    sake of thread-safeness. However, in the following patches, we should
         regain performance by replacing grep_read_mutex for an internal object
         reading lock and allowing parallel inflation during object reading.
     
 2:  be32683f1d =  2:  6f0899701b grep: fix race conditions at grep_submodule()
 3:  34aeb218bf !  3:  5295c892ee grep: fix racy calls in grep_objects()
    @@ Commit message
         condition with object reading operations (such as the ones internally
         performed by fill_textconv(), called at fill_textconv_grep()). The same
         problem happens with the call to gitmodules_config_oid() which also has
    -    parse_object() in its call stack. Fix that protecting both call with the
    -    said grep_read_mutex.
    +    parse_object() in its call stack. Fix that protecting both calls with
    +    the said grep_read_mutex.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
 4:  5deee3cf11 !  4:  d7f739bc57 replace-object: make replace operations thread-safe
    @@ object-store.h: struct raw_object_store {
     
      ## object.c ##
     @@ object.c: struct raw_object_store *raw_object_store_new(void)
    - 
      	memset(o, 0, sizeof(*o));
      	INIT_LIST_HEAD(&o->packed_git_mru);
    + 	hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
     +	pthread_mutex_init(&o->replace_mutex, NULL);
      	return o;
      }
 5:  4c5652ab34 !  5:  b72e90f229 object-store: allow threaded access to object reading
    @@ Commit message
         object-store: allow threaded access to object reading
     
         Allow object reading to be performed by multiple threads protecting it
    -    with an internal lock. The lock usage can be toggled with
    -    enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
    +    with an internal lock, the obj_read_mutex. The lock usage can be toggled
    +    with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
         functions which can be safely called in parallel are:
         read_object_file_extended(), repo_read_object_file(),
         read_object_file(), read_object_with_reference(), read_object(),
    -    oid_object_info() and oid_object_info_extended(). It's also possible to
    -    use obj_read_lock() and obj_read_unlock() to protect other sections that
    -    cannot execute in parallel with object reading.
    +    oid_object_info() and oid_object_info_extended(). It's also possible
    +    to use obj_read_lock() and obj_read_unlock() to protect other sections
    +    that cannot execute in parallel with object reading.
     
         Probably there are many spots in the functions listed above that could
         be executed unlocked (and thus, in parallel). But, for now, we are most
         interested in allowing parallel access to zlib inflation. This is one of
    -    the sections where object reading spends most of the time and it's
    -    already thread-safe. So, to take advantage of that, the respective lock
    -    is released when calling git_inflate() and re-acquired right after, for
    -    every calling spot in oid_object_info_extended()'s call chain. We may
    -    refine the lock to also exploit other possible parallel spots in the
    -    future, but threaded zlib inflation should already give great speedups
    -    for now.
    +    the sections where object reading spends most of the time in (e.g. up to
    +    one-third of git-grep's execution time in the chromium repo corresponds
    +    to inflation) and it's already thread-safe. So, to take advantage of
    +    that, the obj_read_mutex is released when calling git_inflate() and
    +    re-acquired right after, for every calling spot in
    +    oid_object_info_extended()'s call chain. We may refine this lock to also
    +    exploit other possible parallel spots in the future, but for now,
    +    threaded zlib inflation should already give great speedups for threaded
    +    object reading callers.
     
         Note that add_delta_base_cache() was also modified to skip adding
         already present entries to the cache. This wasn't possible before, but
    -    now it is since phase I and phase III of unpack_entry() may execute
    -    concurrently.
    +    it would be now, with the parallel inflation. Take for example the
    +    following situation, where two threads - A and B - are executing the
    +    code at unpack_entry():
     
    -    Another important thing to notice is that the object reading lock only
    -    works in conjunction with the 'struct raw_object_store's replace_mutex.
    -    Otherwise, there would still be racy spots in object reading
    -    functions.
    +    1. Thread A is performing the decompression of a base O (which is not
    +       yet in the cache) at PHASE II. Thread B is simultaneously trying to
    +       unpack O, but just starting at PHASE I.
    +    2. Since O is not yet in the cache, B will go to PHASE II to also
    +       perform the decompression.
    +    3. When they finish decompressing, one of them will get the object
    +       reading mutex and go to PHASE III while the other waits for the
    +       mutex. Let’s say A got the mutex first.
    +    4. Thread A will add O to the cache, go throughout the rest of PHASE III
    +       and return.
    +    5. Thread B gets the mutex, also add O to the cache (if the check wasn't
    +       there) and returns.
    +
    +    Finally, it is also important to highlight that the object reading lock
    +    can only ensure thread-safety in the mentioned functions thanks to two
    +    complementary mechanisms: the use of 'struct raw_object_store's
    +    replace_mutex, which guards sections in the object reading machinery
    +    that would otherwise be thread-unsafe; and the 'struct pack_window's
    +    inuse_cnt, which protects window reading operations (such as the one
    +    performed during the inflation of a packed object), allowing them to
    +    execute without the acquisition of the obj_read_mutex.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    @@ packfile.c: unsigned long get_size_from_delta(struct packed_git *p,
      	do {
      		in = use_pack(p, w_curs, curpos, &stream.avail_in);
      		stream.next_in = in;
    ++		/*
    ++		 * Note: the window section returned by use_pack() must be
    ++		 * available throughout git_inflate()'s unlocked execution. To
    ++		 * ensure no other thread will modify the window in the
    ++		 * meantime, we rely on the packed_window.inuse_cnt. This
    ++		 * counter is incremented before window reading and checked
    ++		 * before window disposal.
    ++		 *
    ++		 * Other worrying sections could be the call to close_pack_fd(),
    ++		 * which can close packs even with in-use windows, and to
    ++		 * reprepare_packed_git(). Regarding the former, mmap doc says:
    ++		 * "closing the file descriptor does not unmap the region". And
    ++		 * for the latter, it won't re-open already available packs.
    ++		 */
     +		obj_read_unlock();
      		st = git_inflate(&stream, Z_FINISH);
     +		obj_read_lock();
    @@ packfile.c: static void add_delta_base_cache(struct packed_git *p, off_t base_of
      	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
      	struct list_head *lru, *tmp;
      
    -+	if (get_delta_base_cache_entry(p, base_offset))
    ++	/*
    ++	 * Check required to avoid redundant entries when more than one thread
    ++	 * is unpacking the same object, in unpack_entry() (since its phases I
    ++	 * and III might run concurrently across multiple threads).
    ++	 */
    ++	if (in_delta_base_cache(p, base_offset))
     +		return;
     +
      	delta_base_cached += base_size;
    @@ packfile.c: static void *unpack_compressed_entry(struct packed_git *p,
      	do {
      		in = use_pack(p, w_curs, curpos, &stream.avail_in);
      		stream.next_in = in;
    ++		/*
    ++		 * Note: we must ensure the window section returned by
    ++		 * use_pack() will be available throughout git_inflate()'s
    ++		 * unlocked execution. Please refer to the comment at
    ++		 * get_size_from_delta() to see how this is done.
    ++		 */
     +		obj_read_unlock();
      		st = git_inflate(&stream, Z_FINISH);
     +		obj_read_lock();
 6:  b439bdeed3 =  6:  fc1200bb07 grep: replace grep_read_mutex by internal obj read lock
 7:  d759f1e8c7 !  7:  d39d2ce9c4 submodule-config: add skip_if_read option to repo_read_gitmodules()
    @@ Metadata
      ## Commit message ##
         submodule-config: add skip_if_read option to repo_read_gitmodules()
     
    -    Currently, submodule-config.c doesn't have an externally acessible
    +    Currently, submodule-config.c doesn't have an externally accessible
         function to read gitmodules only if it wasn't already read. But this
    -    exactly behavior is internally implemented by gitmodules_read_check(),
    -    to perform a lazy load. Let's merge this function with
    -    repo_read_gitmodules() adding an 'skip_if_read' which allow both
    +    exact behavior is internally implemented by gitmodules_read_check(), to
    +    perform a lazy load. Let's merge this function with
    +    repo_read_gitmodules() adding a 'skip_if_read' which allows both
         internal and external callers to access this functionality. This
    -    simplifies a little the code. The added option will also be used in the
    -    following patch.
    +    simplifies a little the code. The added option will also be used in
    +    the following patch.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    @@ submodule-config.h: int option_fetch_parse_recurse_submodules(const struct optio
     -void repo_read_gitmodules(struct repository *repo);
     +void repo_read_gitmodules(struct repository *repo, int skip_if_read);
      void gitmodules_config_oid(const struct object_id *commit_oid);
    - const struct submodule *submodule_from_name(struct repository *r,
    - 					    const struct object_id *commit_or_tree,
    + 
    + /**
     
      ## unpack-trees.c ##
     @@ unpack-trees.c: static void load_gitmodules_file(struct index_state *index,
 8:  2e76847ec9 !  8:  af8ad95d41 grep: allow submodule functions to run in parallel
    @@ Commit message
           race condition with object reading (for example parse_object() and
           is_promisor_remote()). However, they only call repo_read_gitmodules()
           if it wasn't read before. So let's pre-read it before firing the
    -      threads and allow these two functions to safelly be called in
    +      threads and allow these two functions to safely be called in
           parallel.
     
         - repo_submodule_init() is already thread-safe, so remove it from the
    @@ Commit message
           protected for now. Let's add a "NEEDSWORK" to it, informing why it
           cannot be removed from the critical section yet.
     
    -    - Finally, add_to_alternates_memory() must be kept protected by the same
    -      reason of the above item.
    +    - Finally, add_to_alternates_memory() must be kept protected for the
    +      same reason as the item above.
     
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
 9:  134114b001 !  9:  0ccf79ba86 grep: protect packed_git [re-]initialization
    @@ Commit message
         happen often, it could be hard to detect. So to prevent future
         headaches, let's force eager initialization of packed_git when setting
         git-grep up. There'll be a small overhead in the cases where we didn't
    -    really needed to prepare packed_git during execution but this shouldn't
    -    be very noticeable.
    +    really need to prepare packed_git during execution but this shouldn't be
    +    very noticeable.
     
         Also, packed_git may be re-initialized by
         packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep
10:  04a4d81ff5 ! 10:  6c09e9169d grep: re-enable threads in non-worktree case
    @@ Commit message
             8**  |   6.2886s  |   6.9843s
     
         These are all means of 30 executions after 2 warmup runs. All tests were
    -    executed on an i7-7700HQ (quad core w/ hyper-threading), 16GB of RAM and
    +    executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and
         SSD, running Manjaro Linux. But to make sure the optimization also
         performs well on HDD, the tests were repeated on another machine with an
    -    i5-4210U (dual core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
    +    i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
         5400 rpm), also running Manjaro Linux:
     
          Threads |   Regex 1  |  Regex 2
    @@ Commit message
         case when --textconv is used and there're too many text conversions.
         Probably the reason for this is that the object read lock is used to
         protect fill_textconv() and therefore there is a mutual exclusion
    -    between textconv execution and object reading. Because both are time
    -    consuming operations, not being able to perform them in parallel can
    -    cause performance drops. To inform the users about this (and other
    -    threading detais), let's also add a "NOTES ON THREADS" section to
    +    between textconv execution and object reading. Because both are
    +    time-consuming operations, not being able to perform them in parallel
    +    can cause performance drops. To inform the users about this (and other
    +    threading details), let's also add a "NOTES ON THREADS" section to
         Documentation/git-grep.txt.
     
         [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
11:  4f6f8b611c = 11:  2f72f30341 grep: move driver pre-load out of critical section
 -:  ---------- > 12:  a5891176d7 grep: use no. of cores as the default no. of threads
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 01/12] grep: fix race conditions on userdiff calls
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
                       ` (10 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Ramkumar Ramachandra, Eric Wong

git-grep uses an internal grep_read_mutex to protect object reading
operations. Similarly, there's a grep_attr_mutex to protect calls to the
gitattributes machinery. However, two of the three functions protected
by the last mutex may also perform object reading, as seen below:

- userdiff_get_textconv() > notes_cache_init() >
  notes_cache_match_validity() > lookup_commit_reference_gently() >
  parse_object() > repo_has_object_file() >
  repo_has_object_file_with_flags() > oid_object_info_extended()

- userdiff_find_by_path() > git_check_attr() > collect_some_attrs() >
  prepare_attr_stack() > read_attr() > read_attr_from_index() >
  read_blob_data_from_index() > read_object_file()

As these calls are not protected by grep_read_mutex, there might be race
conditions with other threads performing object reading (e.g. threads
calling fill_textconv() at grep.c:fill_textconv_grep()). To prevent
that, let's make sure to acquire the lock before both of these calls.

Note: this patch might slow down the threaded grep in worktree, for the
sake of thread-safeness. However, in the following patches, we should
regain performance by replacing grep_read_mutex for an internal object
reading lock and allowing parallel inflation during object reading.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 grep.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/grep.c b/grep.c
index 0552b127c1..c028f70aba 100644
--- a/grep.c
+++ b/grep.c
@@ -1816,7 +1816,9 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		 * is not thread-safe.
 		 */
 		grep_attr_lock();
+		grep_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
+		grep_read_unlock();
 		grep_attr_unlock();
 	}
 
@@ -2184,8 +2186,11 @@ void grep_source_load_driver(struct grep_source *gs,
 		return;
 
 	grep_attr_lock();
-	if (gs->path)
+	if (gs->path) {
+		grep_read_lock();
 		gs->driver = userdiff_find_by_path(istate, gs->path);
+		grep_read_unlock();
+	}
 	if (!gs->driver)
 		gs->driver = userdiff_find_by_name("default");
 	grep_attr_unlock();
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 02/12] grep: fix race conditions at grep_submodule()
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
                       ` (9 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Antonio Ospite,
	Stefan Beller

There're currently two function calls in builtin/grep.c:grep_submodule()
which might result in race conditions:

- submodule_from_path(): it has config_with_options() in its call stack
  which, in turn, may have read_object_file() in its own. Therefore,
  calling the first function without acquiring grep_read_mutex may end
  up causing a race condition with other object read operations
  performed by worker threads (for example, at the fill_textconv()
  call in grep.c:fill_textconv_grep()).
- parse_object_or_die(): it falls into the same problem, having
  repo_has_object_file(the_repository, ...) in its call stack. Besides
  that, parse_object(), which is also called by parse_object_or_die(),
  is thread-unsafe and also called by object reading functions.

It's unlikely to really fall into a data race with these operations as
the volume of calls to them is usually very low. But we better protect
ourselves against this possibility, anyway. So, to solve these issues,
move both of these function calls into the critical section of
grep_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 50ce8d9461..896e7effce 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -407,8 +407,7 @@ static int grep_submodule(struct grep_opt *opt,
 {
 	struct repository subrepo;
 	struct repository *superproject = opt->repo;
-	const struct submodule *sub = submodule_from_path(superproject,
-							  &null_oid, path);
+	const struct submodule *sub;
 	struct grep_opt subopt;
 	int hit;
 
@@ -419,6 +418,7 @@ static int grep_submodule(struct grep_opt *opt,
 	 * object.
 	 */
 	grep_read_lock();
+	sub = submodule_from_path(superproject, &null_oid, path);
 
 	if (!is_submodule_active(superproject, path)) {
 		grep_read_unlock();
@@ -455,9 +455,8 @@ static int grep_submodule(struct grep_opt *opt,
 		unsigned long size;
 		struct strbuf base = STRBUF_INIT;
 
-		object = parse_object_or_die(oid, oid_to_hex(oid));
-
 		grep_read_lock();
+		object = parse_object_or_die(oid, oid_to_hex(oid));
 		data = read_object_with_reference(&subrepo,
 						  &object->oid, tree_type,
 						  &size, NULL);
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 03/12] grep: fix racy calls in grep_objects()
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
                       ` (8 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Matthieu Moy, brian m. carlson, Alex Henrie,
	Brandon Williams, Stefan Beller

deref_tag() calls is_promisor_object() and parse_object(), both of which
perform lazy initializations and other thread-unsafe operations. If it
was only called by grep_objects() this wouldn't be a problem as the
latter is only executed by the main thread. However, deref_tag() is also
present in read_object_file()'s call stack. So calling deref_tag() in
grep_objects() without acquiring the grep_read_mutex may incur in a race
condition with object reading operations (such as the ones internally
performed by fill_textconv(), called at fill_textconv_grep()). The same
problem happens with the call to gitmodules_config_oid() which also has
parse_object() in its call stack. Fix that protecting both calls with
the said grep_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/builtin/grep.c b/builtin/grep.c
index 896e7effce..91fc032a32 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -658,13 +658,18 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
+
+		grep_read_lock();
 		real_obj = deref_tag(opt->repo, list->objects[i].item,
 				     NULL, 0);
+		grep_read_unlock();
 
 		/* load the gitmodules file for this rev */
 		if (recurse_submodules) {
 			submodule_free(opt->repo);
+			grep_read_lock();
 			gitmodules_config_oid(&real_obj->oid);
+			grep_read_unlock();
 		}
 		if (grep_object(opt, pathspec, real_obj, list->objects[i].name,
 				list->objects[i].path)) {
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 04/12] replace-object: make replace operations thread-safe
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (2 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
                       ` (7 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Stefan Beller

replace-object functions are very close to being thread-safe: the only
current racy section is the lazy initialization at
prepare_replace_object(). The following patches will protect some object
reading operations to be called threaded, but before that, replace
functions must be protected. To do so, add a mutex to struct
raw_object_store and acquire it before lazy initializing the
replace_map. This won't cause any noticeable performance drop as the
mutex will no longer be used after the replace_map is initialized.

Later, when the replace functions are called in parallel, thread
debuggers might point our use of the added replace_map_initialized flag
as a data race. However, as this boolean variable is initialized as
false and it's only updated once, there's no real harm. It's perfectly
fine if the value is updated right after a thread read it in
replace-map.h:lookup_replace_object() (there'll only be a performance
penalty for the affected threads at that moment). We could cease the
debugger warning protecting the variable reading at the said function.
However, this would negatively affect performance for all threads
calling it, at any time, so it's not really worthy since the warning
doesn't represent a real problem. Instead, to make sure we don't get
false positives (at ThreadSanitizer, at least) an entry for the
respective function is added to .tsan-suppressions.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 .tsan-suppressions |  6 ++++++
 object-store.h     |  2 ++
 object.c           |  2 ++
 replace-object.c   | 11 ++++++++++-
 replace-object.h   |  7 ++++++-
 5 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/.tsan-suppressions b/.tsan-suppressions
index 8c85014a0a..5ba86d6845 100644
--- a/.tsan-suppressions
+++ b/.tsan-suppressions
@@ -8,3 +8,9 @@
 # in practice it (hopefully!) doesn't matter.
 race:^want_color$
 race:^transfer_debug$
+
+# A boolean value, which tells whether the replace_map has been initialized or
+# not, is read racily with an update. As this variable is written to only once,
+# and it's OK if the value change right after reading it, this shouldn't be a
+# problem.
+race:^lookup_replace_object$
diff --git a/object-store.h b/object-store.h
index 55ee639350..33739c9dee 100644
--- a/object-store.h
+++ b/object-store.h
@@ -125,6 +125,8 @@ struct raw_object_store {
 	 * (see git-replace(1)).
 	 */
 	struct oidmap *replace_map;
+	unsigned replace_map_initialized : 1;
+	pthread_mutex_t replace_mutex; /* protect object replace functions */
 
 	struct commit_graph *commit_graph;
 	unsigned commit_graph_attempted : 1; /* if loading has been attempted */
diff --git a/object.c b/object.c
index 142ef69399..b4e1d3db3c 100644
--- a/object.c
+++ b/object.c
@@ -480,6 +480,7 @@ struct raw_object_store *raw_object_store_new(void)
 	memset(o, 0, sizeof(*o));
 	INIT_LIST_HEAD(&o->packed_git_mru);
 	hashmap_init(&o->pack_map, pack_map_entry_cmp, NULL, 0);
+	pthread_mutex_init(&o->replace_mutex, NULL);
 	return o;
 }
 
@@ -507,6 +508,7 @@ void raw_object_store_clear(struct raw_object_store *o)
 
 	oidmap_free(o->replace_map, 1);
 	FREE_AND_NULL(o->replace_map);
+	pthread_mutex_destroy(&o->replace_mutex);
 
 	free_commit_graph(o->commit_graph);
 	o->commit_graph = NULL;
diff --git a/replace-object.c b/replace-object.c
index e295e87943..7bd9aba6ee 100644
--- a/replace-object.c
+++ b/replace-object.c
@@ -34,14 +34,23 @@ static int register_replace_ref(struct repository *r,
 
 void prepare_replace_object(struct repository *r)
 {
-	if (r->objects->replace_map)
+	if (r->objects->replace_map_initialized)
 		return;
 
+	pthread_mutex_lock(&r->objects->replace_mutex);
+	if (r->objects->replace_map_initialized) {
+		pthread_mutex_unlock(&r->objects->replace_mutex);
+		return;
+	}
+
 	r->objects->replace_map =
 		xmalloc(sizeof(*r->objects->replace_map));
 	oidmap_init(r->objects->replace_map, 0);
 
 	for_each_replace_ref(r, register_replace_ref, NULL);
+	r->objects->replace_map_initialized = 1;
+
+	pthread_mutex_unlock(&r->objects->replace_mutex);
 }
 
 /* We allow "recursive" replacement. Only within reason, though */
diff --git a/replace-object.h b/replace-object.h
index 04ed7a85a2..3fbc32eb7b 100644
--- a/replace-object.h
+++ b/replace-object.h
@@ -24,12 +24,17 @@ const struct object_id *do_lookup_replace_object(struct repository *r,
  * name (replaced recursively, if necessary).  The return value is
  * either sha1 or a pointer to a permanently-allocated value.  When
  * object replacement is suppressed, always return sha1.
+ *
+ * Note: some thread debuggers might point a data race on the
+ * replace_map_initialized reading in this function. However, we know there's no
+ * problem in the value being updated by one thread right after another one read
+ * it here (and it should be written to only once, anyway).
  */
 static inline const struct object_id *lookup_replace_object(struct repository *r,
 							    const struct object_id *oid)
 {
 	if (!read_replace_refs ||
-	    (r->objects->replace_map &&
+	    (r->objects->replace_map_initialized &&
 	     r->objects->replace_map->map.tablesize == 0))
 		return oid;
 	return do_lookup_replace_object(r, oid);
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 05/12] object-store: allow threaded access to object reading
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (3 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
                       ` (6 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Stefan Beller

Allow object reading to be performed by multiple threads protecting it
with an internal lock, the obj_read_mutex. The lock usage can be toggled
with enable_obj_read_lock() and disable_obj_read_lock(). Currently, the
functions which can be safely called in parallel are:
read_object_file_extended(), repo_read_object_file(),
read_object_file(), read_object_with_reference(), read_object(),
oid_object_info() and oid_object_info_extended(). It's also possible
to use obj_read_lock() and obj_read_unlock() to protect other sections
that cannot execute in parallel with object reading.

Probably there are many spots in the functions listed above that could
be executed unlocked (and thus, in parallel). But, for now, we are most
interested in allowing parallel access to zlib inflation. This is one of
the sections where object reading spends most of the time in (e.g. up to
one-third of git-grep's execution time in the chromium repo corresponds
to inflation) and it's already thread-safe. So, to take advantage of
that, the obj_read_mutex is released when calling git_inflate() and
re-acquired right after, for every calling spot in
oid_object_info_extended()'s call chain. We may refine this lock to also
exploit other possible parallel spots in the future, but for now,
threaded zlib inflation should already give great speedups for threaded
object reading callers.

Note that add_delta_base_cache() was also modified to skip adding
already present entries to the cache. This wasn't possible before, but
it would be now, with the parallel inflation. Take for example the
following situation, where two threads - A and B - are executing the
code at unpack_entry():

1. Thread A is performing the decompression of a base O (which is not
   yet in the cache) at PHASE II. Thread B is simultaneously trying to
   unpack O, but just starting at PHASE I.
2. Since O is not yet in the cache, B will go to PHASE II to also
   perform the decompression.
3. When they finish decompressing, one of them will get the object
   reading mutex and go to PHASE III while the other waits for the
   mutex. Let’s say A got the mutex first.
4. Thread A will add O to the cache, go throughout the rest of PHASE III
   and return.
5. Thread B gets the mutex, also add O to the cache (if the check wasn't
   there) and returns.

Finally, it is also important to highlight that the object reading lock
can only ensure thread-safety in the mentioned functions thanks to two
complementary mechanisms: the use of 'struct raw_object_store's
replace_mutex, which guards sections in the object reading machinery
that would otherwise be thread-unsafe; and the 'struct pack_window's
inuse_cnt, which protects window reading operations (such as the one
performed during the inflation of a packed object), allowing them to
execute without the acquisition of the obj_read_mutex.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 object-store.h | 35 +++++++++++++++++++++++++++++++
 packfile.c     | 32 ++++++++++++++++++++++++++++
 sha1-file.c    | 57 +++++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 119 insertions(+), 5 deletions(-)

diff --git a/object-store.h b/object-store.h
index 33739c9dee..7c80e0d64c 100644
--- a/object-store.h
+++ b/object-store.h
@@ -6,6 +6,7 @@
 #include "list.h"
 #include "sha1-array.h"
 #include "strbuf.h"
+#include "thread-utils.h"
 
 struct object_directory {
 	struct object_directory *next;
@@ -251,6 +252,40 @@ int has_loose_object_nonlocal(const struct object_id *);
 
 void assert_oid_type(const struct object_id *oid, enum object_type expect);
 
+/*
+ * Enabling the object read lock allows multiple threads to safely call the
+ * following functions in parallel: repo_read_object_file(), read_object_file(),
+ * read_object_file_extended(), read_object_with_reference(), read_object(),
+ * oid_object_info() and oid_object_info_extended().
+ *
+ * obj_read_lock() and obj_read_unlock() may also be used to protect other
+ * section which cannot execute in parallel with object reading. Since the used
+ * lock is a recursive mutex, these sections can even contain calls to object
+ * reading functions. However, beware that in these cases zlib inflation won't
+ * be performed in parallel, losing performance.
+ *
+ * TODO: oid_object_info_extended()'s call stack has a recursive behavior. If
+ * any of its callees end up calling it, this recursive call won't benefit from
+ * parallel inflation.
+ */
+void enable_obj_read_lock(void);
+void disable_obj_read_lock(void);
+
+extern int obj_read_use_lock;
+extern pthread_mutex_t obj_read_mutex;
+
+static inline void obj_read_lock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_lock(&obj_read_mutex);
+}
+
+static inline void obj_read_unlock(void)
+{
+	if(obj_read_use_lock)
+		pthread_mutex_unlock(&obj_read_mutex);
+}
+
 struct object_info {
 	/* Request */
 	enum object_type *typep;
diff --git a/packfile.c b/packfile.c
index 7e7c04e4d8..24a73fc33a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1086,7 +1086,23 @@ unsigned long get_size_from_delta(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		/*
+		 * Note: the window section returned by use_pack() must be
+		 * available throughout git_inflate()'s unlocked execution. To
+		 * ensure no other thread will modify the window in the
+		 * meantime, we rely on the packed_window.inuse_cnt. This
+		 * counter is incremented before window reading and checked
+		 * before window disposal.
+		 *
+		 * Other worrying sections could be the call to close_pack_fd(),
+		 * which can close packs even with in-use windows, and to
+		 * reprepare_packed_git(). Regarding the former, mmap doc says:
+		 * "closing the file descriptor does not unmap the region". And
+		 * for the latter, it won't re-open already available packs.
+		 */
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1445,6 +1461,14 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
 	struct delta_base_cache_entry *ent = xmalloc(sizeof(*ent));
 	struct list_head *lru, *tmp;
 
+	/*
+	 * Check required to avoid redundant entries when more than one thread
+	 * is unpacking the same object, in unpack_entry() (since its phases I
+	 * and III might run concurrently across multiple threads).
+	 */
+	if (in_delta_base_cache(p, base_offset))
+		return;
+
 	delta_base_cached += base_size;
 
 	list_for_each_safe(lru, tmp, &delta_base_cache_lru) {
@@ -1574,7 +1598,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
+		/*
+		 * Note: we must ensure the window section returned by
+		 * use_pack() will be available throughout git_inflate()'s
+		 * unlocked execution. Please refer to the comment at
+		 * get_size_from_delta() to see how this is done.
+		 */
+		obj_read_unlock();
 		st = git_inflate(&stream, Z_FINISH);
+		obj_read_lock();
 		if (!stream.avail_out)
 			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
diff --git a/sha1-file.c b/sha1-file.c
index 188de57634..9dc0649748 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -1147,6 +1147,8 @@ static int unpack_loose_short_header(git_zstream *stream,
 				     unsigned char *map, unsigned long mapsize,
 				     void *buffer, unsigned long bufsiz)
 {
+	int ret;
+
 	/* Get the data stream */
 	memset(stream, 0, sizeof(*stream));
 	stream->next_in = map;
@@ -1155,7 +1157,11 @@ static int unpack_loose_short_header(git_zstream *stream,
 	stream->avail_out = bufsiz;
 
 	git_inflate_init(stream);
-	return git_inflate(stream, 0);
+	obj_read_unlock();
+	ret = git_inflate(stream, 0);
+	obj_read_lock();
+
+	return ret;
 }
 
 int unpack_loose_header(git_zstream *stream,
@@ -1200,7 +1206,9 @@ static int unpack_loose_header_to_strbuf(git_zstream *stream, unsigned char *map
 	stream->avail_out = bufsiz;
 
 	do {
+		obj_read_unlock();
 		status = git_inflate(stream, 0);
+		obj_read_lock();
 		strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
 		if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
 			return 0;
@@ -1240,8 +1248,11 @@ static void *unpack_loose_rest(git_zstream *stream,
 		 */
 		stream->next_out = buf + bytes;
 		stream->avail_out = size - bytes;
-		while (status == Z_OK)
+		while (status == Z_OK) {
+			obj_read_unlock();
 			status = git_inflate(stream, Z_FINISH);
+			obj_read_lock();
+		}
 	}
 	if (status == Z_STREAM_END && !stream->avail_in) {
 		git_inflate_end(stream);
@@ -1411,10 +1422,32 @@ static int loose_object_info(struct repository *r,
 	return (status < 0) ? status : 0;
 }
 
+int obj_read_use_lock = 0;
+pthread_mutex_t obj_read_mutex;
+
+void enable_obj_read_lock(void)
+{
+	if (obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 1;
+	init_recursive_mutex(&obj_read_mutex);
+}
+
+void disable_obj_read_lock(void)
+{
+	if (!obj_read_use_lock)
+		return;
+
+	obj_read_use_lock = 0;
+	pthread_mutex_destroy(&obj_read_mutex);
+}
+
 int fetch_if_missing = 1;
 
-int oid_object_info_extended(struct repository *r, const struct object_id *oid,
-			     struct object_info *oi, unsigned flags)
+static int do_oid_object_info_extended(struct repository *r,
+				       const struct object_id *oid,
+				       struct object_info *oi, unsigned flags)
 {
 	static struct object_info blank_oi = OBJECT_INFO_INIT;
 	struct pack_entry e;
@@ -1422,6 +1455,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	const struct object_id *real = oid;
 	int already_retried = 0;
 
+
 	if (flags & OBJECT_INFO_LOOKUP_REPLACE)
 		real = lookup_replace_object(r, oid);
 
@@ -1497,7 +1531,7 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	rtype = packed_object_info(r, e.p, e.offset, oi);
 	if (rtype < 0) {
 		mark_bad_packed_object(e.p, real->hash);
-		return oid_object_info_extended(r, real, oi, 0);
+		return do_oid_object_info_extended(r, real, oi, 0);
 	} else if (oi->whence == OI_PACKED) {
 		oi->u.packed.offset = e.offset;
 		oi->u.packed.pack = e.p;
@@ -1508,6 +1542,17 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid,
 	return 0;
 }
 
+int oid_object_info_extended(struct repository *r, const struct object_id *oid,
+			     struct object_info *oi, unsigned flags)
+{
+	int ret;
+	obj_read_lock();
+	ret = do_oid_object_info_extended(r, oid, oi, flags);
+	obj_read_unlock();
+	return ret;
+}
+
+
 /* returns enum object_type or negative */
 int oid_object_info(struct repository *r,
 		    const struct object_id *oid,
@@ -1580,6 +1625,7 @@ void *read_object_file_extended(struct repository *r,
 	if (data)
 		return data;
 
+	obj_read_lock();
 	if (errno && errno != ENOENT)
 		die_errno(_("failed to read object %s"), oid_to_hex(oid));
 
@@ -1595,6 +1641,7 @@ void *read_object_file_extended(struct repository *r,
 	if ((p = has_packed_and_bad(r, repl->hash)) != NULL)
 		die(_("packed object %s (stored in %s) is corrupt"),
 		    oid_to_hex(repl), p->pack_name);
+	obj_read_unlock();
 
 	return NULL;
 }
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (4 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
                       ` (5 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, brian m. carlson, Stefan Beller,
	Brandon Williams

git-grep uses 'grep_read_mutex' to protect its calls to object reading
operations. But these have their own internal lock now, which ensures a
better performance (allowing parallel access to more regions). So, let's
remove the former and, instead, activate the latter with
enable_obj_read_lock().

Sections that are currently protected by 'grep_read_mutex' but are not
internally protected by the object reading lock should be surrounded by
obj_read_lock() and obj_read_unlock(). These guarantee mutual exclusion
with object reading operations, keeping the current behavior and
avoiding race conditions. Namely, these places are:

  In grep.c:

  - fill_textconv() at fill_textconv_grep().
  - userdiff_get_textconv() at grep_source_1().

  In builtin/grep.c:

  - parse_object_or_die() and the submodule functions at
    grep_submodule().
  - deref_tag() and gitmodules_config_oid() at grep_objects().

If these functions become thread-safe, in the future, we might remove
the locking and probably get some speedup.

Note that some of the submodule functions will already be thread-safe
(or close to being thread-safe) with the internal object reading lock.
However, as some of them will require additional modifications to be
removed from the critical section, this will be done in its own patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 46 ++++++++++++++++------------------------------
 grep.c         | 39 +++++++++++++++++++--------------------
 grep.h         | 13 -------------
 3 files changed, 35 insertions(+), 63 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 91fc032a32..4a436d6c99 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -200,12 +200,12 @@ static void start_threads(struct grep_opt *opt)
 	int i;
 
 	pthread_mutex_init(&grep_mutex, NULL);
-	pthread_mutex_init(&grep_read_mutex, NULL);
 	pthread_mutex_init(&grep_attr_mutex, NULL);
 	pthread_cond_init(&cond_add, NULL);
 	pthread_cond_init(&cond_write, NULL);
 	pthread_cond_init(&cond_result, NULL);
 	grep_use_locks = 1;
+	enable_obj_read_lock();
 
 	for (i = 0; i < ARRAY_SIZE(todo); i++) {
 		strbuf_init(&todo[i].out, 0);
@@ -257,12 +257,12 @@ static int wait_all(void)
 	free(threads);
 
 	pthread_mutex_destroy(&grep_mutex);
-	pthread_mutex_destroy(&grep_read_mutex);
 	pthread_mutex_destroy(&grep_attr_mutex);
 	pthread_cond_destroy(&cond_add);
 	pthread_cond_destroy(&cond_write);
 	pthread_cond_destroy(&cond_result);
 	grep_use_locks = 0;
+	disable_obj_read_lock();
 
 	return hit;
 }
@@ -295,16 +295,6 @@ static int grep_cmd_config(const char *var, const char *value, void *cb)
 	return st;
 }
 
-static void *lock_and_read_oid_file(const struct object_id *oid, enum object_type *type, unsigned long *size)
-{
-	void *data;
-
-	grep_read_lock();
-	data = read_object_file(oid, type, size);
-	grep_read_unlock();
-	return data;
-}
-
 static int grep_oid(struct grep_opt *opt, const struct object_id *oid,
 		     const char *filename, int tree_name_len,
 		     const char *path)
@@ -413,20 +403,20 @@ static int grep_submodule(struct grep_opt *opt,
 
 	/*
 	 * NEEDSWORK: submodules functions need to be protected because they
-	 * access the object store via config_from_gitmodules(): the latter
-	 * uses get_oid() which, for now, relies on the global the_repository
-	 * object.
+	 * call config_from_gitmodules(): the latter contains in its call stack
+	 * many thread-unsafe operations that are racy with object reading, such
+	 * as parse_object() and is_promisor_object().
 	 */
-	grep_read_lock();
+	obj_read_lock();
 	sub = submodule_from_path(superproject, &null_oid, path);
 
 	if (!is_submodule_active(superproject, path)) {
-		grep_read_unlock();
+		obj_read_unlock();
 		return 0;
 	}
 
 	if (repo_submodule_init(&subrepo, superproject, sub)) {
-		grep_read_unlock();
+		obj_read_unlock();
 		return 0;
 	}
 
@@ -443,7 +433,7 @@ static int grep_submodule(struct grep_opt *opt,
 	 * object.
 	 */
 	add_to_alternates_memory(subrepo.objects->odb->path);
-	grep_read_unlock();
+	obj_read_unlock();
 
 	memcpy(&subopt, opt, sizeof(subopt));
 	subopt.repo = &subrepo;
@@ -455,13 +445,12 @@ static int grep_submodule(struct grep_opt *opt,
 		unsigned long size;
 		struct strbuf base = STRBUF_INIT;
 
-		grep_read_lock();
+		obj_read_lock();
 		object = parse_object_or_die(oid, oid_to_hex(oid));
+		obj_read_unlock();
 		data = read_object_with_reference(&subrepo,
 						  &object->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
-
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&object->oid));
 
@@ -586,7 +575,7 @@ static int grep_tree(struct grep_opt *opt, const struct pathspec *pathspec,
 			void *data;
 			unsigned long size;
 
-			data = lock_and_read_oid_file(&entry.oid, &type, &size);
+			data = read_object_file(&entry.oid, &type, &size);
 			if (!data)
 				die(_("unable to read tree (%s)"),
 				    oid_to_hex(&entry.oid));
@@ -624,12 +613,9 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		struct strbuf base;
 		int hit, len;
 
-		grep_read_lock();
 		data = read_object_with_reference(opt->repo,
 						  &obj->oid, tree_type,
 						  &size, NULL);
-		grep_read_unlock();
-
 		if (!data)
 			die(_("unable to read tree (%s)"), oid_to_hex(&obj->oid));
 
@@ -659,17 +645,17 @@ static int grep_objects(struct grep_opt *opt, const struct pathspec *pathspec,
 	for (i = 0; i < nr; i++) {
 		struct object *real_obj;
 
-		grep_read_lock();
+		obj_read_lock();
 		real_obj = deref_tag(opt->repo, list->objects[i].item,
 				     NULL, 0);
-		grep_read_unlock();
+		obj_read_unlock();
 
 		/* load the gitmodules file for this rev */
 		if (recurse_submodules) {
 			submodule_free(opt->repo);
-			grep_read_lock();
+			obj_read_lock();
 			gitmodules_config_oid(&real_obj->oid);
-			grep_read_unlock();
+			obj_read_unlock();
 		}
 		if (grep_object(opt, pathspec, real_obj, list->objects[i].name,
 				list->objects[i].path)) {
diff --git a/grep.c b/grep.c
index c028f70aba..13232a904a 100644
--- a/grep.c
+++ b/grep.c
@@ -1540,11 +1540,6 @@ static inline void grep_attr_unlock(void)
 		pthread_mutex_unlock(&grep_attr_mutex);
 }
 
-/*
- * Same as git_attr_mutex, but protecting the thread-unsafe object db access.
- */
-pthread_mutex_t grep_read_mutex;
-
 static int match_funcname(struct grep_opt *opt, struct grep_source *gs, char *bol, char *eol)
 {
 	xdemitconf_t *xecfg = opt->priv;
@@ -1741,13 +1736,20 @@ static int fill_textconv_grep(struct repository *r,
 	}
 
 	/*
-	 * fill_textconv is not remotely thread-safe; it may load objects
-	 * behind the scenes, and it modifies the global diff tempfile
-	 * structure.
+	 * fill_textconv is not remotely thread-safe; it modifies the global
+	 * diff tempfile structure, writes to the_repo's odb and might
+	 * internally call thread-unsafe functions such as the
+	 * prepare_packed_git() lazy-initializator. Because of the last two, we
+	 * must ensure mutual exclusion between this call and the object reading
+	 * API, thus we use obj_read_lock() here.
+	 *
+	 * TODO: allowing text conversion to run in parallel with object
+	 * reading operations might increase performance in the multithreaded
+	 * non-worktreee git-grep with --textconv.
 	 */
-	grep_read_lock();
+	obj_read_lock();
 	size = fill_textconv(r, driver, df, &buf);
-	grep_read_unlock();
+	obj_read_unlock();
 	free_filespec(df);
 
 	/*
@@ -1813,12 +1815,15 @@ static int grep_source_1(struct grep_opt *opt, struct grep_source *gs, int colle
 		grep_source_load_driver(gs, opt->repo->index);
 		/*
 		 * We might set up the shared textconv cache data here, which
-		 * is not thread-safe.
+		 * is not thread-safe. Also, get_oid_with_context() and
+		 * parse_object() might be internally called. As they are not
+		 * currenty thread-safe and might be racy with object reading,
+		 * obj_read_lock() must be called.
 		 */
 		grep_attr_lock();
-		grep_read_lock();
+		obj_read_lock();
 		textconv = userdiff_get_textconv(opt->repo, gs->driver);
-		grep_read_unlock();
+		obj_read_unlock();
 		grep_attr_unlock();
 	}
 
@@ -2118,10 +2123,7 @@ static int grep_source_load_oid(struct grep_source *gs)
 {
 	enum object_type type;
 
-	grep_read_lock();
 	gs->buf = read_object_file(gs->identifier, &type, &gs->size);
-	grep_read_unlock();
-
 	if (!gs->buf)
 		return error(_("'%s': unable to read %s"),
 			     gs->name,
@@ -2186,11 +2188,8 @@ void grep_source_load_driver(struct grep_source *gs,
 		return;
 
 	grep_attr_lock();
-	if (gs->path) {
-		grep_read_lock();
+	if (gs->path)
 		gs->driver = userdiff_find_by_path(istate, gs->path);
-		grep_read_unlock();
-	}
 	if (!gs->driver)
 		gs->driver = userdiff_find_by_name("default");
 	grep_attr_unlock();
diff --git a/grep.h b/grep.h
index 811fd274c9..9115db8515 100644
--- a/grep.h
+++ b/grep.h
@@ -220,18 +220,5 @@ int grep_threads_ok(const struct grep_opt *opt);
  */
 extern int grep_use_locks;
 extern pthread_mutex_t grep_attr_mutex;
-extern pthread_mutex_t grep_read_mutex;
-
-static inline void grep_read_lock(void)
-{
-	if (grep_use_locks)
-		pthread_mutex_lock(&grep_read_mutex);
-}
-
-static inline void grep_read_unlock(void)
-{
-	if (grep_use_locks)
-		pthread_mutex_unlock(&grep_read_mutex);
-}
 
 #endif
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules()
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (5 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
                       ` (4 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Stefan Beller

Currently, submodule-config.c doesn't have an externally accessible
function to read gitmodules only if it wasn't already read. But this
exact behavior is internally implemented by gitmodules_read_check(), to
perform a lazy load. Let's merge this function with
repo_read_gitmodules() adding a 'skip_if_read' which allows both
internal and external callers to access this functionality. This
simplifies a little the code. The added option will also be used in
the following patch.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c     |  2 +-
 submodule-config.c | 18 ++++++------------
 submodule-config.h |  2 +-
 unpack-trees.c     |  4 ++--
 4 files changed, 10 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 4a436d6c99..d3ed05c1da 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -420,7 +420,7 @@ static int grep_submodule(struct grep_opt *opt,
 		return 0;
 	}
 
-	repo_read_gitmodules(&subrepo);
+	repo_read_gitmodules(&subrepo, 0);
 
 	/*
 	 * NEEDSWORK: This adds the submodule's object directory to the list of
diff --git a/submodule-config.c b/submodule-config.c
index 85064810b2..bd5e14ab20 100644
--- a/submodule-config.c
+++ b/submodule-config.c
@@ -674,10 +674,13 @@ static int gitmodules_cb(const char *var, const char *value, void *data)
 	return parse_config(var, value, &parameter);
 }
 
-void repo_read_gitmodules(struct repository *repo)
+void repo_read_gitmodules(struct repository *repo, int skip_if_read)
 {
 	submodule_cache_check_init(repo);
 
+	if (repo->submodule_cache->gitmodules_read && skip_if_read)
+		return;
+
 	if (repo_read_index(repo) < 0)
 		return;
 
@@ -703,20 +706,11 @@ void gitmodules_config_oid(const struct object_id *commit_oid)
 	the_repository->submodule_cache->gitmodules_read = 1;
 }
 
-static void gitmodules_read_check(struct repository *repo)
-{
-	submodule_cache_check_init(repo);
-
-	/* read the repo's .gitmodules file if it hasn't been already */
-	if (!repo->submodule_cache->gitmodules_read)
-		repo_read_gitmodules(repo);
-}
-
 const struct submodule *submodule_from_name(struct repository *r,
 					    const struct object_id *treeish_name,
 		const char *name)
 {
-	gitmodules_read_check(r);
+	repo_read_gitmodules(r, 1);
 	return config_from(r->submodule_cache, treeish_name, name, lookup_name);
 }
 
@@ -724,7 +718,7 @@ const struct submodule *submodule_from_path(struct repository *r,
 					    const struct object_id *treeish_name,
 		const char *path)
 {
-	gitmodules_read_check(r);
+	repo_read_gitmodules(r, 1);
 	return config_from(r->submodule_cache, treeish_name, path, lookup_path);
 }
 
diff --git a/submodule-config.h b/submodule-config.h
index 42918b55e8..c11e22cf50 100644
--- a/submodule-config.h
+++ b/submodule-config.h
@@ -61,7 +61,7 @@ int option_fetch_parse_recurse_submodules(const struct option *opt,
 					  const char *arg, int unset);
 int parse_update_recurse_submodules_arg(const char *opt, const char *arg);
 int parse_push_recurse_submodules_arg(const char *opt, const char *arg);
-void repo_read_gitmodules(struct repository *repo);
+void repo_read_gitmodules(struct repository *repo, int skip_if_read);
 void gitmodules_config_oid(const struct object_id *commit_oid);
 
 /**
diff --git a/unpack-trees.c b/unpack-trees.c
index 2399b6818b..f5a8051803 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -291,11 +291,11 @@ static void load_gitmodules_file(struct index_state *index,
 	if (pos >= 0) {
 		struct cache_entry *ce = index->cache[pos];
 		if (!state && ce->ce_flags & CE_WT_REMOVE) {
-			repo_read_gitmodules(the_repository);
+			repo_read_gitmodules(the_repository, 0);
 		} else if (state && (ce->ce_flags & CE_UPDATE)) {
 			submodule_free(the_repository);
 			checkout_entry(ce, state, NULL, NULL);
-			repo_read_gitmodules(the_repository);
+			repo_read_gitmodules(the_repository, 0);
 		}
 	}
 }
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (6 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-29 11:26       ` SZEDER Gábor
  2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
                       ` (3 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Stefan Beller

Now that object reading operations are internally protected, the
submodule initialization functions at builtin/grep.c:grep_submodule()
are very close to being thread-safe. Let's take a look at each call and
remove from the critical section what we can, for better performance:

- submodule_from_path() and is_submodule_active() cannot be called in
  parallel yet only because they call repo_read_gitmodules() which
  contains, in its call stack, operations that would otherwise be in
  race condition with object reading (for example parse_object() and
  is_promisor_remote()). However, they only call repo_read_gitmodules()
  if it wasn't read before. So let's pre-read it before firing the
  threads and allow these two functions to safely be called in
  parallel.

- repo_submodule_init() is already thread-safe, so remove it from the
  critical section without other necessary changes.

- The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as
  no other thread is performing object reading operations in the subrepo
  yet. However, threads might be working in the superproject, and this
  function calls add_to_alternates_memory() internally, which is racy
  with object readings in the superproject. So it must be kept
  protected for now. Let's add a "NEEDSWORK" to it, informing why it
  cannot be removed from the critical section yet.

- Finally, add_to_alternates_memory() must be kept protected for the
  same reason as the item above.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index d3ed05c1da..ac3d86c2e5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -401,25 +401,23 @@ static int grep_submodule(struct grep_opt *opt,
 	struct grep_opt subopt;
 	int hit;
 
-	/*
-	 * NEEDSWORK: submodules functions need to be protected because they
-	 * call config_from_gitmodules(): the latter contains in its call stack
-	 * many thread-unsafe operations that are racy with object reading, such
-	 * as parse_object() and is_promisor_object().
-	 */
-	obj_read_lock();
 	sub = submodule_from_path(superproject, &null_oid, path);
 
-	if (!is_submodule_active(superproject, path)) {
-		obj_read_unlock();
+	if (!is_submodule_active(superproject, path))
 		return 0;
-	}
 
-	if (repo_submodule_init(&subrepo, superproject, sub)) {
-		obj_read_unlock();
+	if (repo_submodule_init(&subrepo, superproject, sub))
 		return 0;
-	}
 
+	/*
+	 * NEEDSWORK: repo_read_gitmodules() might call
+	 * add_to_alternates_memory() via config_from_gitmodules(). This
+	 * operation causes a race condition with concurrent object readings
+	 * performed by the worker threads. That's why we need obj_read_lock()
+	 * here. It should be removed once it's no longer necessary to add the
+	 * subrepo's odbs to the in-memory alternates list.
+	 */
+	obj_read_lock();
 	repo_read_gitmodules(&subrepo, 0);
 
 	/*
@@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	pathspec.recursive = 1;
 	pathspec.recurse_submodules = !!recurse_submodules;
 
+	if (recurse_submodules && (!use_index || untracked))
+		die(_("option not supported with --recurse-submodules"));
+
 	if (list.nr || cached || show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
@@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		    && (opt.pre_context || opt.post_context ||
 			opt.file_break || opt.funcbody))
 			skip_first_line = 1;
+
+		/*
+		 * Pre-read gitmodules (if not read already) to prevent racy
+		 * lazy reading in worker threads.
+		 */
+		if (recurse_submodules)
+			repo_read_gitmodules(the_repository, 1);
+
 		start_threads(&opt);
 	} else {
 		/*
@@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 		}
 	}
 
-	if (recurse_submodules && (!use_index || untracked))
-		die(_("option not supported with --recurse-submodules"));
-
 	if (!show_in_pager && !opt.status_only)
 		setup_pager();
 
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 09/12] grep: protect packed_git [re-]initialization
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (7 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
                       ` (2 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, René Scharfe, Stefan Beller,
	Brandon Williams

Some fields in struct raw_object_store are lazy initialized by the
thread-unsafe packfile.c:prepare_packed_git(). Although this function is
present in the call stack of git-grep threads, all paths to it are
currently protected by obj_read_lock() (and the main thread usually
indirectly calls it before firing the worker threads, anyway). However,
it's possible that future modifications add new unprotected paths to it,
introducing a race condition. Because errors derived from it wouldn't
happen often, it could be hard to detect. So to prevent future
headaches, let's force eager initialization of packed_git when setting
git-grep up. There'll be a small overhead in the cases where we didn't
really need to prepare packed_git during execution but this shouldn't be
very noticeable.

Also, packed_git may be re-initialized by
packfile.c:reprepare_packed_git(). Again, all paths to it in git-grep
are already protected by obj_read_lock() but it may suffer from the same
problem in the future. So let's also internally protect it with
obj_read_lock() (which is a recursive mutex).

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 8 ++++++--
 packfile.c     | 2 ++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index ac3d86c2e5..1535fd50f8 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -24,6 +24,7 @@
 #include "submodule.h"
 #include "submodule-config.h"
 #include "object-store.h"
+#include "packfile.h"
 
 static char const * const grep_usage[] = {
 	N_("git grep [<options>] [-e] <pattern> [<rev>...] [[--] <path>...]"),
@@ -1074,11 +1075,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			skip_first_line = 1;
 
 		/*
-		 * Pre-read gitmodules (if not read already) to prevent racy
-		 * lazy reading in worker threads.
+		 * Pre-read gitmodules (if not read already) and force eager
+		 * initialization of packed_git to prevent racy lazy
+		 * reading/initialization once worker threads are started.
 		 */
 		if (recurse_submodules)
 			repo_read_gitmodules(the_repository, 1);
+		if (startup_info->have_repository)
+			(void)get_packed_git(the_repository);
 
 		start_threads(&opt);
 	} else {
diff --git a/packfile.c b/packfile.c
index 24a73fc33a..946ca83e7a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1004,12 +1004,14 @@ void reprepare_packed_git(struct repository *r)
 {
 	struct object_directory *odb;
 
+	obj_read_lock();
 	for (odb = r->objects->odb; odb; odb = odb->next)
 		odb_clear_loose_cache(odb);
 
 	r->objects->approximate_object_count_valid = 0;
 	r->objects->packed_git_initialized = 0;
 	prepare_packed_git(r);
+	obj_read_unlock();
 }
 
 struct packed_git *get_packed_git(struct repository *r)
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 10/12] grep: re-enable threads in non-worktree case
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (8 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
  2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Eric Wong,
	Ramkumar Ramachandra, Manav Rathi

They were disabled at 53b8d93 ("grep: disable threading in non-worktree
case", 12-12-2011), due to observable performance drops (to the point
that using a single thread would be faster than multiple threads). But
now that zlib inflation can be performed in parallel we can regain the
speedup, so let's re-enable threads in non-worktree grep.

Grepping 'abcd[02]' ("Regex 1") and '(static|extern) (int|double) \*'
("Regex 2") at chromium's repository[1] I got:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  17.2920s  |  20.9624s
    2    |   9.6512s  |  11.3184s
    4    |   6.7723s  |   7.6268s
    8**  |   6.2886s  |   6.9843s

These are all means of 30 executions after 2 warmup runs. All tests were
executed on an i7-7700HQ (quad-core w/ hyper-threading), 16GB of RAM and
SSD, running Manjaro Linux. But to make sure the optimization also
performs well on HDD, the tests were repeated on another machine with an
i5-4210U (dual-core w/ hyper-threading), 8GB of RAM and HDD (SATA III,
5400 rpm), also running Manjaro Linux:

 Threads |   Regex 1  |  Regex 2
---------|------------|-----------
    1    |  18.4035s  |  22.5368s
    2    |  12.5063s  |  14.6409s
    4**  |  10.9136s  |  12.7106s

** Note that in these cases we relied on hyper-threading, and that's
   probably why we don't see a big difference in time.

Unfortunately, multithreaded git-grep might be slow in the non-worktree
case when --textconv is used and there're too many text conversions.
Probably the reason for this is that the object read lock is used to
protect fill_textconv() and therefore there is a mutual exclusion
between textconv execution and object reading. Because both are
time-consuming operations, not being able to perform them in parallel
can cause performance drops. To inform the users about this (and other
threading details), let's also add a "NOTES ON THREADS" section to
Documentation/git-grep.txt.

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 11 +++++++++++
 builtin/grep.c             |  2 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index c89fb569e3..de628741fa 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -347,6 +347,17 @@ EXAMPLES
 `git grep solution -- :^Documentation`::
 	Looks for `solution`, excluding files in `Documentation`.
 
+NOTES ON THREADS
+----------------
+
+The `--threads` option (and the grep.threads configuration) will be ignored when
+`--open-files-in-pager` is used, forcing a single-threaded execution.
+
+When grepping the object store (with `--cached` or giving tree objects), running
+with multiple threads might perform slower than single threaded if `--textconv`
+is given and there're too many text conversions. So if you experience low
+performance in this case, it might be desirable to use `--threads=1`.
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/builtin/grep.c b/builtin/grep.c
index 1535fd50f8..6aaa8d4406 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -1054,7 +1054,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	if (recurse_submodules && (!use_index || untracked))
 		die(_("option not supported with --recurse-submodules"));
 
-	if (list.nr || cached || show_in_pager) {
+	if (show_in_pager) {
 		if (num_threads > 1)
 			warning(_("invalid option combination, ignoring --threads"));
 		num_threads = 1;
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 11/12] grep: move driver pre-load out of critical section
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (9 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
@ 2020-01-16  2:39     ` Matheus Tavares
  2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
  11 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:39 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Alex Henrie, Rasmus Villemoes, Matthieu Moy

In builtin/grep.c:add_work() we pre-load the userdiff drivers before
adding the grep_source in the todo list. This operation is currently
being performed after acquiring the grep_mutex, but as it's already
thread-safe, we don't need to protect it here. So let's move it out of
the critical section which should avoid thread contention and improve
performance.

Running[1] `git grep --threads=8 abcd[02] HEAD` on chromium's
repository[2], I got the following mean times for 30 executions after 2
warmups:

        Original         |  6.2886s
-------------------------|-----------
 Out of critical section |  5.7852s

[1]: Tests performed on an i7-7700HQ with 16GB of RAM and SSD, running
     Manjaro Linux.
[2]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
         04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 builtin/grep.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 6aaa8d4406..a85b710b48 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -92,8 +92,11 @@ static pthread_cond_t cond_result;
 
 static int skip_first_line;
 
-static void add_work(struct grep_opt *opt, const struct grep_source *gs)
+static void add_work(struct grep_opt *opt, struct grep_source *gs)
 {
+	if (opt->binary != GREP_BINARY_TEXT)
+		grep_source_load_driver(gs, opt->repo->index);
+
 	grep_lock();
 
 	while ((todo_end+1) % ARRAY_SIZE(todo) == todo_done) {
@@ -101,9 +104,6 @@ static void add_work(struct grep_opt *opt, const struct grep_source *gs)
 	}
 
 	todo[todo_end].source = *gs;
-	if (opt->binary != GREP_BINARY_TEXT)
-		grep_source_load_driver(&todo[todo_end].source,
-					opt->repo->index);
 	todo[todo_end].done = 0;
 	strbuf_reset(&todo[todo_end].out);
 	todo_end = (todo_end + 1) % ARRAY_SIZE(todo);
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v3 12/12] grep: use no. of cores as the default no. of threads
  2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
                       ` (10 preceding siblings ...)
  2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
@ 2020-01-16  2:40     ` Matheus Tavares
  2020-01-16 13:11       ` Victor Leschuk
  11 siblings, 1 reply; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16  2:40 UTC (permalink / raw)
  To: git
  Cc: christian.couder, gitster, jrnieder, olyatelezhnaya, pclouds,
	jonathantanmy, peff, Brandon Williams, Alex Henrie,
	Victor Leschuk, Ramkumar Ramachandra, Victor Leschuk, Eric Wong,
	Matthieu Moy

When --threads is not specified, git-grep will use 8 threads by default.
This fixed number may be too many for machines with fewer cores and too
little for machines with more cores. So, instead, use the number of
logical cores available in the machine, which seems to result in the
best overall performance: The following measurements correspond to the
mean elapsed times for 30 git-grep executions in chromium's
repository[1] with a 95% confidence interval (each set of 30 were
performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
'(static|extern) (int|double) \*'.

      |          Working tree         |           Object Store
------|-------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
------|---------------|---------------|----------------|---------------
  32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
  16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
>  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
   4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
   2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
   1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01

The above tests were performed in a desktop running Debian 10.0 with
Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
RAM and a 7200 rpm, SATA 3.1 HDD.

Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:

      |          Working tree          |           Object Store
------|--------------------------------|--------------------------------
 #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
------|---------------|----------------|----------------|---------------
  32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
  16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
>  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
   4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
   2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
   1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

[1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
     04-06-2019), after a 'git gc' execution.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
---
 Documentation/git-grep.txt | 4 ++--
 builtin/grep.c             | 3 +--
 2 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index de628741fa..eb5412724f 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -59,8 +59,8 @@ grep.extendedRegexp::
 	other than 'default'.
 
 grep.threads::
-	Number of grep worker threads to use.  If unset (or set to 0),
-	8 threads are used by default (for now).
+	Number of grep worker threads to use. If unset (or set to 0), Git will
+	use as many threads as the number of logical cores available.
 
 grep.fullName::
 	If set to true, enable `--full-name` option by default.
diff --git a/builtin/grep.c b/builtin/grep.c
index a85b710b48..629eaf5dbc 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -33,7 +33,6 @@ static char const * const grep_usage[] = {
 
 static int recurse_submodules;
 
-#define GREP_NUM_THREADS_DEFAULT 8
 static int num_threads;
 
 static pthread_t *threads;
@@ -1064,7 +1063,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 	} else if (num_threads < 0)
 		die(_("invalid number of threads specified (%d)"), num_threads);
 	else if (num_threads == 0)
-		num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
+		num_threads = HAVE_THREADS ? online_cpus() : 1;
 
 	if (num_threads > 1) {
 		if (!HAVE_THREADS)
-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 12/12] grep: use no. of cores as the default no. of threads
  2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
@ 2020-01-16 13:11       ` Victor Leschuk
  2020-01-16 14:47         ` [PATCH] " Matheus Tavares
  0 siblings, 1 reply; 21+ messages in thread
From: Victor Leschuk @ 2020-01-16 13:11 UTC (permalink / raw)
  To: Matheus Tavares
  Cc: Git List, christian.couder, Junio C Hamano, jrnieder,
	olyatelezhnaya, Nguyễn Thái Ngọc Duy,
	Jonathan Tan, Jeff King, Brandon Williams, Alex Henrie,
	Ramkumar Ramachandra, Victor Leschuk, Eric Wong, Matthieu Moy

Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
use not online_cpus() but online_cpus()*2?

--
Victor


On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
<matheus.bernardino@usp.br> wrote:
>
> When --threads is not specified, git-grep will use 8 threads by default.
> This fixed number may be too many for machines with fewer cores and too
> little for machines with more cores. So, instead, use the number of
> logical cores available in the machine, which seems to result in the
> best overall performance: The following measurements correspond to the
> mean elapsed times for 30 git-grep executions in chromium's
> repository[1] with a 95% confidence interval (each set of 30 were
> performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> '(static|extern) (int|double) \*'.
>
>       |          Working tree         |           Object Store
> ------|-------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> ------|---------------|---------------|----------------|---------------
>   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
>   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
>    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
>    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
>    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
>
> The above tests were performed in a desktop running Debian 10.0 with
> Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> RAM and a 7200 rpm, SATA 3.1 HDD.
>
> Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
>
>       |          Working tree          |           Object Store
> ------|--------------------------------|--------------------------------
>  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> ------|---------------|----------------|----------------|---------------
>   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
>   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
>    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
>    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
>    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08
>
> [1]: chromium’s repo at commit 03ae96f (“Add filters testing at DSF=2”,
>      04-06-2019), after a 'git gc' execution.
>
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  Documentation/git-grep.txt | 4 ++--
>  builtin/grep.c             | 3 +--
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
> index de628741fa..eb5412724f 100644
> --- a/Documentation/git-grep.txt
> +++ b/Documentation/git-grep.txt
> @@ -59,8 +59,8 @@ grep.extendedRegexp::
>         other than 'default'.
>
>  grep.threads::
> -       Number of grep worker threads to use.  If unset (or set to 0),
> -       8 threads are used by default (for now).
> +       Number of grep worker threads to use. If unset (or set to 0), Git will
> +       use as many threads as the number of logical cores available.
>
>  grep.fullName::
>         If set to true, enable `--full-name` option by default.
> diff --git a/builtin/grep.c b/builtin/grep.c
> index a85b710b48..629eaf5dbc 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -33,7 +33,6 @@ static char const * const grep_usage[] = {
>
>  static int recurse_submodules;
>
> -#define GREP_NUM_THREADS_DEFAULT 8
>  static int num_threads;
>
>  static pthread_t *threads;
> @@ -1064,7 +1063,7 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>         } else if (num_threads < 0)
>                 die(_("invalid number of threads specified (%d)"), num_threads);
>         else if (num_threads == 0)
> -               num_threads = HAVE_THREADS ? GREP_NUM_THREADS_DEFAULT : 1;
> +               num_threads = HAVE_THREADS ? online_cpus() : 1;
>
>         if (num_threads > 1) {
>                 if (!HAVE_THREADS)
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] grep: use no. of cores as the default no. of threads
  2020-01-16 13:11       ` Victor Leschuk
@ 2020-01-16 14:47         ` " Matheus Tavares
  0 siblings, 0 replies; 21+ messages in thread
From: Matheus Tavares @ 2020-01-16 14:47 UTC (permalink / raw)
  To: vleschuk
  Cc: alexhenrie24, artagnon, bwilliams.eng, christian.couder, e, git,
	git, gitster, jonathantanmy, jrnieder, matheus.bernardino,
	olyatelezhnaya, pclouds, peff, vleschuk

Hi, Victor

On Thu, Jan 16, 2020 at 10:11 AM Victor Leschuk <vleschuk@gmail.com> wrote:
>
> Grepping bottleneck is not cpu, but IO. Maybe it is more reasonable to
> use not online_cpus() but online_cpus()*2?

I also tried this approach, but the tests I ran with online_cpus() * 2
only showed slowdowns. The results can be seen in the commit message:

> On Thu, Jan 16, 2020 at 5:41 AM Matheus Tavares
> <matheus.bernardino@usp.br> wrote:
> >
[...]
> > The following measurements correspond to the
> > mean elapsed times for 30 git-grep executions in chromium's
> > repository[1] with a 95% confidence interval (each set of 30 were
> > performed after 2 warmup runs). Regex 1 is 'abcd[02]' and Regex 2 is
> > '(static|extern) (int|double) \*'.
> >
> >       |          Working tree         |           Object Store
> > ------|-------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2      |   Regex 1      |   Regex 2
> > ------|---------------|---------------|----------------|---------------
> >   32  |  2.92s ± 0.01 |  3.72s ± 0.21 |   5.36s ± 0.01 |   6.07s ± 0.01
> >   16  |  2.84s ± 0.01 |  3.57s ± 0.21 |   5.05s ± 0.01 |   5.71s ± 0.01
> > >  8  |  2.53s ± 0.00 |  3.24s ± 0.21 |   4.86s ± 0.01 |   5.48s ± 0.01
> >    4  |  2.43s ± 0.02 |  3.22s ± 0.20 |   5.22s ± 0.02 |   6.03s ± 0.02
> >    2  |  3.06s ± 0.20 |  4.52s ± 0.01 |   7.52s ± 0.01 |   9.06s ± 0.01
> >    1  |  6.16s ± 0.01 |  9.25s ± 0.02 |  14.10s ± 0.01 |  17.22s ± 0.01
> >
> > The above tests were performed in a desktop running Debian 10.0 with
> > Intel(R) Xeon(R) CPU E3-1230 V2 (4 cores w/ hyper-threading), 32GB of
> > RAM and a 7200 rpm, SATA 3.1 HDD.
> >
> > Bellow, the tests were repeated for a machine with SSD: a Manjaro laptop
> > with Intel(R) i7-7700HQ (4 cores w/ hyper-threading) and 16GB of RAM:
> >
> >       |          Working tree          |           Object Store
> > ------|--------------------------------|--------------------------------
> >  #ths |  Regex 1      |  Regex 2       |   Regex 1      |   Regex 2
> > ------|---------------|----------------|----------------|---------------
> >   32  |  3.29s ± 0.21 |   4.30s ± 0.01 |   6.30s ± 0.01 |   7.30s ± 0.02
> >   16  |  3.19s ± 0.20 |   4.14s ± 0.02 |   5.91s ± 0.01 |   6.83s ± 0.01
> > >  8  |  2.90s ± 0.04 |   3.82s ± 0.20 |   5.70s ± 0.02 |   6.53s ± 0.01
> >    4  |  2.84s ± 0.02 |   3.77s ± 0.20 |   6.19s ± 0.02 |   7.18s ± 0.02
> >    2  |  3.73s ± 0.21 |   5.57s ± 0.02 |   9.28s ± 0.01 |  11.22s ± 0.01
> >    1  |  7.48s ± 0.02 |  11.36s ± 0.03 |  17.75s ± 0.01 |  21.87s ± 0.08

I deliberately used somehow complex regexes for these tests. So I
decided to do one more test with a very simple fixed string ("abc"),
allowing git-grep to spend less time in the cpu-bound regex searching.
The results can be seen bellow (the metodology is the same as described
above and the machine is the Manjaro laptop, for which online_cpus()
returns 8):

  #ths |  Working Three |  Object Store
 ------|----------------|---------------
    16 |  3.22s ± 0.20  |  5.96s ± 0.06
     8 |  2.92s ± 0.01  |  5.73s ± 0.02


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
@ 2020-01-29 11:26       ` SZEDER Gábor
  2020-01-29 18:49         ` Junio C Hamano
  2020-01-29 18:57         ` Junio C Hamano
  0 siblings, 2 replies; 21+ messages in thread
From: SZEDER Gábor @ 2020-01-29 11:26 UTC (permalink / raw)
  To: Matheus Tavares, gitster, Philippe Blain
  Cc: git, christian.couder, gitster, jrnieder, olyatelezhnaya,
	pclouds, jonathantanmy, peff, Brandon Williams, Stefan Beller

Junio, Matheus, Philippe,

this patch below and a7f3240877 (grep: ignore --recurse-submodules if
--no-index is given, 2020-01-26) on topic
'pb/do-not-recurse-grep-no-index' don't work well together, and cause
failure of the test 'grep --recurse-submodules --no-index ignores
--recurse-submodules' in 't7814-grep-recurse-submodules.sh', i.e. in
the new test added in a7f3240877.

More below.

On Wed, Jan 15, 2020 at 11:39:56PM -0300, Matheus Tavares wrote:
> Now that object reading operations are internally protected, the
> submodule initialization functions at builtin/grep.c:grep_submodule()
> are very close to being thread-safe. Let's take a look at each call and
> remove from the critical section what we can, for better performance:
> 
> - submodule_from_path() and is_submodule_active() cannot be called in
>   parallel yet only because they call repo_read_gitmodules() which
>   contains, in its call stack, operations that would otherwise be in
>   race condition with object reading (for example parse_object() and
>   is_promisor_remote()). However, they only call repo_read_gitmodules()
>   if it wasn't read before. So let's pre-read it before firing the
>   threads and allow these two functions to safely be called in
>   parallel.
> 
> - repo_submodule_init() is already thread-safe, so remove it from the
>   critical section without other necessary changes.
> 
> - The repo_read_gitmodules(&subrepo) call at grep_submodule() is safe as
>   no other thread is performing object reading operations in the subrepo
>   yet. However, threads might be working in the superproject, and this
>   function calls add_to_alternates_memory() internally, which is racy
>   with object readings in the superproject. So it must be kept
>   protected for now. Let's add a "NEEDSWORK" to it, informing why it
>   cannot be removed from the critical section yet.
> 
> - Finally, add_to_alternates_memory() must be kept protected for the
>   same reason as the item above.
> 
> Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
> ---
>  builtin/grep.c | 38 ++++++++++++++++++++++----------------
>  1 file changed, 22 insertions(+), 16 deletions(-)
> 
> diff --git a/builtin/grep.c b/builtin/grep.c
> index d3ed05c1da..ac3d86c2e5 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -401,25 +401,23 @@ static int grep_submodule(struct grep_opt *opt,
>  	struct grep_opt subopt;
>  	int hit;
>  
> -	/*
> -	 * NEEDSWORK: submodules functions need to be protected because they
> -	 * call config_from_gitmodules(): the latter contains in its call stack
> -	 * many thread-unsafe operations that are racy with object reading, such
> -	 * as parse_object() and is_promisor_object().
> -	 */
> -	obj_read_lock();
>  	sub = submodule_from_path(superproject, &null_oid, path);
>  
> -	if (!is_submodule_active(superproject, path)) {
> -		obj_read_unlock();
> +	if (!is_submodule_active(superproject, path))
>  		return 0;
> -	}
>  
> -	if (repo_submodule_init(&subrepo, superproject, sub)) {
> -		obj_read_unlock();
> +	if (repo_submodule_init(&subrepo, superproject, sub))
>  		return 0;
> -	}
>  
> +	/*
> +	 * NEEDSWORK: repo_read_gitmodules() might call
> +	 * add_to_alternates_memory() via config_from_gitmodules(). This
> +	 * operation causes a race condition with concurrent object readings
> +	 * performed by the worker threads. That's why we need obj_read_lock()
> +	 * here. It should be removed once it's no longer necessary to add the
> +	 * subrepo's odbs to the in-memory alternates list.
> +	 */
> +	obj_read_lock();
>  	repo_read_gitmodules(&subrepo, 0);
>  
>  	/*
> @@ -1052,6 +1050,9 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  	pathspec.recursive = 1;
>  	pathspec.recurse_submodules = !!recurse_submodules;
>  
> +	if (recurse_submodules && (!use_index || untracked))
> +		die(_("option not supported with --recurse-submodules"));

So this patch moves this condition here, expecting git to die with 
'--recurse-submodules --no-index'.  However, a7f3240877 removes the
'!use_index' part of the condition, so we won't die here ...

>  	if (list.nr || cached || show_in_pager) {
>  		if (num_threads > 1)
>  			warning(_("invalid option combination, ignoring --threads"));
> @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  		    && (opt.pre_context || opt.post_context ||
>  			opt.file_break || opt.funcbody))
>  			skip_first_line = 1;
> +
> +		/*
> +		 * Pre-read gitmodules (if not read already) to prevent racy
> +		 * lazy reading in worker threads.
> +		 */
> +		if (recurse_submodules)
> +			repo_read_gitmodules(the_repository, 1);

... and eventually reach this condition, which then reads the
submodules even with '--no-index', which is just what a7f3240877 tried
to avoid, thus triggering the test failure.

It might be that all we need is changing this condition to:

  if (recurse_submodules && use_index)

Or maybe not, but this change on top of 'pu' makes t7814 succeed
again.

However, I'm not familiar with the intricacies of either threaded grep
or submodules, much less the combination of the two...  so just an
idea.

>  		start_threads(&opt);
>  	} else {
>  		/*
> @@ -1105,9 +1114,6 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>  		}
>  	}
>  
> -	if (recurse_submodules && (!use_index || untracked))
> -		die(_("option not supported with --recurse-submodules"));
> -
>  	if (!show_in_pager && !opt.status_only)
>  		setup_pager();
>  
> -- 
> 2.24.1
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 11:26       ` SZEDER Gábor
@ 2020-01-29 18:49         ` Junio C Hamano
  2020-01-29 18:57         ` Junio C Hamano
  1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2020-01-29 18:49 UTC (permalink / raw)
  To: SZEDER Gábor; +Cc: Matheus Tavares, Philippe Blain, git

SZEDER Gábor <szeder.dev@gmail.com> writes:

> this patch below and a7f3240877 (grep: ignore --recurse-submodules if
> --no-index is given, 2020-01-26) on topic
> 'pb/do-not-recurse-grep-no-index' don't work well together, and cause
> failure of the test 'grep --recurse-submodules --no-index ignores
> --recurse-submodules' in 't7814-grep-recurse-submodules.sh', i.e. in
> the new test added in a7f3240877.

Yup, I was looking at it and trying to see what was going on.

>> +	if (recurse_submodules && (!use_index || untracked))
>> +		die(_("option not supported with --recurse-submodules"));
>
> So this patch moves this condition here, expecting git to die with 
> '--recurse-submodules --no-index'.  However, a7f3240877 removes the
> '!use_index' part of the condition, so we won't die here ...
>
>>  	if (list.nr || cached || show_in_pager) {
>>  		if (num_threads > 1)
>>  			warning(_("invalid option combination, ignoring --threads"));
>> @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>  		    && (opt.pre_context || opt.post_context ||
>>  			opt.file_break || opt.funcbody))
>>  			skip_first_line = 1;
>> +
>> +		/*
>> +		 * Pre-read gitmodules (if not read already) to prevent racy
>> +		 * lazy reading in worker threads.
>> +		 */
>> +		if (recurse_submodules)
>> +			repo_read_gitmodules(the_repository, 1);
>
> ... and eventually reach this condition, which then reads the
> submodules even with '--no-index', which is just what a7f3240877 tried
> to avoid, thus triggering the test failure.
>
> It might be that all we need is changing this condition to:
>
>   if (recurse_submodules && use_index)
>
> Or maybe not, but this change on top of 'pu' makes t7814 succeed
> again.

Sounds like a sensible idea.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 11:26       ` SZEDER Gábor
  2020-01-29 18:49         ` Junio C Hamano
@ 2020-01-29 18:57         ` Junio C Hamano
  2020-01-29 20:42           ` Matheus Tavares Bernardino
  1 sibling, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2020-01-29 18:57 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Matheus Tavares, Philippe Blain, git, christian.couder, jrnieder,
	olyatelezhnaya, pclouds, jonathantanmy, peff, Brandon Williams,
	Stefan Beller

SZEDER Gábor <szeder.dev@gmail.com> writes:

> Junio, Matheus, Philippe,
>
> this patch below and a7f3240877 (grep: ignore --recurse-submodules if
> --no-index is given, 2020-01-26) on topic
> 'pb/do-not-recurse-grep-no-index' don't work well together, and cause
> failure of the test 'grep --recurse-submodules --no-index ignores
> --recurse-submodules' in 't7814-grep-recurse-submodules.sh', i.e. in
> the new test added in a7f3240877.

Hmph, I wonder if "ignore --recurse-submodules if --no-index" should
have been done as a single liner patch, something along the lines of
"after parse_options() returns, drop recurse_submodules if no-index
was given", i.e.

@@ -958,6 +946,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
 			/* die the same way as if we did it at the beginning */
 			setup_git_directory();
 	}
+	if (!use_index)
+		recurse_submodules = 0; /* ignore */
 
 	/*
 	 * skip a -- separator; we know it cannot be

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 18:57         ` Junio C Hamano
@ 2020-01-29 20:42           ` Matheus Tavares Bernardino
  2020-01-30 13:28             ` Philippe Blain
  0 siblings, 1 reply; 21+ messages in thread
From: Matheus Tavares Bernardino @ 2020-01-29 20:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: SZEDER Gábor, Philippe Blain, git, Christian Couder,
	Jonathan Nieder,
	Оля
	Тележная,
	Nguyễn Thái Ngọc Duy, Jonathan Tan, Jeff King,
	Brandon Williams, Stefan Beller

On Wed, Jan 29, 2020 at 3:57 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> >
[...]
> > @@ -1071,6 +1072,14 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
> >                   && (opt.pre_context || opt.post_context ||
> >                       opt.file_break || opt.funcbody))
> >                       skip_first_line = 1;
> > +
> > +             /*
> > +              * Pre-read gitmodules (if not read already) to prevent racy
> > +              * lazy reading in worker threads.
> > +              */
> > +             if (recurse_submodules)
> > +                     repo_read_gitmodules(the_repository, 1);
> >
> > ... and eventually reach this condition, which then reads the
> > submodules even with '--no-index', which is just what a7f3240877 tried
> > to avoid, thus triggering the test failure.
> >
> > It might be that all we need is changing this condition to:
> >
> >  if (recurse_submodules && use_index)

Yes, I think that would work. I was only worried that, in case of
!use_index, the path taken could somehow lead to an unprotected call
to repo_read_gitmodules() (with threads spawned).Then, since the file
would not have been pre-loaded by the sequential code, we could
encounter a race condition. But by what I've inspected, when use_index
is false, grep_directory() will be called to traverse the files, and
it does not have repo_read_gitmodules() in its call graph[1]. So the
solution should be fine in the point of view of thread-safeness.

> Hmph, I wonder if "ignore --recurse-submodules if --no-index" should
> have been done as a single liner patch, something along the lines of
> "after parse_options() returns, drop recurse_submodules if no-index
> was given", i.e.
>
> @@ -958,6 +946,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>                         /* die the same way as if we did it at the beginning */
>                         setup_git_directory();
>         }
> +       if (!use_index)
> +               recurse_submodules = 0; /* ignore */
>
>         /*
>          * skip a -- separator; we know it cannot be

Yeah, this seems more meaningful, IMHO, as we can easily see that the
recurse_submodules option was dropped in favor of using --no-index.

[1]: Well, in fact repo_read_gitmodules() *is* in grep_directory()'s
call graph, but the only path to it is through the
fill_textconv_grep() > fill_textconv() call, which is already guarded
by the obj_read_mutex. So there is no problem here.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v3 08/12] grep: allow submodule functions to run in parallel
  2020-01-29 20:42           ` Matheus Tavares Bernardino
@ 2020-01-30 13:28             ` Philippe Blain
  0 siblings, 0 replies; 21+ messages in thread
From: Philippe Blain @ 2020-01-30 13:28 UTC (permalink / raw)
  To: Matheus Tavares Bernardino
  Cc: Junio C Hamano, SZEDER Gábor, git, Christian Couder,
	Jonathan Nieder,
	Оля
	Тележная,
	Nguyễn Thái Ngọc Duy, Jonathan Tan, Jeff King,
	Brandon Williams, Stefan Beller

Hi everyone,
>> 
>> @@ -958,6 +946,8 @@ int cmd_grep(int argc, const char **argv, const char *prefix)
>>                        /* die the same way as if we did it at the beginning */
>>                        setup_git_directory();
>>        }
>> +       if (!use_index)
>> +               recurse_submodules = 0; /* ignore */
>> 
>>        /*
>>         * skip a -- separator; we know it cannot be
> 
> Yeah, this seems more meaningful, IMHO, as we can easily see that the
> recurse_submodules option was dropped in favor of using --no-index.
> 
I agree. I’ll send a v2 of my patch with this added.

Philippe.


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, back to index

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-10 20:27 [GSoC][PATCH 0/4] grep: re-enable threads when cached, w/ parallel inflation Matheus Tavares
2019-09-30  1:50 ` [PATCH v2 00/11] grep: improve threading and fix race conditions Matheus Tavares
2020-01-16  2:39   ` [PATCH v3 00/12] " Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 01/12] grep: fix race conditions on userdiff calls Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 02/12] grep: fix race conditions at grep_submodule() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 03/12] grep: fix racy calls in grep_objects() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 04/12] replace-object: make replace operations thread-safe Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 05/12] object-store: allow threaded access to object reading Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 06/12] grep: replace grep_read_mutex by internal obj read lock Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 07/12] submodule-config: add skip_if_read option to repo_read_gitmodules() Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 08/12] grep: allow submodule functions to run in parallel Matheus Tavares
2020-01-29 11:26       ` SZEDER Gábor
2020-01-29 18:49         ` Junio C Hamano
2020-01-29 18:57         ` Junio C Hamano
2020-01-29 20:42           ` Matheus Tavares Bernardino
2020-01-30 13:28             ` Philippe Blain
2020-01-16  2:39     ` [PATCH v3 09/12] grep: protect packed_git [re-]initialization Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 10/12] grep: re-enable threads in non-worktree case Matheus Tavares
2020-01-16  2:39     ` [PATCH v3 11/12] grep: move driver pre-load out of critical section Matheus Tavares
2020-01-16  2:40     ` [PATCH v3 12/12] grep: use no. of cores as the default no. of threads Matheus Tavares
2020-01-16 13:11       ` Victor Leschuk
2020-01-16 14:47         ` [PATCH] " Matheus Tavares

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git