All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: gitster@pobox.com
Cc: git@vger.kernel.org, christian.couder@gmail.com, git@jeffhostetler.com
Subject: [PATCH v4 0/5] Parallel Checkout (part 2)
Date: Mon, 19 Apr 2021 16:53:30 -0300	[thread overview]
Message-ID: <cover.1618861380.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1618790794.git.matheus.bernardino@usp.br>

This version is almost identical to v3, but the last patch incorporates
the typo fixes and other rewording suggestions Christian made about the
design doc on the last round.

I decided to remove the sentence about step 3 dominating the execution
time as that's not always the case on e.g. a non-local clone or
sparse-checkout.

Matheus Tavares (5):
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: add configuration options
  parallel-checkout: support progress displaying
  parallel-checkout: add design documentation

 .gitignore                                    |   1 +
 Documentation/Makefile                        |   1 +
 Documentation/config/checkout.txt             |  21 +
 Documentation/technical/parallel-checkout.txt | 270 ++++++++
 Makefile                                      |   2 +
 builtin.h                                     |   1 +
 builtin/checkout--worker.c                    | 145 ++++
 entry.c                                       |  17 +-
 git.c                                         |   2 +
 parallel-checkout.c                           | 655 ++++++++++++++++++
 parallel-checkout.h                           | 111 +++
 unpack-trees.c                                |  19 +-
 12 files changed, 1240 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/technical/parallel-checkout.txt
 create mode 100644 builtin/checkout--worker.c
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h

Range-diff against v3:
1:  7096822c14 = 1:  7096822c14 unpack-trees: add basic support for parallel checkout
2:  4526516ea0 = 2:  4526516ea0 parallel-checkout: make it truly parallel
3:  ad165c0637 = 3:  ad165c0637 parallel-checkout: add configuration options
4:  cf9e28dc0e = 4:  cf9e28dc0e parallel-checkout: support progress displaying
5:  415d4114aa ! 5:  fd929f072c parallel-checkout: add design documentation
    @@ Documentation/technical/parallel-checkout.txt (new)
     +* Step 4: Write the new index to disk.
     +
     +Step 3 is the focus of the "parallel checkout" effort described here.
    -+It dominates the execution time for most of the above command types.
     +
     +Sequential Implementation
     +-------------------------
    @@ Documentation/technical/parallel-checkout.txt (new)
     +It wouldn't be safe to perform Step 3b in parallel, as there could be
     +race conditions between file creations and removals. Instead, the
     +parallel checkout framework lets the sequential code handle Step 3b,
    -+and use parallel workers to replace the sequential
    ++and uses parallel workers to replace the sequential
     +`entry.c:write_entry()` calls from Step 3c.
     +
     +Rejected Multi-Threaded Solution
    @@ Documentation/technical/parallel-checkout.txt (new)
     +warning for the user, like the classic sequential checkout does.
     +
     +The workers are able to detect both collisions among the entries being
    -+concurrently written and collisions among parallel-eligible and
    -+ineligible entries. The general idea for collision detection is quite
    -+straightforward: for each parallel-eligible entry, the main process must
    -+remove all files that prevent this entry from being written (before
    -+enqueueing it). This includes any non-directory file in the leading path
    -+of the entry. Later, when a worker gets assigned the entry, it looks
    -+again for the non-directories files and for an already existing file at
    -+the entry's path. If any of these checks finds something, the worker
    -+knows that there was a path collision.
    ++concurrently written and collisions between a parallel-eligible entry
    ++and an ineligible entry. The general idea for collision detection is
    ++quite straightforward: for each parallel-eligible entry, the main
    ++process must remove all files that prevent this entry from being written
    ++(before enqueueing it). This includes any non-directory file in the
    ++leading path of the entry. Later, when a worker gets assigned the entry,
    ++it looks again for the non-directories files and for an already existing
    ++file at the entry's path. If any of these checks finds something, the
    ++worker knows that there was a path collision.
     +
     +Because parallel checkout can distinguish path collisions from the case
     +where the file was already present in the working tree before checkout,
    @@ Documentation/technical/parallel-checkout.txt (new)
     +Besides, long-running filters may use the delayed checkout feature to
     +postpone the return of some filtered blobs. The delayed checkout queue
     +and the parallel checkout queue are not compatible and should remain
    -+separated.
    ++separate.
     ++
     +Note: regular files that only require internal filters, like end-of-line
     +conversion and re-encoding, are eligible for parallel checkout.
    @@ Documentation/technical/parallel-checkout.txt (new)
     +The API
     +-------
     +
    -+The parallel checkout API was designed with the goal to minimize changes
    -+to the current users of the checkout machinery. This means that they
    -+don't have to call a different function for sequential or parallel
    ++The parallel checkout API was designed with the goal of minimizing
    ++changes to the current users of the checkout machinery. This means that
    ++they don't have to call a different function for sequential or parallel
     +checkout. As already mentioned, `checkout_entry()` will automatically
     +insert the given entry in the parallel checkout queue when this feature
     +is enabled and the entry is eligible; otherwise, it will just write the
-- 
2.30.1


  parent reply	other threads:[~2021-04-19 19:53 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-17 21:12 [PATCH 0/5] Parallel Checkout (part 2) Matheus Tavares
2021-03-17 21:12 ` [PATCH 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-03-31  4:22   ` Christian Couder
2021-04-02 14:39     ` Matheus Tavares Bernardino
2021-03-17 21:12 ` [PATCH 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-03-31  4:32   ` Christian Couder
2021-04-02 14:42     ` Matheus Tavares Bernardino
2021-03-17 21:12 ` [PATCH 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-03-31  4:33   ` Christian Couder
2021-04-02 14:45     ` Matheus Tavares Bernardino
2021-03-17 21:12 ` [PATCH 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-03-17 21:12 ` [PATCH 5/5] parallel-checkout: add design documentation Matheus Tavares
2021-03-31  5:36   ` Christian Couder
2021-03-18 20:56 ` [PATCH 0/5] Parallel Checkout (part 2) Junio C Hamano
2021-03-19  3:24   ` Matheus Tavares
2021-03-19 22:58     ` Junio C Hamano
2021-03-31  5:42 ` Christian Couder
2021-04-08 16:16 ` [PATCH v2 " Matheus Tavares
2021-04-08 16:17   ` [PATCH v2 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-04-08 16:17   ` [PATCH v2 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-04-08 16:17   ` [PATCH v2 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-04-08 16:17   ` [PATCH v2 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-04-08 16:17   ` [PATCH v2 5/5] parallel-checkout: add design documentation Matheus Tavares
2021-04-08 19:52   ` [PATCH v2 0/5] Parallel Checkout (part 2) Junio C Hamano
2021-04-16 21:43   ` Junio C Hamano
2021-04-17 19:57     ` Matheus Tavares Bernardino
2021-04-19  9:41     ` Christian Couder
2021-04-19  0:14   ` [PATCH v3 " Matheus Tavares
2021-04-19  0:14     ` [PATCH v3 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-04-19  0:14     ` [PATCH v3 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-04-19  0:14     ` [PATCH v3 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-04-19  0:14     ` [PATCH v3 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-04-19  0:14     ` [PATCH v3 5/5] parallel-checkout: add design documentation Matheus Tavares
2021-04-19  9:36       ` Christian Couder
2021-04-19 19:53     ` Matheus Tavares [this message]
2021-04-19 19:53       ` [PATCH v4 1/5] unpack-trees: add basic support for parallel checkout Matheus Tavares
2021-04-19 19:53       ` [PATCH v4 2/5] parallel-checkout: make it truly parallel Matheus Tavares
2021-04-19 19:53       ` [PATCH v4 3/5] parallel-checkout: add configuration options Matheus Tavares
2021-04-19 19:53       ` [PATCH v4 4/5] parallel-checkout: support progress displaying Matheus Tavares
2021-04-19 19:53       ` [PATCH v4 5/5] parallel-checkout: add design documentation Matheus Tavares

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1618861380.git.matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=christian.couder@gmail.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.