All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares <matheus.bernardino@usp.br>
To: git@vger.kernel.org
Cc: gitster@pobox.com, git@jeffhostetler.com,
	chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com,
	jrnieder@gmail.com, martin.agren@gmail.com
Subject: [PATCH v3 00/19] Parallel Checkout (part I)
Date: Wed, 28 Oct 2020 23:14:37 -0300	[thread overview]
Message-ID: <cover.1603937110.git.matheus.bernardino@usp.br> (raw)
In-Reply-To: <cover.1600814153.git.matheus.bernardino@usp.br>

There was some semantic conflicts between this series and
jk/checkout-index-errors, so I rebased my series on top of that.

Also, I'd please ask reviewers to confirm that my descriptor
redirection in git_pc() (patch 17) is correct, as I'm not very 
familiar with the test suite descriptors.

Main changes since v2:

Patch 10:
  - Squashed Peff's patch removing an useless function parameter.

Patch 11:
  - Valgrind used to complain about send_one_item() passing
    uninitialized bytes to a syscall (write(2)). The referred bytes come
    from the unused positions on oid->hash[], when the hash is SHA-1.
    Since the workers won't use these bytes, there is no real harm. But
    the warning could cause confusion and even get in the way of
    detecting real errors, so I replaced the oidcpy() call with
    hashcpy().

Patch 16:
  - Replaced use of the non-portable '\+' in grep with '..*' (in
    t/lib-parallel-checkout.sh).

  - Properly quoted function parameters in t/lib-parallel-checkout.sh,
    as Jonathan pointed out.

  - In t2080, dropped tests that used git.git as test data, and added
    two more tests to check clone with parallel-checkout using the
    artificial repo already created for other tests.

  - No longer skip clone tests when GIT_TEST_DEFAULT_HASH is sha256. A
    bug in clone used to make the tests fail with this configuration set
    to this value, but the bug was fixed in 47ac970309 ("builtin/clone:
    avoid failure with GIT_DEFAULT_HASH", 2020-09-20).

Patch 17:
  - The test t2081-parallel-checkout-collisions.sh had a bug in which
    the filter options were being wrongly passed to git. These options
    were conditionally defined through a shell variable, for which the
    quoting was wrong. This should have made the test fail but, in fact,
    another bug (using the arithmetic operator `-eq` for strings), was
    preventing the problematic section from ever running. These bugs are
    now fixed, and the test script was also simplified, by making use of
    the lib-parallel-checkout.sh and eliminating the helper function.

  - Use "$TEST_ROOT/logger_script" instead of "../logger_script", to be
    on the safe side.


Jeff Hostetler (4):
  convert: make convert_attrs() and convert structs public
  convert: add [async_]convert_to_working_tree_ca() variants
  convert: add get_stream_filter_ca() variant
  convert: add conv_attrs classification

Matheus Tavares (15):
  entry: extract a header file for entry.c functions
  entry: make fstat_output() and read_blob_entry() public
  entry: extract cache_entry update from write_entry()
  entry: move conv_attrs lookup up to checkout_entry()
  entry: add checkout_entry_ca() which takes preloaded conv_attrs
  unpack-trees: add basic support for parallel checkout
  parallel-checkout: make it truly parallel
  parallel-checkout: support progress displaying
  make_transient_cache_entry(): optionally alloc from mem_pool
  builtin/checkout.c: complete parallel checkout support
  checkout-index: add parallel checkout support
  parallel-checkout: add tests for basic operations
  parallel-checkout: add tests related to clone collisions
  parallel-checkout: add tests related to .gitattributes
  ci: run test round with parallel-checkout enabled

 .gitignore                              |   1 +
 Documentation/config/checkout.txt       |  21 +
 Makefile                                |   2 +
 apply.c                                 |   1 +
 builtin.h                               |   1 +
 builtin/checkout--helper.c              | 142 ++++++
 builtin/checkout-index.c                |  22 +-
 builtin/checkout.c                      |  21 +-
 builtin/difftool.c                      |   3 +-
 cache.h                                 |  34 +-
 ci/run-build-and-tests.sh               |   1 +
 convert.c                               | 121 +++--
 convert.h                               |  68 +++
 entry.c                                 | 102 ++--
 entry.h                                 |  54 ++
 git.c                                   |   2 +
 parallel-checkout.c                     | 638 ++++++++++++++++++++++++
 parallel-checkout.h                     | 103 ++++
 read-cache.c                            |  12 +-
 t/README                                |   4 +
 t/lib-encoding.sh                       |  25 +
 t/lib-parallel-checkout.sh              |  46 ++
 t/t0028-working-tree-encoding.sh        |  25 +-
 t/t2080-parallel-checkout-basics.sh     | 170 +++++++
 t/t2081-parallel-checkout-collisions.sh |  98 ++++
 t/t2082-parallel-checkout-attributes.sh | 174 +++++++
 unpack-trees.c                          |  22 +-
 27 files changed, 1758 insertions(+), 155 deletions(-)
 create mode 100644 builtin/checkout--helper.c
 create mode 100644 entry.h
 create mode 100644 parallel-checkout.c
 create mode 100644 parallel-checkout.h
 create mode 100644 t/lib-encoding.sh
 create mode 100644 t/lib-parallel-checkout.sh
 create mode 100755 t/t2080-parallel-checkout-basics.sh
 create mode 100755 t/t2081-parallel-checkout-collisions.sh
 create mode 100755 t/t2082-parallel-checkout-attributes.sh

Range-diff against v2:
 1:  b9d2a329d3 =  1:  dfc3e0fd62 convert: make convert_attrs() and convert structs public
 2:  313c3bcbeb =  2:  c5fbd1e16d convert: add [async_]convert_to_working_tree_ca() variants
 3:  29bbdb78e9 =  3:  c77b16f694 convert: add get_stream_filter_ca() variant
 4:  a1cf5df961 =  4:  18c3f4247e convert: add conv_attrs classification
 5:  25b311745a =  5:  2caa2c4345 entry: extract a header file for entry.c functions
 6:  dbee09e936 =  6:  bfa52df9e2 entry: make fstat_output() and read_blob_entry() public
 7:  b61b5c44f0 =  7:  91ef17f533 entry: extract cache_entry update from write_entry()
 8:  667ad0dea7 =  8:  81e03baab1 entry: move conv_attrs lookup up to checkout_entry()
 9:  4ddb34209e =  9:  e1b886f823 entry: add checkout_entry_ca() which takes preloaded conv_attrs
10:  af0d790973 ! 10:  2bdc13664e unpack-trees: add basic support for parallel checkout
    @@ parallel-checkout.c (new)
     +}
     +
     +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd,
    -+			       const char *path, struct checkout *state)
    ++			       const char *path)
     +{
     +	int ret;
     +	struct stream_filter *filter;
    @@ parallel-checkout.c (new)
     +		goto out;
     +	}
     +
    -+	if (write_pc_item_to_fd(pc_item, fd, path.buf, state)) {
    ++	if (write_pc_item_to_fd(pc_item, fd, path.buf)) {
     +		/* Error was already reported. */
     +		pc_item->status = PC_ITEM_FAILED;
     +		goto out;
11:  991169488b ! 11:  096e543fd2 parallel-checkout: make it truly parallel
    @@ Documentation/config/checkout.txt: will checkout the '<something>' branch on ano
     +	The number of parallel workers to use when updating the working tree.
     +	The default is one, i.e. sequential execution. If set to a value less
     +	than one, Git will use as many workers as the number of logical cores
    -+	available. This setting and checkout.thresholdForParallelism affect all
    -+	commands that perform checkout. E.g. checkout, switch, clone, reset,
    -+	sparse-checkout, read-tree, etc.
    ++	available. This setting and `checkout.thresholdForParallelism` affect
    ++	all commands that perform checkout. E.g. checkout, clone, reset,
    ++	sparse-checkout, etc.
     ++
     +Note: parallel checkout usually delivers better performance for repositories
     +located on SSDs or over NFS. For repositories on spinning disks and/or machines
    @@ parallel-checkout.c: static void write_pc_item(struct parallel_checkout_item *pc
     +
     +	fixed_portion = (struct pc_item_fixed_portion *)data;
     +	fixed_portion->id = pc_item->id;
    -+	oidcpy(&fixed_portion->oid, &pc_item->ce->oid);
     +	fixed_portion->ce_mode = pc_item->ce->ce_mode;
     +	fixed_portion->crlf_action = pc_item->ca.crlf_action;
     +	fixed_portion->ident = pc_item->ca.ident;
     +	fixed_portion->name_len = name_len;
     +	fixed_portion->working_tree_encoding_len = working_tree_encoding_len;
    ++	/*
    ++	 * We use hashcpy() instead of oidcpy() because the hash[] positions
    ++	 * after `the_hash_algo->rawsz` might not be initialized. And Valgrind
    ++	 * would complain about passing uninitialized bytes to a syscall
    ++	 * (write(2)). There is no real harm in this case, but the warning could
    ++	 * hinder the detection of actual errors.
    ++	 */
    ++	hashcpy(fixed_portion->oid.hash, pc_item->ce->oid.hash);
     +
     +	variant = data + sizeof(*fixed_portion);
     +	if (working_tree_encoding_len) {
12:  7ceadf2427 = 12:  9cfeb4821c parallel-checkout: support progress displaying
13:  f13b4c17f4 = 13:  da99b671e6 make_transient_cache_entry(): optionally alloc from mem_pool
14:  d7885a1130 = 14:  d3d561754a builtin/checkout.c: complete parallel checkout support
15:  1cf9b807f7 ! 15:  ee34c6e149 checkout-index: add parallel checkout support
    @@ builtin/checkout-index.c
      #define CHECKOUT_ALL 4
      static int nul_term_line;
     @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, const char *prefix)
    - 	int prefix_length;
      	int force = 0, quiet = 0, not_new = 0;
      	int index_opt = 0;
    + 	int err = 0;
     +	int pc_workers, pc_threshold;
      	struct option builtin_checkout_index_options[] = {
      		OPT_BOOL('a', "all", &all,
    @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, co
      	for (i = 0; i < argc; i++) {
      		const char *arg = argv[i];
     @@ builtin/checkout-index.c: int cmd_checkout_index(int argc, const char **argv, const char *prefix)
    + 		strbuf_release(&buf);
    + 	}
    + 
    +-	if (err)
    +-		return 1;
    +-
      	if (all)
      		checkout_all(prefix, prefix_length);
      
     +	if (pc_workers > 1) {
    -+		/* Errors were already reported */
    -+		run_parallel_checkout(&state, pc_workers, pc_threshold,
    -+				      NULL, NULL);
    ++		err |= run_parallel_checkout(&state, pc_workers, pc_threshold,
    ++					     NULL, NULL);
     +	}
    ++
    ++	if (err)
    ++		return 1;
     +
      	if (is_lock_file_locked(&lock_file) &&
      	    write_locked_index(&the_index, &lock_file, COMMIT_LOCK))
16:  64b41d537e ! 16:  05299a3cc0 parallel-checkout: add tests for basic operations
    @@ Commit message
         for symlinks in the leading directories and the abidance to --force.
     
         Note: some helper functions are added to a common lib file which is only
    -    included by t2080 for now. But it will also be used by another
    -    parallel-checkout test in a following patch.
    +    included by t2080 for now. But it will also be used by other
    +    parallel-checkout tests in the following patches.
     
         Original-patch-by: Jeff Hostetler <jeffhost@microsoft.com>
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
    @@ t/lib-parallel-checkout.sh (new)
     +
     +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
     +# and checks that the number of workers spawned is equal to $3.
    ++#
     +git_pc()
     +{
     +	if test $# -lt 4
     +	then
     +		BUG "too few arguments to git_pc()"
    -+	fi
    ++	fi &&
     +
     +	workers=$1 threshold=$2 expected_workers=$3 &&
    -+	shift && shift && shift &&
    ++	shift 3 &&
     +
     +	rm -f trace &&
     +	GIT_TRACE2="$(pwd)/trace" git \
     +		-c checkout.workers=$workers \
     +		-c checkout.thresholdForParallelism=$threshold \
     +		-c advice.detachedHead=0 \
    -+		$@ &&
    ++		"$@" &&
     +
     +	# Check that the expected number of workers has been used. Note that it
    -+	# can be different than the requested number in two cases: when the
    -+	# quantity of entries to be checked out is less than the number of
    -+	# workers; and when the threshold has not been reached.
    ++	# can be different from the requested number in two cases: when the
    ++	# threshold is not reached; and when there are not enough
    ++	# parallel-eligible entries for all workers.
     +	#
    -+	local workers_in_trace=$(grep "child_start\[.\+\] git checkout--helper" trace | wc -l) &&
    ++	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
     +	test $workers_in_trace -eq $expected_workers &&
     +	rm -f trace
     +}
    @@ t/lib-parallel-checkout.sh (new)
     +# Verify that both the working tree and the index were created correctly
     +verify_checkout()
     +{
    -+	git -C $1 diff-index --quiet HEAD -- &&
    -+	git -C $1 diff-index --quiet --cached HEAD -- &&
    -+	git -C $1 status --porcelain >$1.status &&
    -+	test_must_be_empty $1.status
    ++	git -C "$1" diff-index --quiet HEAD -- &&
    ++	git -C "$1" diff-index --quiet --cached HEAD -- &&
    ++	git -C "$1" status --porcelain >"$1".status &&
    ++	test_must_be_empty "$1".status
     +}
     
      ## t/t2080-parallel-checkout-basics.sh (new) ##
    @@ t/t2080-parallel-checkout-basics.sh (new)
     +. ./test-lib.sh
     +. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
     +
    -+# NEEDSWORK: cloning a SHA1 repo with GIT_TEST_DEFAULT_HASH set to "sha256"
    -+# currently produces a wrong result (See
    -+# https://lore.kernel.org/git/20200911151717.43475-1-matheus.bernardino@usp.br/).
    -+# So we skip the "parallel-checkout during clone" tests when this test flag is
    -+# set to "sha256". Remove this when the bug is fixed.
    -+#
    -+if test "$GIT_TEST_DEFAULT_HASH" = "sha256"
    -+then
    -+	skip_all="t2080 currently don't work with GIT_TEST_DEFAULT_HASH=sha256"
    -+	test_done
    -+fi
    -+
    -+R_BASE=$GIT_BUILD_DIR
    -+
    -+test_expect_success 'sequential clone' '
    -+	git_pc 1 0 0 clone --quiet -- $R_BASE r_sequential &&
    -+	verify_checkout r_sequential
    -+'
    -+
    -+test_expect_success 'parallel clone' '
    -+	git_pc 2 0 2 clone --quiet -- $R_BASE r_parallel &&
    -+	verify_checkout r_parallel
    -+'
    -+
    -+test_expect_success 'fallback to sequential clone (threshold)' '
    -+	git -C $R_BASE ls-files >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+
    -+	git_pc 2 $threshold 0 clone --quiet -- $R_BASE r_sequential_fallback &&
    -+	verify_checkout r_sequential_fallback
    -+'
    -+
    -+# Just to be paranoid, actually compare the contents of the worktrees directly.
    -+test_expect_success 'compare working trees from clones' '
    -+	rm -rf r_sequential/.git &&
    -+	rm -rf r_parallel/.git &&
    -+	rm -rf r_sequential_fallback/.git &&
    -+	diff -qr r_sequential r_parallel &&
    -+	diff -qr r_sequential r_sequential_fallback
    -+'
    -+
     +# Test parallel-checkout with different operations (creation, deletion,
     +# modification) and entry types. A branch switch from B1 to B2 will contain:
     +#
    @@ t/t2080-parallel-checkout-basics.sh (new)
     +	verify_checkout various_sequential_fallback
     +'
     +
    -+test_expect_success SYMLINKS 'compare working trees from checkouts' '
    -+	rm -rf various_sequential/.git &&
    -+	rm -rf various_parallel/.git &&
    -+	rm -rf various_sequential_fallback/.git &&
    -+	diff -qr various_sequential various_parallel &&
    -+	diff -qr various_sequential various_sequential_fallback
    ++test_expect_success SYMLINKS 'parallel checkout on clone' '
    ++	git -C various checkout --recurse-submodules B2 &&
    ++	git_pc 2 0 2 clone --recurse-submodules various various_parallel_clone  &&
    ++	verify_checkout various_parallel_clone
    ++'
    ++
    ++test_expect_success SYMLINKS 'fallback to sequential checkout on clone (threshold)' '
    ++	git -C various checkout --recurse-submodules B2 &&
    ++	git_pc 2 100 0 clone --recurse-submodules various various_sequential_fallback_clone &&
    ++	verify_checkout various_sequential_fallback_clone
    ++'
    ++
    ++# Just to be paranoid, actually compare the working trees' contents directly.
    ++test_expect_success SYMLINKS 'compare the working trees' '
    ++	rm -rf various_*/.git &&
    ++	rm -rf various_*/d/.git &&
    ++
    ++	diff -r various_sequential various_parallel &&
    ++	diff -r various_sequential various_sequential_fallback &&
    ++	diff -r various_sequential various_parallel_clone &&
    ++	diff -r various_sequential various_sequential_fallback_clone
     +'
     +
     +test_cmp_str()
17:  70708d3e31 ! 17:  3d140dcacb parallel-checkout: add tests related to clone collisions
    @@ Commit message
         Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
         Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
     
    + ## t/lib-parallel-checkout.sh ##
    +@@ t/lib-parallel-checkout.sh: git_pc()
    + 		-c checkout.workers=$workers \
    + 		-c checkout.thresholdForParallelism=$threshold \
    + 		-c advice.detachedHead=0 \
    +-		"$@" &&
    ++		"$@" 2>&8 &&
    + 
    + 	# Check that the expected number of workers has been used. Note that it
    + 	# can be different from the requested number in two cases: when the
    +@@ t/lib-parallel-checkout.sh: git_pc()
    + 	local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) &&
    + 	test $workers_in_trace -eq $expected_workers &&
    + 	rm -f trace
    +-}
    ++} 8>&2 2>&4
    + 
    + # Verify that both the working tree and the index were created correctly
    + verify_checkout()
    +
      ## t/t2081-parallel-checkout-collisions.sh (new) ##
     @@
     +#!/bin/sh
     +
    -+test_description='parallel-checkout collisions'
    ++test_description='parallel-checkout collisions
    ++
    ++When there are path collisions during a clone, Git should report a warning
    ++listing all of the colliding entries. The sequential code detects a collision
    ++by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
    ++colliding pair of an item k, it searches cache_entry[0, k-1].
    ++
    ++This is not sufficient in parallel checkout since:
    ++
    ++- A colliding file may be created between the lstat() and open() calls;
    ++- A colliding entry might appear in the second half of the cache_entry array.
    ++
    ++The tests in this file make sure that the collision detection code is extended
    ++for parallel checkout.
    ++'
     +
     +. ./test-lib.sh
    ++. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
     +
    -+# When there are pathname collisions during a clone, Git should report a warning
    -+# listing all of the colliding entries. The sequential code detects a collision
    -+# by calling lstat() before trying to open(O_CREAT) the file. Then, to find the
    -+# colliding pair of an item k, it searches cache_entry[0, k-1].
    -+#
    -+# This is not sufficient in parallel-checkout mode since colliding files may be
    -+# created in a racy order. The tests in this file make sure the collision
    -+# detection code is extended for parallel-checkout. This is done in two parts:
    -+#
    -+# - First, two parallel workers create four colliding files racily.
    -+# - Then this exercise is repeated but forcing the colliding pair to appear in
    -+#   the second half of the cache_entry's array.
    -+#
    -+# The second item uses the fact that files with clean/smudge filters are not
    -+# parallel-eligible; and that they are processed sequentially *before* any
    -+# worker is spawned. We set a filter attribute to the last entry in the
    -+# cache_entry[] array, making it non-eligible, so that it is populated first.
    -+# This way, we can test if the collision detection code is correctly looking
    -+# for collision pairs in the second half of the array.
    ++TEST_ROOT="$PWD"
     +
     +test_expect_success CASE_INSENSITIVE_FS 'setup' '
    -+	file_hex=$(git hash-object -w --stdin </dev/null) &&
    -+	file_oct=$(echo $file_hex | hex2oct) &&
    ++	file_x_hex=$(git hash-object -w --stdin </dev/null) &&
    ++	file_x_oct=$(echo $file_x_hex | hex2oct) &&
     +
     +	attr_hex=$(echo "file_x filter=logger" | git hash-object -w --stdin) &&
     +	attr_oct=$(echo $attr_hex | hex2oct) &&
     +
    -+	printf "100644 FILE_X\0${file_oct}" >tree &&
    -+	printf "100644 FILE_x\0${file_oct}" >>tree &&
    -+	printf "100644 file_X\0${file_oct}" >>tree &&
    -+	printf "100644 file_x\0${file_oct}" >>tree &&
    ++	printf "100644 FILE_X\0${file_x_oct}" >tree &&
    ++	printf "100644 FILE_x\0${file_x_oct}" >>tree &&
    ++	printf "100644 file_X\0${file_x_oct}" >>tree &&
    ++	printf "100644 file_x\0${file_x_oct}" >>tree &&
     +	printf "100644 .gitattributes\0${attr_oct}" >>tree &&
     +
     +	tree_hex=$(git hash-object -w -t tree --stdin <tree) &&
     +	commit_hex=$(git commit-tree -m collisions $tree_hex) &&
     +	git update-ref refs/heads/collisions $commit_hex &&
     +
    -+	write_script logger_script <<-\EOF
    ++	write_script "$TEST_ROOT"/logger_script <<-\EOF
     +	echo "$@" >>filter.log
     +	EOF
     +'
     +
    -+clone_and_check_collision()
    -+{
    -+	id=$1 workers=$2 threshold=$3 expected_workers=$4 filter=$5 &&
    -+
    -+	filter_opts=
    -+	if test "$filter" -eq "use_filter"
    -+	then
    -+		# We use `core.ignoreCase=0` so that only `file_x`
    -+		# matches the pattern in .gitattributes.
    -+		#
    -+		filter_opts='-c filter.logger.smudge="../logger_script %f" -c core.ignoreCase=0'
    -+	fi &&
    -+
    -+	test_path_is_missing $id.trace &&
    -+	GIT_TRACE2="$(pwd)/$id.trace" git \
    -+		-c checkout.workers=$workers \
    -+		-c checkout.thresholdForParallelism=$threshold \
    -+		$filter_opts clone --branch=collisions -- . r_$id 2>$id.warning &&
    -+
    -+	# Check that checkout spawned the right number of workers
    -+	workers_in_trace=$(grep "child_start\[.\] git checkout--helper" $id.trace | wc -l) &&
    -+	test $workers_in_trace -eq $expected_workers &&
    -+
    -+	if test $filter -eq "use_filter"
    -+	then
    -+		#  Make sure only 'file_x' was filtered
    -+		test_path_is_file r_$id/filter.log &&
    ++for mode in parallel sequential-fallback
    ++do
    ++
    ++	case $mode in
    ++	parallel)		workers=2 threshold=0 expected_workers=2 ;;
    ++	sequential-fallback)	workers=2 threshold=100 expected_workers=0 ;;
    ++	esac
    ++
    ++	test_expect_success CASE_INSENSITIVE_FS "collision detection on $mode clone" '
    ++		git_pc $workers $threshold $expected_workers \
    ++			clone --branch=collisions . $mode 2>$mode.stderr &&
    ++
    ++		grep FILE_X $mode.stderr &&
    ++		grep FILE_x $mode.stderr &&
    ++		grep file_X $mode.stderr &&
    ++		grep file_x $mode.stderr &&
    ++		test_i18ngrep "the following paths have collided" $mode.stderr
    ++	'
    ++
    ++	# The following test ensures that the collision detection code is
    ++	# correctly looking for colliding peers in the second half of the
    ++	# cache_entry array. This is done by defining a smudge command for the
    ++	# *last* array entry, which makes it non-eligible for parallel-checkout.
    ++	# The last entry is then checked out *before* any worker is spawned,
    ++	# making it succeed and the workers' entries collide.
    ++	#
    ++	# Note: this test don't work on Windows because, on this system,
    ++	# collision detection uses strcmp() when core.ignoreCase=false. And we
    ++	# have to set core.ignoreCase=false so that only 'file_x' matches the
    ++	# pattern of the filter attribute. But it works on OSX, where collision
    ++	# detection uses inode.
    ++	#
    ++	test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN "collision detection on $mode clone w/ filter" '
    ++		git_pc $workers $threshold $expected_workers \
    ++			-c core.ignoreCase=false \
    ++			-c filter.logger.smudge="\"$TEST_ROOT/logger_script\" %f" \
    ++			clone --branch=collisions . ${mode}_with_filter \
    ++			2>${mode}_with_filter.stderr &&
    ++
    ++		grep FILE_X ${mode}_with_filter.stderr &&
    ++		grep FILE_x ${mode}_with_filter.stderr &&
    ++		grep file_X ${mode}_with_filter.stderr &&
    ++		grep file_x ${mode}_with_filter.stderr &&
    ++		test_i18ngrep "the following paths have collided" ${mode}_with_filter.stderr &&
    ++
    ++		# Make sure only "file_x" was filtered
    ++		test_path_is_file ${mode}_with_filter/filter.log &&
     +		echo file_x >expected.filter.log &&
    -+		test_cmp r_$id/filter.log expected.filter.log
    -+	else
    -+		test_path_is_missing r_$id/filter.log
    -+	fi &&
    -+
    -+	grep FILE_X $id.warning &&
    -+	grep FILE_x $id.warning &&
    -+	grep file_X $id.warning &&
    -+	grep file_x $id.warning &&
    -+	test_i18ngrep "the following paths have collided" $id.warning
    -+}
    -+
    -+test_expect_success CASE_INSENSITIVE_FS 'collision detection on parallel clone' '
    -+	clone_and_check_collision parallel 2 0 2
    -+'
    -+
    -+test_expect_success CASE_INSENSITIVE_FS 'collision detection on fallback to sequential clone' '
    -+	git ls-tree --name-only -r collisions >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+	clone_and_check_collision sequential 2 $threshold 0
    -+'
    -+
    -+# The next two tests don't work on Windows because, on this system, collision
    -+# detection uses strcmp() (when core.ignoreCase=0) to find the colliding pair.
    -+# But they work on OSX, where collision detection uses inode.
    -+
    -+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on parallel clone w/ filter' '
    -+	clone_and_check_collision parallel-with-filter 2 0 2 use_filter
    -+'
    -+
    -+test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN 'collision detection on fallback to sequential clone w/ filter' '
    -+	git ls-tree --name-only -r collisions >files &&
    -+	nr_files=$(wc -l <files) &&
    -+	threshold=$(($nr_files + 1)) &&
    -+	clone_and_check_collision sequential-with-filter 2 $threshold 0 use_filter
    -+'
    ++		test_cmp ${mode}_with_filter/filter.log expected.filter.log
    ++	'
    ++done
     +
     +test_done
18:  ece38f0483 = 18:  b26f676cae parallel-checkout: add tests related to .gitattributes
19:  b4cb5905d2 ! 19:  641c61f9b6 ci: run test round with parallel-checkout enabled
    @@ t/lib-parallel-checkout.sh
     +
      # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}`
      # and checks that the number of workers spawned is equal to $3.
    - git_pc()
    -
    - ## t/t2081-parallel-checkout-collisions.sh ##
    -@@
    - test_description='parallel-checkout collisions'
    - 
    - . ./test-lib.sh
    -+. "$TEST_DIRECTORY/lib-parallel-checkout.sh"
    - 
    - # When there are pathname collisions during a clone, Git should report a warning
    - # listing all of the colliding entries. The sequential code detects a collision
    + #
-- 
2.28.0


  parent reply	other threads:[~2020-10-29  2:17 UTC|newest]

Thread overview: 154+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10 21:33 [RFC PATCH 00/21] [RFC] Parallel checkout Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 01/21] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 02/21] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 03/21] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 04/21] convert: add conv_attrs classification Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 05/21] entry: extract a header file for entry.c functions Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 06/21] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 07/21] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 08/21] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 09/21] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 10/21] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 11/21] parallel-checkout: make it truly parallel Matheus Tavares
2020-08-19 21:34   ` Jeff Hostetler
2020-08-20  1:33     ` Matheus Tavares Bernardino
2020-08-20 14:39       ` Jeff Hostetler
2020-08-10 21:33 ` [RFC PATCH 12/21] parallel-checkout: add configuration options Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 13/21] parallel-checkout: support progress displaying Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 14/21] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 15/21] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 16/21] checkout-index: add " Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 17/21] parallel-checkout: avoid stat() calls in workers Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 18/21] entry: use is_dir_sep() when checking leading dirs Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 19/21] symlinks: make has_dirs_only_path() track FL_NOENT Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 20/21] parallel-checkout: create leading dirs in workers Matheus Tavares
2020-08-10 21:33 ` [RFC PATCH 21/21] parallel-checkout: skip checking the working tree on clone Matheus Tavares
2020-08-12 16:57 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler
2020-09-22 22:49 ` [PATCH v2 00/19] Parallel Checkout (part I) Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 04/19] convert: add conv_attrs classification Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 05/19] entry: extract a header file for entry.c functions Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-10-01 15:53     ` Jeff Hostetler
2020-10-01 15:59       ` Jeff Hostetler
2020-09-22 22:49   ` [PATCH v2 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-10-05  6:17     ` [PATCH] parallel-checkout: drop unused checkout state parameter Jeff King
2020-10-05 13:13       ` Matheus Tavares Bernardino
2020-10-05 13:45         ` Jeff King
2020-09-22 22:49   ` [PATCH v2 11/19] parallel-checkout: make it truly parallel Matheus Tavares
2020-09-29 19:52     ` Martin Ågren
2020-09-30 14:02       ` Matheus Tavares Bernardino
2020-09-22 22:49   ` [PATCH v2 12/19] parallel-checkout: support progress displaying Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 15/19] checkout-index: add " Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
2020-10-20  1:35     ` Jonathan Nieder
2020-10-20  2:55       ` Taylor Blau
2020-10-20 13:18         ` Matheus Tavares Bernardino
2020-10-20 19:09           ` Junio C Hamano
2020-10-20  3:18       ` Matheus Tavares Bernardino
2020-10-20  4:16         ` Jonathan Nieder
2020-10-20 19:14         ` Junio C Hamano
2020-09-22 22:49   ` [PATCH v2 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
2020-09-22 22:49   ` [PATCH v2 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
2020-10-29  2:14   ` Matheus Tavares [this message]
2020-10-29  2:14     ` [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-10-29 23:40       ` Junio C Hamano
2020-10-30 17:01         ` Matheus Tavares Bernardino
2020-10-30 17:38           ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-10-29 23:48       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-10-29 23:51       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 04/19] convert: add conv_attrs classification Matheus Tavares
2020-10-29 23:53       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 05/19] entry: extract a header file for entry.c functions Matheus Tavares
2020-10-30 21:36       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-10-30 21:58       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-10-30 22:02       ` Junio C Hamano
2020-10-29  2:14     ` [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-11-02 19:35       ` Junio C Hamano
2020-11-03  3:48         ` Matheus Tavares Bernardino
2020-10-29  2:14     ` [PATCH v3 11/19] parallel-checkout: make it truly parallel Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 12/19] parallel-checkout: support progress displaying Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 15/19] checkout-index: add " Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
2020-10-29  2:14     ` [PATCH v3 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
2020-10-29 19:48     ` [PATCH v3 00/19] Parallel Checkout (part I) Junio C Hamano
2020-10-30 15:58     ` Jeff Hostetler
2020-11-04 20:32     ` [PATCH v4 " Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 01/19] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-12-05 10:40         ` Christian Couder
2020-12-05 21:53           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 02/19] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-12-05 11:10         ` Christian Couder
2020-12-05 22:20           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 03/19] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-12-05 11:45         ` Christian Couder
2020-11-04 20:33       ` [PATCH v4 04/19] convert: add conv_attrs classification Matheus Tavares
2020-12-05 12:07         ` Christian Couder
2020-12-05 22:08           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 05/19] entry: extract a header file for entry.c functions Matheus Tavares
2020-12-06  8:31         ` Christian Couder
2020-11-04 20:33       ` [PATCH v4 06/19] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 07/19] entry: extract cache_entry update from write_entry() Matheus Tavares
2020-12-06  8:53         ` Christian Couder
2020-11-04 20:33       ` [PATCH v4 08/19] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-12-06  9:35         ` Christian Couder
2020-12-07 13:52           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Matheus Tavares
2020-12-06 10:02         ` Christian Couder
2020-12-07 16:47           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 10/19] unpack-trees: add basic support for parallel checkout Matheus Tavares
2020-12-06 11:36         ` Christian Couder
2020-12-07 19:06           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 11/19] parallel-checkout: make it truly parallel Matheus Tavares
2020-12-16 22:31         ` Emily Shaffer
2020-12-17 15:00           ` Matheus Tavares Bernardino
2020-11-04 20:33       ` [PATCH v4 12/19] parallel-checkout: support progress displaying Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 14/19] builtin/checkout.c: complete parallel checkout support Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 15/19] checkout-index: add " Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 16/19] parallel-checkout: add tests for basic operations Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 17/19] parallel-checkout: add tests related to clone collisions Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 18/19] parallel-checkout: add tests related to .gitattributes Matheus Tavares
2020-11-04 20:33       ` [PATCH v4 19/19] ci: run test round with parallel-checkout enabled Matheus Tavares
2020-12-16 14:50       ` [PATCH v5 0/9] Parallel Checkout (part I) Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 1/9] convert: make convert_attrs() and convert structs public Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 2/9] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 3/9] convert: add get_stream_filter_ca() variant Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 4/9] convert: add classification for conv_attrs struct Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 5/9] entry: extract a header file for entry.c functions Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 6/9] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 7/9] entry: extract update_ce_after_write() from write_entry() Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 8/9] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2020-12-16 14:50         ` [PATCH v5 9/9] entry: add checkout_entry_ca() taking preloaded conv_attrs Matheus Tavares
2020-12-16 15:27         ` [PATCH v5 0/9] Parallel Checkout (part I) Christian Couder
2020-12-17  1:11         ` Junio C Hamano
2021-03-23 14:19         ` [PATCH v6 0/9] Parallel Checkout (part 1) Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 1/9] convert: make convert_attrs() and convert structs public Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 2/9] convert: add [async_]convert_to_working_tree_ca() variants Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 3/9] convert: add get_stream_filter_ca() variant Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 4/9] convert: add classification for conv_attrs struct Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 5/9] entry: extract a header file for entry.c functions Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 6/9] entry: make fstat_output() and read_blob_entry() public Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 7/9] entry: extract update_ce_after_write() from write_entry() Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 8/9] entry: move conv_attrs lookup up to checkout_entry() Matheus Tavares
2021-03-23 14:19           ` [PATCH v6 9/9] entry: add checkout_entry_ca() taking preloaded conv_attrs Matheus Tavares
2021-03-23 17:34           ` [PATCH v6 0/9] Parallel Checkout (part 1) Junio C Hamano
2020-10-01 16:42 ` [RFC PATCH 00/21] [RFC] Parallel checkout Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1603937110.git.matheus.bernardino@usp.br \
    --to=matheus.bernardino@usp.br \
    --cc=chriscool@tuxfamily.org \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=martin.agren@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.