All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Ben Peart <peartben@gmail.com>
Cc: benpeart@microsoft.com, git@vger.kernel.org
Subject: Re: [PATCH v1] add: speed up cmd_add() by utilizing read_cache_preload()
Date: Sat, 03 Nov 2018 00:23:32 +0900	[thread overview]
Message-ID: <xmqqy3abo64r.fsf@gitster-ct.c.googlers.com> (raw)
In-Reply-To: <20181102133050.10756-1-peartben@gmail.com> (Ben Peart's message of "Fri, 2 Nov 2018 09:30:50 -0400")

Ben Peart <peartben@gmail.com> writes:

> From: Ben Peart <benpeart@microsoft.com>
>
> During an "add", a call is made to run_diff_files() which calls
> check_remove() for each index-entry.  The preload_index() code
> distributes some of the costs across multiple threads.

Nice.  I peeked around and noticed that we already do this in
builtin_diff_index() before running run_diff_index() when !cached,
and builtin_diff_files(), of course.

> Because the files checked are restricted to pathspec, adding individual
> files makes no measurable impact but on a Windows repo with ~200K files,
> 'git add .' drops from 6.3 seconds to 3.3 seconds for a 47% savings.

;-)

> diff --git a/builtin/add.c b/builtin/add.c
> index ad49806ebf..f65c172299 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -445,11 +445,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  		return 0;
>  	}
>  
> -	if (read_cache() < 0)
> -		die(_("index file corrupt"));
> -
> -	die_in_unpopulated_submodule(&the_index, prefix);
> -
>  	/*
>  	 * Check the "pathspec '%s' did not match any files" block
>  	 * below before enabling new magic.

It is not explained why this is not a mere s/read_cache/&_preload/
in the log message.  I can see it is because you wanted to make the
pathspec available to preload to further cut down the preloaded
paths, and I do not think it has any unintended (negatie) side
effect to parse the pathspec before populating the in-core index,
but that would have been a good thing to mention in the proposed log
message.

> @@ -459,6 +454,10 @@ int cmd_add(int argc, const char **argv, const char *prefix)
>  		       PATHSPEC_SYMLINK_LEADING_PATH,
>  		       prefix, argv);
>  
> +	if (read_cache_preload(&pathspec) < 0)
> +		die(_("index file corrupt"));
> +
> +	die_in_unpopulated_submodule(&the_index, prefix);
>  	die_path_inside_submodule(&the_index, &pathspec);
>  
>  	if (add_new_files) {
>
> base-commit: 4ede3d42dfb57f9a41ac96a1f216c62eb7566cc2

  reply	other threads:[~2018-11-02 15:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-02 13:30 [PATCH v1] add: speed up cmd_add() by utilizing read_cache_preload() Ben Peart
2018-11-02 15:23 ` Junio C Hamano [this message]
2018-11-02 16:14   ` Ben Peart
2018-11-02 15:49 ` Duy Nguyen
2018-11-03  0:38   ` Junio C Hamano
2018-11-03  4:47     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqy3abo64r.fsf@gitster-ct.c.googlers.com \
    --to=gitster@pobox.com \
    --cc=benpeart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=peartben@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.