git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Eric Sunshine <sunshine@sunshineco.com>,
	Ben Peart <Ben.Peart@microsoft.com>
Cc: Git List <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH v1] checkout: optionally speed up "git checkout -b foo"
Date: Tue, 24 Jul 2018 15:45:19 -0400	[thread overview]
Message-ID: <960c32c0-5b10-6854-73c9-392814dddbb3@gmail.com> (raw)
In-Reply-To: <CAPig+cQZ4g-6uT3zB0n2XWb-68DUUBZdaimTb6_Y4DNZrLUdyQ@mail.gmail.com>



On 7/24/2018 2:42 PM, Eric Sunshine wrote:
> On Tue, Jul 24, 2018 at 2:01 PM Ben Peart <Ben.Peart@microsoft.com> wrote:
>> If the new core.optimizecheckout config setting is set to true, speed up
> 
> Maybe:
> 
>      Add core.optimizeCheckout config setting which, when true, speeds up
> 

Sure

>> "git checkout -b foo" by avoiding the work to merge the working tree.  This
>> is valid because no merge needs to occur - only creating the new branch/
>> updating the refs. Any other options force it through the old code path.
>>
>> This change in behavior is off by default and behind the config setting so
>> that users have to opt-in to the optimized behavior.
>>
>> We've been running with this patch internally for a long time but it was
>> rejected when I submitted it to the mailing list before because it
>> implicitly changes the behavior of checkout -b. Trying it again configured
>> behind a config setting as a potential solution for other optimizations to
>> checkout that could change the behavior as well.
> 
> This paragraph is mere commentary which probably belongs below the
> "---" line following your sign-off.
> 

Hopefully this commentary (I'll move it below the --- line) is clearer:

We've been running with this patch internally for a long time but it was
rejected when I submitted it to the mailing list before [1] because it
implicitly changes the behavior of checkout -b as it no longer updates
the working directory.

I'm submitting it again behind a config setting so that it doesn't cause
any back compat issues unless the user explicitly opts in to the new
behavior. My hope is this same setting and model can be used if/when we
make other performance optimizations to checkout like using the cache
tree to avoid having to traverse the entire tree being discussed [2].

[1] 
https://public-inbox.org/git/20160909192520.4812-1-benpeart@microsoft.com/
[2] 
https://public-inbox.org/git/20180724042740.GB13248@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469

>> https://public-inbox.org/git/20180724042740.GB13248@sigill.intra.peff.net/T/#m75afe3ab318d23f36334cf3a6e3d058839592469
> 
> Is this link meant to reference the previous attempt of optimizing
> "checkout -b"? Although there's a single mention of "checkout -b" in
> that discussion, it doesn't seem to be the previous attempt or explain
> why it was rejected.
> 
> It would be quite nice to see a discussion in both the commit message
> and the documentation about the pros and cons of enabling this
> optimization. That it was previously rejected suggests that there may
> be serious or unexpected consequences. How will a typical user know
> whether its use is desirable or not?
> 
>> Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
>> ---
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> @@ -911,6 +911,12 @@ core.commitGraph::
>> +core.optimizedCheckout
>> +       Speed up "git checkout -b foo" by skipping much of the work of a
>> +       full checkout command.  This changs the behavior as it will skip
> 
> s/changs/changes/
> 
>> +       merging the trees and updating the index and instead only create
>> +       and switch to the new ref.
>> diff --git a/builtin/checkout.c b/builtin/checkout.c
>> @@ -471,6 +475,88 @@ static void setup_branch_path(struct branch_info *branch)
>> +static int needs_working_tree_merge(const struct checkout_opts *opts,
>> +       const struct branch_info *old_branch_info,
>> +       const struct branch_info *new_branch_info)
>> +{
>> +       /*
>> +        * We must do the merge if we are actually moving to a new
>> +        * commit tree.
>> +        */
>> +       if (!old_branch_info->commit || !new_branch_info->commit ||
>> +               oidcmp(&old_branch_info->commit->object.oid, &new_branch_info->commit->object.oid))
>> +               return 1;
>> +       [...]
>> +       return 0;
>> +}
> 
> This long list of special-case checks doesn't leave me too enthused,
> however, that aside, this approach seems backward. Rather than erring
> on the side of safety by falling back to the merging behavior, it errs
> in the other direction, which may be a problem if this list of
> special-case checks ever gets out of sync with 'checkout_opts'. That
> is, if someone adds a new option which ought to employ the merging
> behavior, but forgets to update this function, then this function will
> incorrectly default to using the optimization.
> 

I'm not thrilled with the long list either (the plethora of comments 
probably makes it appear worse than it is) but I don't see how flipping 
the logic around makes it fail if someone adds a new option.  The "meets 
all criteria for optimization" code can only test existing options.

> A safer approach would be the inverse, namely:
> 
>      static int skip_worktree_merge(...)
>      {
>          if (...meets all criteria for optimization...)
>              return 1;
>          return 0;
>      }
> 
>>   static int merge_working_tree(const struct checkout_opts *opts,
>>                                struct branch_info *old_branch_info,
>>                                struct branch_info *new_branch_info,
>> {
>> +       /*
>> +        * Skip merging the trees, updating the index, and work tree only if we
>> +        * are simply creating a new branch via "git checkout -b foo."  Any
>> +        * other options or usage will continue to do all these steps.
>> +        */
>> +       if (core_optimize_checkout && !needs_working_tree_merge(opts, old_branch_info, new_branch_info))
>> +               return 0;
> 
> This seems a somewhat odd place to hook in this optimization,
> especially as there is only a single caller of this function. Instead,
> one might expect the caller itself to make this judgment and avoid
> trying the merge in the first place if not needed. That is, in
> switch_branches:
> 
>      if (!skip_worktree_merge(...))
>          ret = merge_working_tree(...);
> 

I personally agree, it was moved to its current location per feedback 
the first time around.  Perhaps with the addition of the config setting 
it will be better received moved out to the caller.

  reply	other threads:[~2018-07-24 19:45 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-24 18:01 [PATCH v1] checkout: optionally speed up "git checkout -b foo" Ben Peart
2018-07-24 18:42 ` Eric Sunshine
2018-07-24 19:45   ` Ben Peart [this message]
2018-07-26 15:04     ` Junio C Hamano
2018-07-26 18:59       ` Eric Sunshine
2018-07-26 19:08         ` Eric Sunshine
2018-07-24 19:21 ` Junio C Hamano
2018-07-24 20:47   ` Ben Peart
2018-07-31 16:39 ` [PATCH v2] checkout: optimize "git checkout -b <new_branch>" Ben Peart
2018-07-31 20:01   ` Junio C Hamano
2018-08-01 15:10   ` Duy Nguyen
2018-08-02 18:02     ` Ben Peart
2018-08-03 15:58       ` Duy Nguyen
2018-08-06 14:25         ` Ben Peart
2018-08-15 21:05           ` Ben Peart
2018-08-05  8:57       ` Duy Nguyen
2018-08-16 18:27 ` [PATCH v3] " Ben Peart
2018-08-16 18:37   ` Duy Nguyen
2018-08-17 12:37     ` Ben Peart
2018-08-19  1:44       ` Elijah Newren
2018-08-20 13:40         ` Ben Peart
2018-08-20 18:16           ` Elijah Newren
2018-08-21 14:51             ` Duy Nguyen
2018-08-30 17:22               ` Elijah Newren
2018-09-04 16:46                 ` Duy Nguyen
2018-08-20 18:31         ` Junio C Hamano
2018-09-18  5:34   ` [PATCH] config doc: add missing list separator for checkout.optimizeNewBranch Ævar Arnfjörð Bjarmason
2018-09-18 16:57     ` Taylor Blau
2018-09-18 17:16       ` Jeff King
2018-09-18 17:20         ` Taylor Blau
2018-09-18 17:13     ` Jeff King
2018-09-19  4:41       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=960c32c0-5b10-6854-73c9-392814dddbb3@gmail.com \
    --to=peartben@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=sunshine@sunshineco.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).