git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Peart <peartben@gmail.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Ben Peart <Ben.Peart@microsoft.com>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>, Jeff King <peff@peff.net>
Subject: Re: [PATCH v2 0/4] Speed up unpack_trees()
Date: Thu, 9 Aug 2018 04:16:03 -0400	[thread overview]
Message-ID: <eb39eecf-81b0-e937-d686-47b7565d6511@gmail.com> (raw)
In-Reply-To: <ccec34c9-b81a-bcb4-7d05-48dccc059cc8@gmail.com>



On 8/8/2018 4:53 PM, Ben Peart wrote:
> 
> 
> On 8/1/2018 12:38 PM, Duy Nguyen wrote:
>> On Tue, Jul 31, 2018 at 01:31:31PM -0400, Ben Peart wrote:
>>>
>>>
>>> On 7/31/2018 12:50 PM, Ben Peart wrote:
>>>>
>>>>
>>>> On 7/31/2018 11:31 AM, Duy Nguyen wrote:
>>>
>>>>>
>>>>>> In the performance game of whack-a-mole, that call to repair 
>>>>>> cache-tree
>>>>>> is now looking quite expensive...
>>>>>
>>>>> Yeah and I think we can whack that mole too. I did some measurement.
>>>>> Best case possible, we just need to scan through two indexes (one with
>>>>> many good cache-tree, one with no cache-tree), compare and copy
>>>>> cache-tree over. The scanning takes like 1% time of current repair
>>>>> step and I suspect it's the hashing that takes most of the time. Of
>>>>> course real world won't have such nice numbers, but I guess we could
>>>>> maybe half cache-tree update/repair time.
>>>>>
>>>>
>>>> I have some great profiling tools available so will take a look at this
>>>> next and see exactly where the time is being spent.
>>>
>>> Good instincts.  In cache_tree_update, the heavy hitter is definitely
>>> hash_object_file followed by has_object_file.
>>>
>>> Name                                   Inc %         Inc
>>> + git!cache_tree_update                 12.4       4,935
>>> |+ git!update_one                       11.8       4,706
>>> | + git!update_one                      11.8       4,706
>>> |  + git!hash_object_file                6.1       2,406
>>> |  + git!has_object_file                 2.0         813
>>> |  + OTHER <<vcruntime140d!strchr>>      0.5         203
>>> |  + git!strbuf_addf                     0.4         155
>>> |  + git!strbuf_release                  0.4         143
>>> |  + git!strbuf_add                      0.3         121
>>> |  + OTHER <<vcruntime140d!memcmp>>      0.2          93
>>> |  + git!strbuf_grow                     0.1          25
>>
>> Ben, if you work on this, this could be a good starting point. I will
>> not work on this because I still have some other things to catch up
>> and follow through. You can have my sign off if you reuse something
>> from this patch
>>
>> Even if it's a naive implementation, the initial numbers look pretty
>> good. Without the patch we have
>>
>> 18:31:05.970621 unpack-trees.c:1437     performance: 0.000001029 s: copy
>> 18:31:05.975729 unpack-trees.c:1444     performance: 0.005082004 s: 
>> update
>>
>> And with the patch
>>
>> 18:31:13.295655 unpack-trees.c:1437     performance: 0.000198017 s: copy
>> 18:31:13.296757 unpack-trees.c:1444     performance: 0.001075935 s: 
>> update
>>
>> Time saving is about 80% by the look of this (best possible case
>> because only the top tree needs to be hashed and written out).
>>
>> -- 8< --
>> diff --git a/cache-tree.c b/cache-tree.c
>> index 6b46711996..67a4a93100 100644
>> --- a/cache-tree.c
>> +++ b/cache-tree.c
>> @@ -440,6 +440,147 @@ int cache_tree_update(struct index_state 
>> *istate, int flags)
>>       return 0;
>>   }
>> +static int same(const struct cache_entry *a, const struct cache_entry 
>> *b)
>> +{
>> +    if (ce_stage(a) || ce_stage(b))
>> +        return 0;
>> +    if ((a->ce_flags | b->ce_flags) & CE_CONFLICTED)
>> +        return 0;
>> +    return a->ce_mode == b->ce_mode &&
>> +           !oidcmp(&a->oid, &b->oid);
>> +}
>> +
>> +static int cache_tree_name_pos(const struct index_state *istate,
>> +                   const struct strbuf *path)
>> +{
>> +    int pos;
>> +
>> +    if (!path->len)
>> +        return 0;
>> +
>> +    pos = index_name_pos(istate, path->buf, path->len);
>> +    if (pos >= 0)
>> +        BUG("No no no, directory path must not exist in index");
>> +    return -pos - 1;
>> +}
>> +
>> +/*
>> + * Locate the same cache-tree in two separate indexes. Check the
>> + * cache-tree is still valid for the "to" index (i.e. it contains the
>> + * same set of entries in the "from" index).
>> + */
>> +static int verify_one_cache_tree(const struct index_state *to,
>> +                 const struct index_state *from,
>> +                 const struct cache_tree *it,
>> +                 const struct strbuf *path)
>> +{
>> +    int i, spos, dpos;
>> +
>> +    spos = cache_tree_name_pos(from, path);
>> +    if (spos + it->entry_count > from->cache_nr)
>> +        return -1;
>> +
>> +    dpos = cache_tree_name_pos(to, path);
>> +    if (dpos + it->entry_count > to->cache_nr)
>> +        return -1;
>> +
>> +    /* Can we quickly check head and tail and bail out early */
>> +    if (!same(from->cache[spos], to->cache[spos]) ||
>> +        !same(from->cache[spos + it->entry_count - 1],
>> +          to->cache[spos + it->entry_count - 1]))
>> +        return -1;
>> +
>> +    for (i = 1; i < it->entry_count - 1; i++)
>> +        if (!same(from->cache[spos + i],
>> +              to->cache[dpos + i]))
>> +            return -1;
>> +
>> +    return 0;
>> +}
>> +
>> +static int verify_and_invalidate(struct index_state *to,
>> +                 const struct index_state *from,
>> +                 struct cache_tree *it,
>> +                 struct strbuf *path)
>> +{
>> +    /*
>> +     * Optimistically verify the current tree first. Alternatively
>> +     * we could verify all the subtrees first then do this
>> +     * last. Any invalid subtree would also invalidates its
>> +     * ancestors.
>> +     */
>> +    if (it->entry_count != -1 &&
>> +        verify_one_cache_tree(to, from, it, path))
>> +        it->entry_count = -1;
>> +
>> +    /*
>> +     * If the current tree is valid, don't bother checking
>> +     * inside. All subtrees _should_ also be valid
>> +     */
>> +    if (it->entry_count == -1) {
>> +        int i, len = path->len;
>> +
>> +        for (i = 0; i < it->subtree_nr; i++) {
>> +            struct cache_tree_sub *down = it->down[i];
>> +
>> +            if (!down || !down->cache_tree)
>> +                continue;
>> +
>> +            strbuf_setlen(path, len);
>> +            strbuf_add(path, down->name, down->namelen);
>> +            strbuf_addch(path, '/');
>> +            if (verify_and_invalidate(to, from,
>> +                          down->cache_tree, path))
>> +                return -1;
>> +        }
>> +        strbuf_setlen(path, len);
>> +    }
>> +    return 0;
>> +}
>> +
>> +static struct cache_tree *duplicate_cache_tree(const struct 
>> cache_tree *src)
>> +{
>> +    struct cache_tree *dst;
>> +    int i;
>> +
>> +    if (!src)
>> +        return NULL;
>> +
>> +    dst = xmalloc(sizeof(*dst));
>> +    dst->entry_count = src->entry_count;
>> +    oidcpy(&dst->oid, &src->oid);
>> +    dst->subtree_nr = src->subtree_nr;
>> +    dst->subtree_alloc = dst->subtree_nr;
>> +    ALLOC_ARRAY(dst->down, dst->subtree_alloc);
>> +    for (i = 0; i < src->subtree_nr; i++) {
>> +        struct cache_tree_sub *dsrc = src->down[i];
>> +        struct cache_tree_sub *down;
>> +
>> +        FLEX_ALLOC_MEM(down, name, dsrc->name, dsrc->namelen);
>> +        down->count = dsrc->count;
>> +        down->namelen = dsrc->namelen;
>> +        down->used = dsrc->used;
>> +        down->cache_tree = duplicate_cache_tree(dsrc->cache_tree);
>> +        dst->down[i] = down;
>> +    }
>> +    return dst;
>> +}
>> +
>> +int cache_tree_copy(struct index_state *to, const struct index_state 
>> *from)
>> +{
>> +    struct cache_tree *it = duplicate_cache_tree(from->cache_tree);
>> +    struct strbuf path = STRBUF_INIT;
>> +    int ret;
>> +
>> +    if (to->cache_tree)
>> +        BUG("Sorry merging cache-tree is not supported yet");
>> +    ret = verify_and_invalidate(to, from, it, &path);
>> +    to->cache_tree = it;
>> +    to->cache_changed |= CACHE_TREE_CHANGED;
>> +    strbuf_release(&path);
>> +    return ret;
>> +}
>> +
>>   static void write_one(struct strbuf *buffer, struct cache_tree *it,
>>                         const char *path, int pathlen)
>>   {
>> diff --git a/cache-tree.h b/cache-tree.h
>> index cfd5328cc9..6981da8e0d 100644
>> --- a/cache-tree.h
>> +++ b/cache-tree.h
>> @@ -53,4 +53,6 @@ void prime_cache_tree(struct index_state *, struct 
>> tree *);
>>   extern int cache_tree_matches_traversal(struct cache_tree *, struct 
>> name_entry *ent, struct traverse_info *info);
>> +int cache_tree_copy(struct index_state *to, const struct index_state 
>> *from);
>> +
>>   #endif
>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index cd0680f11e..cb3fdd42a6 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -1427,12 +1427,22 @@ int unpack_trees(unsigned len, struct 
>> tree_desc *t, struct unpack_trees_options
>>       ret = check_updates(o) ? (-2) : 0;
>>       if (o->dst_index) {
>>           if (!ret) {
>> -            if (!o->result.cache_tree)
>> +            if (!o->result.cache_tree) {
>> +                uint64_t start = getnanotime();
>> +#if 0
>>                   o->result.cache_tree = cache_tree();
>> -            if (!cache_tree_fully_valid(o->result.cache_tree))
>> +#else
>> +                cache_tree_copy(&o->result, o->src_index);
>> +#endif
>> +                trace_performance_since(start, "copy");
>> +            }
>> +            if (!cache_tree_fully_valid(o->result.cache_tree)) {
>> +                uint64_t start = getnanotime();
>>                   cache_tree_update(&o->result,
>>                             WRITE_TREE_SILENT |
>>                             WRITE_TREE_REPAIR);
>> +                trace_performance_since(start, "update");
>> +            }
>>           }
>>           move_index_extensions(&o->result, o->src_index);
>>           discard_index(o->dst_index);
>> -- 8< --
>>
> 
> I like the idea (and the perf win!) but it seems like there is an 
> important piece missing.  If I'm reading this correctly, unpack_trees() 
> will copy the source cache tree (instead of creating a new one) and then 
> verify_and_invalidate() will walk the cache tree and for any tree that 
> is dirty, it will flag its ancestors as dirty as well.
> 
> What I don't understand is how any cache tree entries that became 
> invalid as a result of the merge of the n-trees are marked as invalid. 
> It seems like something needs to walk the cache tree and call 
> cache_tree_invalidate_path() for all entries that changed as a result of 
> the merge before the call to verify_and_invalidate().
> 
> I thought at first cache_tree_fully_valid() might do that but it only 
> looks for entries that are already marked as invalid (or are missing 
> their corresponding object in the object store).  It assumes something 
> else has marked the invalid paths already.

In fact, in the other [1] patch series, we're detecting the number of 
cache entries that are the same as the cache tree and using that to 
traverse_by_cache_tree().  At that point, couldn't we copy the 
corresponding cache tree entries over to the destination so that those 
don't have to get recreated in the later call to cache_tree_update()?

[1] 
https://public-inbox.org/git/20180727154241.GA21288@duynguyen.home/T/#mad6b94733dcf16c29350cbad4beccd9ca93beaed

  reply	other threads:[~2018-08-09  8:16 UTC|newest]

Thread overview: 121+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-18 20:45 [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-18 20:45 ` [PATCH v1 1/3] add unbounded Multi-Producer-Multi-Consumer queue Ben Peart
2018-07-18 20:57   ` Stefan Beller
2018-07-19 19:11   ` Junio C Hamano
2018-07-18 20:45 ` [PATCH v1 2/3] add performance tracing around traverse_trees() in unpack_trees() Ben Peart
2018-07-18 20:45 ` [PATCH v1 3/3] Add initial parallel version of unpack_trees() Ben Peart
2018-07-18 22:56   ` Junio C Hamano
2018-07-18 21:02 ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Stefan Beller
2018-07-18 21:34 ` Jeff King
2018-07-23 15:48   ` Ben Peart
2018-07-23 17:03     ` Duy Nguyen
2018-07-23 20:51       ` Ben Peart
2018-07-24  4:20         ` Jeff King
2018-07-24 15:33           ` Duy Nguyen
2018-07-25 20:56             ` Ben Peart
2018-07-26  5:30               ` Duy Nguyen
2018-07-26 16:30                 ` Duy Nguyen
2018-07-26 19:40                   ` Junio C Hamano
2018-07-27 15:42                     ` Duy Nguyen
2018-07-27 16:22                       ` Ben Peart
2018-07-27 18:00                         ` Duy Nguyen
2018-07-27 17:14                       ` Junio C Hamano
2018-07-27 17:52                         ` Duy Nguyen
2018-07-29  6:24                           ` Duy Nguyen
2018-07-29 10:33                       ` [PATCH v2 0/4] Speed up unpack_trees() Nguyễn Thái Ngọc Duy
2018-07-29 10:33                         ` [PATCH v2 1/4] unpack-trees.c: add performance tracing Nguyễn Thái Ngọc Duy
2018-07-30 20:16                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-07-30 20:52                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-07-30 20:58                           ` Ben Peart
2018-07-29 10:33                         ` [PATCH v2 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:46                           ` Elijah Newren
2018-08-10 16:39                             ` Duy Nguyen
2018-08-10 18:39                               ` Elijah Newren
2018-08-10 19:30                                 ` Duy Nguyen
2018-08-10 19:40                                   ` Elijah Newren
2018-08-10 19:48                                     ` Duy Nguyen
2018-07-30 18:10                         ` [PATCH v2 0/4] Speed up unpack_trees() Ben Peart
2018-07-31 15:31                           ` Duy Nguyen
2018-07-31 16:50                             ` Ben Peart
2018-07-31 17:31                               ` Ben Peart
2018-08-01 16:38                                 ` Duy Nguyen
2018-08-08 20:53                                   ` Ben Peart
2018-08-09  8:16                                     ` Ben Peart [this message]
2018-08-10 16:08                                       ` Duy Nguyen
2018-08-10 15:51                                     ` Duy Nguyen
2018-07-30 21:04                         ` Ben Peart
2018-08-04  5:37                         ` [PATCH v3 " Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 1/4] unpack-trees: add performance tracing Nguyễn Thái Ngọc Duy
2018-08-04  5:37                           ` [PATCH v3 2/4] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-08 18:23                             ` Elijah Newren
2018-08-10 16:29                               ` Duy Nguyen
2018-08-10 18:48                                 ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 3/4] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-08 18:30                             ` Elijah Newren
2018-08-04  5:37                           ` [PATCH v3 4/4] unpack-trees: cheaper index update when walking by cache-tree Nguyễn Thái Ngọc Duy
2018-08-06 15:48                           ` [PATCH v3 0/4] Speed up unpack_trees() Junio C Hamano
2018-08-06 15:59                             ` Duy Nguyen
2018-08-06 18:59                               ` Junio C Hamano
2018-08-08 17:00                                 ` Ben Peart
2018-08-08 17:46                               ` Junio C Hamano
2018-08-08 18:12                                 ` Junio C Hamano
2018-08-08 18:39                                   ` Junio C Hamano
2018-08-10 16:53                                     ` Duy Nguyen
2018-08-12  8:15                           ` [PATCH v4 0/5] " Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 1/5] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-13 18:39                               ` Ben Peart
2018-08-12  8:15                             ` [PATCH v4 2/5] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-12 10:05                               ` Thomas Adam
2018-08-13 18:50                                 ` Junio C Hamano
2018-08-13 18:44                               ` Ben Peart
2018-08-13 19:25                               ` Jeff King
2018-08-13 19:36                                 ` Stefan Beller
2018-08-13 20:11                                   ` Ben Peart
2018-08-13 19:52                                 ` Duy Nguyen
2018-08-13 21:47                                   ` Jeff King
2018-08-13 22:41                                 ` Junio C Hamano
2018-08-14 18:19                                   ` Jeff Hostetler
2018-08-14 18:32                                     ` Duy Nguyen
2018-08-14 18:44                                       ` Stefan Beller
2018-08-14 18:51                                         ` Duy Nguyen
2018-08-14 19:54                                           ` Jeff King
2018-08-14 20:52                                           ` Junio C Hamano
2018-08-15 16:32                                             ` Duy Nguyen
2018-08-15 18:28                                               ` Junio C Hamano
2018-08-14 20:14                                         ` Jeff Hostetler
2018-08-12  8:15                             ` [PATCH v4 3/5] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-13 18:58                               ` Ben Peart
2018-08-15 16:38                                 ` Duy Nguyen
2018-08-12  8:15                             ` [PATCH v4 4/5] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-12  8:15                             ` [PATCH v4 5/5] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-13 15:48                               ` Elijah Newren
2018-08-13 15:57                                 ` Duy Nguyen
2018-08-13 16:05                                 ` Ben Peart
2018-08-13 16:25                                   ` Duy Nguyen
2018-08-13 17:15                                     ` Ben Peart
2018-08-13 19:01                             ` [PATCH v4 0/5] Speed up unpack_trees() Junio C Hamano
2018-08-14 19:19                             ` Ben Peart
2018-08-18 14:41                             ` [PATCH v5 0/7] " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 1/7] trace.h: support nested performance tracing Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 2/7] unpack-trees: add " Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 3/7] unpack-trees: optimize walking same trees with cache-tree Nguyễn Thái Ngọc Duy
2018-08-20 12:43                                 ` Ben Peart
2018-08-18 14:41                               ` [PATCH v5 4/7] unpack-trees: reduce malloc in cache-tree walk Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 5/7] unpack-trees: reuse (still valid) cache-tree from src_index Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 6/7] unpack-trees: add missing cache invalidation Nguyễn Thái Ngọc Duy
2018-08-18 14:41                               ` [PATCH v5 7/7] cache-tree: verify valid cache-tree in the test suite Nguyễn Thái Ngọc Duy
2018-08-18 21:45                                 ` Elijah Newren
2018-08-18 22:01                               ` [PATCH v5 0/7] Speed up unpack_trees() Elijah Newren
2018-08-19  5:09                                 ` Duy Nguyen
2018-08-25 12:18                               ` [PATCH] Document update for nd/unpack-trees-with-cache-tree Nguyễn Thái Ngọc Duy
2018-08-25 12:31                                 ` Martin Ågren
2018-08-25 13:02                                 ` [PATCH v2] " Nguyễn Thái Ngọc Duy
2018-07-27 15:50                     ` [PATCH v1 0/3] [RFC] Speeding up checkout (and merge, rebase, etc) Ben Peart
2018-07-26 16:35               ` Duy Nguyen
2018-07-24  5:54         ` Junio C Hamano
2018-07-24 15:13         ` Duy Nguyen
2018-07-24 21:21           ` Jeff King
2018-07-25 16:09           ` Ben Peart
2018-07-24  4:27       ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eb39eecf-81b0-e937-d686-47b7565d6511@gmail.com \
    --to=peartben@gmail.com \
    --cc=Ben.Peart@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).