* [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout @ 2017-04-14 19:25 git 2017-04-14 19:25 ` git 0 siblings, 1 reply; 4+ messages in thread From: git @ 2017-04-14 19:25 UTC (permalink / raw) To: git; +Cc: gitster, peff, Jeff Hostetler From: Jeff Hostetler <jeffhost@microsoft.com> Version 4 cleans up the buf[] allocation and freeing as suggested on the mailing list. Jeff Hostetler (1): unpack-trees: avoid duplicate ODB lookups during checkout unpack-trees.c | 38 +++++++++++++++++++++++++++++++++----- 1 file changed, 33 insertions(+), 5 deletions(-) -- 2.9.3 ^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-14 19:25 [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout git @ 2017-04-14 19:25 ` git 2017-04-14 19:52 ` Jeff King 0 siblings, 1 reply; 4+ messages in thread From: git @ 2017-04-14 19:25 UTC (permalink / raw) To: git; +Cc: gitster, peff, Jeff Hostetler From: Jeff Hostetler <jeffhost@microsoft.com> Teach traverse_trees_recursive() to not do redundant ODB lookups when both directories refer to the same OID. In operations such as read-tree and checkout, there will likely be many peer directories that have the same OID when the differences between the commits are relatively small. In these cases we can avoid hitting the ODB multiple times for the same OID. This patch handles n=2 and n=3 cases and simply copies the data rather than repeating the fill_tree_descriptor(). ================ On the Windows repo (500K trees, 3.1M files, 450MB index), this reduced the overall time by 0.75 seconds when cycling between 2 commits with a single file difference. (avg) before: 22.699 (avg) after: 21.955 =============== ================ On Linux using p0006-read-tree-checkout.sh with linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------------------------------------- 0006.2: read-tree br_base br_ballast (57994) 0.24(0.20+0.03) 0.24(0.22+0.01) +0.0% 0006.3: switch between br_base br_ballast (57994) 10.58(6.23+2.86) 10.67(5.94+2.87) +0.9% 0006.4: switch between br_ballast br_ballast_plus_1 (57994) 0.60(0.44+0.17) 0.57(0.44+0.14) -5.0% 0006.5: switch between aliases (57994) 0.59(0.48+0.13) 0.57(0.44+0.15) -3.4% ================ Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> --- unpack-trees.c | 38 +++++++++++++++++++++++++++++++++----- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/unpack-trees.c b/unpack-trees.c index 3a8ee19..07b0f11 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -531,12 +531,18 @@ static int switch_cache_bottom(struct traverse_info *info) return ret; } +static inline int are_same_oid(struct name_entry *name_j, struct name_entry *name_k) +{ + return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid); +} + static int traverse_trees_recursive(int n, unsigned long dirmask, unsigned long df_conflicts, struct name_entry *names, struct traverse_info *info) { int i, ret, bottom; + int nr_buf = 0; struct tree_desc t[MAX_UNPACK_TREES]; void *buf[MAX_UNPACK_TREES]; struct traverse_info newinfo; @@ -553,18 +559,40 @@ static int traverse_trees_recursive(int n, unsigned long dirmask, newinfo.pathlen += tree_entry_len(p) + 1; newinfo.df_conflicts |= df_conflicts; + /* + * Fetch the tree from the ODB for each peer directory in the + * n commits. + * + * For 2- and 3-way traversals, we try to avoid hitting the + * ODB twice for the same OID. This should yield a nice speed + * up in checkouts and merges when the commits are similar. + * + * We don't bother doing the full O(n^2) search for larger n, + * because wider traversals don't happen that often and we + * avoid the search setup. + * + * When 2 peer OIDs are the same, we just copy the tree + * descriptor data. This implicitly borrows the buffer + * data from the earlier cell. + */ for (i = 0; i < n; i++, dirmask >>= 1) { - const unsigned char *sha1 = NULL; - if (dirmask & 1) - sha1 = names[i].oid->hash; - buf[i] = fill_tree_descriptor(t+i, sha1); + if (i > 0 && are_same_oid(&names[i], &names[i - 1])) + t[i] = t[i - 1]; + else if (i > 1 && are_same_oid(&names[i], &names[i - 2])) + t[i] = t[i - 2]; + else { + const unsigned char *sha1 = NULL; + if (dirmask & 1) + sha1 = names[i].oid->hash; + buf[nr_buf++] = fill_tree_descriptor(t+i, sha1); + } } bottom = switch_cache_bottom(&newinfo); ret = traverse_trees(n, t, &newinfo); restore_cache_bottom(&newinfo, bottom); - for (i = 0; i < n; i++) + for (i = 0; i < nr_buf; i++) free(buf[i]); return ret; -- 2.9.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-14 19:25 ` git @ 2017-04-14 19:52 ` Jeff King 2017-04-14 21:06 ` Jeff Hostetler 0 siblings, 1 reply; 4+ messages in thread From: Jeff King @ 2017-04-14 19:52 UTC (permalink / raw) To: git; +Cc: git, gitster, Jeff Hostetler On Fri, Apr 14, 2017 at 07:25:54PM +0000, git@jeffhostetler.com wrote: > for (i = 0; i < n; i++, dirmask >>= 1) { > - const unsigned char *sha1 = NULL; > - if (dirmask & 1) > - sha1 = names[i].oid->hash; > - buf[i] = fill_tree_descriptor(t+i, sha1); > + if (i > 0 && are_same_oid(&names[i], &names[i - 1])) > + t[i] = t[i - 1]; > + else if (i > 1 && are_same_oid(&names[i], &names[i - 2])) > + t[i] = t[i - 2]; > + else { > + const unsigned char *sha1 = NULL; > + if (dirmask & 1) > + sha1 = names[i].oid->hash; > + buf[nr_buf++] = fill_tree_descriptor(t+i, sha1); > + } This looks fine to me. Just musing (and I do not think we need to go further than your patch), we're slowly walking towards an actual object-content cache. The "buf" array is now essentially a cache of all oids we've loaded, but it doesn't know its sha1s. So we could actually encapsulate all of the caching: struct object_cache { int nr_entries; struct object_cache_entry { struct object_id oid; void *data; } cache[MAX_UNPACK_TREES]; }; and then ask it "have you seen oid X" rather than playing games with looking at "i - 1". Of course it would have to do a linear search, so the next step is to replace its array with a hashmap. And now suddenly we have a reusable object-content cache that you could use like: struct object_cache = {0}; for (...) { /* maybe reads fresh, or maybe gets it from the cache */ void *data = read_object_data_cached(&oid, &cache); } /* operation done, release the cache */ clear_object_cache(&cache); which would work anywhere you expect to load N objects and see some overlap. Of course it would be nicer still if this all just happened automatically behind the scenes of read_object_data(). But it would have to keep an _extra_ copy of each object, since the caller expects to be able to free it. We'd probably have to return instead a struct with buffer/size in it along with a reference counter. I don't think any of that is worth it unless there are spots where we really expect there to be a lot of cases where we hit the same objects in rapid succession. I don't think there should be, though. Our usual "caching" mechanism is to create a "struct object", which is enough to perform most operations (and has a much smaller memory footprint). So again, just musing. I think your patch is fine as-is. -Peff ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout 2017-04-14 19:52 ` Jeff King @ 2017-04-14 21:06 ` Jeff Hostetler 0 siblings, 0 replies; 4+ messages in thread From: Jeff Hostetler @ 2017-04-14 21:06 UTC (permalink / raw) To: Jeff King; +Cc: git, gitster, Jeff Hostetler On 4/14/2017 3:52 PM, Jeff King wrote: > On Fri, Apr 14, 2017 at 07:25:54PM +0000, git@jeffhostetler.com wrote: > >> for (i = 0; i < n; i++, dirmask >>= 1) { >> - const unsigned char *sha1 = NULL; >> - if (dirmask & 1) >> - sha1 = names[i].oid->hash; >> - buf[i] = fill_tree_descriptor(t+i, sha1); >> + if (i > 0 && are_same_oid(&names[i], &names[i - 1])) >> + t[i] = t[i - 1]; >> + else if (i > 1 && are_same_oid(&names[i], &names[i - 2])) >> + t[i] = t[i - 2]; >> + else { >> + const unsigned char *sha1 = NULL; >> + if (dirmask & 1) >> + sha1 = names[i].oid->hash; >> + buf[nr_buf++] = fill_tree_descriptor(t+i, sha1); >> + } > > This looks fine to me. > > Just musing (and I do not think we need to go further than your patch), > we're slowly walking towards an actual object-content cache. The "buf" > array is now essentially a cache of all oids we've loaded, but it > doesn't know its sha1s. So we could actually encapsulate all of the > caching: > > struct object_cache { > int nr_entries; > struct object_cache_entry { > struct object_id oid; > void *data; > } cache[MAX_UNPACK_TREES]; > }; > > and then ask it "have you seen oid X" rather than playing games with > looking at "i - 1". Of course it would have to do a linear search, so > the next step is to replace its array with a hashmap. > > And now suddenly we have a reusable object-content cache that you could > use like: > > struct object_cache = {0}; > for (...) { > /* maybe reads fresh, or maybe gets it from the cache */ > void *data = read_object_data_cached(&oid, &cache); > } > /* operation done, release the cache */ > clear_object_cache(&cache); > > which would work anywhere you expect to load N objects and see some > overlap. > > Of course it would be nicer still if this all just happened > automatically behind the scenes of read_object_data(). But it would have > to keep an _extra_ copy of each object, since the caller expects to be > able to free it. We'd probably have to return instead a struct with > buffer/size in it along with a reference counter. > > I don't think any of that is worth it unless there are spots where we > really expect there to be a lot of cases where we hit the same objects > in rapid succession. I don't think there should be, though. Our usual > "caching" mechanism is to create a "struct object", which is enough to > perform most operations (and has a much smaller memory footprint). > > So again, just musing. I think your patch is fine as-is. Thanks for your help on this one. I think before I tried to do a cache at this layer, I would like to look at (or have a brave volunteer look at) the recursive tree traversal. In my Windows tree I have 500K directories, so the full recursive tree traversal touches them. My change cuts the ODB lookups (on a "checkout -b") from 1M to 500K (roughly), but I still have to do 500K strcmp's to get those savings. What would be nice would be to have maybe an alternate callback -- one which knows the peers (and everything under them) are equal and let it short cut as much as it can. The alternate version of the above routine would be able to avoid the strcmp's, but I'm guessing that there would also be savings when we look within a treenode -- the oidcmp's and some of the n-way parallel sub-treenode-iteration. I'm just swag'ing here, but there might be something here. Jeff ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-04-14 21:06 UTC | newest] Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-04-14 19:25 [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout git 2017-04-14 19:25 ` git 2017-04-14 19:52 ` Jeff King 2017-04-14 21:06 ` Jeff Hostetler
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.