All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout
@ 2017-04-14 19:25 git
  2017-04-14 19:25 ` git
  0 siblings, 1 reply; 4+ messages in thread
From: git @ 2017-04-14 19:25 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Version 4 cleans up the buf[] allocation and freeing as
suggested on the mailing list.

Jeff Hostetler (1):
  unpack-trees: avoid duplicate ODB lookups during checkout

 unpack-trees.c | 38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

-- 
2.9.3


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout
  2017-04-14 19:25 [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout git
@ 2017-04-14 19:25 ` git
  2017-04-14 19:52   ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: git @ 2017-04-14 19:25 UTC (permalink / raw)
  To: git; +Cc: gitster, peff, Jeff Hostetler

From: Jeff Hostetler <jeffhost@microsoft.com>

Teach traverse_trees_recursive() to not do redundant ODB
lookups when both directories refer to the same OID.

In operations such as read-tree and checkout, there will
likely be many peer directories that have the same OID when
the differences between the commits are relatively small.
In these cases we can avoid hitting the ODB multiple times
for the same OID.

This patch handles n=2 and n=3 cases and simply copies the
data rather than repeating the fill_tree_descriptor().

================
On the Windows repo (500K trees, 3.1M files, 450MB index),
this reduced the overall time by 0.75 seconds when cycling
between 2 commits with a single file difference.

(avg) before: 22.699
(avg) after:  21.955
===============

================
On Linux using p0006-read-tree-checkout.sh with linux.git:

Test                                                          HEAD^              HEAD
-------------------------------------------------------------------------------------------------------
0006.2: read-tree br_base br_ballast (57994)                  0.24(0.20+0.03)    0.24(0.22+0.01) +0.0%
0006.3: switch between br_base br_ballast (57994)             10.58(6.23+2.86)   10.67(5.94+2.87) +0.9%
0006.4: switch between br_ballast br_ballast_plus_1 (57994)   0.60(0.44+0.17)    0.57(0.44+0.14) -5.0%
0006.5: switch between aliases (57994)                        0.59(0.48+0.13)    0.57(0.44+0.15) -3.4%
================

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
---
 unpack-trees.c | 38 +++++++++++++++++++++++++++++++++-----
 1 file changed, 33 insertions(+), 5 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3a8ee19..07b0f11 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -531,12 +531,18 @@ static int switch_cache_bottom(struct traverse_info *info)
 	return ret;
 }
 
+static inline int are_same_oid(struct name_entry *name_j, struct name_entry *name_k)
+{
+	return name_j->oid && name_k->oid && !oidcmp(name_j->oid, name_k->oid);
+}
+
 static int traverse_trees_recursive(int n, unsigned long dirmask,
 				    unsigned long df_conflicts,
 				    struct name_entry *names,
 				    struct traverse_info *info)
 {
 	int i, ret, bottom;
+	int nr_buf = 0;
 	struct tree_desc t[MAX_UNPACK_TREES];
 	void *buf[MAX_UNPACK_TREES];
 	struct traverse_info newinfo;
@@ -553,18 +559,40 @@ static int traverse_trees_recursive(int n, unsigned long dirmask,
 	newinfo.pathlen += tree_entry_len(p) + 1;
 	newinfo.df_conflicts |= df_conflicts;
 
+	/*
+	 * Fetch the tree from the ODB for each peer directory in the
+	 * n commits.
+	 *
+	 * For 2- and 3-way traversals, we try to avoid hitting the
+	 * ODB twice for the same OID.  This should yield a nice speed
+	 * up in checkouts and merges when the commits are similar.
+	 *
+	 * We don't bother doing the full O(n^2) search for larger n,
+	 * because wider traversals don't happen that often and we
+	 * avoid the search setup.
+	 *
+	 * When 2 peer OIDs are the same, we just copy the tree
+	 * descriptor data.  This implicitly borrows the buffer
+	 * data from the earlier cell.
+	 */
 	for (i = 0; i < n; i++, dirmask >>= 1) {
-		const unsigned char *sha1 = NULL;
-		if (dirmask & 1)
-			sha1 = names[i].oid->hash;
-		buf[i] = fill_tree_descriptor(t+i, sha1);
+		if (i > 0 && are_same_oid(&names[i], &names[i - 1]))
+			t[i] = t[i - 1];
+		else if (i > 1 && are_same_oid(&names[i], &names[i - 2]))
+			t[i] = t[i - 2];
+		else {
+			const unsigned char *sha1 = NULL;
+			if (dirmask & 1)
+				sha1 = names[i].oid->hash;
+			buf[nr_buf++] = fill_tree_descriptor(t+i, sha1);
+		}
 	}
 
 	bottom = switch_cache_bottom(&newinfo);
 	ret = traverse_trees(n, t, &newinfo);
 	restore_cache_bottom(&newinfo, bottom);
 
-	for (i = 0; i < n; i++)
+	for (i = 0; i < nr_buf; i++)
 		free(buf[i]);
 
 	return ret;
-- 
2.9.3


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout
  2017-04-14 19:25 ` git
@ 2017-04-14 19:52   ` Jeff King
  2017-04-14 21:06     ` Jeff Hostetler
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2017-04-14 19:52 UTC (permalink / raw)
  To: git; +Cc: git, gitster, Jeff Hostetler

On Fri, Apr 14, 2017 at 07:25:54PM +0000, git@jeffhostetler.com wrote:

>  	for (i = 0; i < n; i++, dirmask >>= 1) {
> -		const unsigned char *sha1 = NULL;
> -		if (dirmask & 1)
> -			sha1 = names[i].oid->hash;
> -		buf[i] = fill_tree_descriptor(t+i, sha1);
> +		if (i > 0 && are_same_oid(&names[i], &names[i - 1]))
> +			t[i] = t[i - 1];
> +		else if (i > 1 && are_same_oid(&names[i], &names[i - 2]))
> +			t[i] = t[i - 2];
> +		else {
> +			const unsigned char *sha1 = NULL;
> +			if (dirmask & 1)
> +				sha1 = names[i].oid->hash;
> +			buf[nr_buf++] = fill_tree_descriptor(t+i, sha1);
> +		}

This looks fine to me.

Just musing (and I do not think we need to go further than your patch),
we're slowly walking towards an actual object-content cache. The "buf"
array is now essentially a cache of all oids we've loaded, but it
doesn't know its sha1s. So we could actually encapsulate all of the
caching:

  struct object_cache {
	  int nr_entries;
	  struct object_cache_entry {
		  struct object_id oid;
		  void *data;
	  } cache[MAX_UNPACK_TREES];
  };

and then ask it "have you seen oid X" rather than playing games with
looking at "i - 1". Of course it would have to do a linear search, so
the next step is to replace its array with a hashmap.

And now suddenly we have a reusable object-content cache that you could
use like:

  struct object_cache = {0};
  for (...) {
    /* maybe reads fresh, or maybe gets it from the cache */
    void *data = read_object_data_cached(&oid, &cache);
  }
  /* operation done, release the cache */
  clear_object_cache(&cache);

which would work anywhere you expect to load N objects and see some
overlap.

Of course it would be nicer still if this all just happened
automatically behind the scenes of read_object_data(). But it would have
to keep an _extra_ copy of each object, since the caller expects to be
able to free it. We'd probably have to return instead a struct with
buffer/size in it along with a reference counter.

I don't think any of that is worth it unless there are spots where we
really expect there to be a lot of cases where we hit the same objects
in rapid succession. I don't think there should be, though. Our usual
"caching" mechanism is to create a "struct object", which is enough to
perform most operations (and has a much smaller memory footprint).

So again, just musing. I think your patch is fine as-is.

-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout
  2017-04-14 19:52   ` Jeff King
@ 2017-04-14 21:06     ` Jeff Hostetler
  0 siblings, 0 replies; 4+ messages in thread
From: Jeff Hostetler @ 2017-04-14 21:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git, gitster, Jeff Hostetler



On 4/14/2017 3:52 PM, Jeff King wrote:
> On Fri, Apr 14, 2017 at 07:25:54PM +0000, git@jeffhostetler.com wrote:
>
>>  	for (i = 0; i < n; i++, dirmask >>= 1) {
>> -		const unsigned char *sha1 = NULL;
>> -		if (dirmask & 1)
>> -			sha1 = names[i].oid->hash;
>> -		buf[i] = fill_tree_descriptor(t+i, sha1);
>> +		if (i > 0 && are_same_oid(&names[i], &names[i - 1]))
>> +			t[i] = t[i - 1];
>> +		else if (i > 1 && are_same_oid(&names[i], &names[i - 2]))
>> +			t[i] = t[i - 2];
>> +		else {
>> +			const unsigned char *sha1 = NULL;
>> +			if (dirmask & 1)
>> +				sha1 = names[i].oid->hash;
>> +			buf[nr_buf++] = fill_tree_descriptor(t+i, sha1);
>> +		}
>
> This looks fine to me.
>
> Just musing (and I do not think we need to go further than your patch),
> we're slowly walking towards an actual object-content cache. The "buf"
> array is now essentially a cache of all oids we've loaded, but it
> doesn't know its sha1s. So we could actually encapsulate all of the
> caching:
>
>   struct object_cache {
> 	  int nr_entries;
> 	  struct object_cache_entry {
> 		  struct object_id oid;
> 		  void *data;
> 	  } cache[MAX_UNPACK_TREES];
>   };
>
> and then ask it "have you seen oid X" rather than playing games with
> looking at "i - 1". Of course it would have to do a linear search, so
> the next step is to replace its array with a hashmap.
>
> And now suddenly we have a reusable object-content cache that you could
> use like:
>
>   struct object_cache = {0};
>   for (...) {
>     /* maybe reads fresh, or maybe gets it from the cache */
>     void *data = read_object_data_cached(&oid, &cache);
>   }
>   /* operation done, release the cache */
>   clear_object_cache(&cache);
>
> which would work anywhere you expect to load N objects and see some
> overlap.
>
> Of course it would be nicer still if this all just happened
> automatically behind the scenes of read_object_data(). But it would have
> to keep an _extra_ copy of each object, since the caller expects to be
> able to free it. We'd probably have to return instead a struct with
> buffer/size in it along with a reference counter.
>
> I don't think any of that is worth it unless there are spots where we
> really expect there to be a lot of cases where we hit the same objects
> in rapid succession. I don't think there should be, though. Our usual
> "caching" mechanism is to create a "struct object", which is enough to
> perform most operations (and has a much smaller memory footprint).
>
> So again, just musing. I think your patch is fine as-is.


Thanks for your help on this one.

I think before I tried to do a cache at this layer,
I would like to look at (or have a brave volunteer
look at) the recursive tree traversal.  In my Windows
tree I have 500K directories, so the full recursive
tree traversal touches them.  My change cuts the ODB
lookups (on a "checkout -b") from 1M to 500K (roughly),
but I still have to do 500K strcmp's to get those
savings.  What would be nice would be to have maybe
an alternate callback -- one which knows the peers
(and everything under them) are equal and let it short
cut as much as it can.  The alternate version of the
above routine would be able to avoid the strcmp's,
but I'm guessing that there would also be savings
when we look within a treenode -- the oidcmp's and
some of the n-way parallel sub-treenode-iteration.
I'm just swag'ing here, but there might be something
here.

Jeff



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-04-14 21:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-14 19:25 [PATCH v4] unpack-trees: avoid duplicate ODB lookups during checkout git
2017-04-14 19:25 ` git
2017-04-14 19:52   ` Jeff King
2017-04-14 21:06     ` Jeff Hostetler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.