git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] log(n)-transmissions common commit handshake
@ 2008-09-17 23:00 Thomas Rast
  2008-09-17 23:01 ` [RFC PATCH] fetch-pack: log(n)-transmission find_common() Thomas Rast
  2008-09-18  8:18 ` [RFC] log(n)-transmissions common commit handshake Thomas Rast
  0 siblings, 2 replies; 13+ messages in thread
From: Thomas Rast @ 2008-09-17 23:00 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 8601 bytes --]

Hi *

This is a fairly long mail, so here's a brief abstract:

  Using two passes, first exponential stride through history, then
  bisection on the "right" windows, we implement the find_common
  handshake in log(history_size) many transmissions.  The followup
  patch partially achieves that, but a few points are open and strict
  backwards compatibility is missing (but its absence is Mostly
  Harmless).

Terminology
===========

The sending side (running upload-pack) is called server, and the
receiving side (running fetch-pack) is the client.  Commits fall into
the following categories:

'want':
	Requested by the client for transmission.

'have':
	In the client's object storage.

'common':
	In both client's and server's object storage.

'private':
	have, but not in the server's storage.

'required':
	Part of the history graph "between" commons and wants; minimal
	subset that needs to be sent over the wire as part of the
	transmission.

("Verbing weirds language." -- Calvin)

When I say "a commit is not _yet_ X", I mean "it is not yet known to
be X", usually from the client's POV.


Motivation
==========

This spawned out of the O(#haves*#wants*#history) analysis late last
week.  As a brief summary, the server uses the fairly expensive test
ok_to_give_up() to check whether it can cut the handshake short since
it knows enough commons to cover all wants.  The thread is here:

  http://article.gmane.org/gmane.comp.version-control.git/95787

(Be sure to read the follow-up regarding the first '*' bullet point.)

It does this shortcut because the client risks getting side-tracked on
a private chunk of history leading to a private root.  (According to
the commit message.  I'm not sure how effective it really is.)

This check by far dominates the rest of the server's work, which is
mostly lookup_object(): it only has to take each "have" line sha1, see
if it is present, and if so, ACK it.  The client announces when it
considers the handshake finished, so there is no other requirement to
test for it.

A more efficient handshake could -- in addition to being more
efficient :-) -- solve this problem simply by hitting the private root
very quickly and proceeding elsewhere, avoiding the server-side check.


Outline of current algorithm
============================

The client sends all its 'have' sha1s, ordered by date, in chunks of
32.  The server replies with an ACK for each 'common', i.e., for any
sha1 it finds in its own object store.

[This is complicated a bit by the exchange of refs, see "Open
questions".]

As it goes, the client marks the commons, and as soon as it runs out
of unmarket commits, it sends a "done".  The server then knows the
entire set of commons too, and can proceed to generate a pack file of
the requireds.

There is a hard limit of 256 commits that go un-ACKed until the client
gives up and declares that there are no common commits.


Idea
====

The proposed algorithm scans the haves in a different order, using two
stages: exponential stride and bisection.

(1) Exponential stride:
-----------------------

Emit the first commit, then the 2nd from that, then the 4th from that,
and so on.  As a boundary case, always emit the root.

If the history has "bounded width" (the meaning should be fairly
obvious), then this stage emits O(log(n)) commits up to the root.

(2) Bisection:
--------------

Call a pair of (A,B) where A is an ancestor of B "interesting" if B is
not yet common and A is not yet private.

Then the bisection step is: between any interesting pair of commits
(A,B) emitted so far, emit the one in the middle M=M(A,B).

["Cut points" between last common and first private commits clearly
lie only between such pairs.]

Assuming for a moment that the server responds immediately, this also
takes only O(log(n)) commits since M must be either common or private,
and thus A..M or M..B is no longer interesting, respectively.

To deal with the delayed responses in the actual protocol, at each
pass 'i' emit M(A,B) for all (A,B) in pairs(i) only if the pair is
still interesting.  Add both (A,M) and (M,B) to pairs(i+1).  Iterate.
pairs(0) is constructed from pass (1).

Intuitively, this means we emit more than log(n) only if we have
nothing else left to do.

----

Between the two stages, we achieve O(log(n)) complexity.  (This
notably means that we hit private roots in far less transmitted sha1s
than the chunk size of 32.)

Unfortunately, this ordering sometimes conflicts with the server's
ok_to_give_up() shortcut mentioned in "Motivation": Consider a linear
branch 'master', and suppose the only 'common' sent out in the first
batch was the root.  ok_to_give_up() detects that it has a common on
the only ancestry line of the want, and trigger a fake ACK for all
further commits.  This becomes a problem when we later emit, say,
'master~1' and the server ACKs it even though it is private.  (It does
not affect correctness, but can increase the depth of history
transmitted by a factor of 2 at worst.)


RFC Implementation
==================

I'll just highlight a few design decisions.  It's my first serious C
coding in years, so warm up the flamethrowers...

* The work_list (replacing rev_list) is the "recursion stack" for
  get_rev().  New "frames" are inserted such that the order of emitted
  commits is
   (a) refs
   (b) stage (1) in date order
   (b) stage (2) in bisection depth order

  get_rev() is split across functions for each case.

  Refs are moved to front because the current code emits them anyway
  (see "Open questions" below), mostly to help fill the first 64
  "have" lines with things we are going to send anyway, as opposed to
  the "real" walk where we make choices.

* The bisection does not actually remember pairs, but flags commits
  with the directions it has already started a bisection from them.
  Combined with the existing POPPED flag for commits that were
  emitted, the pair (A,B) can be determined from (A,direction) by a
  parent/children search through the history.  This simplifies the
  implementation.

  The idea is then to scan with two pointers 'half' and 'full' in
  parallel, where the first one makes only half as many steps, until
  we hit a POPPED commit.  This must then be the interval we want to
  bisect, and we can check whether it is still interesting, emit M,
  and queue it for recursion at one level deeper.

* We maintain a list 'outstanding' of all commits that have been sent,
  so that an "implied not-ACK" can be inferred from the next ACK.
  This is used to do a child-recursive NOTCOMMON flagging, analogous
  to the parent-recursive COMMON flagging on an ACK.  The bisection
  uses these flags to determine ranges.

* Yes, commit_info is a pointless level of indirection.  I had more
  tracking information in it before the last draft, and left it in for
  the time being.

[Hint: when playing with the patch, command lines such as

   git fetch-pack -v -k <url> <head> 2>&1 | git name-rev --stdin

help see the effects.]


Open Problems
=============

* Effect of branch/merge points

  I do not see a better solution to deal with branch/merge points than
  recursing into them.  There may be examples of history where this
  extra level of recursion spoils the log(n) bound.  Carefully
  choosing which side to scan first might help.

* Impact/clever use of refs

  For some reason, current git sends all refs to the server, even if
  the server should already know about lots of them.  For example, in
  git.git, emitting v1.6.0.2 covers almost all tags in the repository
  by simple ancestor scanning.

  Is there a reason for this behaviour?  Otherwise it would be better
  to emit them in date order and intelligently handle commons.  (In
  fact this does not depend on the discussed change.)

* Backwards compatibility

  As mentioned above, ok_to_give_up() is not really compatible with
  this algorithm.  I suppose we should introduce a new capability
  identifier that has the client enable the log(n) walker, and the
  server disable ok_to_give_up()?  Is there a better approach, or are
  there problems with it?  (I haven't worked with the protocol enough
  to say, though it seems harmless.)

... and of course everything that I missed. ;-)


I'd greatly appreciate comments, especially from people who know the
protocol side of git.

- Thomas


-- 
Thomas Rast
trast@student.ethz.ch





[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH] fetch-pack: log(n)-transmission find_common()
  2008-09-17 23:00 [RFC] log(n)-transmissions common commit handshake Thomas Rast
@ 2008-09-17 23:01 ` Thomas Rast
  2008-09-24  0:48   ` [RFC PATCH v2] " Thomas Rast
  2008-09-18  8:18 ` [RFC] log(n)-transmissions common commit handshake Thomas Rast
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas Rast @ 2008-09-17 23:01 UTC (permalink / raw)
  To: git

See thread starter for discussion.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>

---
 builtin-fetch-pack.c |  333 +++++++++++++++++++++++++++++++++++++++++++------
 upload-pack.c        |    4 +-
 2 files changed, 294 insertions(+), 43 deletions(-)

diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
index 4dfef29..9f010fe 100644
--- a/builtin-fetch-pack.c
+++ b/builtin-fetch-pack.c
@@ -25,6 +25,11 @@ static const char fetch_pack_usage[] =
 #define COMMON_REF	(1U << 2)
 #define SEEN		(1U << 3)
 #define POPPED		(1U << 4)
+#define BISECTED_FW	(1U << 5)
+#define BISECTED_BW	(1U << 6)
+#define NOTCOMMON       (1U << 7)
+#define HAS_INFO        (1U << 8)
+#define CLEAN_MARKS     (COMMON | COMMON_REF | SEEN | POPPED | BISECTED_FW | BISECTED_BW | NOTCOMMON)
 
 static int marked;
 
@@ -34,19 +39,78 @@ static int marked;
  */
 #define MAX_IN_VAIN 256
 
-static struct commit_list *rev_list;
+struct work_list
+{
+	struct work_list *next;
+	struct commit *commit;
+	unsigned int stride;    /* current stride length between commits */
+	unsigned int steps;     /* steps left to go before emitting again */
+	unsigned int bisect;
+	unsigned int is_ref :1; /* prioritise refs */
+};
+
+static struct work_list *work_list = NULL;
+
+static struct commit_list *outstanding = NULL; /* list of commits not ACKed yet */
+static struct commit_list *outstanding_end = NULL;
+
+struct commit_info
+{
+	struct commit_list *children;
+};
+
+
 static int non_common_revs, multi_ack, use_sideband;
 
-static void rev_list_push(struct commit *commit, int mark)
+static struct commit_info *get_commit_info(struct commit *commit)
+{
+	if (commit->object.flags & HAS_INFO)
+		return commit->util;
+
+	struct commit_info *info = xmalloc(sizeof(struct commit_info));
+	info->children = NULL;
+	commit->util = info;
+	commit->object.flags |= HAS_INFO;
+	return info;
+}
+
+static void work_list_insert(struct work_list *work_item)
 {
+	struct work_list **pp = &work_list;
+	struct work_list *p;
+
+	while ((p = *pp)) {
+		if ((work_item->is_ref && (!p->is_ref || p->commit->date < work_item->commit->date))
+		    || (!work_item->bisect && (p->bisect || p->commit->date < work_item->commit->date))
+		    || p->bisect > work_item->bisect)
+			break;
+		pp = &p->next;
+	}
+
+	work_item->next = p;
+	*pp = work_item;
+}
+
+
+static void rev_list_push(struct commit *commit, int mark, unsigned int stride, unsigned int steps, unsigned int is_ref)
+{
+	struct work_list *work_item;
+
 	if (!(commit->object.flags & mark)) {
 		commit->object.flags |= mark;
 
-		if (!(commit->object.parsed))
+		if (!(commit->object.parsed)) {
 			if (parse_commit(commit))
 				return;
+		}
 
-		insert_by_date(commit, &rev_list);
+		work_item = xmalloc(sizeof(struct work_list));
+		work_item->commit = commit;
+		work_item->stride = stride;
+		work_item->steps = steps;
+		work_item->bisect = 0;
+		work_item->is_ref = is_ref;
+		work_list_insert(work_item);
 
 		if (!(commit->object.flags & COMMON))
 			non_common_revs++;
@@ -58,7 +122,7 @@ static int rev_list_insert_ref(const char *path, const unsigned char *sha1, int
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
 	if (o && o->type == OBJ_COMMIT)
-		rev_list_push((struct commit *)o, SEEN);
+		rev_list_push((struct commit *)o, SEEN, 1, 1, 1);
 
 	return 0;
 }
@@ -68,8 +132,7 @@ static int clear_marks(const char *path, const unsigned char *sha1, int flag, vo
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
 	if (o && o->type == OBJ_COMMIT)
-		clear_commit_marks((struct commit *)o,
-				   COMMON | COMMON_REF | SEEN | POPPED);
+		clear_commit_marks((struct commit *)o, CLEAN_MARKS);
 	return 0;
 }
 
@@ -89,7 +152,7 @@ static void mark_common(struct commit *commit,
 			o->flags |= COMMON;
 
 		if (!(o->flags & SEEN))
-			rev_list_push(commit, SEEN);
+			rev_list_push(commit, SEEN, 1, 1, 0);
 		else {
 			struct commit_list *parents;
 
@@ -111,49 +174,216 @@ static void mark_common(struct commit *commit,
   Get the next rev to send, ignoring the common.
 */
 
-static const unsigned char* get_rev(void)
+
+static void get_rev_queue_parents(struct commit *commit, int mark, unsigned int stride, unsigned int steps)
+{
+	struct commit_list *parents = commit->parents;
+
+	while (parents) {
+		struct commit_info *info = get_commit_info(parents->item);
+		struct commit_list *entry;
+		for (entry = info->children; entry; entry = entry->next)
+			if (entry->item == commit)
+				break;
+		if (!entry)
+			commit_list_insert(commit, &(info->children));
+		if (!(parents->item->object.flags & SEEN))
+			rev_list_push(parents->item, mark, stride, steps, 0);
+		if (mark & COMMON)
+			mark_common(parents->item, 1, 0);
+		parents = parents->next;
+	}
+}
+
+
+static struct commit* get_rev_stride_skip(struct work_list *work_item)
+{
+	get_rev_queue_parents(work_item->commit, SEEN, work_item->stride,
+			      work_item->steps + 1);
+
+	free(work_item);
+	return NULL;
+}
+
+static void setup_bisect(struct commit *commit, int depth)
+{
+	struct work_list *work_item = xmalloc(sizeof(struct work_list));
+
+	work_item->commit = commit;
+	work_item->bisect = 1+depth;
+	work_item->is_ref = 0;
+	work_list_insert(work_item);
+}
+
+static void setup_bisect_all (struct commit_list* list, int depth)
+{
+	while (list) {
+		setup_bisect(list->item, depth);
+		list = list->next;
+	}
+}
+
+static struct commit* get_rev_stride_emit(struct work_list *work_item)
+{
+	struct commit *commit = work_item->commit;
+	unsigned int mark;
+
+	commit->object.flags |= POPPED;
+
+	/* This one got hit by our walk! */
+	if (!(commit->object.flags & COMMON))
+		non_common_revs--;
+
+	if (commit->object.flags & COMMON) {
+		/* do not send "have", and ignore ancestors */
+		commit = NULL;
+		mark = COMMON | SEEN;
+	} else if (commit->object.flags & COMMON_REF) {
+		/* send "have", and ignore ancestors */
+		mark = COMMON | SEEN;
+	} else {
+		/* send "have", also for its ancestors */
+		mark = SEEN;
+		setup_bisect(commit, 0);
+	}
+
+	if (commit)
+		get_rev_queue_parents(commit, mark, work_item->stride*2, 0);
+
+	free(work_item);
+
+	return commit;
+}
+
+static struct commit* get_rev_bisect(struct work_list *work_item)
+{
+	struct commit *full;
+	struct commit *half;
+	int half_step;
+	struct commit_info *info;
+	unsigned int flags = work_item->commit->object.flags;
+
+	if (flags & NOTCOMMON) {
+		/* we inferred that this side of the bisection is not
+		 * interesting any longer */
+		free(work_item);
+		return NULL;
+	}
+
+
+	if (!(flags & (COMMON|BISECTED_BW))) {
+		/* Server does not have this. Search backward in history */
+
+		full = work_item->commit;
+		full->object.flags |= BISECTED_BW;
+		half = work_item->commit;
+		half_step = 0;
+
+		while (full && full->parents) {
+			setup_bisect_all(full->parents->next, work_item->bisect);
+			full = full->parents->item;
+			half_step ^= 1;
+			if (half_step)
+				half = half->parents->item;
+			if (full->object.flags & (POPPED|COMMON))
+				break;
+		}
+
+		/* also insert the same bisection again so we can try forward too */
+		work_list_insert(work_item);
+
+		if (full->object.flags & POPPED
+		    && !(full->object.flags & NOTCOMMON)
+		    && !(half->object.flags & (COMMON|POPPED))) {
+			setup_bisect(half, work_item->bisect);
+			half->object.flags |= POPPED;
+			info = get_commit_info(half);
+			return half;
+		} else {
+			return NULL;
+		}
+	}
+
+	if (!(flags & (NOTCOMMON|BISECTED_FW))) {
+		/* We have not seen this yet when bisecting, search forward */
+
+		full = work_item->commit;
+		full->object.flags |= BISECTED_FW;
+		half = work_item->commit;
+		half_step = 0;
+
+		while (full && (info=get_commit_info(full))->children) {
+			setup_bisect_all(info->children->next, work_item->bisect);
+			full = info->children->item;
+			half_step ^= 1;
+			if (half_step) {
+				info = get_commit_info(half);
+				half = info->children->item;
+			}
+			if (full->object.flags & (POPPED|NOTCOMMON))
+				break;
+		}
+
+		if (full->object.flags & POPPED
+		    && !(full->object.flags & COMMON)
+		    && !(half->object.flags & POPPED)) {
+			setup_bisect(half, work_item->bisect);
+			half->object.flags |= POPPED;
+			info = get_commit_info(half);
+			free(work_item);
+			return half;
+		}
+	}
+
+	free(work_item);
+	return NULL;
+}
+
+
+static void mark_not_common(struct commit *commit)
+{
+	struct commit_info *info = get_commit_info(commit);
+	struct commit_list *child;
+
+	if (commit->object.flags & COMMON)
+		/* this has already been acked earlier */
+		return;
+
+	commit->object.flags |= NOTCOMMON;
+
+	for (child = info->children; child; child = child->next)
+		mark_not_common(child->item);
+}
+
+static struct commit *get_rev(void)
 {
 	struct commit *commit = NULL;
 
 	while (commit == NULL) {
-		unsigned int mark;
-		struct commit_list *parents;
+		struct work_list *work_item = NULL;
 
-		if (rev_list == NULL || non_common_revs == 0)
+		if (work_list == NULL || non_common_revs == 0)
 			return NULL;
 
-		commit = rev_list->item;
-		if (!commit->object.parsed)
+		work_item = work_list;
+		work_list = work_item->next;
+
+		commit = work_item->commit;
+
+		if (commit && !commit->object.parsed)
 			parse_commit(commit);
-		parents = commit->parents;
 
-		commit->object.flags |= POPPED;
-		if (!(commit->object.flags & COMMON))
-			non_common_revs--;
-
-		if (commit->object.flags & COMMON) {
-			/* do not send "have", and ignore ancestors */
-			commit = NULL;
-			mark = COMMON | SEEN;
-		} else if (commit->object.flags & COMMON_REF)
-			/* send "have", and ignore ancestors */
-			mark = COMMON | SEEN;
-		else
-			/* send "have", also for its ancestors */
-			mark = SEEN;
-
-		while (parents) {
-			if (!(parents->item->object.flags & SEEN))
-				rev_list_push(parents->item, mark);
-			if (mark & COMMON)
-				mark_common(parents->item, 1, 0);
-			parents = parents->next;
+		if (work_item->bisect) {
+			commit = get_rev_bisect(work_item);
+		} else if (work_item->steps >= work_item->stride-1
+			   || (commit && !commit->parents)) {
+			commit = get_rev_stride_emit(work_item);
+		} else {
+			commit = get_rev_stride_skip(work_item);
 		}
-
-		rev_list = rev_list->next;
 	}
 
-	return commit->object.sha1;
+	return commit;
 }
 
 static int find_common(int fd[2], unsigned char *result_sha1,
@@ -161,6 +391,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 {
 	int fetching;
 	int count = 0, flushes = 0, retval;
+	struct commit *commit;
 	const unsigned char *sha1;
 	unsigned in_vain = 0;
 	int got_continue = 0;
@@ -243,11 +474,21 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 
 	flushes = 0;
 	retval = -1;
-	while ((sha1 = get_rev())) {
+	while ((commit = get_rev())) {
+		sha1 = commit->object.sha1;
 		packet_write(fd[1], "have %s\n", sha1_to_hex(sha1));
 		if (args.verbose)
 			fprintf(stderr, "have %s\n", sha1_to_hex(sha1));
 		in_vain++;
+
+		if (outstanding) {
+			commit_list_insert(commit, &(outstanding_end->next));
+			outstanding_end = outstanding_end->next;
+		} else {
+			commit_list_insert(commit, &outstanding);
+			outstanding_end = outstanding;
+		}
+
 		if (!(31 & ++count)) {
 			int ack;
 
@@ -274,6 +515,16 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 				} else if (ack == 2) {
 					struct commit *commit =
 						lookup_commit(result_sha1);
+					struct commit_list *item;
+					while (commit != outstanding->item) {
+						mark_not_common(commit);
+						item = outstanding;
+						outstanding = item->next;
+						free(item);
+					}
+					item = outstanding;
+					outstanding = item->next;
+					free(item);
 					mark_common(commit, 0, 1);
 					retval = 0;
 					in_vain = 0;
@@ -445,7 +696,7 @@ static int everything_local(struct ref **refs, int nr_match, char **match)
 			continue;
 
 		if (!(o->flags & SEEN)) {
-			rev_list_push((struct commit *)o, COMMON_REF | SEEN);
+			rev_list_push((struct commit *)o, COMMON_REF | SEEN, 1, 1, 1);
 
 			mark_common((struct commit *)o, 1, 1);
 		}
diff --git a/upload-pack.c b/upload-pack.c
index e5adbc0..c6dfb32 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -414,9 +414,9 @@ static int get_common_commits(void)
 		if (!prefixcmp(line, "have ")) {
 			switch (got_sha1(line+5, sha1)) {
 			case -1: /* they have what we do not */
-				if (multi_ack && ok_to_give_up())
+/*				if (multi_ack && ok_to_give_up())
 					packet_write(1, "ACK %s continue\n",
-						     sha1_to_hex(sha1));
+					sha1_to_hex(sha1));*/
 				break;
 			default:
 				memcpy(hex, sha1_to_hex(sha1), 41);
-- 
tg: (1293c95..) t/fetch-pack-speedup (depends on: origin/master)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC] log(n)-transmissions common commit handshake
  2008-09-17 23:00 [RFC] log(n)-transmissions common commit handshake Thomas Rast
  2008-09-17 23:01 ` [RFC PATCH] fetch-pack: log(n)-transmission find_common() Thomas Rast
@ 2008-09-18  8:18 ` Thomas Rast
  1 sibling, 0 replies; 13+ messages in thread
From: Thomas Rast @ 2008-09-18  8:18 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1446 bytes --]

I wrote:
> * Impact/clever use of refs
> 
>   For some reason, current git sends all refs to the server, even if
>   the server should already know about lots of them.  For example, in
>   git.git, emitting v1.6.0.2 covers almost all tags in the repository
>   by simple ancestor scanning.
> 
>   Is there a reason for this behaviour?  Otherwise it would be better
>   to emit them in date order and intelligently handle commons.  (In
>   fact this does not depend on the discussed change.)

As an addendum, I think the following would be a good way to cleverly
use refs to reduce work:

Cache a "reduced" DAG which just maps the ref'd commit relationships,
i.e., shows the reachability of refs only.  This needs to be written
out to disk between invocations.

At the start of the protocol, the server announces all its refs.  We
can use the reduced DAG to infer the minimal set of ref heads we need
to announce to have the server know all common ones.  We can also mark
all the other refs as "common but not announced yet", so that the
backwards marking and searching routines know to stop there.

This should reduce the number of refs listed back to the server to
only a handful, and at the same time, stop the client from searching
backwards through _all_ history (which can take a bit of time, and is
one of the weaknesses of my proposal) in most cases.

- Thomas

-- 
Thomas Rast
trast@student.ethz.ch



[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-09-17 23:01 ` [RFC PATCH] fetch-pack: log(n)-transmission find_common() Thomas Rast
@ 2008-09-24  0:48   ` Thomas Rast
  2008-10-23 19:38     ` Thomas Rast
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Rast @ 2008-09-24  0:48 UTC (permalink / raw)
  To: git; +Cc: Clemens Buchacher

Replaces the existing simple history search with a more sophisticated
algorithm:

1) Walk history with exponentially increasing stride lengths; i.e.,
   send the 1st commit, then the 2nd after that, then the 4th after
   that, and so on.

2) Bisect the resulting intervals.

Combined with tracking the "outstanding haves" so that we can detect
which sha1s were never ACKed by upload-pack (and hence not common),
this gives O(log(n)) required "have" lines.

Unfortunately this cannot work if the server sends fake ACKs, so we
introduce a capability 'exp-stride' which instructs upload-pack to
disable ok_to_give_up().  (Which incidentally saves the server a lot
of work.)

Signed-off-by: Thomas Rast <trast@student.ethz.ch>

---

So, after the resounding success that v1 of this patch was, I went
through and actually made it work, or at least I hope so.

I eventually gave up on trying to understand the exact interaction of
mark_common(), which sometimes pushes, and rev_list_push(), which
sometimes marks, and just rewrote them to each only do one thing.  I
also rewrote the old strategy in terms of the new get_rev_stride
(always keeping stride length at 1) because that was far less work
than integrating the existing code with the new one.

For the timings below, I ran the (new) daemon locally, with two repos:

- a 'clone --mirror' of my own git.git, with a branch 'next' 525
  commits from the above cc185a6

- a 'clone --mirror' of the completely unrelated adps.git, which
  shares no objects with git.git

Some not really statistically sound timings, usually best of 3-5:

  ## A stripped down git.git, with only one branch and no tags
  $ git branch -a
  * master
  $ git tag
  $ git log -1 --pretty=oneline master
  cc185a6a8ac24737a26ec4b40cc401c2db8b2e97 Start draft release notes for 1.6.0.3
  
  ## I still have the old (dashed) forms in /usr/bin from RPMs
  $ /usr/bin/git --version
  git version 1.5.6

  ## (1a) "Updating" from the git.git that is further ahead (OLD)
  $ time /usr/bin/git-fetch-pack -k git://localhost/git next
  [...]
  real    0m1.004s
  user    0m0.228s
  sys     0m0.028s

  ## (1b) "Updating" from the git.git that is further ahead (NEW)
  $ time git fetch-pack -k git://localhost/git next
  [...]
  real    0m0.977s
  user    0m0.208s
  sys     0m0.068s

  ## (2a) Fetching the unrelated repo from scratch (OLD)
  $ time /usr/bin/git-fetch-pack -k git://localhost/adps master
  [...]
  real    0m9.560s
  user    0m0.720s
  sys     0m0.128s

  ## (2b) Fetching the unrelated repo from scratch (NEW)
  $ time git fetch-pack -k git://localhost/adps master
  [...]
  real    0m0.596s
  user    0m0.372s
  sys     0m0.008s

  ## (3a) Fetching over the network from real repo (OLD)
  $ time /usr/bin/git-fetch-pack git://git.kernel.org/pub/scm/git/git.git next
  [...]
  user    0m0.528s
  sys     0m0.136s

  ## (3b) Fetching over the network from real repo (NEW)
  $ time git fetch-pack git://git.kernel.org/pub/scm/git/git.git next
  [...]
  user    0m0.540s
  sys     0m0.180s

  ## Add more refs to make it more interesting
  $ git remote add origin ~/dev/git
  $ git fetch --tags origin

  ## (4a) like 1a, but with lots of tags
  $ time /usr/bin/git-fetch-pack -k git://localhost/git next
  [...]
  real    0m1.075s
  user    0m0.452s
  sys     0m0.048s

  ## (4b) like 1b, but with lots of tags
  $ time git fetch-pack -k git://localhost/git next
  [...]
  real    0m0.837s
  user    0m0.236s
  sys     0m0.040s

Clearly, this approach does solve the issue I mentioned in the
motivation earlier in the thread, where the initial fetch of
completely disjoint repositories takes _ages_.  Somewhat to my own
surprise, it seems to do quite well in _all_ cases even though it
aggressively digs through history.

Note, however, that the timings aren't really solid; the final write
of the pack, which I assume is fsync()ed, seems to have a big impact
on the execution time.  Test (3) obviously depends more on network
performance than anything else; and it uses the old (linear)
algorithm, but in the new implementation.

I'd appreciate testers and eyeballs on the patch.  I'll think about
the ref heads "reduced DAG" again, to possibly shave off some more
(client side cpu) time when I get the time, but that likely won't be
in the next few days.

- Thomas


PS: Thanks drizzd for the discussion of the topic on IRC.

PPS: Junio, enjoy your holidays :-)



 builtin-fetch-pack.c |  477 +++++++++++++++++++++++++++++++++++++++++---------
 upload-pack.c        |    7 +-
 2 files changed, 400 insertions(+), 84 deletions(-)

diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
index 4dfef29..721fe08 100644
--- a/builtin-fetch-pack.c
+++ b/builtin-fetch-pack.c
@@ -25,6 +25,12 @@ static const char fetch_pack_usage[] =
 #define COMMON_REF	(1U << 2)
 #define SEEN		(1U << 3)
 #define POPPED		(1U << 4)
+#define BISECTED_FW	(1U << 5)
+#define BISECTED_BW	(1U << 6)
+#define NOTCOMMON       (1U << 7)
+#define HAS_INFO        (1U << 8)
+#define CLEAN_MARKS     (COMMON | COMMON_REF | SEEN | POPPED | BISECTED_FW \
+			 | BISECTED_BW | NOTCOMMON)
 
 static int marked;
 
@@ -34,133 +40,412 @@ static int marked;
  */
 #define MAX_IN_VAIN 256
 
-static struct commit_list *rev_list;
-static int non_common_revs, multi_ack, use_sideband;
+struct work_list
+{
+	struct work_list *next;
+	struct commit *commit;
+	unsigned int stride;    /* current stride length between commits */
+	unsigned int steps;     /* steps left to go before emitting again */
+	unsigned int depth;     /* sideline depth */
+	unsigned int bisect;
+};
+
+/* "recursion stack" of our strategy of choosing sha1s to transmit */
+static struct work_list *work_list = NULL;
 
-static void rev_list_push(struct commit *commit, int mark)
+/* list of commits not ACKed yet */
+static struct commit_list *outstanding = NULL;
+static struct commit_list *outstanding_end = NULL;
+
+struct commit_info
 {
-	if (!(commit->object.flags & mark)) {
-		commit->object.flags |= mark;
+	struct commit_list *children;
+};
 
-		if (!(commit->object.parsed))
-			if (parse_commit(commit))
-				return;
 
-		insert_by_date(commit, &rev_list);
+static int non_common_revs = 1, multi_ack, use_sideband, exp_stride_algorithm;
 
-		if (!(commit->object.flags & COMMON))
-			non_common_revs++;
+static struct commit_info *get_commit_info(struct commit *commit)
+{
+	if (commit->object.flags & HAS_INFO)
+		return commit->util;
+
+	struct commit_info *info = xmalloc(sizeof(struct commit_info));
+	info->children = NULL;
+	commit->util = info;
+	commit->object.flags |= HAS_INFO;
+	return info;
+}
+
+static void work_list_insert(struct work_list *work_item)
+{
+	struct work_list **pp = &work_list;
+	struct work_list *p;
+
+	while ((p = *pp)) {
+		if (p->bisect || work_item->bisect) {
+			if (p->bisect > work_item->bisect)
+				break;
+		} else {
+			if (p->depth > work_item->depth)
+				break;
+			else if (p->commit->date < work_item->commit->date)
+				break;
+		}
+
+		pp = &p->next;
 	}
+
+	work_item->next = p;
+	*pp = work_item;
 }
 
-static int rev_list_insert_ref(const char *path, const unsigned char *sha1, int flag, void *cb_data)
+static void add_child(struct commit *parent, struct commit *child)
+{
+	struct commit_info *info = get_commit_info(parent);
+	struct commit_list *item;
+	for (item = info->children; item; item = item->next)
+		if (item->item == child)
+			break;
+	if (!item)
+		commit_list_insert(child, &(info->children));
+}
+
+
+/*
+  Marks all (known) children of this commit NOTCOMMON.  We call this
+  for all sha1s the server did not ACK.
+*/
+
+static void mark_not_common(struct commit *commit)
+{
+	struct commit_info *info = get_commit_info(commit);
+	struct commit_list *child;
+
+	assert(!(commit->object.flags & COMMON));
+
+	if (commit->object.flags & NOTCOMMON)
+		return;
+
+	commit->object.flags |= NOTCOMMON;
+
+	for (child = info->children; child; child = child->next)
+		mark_not_common(child->item);
+}
+
+
+static void check_parsed_and_mark(struct commit *commit)
+{
+	struct commit_list *parent;
+
+	if (!commit && commit->object.parsed)
+		return;
+
+	parse_commit(commit);
+
+	for (parent = commit->parents; parent; parent = parent->next) {
+		add_child(parent->item, commit);
+		if (parent->item->object.flags & NOTCOMMON)
+			mark_not_common(commit);
+	}
+}
+
+
+static void rev_list_push(struct commit *commit, unsigned int stride,
+			  unsigned int steps, unsigned int depth)
+{
+	struct work_list *work_item;
+
+	check_parsed_and_mark(commit);
+
+	work_item = xmalloc(sizeof(struct work_list));
+	work_item->commit = commit;
+	work_item->stride = stride;
+	work_item->steps = steps;
+	work_item->depth = depth;
+	work_item->bisect = 0;
+	work_list_insert(work_item);
+}
+
+static int rev_list_insert_ref(const char *path, const unsigned char *sha1,
+			       int flag, void *cb_data)
 {
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
-	if (o && o->type == OBJ_COMMIT)
-		rev_list_push((struct commit *)o, SEEN);
+	if (o && o->type == OBJ_COMMIT && !(o->flags & SEEN))
+		rev_list_push((struct commit *)o, 1, 1, 0);
 
 	return 0;
 }
 
-static int clear_marks(const char *path, const unsigned char *sha1, int flag, void *cb_data)
+static int clear_marks(const char *path, const unsigned char *sha1, int flag,
+		       void *cb_data)
 {
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
 	if (o && o->type == OBJ_COMMIT)
-		clear_commit_marks((struct commit *)o,
-				   COMMON | COMMON_REF | SEEN | POPPED);
+		clear_commit_marks((struct commit *)o, CLEAN_MARKS);
 	return 0;
 }
 
+
 /*
-   This function marks a rev and its ancestors as common.
-   In some cases, it is desirable to mark only the ancestors (for example
-   when only the server does not yet know that they are common).
+  Mark commits backwards through history as COMMON.
 */
 
-static void mark_common(struct commit *commit,
-		int ancestors_only, int dont_parse)
+static void mark_common(struct commit *commit, int ancestors_only, int dont_parse)
 {
-	if (commit != NULL && !(commit->object.flags & COMMON)) {
-		struct object *o = (struct object *)commit;
+	struct commit_list *parents;
+	struct object *o;
 
-		if (!ancestors_only)
-			o->flags |= COMMON;
+	if (commit == NULL || commit->object.flags & (COMMON_REF|COMMON))
+		return;
 
-		if (!(o->flags & SEEN))
-			rev_list_push(commit, SEEN);
-		else {
-			struct commit_list *parents;
-
-			if (!ancestors_only && !(o->flags & POPPED))
-				non_common_revs--;
-			if (!o->parsed && !dont_parse)
-				if (parse_commit(commit))
-					return;
-
-			for (parents = commit->parents;
-					parents;
-					parents = parents->next)
-				mark_common(parents->item, 0, dont_parse);
-		}
+	if (!dont_parse && !(commit->object.parsed))
+		check_parsed_and_mark(commit);
+
+	o = (struct object *)commit;
+
+	if (!ancestors_only)
+		o->flags |= COMMON;
+
+	for (parents = commit->parents; parents; parents = parents->next) {
+		add_child(parents->item, commit);
+		mark_common(parents->item, 0, dont_parse);
 	}
 }
 
+
+
 /*
-  Get the next rev to send, ignoring the common.
+  Queue all revs in the list for simple backward search. Does not
+  requeue already SEEN commits.
 */
 
-static const unsigned char* get_rev(void)
+static void queue_list(struct commit *child, struct commit_list *list,
+		       unsigned int stride, unsigned int steps,
+		       unsigned int depth)
 {
-	struct commit *commit = NULL;
+	while (list) {
+		add_child(list->item, child);
+		if (!(list->item->object.flags & SEEN))
+			rev_list_push(list->item, stride, steps, depth);
+		list = list->next;
+	}
+}
+
+
+/*
+  Queues a commit for bisection
+*/
+
+static void setup_bisect(struct commit *commit, int depth)
+{
+	struct work_list *work_item = xmalloc(sizeof(struct work_list));
+
+	work_item->commit = commit;
+	work_item->bisect = 1+depth;
+	work_item->depth = 0;
+	work_list_insert(work_item);
+}
+
+
+/*
+  Queues a list of commits for bisection
+*/
+
+static void setup_bisect_all (struct commit_list* list, int depth)
+{
+	while (list) {
+		setup_bisect(list->item, depth);
+		list = list->next;
+	}
+}
+
+
+/*
+  Backwards search. If we use exp_stride_algorithm, it takes
+  exponential strides between commits chosen.
+*/
+
+static struct commit* get_rev_stride(struct work_list *work_item)
+{
+	struct commit *commit = work_item->commit;
+	unsigned int steps = work_item->steps;
+
+	while (steps++ < work_item->stride-1 && commit->parents
+	       && !(commit->object.flags & (SEEN|COMMON|COMMON_REF))) {
+		/* all but the first parent line are queued for later */
+		if (commit->parents)
+			queue_list(commit, commit->parents->next,
+				   work_item->stride, steps,
+				   work_item->depth+1);
+
+		/* follow the first parent line directly */
+		commit->object.flags |= SEEN;
+		add_child(commit->parents->item, commit);
+		commit = commit->parents->item;
+		check_parsed_and_mark(commit);
+	}
+
+	if (commit->object.flags & (SEEN|COMMON)) {
+		/* already been here! */
+		commit = NULL;
+	} else if (commit->object.flags & COMMON_REF) {
+		/* this came from a ref; we stop here, but need to
+		 * emit it so the server knows too */
+		commit->object.flags |= POPPED|SEEN|BISECTED_BW;
+		if (exp_stride_algorithm)
+			setup_bisect(commit, 0);
+	} else {
+		/* usual case: this is news to us, so we scan further
+		 * back.  note that the first-depth bisection only
+		 * needs to go forward */
+		commit->object.flags |= POPPED|SEEN|BISECTED_BW;
+		queue_list(commit, commit->parents,
+			   exp_stride_algorithm ? work_item->stride*2 : 1,
+			   0, work_item->depth);
+		if (exp_stride_algorithm)
+			setup_bisect(commit, 0);
+	}
+
+	free(work_item);
+
+	return commit;
+}
 
-	while (commit == NULL) {
-		unsigned int mark;
-		struct commit_list *parents;
 
-		if (rev_list == NULL || non_common_revs == 0)
+/*
+  Bisection state of the exponential stride algorithm: scans forward
+  (backward) for a non-common/already-sent (common/already-sent,
+  resp.) commit, then emits the middle of that range, and re-queues.
+
+  This means we bisect the discovered range in log(n) transmissions,
+  although we do make O(n) steps through history.
+*/
+
+static struct commit* get_rev_bisect(struct work_list *work_item)
+{
+	struct commit *full;
+	struct commit *half;
+	int half_step;
+	struct commit_info *info;
+	unsigned int flags = work_item->commit->object.flags;
+
+	if (!(flags & (COMMON|BISECTED_BW))) {
+		/* We don't know anything about the history _backward_
+		 * from this, so search it */
+
+		full = work_item->commit;
+		full->object.flags |= BISECTED_BW;
+		half = work_item->commit;
+		half_step = 0;
+
+		while (full && full->parents) {
+			full = full->parents->item;
+			half_step ^= 1;
+			if (half_step)
+				half = half->parents->item;
+			if (full->object.flags & (POPPED|COMMON))
+				break;
+		}
+
+		/* also insert the same bisection again so we can try
+		 * forward too */
+		work_list_insert(work_item);
+
+		if (full->object.flags & POPPED
+		    && !(full->object.flags & NOTCOMMON)
+		    && !(half->object.flags & (COMMON|POPPED))) {
+			setup_bisect(half, work_item->bisect);
+			half->object.flags |= POPPED;
+			return half;
+		} else {
 			return NULL;
+		}
+	}
 
-		commit = rev_list->item;
-		if (!commit->object.parsed)
-			parse_commit(commit);
-		parents = commit->parents;
-
-		commit->object.flags |= POPPED;
-		if (!(commit->object.flags & COMMON))
-			non_common_revs--;
-
-		if (commit->object.flags & COMMON) {
-			/* do not send "have", and ignore ancestors */
-			commit = NULL;
-			mark = COMMON | SEEN;
-		} else if (commit->object.flags & COMMON_REF)
-			/* send "have", and ignore ancestors */
-			mark = COMMON | SEEN;
-		else
-			/* send "have", also for its ancestors */
-			mark = SEEN;
-
-		while (parents) {
-			if (!(parents->item->object.flags & SEEN))
-				rev_list_push(parents->item, mark);
-			if (mark & COMMON)
-				mark_common(parents->item, 1, 0);
-			parents = parents->next;
+	if (!(flags & (NOTCOMMON|BISECTED_FW))) {
+		/* We don't know anything about the history _forward_
+		 * from this, so search it */
+
+		full = work_item->commit;
+		full->object.flags |= BISECTED_FW;
+		half = work_item->commit;
+		half_step = 0;
+
+		while (full && (info=get_commit_info(full))->children) {
+			setup_bisect_all(info->children->next, work_item->bisect);
+			full = info->children->item;
+			half_step ^= 1;
+			if (half_step) {
+				info = get_commit_info(half);
+				half = info->children->item;
+			}
+			if (full->object.flags & (POPPED|NOTCOMMON))
+				break;
 		}
 
-		rev_list = rev_list->next;
+		if (full->object.flags & POPPED
+		    && !(full->object.flags & COMMON)
+		    && !(half->object.flags & (NOTCOMMON|POPPED))) {
+			setup_bisect(half, work_item->bisect);
+			half->object.flags |= POPPED;
+			free(work_item);
+			return half;
+		}
 	}
 
-	return commit->object.sha1;
+	free(work_item);
+	return NULL;
 }
 
+
+/*
+  Get the next revision to send.
+*/
+
+static struct commit *get_rev(void)
+{
+	struct commit *commit = NULL;
+
+	while (commit == NULL && work_list) {
+		struct work_list *work_item = NULL;
+
+		work_item = work_list;
+		/* we update the pointer early so the get_rev
+		 * functions can free() or reuse the work_item as
+		 * required */
+		work_list = work_item->next;
+
+		commit = work_item->commit;
+
+		check_parsed_and_mark(commit);
+
+		if (work_item->bisect) {
+			commit = get_rev_bisect(work_item);
+		} else {
+			commit = get_rev_stride(work_item);
+		}
+	}
+
+	return commit;
+}
+
+static void pop_outstanding()
+{
+	struct commit_list *item = outstanding;
+	outstanding = item->next;
+	free(item);
+}
+
+
 static int find_common(int fd[2], unsigned char *result_sha1,
 		       struct ref *refs)
 {
 	int fetching;
 	int count = 0, flushes = 0, retval;
+	struct commit *commit;
 	const unsigned char *sha1;
 	unsigned in_vain = 0;
 	int got_continue = 0;
@@ -192,7 +477,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 		}
 
 		if (!fetching)
-			packet_write(fd[1], "want %s%s%s%s%s%s%s%s\n",
+			packet_write(fd[1], "want %s%s%s%s%s%s%s%s%s\n",
 				     sha1_to_hex(remote),
 				     (multi_ack ? " multi_ack" : ""),
 				     (use_sideband == 2 ? " side-band-64k" : ""),
@@ -200,6 +485,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 				     (args.use_thin_pack ? " thin-pack" : ""),
 				     (args.no_progress ? " no-progress" : ""),
 				     (args.include_tag ? " include-tag" : ""),
+				     (exp_stride_algorithm ? " exp-stride" : ""),
 				     " ofs-delta");
 		else
 			packet_write(fd[1], "want %s\n", sha1_to_hex(remote));
@@ -243,13 +529,25 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 
 	flushes = 0;
 	retval = -1;
-	while ((sha1 = get_rev())) {
+	while ((commit = get_rev())) {
+		sha1 = commit->object.sha1;
 		packet_write(fd[1], "have %s\n", sha1_to_hex(sha1));
 		if (args.verbose)
 			fprintf(stderr, "have %s\n", sha1_to_hex(sha1));
 		in_vain++;
+
+		if (outstanding) {
+			commit_list_insert(commit, &(outstanding_end->next));
+			outstanding_end = outstanding_end->next;
+		} else {
+			commit_list_insert(commit, &outstanding);
+			outstanding_end = outstanding;
+		}
+
 		if (!(31 & ++count)) {
 			int ack;
+			int unwound = 0;
+			struct commit_list *item;
 
 			packet_flush(fd[1]);
 			flushes++;
@@ -274,12 +572,23 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 				} else if (ack == 2) {
 					struct commit *commit =
 						lookup_commit(result_sha1);
+					while (commit != outstanding->item) {
+						mark_not_common(outstanding->item);
+						pop_outstanding();
+						unwound++;
+					}
+					pop_outstanding();
+					unwound++;
 					mark_common(commit, 0, 1);
 					retval = 0;
 					in_vain = 0;
 					got_continue = 1;
 				}
 			} while (ack);
+			while (unwound++ < 32) {
+				mark_not_common(outstanding->item);
+				pop_outstanding();
+			}
 			flushes--;
 			if (got_continue && MAX_IN_VAIN < in_vain) {
 				if (args.verbose)
@@ -445,9 +754,8 @@ static int everything_local(struct ref **refs, int nr_match, char **match)
 			continue;
 
 		if (!(o->flags & SEEN)) {
-			rev_list_push((struct commit *)o, COMMON_REF | SEEN);
-
-			mark_common((struct commit *)o, 1, 1);
+			rev_list_push((struct commit *)o, 1, 0, 0);
+			o->flags |= COMMON_REF;
 		}
 	}
 
@@ -597,6 +905,11 @@ static struct ref *do_fetch_pack(int fd[2],
 			fprintf(stderr, "Server supports side-band\n");
 		use_sideband = 1;
 	}
+	if (server_supports("exp-stride")) {
+		if (args.verbose)
+			fprintf(stderr, "Server supports exp-stride\n");
+		exp_stride_algorithm = 1;
+	}
 	if (everything_local(&ref, nr_match, match)) {
 		packet_flush(fd[1]);
 		goto all_done;
diff --git a/upload-pack.c b/upload-pack.c
index e5adbc0..14012ee 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -37,6 +37,7 @@ static unsigned int timeout;
  */
 static int use_sideband;
 static int debug_fd;
+static int exp_stride_algorithm;
 
 static void reset_timeout(void)
 {
@@ -414,7 +415,7 @@ static int get_common_commits(void)
 		if (!prefixcmp(line, "have ")) {
 			switch (got_sha1(line+5, sha1)) {
 			case -1: /* they have what we do not */
-				if (multi_ack && ok_to_give_up())
+				if (multi_ack && !exp_stride_algorithm && ok_to_give_up())
 					packet_write(1, "ACK %s continue\n",
 						     sha1_to_hex(sha1));
 				break;
@@ -501,6 +502,8 @@ static void receive_needs(void)
 			no_progress = 1;
 		if (strstr(line+45, "include-tag"))
 			use_include_tag = 1;
+		if (strstr(line+45, "exp-stride"))
+			exp_stride_algorithm = 1;
 
 		/* We have sent all our refs already, and the other end
 		 * should have chosen out of them; otherwise they are
@@ -573,7 +576,7 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
-		" include-tag";
+		" include-tag exp-stride";
 	struct object *o = parse_object(sha1);
 
 	if (!o)
-- 
tg: (1293c95..) t/fetch-pack-speedup (depends on: origin/master)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-09-24  0:48   ` [RFC PATCH v2] " Thomas Rast
@ 2008-10-23 19:38     ` Thomas Rast
  2008-10-24 15:49       ` Nicolas Pitre
  2008-10-27 10:29       ` Nanako Shiraishi
  0 siblings, 2 replies; 13+ messages in thread
From: Thomas Rast @ 2008-10-23 19:38 UTC (permalink / raw)
  To: git

Replaces the existing simple history search with a more sophisticated
algorithm:

1) Walk history with exponentially increasing stride lengths; i.e.,
   send the 1st commit, then the 2nd after that, then the 4th after
   that, and so on.

2) Bisect the resulting intervals.

Combined with tracking the "outstanding haves" so that we can detect
which sha1s were never ACKed by upload-pack (and hence not common),
this gives O(log(n)) required "have" lines.

Unfortunately this cannot work if the server sends fake ACKs, so we
introduce a capability 'exp-stride' which instructs upload-pack to
disable ok_to_give_up().  (Which incidentally saves the server a lot
of work.)

Signed-off-by: Thomas Rast <trast@student.ethz.ch>

---

This is a simple resend of v2, in the hope of attracting some
discussion or at least encouraging words this time around.

Here are the timings I took when I wrote the last mail:

-- 8< --
For the timings below, I ran the (new) daemon locally, with two repos:

- a 'clone --mirror' of my own git.git, with a branch 'next' 525
  commits from the above cc185a6

- a 'clone --mirror' of the completely unrelated adps.git, which
  shares no objects with git.git

Some not really statistically sound timings, usually best of 3-5:

  ## A stripped down git.git, with only one branch and no tags
  $ git branch -a
  * master
  $ git tag
  $ git log -1 --pretty=oneline master
  cc185a6a8ac24737a26ec4b40cc401c2db8b2e97 Start draft release notes for 1.6.0.3
  
  ## I still have the old (dashed) forms in /usr/bin from RPMs
  $ /usr/bin/git --version
  git version 1.5.6

  ## (1a) "Updating" from the git.git that is further ahead (OLD)
  $ time /usr/bin/git-fetch-pack -k git://localhost/git next
  [...]
  real    0m1.004s
  user    0m0.228s
  sys     0m0.028s

  ## (1b) "Updating" from the git.git that is further ahead (NEW)
  $ time git fetch-pack -k git://localhost/git next
  [...]
  real    0m0.977s
  user    0m0.208s
  sys     0m0.068s

  ## (2a) Fetching the unrelated repo from scratch (OLD)
  $ time /usr/bin/git-fetch-pack -k git://localhost/adps master
  [...]
  real    0m9.560s
  user    0m0.720s
  sys     0m0.128s

  ## (2b) Fetching the unrelated repo from scratch (NEW)
  $ time git fetch-pack -k git://localhost/adps master
  [...]
  real    0m0.596s
  user    0m0.372s
  sys     0m0.008s

  ## (3a) Fetching over the network from real repo (OLD)
  $ time /usr/bin/git-fetch-pack git://git.kernel.org/pub/scm/git/git.git next
  [...]
  user    0m0.528s
  sys     0m0.136s

  ## (3b) Fetching over the network from real repo (NEW)
  $ time git fetch-pack git://git.kernel.org/pub/scm/git/git.git next
  [...]
  user    0m0.540s
  sys     0m0.180s

  ## Add more refs to make it more interesting
  $ git remote add origin ~/dev/git
  $ git fetch --tags origin

  ## (4a) like 1a, but with lots of tags
  $ time /usr/bin/git-fetch-pack -k git://localhost/git next
  [...]
  real    0m1.075s
  user    0m0.452s
  sys     0m0.048s

  ## (4b) like 1b, but with lots of tags
  $ time git fetch-pack -k git://localhost/git next
  [...]
  real    0m0.837s
  user    0m0.236s
  sys     0m0.040s

Clearly, this approach does solve the issue I mentioned in the
motivation earlier in the thread, where the initial fetch of
completely disjoint repositories takes _ages_.  Somewhat to my own
surprise, it seems to do quite well in _all_ cases even though it
aggressively digs through history.

Note, however, that the timings aren't really solid; the final write
of the pack, which I assume is fsync()ed, seems to have a big impact
on the execution time.  Test (3) obviously depends more on network
performance than anything else; and it uses the old (linear)
algorithm, but in the new implementation.
-- >8 --

 builtin-fetch-pack.c |  477 +++++++++++++++++++++++++++++++++++++++++---------
 upload-pack.c        |    7 +-
 2 files changed, 400 insertions(+), 84 deletions(-)

diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
index 372bfa2..9ab84ab 100644
--- a/builtin-fetch-pack.c
+++ b/builtin-fetch-pack.c
@@ -25,6 +25,12 @@ static const char fetch_pack_usage[] =
 #define COMMON_REF	(1U << 2)
 #define SEEN		(1U << 3)
 #define POPPED		(1U << 4)
+#define BISECTED_FW	(1U << 5)
+#define BISECTED_BW	(1U << 6)
+#define NOTCOMMON       (1U << 7)
+#define HAS_INFO        (1U << 8)
+#define CLEAN_MARKS     (COMMON | COMMON_REF | SEEN | POPPED | BISECTED_FW \
+			 | BISECTED_BW | NOTCOMMON)
 
 static int marked;
 
@@ -34,133 +40,412 @@ static int marked;
  */
 #define MAX_IN_VAIN 256
 
-static struct commit_list *rev_list;
-static int non_common_revs, multi_ack, use_sideband;
+struct work_list
+{
+	struct work_list *next;
+	struct commit *commit;
+	unsigned int stride;    /* current stride length between commits */
+	unsigned int steps;     /* steps left to go before emitting again */
+	unsigned int depth;     /* sideline depth */
+	unsigned int bisect;
+};
+
+/* "recursion stack" of our strategy of choosing sha1s to transmit */
+static struct work_list *work_list = NULL;
 
-static void rev_list_push(struct commit *commit, int mark)
+/* list of commits not ACKed yet */
+static struct commit_list *outstanding = NULL;
+static struct commit_list *outstanding_end = NULL;
+
+struct commit_info
 {
-	if (!(commit->object.flags & mark)) {
-		commit->object.flags |= mark;
+	struct commit_list *children;
+};
 
-		if (!(commit->object.parsed))
-			if (parse_commit(commit))
-				return;
 
-		insert_by_date(commit, &rev_list);
+static int non_common_revs = 1, multi_ack, use_sideband, exp_stride_algorithm;
 
-		if (!(commit->object.flags & COMMON))
-			non_common_revs++;
+static struct commit_info *get_commit_info(struct commit *commit)
+{
+	if (commit->object.flags & HAS_INFO)
+		return commit->util;
+
+	struct commit_info *info = xmalloc(sizeof(struct commit_info));
+	info->children = NULL;
+	commit->util = info;
+	commit->object.flags |= HAS_INFO;
+	return info;
+}
+
+static void work_list_insert(struct work_list *work_item)
+{
+	struct work_list **pp = &work_list;
+	struct work_list *p;
+
+	while ((p = *pp)) {
+		if (p->bisect || work_item->bisect) {
+			if (p->bisect > work_item->bisect)
+				break;
+		} else {
+			if (p->depth > work_item->depth)
+				break;
+			else if (p->commit->date < work_item->commit->date)
+				break;
+		}
+
+		pp = &p->next;
 	}
+
+	work_item->next = p;
+	*pp = work_item;
 }
 
-static int rev_list_insert_ref(const char *path, const unsigned char *sha1, int flag, void *cb_data)
+static void add_child(struct commit *parent, struct commit *child)
+{
+	struct commit_info *info = get_commit_info(parent);
+	struct commit_list *item;
+	for (item = info->children; item; item = item->next)
+		if (item->item == child)
+			break;
+	if (!item)
+		commit_list_insert(child, &(info->children));
+}
+
+
+/*
+  Marks all (known) children of this commit NOTCOMMON.  We call this
+  for all sha1s the server did not ACK.
+*/
+
+static void mark_not_common(struct commit *commit)
+{
+	struct commit_info *info = get_commit_info(commit);
+	struct commit_list *child;
+
+	assert(!(commit->object.flags & COMMON));
+
+	if (commit->object.flags & NOTCOMMON)
+		return;
+
+	commit->object.flags |= NOTCOMMON;
+
+	for (child = info->children; child; child = child->next)
+		mark_not_common(child->item);
+}
+
+
+static void check_parsed_and_mark(struct commit *commit)
+{
+	struct commit_list *parent;
+
+	if (!commit && commit->object.parsed)
+		return;
+
+	parse_commit(commit);
+
+	for (parent = commit->parents; parent; parent = parent->next) {
+		add_child(parent->item, commit);
+		if (parent->item->object.flags & NOTCOMMON)
+			mark_not_common(commit);
+	}
+}
+
+
+static void rev_list_push(struct commit *commit, unsigned int stride,
+			  unsigned int steps, unsigned int depth)
+{
+	struct work_list *work_item;
+
+	check_parsed_and_mark(commit);
+
+	work_item = xmalloc(sizeof(struct work_list));
+	work_item->commit = commit;
+	work_item->stride = stride;
+	work_item->steps = steps;
+	work_item->depth = depth;
+	work_item->bisect = 0;
+	work_list_insert(work_item);
+}
+
+static int rev_list_insert_ref(const char *path, const unsigned char *sha1,
+			       int flag, void *cb_data)
 {
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
-	if (o && o->type == OBJ_COMMIT)
-		rev_list_push((struct commit *)o, SEEN);
+	if (o && o->type == OBJ_COMMIT && !(o->flags & SEEN))
+		rev_list_push((struct commit *)o, 1, 1, 0);
 
 	return 0;
 }
 
-static int clear_marks(const char *path, const unsigned char *sha1, int flag, void *cb_data)
+static int clear_marks(const char *path, const unsigned char *sha1, int flag,
+		       void *cb_data)
 {
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
 	if (o && o->type == OBJ_COMMIT)
-		clear_commit_marks((struct commit *)o,
-				   COMMON | COMMON_REF | SEEN | POPPED);
+		clear_commit_marks((struct commit *)o, CLEAN_MARKS);
 	return 0;
 }
 
+
 /*
-   This function marks a rev and its ancestors as common.
-   In some cases, it is desirable to mark only the ancestors (for example
-   when only the server does not yet know that they are common).
+  Mark commits backwards through history as COMMON.
 */
 
-static void mark_common(struct commit *commit,
-		int ancestors_only, int dont_parse)
+static void mark_common(struct commit *commit, int ancestors_only, int dont_parse)
 {
-	if (commit != NULL && !(commit->object.flags & COMMON)) {
-		struct object *o = (struct object *)commit;
+	struct commit_list *parents;
+	struct object *o;
 
-		if (!ancestors_only)
-			o->flags |= COMMON;
+	if (commit == NULL || commit->object.flags & (COMMON_REF|COMMON))
+		return;
 
-		if (!(o->flags & SEEN))
-			rev_list_push(commit, SEEN);
-		else {
-			struct commit_list *parents;
-
-			if (!ancestors_only && !(o->flags & POPPED))
-				non_common_revs--;
-			if (!o->parsed && !dont_parse)
-				if (parse_commit(commit))
-					return;
-
-			for (parents = commit->parents;
-					parents;
-					parents = parents->next)
-				mark_common(parents->item, 0, dont_parse);
-		}
+	if (!dont_parse && !(commit->object.parsed))
+		check_parsed_and_mark(commit);
+
+	o = (struct object *)commit;
+
+	if (!ancestors_only)
+		o->flags |= COMMON;
+
+	for (parents = commit->parents; parents; parents = parents->next) {
+		add_child(parents->item, commit);
+		mark_common(parents->item, 0, dont_parse);
 	}
 }
 
+
+
 /*
-  Get the next rev to send, ignoring the common.
+  Queue all revs in the list for simple backward search. Does not
+  requeue already SEEN commits.
 */
 
-static const unsigned char* get_rev(void)
+static void queue_list(struct commit *child, struct commit_list *list,
+		       unsigned int stride, unsigned int steps,
+		       unsigned int depth)
 {
-	struct commit *commit = NULL;
+	while (list) {
+		add_child(list->item, child);
+		if (!(list->item->object.flags & SEEN))
+			rev_list_push(list->item, stride, steps, depth);
+		list = list->next;
+	}
+}
+
+
+/*
+  Queues a commit for bisection
+*/
+
+static void setup_bisect(struct commit *commit, int depth)
+{
+	struct work_list *work_item = xmalloc(sizeof(struct work_list));
+
+	work_item->commit = commit;
+	work_item->bisect = 1+depth;
+	work_item->depth = 0;
+	work_list_insert(work_item);
+}
+
+
+/*
+  Queues a list of commits for bisection
+*/
+
+static void setup_bisect_all (struct commit_list* list, int depth)
+{
+	while (list) {
+		setup_bisect(list->item, depth);
+		list = list->next;
+	}
+}
+
+
+/*
+  Backwards search. If we use exp_stride_algorithm, it takes
+  exponential strides between commits chosen.
+*/
+
+static struct commit* get_rev_stride(struct work_list *work_item)
+{
+	struct commit *commit = work_item->commit;
+	unsigned int steps = work_item->steps;
+
+	while (steps++ < work_item->stride-1 && commit->parents
+	       && !(commit->object.flags & (SEEN|COMMON|COMMON_REF))) {
+		/* all but the first parent line are queued for later */
+		if (commit->parents)
+			queue_list(commit, commit->parents->next,
+				   work_item->stride, steps,
+				   work_item->depth+1);
+
+		/* follow the first parent line directly */
+		commit->object.flags |= SEEN;
+		add_child(commit->parents->item, commit);
+		commit = commit->parents->item;
+		check_parsed_and_mark(commit);
+	}
+
+	if (commit->object.flags & (SEEN|COMMON)) {
+		/* already been here! */
+		commit = NULL;
+	} else if (commit->object.flags & COMMON_REF) {
+		/* this came from a ref; we stop here, but need to
+		 * emit it so the server knows too */
+		commit->object.flags |= POPPED|SEEN|BISECTED_BW;
+		if (exp_stride_algorithm)
+			setup_bisect(commit, 0);
+	} else {
+		/* usual case: this is news to us, so we scan further
+		 * back.  note that the first-depth bisection only
+		 * needs to go forward */
+		commit->object.flags |= POPPED|SEEN|BISECTED_BW;
+		queue_list(commit, commit->parents,
+			   exp_stride_algorithm ? work_item->stride*2 : 1,
+			   0, work_item->depth);
+		if (exp_stride_algorithm)
+			setup_bisect(commit, 0);
+	}
+
+	free(work_item);
+
+	return commit;
+}
 
-	while (commit == NULL) {
-		unsigned int mark;
-		struct commit_list *parents;
 
-		if (rev_list == NULL || non_common_revs == 0)
+/*
+  Bisection state of the exponential stride algorithm: scans forward
+  (backward) for a non-common/already-sent (common/already-sent,
+  resp.) commit, then emits the middle of that range, and re-queues.
+
+  This means we bisect the discovered range in log(n) transmissions,
+  although we do make O(n) steps through history.
+*/
+
+static struct commit* get_rev_bisect(struct work_list *work_item)
+{
+	struct commit *full;
+	struct commit *half;
+	int half_step;
+	struct commit_info *info;
+	unsigned int flags = work_item->commit->object.flags;
+
+	if (!(flags & (COMMON|BISECTED_BW))) {
+		/* We don't know anything about the history _backward_
+		 * from this, so search it */
+
+		full = work_item->commit;
+		full->object.flags |= BISECTED_BW;
+		half = work_item->commit;
+		half_step = 0;
+
+		while (full && full->parents) {
+			full = full->parents->item;
+			half_step ^= 1;
+			if (half_step)
+				half = half->parents->item;
+			if (full->object.flags & (POPPED|COMMON))
+				break;
+		}
+
+		/* also insert the same bisection again so we can try
+		 * forward too */
+		work_list_insert(work_item);
+
+		if (full->object.flags & POPPED
+		    && !(full->object.flags & NOTCOMMON)
+		    && !(half->object.flags & (COMMON|POPPED))) {
+			setup_bisect(half, work_item->bisect);
+			half->object.flags |= POPPED;
+			return half;
+		} else {
 			return NULL;
+		}
+	}
 
-		commit = rev_list->item;
-		if (!commit->object.parsed)
-			parse_commit(commit);
-		parents = commit->parents;
-
-		commit->object.flags |= POPPED;
-		if (!(commit->object.flags & COMMON))
-			non_common_revs--;
-
-		if (commit->object.flags & COMMON) {
-			/* do not send "have", and ignore ancestors */
-			commit = NULL;
-			mark = COMMON | SEEN;
-		} else if (commit->object.flags & COMMON_REF)
-			/* send "have", and ignore ancestors */
-			mark = COMMON | SEEN;
-		else
-			/* send "have", also for its ancestors */
-			mark = SEEN;
-
-		while (parents) {
-			if (!(parents->item->object.flags & SEEN))
-				rev_list_push(parents->item, mark);
-			if (mark & COMMON)
-				mark_common(parents->item, 1, 0);
-			parents = parents->next;
+	if (!(flags & (NOTCOMMON|BISECTED_FW))) {
+		/* We don't know anything about the history _forward_
+		 * from this, so search it */
+
+		full = work_item->commit;
+		full->object.flags |= BISECTED_FW;
+		half = work_item->commit;
+		half_step = 0;
+
+		while (full && (info=get_commit_info(full))->children) {
+			setup_bisect_all(info->children->next, work_item->bisect);
+			full = info->children->item;
+			half_step ^= 1;
+			if (half_step) {
+				info = get_commit_info(half);
+				half = info->children->item;
+			}
+			if (full->object.flags & (POPPED|NOTCOMMON))
+				break;
 		}
 
-		rev_list = rev_list->next;
+		if (full->object.flags & POPPED
+		    && !(full->object.flags & COMMON)
+		    && !(half->object.flags & (NOTCOMMON|POPPED))) {
+			setup_bisect(half, work_item->bisect);
+			half->object.flags |= POPPED;
+			free(work_item);
+			return half;
+		}
 	}
 
-	return commit->object.sha1;
+	free(work_item);
+	return NULL;
 }
 
+
+/*
+  Get the next revision to send.
+*/
+
+static struct commit *get_rev(void)
+{
+	struct commit *commit = NULL;
+
+	while (commit == NULL && work_list) {
+		struct work_list *work_item = NULL;
+
+		work_item = work_list;
+		/* we update the pointer early so the get_rev
+		 * functions can free() or reuse the work_item as
+		 * required */
+		work_list = work_item->next;
+
+		commit = work_item->commit;
+
+		check_parsed_and_mark(commit);
+
+		if (work_item->bisect) {
+			commit = get_rev_bisect(work_item);
+		} else {
+			commit = get_rev_stride(work_item);
+		}
+	}
+
+	return commit;
+}
+
+static void pop_outstanding()
+{
+	struct commit_list *item = outstanding;
+	outstanding = item->next;
+	free(item);
+}
+
+
 static int find_common(int fd[2], unsigned char *result_sha1,
 		       struct ref *refs)
 {
 	int fetching;
 	int count = 0, flushes = 0, retval;
+	struct commit *commit;
 	const unsigned char *sha1;
 	unsigned in_vain = 0;
 	int got_continue = 0;
@@ -192,7 +477,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 		}
 
 		if (!fetching)
-			packet_write(fd[1], "want %s%s%s%s%s%s%s%s\n",
+			packet_write(fd[1], "want %s%s%s%s%s%s%s%s%s\n",
 				     sha1_to_hex(remote),
 				     (multi_ack ? " multi_ack" : ""),
 				     (use_sideband == 2 ? " side-band-64k" : ""),
@@ -200,6 +485,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 				     (args.use_thin_pack ? " thin-pack" : ""),
 				     (args.no_progress ? " no-progress" : ""),
 				     (args.include_tag ? " include-tag" : ""),
+				     (exp_stride_algorithm ? " exp-stride" : ""),
 				     " ofs-delta");
 		else
 			packet_write(fd[1], "want %s\n", sha1_to_hex(remote));
@@ -243,13 +529,25 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 
 	flushes = 0;
 	retval = -1;
-	while ((sha1 = get_rev())) {
+	while ((commit = get_rev())) {
+		sha1 = commit->object.sha1;
 		packet_write(fd[1], "have %s\n", sha1_to_hex(sha1));
 		if (args.verbose)
 			fprintf(stderr, "have %s\n", sha1_to_hex(sha1));
 		in_vain++;
+
+		if (outstanding) {
+			commit_list_insert(commit, &(outstanding_end->next));
+			outstanding_end = outstanding_end->next;
+		} else {
+			commit_list_insert(commit, &outstanding);
+			outstanding_end = outstanding;
+		}
+
 		if (!(31 & ++count)) {
 			int ack;
+			int unwound = 0;
+			struct commit_list *item;
 
 			packet_flush(fd[1]);
 			flushes++;
@@ -274,12 +572,23 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 				} else if (ack == 2) {
 					struct commit *commit =
 						lookup_commit(result_sha1);
+					while (commit != outstanding->item) {
+						mark_not_common(outstanding->item);
+						pop_outstanding();
+						unwound++;
+					}
+					pop_outstanding();
+					unwound++;
 					mark_common(commit, 0, 1);
 					retval = 0;
 					in_vain = 0;
 					got_continue = 1;
 				}
 			} while (ack);
+			while (unwound++ < 32) {
+				mark_not_common(outstanding->item);
+				pop_outstanding();
+			}
 			flushes--;
 			if (got_continue && MAX_IN_VAIN < in_vain) {
 				if (args.verbose)
@@ -445,9 +754,8 @@ static int everything_local(struct ref **refs, int nr_match, char **match)
 			continue;
 
 		if (!(o->flags & SEEN)) {
-			rev_list_push((struct commit *)o, COMMON_REF | SEEN);
-
-			mark_common((struct commit *)o, 1, 1);
+			rev_list_push((struct commit *)o, 1, 0, 0);
+			o->flags |= COMMON_REF;
 		}
 	}
 
@@ -597,6 +905,11 @@ static struct ref *do_fetch_pack(int fd[2],
 			fprintf(stderr, "Server supports side-band\n");
 		use_sideband = 1;
 	}
+	if (server_supports("exp-stride")) {
+		if (args.verbose)
+			fprintf(stderr, "Server supports exp-stride\n");
+		exp_stride_algorithm = 1;
+	}
 	if (everything_local(&ref, nr_match, match)) {
 		packet_flush(fd[1]);
 		goto all_done;
diff --git a/upload-pack.c b/upload-pack.c
index e5adbc0..14012ee 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -37,6 +37,7 @@ static unsigned int timeout;
  */
 static int use_sideband;
 static int debug_fd;
+static int exp_stride_algorithm;
 
 static void reset_timeout(void)
 {
@@ -414,7 +415,7 @@ static int get_common_commits(void)
 		if (!prefixcmp(line, "have ")) {
 			switch (got_sha1(line+5, sha1)) {
 			case -1: /* they have what we do not */
-				if (multi_ack && ok_to_give_up())
+				if (multi_ack && !exp_stride_algorithm && ok_to_give_up())
 					packet_write(1, "ACK %s continue\n",
 						     sha1_to_hex(sha1));
 				break;
@@ -501,6 +502,8 @@ static void receive_needs(void)
 			no_progress = 1;
 		if (strstr(line+45, "include-tag"))
 			use_include_tag = 1;
+		if (strstr(line+45, "exp-stride"))
+			exp_stride_algorithm = 1;
 
 		/* We have sent all our refs already, and the other end
 		 * should have chosen out of them; otherwise they are
@@ -573,7 +576,7 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
-		" include-tag";
+		" include-tag exp-stride";
 	struct object *o = parse_object(sha1);
 
 	if (!o)
-- 
tg: (759ad19..) t/fetch-pack-speedup (depends on: origin/master)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-10-23 19:38     ` Thomas Rast
@ 2008-10-24 15:49       ` Nicolas Pitre
  2008-10-27 10:29       ` Nanako Shiraishi
  1 sibling, 0 replies; 13+ messages in thread
From: Nicolas Pitre @ 2008-10-24 15:49 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git

On Thu, 23 Oct 2008, Thomas Rast wrote:

> Replaces the existing simple history search with a more sophisticated
> algorithm:
> 
> 1) Walk history with exponentially increasing stride lengths; i.e.,
>    send the 1st commit, then the 2nd after that, then the 4th after
>    that, and so on.
> 
> 2) Bisect the resulting intervals.
> 
> Combined with tracking the "outstanding haves" so that we can detect
> which sha1s were never ACKed by upload-pack (and hence not common),
> this gives O(log(n)) required "have" lines.
> 
> Unfortunately this cannot work if the server sends fake ACKs, so we
> introduce a capability 'exp-stride' which instructs upload-pack to
> disable ok_to_give_up().  (Which incidentally saves the server a lot
> of work.)
> 
> Signed-off-by: Thomas Rast <trast@student.ethz.ch>
> 
> ---
> 
> This is a simple resend of v2, in the hope of attracting some
> discussion or at least encouraging words this time around.

OK, I gave this a quick try, and fetch operations appear to make their 
mind about what to do quicker.  Some fetch requests which used to take 
up to 5 seconds are somewhat faster.  I have no formal measurements 
though.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-10-23 19:38     ` Thomas Rast
  2008-10-24 15:49       ` Nicolas Pitre
@ 2008-10-27 10:29       ` Nanako Shiraishi
  2008-10-28  3:24         ` Junio C Hamano
  1 sibling, 1 reply; 13+ messages in thread
From: Nanako Shiraishi @ 2008-10-27 10:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git

Quoting Thomas Rast <trast@student.ethz.ch>:

> Replaces the existing simple history search with a more sophisticated
> algorithm:
> 
> 1) Walk history with exponentially increasing stride lengths; i.e.,
>    send the 1st commit, then the 2nd after that, then the 4th after
>    that, and so on.
> 
> 2) Bisect the resulting intervals.

Junio, may I ask what the status of this patch is? I see Nicolas responded and said "I gave this a quick try". Wasn't it a good enough review?

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-10-27 10:29       ` Nanako Shiraishi
@ 2008-10-28  3:24         ` Junio C Hamano
  2008-10-28 14:46           ` Nicolas Pitre
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2008-10-28  3:24 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Thomas Rast, git

Nanako Shiraishi <nanako3@lavabit.com> writes:

> Quoting Thomas Rast <trast@student.ethz.ch>:
>
>> Replaces the existing simple history search with a more sophisticated
>> algorithm:
>> 
>> 1) Walk history with exponentially increasing stride lengths; i.e.,
>>    send the 1st commit, then the 2nd after that, then the 4th after
>>    that, and so on.
>> 
>> 2) Bisect the resulting intervals.
>
> Junio, may I ask what the status of this patch is? I see Nicolas responded and said "I gave this a quick try". Wasn't it a good enough review?

I took the "quick try" more about "first feel in performance" and not
"code review concentrating on correctness and trying to catch mistakes".

I like what the patch tries to solve, and the approach it takes to solve
it.  I haven't had a chance to have a block of time for me to concentrate
on this patch to assess where it could go wrong yet.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-10-28  3:24         ` Junio C Hamano
@ 2008-10-28 14:46           ` Nicolas Pitre
  2008-10-28 19:37             ` Thomas Rast
  2008-12-06 23:20             ` [RFC PATCH v3 0/2] " Thomas Rast
  0 siblings, 2 replies; 13+ messages in thread
From: Nicolas Pitre @ 2008-10-28 14:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nanako Shiraishi, Thomas Rast, git

On Mon, 27 Oct 2008, Junio C Hamano wrote:

> Nanako Shiraishi <nanako3@lavabit.com> writes:
> 
> > Quoting Thomas Rast <trast@student.ethz.ch>:
> >
> >> Replaces the existing simple history search with a more sophisticated
> >> algorithm:
> >> 
> >> 1) Walk history with exponentially increasing stride lengths; i.e.,
> >>    send the 1st commit, then the 2nd after that, then the 4th after
> >>    that, and so on.
> >> 
> >> 2) Bisect the resulting intervals.
> >
> > Junio, may I ask what the status of this patch is? I see Nicolas responded and said "I gave this a quick try". Wasn't it a good enough review?
> 
> I took the "quick try" more about "first feel in performance" and not
> "code review concentrating on correctness and trying to catch mistakes".

Exact.

FWIW, I had to back this patch out from my version as things seemed to 
fall into an infinite loop of ref negotiation while fetching the Linux 
kernel repository at some point.  Doing a "git fetch -v -v" turned up an 
endless stream of "got" and "have" lines.  I was in a hurry for $work so 
didn't think of preserving my local refs for reproduction of the 
problem.

Sorry for not being more helpful.  This is some part of git that I know 
f-all about.


Nicolas

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC PATCH v2] fetch-pack: log(n)-transmission find_common()
  2008-10-28 14:46           ` Nicolas Pitre
@ 2008-10-28 19:37             ` Thomas Rast
  2008-12-06 23:20             ` [RFC PATCH v3 0/2] " Thomas Rast
  1 sibling, 0 replies; 13+ messages in thread
From: Thomas Rast @ 2008-10-28 19:37 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, Nanako Shiraishi, git

[-- Attachment #1: Type: text/plain, Size: 678 bytes --]

Nicolas Pitre wrote:
> 
> FWIW, I had to back this patch out from my version as things seemed to 
> fall into an infinite loop of ref negotiation while fetching the Linux 
> kernel repository at some point.  Doing a "git fetch -v -v" turned up an 
> endless stream of "got" and "have" lines.  I was in a hurry for $work so 
> didn't think of preserving my local refs for reproduction of the 
> problem.

Ah.  Thanks for the report, though having a test case would be great.
I've been running it without problems since I completed v2 more than a
month ago, so I don't expect it to be an obvious mistake.

- Thomas

-- 
Thomas Rast
trast@{inf,student}.ethz.ch


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v3 0/2] fetch-pack: log(n)-transmission find_common()
  2008-10-28 14:46           ` Nicolas Pitre
  2008-10-28 19:37             ` Thomas Rast
@ 2008-12-06 23:20             ` Thomas Rast
  2008-12-06 23:20               ` [RFC PATCH v3 1/2] fetch-pack: rearrange main loop Thomas Rast
  1 sibling, 1 reply; 13+ messages in thread
From: Thomas Rast @ 2008-12-06 23:20 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nanako Shiraishi

If you still know what this was about, you can skip ahead to the "=="
marker below.


The mail that started it all was

  http://kerneltrap.org/mailarchive/git/2008/9/17/3328364

  [Abstract:]
  Using two passes, first exponential stride through history, then
  bisection on the "right" windows, we implement the find_common
  handshake in log(history_size) many transmissions.  The followup
  patch partially achieves that, but a few points are open and strict
  backwards compatibility is missing (but its absence is Mostly
  Harmless).

This spawned out of a problem originally reported on IRC where pulling
several disjoint histories together takes ages (due to the load on the
server caused by ok_to_give_up()).  Apparently this is a common
workflow in the Ruby-on-Rails.  The bound on the number of
transmissions holds in other cases too, of course.

The idea attracted some interest, but to my knowledge nobody actually
reviewed the code, and it eventually died because of Nicholas's
finding:

Nicolas Pitre wrote:
> FWIW, I had to back this patch out from my version as things seemed to 
> fall into an infinite loop of ref negotiation while fetching the Linux 
> kernel repository at some point.  Doing a "git fetch -v -v" turned up an 
> endless stream of "got" and "have" lines.  I was in a hurry for $work so 
> didn't think of preserving my local refs for reproduction of the 
> problem.

Unfortunately, I was never able to reliably reproduce this problem.
(IIRC I did once when running an automated hammering script but it
vanished when I tried again manually.)



==

After a long break, I decided to pick this up again.  I'm not sure
this is the best time to do so, but I had the time to spare, and
(unless there are more serious bugs, and Gods and Junio willing) we
might get it into 'next' sometime early in the next cycle for better
testing.

I rewrote the core of the algorithm, though some helpers and most of
the glue to surrounding routines are still the same.  I took the
following design decisions that weren't in v2:

0. Lots of comments.  Well, by my standards in any case.

1. The old code stays.  I mostly want to make it (far) easier to be
   certain that it behaves exactly as before if the server does not
   support disabling ok_to_give_up().  [The idea is that it should be
   easy to verify that 1/2 is a no-change refactoring and 2/2 only
   does something if the server supports it.]

2. The work list now uses a Fibonacci heap to order jobs.  I'm not
   religious about the specific flavour of heap, but I had the
   description of these handy.  (The work list can get rather large.)

3. The bisection is essentially precomputed.  By staring at the binary
   representation of the distance from the starting commit hard
   enough, it can be constructed during the stride pass.  Barring a
   circular bisection "tree", the algorithm is now guaranteed to
   terminate even if it, say, emits too many or too few sha1s due to
   some other bug.

The last point of course means that the main work is now in the
tangled mess that is get_rev_new_stride(), and "staring hard enough"
frequently enough meant grabbing a notepad.  YMMV.

While I did run a lot of tests, including 'make test' and some
automated hammering, it seems quite hard to make sure it _really_
bisects correctly (except manually in toy examples, such as two linear
histories with a common base).  Suggestions for automated tests
welcome.

To see the effects of the patch, try

  git fetch-pack -v -k <url> <head> 2>&1 \
    | git name-rev --stdin refs/heads/master

with a daemon that supports the feature.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC PATCH v3 1/2] fetch-pack: rearrange main loop
  2008-12-06 23:20             ` [RFC PATCH v3 0/2] " Thomas Rast
@ 2008-12-06 23:20               ` Thomas Rast
  2008-12-06 23:20                 ` [RFC PATCH v3 2/2] fetch-pack: log(n)-transmission find_common() Thomas Rast
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Rast @ 2008-12-06 23:20 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nanako Shiraishi

This patch does not change the results (nor any of the semantics
except for the get_rev return type), but we need the changed layout
for the exponential-stride feature.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>

---
 builtin-fetch-pack.c |  108 +++++++++++++++++++++++++++++--------------------
 1 files changed, 64 insertions(+), 44 deletions(-)

diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
index 372bfa2..ae0a67a 100644
--- a/builtin-fetch-pack.c
+++ b/builtin-fetch-pack.c
@@ -111,7 +111,7 @@ static void mark_common(struct commit *commit,
   Get the next rev to send, ignoring the common.
 */
 
-static const unsigned char* get_rev(void)
+static struct commit* get_rev(void)
 {
 	struct commit *commit = NULL;
 
@@ -153,15 +153,41 @@ static const unsigned char* get_rev(void)
 		rev_list = rev_list->next;
 	}
 
-	return commit->object.sha1;
+	return commit;
 }
 
+/*
+ * Send 'have' for the next batch of revisions.  Returns 0 if we ran
+ * out of commits to send, 1 otherwise.
+ */
+
+static int send_have_lines(int fd[2], int *flushes, unsigned *in_vain)
+{
+	struct commit *commit;
+	int i;
+
+	for (i = 0; i < 32; i++) {
+		commit = get_rev();
+		if (!commit)
+			return 0;
+		packet_write(fd[1], "have %s\n",
+			     sha1_to_hex(commit->object.sha1));
+		if (args.verbose)
+			fprintf(stderr, "have %s\n",
+				sha1_to_hex(commit->object.sha1));
+	}
+	packet_flush(fd[1]);
+	*flushes += 1;
+	*in_vain += 32;
+	return 1;
+}
+
+
 static int find_common(int fd[2], unsigned char *result_sha1,
 		       struct ref *refs)
 {
 	int fetching;
 	int count = 0, flushes = 0, retval;
-	const unsigned char *sha1;
 	unsigned in_vain = 0;
 	int got_continue = 0;
 
@@ -243,51 +269,45 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 
 	flushes = 0;
 	retval = -1;
-	while ((sha1 = get_rev())) {
-		packet_write(fd[1], "have %s\n", sha1_to_hex(sha1));
-		if (args.verbose)
-			fprintf(stderr, "have %s\n", sha1_to_hex(sha1));
-		in_vain++;
-		if (!(31 & ++count)) {
-			int ack;
 
-			packet_flush(fd[1]);
-			flushes++;
+	/*
+	 * We keep one window "ahead" of the other side, and
+	 * will wait for an ACK only on the next one
+	 */
+	if (!send_have_lines(fd, &flushes, &in_vain))
+		goto done;
 
-			/*
-			 * We keep one window "ahead" of the other side, and
-			 * will wait for an ACK only on the next one
-			 */
-			if (count == 32)
-				continue;
+	while (send_have_lines(fd, &flushes, &in_vain)) {
+		int ack;
+		int unwound = 0;
 
-			do {
-				ack = get_ack(fd[0], result_sha1);
-				if (args.verbose && ack)
-					fprintf(stderr, "got ack %d %s\n", ack,
-							sha1_to_hex(result_sha1));
-				if (ack == 1) {
-					flushes = 0;
-					multi_ack = 0;
-					retval = 0;
-					goto done;
-				} else if (ack == 2) {
-					struct commit *commit =
-						lookup_commit(result_sha1);
-					mark_common(commit, 0, 1);
-					retval = 0;
-					in_vain = 0;
-					got_continue = 1;
-				}
-			} while (ack);
-			flushes--;
-			if (got_continue && MAX_IN_VAIN < in_vain) {
-				if (args.verbose)
-					fprintf(stderr, "giving up\n");
-				break; /* give up */
+		do {
+			ack = get_ack(fd[0], result_sha1);
+			if (args.verbose && ack)
+				fprintf(stderr, "got ack %d %s\n", ack,
+					sha1_to_hex(result_sha1));
+			if (ack == 1) {
+				flushes = 0;
+				multi_ack = 0;
+				retval = 0;
+				goto done;
+			} else if (ack == 2) {
+				struct commit *commit =
+					lookup_commit(result_sha1);
+				mark_common(commit, 0, 1);
+				retval = 0;
+				in_vain = 0;
+				got_continue = 1;
 			}
+		} while (ack);
+		flushes--;
+		if (got_continue && MAX_IN_VAIN < in_vain) {
+			if (args.verbose)
+				fprintf(stderr, "giving up\n");
+			break; /* give up */
 		}
 	}
+
 done:
 	packet_write(fd[1], "done\n");
 	if (args.verbose)
-- 
tg: (7f705dc..) t/fp-refactor (depends on: origin/master)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC PATCH v3 2/2] fetch-pack: log(n)-transmission find_common()
  2008-12-06 23:20               ` [RFC PATCH v3 1/2] fetch-pack: rearrange main loop Thomas Rast
@ 2008-12-06 23:20                 ` Thomas Rast
  0 siblings, 0 replies; 13+ messages in thread
From: Thomas Rast @ 2008-12-06 23:20 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Nicolas Pitre, Nanako Shiraishi

Replaces the existing simple history search with a more sophisticated
algorithm:

1) Walk history with exponentially increasing stride lengths; i.e.,
   send the 1st commit, then the 2nd after that, then the 4th after
   that, and so on.

2) Bisect the resulting intervals.

Combined with tracking the "outstanding haves" so that we can detect
which sha1s were never ACKed by upload-pack (and hence not common),
this gives O(log(n)) required "have" lines.

Unfortunately this cannot work if the server sends fake ACKs, so we
introduce a capability 'no-giveup' which instructs upload-pack to
disable ok_to_give_up().  (Which incidentally saves the server a lot
of work.)

Signed-off-by: Thomas Rast <trast@student.ethz.ch>

---
 builtin-fetch-pack.c |  789 +++++++++++++++++++++++++++++++++++++++++++++++++-
 upload-pack.c        |    7 +-
 2 files changed, 787 insertions(+), 9 deletions(-)

diff --git a/builtin-fetch-pack.c b/builtin-fetch-pack.c
index ae0a67a..ea88e28 100644
--- a/builtin-fetch-pack.c
+++ b/builtin-fetch-pack.c
@@ -9,6 +9,7 @@
 #include "fetch-pack.h"
 #include "remote.h"
 #include "run-command.h"
+#include <string.h>
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -25,6 +26,14 @@ static const char fetch_pack_usage[] =
 #define COMMON_REF	(1U << 2)
 #define SEEN		(1U << 3)
 #define POPPED		(1U << 4)
+#define BISECTED_FW	(1U << 5)
+#define BISECTED_BW	(1U << 6)
+#define NOTCOMMON       (1U << 7)
+#define HAS_INFO        (1U << 8)
+#define PARSED_WITH_PARENTS (1U << 9)
+#define CLEAN_MARKS     (COMMON | COMMON_REF | SEEN | POPPED | BISECTED_FW \
+			 | BISECTED_BW | NOTCOMMON | HAS_INFO \
+			 | PARSED_WITH_PARENTS)
 
 static int marked;
 
@@ -34,8 +43,61 @@ static int marked;
  */
 #define MAX_IN_VAIN 256
 
+struct work_heap
+{
+	struct work_heap *next;
+	struct work_heap *prev;
+	struct work_item *item;
+};
+
+struct work_item
+{
+	/* Fibonacci heap data */
+	/* we don't need a parent pointer because we don't support changing keys */
+	struct work_item *child;
+	struct work_item *sibling;
+	int rank;
+	int marked;
+
+	/* actual content */
+	struct commit *commit;
+	unsigned int stride;  /* for stride mode */
+	unsigned int steps;
+	unsigned int depth;
+	unsigned int bisect;  /* for bisection mode */
+
+	/*
+	 * Helper array to generate the bisection order.  A size of 32
+	 * is ok as long as the history length is less than 2**32.
+	 */
+	struct commit **ladder;
+};
+
+/*
+ * "Recursion stack" of the exponential-stride algorithm
+ *
+ * This is a priority queue implemented as a Fibonacci heap since we
+ * expect to stuff a lot of commits into it.
+ */
+static struct work_heap *work_heap = NULL;
+static struct work_heap *work_min = NULL;
+
+/* list of commits not ACKed yet */
+static struct commit_list *outstanding = NULL;
+static struct commit_list *outstanding_end = NULL;
+
+struct commit_info
+{
+	struct commit_list *children;
+	struct commit_list *bisect_forward;
+	struct commit_list *bisect_backward;
+	struct commit_list *forward_bound;
+	struct commit_list *backward_bound;
+};
+
 static struct commit_list *rev_list;
 static int non_common_revs, multi_ack, use_sideband;
+static int exp_stride_algorithm;
 
 static void rev_list_push(struct commit *commit, int mark)
 {
@@ -68,8 +130,7 @@ static int clear_marks(const char *path, const unsigned char *sha1, int flag, vo
 	struct object *o = deref_tag(parse_object(sha1), path, 0);
 
 	if (o && o->type == OBJ_COMMIT)
-		clear_commit_marks((struct commit *)o,
-				   COMMON | COMMON_REF | SEEN | POPPED);
+		clear_commit_marks((struct commit *)o, CLEAN_MARKS);
 	return 0;
 }
 
@@ -107,6 +168,354 @@ static void mark_common(struct commit *commit,
 	}
 }
 
+static struct commit_info *get_commit_info(struct commit *commit)
+{
+	if (commit->object.flags & HAS_INFO)
+		return commit->util;
+
+	struct commit_info *info = xmalloc(sizeof(struct commit_info));
+	info->children = NULL;
+	info->bisect_forward = NULL;
+	info->bisect_backward = NULL;
+	info->forward_bound = NULL;
+	info->backward_bound = NULL;
+	commit->util = info;
+	commit->object.flags |= HAS_INFO;
+	return info;
+}
+
+static void add_child(struct commit *parent, struct commit *child)
+{
+	struct commit_info *info = get_commit_info(parent);
+	struct commit_list *item;
+	for (item = info->children; item; item = item->next)
+		if (item->item == child)
+			break;
+	if (!item)
+		commit_list_insert(child, &(info->children));
+}
+
+/* implemented further below */
+static void mark_not_common(struct commit *commit);
+
+static void check_parsed_and_mark(struct commit *commit)
+{
+	struct commit_list *parent;
+
+	if (!commit || commit->object.flags & PARSED_WITH_PARENTS)
+		return;
+
+	parse_commit(commit);
+
+	commit->object.flags |= PARSED_WITH_PARENTS;
+
+	for (parent = commit->parents; parent; parent = parent->next) {
+		add_child(parent->item, commit);
+		if (parent->item->object.flags & NOTCOMMON)
+			mark_not_common(commit);
+	}
+}
+
+static void work_item_free(struct work_item *item)
+{
+	if (!item->bisect)
+		free(item->ladder);
+	free(item);
+}
+
+/*
+ * Ordering imposed on the "recursion" that generates the commits.
+ * The smallest item is processed first.  To maintain the O(log n)
+ * transmissions bound we want to achieve, bisections must come after
+ * (compare >0) strides.
+ */
+
+static int work_item_cmp(struct work_item *a, struct work_item *b)
+{
+	if (a->bisect || b->bisect) {
+		/* bisections last */
+		if (a->bisect > b->bisect)
+			return 1;
+		else if (a->bisect < b->bisect)
+			return -1;
+
+		/* older commits later */
+		if (a->commit->date < b->commit->date)
+			return 1;
+		else if (a->commit->date == b->commit->date)
+			return 0;
+		else
+			return -1;
+	} else {
+		/* deeper sidelines of history later */
+		if (a->depth > b->depth)
+			return 1;
+		else if (a->depth < b->depth)
+			return -1;
+
+		/* older commits later */
+		if (a->commit->date < b->commit->date)
+			return 1;
+		else if (a->commit->date == b->commit->date)
+			return 0;
+		else
+			return -1;
+	}
+}
+
+/*
+ * Insert/remove items from the work_heap.  _pop() always returns the
+ * smallest one.
+ */
+
+static void work_heap_insert_internal(struct work_item *item)
+{
+	struct work_heap *entry = xmalloc(sizeof(struct work_heap));
+	entry->item = item;
+
+	if (!work_heap) {
+		entry->prev = entry;
+		entry->next = entry;
+		work_heap = entry;
+		work_min = work_heap;
+		return;
+	}
+
+	entry->next = work_heap;
+	entry->prev = work_heap->prev;
+	work_heap->prev = entry;
+	entry->prev->next = entry;
+	if (!work_min || work_item_cmp(item, work_min->item) < 0)
+		work_min = entry;
+	work_heap = entry;
+}
+
+static void work_heap_insert(struct work_item *item)
+{
+	item->marked = 0;
+	item->child = NULL;
+	item->sibling = NULL;
+	item->rank = 0;
+
+	work_heap_insert_internal(item);
+}
+
+static struct work_item *work_heap_pop(void)
+{
+	struct work_heap *entry = work_min;
+	struct work_item *x, *y, *t;
+	struct work_item *child, *item;
+	/* The rank of any child in a Fib heap of size n can be at
+	 * most log_{3/2}(n), and we only support 2**32 items
+	 * anyway */
+	struct work_item *ranks[55] = {0,};
+	int i;
+
+	if (!entry)
+		return NULL;
+
+	if (entry == entry->prev) {
+		work_heap = NULL;
+		work_min = NULL;
+		item = entry->item;
+		free(entry);
+		for (child = item->child; child; child = child->sibling) {
+			child->marked = 0;
+			work_heap_insert_internal(child);
+		}
+		return item;
+	}
+
+	work_heap = entry->next;
+	entry->next->prev = entry->prev;
+	entry->prev->next = entry->next;
+	item = entry->item;
+	free(entry);
+	work_min = NULL;
+
+	for (child = item->child; child; child = child->sibling) {
+		child->marked = 0;
+		work_heap_insert_internal(child);
+	}
+
+	while (work_heap) {
+		entry = work_heap;
+		if (entry == entry->prev) {
+			work_heap = NULL;
+		} else {
+			work_heap = entry->next;
+			entry->next->prev = entry->prev;
+			entry->prev->next = entry->next;
+		}
+		x = entry->item;
+		free(entry);
+		while ((y = ranks[x->rank])) {
+			ranks[x->rank] = NULL;
+			if (work_item_cmp(x, y) > 0) {
+				t = x;
+				x = y;
+				y = t;
+			}
+			y->sibling = x->child;
+			x->child = y;
+			x->rank++;
+		}
+		ranks[x->rank] = x;
+	}
+
+	for (i = 0; i < 55; i++)
+		if (ranks[i])
+			work_heap_insert_internal(ranks[i]);
+
+	return item;
+}
+
+static void work_insert_stride(struct commit *commit, int stride,
+			       int steps, int depth, struct commit **ladder)
+{
+	struct work_item *item = xmalloc(sizeof(struct work_item));
+	check_parsed_and_mark(commit);
+	item->commit = commit;
+	item->stride = stride;
+	item->steps = steps;
+	item->depth = depth;
+	item->bisect = 0;
+	item->ladder = xmalloc(32*sizeof(struct commit *));
+	if (ladder)
+		memcpy(item->ladder, ladder, 32*sizeof(struct commit *));
+	work_heap_insert(item);
+}
+
+static void work_insert_bisect(struct commit *commit, int bisect)
+{
+	struct work_item *item = xmalloc(sizeof(struct work_item));
+	item->commit = commit;
+	item->stride = 0;
+	item->steps = 0;
+	item->depth = 0;
+	item->bisect = bisect;
+	item->ladder = NULL;
+	work_heap_insert(item);
+}
+
+static void work_insert_strides(struct commit_list *commits,
+				int stride, int steps, int depth, int depth2,
+				struct commit **ladder,
+				struct work_item *reusable_item)
+{
+	struct work_item *item = reusable_item;
+
+	while (commits) {
+		if (!(commits->item->object.flags &= SEEN)) {
+			if (!item)
+				item = xmalloc(sizeof(struct work_item));
+
+			item->commit = commits->item;
+			item->stride = stride;
+			item->steps = steps;
+			item->depth = depth;
+			item->bisect = 0;
+			item->ladder = xmalloc(32*sizeof(struct commit *));
+			if (ladder)
+				memcpy(item->ladder, ladder, 32*sizeof(struct commit *));
+			work_heap_insert(item);
+		}
+		commits = commits->next;
+		item = NULL;
+		depth = depth2;
+	}
+
+	if (item)
+		work_item_free(item);
+}
+
+static void work_insert_bisects(struct commit_list *commits,
+				int bisect,
+				struct work_item *reusable_item)
+{
+	struct work_item *item = reusable_item;
+
+	while (commits) {
+		if (!item)
+			item = xmalloc(sizeof(struct work_item));
+
+		item->commit = commits->item;
+		item->stride = 0;
+		item->steps = 0;
+		item->depth = 0;
+		item->bisect = bisect;
+		work_heap_insert(item);
+
+		commits = commits->next;
+		item = NULL;
+	}
+
+	if (item)
+		work_item_free(item);
+}
+
+static int work_heap_insert_ref(const char *path, const unsigned char *sha1,
+			       int flag, void *cb_data)
+{
+	struct object *o = deref_tag(parse_object(sha1), path, 0);
+
+	if (o && o->type == OBJ_COMMIT && !(o->flags & SEEN))
+		work_insert_stride((struct commit *)o, 0, 0, 0, NULL);
+
+	return 0;
+}
+
+
+/*
+ * Marks all (known) children of this commit NOTCOMMON.  We call this
+ * for all sha1s the server did not ACK.
+ */
+
+static void mark_not_common(struct commit *commit)
+{
+	struct commit_info *info = get_commit_info(commit);
+	struct commit_list *child;
+
+	assert(!(commit->object.flags & COMMON));
+
+	if (commit->object.flags & NOTCOMMON)
+		return;
+
+	commit->object.flags |= NOTCOMMON;
+
+	for (child = info->children; child; child = child->next)
+		mark_not_common(child->item);
+}
+
+/*
+ * Similar for the ACKed ones.  We have a _new() version because the
+ * old one interacts with the work queue.
+ */
+
+static void mark_common_new(struct commit *commit, int ancestors_only, int dont_parse)
+{
+	struct commit_list *parents;
+	struct object *o;
+
+	if (commit == NULL || commit->object.flags & (COMMON_REF|COMMON))
+		return;
+
+	if (!dont_parse && !(commit->object.parsed))
+		check_parsed_and_mark(commit);
+
+	o = (struct object *)commit;
+
+	if (!ancestors_only)
+		o->flags |= COMMON;
+
+	for (parents = commit->parents; parents; parents = parents->next) {
+		add_child(parents->item, commit);
+		mark_common_new(parents->item, 0, dont_parse);
+	}
+}
+
+
+
 /*
   Get the next rev to send, ignoring the common.
 */
@@ -156,6 +565,330 @@ static struct commit* get_rev(void)
 	return commit;
 }
 
+
+int flag_in_list(int flag, struct commit_list *list)
+{
+	while (list) {
+		if (list->item->object.flags & flag)
+			return 1;
+		list = list->next;
+	}
+	return 0;
+}
+
+
+/*
+ * Find the second bit set in x, similar to ffs() from string.h.
+ * Assumes that x>0.
+ */
+
+int fss(int x)
+{
+	return ffs(x - (1<<(ffs(x)-1)));
+}
+
+/*
+ * The exponential-stride algorithm, part 1 and 1.5.
+ *
+ * We take exponential strides through history, that is we emit the
+ * 1st commit, the 2nd, the 4th, etc.  But we also simultaneously
+ * build pointers that help us bisect later: Every commit has a
+ * bisect_forward and bisect_backward list (if history is linear, both
+ * have at most 1 element) that tells us where to continue bisection
+ * after this.
+ *
+ * To achieve this with good running time, we think of the commits in
+ * a line of history as being numbered in binary:
+ *
+ *        o 1000
+ *        o  111
+ *        o  110   <- current   [1]
+ *        o  101                [0]
+ *  t |	  o  100                [2]
+ *  i |	  o   11
+ *  m |	  o   10
+ *  e v	  o    1
+ *
+ * Every commit of the form 10...0 is emitted, since it is an
+ * exponential step.  Between that, we want to have
+ * (100).bisect_backward=[110], (110).bisect_forward=[101],
+ * (110).bisect_backward=[111], etc.  Define the lowest bit set (as
+ * per ffs()-1) in a commit as its "order", then we can achieve this
+ * by keeping track of the last commit of every order (marked with [n]
+ * in the diagram) as we go through history.  Then [n] must be on
+ * [n+1].bisect_backward and [n-1] must be on [n].bisect_forward.
+ *
+ * Similarly, we track the boundaries forward_bound and
+ * backward_bound, which indicate the furthest commit(s) that can be
+ * reached by bisecting this one.  We use them to detect when it
+ * becomes pointless to pursue this branch of the bisection because
+ * the far end is already COMMON (NOTCOMMON) in forward (backward)
+ * mode.  In the scheme chosen, a current commit [n] numbered N is a
+ * backward boundary of [i] for all i=0..n, and [fss(N)-1] is a
+ * forward boundary of the current [n].
+ *
+ * The tough problem is if we stop (because we hit a root, or a part
+ * of history that we've already seen) at a non-exp-stride commit.
+ * Then we must ensure that all history up to the current commit can
+ * be reached by bisecting.  Consider
+ *
+ *        o 1000
+ *        o  111
+ *        o  110
+ *        o  101   <- current   [0]
+ *  t |	  o  100                [2]
+ *  i |	  o   11
+ *  m |	  o   10                [1]
+ *  e v	  o    1
+ *
+ * then stopping at the commit 101 without any corrective measures
+ * would leave it unreachable because the intermediate step 110 has
+ * not been seen yet.
+ *
+ * Observe that any given commit x10z, where x is arbitrary and z only
+ * zeros, has outgoing pointers to x01z (forward) and x11z (backward).
+ * Conversely this means x01z needs an incoming forward pointer from
+ * x10z, even if we haven't seen it yet.  So there are two cases:
+ *
+ * SEEN CASE: We stopped because we hit a SEEN commit.  Assume that
+ * the last commit not SEEN is numbered x1y1z, where x is arbitrary
+ * and y and z are only zeros (z possibly empty).  Then unless y is
+ * empty, x1y'01z needs an incoming forward pointer from x1y'10z
+ * (where y'0=y), which we haven't seen yet.  Repeat until y is empty.
+ * Note that during this repetition, we might hit a root commit.
+ *
+ * ROOT CASE: We cannot fix this with forward pointers since there are
+ * no commits left to put them in.  Assume that the root commit is
+ * numbered x1y1z as in the SEEN case.  Then we fix by pointing a
+ * backwards pointer from x1y'00z to x1y'01z, then iterate with
+ * x1y'00z.  Note that x1y'00z is the last seen commit of order
+ * fss(x1y1z).
+ *
+ * Note that we only need to bisect (first level) exp-stride commits
+ * in one direction.  We choose backward.  This choice needs to be
+ * consistent among get_rev_new_stride() and get_rev_new(), and it
+ * happens to match the faster code path of get_rev_new_bisect().
+ */
+
+static struct commit *get_rev_new_stride(struct work_item *item)
+{
+	struct commit_info *info;
+	struct commit *commit = item->commit;
+	int steps = item->steps;
+	int stride = item->stride;
+	int depth = item->depth;
+	int order = -1;
+	struct commit **ladder = item->ladder;
+	int i;
+	/* The next two are for the pointer-fixing iterations */
+	int order2;
+	struct commit *commit2;
+
+	while (!(commit->object.flags & SEEN)) {
+		steps++;
+		order = ffs(steps)-1;
+
+		commit->object.flags |= SEEN;
+
+		if (order > 0 && stride == order) {
+			info = get_commit_info(ladder[order-1]);
+			commit_list_insert(commit, &info->backward_bound);
+			for (i = order-2; i>=0; i--) {
+				info = get_commit_info(ladder[i]);
+				commit_list_insert(commit, &info->backward_bound);
+			}
+		}
+
+		ladder[order] = commit;
+
+		if (stride == order) {
+			/* this is precisely a 2**k'th commit, emit it */
+			work_insert_strides(commit->parents, stride+1, steps,
+					    depth, depth+1, ladder, item);
+			return commit;
+		}
+
+		/*
+		 * We are the next step in the higher-order commit's
+		 * backward (in time) bisection
+		 */
+		info = get_commit_info(ladder[order+1]);
+		commit_list_insert(commit, &info->bisect_backward);
+		info = get_commit_info(commit);
+		commit_list_insert(ladder[fss(steps)-1], &info->forward_bound);
+
+		if (order > 0) {
+			/*
+			 * The lower-order commit is our next step for
+			 * forward (in time) bisection
+			 */
+			info = get_commit_info(commit);
+			commit_list_insert(ladder[order-1], &info->bisect_forward);
+			for (i = order-1; i >= 0; i--) {
+				info = get_commit_info(ladder[i]);
+				commit_list_insert(commit, &info->backward_bound);
+			}
+		}
+
+		if (commit->parents) {
+			work_insert_strides(commit->parents->next, stride,
+					    steps, depth+1, depth+1, ladder,
+					    NULL);
+			commit = commit->parents->item;
+			check_parsed_and_mark(commit);
+		} else {
+			/* ROOT CASE: fix missing pointers */
+			order2 = fss(steps)-1;
+			commit2 = commit;
+			while (order != order2-1 && order2 > 0) {
+				info = get_commit_info(ladder[order2]);
+				commit_list_insert(commit2, &info->bisect_backward);
+				info = get_commit_info(commit2);
+				commit_list_insert(ladder[order2], &info->forward_bound);
+				commit2 = ladder[order2];
+				steps -= 1 << order;
+				order = order2;
+				order2 = fss(steps)-1;
+			}
+			work_item_free(item);
+			return commit;
+		}
+	}
+
+	if (order == -1) {
+		/* this just means we hit a SEEN commit immediately */
+		work_item_free(item);
+		return  NULL;
+	}
+
+	/* SEEN CASE: fix missing pointers w.r.t. the _last_ commit
+	 * before this */
+
+	/* unless/until we hit a root, 'commit' tracks the steps
+	 * backward through the SEEN commits that we stick the
+	 * pointers in.  'order2' only serves to figure out if the 'y'
+	 * (see comment at function head) of 'commit' is empty. */
+	order2 = fss(steps)-1;
+	assert(order2 > order || order2 == 0);
+	while (order != order2-1 && order2 > 0) {
+		int i;
+		/* skip ahead to the next commit2 of order 'order+1' */
+		for (i = (1<<order); i > 0; i--) {
+			steps++;
+			assert (steps < (1<<stride));
+			if (commit->parents) {
+				commit = commit->parents->item;
+				check_parsed_and_mark(commit);
+				continue;
+			}
+			/* we hit a root commit! use the root strategy
+			 * for the rest */
+			order2 = fss(steps)-1;
+			commit2 = commit;
+			while (order != order2-1 && order2 > 0) {
+				info = get_commit_info(ladder[order2]);
+				commit_list_insert(commit2, &info->bisect_backward);
+				info = get_commit_info(commit2);
+				commit_list_insert(ladder[order2], &info->forward_bound);
+				commit2 = ladder[order2];
+				steps -= 1 << order;
+				order = order2;
+				order2 = fss(steps)-1;
+			}
+			/* short-cut out of the big loop */
+			work_item_free(item);
+			return NULL;
+		}
+		info = get_commit_info(commit);
+		commit_list_insert(ladder[order], &info->bisect_forward);
+		for (i = order; i >= 0; i--) {
+			info = get_commit_info(ladder[i]);
+			commit_list_insert(commit, &info->backward_bound);
+		}
+		order++;
+		/* this has not changed order2.  we still keep track
+		 * of 'steps' carefully in case we hit a root
+		 * commit. */
+	}
+
+	work_item_free(item);
+	return NULL;
+}
+
+/*
+ * The exponential-stride algorithm, part 2.
+ *
+ * Bisecting is now easy: just follow the forward/backward pointers,
+ * stopping when we already know enough to not search any further.
+ */
+
+static struct commit *get_rev_new_bisect(struct work_item *item)
+{
+	struct commit *commit = item->commit;
+	struct commit_info *info = get_commit_info(commit);
+	int flags = commit->object.flags;
+
+	if (!(flags & (BISECTED_FW|NOTCOMMON))
+	    && !flag_in_list(COMMON, info->forward_bound)) {
+		commit->object.flags |= BISECTED_FW;
+		work_insert_bisects(info->bisect_forward, item->bisect+1, NULL);
+		if (!(flags & (POPPED|COMMON))) {
+			/* re-queue for backward bisection */
+			work_heap_insert(item);
+			item = NULL;
+			return commit;
+		}
+	}
+
+	if (!(flags & (BISECTED_BW|COMMON))
+	    && !flag_in_list(NOTCOMMON, info->backward_bound)) {
+		commit->object.flags |= BISECTED_BW;
+		work_insert_bisects(info->bisect_backward, item->bisect+1, item);
+		item = NULL;
+		if (!(flags & (POPPED|NOTCOMMON)))
+			return commit;
+	}
+
+	if (item)
+		free(item);
+	return NULL;
+}
+
+
+static struct commit* get_rev_new(void)
+{
+	struct commit *commit = NULL;
+	struct work_item *item;
+
+	while (commit == NULL) {
+		item = work_heap_pop();
+		if (!item)
+			return NULL;
+		check_parsed_and_mark(item->commit);
+		if (item->bisect) {
+			commit = get_rev_new_bisect(item);
+		} else {
+			commit = get_rev_new_stride(item);
+			if (commit) {
+				/* there's no point in bisecting the
+				 * first level both ways */
+				commit->object.flags |= BISECTED_FW;
+				work_insert_bisect(commit, 1);
+			}
+		}
+	}
+
+	commit->object.flags |= POPPED;
+	return commit;
+}
+
+static void pop_outstanding(void)
+{
+	struct commit_list *item = outstanding;
+	outstanding = item->next;
+	free(item);
+}
+
 /*
  * Send 'have' for the next batch of revisions.  Returns 0 if we ran
  * out of commits to send, 1 otherwise.
@@ -167,7 +900,10 @@ static int send_have_lines(int fd[2], int *flushes, unsigned *in_vain)
 	int i;
 
 	for (i = 0; i < 32; i++) {
-		commit = get_rev();
+		if (exp_stride_algorithm)
+			commit = get_rev_new();
+		else
+			commit = get_rev();
 		if (!commit)
 			return 0;
 		packet_write(fd[1], "have %s\n",
@@ -175,6 +911,14 @@ static int send_have_lines(int fd[2], int *flushes, unsigned *in_vain)
 		if (args.verbose)
 			fprintf(stderr, "have %s\n",
 				sha1_to_hex(commit->object.sha1));
+
+		if (outstanding) {
+			commit_list_insert(commit, &(outstanding_end->next));
+			outstanding_end = outstanding_end->next;
+		} else {
+			commit_list_insert(commit, &outstanding);
+			outstanding_end = outstanding;
+		}
 	}
 	packet_flush(fd[1]);
 	*flushes += 1;
@@ -195,7 +939,10 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 		for_each_ref(clear_marks, NULL);
 	marked = 1;
 
-	for_each_ref(rev_list_insert_ref, NULL);
+	if (exp_stride_algorithm)
+		for_each_ref(work_heap_insert_ref, NULL);
+	else
+		for_each_ref(rev_list_insert_ref, NULL);
 
 	fetching = 0;
 	for ( ; refs ; refs = refs->next) {
@@ -218,7 +965,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 		}
 
 		if (!fetching)
-			packet_write(fd[1], "want %s%s%s%s%s%s%s%s\n",
+			packet_write(fd[1], "want %s%s%s%s%s%s%s%s%s\n",
 				     sha1_to_hex(remote),
 				     (multi_ack ? " multi_ack" : ""),
 				     (use_sideband == 2 ? " side-band-64k" : ""),
@@ -226,6 +973,7 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 				     (args.use_thin_pack ? " thin-pack" : ""),
 				     (args.no_progress ? " no-progress" : ""),
 				     (args.include_tag ? " include-tag" : ""),
+				     (exp_stride_algorithm ? " no-giveup" : ""),
 				     " ofs-delta");
 		else
 			packet_write(fd[1], "want %s\n", sha1_to_hex(remote));
@@ -294,12 +1042,31 @@ static int find_common(int fd[2], unsigned char *result_sha1,
 			} else if (ack == 2) {
 				struct commit *commit =
 					lookup_commit(result_sha1);
-				mark_common(commit, 0, 1);
+				if (exp_stride_algorithm) {
+					while (commit != outstanding->item) {
+						if (args.verbose)
+							fprintf(stderr, "unwinding: %s\n", sha1_to_hex(outstanding->item->object.sha1));
+						mark_not_common(outstanding->item);
+						pop_outstanding();
+						unwound++;
+					}
+					pop_outstanding();
+					unwound++;
+					mark_common_new(commit, 0, 1);
+				} else {
+					mark_common(commit, 0, 1);
+				}
 				retval = 0;
 				in_vain = 0;
 				got_continue = 1;
 			}
 		} while (ack);
+		while (exp_stride_algorithm && unwound++ < 32) {
+			if (args.verbose)
+				fprintf(stderr, "unwinding: %s\n", sha1_to_hex(outstanding->item->object.sha1));
+			mark_not_common(outstanding->item);
+			pop_outstanding();
+		}
 		flushes--;
 		if (got_continue && MAX_IN_VAIN < in_vain) {
 			if (args.verbose)
@@ -467,7 +1234,10 @@ static int everything_local(struct ref **refs, int nr_match, char **match)
 		if (!(o->flags & SEEN)) {
 			rev_list_push((struct commit *)o, COMMON_REF | SEEN);
 
-			mark_common((struct commit *)o, 1, 1);
+			if (exp_stride_algorithm)
+				mark_common_new((struct commit *)o, 1, 1);
+			else
+				mark_common((struct commit *)o, 1, 1);
 		}
 	}
 
@@ -617,6 +1387,11 @@ static struct ref *do_fetch_pack(int fd[2],
 			fprintf(stderr, "Server supports side-band\n");
 		use_sideband = 1;
 	}
+	if (server_supports("no-giveup")) {
+		if (args.verbose)
+			fprintf(stderr, "Server supports no-giveup\n");
+		exp_stride_algorithm = 1;
+	}
 	if (everything_local(&ref, nr_match, match)) {
 		packet_flush(fd[1]);
 		goto all_done;
diff --git a/upload-pack.c b/upload-pack.c
index e5adbc0..8f05dbf 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -37,6 +37,7 @@ static unsigned int timeout;
  */
 static int use_sideband;
 static int debug_fd;
+static int disable_give_up;
 
 static void reset_timeout(void)
 {
@@ -414,7 +415,7 @@ static int get_common_commits(void)
 		if (!prefixcmp(line, "have ")) {
 			switch (got_sha1(line+5, sha1)) {
 			case -1: /* they have what we do not */
-				if (multi_ack && ok_to_give_up())
+				if (multi_ack && !disable_give_up && ok_to_give_up())
 					packet_write(1, "ACK %s continue\n",
 						     sha1_to_hex(sha1));
 				break;
@@ -501,6 +502,8 @@ static void receive_needs(void)
 			no_progress = 1;
 		if (strstr(line+45, "include-tag"))
 			use_include_tag = 1;
+		if (strstr(line+45, "no-giveup"))
+			disable_give_up = 1;
 
 		/* We have sent all our refs already, and the other end
 		 * should have chosen out of them; otherwise they are
@@ -573,7 +576,7 @@ static int send_ref(const char *refname, const unsigned char *sha1, int flag, vo
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
-		" include-tag";
+		" include-tag no-giveup";
 	struct object *o = parse_object(sha1);
 
 	if (!o)
-- 
tg: (73ef856..) t/fp-speedup (depends on: origin/master t/fp-refactor)

^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2008-12-06 23:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-17 23:00 [RFC] log(n)-transmissions common commit handshake Thomas Rast
2008-09-17 23:01 ` [RFC PATCH] fetch-pack: log(n)-transmission find_common() Thomas Rast
2008-09-24  0:48   ` [RFC PATCH v2] " Thomas Rast
2008-10-23 19:38     ` Thomas Rast
2008-10-24 15:49       ` Nicolas Pitre
2008-10-27 10:29       ` Nanako Shiraishi
2008-10-28  3:24         ` Junio C Hamano
2008-10-28 14:46           ` Nicolas Pitre
2008-10-28 19:37             ` Thomas Rast
2008-12-06 23:20             ` [RFC PATCH v3 0/2] " Thomas Rast
2008-12-06 23:20               ` [RFC PATCH v3 1/2] fetch-pack: rearrange main loop Thomas Rast
2008-12-06 23:20                 ` [RFC PATCH v3 2/2] fetch-pack: log(n)-transmission find_common() Thomas Rast
2008-09-18  8:18 ` [RFC] log(n)-transmissions common commit handshake Thomas Rast

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).