All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: git@vger.kernel.org
Subject: [PATCH 4/4] fetch-pack: use priority queue for mark_complete
Date: Thu, 19 May 2011 17:26:57 -0400	[thread overview]
Message-ID: <20110519212657.GD29584@sigill.intra.peff.net> (raw)
In-Reply-To: <20110519212349.GA28589@sigill.intra.peff.net>

The regular commit_list has quadratic insertion complexity.
Fetch-pack will insert an item for each ref we have.  For a
"normal" number of refs, this is not a big deal, but it is
quite inefficient when you have tens or hundreds of
thousands of refs.

Instead, let's use a priority queue with better insertion
performance. For example, timings from one repo (which
stores alternates for all of the Rails forks at GitHub, with
mirrored refs from each individual repo to manage object
pruning):

  $ git for-each-ref | wc -l
  112514

  [before]
  $ time git fetch --no-tags ../remote.git
  63.52user 0.12system 1:03.68elapsed 99%CPU (0avgtext+0avgdata 137648maxresident)k
  1856inputs+48outputs (11major+19603minor)pagefaults 0swaps

  [after]
  $ time git fetch --no-tags ../remote.git
  6.20user 0.03system 0:06.25elapsed 99%CPU (0avgtext+0avgdata 123712maxresident)k
  0inputs+40outputs (0major+18885minor)pagefaults 0swaps

Signed-off-by: Jeff King <peff@peff.net>
---
So this is the big performance-win patch. But if you didn't read my
other "fetch: avoid repeated commits in mark_complete", go do that. It's
a much simpler solution that yields the same speedup in practice.

 builtin/fetch-pack.c |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 85aff02..9a951f3 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -457,7 +457,7 @@ done:
 	return count ? retval : 0;
 }
 
-static struct commit_list *complete;
+static struct queue complete = COMMIT_QUEUE_INIT;
 
 static int mark_complete(const char *path, const unsigned char *sha1, int flag, void *cb_data)
 {
@@ -473,18 +473,19 @@ static int mark_complete(const char *path, const unsigned char *sha1, int flag,
 	if (o && o->type == OBJ_COMMIT) {
 		struct commit *commit = (struct commit *)o;
 		commit->object.flags |= COMPLETE;
-		commit_list_insert_by_date(commit, &complete);
+		queue_insert(&complete, commit);
 	}
 	return 0;
 }
 
 static void mark_recent_complete_commits(unsigned long cutoff)
 {
-	while (complete && cutoff <= complete->item->date) {
+	struct commit *c;
+	while ((c = queue_peek(&complete)) && cutoff <= c->date) {
 		if (args.verbose)
 			fprintf(stderr, "Marking %s as complete\n",
-				sha1_to_hex(complete->item->object.sha1));
-		pop_most_recent_commit(&complete, COMPLETE);
+				sha1_to_hex(c->object.sha1));
+		pop_commit_from_queue(&complete, COMPLETE);
 	}
 }
 
-- 
1.7.5.8.ga95ab

  parent reply	other threads:[~2011-05-19 21:27 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-19 20:48 [PATCH] fetch: avoid repeated commits in mark_complete Jeff King
2011-05-19 21:23 ` [RFC/PATCH 0/4] commit lists as priority queues Jeff King
2011-05-19 21:24   ` [PATCH 1/4] Makefile: sort TEST_PROGRAMS list Jeff King
2011-05-19 21:24   ` [PATCH 2/4] basic priority queue implementation Jeff King
2011-05-20  0:47     ` Thiago Farina
2011-05-20  7:38       ` Jeff King
2011-05-20 13:13         ` Thiago Farina
2011-05-20 13:23           ` Jeff King
2011-05-19 21:25   ` [PATCH 3/4] commit: add infrastructure for priority queues of commits Jeff King
2011-05-19 21:26   ` Jeff King [this message]
2011-05-20 22:03   ` [RFC/PATCH 0/4] commit lists as priority queues Junio C Hamano
2011-05-20  1:42 ` [PATCH] fetch: avoid repeated commits in mark_complete Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110519212657.GD29584@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.