All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC 0/6] reuse deltas found by bitmaps
@ 2014-03-26  7:22 Jeff King
  2014-03-26  7:22 ` [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf Jeff King
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:22 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

[tl;dr the patch is the same as before, but there is a script to measure
       its effects; please try it out on your repos]

This is a continuation of the discussion here:

  http://thread.gmane.org/gmane.comp.version-control.git/239647

I'll summarize the story so far.

Basically, the problem is that while pack bitmaps make the "Counting
objects" phase of a fetch fast (which in turn makes clones very fast),
they do not help as much for an incremental fetch. One, because we have
a lot fewer objects to count, so there is less for us to speed up there.
And two, we spend a lot more effort on delta compression for a fetch,
because we are often throwing out on-disk deltas that we are not sure
that the other side has.

One common case is that the server side is mostly packed and has
bitmaps, and the fetching client has some subset of its objects, but
wants to bring itself up to the tip. Any delta that the server has on
disk is generally eligible for reuse, because the base is either:

  1. An object that the client already has.

  2. An object that we are going to send as part of the new pack.

However, we quite often do not notice case (1), because we only consider
a subset of the client's objects as "preferred bases" for thin-packing.
The reason for this is that traditionally it was expensive to enumerate
all of the objects we know the client has. But with bitmaps, this is
very cheap. So the basic idea is to better notice which objects the
client has and change our delta reuse and preferred-base rules.

There are three basic strategies I can think of for doing this:

  1. Add every object the client has to the packing list as a preferred
     base.

  2. When considering whether a delta can be reused, check the bitmaps
     to see if the client has the base. If so, allow reuse.

  3. Do (2), but also add the object as a preferred base.

I think that option (1) is probably a bad idea; we'll spend a lot of
time generating a large packing list, and the huge number of objects
will probably clog the delta window.

Option (3) might or might not be a good idea. It would hopefully give us
some relevant subset of the objects to use as preferred bases, in case
other objects also need deltas.

The implementation I'm including here is the one I've shown before,
which does (2). Part of the reason that I'm reposting it before looking
further into these options is that I've written a t/perf script that can
help with the analysis.

I've run it against git.git and linux.git. I'd be curious to see the
results on other repositories that might have different delta patterns.

The script simulates a fetch from a fully-packed server by a client who
has not fetched in N days, for several values of N. It measures the time
taken on the server (to run pack-objects), the time taken on the client
(to run index-pack), and the size of the resulting pack. Here are the
results for running it on linux.git (note that the script output has the
tests interleaved, but I've rearranged it here for clarity):

    Test                         origin             HEAD                   
    -----------------------------------------------------------------------
    5311.4: size     (1 days)               1.0M              59.5K -94.1% 
    5311.8: size     (2 days)               1.0M              59.5K -94.1% 
    5311.12: size     (4 days)              1.1M              67.9K -93.5% 
    5311.16: size     (8 days)              1.5M             130.1K -91.4% 
    5311.20: size    (16 days)              3.7M             359.8K -90.3% 
    5311.24: size    (32 days)              8.6M               1.4M -84.3% 
    5311.28: size    (64 days)             68.3M              23.0M -66.3% 
    5311.32: size   (128 days)             83.1M              35.1M -57.7% 

You can see that it produces much smaller packs, because we're getting
better delta reuse (and bitmaps don't generally produce preferred bases
at all, so we were probably failing to make thin packs at all). The
percentage benefit lessens as we go further back, of course, because the
thin objects will be at the "edge" of the pack (and the bigger the pack,
the less of it is edge).

So far so good...here are the server timings:

    Test                         origin             HEAD                   
    -----------------------------------------------------------------------
    5311.3: server   (1 days)    0.29(0.33+0.03)    0.14(0.11+0.02) -51.7% 
    5311.7: server   (2 days)    0.29(0.33+0.04)    0.14(0.11+0.02) -51.7% 
    5311.11: server   (4 days)   0.29(0.32+0.04)    0.14(0.11+0.02) -51.7% 
    5311.15: server   (8 days)   0.36(0.45+0.04)    0.14(0.11+0.02) -61.1% 
    5311.19: server  (16 days)   0.74(1.17+0.05)    0.26(0.23+0.02) -64.9% 
    5311.23: server  (32 days)   1.38(2.53+0.06)    0.29(0.25+0.03) -79.0% 
    5311.27: server  (64 days)   7.12(15.70+0.18)   0.43(0.38+0.07) -94.0% 
    5311.31: server (128 days)   7.60(16.72+0.19)   0.52(0.48+0.07) -93.2% 

Again, looking good. We're cutting out the whole delta-search phase on
the server, because we can reuse all of our deltas. Note that these
measurements represent the optimal case: a fully packed repository with
a client that wants every object we have. So we should get full delta
reuse. That also means that this test probably won't show any advantages
for option (3) above, where we would add more preferred bases. We don't
have any candidates for delta compression, so it doesn't matter how many
preferred bases we have.

And here's the bad(ish) news:

    Test                         origin             HEAD                   
    -----------------------------------------------------------------------
    5311.5: client   (1 days)    0.03(0.04+0.00)    0.08(0.07+0.00) +166.7%
    5311.9: client   (2 days)    0.03(0.04+0.00)    0.09(0.08+0.00) +200.0%
    5311.13: client   (4 days)   0.04(0.03+0.01)    0.09(0.08+0.00) +125.0%
    5311.17: client   (8 days)   0.05(0.06+0.00)    0.12(0.11+0.00) +140.0%
    5311.21: client  (16 days)   0.13(0.14+0.00)    0.29(0.28+0.01) +123.1%
    5311.25: client  (32 days)   0.32(0.37+0.01)    0.67(0.66+0.01) +109.4%
    5311.29: client  (64 days)   3.14(4.52+0.19)    6.27(6.50+0.13) +99.7% 
    5311.33: client (128 days)   4.16(6.06+0.23)    7.17(7.86+0.16) +72.4% 

The client has to spend more effort, because it now has to "fix" the
thin pack it receives. But there's a silver lining. Even though the
percent change looks bad, that's only because the client was already
spending much less time than the server. In absolute numbers, we should
come out ahead.

Let's look at the 128-day case as an example.  Before this series, we
spent 7.6s on the server and 4.2s on the client. We sent an 83M pack. If
we assume a 25MBit connection, that's 26.5s of transfer. The total is
38.3s for the whole fetch (this is slightly inaccurate, because
index-pack does some work while the pack is streaming in from the
network, but most of it does happen afterwards).

After the series, we spend only 0.5s on the server, but 7.2s on the
client. Our pack shrinks to 35M, or 11.2s of transfer time. Our total is
now 18.9s (about half).

The other cases show a similar effect, though sometimes less pronounced
(but the very small fetches are already so fast that they are not really
that interesting). Even without the net win in wall-clock time, I think
the reduced server and network load are worth it, but it's nice that it
ends up faster overall, too.

The other interesting thing is to compare CPU and wall-clock time. In
the 128-day case we only saved 7s of wall-clock time, but we saved
almost 17s of CPU time (this is on a HT quad-core machine). On the
client side, we added 3s of wall-clock time, but we spent only 1.8s of
extra CPU time.

This is because most of index-pack is multi-threaded, but the
thin-object resolution is not. So the more thin objects we send, the
less effectively index-pack can use multiple cores. This might be worth
looking into as a next step.

I'd be curious to see numbers for other repositories. I know that Ben
mentioned in our initial discussion that while they saw a server CPU
time improvement on their repo at Facebook, they didn't see the same
pack-size savings.

If you want to run the test yourself, you can either apply these
patches, or fetch from:

  git://github.com/peff/git.git jk/bitmap-reuse-delta

and then run:

  cd t/perf
  export GIT_PERF_REPO=/path/to/your/repo
  ./run origin HEAD p5311-pack-bitmaps-fetch.sh

If you want to re-show the results without re-running the test, you can
do:

  perl aggregate.perl origin HEAD p5311-pack-bitmaps-fetch.sh

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
@ 2014-03-26  7:22 ` Jeff King
  2014-03-26 22:34   ` Junio C Hamano
  2014-03-26  7:22 ` [PATCH 2/6] t/perf/aggregate: factor our percent calculations Jeff King
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:22 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

About half of test_perf() is boilerplate, and half is
actually related to running the perf test. Let's split it
into two functions, so that we can reuse the boilerplate in
future commits.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/perf-lib.sh | 61 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 35 insertions(+), 26 deletions(-)

diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index a8c9574..20f306a 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -150,8 +150,8 @@ exit $ret' >&3 2>&4
 	return "$eval_ret"
 }
 
-
-test_perf () {
+test_wrapper_ () {
+	test_wrapper_func_=$1; shift
 	test_start_
 	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
 	test "$#" = 2 ||
@@ -162,35 +162,44 @@ test_perf () {
 		base=$(basename "$0" .sh)
 		echo "$test_count" >>"$perf_results_dir"/$base.subtests
 		echo "$1" >"$perf_results_dir"/$base.$test_count.descr
-		if test -z "$verbose"; then
-			printf "%s" "perf $test_count - $1:"
-		else
-			echo "perf $test_count - $1:"
-		fi
-		for i in $(test_seq 1 $GIT_PERF_REPEAT_COUNT); do
-			say >&3 "running: $2"
-			if test_run_perf_ "$2"
-			then
-				if test -z "$verbose"; then
-					printf " %s" "$i"
-				else
-					echo "* timing run $i/$GIT_PERF_REPEAT_COUNT:"
-				fi
+		base="$perf_results_dir"/"$perf_results_prefix$(basename "$0" .sh)"."$test_count"
+		"$test_wrapper_func_" "$@"
+	fi
+
+	test_finish_
+}
+
+test_perf_ () {
+	if test -z "$verbose"; then
+		printf "%s" "perf $test_count - $1:"
+	else
+		echo "perf $test_count - $1:"
+	fi
+	for i in $(test_seq 1 $GIT_PERF_REPEAT_COUNT); do
+		say >&3 "running: $2"
+		if test_run_perf_ "$2"
+		then
+			if test -z "$verbose"; then
+				printf " %s" "$i"
 			else
-				test -z "$verbose" && echo
-				test_failure_ "$@"
-				break
+				echo "* timing run $i/$GIT_PERF_REPEAT_COUNT:"
 			fi
-		done
-		if test -z "$verbose"; then
-			echo " ok"
 		else
-			test_ok_ "$1"
+			test -z "$verbose" && echo
+			test_failure_ "$@"
+			break
 		fi
-		base="$perf_results_dir"/"$perf_results_prefix$(basename "$0" .sh)"."$test_count"
-		"$TEST_DIRECTORY"/perf/min_time.perl test_time.* >"$base".times
+	done
+	if test -z "$verbose"; then
+		echo " ok"
+	else
+		test_ok_ "$1"
 	fi
-	test_finish_
+	"$TEST_DIRECTORY"/perf/min_time.perl test_time.* >"$base".times
+}
+
+test_perf () {
+	test_wrapper_ test_perf_ "$@"
 }
 
 # We extend test_done to print timings at the end (./run disables this
-- 
1.9.1.601.g7ec968e

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/6] t/perf/aggregate: factor our percent calculations
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
  2014-03-26  7:22 ` [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf Jeff King
@ 2014-03-26  7:22 ` Jeff King
  2014-03-26  7:22 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:22 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

This will let us reuse the code when we add new values to
aggregate besides times.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/aggregate.perl | 21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 15f7fc1..690cd8c 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -16,21 +16,24 @@ sub get_times {
 	return ($rt, $4, $5);
 }
 
+sub relative_change {
+	my ($r, $firstr) = @_;
+	if ($firstr > 0) {
+		return sprintf "%+.1f%%", 100.0*($r-$firstr)/$firstr;
+	} elsif ($r == 0) {
+		return "=";
+	} else {
+		return "+inf";
+	}
+}
+
 sub format_times {
 	my ($r, $u, $s, $firstr) = @_;
 	if (!defined $r) {
 		return "<missing>";
 	}
 	my $out = sprintf "%.2f(%.2f+%.2f)", $r, $u, $s;
-	if (defined $firstr) {
-		if ($firstr > 0) {
-			$out .= sprintf " %+.1f%%", 100.0*($r-$firstr)/$firstr;
-		} elsif ($r == 0) {
-			$out .= " =";
-		} else {
-			$out .= " +inf";
-		}
-	}
+	$out .= ' ' . relative_change($r, $firstr) if defined $firstr;
 	return $out;
 }
 
-- 
1.9.1.601.g7ec968e

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/6] t/perf: add infrastructure for measuring sizes
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
  2014-03-26  7:22 ` [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf Jeff King
  2014-03-26  7:22 ` [PATCH 2/6] t/perf/aggregate: factor our percent calculations Jeff King
@ 2014-03-26  7:22 ` Jeff King
  2014-03-26  7:22 ` [PATCH 4/6] t/perf: add perf tests for fetches from a bitmapped server Jeff King
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:22 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

The main objective of scripts in the perf framework is to
run "test_perf", which measures the time it takes to run
some operation. However, it can also be interesting to see
the change in the output size of certain operations.

This patch introduces test_size, which records a single
numeric output from the test and shows it in the aggregated
output (with pretty printing and relative size comparison).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/README         | 20 ++++++++++++++++++++
 t/perf/aggregate.perl | 48 +++++++++++++++++++++++++++++++++++++++++++-----
 t/perf/perf-lib.sh    | 13 +++++++++++++
 3 files changed, 76 insertions(+), 5 deletions(-)

diff --git a/t/perf/README b/t/perf/README
index 8848c14..09c400f 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -144,3 +144,23 @@ that
   While we have tried to make sure that it can cope with embedded
   whitespace and other special characters, it will not work with
   multi-line data.
+
+Rather than tracking the performance by run-time as `test_perf` does, you
+may also track output size by using `test_size`. The stdout of the
+function should be a single numeric value, which will be captured and
+shown in the aggregated output. For example:
+
+	test_perf 'time foo' '
+		./foo >foo.out
+	'
+
+	test_size 'output size'
+		wc -c <foo.out
+	'
+
+might produce output like:
+
+	Test                origin           HEAD
+	-------------------------------------------------------------
+	1234.1 time foo     0.37(0.79+0.02)  0.26(0.51+0.02) -29.7%
+	1234.2 output size             4.3M             3.6M -14.7%
diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 690cd8c..42739a5 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -10,10 +10,16 @@ sub get_times {
 	my $line = <$fh>;
 	return undef if not defined $line;
 	close $fh or die "cannot close $name: $!";
-	$line =~ /^(?:(\d+):)?(\d+):(\d+(?:\.\d+)?) (\d+(?:\.\d+)?) (\d+(?:\.\d+)?)$/
-		or die "bad input line: $line";
-	my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
-	return ($rt, $4, $5);
+	# times
+	if ($line =~ /^(?:(\d+):)?(\d+):(\d+(?:\.\d+)?) (\d+(?:\.\d+)?) (\d+(?:\.\d+)?)$/) {
+		my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
+		return ($rt, $4, $5);
+	# size
+	} elsif ($line =~ /^\d+$/) {
+		return $&;
+	} else {
+		die "bad input line: $line";
+	}
 }
 
 sub relative_change {
@@ -29,14 +35,39 @@ sub relative_change {
 
 sub format_times {
 	my ($r, $u, $s, $firstr) = @_;
+	# no value means we did not finish the test
 	if (!defined $r) {
 		return "<missing>";
 	}
+	# a single value means we have a size, not times
+	if (!defined $u) {
+		return format_size($r, $firstr);
+	}
+	# otherwise, we have real/user/system times
 	my $out = sprintf "%.2f(%.2f+%.2f)", $r, $u, $s;
 	$out .= ' ' . relative_change($r, $firstr) if defined $firstr;
 	return $out;
 }
 
+sub human_size {
+	my $n = shift;
+	my @units = ('', qw(K M G));
+	while ($n > 900 && @units > 1) {
+		$n /= 1000;
+		shift @units;
+	}
+	return $n unless length $units[0];
+	return sprintf '%.1f%s', $n, $units[0];
+}
+
+sub format_size {
+	my ($size, $first) = @_;
+	# match the width of a time: 0.00(0.00+0.00)
+	my $out = sprintf '%15s', human_size($size);
+	$out .= ' ' . relative_change($size, $first) if defined $first;
+	return $out;
+}
+
 my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests);
 while (scalar @ARGV) {
 	my $arg = $ARGV[0];
@@ -139,7 +170,14 @@ sub have_slash {
 	my $firstr;
 	for my $i (0..$#dirs) {
 		my $d = $dirs[$i];
-		$times{$prefixes{$d}.$t} = [get_times("test-results/$prefixes{$d}$t.times")];
+		my $base = "test-results/$prefixes{$d}$t";
+		$times{$prefixes{$d}.$t} = [];
+		foreach my $type (qw(times size)) {
+			if (-e "$base.$type") {
+				$times{$prefixes{$d}.$t} = [get_times("$base.$type")];
+				last;
+			}
+		}
 		my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
 		my $w = length format_times($r,$u,$s,$firstr);
 		$colwidth[$i] = $w if $w > $colwidth[$i];
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index 20f306a..fb8e017 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -202,6 +202,19 @@ test_perf () {
 	test_wrapper_ test_perf_ "$@"
 }
 
+test_size_ () {
+	say >&3 "running: $2"
+	if test_eval_ "$2" 3>"$base".size; then
+		test_ok_ "$1"
+	else
+		test_failure_ "$@"
+	fi
+}
+
+test_size () {
+	test_wrapper_ test_size_ "$@"
+}
+
 # We extend test_done to print timings at the end (./run disables this
 # and does it after running everything)
 test_at_end_hook_ () {
-- 
1.9.1.601.g7ec968e

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/6] t/perf: add perf tests for fetches from a bitmapped server
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
                   ` (2 preceding siblings ...)
  2014-03-26  7:22 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
@ 2014-03-26  7:22 ` Jeff King
  2014-03-26  7:23 ` [PATCH 5/6] pack-bitmap: save "have" bitmap from walk Jeff King
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:22 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

A server with bitmapped packs can serve a clone very
quickly. However, fetches are not necessarily made any
faster, because we spend a lot less time in object traversal
(which is what bitmaps help with) and more time finding
deltas (because we may have to throw out on-disk deltas if
the client does not have the other side).

As a first step to making this faster, this patch introduces
a new perf script to measure fetches into a repo of various
ages from a fully-bitmapped server.

We separately measure the work done by the server (in
pack-objects) and that done by the client (in index-pack).
Furthermore, we measure the size of the resulting pack.

Breaking it down like this (instead of just doing a regular
"git fetch") lets us see how much each side benefits from
any changes. And since we know the pack size, if we estimate
the network speed, then we can calculate a complete
wall-clock time for the operation.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/p5311-pack-bitmaps-fetch.sh | 45 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)
 create mode 100755 t/perf/p5311-pack-bitmaps-fetch.sh

diff --git a/t/perf/p5311-pack-bitmaps-fetch.sh b/t/perf/p5311-pack-bitmaps-fetch.sh
new file mode 100755
index 0000000..b045759
--- /dev/null
+++ b/t/perf/p5311-pack-bitmaps-fetch.sh
@@ -0,0 +1,45 @@
+#!/bin/sh
+
+test_description='performance of fetches from bitmapped packs'
+. ./perf-lib.sh
+
+test_perf_default_repo
+
+test_expect_success 'create bitmapped server repo' '
+	git config pack.writebitmaps true &&
+	git config pack.writebitmaphashcache true &&
+	git repack -ad
+'
+
+# simulate a fetch from a repository that last fetched N days ago, for
+# various values of N. We do so by following the first-parent chain,
+# and assume the first entry in the chain that is N days older than the current
+# HEAD is where the HEAD would have been then.
+for days in 1 2 4 8 16 32 64 128; do
+	title=$(printf '%10s' "($days days)")
+	test_expect_success "setup revs from $days days ago" '
+		now=$(git log -1 --format=%ct HEAD) &&
+		then=$(($now - ($days * 86400))) &&
+		tip=$(git rev-list -1 --first-parent --until=$then HEAD) &&
+		{
+			echo HEAD &&
+			echo ^$tip
+		} >revs
+	'
+
+	test_perf "server $title" '
+		git pack-objects --stdout --revs \
+				 --thin --delta-base-offset \
+				 <revs >tmp.pack
+	'
+
+	test_size "size   $title" '
+		wc -c <tmp.pack
+	'
+
+	test_perf "client $title" '
+		git index-pack --stdin --fix-thin <tmp.pack
+	'
+done
+
+test_done
-- 
1.9.1.601.g7ec968e

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/6] pack-bitmap: save "have" bitmap from walk
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
                   ` (3 preceding siblings ...)
  2014-03-26  7:22 ` [PATCH 4/6] t/perf: add perf tests for fetches from a bitmapped server Jeff King
@ 2014-03-26  7:23 ` Jeff King
  2014-03-26  7:23 ` [PATCH 6/6] pack-objects: reuse deltas for thin "have" objects Jeff King
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:23 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

When we do a bitmap walk, we save the result, which
represents (WANTs & ~HAVEs); i.e., every object we care
about visiting in our walk. However, we throw away the
haves bitmap, which can sometimes be useful, too. Save it
and provide an access function so code which has performed a
walk can query it.

Signed-off-by: Jeff King <peff@peff.net>
---
 pack-bitmap.c | 21 ++++++++++++++++++++-
 pack-bitmap.h |  2 ++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index a31e529..f7d417b 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -86,6 +86,9 @@ static struct bitmap_index {
 	/* Bitmap result of the last performed walk */
 	struct bitmap *result;
 
+	/* "have" bitmap from the last performed walk */
+	struct bitmap *haves;
+
 	/* Version of the bitmap index */
 	unsigned int version;
 
@@ -769,8 +772,8 @@ int prepare_bitmap_walk(struct rev_info *revs)
 		bitmap_and_not(wants_bitmap, haves_bitmap);
 
 	bitmap_git.result = wants_bitmap;
+	bitmap_git.haves = haves_bitmap;
 
-	bitmap_free(haves_bitmap);
 	return 0;
 }
 
@@ -1097,3 +1100,19 @@ int rebuild_existing_bitmaps(struct packing_data *mapping,
 	bitmap_free(rebuild);
 	return 0;
 }
+
+int bitmap_have(const unsigned char *sha1)
+{
+	int pos;
+
+	if (!bitmap_git.loaded)
+		return 0; /* no bitmap loaded */
+	if (!bitmap_git.haves)
+		return 0; /* walk had no "haves" */
+
+	pos = bitmap_position_packfile(sha1);
+	if (pos < 0)
+		return 0;
+
+	return bitmap_get(bitmap_git.haves, pos);
+}
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 8b7f4e9..a63ee6b 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -49,6 +49,8 @@ int prepare_bitmap_walk(struct rev_info *revs);
 int reuse_partial_packfile_from_bitmap(struct packed_git **packfile, uint32_t *entries, off_t *up_to);
 int rebuild_existing_bitmaps(struct packing_data *mapping, khash_sha1 *reused_bitmaps, int show_progress);
 
+int bitmap_have(const unsigned char *sha1);
+
 void bitmap_writer_show_progress(int show);
 void bitmap_writer_set_checksum(unsigned char *sha1);
 void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t index_nr);
-- 
1.9.1.601.g7ec968e

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/6] pack-objects: reuse deltas for thin "have" objects
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
                   ` (4 preceding siblings ...)
  2014-03-26  7:23 ` [PATCH 5/6] pack-bitmap: save "have" bitmap from walk Jeff King
@ 2014-03-26  7:23 ` Jeff King
  2014-03-28  4:23   ` Eric Sunshine
  2014-03-26 17:33 ` [PATCH/RFC 0/6] reuse deltas found by bitmaps Junio C Hamano
  2014-03-26 22:40 ` Siddharth Agarwal
  7 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2014-03-26  7:23 UTC (permalink / raw)
  To: git; +Cc: Ben Maurer, Siddharth Agarwal

When we calculate the "wants" and "haves" for a pack, we
only add the objects in the boundary commits as preferred
bases. However, we know that every object reachable from the
"haves" could be a preferred base.

We probably don't want to add these to our preferred base
list, because they would clog the delta-search window.
However, there is no reason we cannot reuse an on-disk delta
against such a deep "have" base, avoiding the delta search
for that object altogether.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7950c43..92bd682 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -53,6 +53,7 @@ static unsigned long pack_size_limit;
 static int depth = 50;
 static int delta_search_threads;
 static int pack_to_stdout;
+static int thin = 0;
 static int num_preferred_base;
 static struct progress *progress_state;
 static int pack_compression_level = Z_DEFAULT_COMPRESSION;
@@ -1419,6 +1420,20 @@ static void check_object(struct object_entry *entry)
 			base_entry->delta_child = entry;
 			unuse_pack(&w_curs);
 			return;
+		} else if(thin && base_ref && bitmap_have(base_ref)) {
+			entry->type = entry->in_pack_type;
+			entry->delta_size = entry->size;
+			/*
+			 * XXX we'll leak this, but it's probably OK
+			 * since we'll exit immediately after the packing
+			 * is done
+			 */
+			entry->delta = xcalloc(1, sizeof(*entry->delta));
+			hashcpy(entry->delta->idx.sha1, base_ref);
+			entry->delta->preferred_base = 1;
+			entry->delta->filled = 1;
+			unuse_pack(&w_curs);
+			return;
 		}
 
 		if (entry->type) {
@@ -2559,7 +2574,6 @@ static int option_parse_ulong(const struct option *opt,
 int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 {
 	int use_internal_rev_list = 0;
-	int thin = 0;
 	int all_progress_implied = 0;
 	const char *rp_av[6];
 	int rp_ac = 0;
-- 
1.9.1.601.g7ec968e

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
                   ` (5 preceding siblings ...)
  2014-03-26  7:23 ` [PATCH 6/6] pack-objects: reuse deltas for thin "have" objects Jeff King
@ 2014-03-26 17:33 ` Junio C Hamano
  2014-03-26 18:13   ` Jeff King
  2014-03-26 22:40 ` Siddharth Agarwal
  7 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2014-03-26 17:33 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Ben Maurer, Siddharth Agarwal

Jeff King <peff@peff.net> writes:

>   2. When considering whether a delta can be reused, check the bitmaps
>      to see if the client has the base. If so, allow reuse.
> ...
> The implementation I'm including here is the one I've shown before,
> which does (2). Part of the reason that I'm reposting it before looking
> further into these options is that I've written a t/perf script that can
> help with the analysis.

Conceptually, that feels like a natural extension for the "thin
pack" transfer.  I wouldn't foresee a correctness issue, as long as
the fetcher runs "index-pack --fix-thin" at their end.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26 17:33 ` [PATCH/RFC 0/6] reuse deltas found by bitmaps Junio C Hamano
@ 2014-03-26 18:13   ` Jeff King
  2014-03-26 22:31     ` Junio C Hamano
  2014-03-27  1:13     ` Jeff King
  0 siblings, 2 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26 18:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Ben Maurer, Siddharth Agarwal

On Wed, Mar 26, 2014 at 10:33:41AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >   2. When considering whether a delta can be reused, check the bitmaps
> >      to see if the client has the base. If so, allow reuse.
> > ...
> > The implementation I'm including here is the one I've shown before,
> > which does (2). Part of the reason that I'm reposting it before looking
> > further into these options is that I've written a t/perf script that can
> > help with the analysis.
> 
> Conceptually, that feels like a natural extension for the "thin
> pack" transfer.  I wouldn't foresee a correctness issue, as long as
> the fetcher runs "index-pack --fix-thin" at their end.

Right, this is only enabled when --thin is negotiated during the
protocol transfer. I don't think there's any correctness problem here.
But I want to make sure we are doing the fastest thing.

Here's another set of results comparing three runs: v1.9.0 (with no
bitmaps), origin/master (bitmaps), and this series. Apologies for the
long lines.

  Test                         v1.9.0            origin                     HEAD                   
  -------------------------------------------------------------------------------------------------
  5311.3: server   (1 days)    0.20(0.17+0.02)   0.29(0.34+0.04) +45.0%     0.14(0.12+0.02) -30.0% 
  5311.4: size     (1 days)              56.2K              1.0M +1694.4%             59.5K +5.9%  
  5311.5: client   (1 days)    0.08(0.07+0.00)   0.03(0.04+0.00) -62.5%     0.08(0.07+0.00) +0.0%  
  5311.7: server   (2 days)    0.06(0.06+0.00)   0.28(0.33+0.04) +366.7%    0.14(0.11+0.02) +133.3%
  5311.8: size     (2 days)              56.2K              1.0M +1694.4%             59.5K +5.9%  
  5311.9: client   (2 days)    0.09(0.08+0.00)   0.03(0.04+0.00) -66.7%     0.09(0.08+0.00) +0.0%  
  5311.11: server   (4 days)   0.21(0.18+0.02)   0.29(0.33+0.03) +38.1%     0.14(0.11+0.02) -33.3% 
  5311.12: size     (4 days)             64.2K              1.1M +1536.6%             67.9K +5.8%  
  5311.13: client   (4 days)   0.08(0.08+0.00)   0.04(0.03+0.01) -50.0%     0.09(0.07+0.01) +12.5% 
  5311.15: server   (8 days)   0.22(0.21+0.02)   0.36(0.47+0.03) +63.6%     0.14(0.11+0.02) -36.4% 
  5311.16: size     (8 days)            125.7K              1.5M +1100.8%            130.1K +3.5%  
  5311.17: client   (8 days)   0.11(0.10+0.00)   0.05(0.05+0.00) -54.5%     0.13(0.11+0.01) +18.2% 
  5311.19: server  (16 days)   0.26(0.26+0.03)   0.76(1.20+0.04) +192.3%    0.25(0.21+0.04) -3.8%  
  5311.20: size    (16 days)            358.6K              3.7M +931.7%             359.8K +0.3%  
  5311.21: client  (16 days)   0.28(0.27+0.01)   0.13(0.15+0.00) -53.6%     0.30(0.28+0.02) +7.1%  
  5311.23: server  (32 days)   0.44(0.54+0.07)   1.39(2.54+0.04) +215.9%    0.28(0.23+0.04) -36.4% 
  5311.24: size    (32 days)              1.1M              8.6M +652.4%               2.1M +83.9% 
  5311.25: client  (32 days)   0.66(0.64+0.02)   0.32(0.38+0.01) -51.5%     0.69(0.67+0.04) +4.5%  
  5311.27: server  (64 days)   2.78(4.02+0.17)   7.02(15.59+0.16) +152.5%   0.43(0.37+0.07) -84.5% 
  5311.28: size    (64 days)             18.7M             68.3M +265.5%              24.5M +31.3% 
  5311.29: client  (64 days)   6.25(6.29+0.16)   3.21(4.76+0.14) -48.6%     6.27(6.46+0.18) +0.3%  
  5311.31: server (128 days)   4.48(7.32+0.21)   7.56(16.60+0.16) +68.8%    0.51(0.45+0.10) -88.6% 
  5311.32: size   (128 days)             25.8M             83.1M +222.3%              35.9M +39.2% 
  5311.33: client (128 days)   7.32(7.58+0.17)   3.94(5.87+0.18) -46.2%     7.15(7.80+0.20) -2.3%  

My previous results showed that this series was an improvement over
straight bitmaps. But what it didn't show is that bitmaps actually
provide a regression for some fetches (I think because we do not find
the boundary commits at all, and do not have any preferred bases).

Just looking at the 128-day case again, using bitmaps increased our
server CPU time _and_ made a much bigger pack. This series not only
fixes the CPU time regression, but it also drops the server CPU time to
almost nothing. That's a nice improvement, and it makes perfect sense:
we are reusing on-disk deltas instead of on-the-fly computing deltas
against the preferred bases.

And it does help with the size regression, though the end result is
still a bit worse than v1.9.0. I'm not exactly sure what's going on
here. My guess is that we have objects to send that are not deltas on
disk (probably because they are at the tip of history and other things
are deltas against them). We would still want to do delta compression of
them against preferred bases, but we don't have any preferred bases in
our list.

So I think we still want to add back in some list of preferred base
objects when bitmaps are in use. The question is, which objects?
With bitmaps, it's easy to add in all of the objects the client has, but
that is probably going to be counterproductive (I still need to measure
it, though).

I think we could still add the objects from the tip of the client's HAVE
list. This patch would still be a CPU win on top of that, because it
would reduce the number of objects which need a delta search in the
first place.

But it also creates worse packs on the client side after --fix-thin is
run. Any preferred base we use it something the client may have to add
back in. So once a thin base is used, we would want to use it again to
reduce the work the client has to do to fix it. And in that sense, it
would be helpful to add in bases that we reuse via this patch as
preferred bases for further delta searches.

But solely relying on that would give us a rather incomplete set of
bases. If we are sending a blob for "foo.c" that has an on-disk delta
against an older version that the client has, we'd reuse that delta. And
then further versions of foo.c might be able to use that same base,
which is good. But if we are sending one version of "bar.c" that has no
on-disk delta, then it has no useful preferred bases.

So I think the next steps are probably:

  1. Measure the "all objects are preferred bases" approach and confirm
     that it is bad.

  2. Measure the "reused deltas become preferred bases" approach. I
     expect the resulting size to be slightly better than what I have
     now, but not as good as v1.9.0's size (but taking less CPU time).

  3. Measure the "figure out boundaries and add them as preferred bases,
     like we do without bitmaps" approach. I expect this to behave
     similarly to v1.9.0.

  4. Combine (2) and (3) and measure them. I'm _hoping_ this will give
     us the best of both worlds, but I still want to do the individual
     measurements so we can see where any improvement is coming from.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26 18:13   ` Jeff King
@ 2014-03-26 22:31     ` Junio C Hamano
  2014-03-26 22:36       ` Jeff King
  2014-03-27  1:13     ` Jeff King
  1 sibling, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2014-03-26 22:31 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Ben Maurer, Siddharth Agarwal

Jeff King <peff@peff.net> writes:

> Just looking at the 128-day case again, using bitmaps increased our
> server CPU time _and_ made a much bigger pack. This series not only
> fixes the CPU time regression, but it also drops the server CPU time to
> almost nothing. That's a nice improvement, and it makes perfect sense:
> we are reusing on-disk deltas instead of on-the-fly computing deltas
> against the preferred bases.

True.

> I think we could still add the objects from the tip of the client's HAVE
> list.

That should make the result at least per to the non-bitmap case,
right?

> This patch would still be a CPU win on top of that, because it
> would reduce the number of objects which need a delta search in the
> first place.

Yes.

> So I think the next steps are probably:
>
>   1. Measure the "all objects are preferred bases" approach and confirm
>      that it is bad.

;-)

>   2. Measure the "reused deltas become preferred bases" approach. I
>      expect the resulting size to be slightly better than what I have
>      now, but not as good as v1.9.0's size (but taking less CPU time).

Do you mean "the bases for reused deltas become preferred bases, so
that we can deltify more objects off of them"?

>   3. Measure the "figure out boundaries and add them as preferred bases,
>      like we do without bitmaps" approach. I expect this to behave
>      similarly to v1.9.0.

Yes.

>   4. Combine (2) and (3) and measure them. I'm _hoping_ this will give
>      us the best of both worlds, but I still want to do the individual
>      measurements so we can see where any improvement is coming from.

Sensible.  Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf
  2014-03-26  7:22 ` [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf Jeff King
@ 2014-03-26 22:34   ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2014-03-26 22:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Ben Maurer, Siddharth Agarwal

Jeff King <peff@peff.net> writes:

> About half of test_perf() is boilerplate, and half is
> actually related to running the perf test. Let's split it
> into two functions, so that we can reuse the boilerplate in
> future commits.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  t/perf/perf-lib.sh | 61 +++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 35 insertions(+), 26 deletions(-)

These early steps somewhat conflict with another topic that is
stalled (due to having no real users) on 'pu'.  I do not think we
would terribly mind dropping tg/perf-lib-test-perf-cleanup and have
it rebased if the author or somebody else wants to have it in my
tree later, but just FYI.

>
> diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
> index a8c9574..20f306a 100644
> --- a/t/perf/perf-lib.sh
> +++ b/t/perf/perf-lib.sh
> @@ -150,8 +150,8 @@ exit $ret' >&3 2>&4
>  	return "$eval_ret"
>  }
>  
> -
> -test_perf () {
> +test_wrapper_ () {
> +	test_wrapper_func_=$1; shift
>  	test_start_
>  	test "$#" = 3 && { test_prereq=$1; shift; } || test_prereq=
>  	test "$#" = 2 ||
> @@ -162,35 +162,44 @@ test_perf () {
>  		base=$(basename "$0" .sh)
>  		echo "$test_count" >>"$perf_results_dir"/$base.subtests
>  		echo "$1" >"$perf_results_dir"/$base.$test_count.descr
> -		if test -z "$verbose"; then
> -			printf "%s" "perf $test_count - $1:"
> -		else
> -			echo "perf $test_count - $1:"
> -		fi
> -		for i in $(test_seq 1 $GIT_PERF_REPEAT_COUNT); do
> -			say >&3 "running: $2"
> -			if test_run_perf_ "$2"
> -			then
> -				if test -z "$verbose"; then
> -					printf " %s" "$i"
> -				else
> -					echo "* timing run $i/$GIT_PERF_REPEAT_COUNT:"
> -				fi
> +		base="$perf_results_dir"/"$perf_results_prefix$(basename "$0" .sh)"."$test_count"
> +		"$test_wrapper_func_" "$@"
> +	fi
> +
> +	test_finish_
> +}
> +
> +test_perf_ () {
> +	if test -z "$verbose"; then
> +		printf "%s" "perf $test_count - $1:"
> +	else
> +		echo "perf $test_count - $1:"
> +	fi
> +	for i in $(test_seq 1 $GIT_PERF_REPEAT_COUNT); do
> +		say >&3 "running: $2"
> +		if test_run_perf_ "$2"
> +		then
> +			if test -z "$verbose"; then
> +				printf " %s" "$i"
>  			else
> -				test -z "$verbose" && echo
> -				test_failure_ "$@"
> -				break
> +				echo "* timing run $i/$GIT_PERF_REPEAT_COUNT:"
>  			fi
> -		done
> -		if test -z "$verbose"; then
> -			echo " ok"
>  		else
> -			test_ok_ "$1"
> +			test -z "$verbose" && echo
> +			test_failure_ "$@"
> +			break
>  		fi
> -		base="$perf_results_dir"/"$perf_results_prefix$(basename "$0" .sh)"."$test_count"
> -		"$TEST_DIRECTORY"/perf/min_time.perl test_time.* >"$base".times
> +	done
> +	if test -z "$verbose"; then
> +		echo " ok"
> +	else
> +		test_ok_ "$1"
>  	fi
> -	test_finish_
> +	"$TEST_DIRECTORY"/perf/min_time.perl test_time.* >"$base".times
> +}
> +
> +test_perf () {
> +	test_wrapper_ test_perf_ "$@"
>  }
>  
>  # We extend test_done to print timings at the end (./run disables this

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26 22:31     ` Junio C Hamano
@ 2014-03-26 22:36       ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2014-03-26 22:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Ben Maurer, Siddharth Agarwal

On Wed, Mar 26, 2014 at 03:31:41PM -0700, Junio C Hamano wrote:

> > I think we could still add the objects from the tip of the client's HAVE
> > list.
> 
> That should make the result at least per to the non-bitmap case,
> right?

That's my expectation.
> 
> >   2. Measure the "reused deltas become preferred bases" approach. I
> >      expect the resulting size to be slightly better than what I have
> >      now, but not as good as v1.9.0's size (but taking less CPU time).
> 
> Do you mean "the bases for reused deltas become preferred bases, so
> that we can deltify more objects off of them"?

Exactly.

I'll update more when I have results for some of these. :)

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
                   ` (6 preceding siblings ...)
  2014-03-26 17:33 ` [PATCH/RFC 0/6] reuse deltas found by bitmaps Junio C Hamano
@ 2014-03-26 22:40 ` Siddharth Agarwal
  2014-03-27 14:09   ` Siddharth Agarwal
  7 siblings, 1 reply; 21+ messages in thread
From: Siddharth Agarwal @ 2014-03-26 22:40 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Ben Maurer

On 03/26/2014 12:22 AM, Jeff King wrote:
> [tl;dr the patch is the same as before, but there is a script to measure
>         its effects; please try it out on your repos]

Here are results from one of our repos:

Test                         origin HEAD
-------------------------------------------------------------------------
5311.3: server   (1 days)    1.77(2.49+0.15)     0.76(0.65+0.12) -57.1%
5311.4: size     (1 days)               7.8M                3.1M -60.2%
5311.5: client   (1 days)    0.56(0.48+0.03)     1.18(0.97+0.05) +110.7%
5311.7: server   (2 days)    2.79(3.68+0.25)     0.77(0.65+0.14) -72.4%
5311.8: size     (2 days)              24.2M                6.8M -71.8%
5311.9: client   (2 days)    1.14(0.95+0.10)     2.72(2.33+0.13) +138.6%
5311.11: server   (4 days)   3.70(4.76+0.28)     0.84(0.72+0.14) -77.3%
5311.12: size     (4 days)             51.2M               18.9M -63.0%
5311.13: client   (4 days)   2.29(2.02+0.29)     4.58(3.91+0.30) +100.0%
5311.15: server   (8 days)   5.16(7.48+0.38)     1.20(1.18+0.21) -76.7%
5311.16: size     (8 days)             78.3M               42.6M -45.5%
5311.17: client   (8 days)   3.73(3.67+0.40)     6.59(5.95+0.51) +76.7%
5311.19: server  (16 days)   6.48(10.27+0.52)    1.60(1.85+0.27) -75.3%
5311.20: size    (16 days)             97.5M               64.0M -34.4%
5311.21: client  (16 days)   5.31(5.76+0.57)     8.99(8.61+0.68) +69.3%
5311.23: server  (32 days)   8.56(14.00+0.67)    2.31(2.91+0.37) -73.0%
5311.24: size    (32 days)            124.7M               92.0M -26.2%
5311.25: client  (32 days)   7.69(9.20+0.76)     12.80(13.07+0.91) +66.4%
5311.27: server  (64 days)   37.47(45.48+1.55)   29.75(29.58+0.96) -20.6%
5311.28: size    (64 days)            245.5M              207.1M -15.6%
5311.29: client  (64 days)   15.11(18.26+1.58)   25.02(25.90+2.03) +65.6%
5311.31: server (128 days)   43.85(57.02+2.57)   29.88(29.54+1.35) -31.9%
5311.32: size   (128 days)            507.3M              461.6M -9.0%
5311.33: client (128 days)   27.13(31.54+3.32)   37.76(39.49+3.16) +39.2%

I'm currently running tests on a much larger repo than that. The full 
perf run will take several hours on that, so the soonest I can likely 
publish results for that is tomorrow.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26 18:13   ` Jeff King
  2014-03-26 22:31     ` Junio C Hamano
@ 2014-03-27  1:13     ` Jeff King
  2014-03-27 16:36       ` Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff King @ 2014-03-27  1:13 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Ben Maurer, Siddharth Agarwal

On Wed, Mar 26, 2014 at 02:13:00PM -0400, Jeff King wrote:

> So I think the next steps are probably:
> 
>   1. Measure the "all objects are preferred bases" approach and confirm
>      that it is bad.

Below is a very rough patch to accomplish this. It just traverses the
"have" bitmap and adds every object with the "exclude" flag. The result
is as comically bad as I expected.

For a big fetch, it seems like it's working (numbers against v1.9.0):

  5311.31: server (128 days)   4.49(7.35+0.23)   4.98(6.82+3.31) +10.9%
  5311.32: size   (128 days)             25.8M             32.0M +24.2%
  5311.33: client (128 days)   7.17(7.38+0.20)   7.33(7.90+0.20) +2.2%

A modest increase in CPU time, and we get back most of our size
(remember that our "bad" case here is ~80M).

But for a small fetch...

  5311.3: server   (1 days)    0.20(0.17+0.03)   4.39(4.03+6.59) +2095.0%
  5311.4: size     (1 days)              57.2K             59.5K +4.1%
  5311.5: client   (1 days)    0.08(0.08+0.00)   0.08(0.08+0.00) +0.0%

Yikes. Besides spending lots of CPU on handling the enlarged object
list, notice that we still only dropped the size in the 128-day case to
32M. Which is almost exactly what the earlier "reuse on-disk deltas"
patch achieved.

What I think is happening is that we manage to reuse those on-disk
deltas (since they are now preferred bases, and we know we can). But we
never actually come up with any _new_ deltas, because the search window
is overwhelmed with candidates. So not only do we waste a huge amount of
CPU, but we just end up at the same not-quite-optimal result as before.

So this is a dead end, but I think it was good to double-check that.

The patch below is messy and would probably be better split into a few
patches, but I don't expect anyone to apply it (or even read it,
really). It's just for reference.

---
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0ee5f1f..1a5d401 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1026,7 +1026,7 @@ static int add_object_entry_from_bitmap(const unsigned char *sha1,
 	if (have_duplicate_entry(sha1, 0, &index_pos))
 		return 0;
 
-	create_object_entry(sha1, type, name_hash, 0, 0, index_pos, pack, offset);
+	create_object_entry(sha1, type, name_hash, flags, 0, index_pos, pack, offset);
 
 	display_progress(progress_state, to_pack.nr_objects);
 	return 1;
@@ -2436,6 +2436,7 @@ static int get_object_list_from_bitmap(struct rev_info *revs)
 	}
 
 	traverse_bitmap_commit_list(&add_object_entry_from_bitmap);
+	bitmap_have_foreach(&add_object_entry_from_bitmap);
 	return 0;
 }
 
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 1bae7e8..f4e30f5 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -605,6 +605,7 @@ static void show_objects_for_type(
 	struct bitmap *objects,
 	struct ewah_bitmap *type_filter,
 	enum object_type object_type,
+	int flags,
 	show_reachable_fn show_reach)
 {
 	size_t pos = 0, i = 0;
@@ -613,9 +614,6 @@ static void show_objects_for_type(
 	struct ewah_iterator it;
 	eword_t filter;
 
-	if (bitmap_git.reuse_objects == bitmap_git.pack->num_objects)
-		return;
-
 	ewah_iterator_init(&it, type_filter);
 
 	while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) {
@@ -640,7 +638,7 @@ static void show_objects_for_type(
 			if (bitmap_git.hashes)
 				hash = ntohl(bitmap_git.hashes[entry->nr]);
 
-			show_reach(sha1, object_type, 0, hash, bitmap_git.pack, entry->offset);
+			show_reach(sha1, object_type, flags, hash, bitmap_git.pack, entry->offset);
 		}
 
 		pos += BITS_IN_WORD;
@@ -816,14 +814,17 @@ void traverse_bitmap_commit_list(show_reachable_fn show_reachable)
 {
 	assert(bitmap_git.result);
 
+	if (bitmap_git.reuse_objects == bitmap_git.pack->num_objects)
+		return;
+
 	show_objects_for_type(bitmap_git.result, bitmap_git.commits,
-		OBJ_COMMIT, show_reachable);
+		OBJ_COMMIT, 0, show_reachable);
 	show_objects_for_type(bitmap_git.result, bitmap_git.trees,
-		OBJ_TREE, show_reachable);
+		OBJ_TREE, 0, show_reachable);
 	show_objects_for_type(bitmap_git.result, bitmap_git.blobs,
-		OBJ_BLOB, show_reachable);
+		OBJ_BLOB, 0, show_reachable);
 	show_objects_for_type(bitmap_git.result, bitmap_git.tags,
-		OBJ_TAG, show_reachable);
+		OBJ_TAG, 0, show_reachable);
 
 	show_extended_objects(bitmap_git.result, show_reachable);
 
@@ -1090,3 +1091,18 @@ int bitmap_have(const unsigned char *sha1)
 
 	return bitmap_get(bitmap_git.haves, pos);
 }
+
+void bitmap_have_foreach(show_reachable_fn show_reachable)
+{
+	if (!bitmap_git.haves)
+		return;
+
+	show_objects_for_type(bitmap_git.haves, bitmap_git.commits,
+		OBJ_COMMIT, 1, show_reachable);
+	show_objects_for_type(bitmap_git.haves, bitmap_git.trees,
+		OBJ_TREE, 1, show_reachable);
+	show_objects_for_type(bitmap_git.haves, bitmap_git.blobs,
+		OBJ_BLOB, 1, show_reachable);
+	show_objects_for_type(bitmap_git.haves, bitmap_git.tags,
+		OBJ_TAG, 1, show_reachable);
+}
diff --git a/pack-bitmap.h b/pack-bitmap.h
index a63ee6b..02c08f8 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -50,6 +50,7 @@ int reuse_partial_packfile_from_bitmap(struct packed_git **packfile, uint32_t *e
 int rebuild_existing_bitmaps(struct packing_data *mapping, khash_sha1 *reused_bitmaps, int show_progress);
 
 int bitmap_have(const unsigned char *sha1);
+void bitmap_have_foreach(show_reachable_fn);
 
 void bitmap_writer_show_progress(int show);
 void bitmap_writer_set_checksum(unsigned char *sha1);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-26 22:40 ` Siddharth Agarwal
@ 2014-03-27 14:09   ` Siddharth Agarwal
  0 siblings, 0 replies; 21+ messages in thread
From: Siddharth Agarwal @ 2014-03-27 14:09 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Ben Maurer

On 03/26/2014 03:40 PM, Siddharth Agarwal wrote:
> On 03/26/2014 12:22 AM, Jeff King wrote:
>> [tl;dr the patch is the same as before, but there is a script to measure
>>         its effects; please try it out on your repos]

Here are the numbers from another, much larger repo:

Test                         origin HEAD
------------------------------------------------------------------------------------ 

5311.3: server   (1 days)    11.72(14.02+1.44) 6.33(5.87+0.75) -46.0%
5311.4: size     (1 days) 19.4M                     15.3M -21.3%
5311.5: client   (1 days)    6.99(8.06+1.50) 10.60(10.34+1.83) +51.6%
5311.7: server   (2 days)    39.82(40.56+3.05) 33.94(31.21+3.10) -14.8%
5311.8: size     (2 days) 26.5M                     22.8M -14.1%
5311.9: client   (2 days)    15.81(16.48+4.29) 19.20(18.20+4.19) +21.4%
5311.11: server   (4 days)   37.21(39.75+3.73) 28.01(26.97+1.75) -24.7%
5311.12: size     (4 days) 37.5M                     34.1M -9.0%
5311.13: client   (4 days)   24.24(26.43+6.76) 31.14(30.75+5.96) +28.5%
5311.15: server   (8 days)   33.74(40.47+3.39) 22.42(22.25+1.51) -33.6%
5311.16: size     (8 days) 81.9M                     78.4M -4.2%
5311.17: client   (8 days)   81.96(91.07+22.35) 88.03(89.45+17.11) +7.4%
5311.19: server  (16 days)   30.87(34.57+2.78) 27.03(25.93+2.73) -12.4%
5311.20: size    (16 days) 153.2M                    150.9M -1.5%
5311.21: client  (16 days)   169.99(183.57+46.96) 177.12(177.17+37.93) 
+4.2%
5311.23: server  (32 days)   51.00(75.49+4.62) 19.52(19.28+1.93) -61.7%
5311.24: size    (32 days) 279.4M                    283.0M +1.3%
5311.25: client  (32 days)   272.43(296.17+76.48) 284.75(285.98+63.19) 
+4.5%
5311.27: server  (64 days)   51.73(97.88+6.40) 37.32(32.63+5.05) -27.9%
5311.28: size    (64 days) 1.7G                      1.8G +5.0%
5311.29: client  (64 days)   2600.42(2751.56+718.10) 
2429.06(2501.29+651.56) -6.6%
5311.31: server (128 days)   51.33(95.33+6.91) 37.73(32.98+5.09) -26.5%
5311.32: size   (128 days) 1.7G                      1.8G +5.0%
5311.33: client (128 days)   2595.68(2739.45+729.07) 
2386.99(2524.54+583.07) -8.0%

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
  2014-03-27  1:13     ` Jeff King
@ 2014-03-27 16:36       ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2014-03-27 16:36 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Ben Maurer, Siddharth Agarwal

Jeff King <peff@peff.net> writes:

> But for a small fetch...
>
>   5311.3: server   (1 days)    0.20(0.17+0.03)   4.39(4.03+6.59) +2095.0%
>   5311.4: size     (1 days)              57.2K             59.5K +4.1%
>   5311.5: client   (1 days)    0.08(0.08+0.00)   0.08(0.08+0.00) +0.0%

Nice ;-)

> So this is a dead end, but I think it was good to double-check that.

Thanks.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/6] pack-objects: reuse deltas for thin "have" objects
  2014-03-26  7:23 ` [PATCH 6/6] pack-objects: reuse deltas for thin "have" objects Jeff King
@ 2014-03-28  4:23   ` Eric Sunshine
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Sunshine @ 2014-03-28  4:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Ben Maurer, Siddharth Agarwal

On Wed, Mar 26, 2014 at 3:23 AM, Jeff King <peff@peff.net> wrote:
> When we calculate the "wants" and "haves" for a pack, we
> only add the objects in the boundary commits as preferred
> bases. However, we know that every object reachable from the
> "haves" could be a preferred base.
>
> We probably don't want to add these to our preferred base
> list, because they would clog the delta-search window.
> However, there is no reason we cannot reuse an on-disk delta
> against such a deep "have" base, avoiding the delta search
> for that object altogether.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  builtin/pack-objects.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 7950c43..92bd682 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -53,6 +53,7 @@ static unsigned long pack_size_limit;
>  static int depth = 50;
>  static int delta_search_threads;
>  static int pack_to_stdout;
> +static int thin = 0;
>  static int num_preferred_base;
>  static struct progress *progress_state;
>  static int pack_compression_level = Z_DEFAULT_COMPRESSION;
> @@ -1419,6 +1420,20 @@ static void check_object(struct object_entry *entry)
>                         base_entry->delta_child = entry;
>                         unuse_pack(&w_curs);
>                         return;
> +               } else if(thin && base_ref && bitmap_have(base_ref)) {

Missing space after 'if'.

> +                       entry->type = entry->in_pack_type;
> +                       entry->delta_size = entry->size;
> +                       /*
> +                        * XXX we'll leak this, but it's probably OK
> +                        * since we'll exit immediately after the packing
> +                        * is done
> +                        */
> +                       entry->delta = xcalloc(1, sizeof(*entry->delta));
> +                       hashcpy(entry->delta->idx.sha1, base_ref);
> +                       entry->delta->preferred_base = 1;
> +                       entry->delta->filled = 1;
> +                       unuse_pack(&w_curs);
> +                       return;
>                 }
>
>                 if (entry->type) {
> @@ -2559,7 +2574,6 @@ static int option_parse_ulong(const struct option *opt,
>  int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>  {
>         int use_internal_rev_list = 0;
> -       int thin = 0;
>         int all_progress_implied = 0;
>         const char *rp_av[6];
>         int rp_ac = 0;
> --
> 1.9.1.601.g7ec968e
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] t/perf: add infrastructure for measuring sizes
  2018-08-22 13:40   ` Derrick Stolee
@ 2018-08-22 15:31     ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2018-08-22 15:31 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git

On Wed, Aug 22, 2018 at 09:40:55AM -0400, Derrick Stolee wrote:

> On 8/21/2018 3:06 PM, Jeff King wrote:
> > The main objective of scripts in the perf framework is to
> > run "test_perf", which measures the time it takes to run
> > some operation. However, it can also be interesting to see
> > the change in the output size of certain operations.
> > 
> > This patch introduces test_size, which records a single
> > numeric output from the test and shows it in the aggregated
> > output (with pretty printing and relative size comparison).
> 
> I'm interested in exploring this test_size mechanism. The other area that
> could benefit from size testing is 'git repack', but I don't have any plans
> to change our compression or delta strategies. If we _did_ look into that,
> then using test_size would be a natural fit.

Yeah, I agree it would be a good tool for showing off improvements
there.

It may also be useful for catching regressions in topics that are trying
to speed things up, but _don't_ intend to change the size. We could even
do that proactively now. I.e., something like:

  test_perf 'repack' 'git repack -adf'
  test_size 'pack size' 'wc -c <.git/objects/pack/*.pack'

just to see if it ever changes.  But I suspect its usefulness may depend
on how you are packing (e.g., is "-f" more likely to catch issues than
without?).

The new tests I added in this series cover packs created for fetches.
There's no guarantee that will overlap with the behavior of an on-disk
repack, but at least we have some generic coverage of pack-objects
output sizes now. Absent any suspicion of a regression for a particular
case, that's probably an acceptable canary in the coal mine.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/6] t/perf: add infrastructure for measuring sizes
  2018-08-21 19:06 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
@ 2018-08-22 13:40   ` Derrick Stolee
  2018-08-22 15:31     ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Derrick Stolee @ 2018-08-22 13:40 UTC (permalink / raw)
  To: Jeff King, git

On 8/21/2018 3:06 PM, Jeff King wrote:
> The main objective of scripts in the perf framework is to
> run "test_perf", which measures the time it takes to run
> some operation. However, it can also be interesting to see
> the change in the output size of certain operations.
>
> This patch introduces test_size, which records a single
> numeric output from the test and shows it in the aggregated
> output (with pretty printing and relative size comparison).

I'm interested in exploring this test_size mechanism. The other area 
that could benefit from size testing is 'git repack', but I don't have 
any plans to change our compression or delta strategies. If we _did_ 
look into that, then using test_size would be a natural fit.

Overall, I really like this series!

Thanks
-Stolee



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 3/6] t/perf: add infrastructure for measuring sizes
  2018-08-21 18:41 [PATCH] test-tool.h: include git-compat-util.h Jeff King
@ 2018-08-21 19:06 ` Jeff King
  2018-08-22 13:40   ` Derrick Stolee
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2018-08-21 19:06 UTC (permalink / raw)
  To: git

The main objective of scripts in the perf framework is to
run "test_perf", which measures the time it takes to run
some operation. However, it can also be interesting to see
the change in the output size of certain operations.

This patch introduces test_size, which records a single
numeric output from the test and shows it in the aggregated
output (with pretty printing and relative size comparison).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/README         | 25 ++++++++++++++++++++++
 t/perf/aggregate.perl | 48 ++++++++++++++++++++++++++++++++++++++-----
 t/perf/perf-lib.sh    | 13 ++++++++++++
 3 files changed, 81 insertions(+), 5 deletions(-)

diff --git a/t/perf/README b/t/perf/README
index 21321a0f36..be12090c38 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -168,3 +168,28 @@ that
   While we have tried to make sure that it can cope with embedded
   whitespace and other special characters, it will not work with
   multi-line data.
+
+Rather than tracking the performance by run-time as `test_perf` does, you
+may also track output size by using `test_size`. The stdout of the
+function should be a single numeric value, which will be captured and
+shown in the aggregated output. For example:
+
+	test_perf 'time foo' '
+		./foo >foo.out
+	'
+
+	test_size 'output size'
+		wc -c <foo.out
+	'
+
+might produce output like:
+
+	Test                origin           HEAD
+	-------------------------------------------------------------
+	1234.1 time foo     0.37(0.79+0.02)  0.26(0.51+0.02) -29.7%
+	1234.2 output size             4.3M             3.6M -14.7%
+
+The item being measured (and its units) is up to the test; the context
+and the test title should make it clear to the user whether bigger or
+smaller numbers are better. Unlike test_perf, the test code will only be
+run once, since output sizes tend to be more deterministic than timings.
diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 3181b087ab..494907a892 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -13,10 +13,16 @@ sub get_times {
 	my $line = <$fh>;
 	return undef if not defined $line;
 	close $fh or die "cannot close $name: $!";
-	$line =~ /^(?:(\d+):)?(\d+):(\d+(?:\.\d+)?) (\d+(?:\.\d+)?) (\d+(?:\.\d+)?)$/
-		or die "bad input line: $line";
-	my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
-	return ($rt, $4, $5);
+	# times
+	if ($line =~ /^(?:(\d+):)?(\d+):(\d+(?:\.\d+)?) (\d+(?:\.\d+)?) (\d+(?:\.\d+)?)$/) {
+		my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
+		return ($rt, $4, $5);
+	# size
+	} elsif ($line =~ /^\d+$/) {
+		return $&;
+	} else {
+		die "bad input line: $line";
+	}
 }
 
 sub relative_change {
@@ -32,9 +38,15 @@ sub relative_change {
 
 sub format_times {
 	my ($r, $u, $s, $firstr) = @_;
+	# no value means we did not finish the test
 	if (!defined $r) {
 		return "<missing>";
 	}
+	# a single value means we have a size, not times
+	if (!defined $u) {
+		return format_size($r, $firstr);
+	}
+	# otherwise, we have real/user/system times
 	my $out = sprintf "%.2f(%.2f+%.2f)", $r, $u, $s;
 	$out .= ' ' . relative_change($r, $firstr) if defined $firstr;
 	return $out;
@@ -54,6 +66,25 @@ sub usage {
 	exit(1);
 }
 
+sub human_size {
+	my $n = shift;
+	my @units = ('', qw(K M G));
+	while ($n > 900 && @units > 1) {
+		$n /= 1000;
+		shift @units;
+	}
+	return $n unless length $units[0];
+	return sprintf '%.1f%s', $n, $units[0];
+}
+
+sub format_size {
+	my ($size, $first) = @_;
+	# match the width of a time: 0.00(0.00+0.00)
+	my $out = sprintf '%15s', human_size($size);
+	$out .= ' ' . relative_change($size, $first) if defined $first;
+	return $out;
+}
+
 my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests,
     $codespeed, $sortby, $subsection, $reponame);
 
@@ -184,7 +215,14 @@ sub print_default_results {
 		my $firstr;
 		for my $i (0..$#dirs) {
 			my $d = $dirs[$i];
-			$times{$prefixes{$d}.$t} = [get_times("$resultsdir/$prefixes{$d}$t.times")];
+			my $base = "$resultsdir/$prefixes{$d}$t";
+			$times{$prefixes{$d}.$t} = [];
+			foreach my $type (qw(times size)) {
+				if (-e "$base.$type") {
+					$times{$prefixes{$d}.$t} = [get_times("$base.$type")];
+					last;
+				}
+			}
 			my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
 			my $w = length format_times($r,$u,$s,$firstr);
 			$colwidth[$i] = $w if $w > $colwidth[$i];
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index a54be09516..11d1922cf5 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -231,6 +231,19 @@ test_perf () {
 	test_wrapper_ test_perf_ "$@"
 }
 
+test_size_ () {
+	say >&3 "running: $2"
+	if test_eval_ "$2" 3>"$base".size; then
+		test_ok_ "$1"
+	else
+		test_failure_ "$@"
+	fi
+}
+
+test_size () {
+	test_wrapper_ test_size_ "$@"
+}
+
 # We extend test_done to print timings at the end (./run disables this
 # and does it after running everything)
 test_at_end_hook_ () {
-- 
2.19.0.rc0.398.g138a08f6f6


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/6] t/perf: add infrastructure for measuring sizes
  2018-08-17 20:54 [PATCH 0/6] reuse on-disk deltas for fetches with bitmaps Jeff King
@ 2018-08-17 20:56 ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2018-08-17 20:56 UTC (permalink / raw)
  To: git

The main objective of scripts in the perf framework is to
run "test_perf", which measures the time it takes to run
some operation. However, it can also be interesting to see
the change in the output size of certain operations.

This patch introduces test_size, which records a single
numeric output from the test and shows it in the aggregated
output (with pretty printing and relative size comparison).

Signed-off-by: Jeff King <peff@peff.net>
---
 t/perf/README         | 25 ++++++++++++++++++++++
 t/perf/aggregate.perl | 48 ++++++++++++++++++++++++++++++++++++++-----
 t/perf/perf-lib.sh    | 13 ++++++++++++
 3 files changed, 81 insertions(+), 5 deletions(-)

diff --git a/t/perf/README b/t/perf/README
index 21321a0f36..be12090c38 100644
--- a/t/perf/README
+++ b/t/perf/README
@@ -168,3 +168,28 @@ that
   While we have tried to make sure that it can cope with embedded
   whitespace and other special characters, it will not work with
   multi-line data.
+
+Rather than tracking the performance by run-time as `test_perf` does, you
+may also track output size by using `test_size`. The stdout of the
+function should be a single numeric value, which will be captured and
+shown in the aggregated output. For example:
+
+	test_perf 'time foo' '
+		./foo >foo.out
+	'
+
+	test_size 'output size'
+		wc -c <foo.out
+	'
+
+might produce output like:
+
+	Test                origin           HEAD
+	-------------------------------------------------------------
+	1234.1 time foo     0.37(0.79+0.02)  0.26(0.51+0.02) -29.7%
+	1234.2 output size             4.3M             3.6M -14.7%
+
+The item being measured (and its units) is up to the test; the context
+and the test title should make it clear to the user whether bigger or
+smaller numbers are better. Unlike test_perf, the test code will only be
+run once, since output sizes tend to be more deterministic than timings.
diff --git a/t/perf/aggregate.perl b/t/perf/aggregate.perl
index 3181b087ab..494907a892 100755
--- a/t/perf/aggregate.perl
+++ b/t/perf/aggregate.perl
@@ -13,10 +13,16 @@ sub get_times {
 	my $line = <$fh>;
 	return undef if not defined $line;
 	close $fh or die "cannot close $name: $!";
-	$line =~ /^(?:(\d+):)?(\d+):(\d+(?:\.\d+)?) (\d+(?:\.\d+)?) (\d+(?:\.\d+)?)$/
-		or die "bad input line: $line";
-	my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
-	return ($rt, $4, $5);
+	# times
+	if ($line =~ /^(?:(\d+):)?(\d+):(\d+(?:\.\d+)?) (\d+(?:\.\d+)?) (\d+(?:\.\d+)?)$/) {
+		my $rt = ((defined $1 ? $1 : 0.0)*60+$2)*60+$3;
+		return ($rt, $4, $5);
+	# size
+	} elsif ($line =~ /^\d+$/) {
+		return $&;
+	} else {
+		die "bad input line: $line";
+	}
 }
 
 sub relative_change {
@@ -32,9 +38,15 @@ sub relative_change {
 
 sub format_times {
 	my ($r, $u, $s, $firstr) = @_;
+	# no value means we did not finish the test
 	if (!defined $r) {
 		return "<missing>";
 	}
+	# a single value means we have a size, not times
+	if (!defined $u) {
+		return format_size($r, $firstr);
+	}
+	# otherwise, we have real/user/system times
 	my $out = sprintf "%.2f(%.2f+%.2f)", $r, $u, $s;
 	$out .= ' ' . relative_change($r, $firstr) if defined $firstr;
 	return $out;
@@ -54,6 +66,25 @@ sub usage {
 	exit(1);
 }
 
+sub human_size {
+	my $n = shift;
+	my @units = ('', qw(K M G));
+	while ($n > 900 && @units > 1) {
+		$n /= 1000;
+		shift @units;
+	}
+	return $n unless length $units[0];
+	return sprintf '%.1f%s', $n, $units[0];
+}
+
+sub format_size {
+	my ($size, $first) = @_;
+	# match the width of a time: 0.00(0.00+0.00)
+	my $out = sprintf '%15s', human_size($size);
+	$out .= ' ' . relative_change($size, $first) if defined $first;
+	return $out;
+}
+
 my (@dirs, %dirnames, %dirabbrevs, %prefixes, @tests,
     $codespeed, $sortby, $subsection, $reponame);
 
@@ -184,7 +215,14 @@ sub print_default_results {
 		my $firstr;
 		for my $i (0..$#dirs) {
 			my $d = $dirs[$i];
-			$times{$prefixes{$d}.$t} = [get_times("$resultsdir/$prefixes{$d}$t.times")];
+			my $base = "$resultsdir/$prefixes{$d}$t";
+			$times{$prefixes{$d}.$t} = [];
+			foreach my $type (qw(times size)) {
+				if (-e "$base.$type") {
+					$times{$prefixes{$d}.$t} = [get_times("$base.$type")];
+					last;
+				}
+			}
 			my ($r,$u,$s) = @{$times{$prefixes{$d}.$t}};
 			my $w = length format_times($r,$u,$s,$firstr);
 			$colwidth[$i] = $w if $w > $colwidth[$i];
diff --git a/t/perf/perf-lib.sh b/t/perf/perf-lib.sh
index a54be09516..11d1922cf5 100644
--- a/t/perf/perf-lib.sh
+++ b/t/perf/perf-lib.sh
@@ -231,6 +231,19 @@ test_perf () {
 	test_wrapper_ test_perf_ "$@"
 }
 
+test_size_ () {
+	say >&3 "running: $2"
+	if test_eval_ "$2" 3>"$base".size; then
+		test_ok_ "$1"
+	else
+		test_failure_ "$@"
+	fi
+}
+
+test_size () {
+	test_wrapper_ test_size_ "$@"
+}
+
 # We extend test_done to print timings at the end (./run disables this
 # and does it after running everything)
 test_at_end_hook_ () {
-- 
2.18.0.1205.g3878b1e64a


^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2018-08-22 15:31 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-26  7:22 [PATCH/RFC 0/6] reuse deltas found by bitmaps Jeff King
2014-03-26  7:22 ` [PATCH 1/6] t/perf-lib: factor boilerplate out of test_perf Jeff King
2014-03-26 22:34   ` Junio C Hamano
2014-03-26  7:22 ` [PATCH 2/6] t/perf/aggregate: factor our percent calculations Jeff King
2014-03-26  7:22 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
2014-03-26  7:22 ` [PATCH 4/6] t/perf: add perf tests for fetches from a bitmapped server Jeff King
2014-03-26  7:23 ` [PATCH 5/6] pack-bitmap: save "have" bitmap from walk Jeff King
2014-03-26  7:23 ` [PATCH 6/6] pack-objects: reuse deltas for thin "have" objects Jeff King
2014-03-28  4:23   ` Eric Sunshine
2014-03-26 17:33 ` [PATCH/RFC 0/6] reuse deltas found by bitmaps Junio C Hamano
2014-03-26 18:13   ` Jeff King
2014-03-26 22:31     ` Junio C Hamano
2014-03-26 22:36       ` Jeff King
2014-03-27  1:13     ` Jeff King
2014-03-27 16:36       ` Junio C Hamano
2014-03-26 22:40 ` Siddharth Agarwal
2014-03-27 14:09   ` Siddharth Agarwal
2018-08-17 20:54 [PATCH 0/6] reuse on-disk deltas for fetches with bitmaps Jeff King
2018-08-17 20:56 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
2018-08-21 18:41 [PATCH] test-tool.h: include git-compat-util.h Jeff King
2018-08-21 19:06 ` [PATCH 3/6] t/perf: add infrastructure for measuring sizes Jeff King
2018-08-22 13:40   ` Derrick Stolee
2018-08-22 15:31     ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.