All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Resumable clone revisited, proof of concept
@ 2016-02-05  8:57 Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 1/8] pack-objects: add --skip and --skip-hash Nguyễn Thái Ngọc Duy
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

I was reminded by LWN [1] about this. Annoyed in fact because it's
called a bug while it looks more like an elephant. So let's see if we
can agree this is a good way to go.

This is the stable pack approach. The client initiates "resumable
mode" and forces the server to generate a resumable, even though less
efficient, pack. Compared to the "download from a cached bundle first"
approach, this requires no extra setup for caching. But it wastes more
cpu.

The UI is not there yet, and you can only resume a fetch, not a clone.
What remains to do are:

 - Add UI to "git fetch" to retry
 - Make "git clone" use "git fetch --resume" to get the pack

[1] Quote from https://lwn.net/Articles/674392/ "she noted that Git
    still does not support interrupting and resuming download operations,
    which is an important bug to fix."

Nguyễn Thái Ngọc Duy (8):
  pack-objects: add --skip and --skip-hash
  pack-objects: produce a stable pack when --skip is given
  index-pack: add --append-pack=<path>
  upload-pack: new capability to pass --skip* to pack-objects
  fetch-pack.c: send "skip" line to pack-objects
  fetch: add --resume-pack=<path>
  index-pack: --append-pack implies --strict
  one ugly test to verify basic functionality

 Documentation/technical/pack-protocol.txt         |  2 +
 Documentation/technical/protocol-capabilities.txt |  9 ++++
 builtin/fetch-pack.c                              |  4 ++
 builtin/fetch.c                                   |  5 +++
 builtin/index-pack.c                              | 52 ++++++++++++++++++++++-
 builtin/pack-objects.c                            | 26 ++++++++++++
 csum-file.c                                       | 52 +++++++++++++++++++++--
 csum-file.h                                       |  3 ++
 fetch-pack.c                                      | 40 +++++++++++++++++
 fetch-pack.h                                      |  3 ++
 remote-curl.c                                     |  8 +++-
 t/t5544-fetch-resume.sh (new +x)                  | 42 ++++++++++++++++++
 transport.c                                       |  4 ++
 transport.h                                       |  4 ++
 upload-pack.c                                     | 30 ++++++++++++-
 15 files changed, 275 insertions(+), 9 deletions(-)
 create mode 100755 t/t5544-fetch-resume.sh

-- 
2.7.0.377.g4cd97dd

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 1/8] pack-objects: add --skip and --skip-hash
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-05  9:20   ` Johannes Schindelin
  2016-02-05  8:57 ` [PATCH 2/8] pack-objects: produce a stable pack when --skip is given Nguyễn Thái Ngọc Duy
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

The idea is, a pack is requested the first time with --skip=0. If pack
transfer is interrupted, the client will ask for the same pack again,
but this time it asks the server not to send what it already has. The
client hashes what it has and sends the SHA-1 to the server. If the
server finds out the skipped part does not match, it can abort early.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/pack-objects.c | 21 ++++++++++++++++++++
 csum-file.c            | 52 ++++++++++++++++++++++++++++++++++++++++++++++----
 csum-file.h            |  3 +++
 3 files changed, 72 insertions(+), 4 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 4dae5b1..417c830 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -75,6 +75,9 @@ static unsigned long cache_max_small_delta_size = 1000;
 
 static unsigned long window_memory_limit = 0;
 
+static struct object_id skip_hash;
+static int skip_opt = -1;
+
 /*
  * stats
  */
@@ -781,6 +784,11 @@ static void write_pack_file(void)
 		else
 			f = create_tmp_packfile(&pack_tmp_name);
 
+		if (skip_opt > 0) {
+			f->skip = skip_opt;
+			hashcpy(f->skip_hash, skip_hash.hash);
+		}
+
 		offset = write_pack_header(f, nr_remaining);
 
 		if (reuse_packfile) {
@@ -2597,6 +2605,7 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	struct argv_array rp = ARGV_ARRAY_INIT;
 	int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0;
 	int rev_list_index = 0;
+	const char *skip_hash_hex = NULL;
 	struct option pack_objects_options[] = {
 		OPT_SET_INT('q', "quiet", &progress,
 			    N_("do not show progress meter"), 0),
@@ -2669,6 +2678,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			 N_("use a bitmap index if available to speed up counting objects")),
 		OPT_BOOL(0, "write-bitmap-index", &write_bitmap_index,
 			 N_("write a bitmap index together with the pack index")),
+		OPT_STRING(0, "skip-hash", &skip_hash_hex, "hash",
+			   N_("hash of the skipped part of the pack")),
+		OPT_INTEGER(0, "skip", &skip_opt,
+			   N_("do not produce the first <n> bytes of the pack")),
 		OPT_END(),
 	};
 
@@ -2689,6 +2702,14 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 	}
 	if (pack_to_stdout != !base_name || argc)
 		usage_with_options(pack_usage, pack_objects_options);
+	if (skip_opt >= 0) {
+		if (skip_opt > 0) {
+			if (!skip_hash_hex)
+				die(_("--skip-hash is required if --skip is non-zero"));
+			if (get_oid_hex(skip_hash_hex, &skip_hash))
+				die(_("%s is not SHA-1"), skip_hash_hex);
+		}
+	}
 
 	argv_array_push(&rp, "pack-objects");
 	if (thin) {
diff --git a/csum-file.c b/csum-file.c
index a172199..284847f 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -11,8 +11,27 @@
 #include "progress.h"
 #include "csum-file.h"
 
+static void check_skip_hash(struct sha1file *f, off_t old_total)
+{
+	if (old_total < f->skip && f->total > f->skip)
+		die("BUG: flush() must stop at skip boundary");
+
+	if (f->total == f->skip) {
+		git_SHA_CTX ctx;
+		unsigned char hash[20];
+
+		ctx = f->ctx;
+		git_SHA1_Final(hash, &ctx);
+		if (hashcmp(hash, f->skip_hash))
+			die("skip hash does not match, expected %s got %s",
+			    sha1_to_hex(f->skip_hash), sha1_to_hex(hash));
+	}
+}
+
 static void flush(struct sha1file *f, const void *buf, unsigned int count)
 {
+	off_t old_total = f->total;
+
 	if (0 <= f->check_fd && count)  {
 		unsigned char check_buffer[8192];
 		ssize_t ret = read_in_full(f->check_fd, check_buffer, count);
@@ -26,7 +45,11 @@ static void flush(struct sha1file *f, const void *buf, unsigned int count)
 	}
 
 	for (;;) {
-		int ret = xwrite(f->fd, buf, count);
+		int ret;
+		if (f->total + count <= f->skip)
+			ret = count;
+		else
+			ret = xwrite(f->fd, buf, count);
 		if (ret > 0) {
 			f->total += ret;
 			display_throughput(f->tp, f->total);
@@ -34,6 +57,8 @@ static void flush(struct sha1file *f, const void *buf, unsigned int count)
 			count -= ret;
 			if (count)
 				continue;
+			if (f->skip > 0)
+				check_skip_hash(f, old_total);
 			return;
 		}
 		if (!ret)
@@ -45,12 +70,25 @@ static void flush(struct sha1file *f, const void *buf, unsigned int count)
 void sha1flush(struct sha1file *f)
 {
 	unsigned offset = f->offset;
+	const unsigned char *buffer = f->buffer;
+
+	if (!offset)
+		return;
+
+	if (f->total < f->skip && f->skip < f->total + offset) {
+		unsigned size = f->skip - f->total;
+		git_SHA1_Update(&f->ctx, buffer, size);
+		flush(f, buffer, size);
+		buffer += size;
+		offset -= size;
+	}
 
 	if (offset) {
-		git_SHA1_Update(&f->ctx, f->buffer, offset);
-		flush(f, f->buffer, offset);
-		f->offset = 0;
+		git_SHA1_Update(&f->ctx, buffer, offset);
+		flush(f, buffer, offset);
 	}
+
+	f->offset = 0;
 }
 
 int sha1close(struct sha1file *f, unsigned char *result, unsigned int flags)
@@ -62,6 +100,8 @@ int sha1close(struct sha1file *f, unsigned char *result, unsigned int flags)
 	if (result)
 		hashcpy(result, f->buffer);
 	if (flags & (CSUM_CLOSE | CSUM_FSYNC)) {
+		if (f->skip > f->total)
+			die(_("can't skip in the middle of, or beyond the trailing SHA-1"));
 		/* write checksum and close fd */
 		flush(f, f->buffer, 20);
 		if (flags & CSUM_FSYNC)
@@ -94,6 +134,9 @@ void sha1write(struct sha1file *f, const void *buf, unsigned int count)
 		unsigned nr = count > left ? left : count;
 		const void *data;
 
+		if (f->total < f->skip && f->skip - f->total < nr)
+			nr = f->skip - f->total;
+
 		if (f->do_crc)
 			f->crc32 = crc32(f->crc32, buf, nr);
 
@@ -149,6 +192,7 @@ struct sha1file *sha1fd_throughput(int fd, const char *name, struct progress *tp
 	f->tp = tp;
 	f->name = name;
 	f->do_crc = 0;
+	f->skip = 0;
 	git_SHA1_Init(&f->ctx);
 	return f;
 }
diff --git a/csum-file.h b/csum-file.h
index 7530927..5a9475f 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -15,6 +15,9 @@ struct sha1file {
 	int do_crc;
 	uint32_t crc32;
 	unsigned char buffer[8192];
+
+	off_t skip;
+	unsigned char skip_hash[20];
 };
 
 /* Checkpoint */
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 2/8] pack-objects: produce a stable pack when --skip is given
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 1/8] pack-objects: add --skip and --skip-hash Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-05 18:43   ` Junio C Hamano
  2016-02-05  8:57 ` [PATCH 3/8] index-pack: add --append-pack=<path> Nguyễn Thái Ngọc Duy
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Parallel delta search does not produce a stable pack so it's disabled
when --skip is used. zlib compression algorithm is stable. Ref
negotiation should be stable (at least on smart http). Ref changes
will be addressed separately.

So unless configuration files or git binary is changed, we should be
able to produce a stable pack. And if config or binaries are changed,
it's ok to abort and fetch a new (resumable) pack again.

Alternatively (and not implemented), pack-objects can be taught to
save the output pack when --skip=0 is passed in, then somehow find the
cached version and send the remaining when --skip=<non-zero> is
given. This saves processing power at the cost of disk and cache
management.

The key thing for caching though, is "find the cached version". We do
not provide any way for pack-objects to send a "key", like http
cookies, to the client, so that the client can pass it back on the
next fetch. Regardless, the input from client must be identical
between fetches, the server should be able to extract a signature from
that and use it to associate with a cached pack. So no need for
sending keys from the server side.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/pack-objects.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 417c830..c58a9cb 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2709,6 +2709,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			if (get_oid_hex(skip_hash_hex, &skip_hash))
 				die(_("%s is not SHA-1"), skip_hash_hex);
 		}
+
+		/*
+		 * Parallel delta search can't produce stable packs.
+		 */
+		delta_search_threads = 1;
 	}
 
 	argv_array_push(&rp, "pack-objects");
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 3/8] index-pack: add --append-pack=<path>
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 1/8] pack-objects: add --skip and --skip-hash Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 2/8] pack-objects: produce a stable pack when --skip is given Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 4/8] upload-pack: new capability to pass --skip* to pack-objects Nguyễn Thái Ngọc Duy
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

In this mode (--append-pack --stdin), index-pack consumes current
content in <path> first without producing anything else, then
continues to read from stdin and append to <path>. This is the
consumer end of "pack-objects --skip".

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/index-pack.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 48 insertions(+), 2 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 6a01509..f099ac2 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -73,6 +73,7 @@ static int nr_resolved_deltas;
 static int nr_threads;
 
 static int from_stdin;
+static int skip_mode;
 static int strict;
 static int do_fsck_object;
 static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
@@ -272,7 +273,29 @@ static void *fill(int min)
 	do {
 		ssize_t ret = xread(input_fd, input_buffer + input_len,
 				sizeof(input_buffer) - input_len);
-		if (ret <= 0) {
+
+		if (ret == 0 && skip_mode) {
+			output_fd = input_fd;
+			input_fd = 0;
+			skip_mode = 0;
+			/*
+			 * At this point we have 'input_len' bytes in
+			 * input_buffer, which is from the existing pack.
+			 * Assuming that we still need to read more, the
+			 * next loop will read from stdin instead.
+			 *
+			 * What's read from stdin must be written down. The
+			 * next flush() will write from &input_buffer[0],
+			 * which appends the 'input_len' bytes from existing
+			 * pack to the pack again.
+			 *
+			 * Seek back to make this an overwrite (of same
+			 * content) instead of an append.
+			 */
+			if (input_len && lseek(output_fd, -input_len, SEEK_CUR))
+				die_errno(_("cannot seek back %d bytes"), input_len);
+		}
+		else if (ret <= 0) {
 			if (!ret)
 				die(_("early EOF"));
 			die_errno(_("read error on input"));
@@ -323,6 +346,26 @@ static const char *open_pack_file(const char *pack_name)
 	return pack_name;
 }
 
+static const char *open_pack_for_append(const char *path)
+{
+	if (!from_stdin)
+		die(_("--append-pack must be used with --stdin"));
+
+	input_fd = open(path, O_CREAT|O_RDWR, 0600);
+	if (input_fd < 0)
+		die_errno(_("cannot open packfile '%s'"), path);
+	output_fd = -1;
+	nothread_data.pack_fd = input_fd;
+	git_SHA1_Init(&input_ctx);
+
+	/*
+	 * fill() will eventually turn this flag off and set output_fd
+	 * after reading everything
+	 */
+	skip_mode = 1;
+	return xstrdup(path);
+}
+
 static void parse_pack_header(void)
 {
 	struct pack_header *hdr = fill(sizeof(struct pack_header));
@@ -1692,6 +1735,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 					opts.off32_limit = strtoul(c+1, &c, 0);
 				if (*c || opts.off32_limit & 0x80000000)
 					die(_("bad %s"), arg);
+			} else if (skip_prefix(arg, "--append-pack=", &arg)) {
+				curr_pack = open_pack_for_append(arg);
 			} else
 				usage(index_pack_usage);
 			continue;
@@ -1742,7 +1787,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 	}
 #endif
 
-	curr_pack = open_pack_file(pack_name);
+	if (!curr_pack)
+		curr_pack = open_pack_file(pack_name);
 	parse_pack_header();
 	objects = xcalloc(nr_objects + 1, sizeof(struct object_entry));
 	if (show_stat)
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 4/8] upload-pack: new capability to pass --skip* to pack-objects
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2016-02-05  8:57 ` [PATCH 3/8] index-pack: add --append-pack=<path> Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-05  9:18   ` Johannes Schindelin
  2016-02-05  8:57 ` [PATCH 5/8] fetch-pack.c: send "skip" line " Nguyễn Thái Ngọc Duy
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/technical/pack-protocol.txt         |  2 ++
 Documentation/technical/protocol-capabilities.txt |  9 +++++++
 upload-pack.c                                     | 30 +++++++++++++++++++++--
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index c6977bb..5207d08 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -167,6 +167,7 @@ MUST peel the ref if it's an annotated tag.
 ----
   advertised-refs  =  (no-refs / list-of-refs)
 		      *shallow
+		      skip
 		      flush-pkt
 
   no-refs          =  PKT-LINE(zero-id SP "capabilities^{}"
@@ -181,6 +182,7 @@ MUST peel the ref if it's an annotated tag.
   other-peeled     =  obj-id SP refname "^{}"
 
   shallow          =  PKT-LINE("shallow" SP obj-id)
+  skip             =  PKT-LINE("skip" SP obj-id SP 1*DIGIT)
 
   capability-list  =  capability *(SP capability)
   capability       =  1*(LC_ALPHA / DIGIT / "-" / "_")
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..0567970 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -275,3 +275,12 @@ to accept a signed push certificate, and asks the <nonce> to be
 included in the push certificate.  A send-pack client MUST NOT
 send a push-cert packet unless the receive-pack server advertises
 this capability.
+
+partial
+------
+
+This capability adds "skip" line to the protocol, which passes --skip
+and --skip-hash to pack-objects. When "skip" line is present, given
+the same set of input from the client (e.g. have, want and shallow
+lines, "skip" line excluded), the exact same pack must be produced.
+
diff --git a/upload-pack.c b/upload-pack.c
index b3f6653..5565afe 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -53,6 +53,9 @@ static int use_sideband;
 static int advertise_refs;
 static int stateless_rpc;
 
+static struct object_id skip_hash;
+static int skip_opt = -1;
+
 static void reset_timeout(void)
 {
 	alarm(timeout);
@@ -90,7 +93,7 @@ static void create_pack_file(void)
 		"corruption on the remote side.";
 	int buffered = -1;
 	ssize_t sz;
-	const char *argv[13];
+	const char *argv[17];
 	int i, arg = 0;
 	FILE *pipe_fd;
 
@@ -112,6 +115,14 @@ static void create_pack_file(void)
 		argv[arg++] = "--delta-base-offset";
 	if (use_include_tag)
 		argv[arg++] = "--include-tag";
+	if (skip_opt >= 0) {
+		argv[arg++] = "--skip";
+		argv[arg++] = xstrfmt("%d", skip_opt);
+		if (skip_opt > 0) {
+			argv[arg++] = "--skip-hash";
+			argv[arg++] = xstrdup(oid_to_hex(&skip_hash));
+		}
+	}
 	argv[arg++] = NULL;
 
 	pack_objects.in = -1;
@@ -550,6 +561,8 @@ static void receive_needs(void)
 		const char *features;
 		unsigned char sha1_buf[20];
 		char *line = packet_read_line(0, NULL);
+		const char *arg;
+
 		reset_timeout();
 		if (!line)
 			break;
@@ -577,6 +590,19 @@ static void receive_needs(void)
 				die("Invalid deepen: %s", line);
 			continue;
 		}
+		if (skip_prefix(line, "skip ", &arg)) {
+			char *end = NULL;
+
+			if (get_oid_hex(arg, &skip_hash))
+				die("invalid skip line: %s", line);
+			arg += 40;
+			if (*arg++ != ' ')
+				die("invalid skip line: %s", line);
+			skip_opt = strtol(arg, &end, 0);
+			if (!end || *end || skip_opt < 0)
+				die("Invalid skip line: %s", line);
+			continue;
+		}
 		if (!starts_with(line, "want ") ||
 		    get_sha1_hex(line+5, sha1_buf))
 			die("git upload-pack: protocol error, "
@@ -725,7 +751,7 @@ static int send_ref(const char *refname, const struct object_id *oid,
 {
 	static const char *capabilities = "multi_ack thin-pack side-band"
 		" side-band-64k ofs-delta shallow no-progress"
-		" include-tag multi_ack_detailed";
+		" include-tag multi_ack_detailed partial";
 	const char *refname_nons = strip_namespace(refname);
 	struct object_id peeled;
 
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 5/8] fetch-pack.c: send "skip" line to pack-objects
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2016-02-05  8:57 ` [PATCH 4/8] upload-pack: new capability to pass --skip* to pack-objects Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-07  8:57   ` Eric Sunshine
  2016-02-05  8:57 ` [PATCH 6/8] fetch: add --resume-pack=<path> Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 fetch-pack.c | 40 ++++++++++++++++++++++++++++++++++++++++
 fetch-pack.h |  3 +++
 2 files changed, 43 insertions(+)

diff --git a/fetch-pack.c b/fetch-pack.c
index 01e34b6..ffb5254 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -15,6 +15,7 @@
 #include "version.h"
 #include "prio-queue.h"
 #include "sha1-array.h"
+#include "dir.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -330,6 +331,10 @@ static int find_common(struct fetch_pack_args *args,
 		write_shallow_commits(&req_buf, 1, NULL);
 	if (args->depth > 0)
 		packet_buf_write(&req_buf, "deepen %d", args->depth);
+	if (args->resume_path)
+		packet_buf_write(&req_buf, "skip %s %d",
+				 sha1_to_hex(args->skip_hash),
+				 args->skip_offset);
 	packet_buf_flush(&req_buf);
 	state_len = req_buf.len;
 
@@ -730,6 +735,9 @@ static int get_pack(struct fetch_pack_args *args,
 			argv_array_push(&cmd.args, "-v");
 		if (args->use_thin_pack)
 			argv_array_push(&cmd.args, "--fix-thin");
+		if (args->resume_path && args->skip_offset)
+			argv_array_pushf(&cmd.args, "--append-pack=%s",
+					 args->resume_path);
 		if (args->lock_pack || unpack_limit) {
 			char hostname[256];
 			if (gethostname(hostname, sizeof(hostname)))
@@ -856,6 +864,8 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 			fprintf(stderr, "Server supports ofs-delta\n");
 	} else
 		prefer_ofs_delta = 0;
+	if (args->resume_path && !server_supports("partial"))
+		die(_("Server does not support resume capability\n"));
 
 	if ((agent_feature = server_feature_value("agent", &agent_len))) {
 		agent_supported = 1;
@@ -1030,6 +1040,35 @@ static void update_shallow(struct fetch_pack_args *args,
 	sha1_array_clear(&ref);
 }
 
+static void prepare_resume_fetch(struct fetch_pack_args *args)
+{
+	git_SHA_CTX c;
+	char buf[8192];
+	int fd, len;
+
+	if (!args->resume_path)
+		return;
+
+	args->skip_offset = 0;
+	args->keep_pack = 1;
+	if (!file_exists(args->resume_path))
+		return;
+
+	fd = open(args->resume_path, O_RDONLY);
+	if (fd == -1)
+		die_errno(_("failed to open '%s'"), args->resume_path);
+
+	git_SHA1_Init(&c);
+	while ((len = xread(fd, buf, sizeof(buf))) > 0) {
+		git_SHA1_Update(&c, buf, len);
+		args->skip_offset += len;
+	}
+	if (len < 0)
+		die_errno(_("failed to read '%s'"), args->resume_path);
+	git_SHA1_Final(args->skip_hash, &c);
+	close(fd);
+}
+
 struct ref *fetch_pack(struct fetch_pack_args *args,
 		       int fd[], struct child_process *conn,
 		       const struct ref *ref,
@@ -1050,6 +1089,7 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		die("no matching remote head");
 	}
 	prepare_shallow_info(&si, shallow);
+	prepare_resume_fetch(args);
 	ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
 				&si, pack_lockfile);
 	reprepare_packed_git();
diff --git a/fetch-pack.h b/fetch-pack.h
index bb7fd76..46f0997 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -10,6 +10,9 @@ struct fetch_pack_args {
 	const char *uploadpack;
 	int unpacklimit;
 	int depth;
+	const char *resume_path;
+	int skip_offset;
+	unsigned char skip_hash[20];
 	unsigned quiet:1;
 	unsigned keep_pack:1;
 	unsigned lock_pack:1;
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 6/8] fetch: add --resume-pack=<path>
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
                   ` (4 preceding siblings ...)
  2016-02-05  8:57 ` [PATCH 5/8] fetch-pack.c: send "skip" line " Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 7/8] index-pack: --append-pack implies --strict Nguyễn Thái Ngọc Duy
  2016-02-05  8:57 ` [PATCH 8/8] one ugly test to verify basic functionality Nguyễn Thái Ngọc Duy
  7 siblings, 0 replies; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

This is a low-level option for resumable fetch. You start a resumable
fetch with

    git fetch --resume-pack=blah <host>

where "blah" file does not exist. If the fetch is interrupted, "blah"
will contain what's been fetched so far. Run the same command again.

On the server side, pack-objects performs the exact same operation to
produce full pack again, but it will not send what is already in
"blah". pack-objects does check if the skipped part is the same between
two sides, in case of configuration change or whatever, and abort early.

On the client side, index-pack feeds itself with what's in "blah",
then the input stream from pack-objects. index-pack does strict
verification as usual. Even if pack-objects fails to produce a stable
pack, index-pack should catch it and complain loudly. If everything
goes well, "blah" is removed and a new good pack is put in $GIT_DIR.

Improvement point. We should be able to perform some heavy operations
locally before connecting to the server (e.g. produce the skip hash
from "blah", or index-pack consuming "blah").

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/fetch-pack.c | 4 ++++
 builtin/fetch.c      | 5 +++++
 remote-curl.c        | 8 +++++++-
 transport.c          | 4 ++++
 transport.h          | 4 ++++
 5 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 9b2a514..996ad30 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -129,6 +129,10 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			args.update_shallow = 1;
 			continue;
 		}
+		if (skip_prefix(arg, "--resume-path=", &arg)) {
+			args.resume_path = arg;
+			continue;
+		}
 		usage(fetch_pack_usage);
 	}
 
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 8e74213..34f32c6 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -48,6 +48,7 @@ static const char *recurse_submodules_default;
 static int shown_url = 0;
 static int refmap_alloc, refmap_nr;
 static const char **refmap_array;
+static const char *resume_path;
 
 static int option_parse_recurse_submodules(const struct option *opt,
 				   const char *arg, int unset)
@@ -115,6 +116,8 @@ static struct option builtin_fetch_options[] = {
 	OPT_BOOL(0, "progress", &progress, N_("force progress reporting")),
 	OPT_STRING(0, "depth", &depth, N_("depth"),
 		   N_("deepen history of shallow clone")),
+	OPT_FILENAME(0, "resume-pack", &resume_path,
+		     N_("perform resumable fetch on the given pack")),
 	{ OPTION_SET_INT, 0, "unshallow", &unshallow, NULL,
 		   N_("convert to a complete repository"),
 		   PARSE_OPT_NONEG | PARSE_OPT_NOARG, NULL, 1 },
@@ -872,6 +875,8 @@ static struct transport *prepare_transport(struct remote *remote)
 		set_option(transport, TRANS_OPT_DEPTH, depth);
 	if (update_shallow)
 		set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
+	if (resume_path)
+		set_option(transport, TRANS_OPT_RESUME_PATH, resume_path);
 	return transport;
 }
 
diff --git a/remote-curl.c b/remote-curl.c
index c704857..36835fb 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -20,6 +20,7 @@ static struct strbuf url = STRBUF_INIT;
 struct options {
 	int verbosity;
 	unsigned long depth;
+	const char *resume_path;
 	unsigned progress : 1,
 		check_self_contained_and_connected : 1,
 		cloning : 1,
@@ -119,6 +120,9 @@ static int set_option(const char *name, const char *value)
 		else
 			return -1;
 		return 0;
+	} else if (!strcmp(name, "resume-path")) {
+		options.resume_path = xstrdup(value);
+		return 0;
 	} else {
 		return 1 /* unsupported */;
 	}
@@ -727,7 +731,7 @@ static int fetch_git(struct discovery *heads,
 	struct strbuf preamble = STRBUF_INIT;
 	char *depth_arg = NULL;
 	int argc = 0, i, err;
-	const char *argv[17];
+	const char *argv[18];
 
 	argv[argc++] = "fetch-pack";
 	argv[argc++] = "--stateless-rpc";
@@ -755,6 +759,8 @@ static int fetch_git(struct discovery *heads,
 		depth_arg = strbuf_detach(&buf, NULL);
 		argv[argc++] = depth_arg;
 	}
+	if (options.resume_path)
+		argv[argc++] = xstrfmt("--resume-path=%s", options.resume_path);
 	argv[argc++] = url.buf;
 	argv[argc++] = NULL;
 
diff --git a/transport.c b/transport.c
index 67f3666..6378bed 100644
--- a/transport.c
+++ b/transport.c
@@ -467,6 +467,9 @@ static int set_git_option(struct git_transport_options *opts,
 	} else if (!strcmp(name, TRANS_OPT_UPDATE_SHALLOW)) {
 		opts->update_shallow = !!value;
 		return 0;
+	} else if (!strcmp(name, TRANS_OPT_RESUME_PATH)) {
+		opts->resume_path = value;
+		return 0;
 	} else if (!strcmp(name, TRANS_OPT_DEPTH)) {
 		if (!value)
 			opts->depth = 0;
@@ -534,6 +537,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 		data->options.check_self_contained_and_connected;
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
+	args.resume_path = data->options.resume_path;
 
 	if (!data->got_remote_heads) {
 		connect_setup(transport, 0, 0);
diff --git a/transport.h b/transport.h
index 8ebaaf2..765e4e5 100644
--- a/transport.h
+++ b/transport.h
@@ -16,6 +16,7 @@ struct git_transport_options {
 	const char *uploadpack;
 	const char *receivepack;
 	struct push_cas_option *cas;
+	const char *resume_path;
 };
 
 struct transport {
@@ -180,6 +181,9 @@ int transport_restrict_protocols(void);
 /* Send push certificates */
 #define TRANS_OPT_PUSH_CERT "pushcert"
 
+/* Resumable fetch */
+#define TRANS_OPT_RESUME_PATH "resume-path"
+
 /**
  * Returns 0 if the option was used, non-zero otherwise. Prints a
  * message to stderr if the option is not used.
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 7/8] index-pack: --append-pack implies --strict
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
                   ` (5 preceding siblings ...)
  2016-02-05  8:57 ` [PATCH 6/8] fetch: add --resume-pack=<path> Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-07  9:02   ` Eric Sunshine
  2016-02-05  8:57 ` [PATCH 8/8] one ugly test to verify basic functionality Nguyễn Thái Ngọc Duy
  7 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

A frankenstein pack, generated by multiple pack-objects runs,
certainly has higher risk of broken, especially when the server side
could be some other implementation than pack-objects. Be safe and
strict.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/index-pack.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index f099ac2..cc60aee 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1736,6 +1736,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix)
 				if (*c || opts.off32_limit & 0x80000000)
 					die(_("bad %s"), arg);
 			} else if (skip_prefix(arg, "--append-pack=", &arg)) {
+				strict = 1;
+				do_fsck_object = 1;
 				curr_pack = open_pack_for_append(arg);
 			} else
 				usage(index_pack_usage);
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 8/8] one ugly test to verify basic functionality
  2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
                   ` (6 preceding siblings ...)
  2016-02-05  8:57 ` [PATCH 7/8] index-pack: --append-pack implies --strict Nguyễn Thái Ngọc Duy
@ 2016-02-05  8:57 ` Nguyễn Thái Ngọc Duy
  2016-02-05 11:57   ` Elia Pinto
  7 siblings, 1 reply; 22+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2016-02-05  8:57 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 t/t5544-fetch-resume.sh (new +x) | 42 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)
 create mode 100755 t/t5544-fetch-resume.sh

diff --git a/t/t5544-fetch-resume.sh b/t/t5544-fetch-resume.sh
new file mode 100755
index 0000000..dfa033d
--- /dev/null
+++ b/t/t5544-fetch-resume.sh
@@ -0,0 +1,42 @@
+#!/bin/sh
+
+test_description='what'
+
+. ./test-lib.sh
+
+test_expect_success 'what' '
+	test_commit one &&
+	git clone --no-local .git abc &&
+	(
+	cd abc &&
+	mv `ls .git/objects/pack/*.pack` pack &&
+	git unpack-objects < pack &&
+	rm pack &&
+	git fsck
+	) &&
+	test_commit two &&
+	test_commit three &&
+	(
+	cd abc &&
+	git fetch --resume-pack=foo origin HEAD &&
+	git log --format=%s origin/master >actual &&
+	echo one >expected &&
+	test_cmp expected actual &&
+	rm .git/FETCH_HEAD &&
+	mv `ls .git/objects/pack/*.pack` pack &&
+	head -c 123 pack >tmp &&
+	git fetch --resume-pack=tmp origin &&
+	test_path_is_missing tmp &&
+	cmp pack .git/objects/pack/*.pack &&
+	git fsck &&
+	git log --format=%s origin/master >actual &&
+	cat >expected <<EOF &&
+three
+two
+one
+EOF
+	test_cmp expected actual
+	)
+'
+
+test_done
-- 
2.7.0.377.g4cd97dd

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 4/8] upload-pack: new capability to pass --skip* to pack-objects
  2016-02-05  8:57 ` [PATCH 4/8] upload-pack: new capability to pass --skip* to pack-objects Nguyễn Thái Ngọc Duy
@ 2016-02-05  9:18   ` Johannes Schindelin
  0 siblings, 0 replies; 22+ messages in thread
From: Johannes Schindelin @ 2016-02-05  9:18 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1235 bytes --]

Hi Duy,

On Fri, 5 Feb 2016, Nguyễn Thái Ngọc Duy wrote:

> [... empty commit message body...]
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>
> [...]
> diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
> index eaab6b4..0567970 100644
> --- a/Documentation/technical/protocol-capabilities.txt
> +++ b/Documentation/technical/protocol-capabilities.txt
> @@ -275,3 +275,12 @@ to accept a signed push certificate, and asks the <nonce> to be
>  included in the push certificate.  A send-pack client MUST NOT
>  send a push-cert packet unless the receive-pack server advertises
>  this capability.
> +
> +partial
> +------
> +
> +This capability adds "skip" line to the protocol, which passes --skip
> +and --skip-hash to pack-objects. When "skip" line is present, given
> +the same set of input from the client (e.g. have, want and shallow
> +lines, "skip" line excluded), the exact same pack must be produced.
> +

Would you care to elaborate on this idea a bit more? The commit message
would make an excellent place for describing the underlying idea, as well
as comparing it to the alternatives.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 1/8] pack-objects: add --skip and --skip-hash
  2016-02-05  8:57 ` [PATCH 1/8] pack-objects: add --skip and --skip-hash Nguyễn Thái Ngọc Duy
@ 2016-02-05  9:20   ` Johannes Schindelin
  0 siblings, 0 replies; 22+ messages in thread
From: Johannes Schindelin @ 2016-02-05  9:20 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 595 bytes --]

Hi Duy,

On Fri, 5 Feb 2016, Nguyễn Thái Ngọc Duy wrote:

> The idea is, a pack is requested the first time with --skip=0. If pack
> transfer is interrupted, the client will ask for the same pack again,
> but this time it asks the server not to send what it already has. The
> client hashes what it has and sends the SHA-1 to the server. If the
> server finds out the skipped part does not match, it can abort early.

Ah, here it is. This description should definitely go into Documentation/,
methinks. Maybe elaborate a little bit more on the "what it has" part?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] one ugly test to verify basic functionality
  2016-02-05  8:57 ` [PATCH 8/8] one ugly test to verify basic functionality Nguyễn Thái Ngọc Duy
@ 2016-02-05 11:57   ` Elia Pinto
  2016-02-05 13:02     ` Duy Nguyen
  2016-02-05 13:20     ` Johannes Schindelin
  0 siblings, 2 replies; 22+ messages in thread
From: Elia Pinto @ 2016-02-05 11:57 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

2016-02-05 9:57 GMT+01:00 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  t/t5544-fetch-resume.sh (new +x) | 42 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
>  create mode 100755 t/t5544-fetch-resume.sh
>
> diff --git a/t/t5544-fetch-resume.sh b/t/t5544-fetch-resume.sh
> new file mode 100755
> index 0000000..dfa033d
> --- /dev/null
> +++ b/t/t5544-fetch-resume.sh
> @@ -0,0 +1,42 @@
> +#!/bin/sh
> +
> +test_description='what'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'what' '
> +       test_commit one &&
> +       git clone --no-local .git abc &&
> +       (
> +       cd abc &&
> +       mv `ls .git/objects/pack/*.pack` pack &&

No, please. From the git coding guideline : "We prefer $( ... ) for
command substitution; unlike ``, it properly nests.
It should have been the way Bourne spelled it from day one, but
unfortunately isn't."

http://stackoverflow.com/questions/4708549/whats-the-difference-between-command-and-command-in-shell-programming

Thank you
> +       git unpack-objects < pack &&
> +       rm pack &&
> +       git fsck
> +       ) &&
> +       test_commit two &&
> +       test_commit three &&
> +       (
> +       cd abc &&
> +       git fetch --resume-pack=foo origin HEAD &&
> +       git log --format=%s origin/master >actual &&
> +       echo one >expected &&
> +       test_cmp expected actual &&
> +       rm .git/FETCH_HEAD &&
> +       mv `ls .git/objects/pack/*.pack` pack &&
> +       head -c 123 pack >tmp &&
> +       git fetch --resume-pack=tmp origin &&
> +       test_path_is_missing tmp &&
> +       cmp pack .git/objects/pack/*.pack &&
> +       git fsck &&
> +       git log --format=%s origin/master >actual &&
> +       cat >expected <<EOF &&
> +three
> +two
> +one
> +EOF
> +       test_cmp expected actual
> +       )
> +'
> +
> +test_done
> --
> 2.7.0.377.g4cd97dd
>
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] one ugly test to verify basic functionality
  2016-02-05 11:57   ` Elia Pinto
@ 2016-02-05 13:02     ` Duy Nguyen
  2016-02-05 13:33       ` Elia Pinto
  2016-02-05 13:20     ` Johannes Schindelin
  1 sibling, 1 reply; 22+ messages in thread
From: Duy Nguyen @ 2016-02-05 13:02 UTC (permalink / raw)
  To: Elia Pinto; +Cc: git

On Fri, Feb 5, 2016 at 6:57 PM, Elia Pinto <gitter.spiros@gmail.com> wrote:
>> +       mv `ls .git/objects/pack/*.pack` pack &&
>
> No, please. From the git coding guideline : "We prefer $( ... ) for
> command substitution; unlike ``, it properly nests.

I was being too subtle with the word "ugly". I assure you this test
cannot be merged in its current form.

But that's not important. It makes me think, can we catch `..`
automatically? I know the test suite can catch broken && command
sequence. Maybe we can do the same for`..`?

> It should have been the way Bourne spelled it from day one, but
> unfortunately isn't."
>
> http://stackoverflow.com/questions/4708549/whats-the-difference-between-command-and-command-in-shell-programming
-- 
Duy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] one ugly test to verify basic functionality
  2016-02-05 11:57   ` Elia Pinto
  2016-02-05 13:02     ` Duy Nguyen
@ 2016-02-05 13:20     ` Johannes Schindelin
  2016-02-05 13:38       ` Elia Pinto
  1 sibling, 1 reply; 22+ messages in thread
From: Johannes Schindelin @ 2016-02-05 13:20 UTC (permalink / raw)
  To: Elia Pinto; +Cc: Nguyễn Thái Ngọc Duy, git

Hi Elia,

On Fri, 5 Feb 2016, Elia Pinto wrote:

> From the git coding guideline : "We prefer $( ... ) for command
> substitution; unlike ``, it properly nests.  It should have been the way
> Bourne spelled it from day one, but unfortunately isn't."

There was only one time in my life as a developer when I had to change a
$(...) to a `...` construct. I do not really remember all the details, but
I think it was on MacOSX, and it was inside a case ... esac and the
closing parentheses was mistaken for a case arm (the $(...) construct
might have been multi-line, is the only thing that would make sense in my
mind.

Did you hear of similar problems?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] one ugly test to verify basic functionality
  2016-02-05 13:02     ` Duy Nguyen
@ 2016-02-05 13:33       ` Elia Pinto
  0 siblings, 0 replies; 22+ messages in thread
From: Elia Pinto @ 2016-02-05 13:33 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: git

2016-02-05 14:02 GMT+01:00 Duy Nguyen <pclouds@gmail.com>:
> On Fri, Feb 5, 2016 at 6:57 PM, Elia Pinto <gitter.spiros@gmail.com> wrote:
>>> +       mv `ls .git/objects/pack/*.pack` pack &&
>>
>> No, please. From the git coding guideline : "We prefer $( ... ) for
>> command substitution; unlike ``, it properly nests.
>
> I was being too subtle with the word "ugly". I assure you this test
> cannot be merged in its current form.
>
> But that's not important. It makes me think, can we catch `..`
> automatically? I know the test suite can catch broken && command
> sequence. Maybe we can do the same for`..`?
I dunno. But it might be a good idea
>
>> It should have been the way Bourne spelled it from day one, but
>> unfortunately isn't."
>>
>> http://stackoverflow.com/questions/4708549/whats-the-difference-between-command-and-command-in-shell-programming
> --
> Duy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 8/8] one ugly test to verify basic functionality
  2016-02-05 13:20     ` Johannes Schindelin
@ 2016-02-05 13:38       ` Elia Pinto
  0 siblings, 0 replies; 22+ messages in thread
From: Elia Pinto @ 2016-02-05 13:38 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nguyễn Thái Ngọc Duy, git

2016-02-05 14:20 GMT+01:00 Johannes Schindelin <Johannes.Schindelin@gmx.de>:
> Hi Elia,
>
> On Fri, 5 Feb 2016, Elia Pinto wrote:
>
>> From the git coding guideline : "We prefer $( ... ) for command
>> substitution; unlike ``, it properly nests.  It should have been the way
>> Bourne spelled it from day one, but unfortunately isn't."
>
> There was only one time in my life as a developer when I had to change a
> $(...) to a `...` construct. I do not really remember all the details, but
> I think it was on MacOSX, and it was inside a case ... esac and the
> closing parentheses was mistaken for a case arm (the $(...) construct
> might have been multi-line, is the only thing that would make sense in my
> mind.
>
> Did you hear of similar problems?

I not personally. The autoconf manual does not seem to me (for that i
can remember) to mention this possible incompatibility, for example.

Best

>
> Ciao,
> Dscho

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] pack-objects: produce a stable pack when --skip is given
  2016-02-05  8:57 ` [PATCH 2/8] pack-objects: produce a stable pack when --skip is given Nguyễn Thái Ngọc Duy
@ 2016-02-05 18:43   ` Junio C Hamano
  2016-02-05 23:25     ` Duy Nguyen
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2016-02-05 18:43 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index 417c830..c58a9cb 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c
> @@ -2709,6 +2709,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>  			if (get_oid_hex(skip_hash_hex, &skip_hash))
>  				die(_("%s is not SHA-1"), skip_hash_hex);
>  		}
> +
> +		/*
> +		 * Parallel delta search can't produce stable packs.
> +		 */
> +		delta_search_threads = 1;
>  	}
>  
>  	argv_array_push(&rp, "pack-objects");

A multi-threaded packing is _one_ source of regenerating the same
pack for the same set of objects, but we shouldn't be tying our
hands by promising it will forever be the _only_ source of it by
doing things like this.  We may want to dynamically tweak the
packing behaviour depending on the load of the minute and such for
example.

This is an indication that the approach this series takes is taking
us in a wrong direction.

I think a more sensible approach for "resuming" is to attack cloning
first.  Take a reasonable baseline snapshot periodically (depending
on the activity level of the project, the interval may span between
12 hours to 2 weeks and you would want to make it configurable) to
create a bundle, teach "clone" to check the bundle first and perform
a resumable and bulk transfer for the stable part of the history
(e.g. up to the latest tag or a branch that does not rewind, the set
of refs to use as the stable anchors you would want to make
configurable), and then fill the gap between that baseline snapshot
and up-to-date state by doing another round of "git fetch" and then
you'd have solved the most serious problem already.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] pack-objects: produce a stable pack when --skip is given
  2016-02-05 18:43   ` Junio C Hamano
@ 2016-02-05 23:25     ` Duy Nguyen
  2016-02-06  0:48       ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Duy Nguyen @ 2016-02-05 23:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Sat, Feb 6, 2016 at 1:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
>> index 417c830..c58a9cb 100644
>> --- a/builtin/pack-objects.c
>> +++ b/builtin/pack-objects.c
>> @@ -2709,6 +2709,11 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
>>                       if (get_oid_hex(skip_hash_hex, &skip_hash))
>>                               die(_("%s is not SHA-1"), skip_hash_hex);
>>               }
>> +
>> +             /*
>> +              * Parallel delta search can't produce stable packs.
>> +              */
>> +             delta_search_threads = 1;
>>       }
>>
>>       argv_array_push(&rp, "pack-objects");
>
> A multi-threaded packing is _one_ source of regenerating the same
> pack for the same set of objects, but we shouldn't be tying our
> hands by promising it will forever be the _only_ source of it by
> doing things like this.  We may want to dynamically tweak the
> packing behaviour depending on the load of the minute and such for
> example.

You noticed that tying the behavior only happens when the user asks
for it, right? I don't expect people to do resumable fetch/clone by
default. There are tradeoffs to make and they decide it, we offer
options. So, it does not really tie our hands in the normal case.

> I think a more sensible approach for "resuming" is to attack cloning
> first.  Take a reasonable baseline snapshot periodically (depending
> on the activity level of the project, the interval may span between
> 12 hours to 2 weeks and you would want to make it configurable) to
> create a bundle, teach "clone" to check the bundle first and perform
> a resumable and bulk transfer for the stable part of the history
> (e.g. up to the latest tag or a branch that does not rewind, the set
> of refs to use as the stable anchors you would want to make
> configurable), and then fill the gap between that baseline snapshot
> and up-to-date state by doing another round of "git fetch" and then
> you'd have solved the most serious problem already.

_most_. On a troubled connection, fetch can fail as well, especially
when the repo is not uptodate. What about shallow clone? I don't think
you want to prepare snapshots for all depths (or some "popular"
depths). Cloning full then cutting it shallow locally might work, but
we're wasting bandwidth and disk space.

> This is an indication that the approach this series takes is taking
> us in a wrong direction.

The way I see it, the two approaches complement each other. Snapshots,
like pack bitmaps, helps tremendously, but it has corner cases that
can be covered by this.
-- 
Duy

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] pack-objects: produce a stable pack when --skip is given
  2016-02-05 23:25     ` Duy Nguyen
@ 2016-02-06  0:48       ` Junio C Hamano
  2016-02-08  5:23         ` Duy Nguyen
  0 siblings, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2016-02-06  0:48 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

> You noticed that tying the behavior only happens when the user asks
> for it, right? I don't expect people to do resumable fetch/clone by
> default. There are tradeoffs to make and they decide it, we offer
> options. So, it does not really tie our hands in the normal case.

You misread me. I do not want to see us having to add
"disable_this_feature = 1" next to that "delta_search_thread = 1"
in this block, and then supporting code to actually disable that
feature, only to support this block. You are declaring that "to
support this mode, we must always have a way to produce a
byte-for-byte identical output and keep supporting it forever".

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 5/8] fetch-pack.c: send "skip" line to pack-objects
  2016-02-05  8:57 ` [PATCH 5/8] fetch-pack.c: send "skip" line " Nguyễn Thái Ngọc Duy
@ 2016-02-07  8:57   ` Eric Sunshine
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Sunshine @ 2016-02-07  8:57 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: Git List

On Fri, Feb 5, 2016 at 3:57 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> diff --git a/fetch-pack.c b/fetch-pack.c
> @@ -856,6 +864,8 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
>                         fprintf(stderr, "Server supports ofs-delta\n");
>         } else
>                 prefer_ofs_delta = 0;
> +       if (args->resume_path && !server_supports("partial"))
> +               die(_("Server does not support resume capability\n"));

Drop "\n" from die() message.

>         if ((agent_feature = server_feature_value("agent", &agent_len))) {
>                 agent_supported = 1;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 7/8] index-pack: --append-pack implies --strict
  2016-02-05  8:57 ` [PATCH 7/8] index-pack: --append-pack implies --strict Nguyễn Thái Ngọc Duy
@ 2016-02-07  9:02   ` Eric Sunshine
  0 siblings, 0 replies; 22+ messages in thread
From: Eric Sunshine @ 2016-02-07  9:02 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: Git List

On Fri, Feb 5, 2016 at 3:57 AM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> A frankenstein pack, generated by multiple pack-objects runs,
> certainly has higher risk of broken, especially when the server side

s/of/& being/

> could be some other implementation than pack-objects. Be safe and
> strict.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 2/8] pack-objects: produce a stable pack when --skip is given
  2016-02-06  0:48       ` Junio C Hamano
@ 2016-02-08  5:23         ` Duy Nguyen
  0 siblings, 0 replies; 22+ messages in thread
From: Duy Nguyen @ 2016-02-08  5:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Sat, Feb 6, 2016 at 7:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> You noticed that tying the behavior only happens when the user asks
>> for it, right? I don't expect people to do resumable fetch/clone by
>> default. There are tradeoffs to make and they decide it, we offer
>> options. So, it does not really tie our hands in the normal case.
>
> You misread me. I do not want to see us having to add
> "disable_this_feature = 1" next to that "delta_search_thread = 1"
> in this block, and then supporting code to actually disable that
> feature, only to support this block. You are declaring that "to
> support this mode, we must always have a way to produce a
> byte-for-byte identical output and keep supporting it forever".

My last defense is, this is an extension, not part of the core
protocol. If the burden becomes real we can always remove it.
Modem-quality connection users may yell a bit, but the majority of
broadband users won't notice a thing. But I understand if the answer
is still 'no'.

I find it unlikely that this cause much maintenance burden though, the
packing algorithm has not changed for a very long time (most related
change is pack bitmaps, which technically happen before pack-objects).
The next big change (at least in public) may be pack v4. We haven't
found a good algorithm for pack-objects on v4 yet, so there's a chance
of problems there.
-- 
Duy

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-02-08  5:24 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-05  8:57 [PATCH 0/8] Resumable clone revisited, proof of concept Nguyễn Thái Ngọc Duy
2016-02-05  8:57 ` [PATCH 1/8] pack-objects: add --skip and --skip-hash Nguyễn Thái Ngọc Duy
2016-02-05  9:20   ` Johannes Schindelin
2016-02-05  8:57 ` [PATCH 2/8] pack-objects: produce a stable pack when --skip is given Nguyễn Thái Ngọc Duy
2016-02-05 18:43   ` Junio C Hamano
2016-02-05 23:25     ` Duy Nguyen
2016-02-06  0:48       ` Junio C Hamano
2016-02-08  5:23         ` Duy Nguyen
2016-02-05  8:57 ` [PATCH 3/8] index-pack: add --append-pack=<path> Nguyễn Thái Ngọc Duy
2016-02-05  8:57 ` [PATCH 4/8] upload-pack: new capability to pass --skip* to pack-objects Nguyễn Thái Ngọc Duy
2016-02-05  9:18   ` Johannes Schindelin
2016-02-05  8:57 ` [PATCH 5/8] fetch-pack.c: send "skip" line " Nguyễn Thái Ngọc Duy
2016-02-07  8:57   ` Eric Sunshine
2016-02-05  8:57 ` [PATCH 6/8] fetch: add --resume-pack=<path> Nguyễn Thái Ngọc Duy
2016-02-05  8:57 ` [PATCH 7/8] index-pack: --append-pack implies --strict Nguyễn Thái Ngọc Duy
2016-02-07  9:02   ` Eric Sunshine
2016-02-05  8:57 ` [PATCH 8/8] one ugly test to verify basic functionality Nguyễn Thái Ngọc Duy
2016-02-05 11:57   ` Elia Pinto
2016-02-05 13:02     ` Duy Nguyen
2016-02-05 13:33       ` Elia Pinto
2016-02-05 13:20     ` Johannes Schindelin
2016-02-05 13:38       ` Elia Pinto

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.