git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/26] protocol version 2
@ 2018-01-03  0:18 Brandon Williams
  2018-01-03  0:18 ` [PATCH 01/26] pkt-line: introduce packet_read_with_status Brandon Williams
                   ` (27 more replies)
  0 siblings, 28 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

The following patches extend what I sent out as an WIP
(https://public-inbox.org/git/20171204235823.63299-1-bmwill@google.com/) and
implement protocol version 2.

Some changes from that series are as follows:
 * Lots of various cleanup on the ls-refs and fetch command code, both server
   and client.
 * Fetch command now supports a stateless-rpc mode which enables communicating
   with a half-duplex connection.
 * Introduce a new remote-helper command 'connect-half-duplex' which is
   implemented by remote-curl (the http remote-helper).  This allows for a
   client to establish a half-duplex connection and use remote-curl as a proxy
   to wrap requests in http before sending them to the remote end and
   unwrapping the responses and sending them back to the client's stdin.
 * The transport code is refactored for ls-remote, fetch, and push to provide a
   list of ref-patterns (based on the refspec being used) when requesting refs
   from the remote end.  This allows the ls-refs code to send this list of
   patterns so the remote end and filter the refs it sends back.

This series effectively implements protocol version 2 for listing a remotes
refs (ls-remote) as well as for fetch for the builtin transports (ssh, git,
file) and for the http/https transports.  Push is not implemented yet and
doesn't need to be implemented at the same time as fetch since the
receive-pack code can default to using protocol v0 when v2 is requested by the
client.

Any feedback is appreciated! Thanks!
	Brandon

Brandon Williams (26):
  pkt-line: introduce packet_read_with_status
  pkt-line: introduce struct packet_reader
  pkt-line: add delim packet support
  upload-pack: convert to a builtin
  upload-pack: factor out processing lines
  transport: use get_refs_via_connect to get refs
  connect: convert get_remote_heads to use struct packet_reader
  connect: discover protocol version outside of get_remote_heads
  transport: store protocol version
  protocol: introduce enum protocol_version value protocol_v2
  serve: introduce git-serve
  ls-refs: introduce ls-refs server command
  connect: request remote refs using v2
  transport: convert get_refs_list to take a list of ref patterns
  transport: convert transport_get_remote_refs to take a list of ref
    patterns
  ls-remote: pass ref patterns when requesting a remote's refs
  fetch: pass ref patterns when fetching
  push: pass ref patterns when pushing
  upload-pack: introduce fetch server command
  fetch-pack: perform a fetch using v2
  transport-helper: remove name parameter
  transport-helper: refactor process_connect_service
  transport-helper: introduce connect-half-duplex
  pkt-line: add packet_buf_write_len function
  remote-curl: create copy of the service name
  remote-curl: implement connect-half-duplex command

 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt | 131 ++++++++++
 Makefile                                |   6 +-
 builtin.h                               |   2 +
 builtin/clone.c                         |   2 +-
 builtin/fetch-pack.c                    |  21 +-
 builtin/fetch.c                         |  14 +-
 builtin/ls-remote.c                     |   7 +-
 builtin/receive-pack.c                  |   6 +
 builtin/remote.c                        |   2 +-
 builtin/send-pack.c                     |  20 +-
 builtin/serve.c                         |  30 +++
 connect.c                               | 226 +++++++++++++-----
 connect.h                               |   3 +
 fetch-pack.c                            | 267 ++++++++++++++++++++-
 fetch-pack.h                            |   4 +-
 git.c                                   |   2 +
 ls-refs.c                               |  97 ++++++++
 ls-refs.h                               |   9 +
 pkt-line.c                              | 147 +++++++++++-
 pkt-line.h                              |  76 ++++++
 protocol.c                              |   2 +
 protocol.h                              |   1 +
 remote-curl.c                           | 209 +++++++++++++++-
 remote.h                                |   9 +-
 serve.c                                 | 243 +++++++++++++++++++
 serve.h                                 |  15 ++
 t/t5701-protocol-v2.sh                  | 117 +++++++++
 transport-helper.c                      |  84 ++++---
 transport-internal.h                    |   4 +-
 transport.c                             | 119 ++++++---
 transport.h                             |   9 +-
 upload-pack.c                           | 412 ++++++++++++++++++++++++++++----
 upload-pack.h                           |   9 +
 34 files changed, 2108 insertions(+), 198 deletions(-)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-protocol-v2.sh
 create mode 100644 upload-pack.h

-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH 01/26] pkt-line: introduce packet_read_with_status
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03 19:27   ` Stefan Beller
  2018-01-09 18:04   ` Jonathan Tan
  2018-01-03  0:18 ` [PATCH 02/26] pkt-line: introduce struct packet_reader Brandon Williams
                   ` (26 subsequent siblings)
  27 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

The current pkt-line API encodes the status of a pkt-line read in the
length of the read content.  An error is indicated with '-1', a flush
with '0' (which can be confusing since a return value of '0' can also
indicate an empty pkt-line), and a positive integer for the length of
the read content otherwise.  This doesn't leave much room for allowing
the addition of additional special packets in the future.

To solve this introduce 'packet_read_with_status()' which reads a packet
and returns the status of the read encoded as an 'enum packet_status'
type.  This allows for easily identifying between special and normal
packets as well as errors.  It also enables easily adding a new special
packet in the future.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 55 ++++++++++++++++++++++++++++++++++++++++++-------------
 pkt-line.h | 15 +++++++++++++++
 2 files changed, 57 insertions(+), 13 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 2827ca772..8d7cd389f 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
-int packet_read(int fd, char **src_buf, size_t *src_len,
-		char *buffer, unsigned size, int options)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
+						char *buffer, unsigned size, int *pktlen,
+						int options)
 {
-	int len, ret;
+	int len;
 	char linelen[4];
 
-	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
-	if (ret < 0)
-		return ret;
+	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
+		return PACKET_READ_EOF;
+
 	len = packet_length(linelen);
 	if (len < 0)
 		die("protocol error: bad line length character: %.4s", linelen);
-	if (!len) {
+
+	if (len == 0) {
 		packet_trace("0000", 4, 0);
-		return 0;
+		return PACKET_READ_FLUSH;
+	} else if (len >= 1 && len <= 3) {
+		die("protocol error: bad line length character: %.4s", linelen);
 	}
+
 	len -= 4;
-	if (len >= size)
+	if ((len < 0) || ((unsigned)len >= size))
 		die("protocol error: bad line length %d", len);
-	ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
-	if (ret < 0)
-		return ret;
+
+	if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
+		return PACKET_READ_EOF;
 
 	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
 	    len && buffer[len-1] == '\n')
@@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
 	buffer[len] = 0;
 	packet_trace(buffer, len, 0);
-	return len;
+	*pktlen = len;
+	return PACKET_READ_NORMAL;
+}
+
+int packet_read(int fd, char **src_buffer, size_t *src_len,
+		char *buffer, unsigned size, int options)
+{
+	enum packet_read_status status;
+	int pktlen;
+
+	status = packet_read_with_status(fd, src_buffer, src_len,
+					 buffer, size, &pktlen,
+					 options);
+	switch (status) {
+	case PACKET_READ_EOF:
+		pktlen = -1;
+		break;
+	case PACKET_READ_NORMAL:
+		break;
+	case PACKET_READ_FLUSH:
+		pktlen = 0;
+		break;
+	}
+
+	return pktlen;
 }
 
 static char *packet_read_line_generic(int fd,
diff --git a/pkt-line.h b/pkt-line.h
index 3dad583e2..06c468927 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Read a packetized line into a buffer like the 'packet_read()' function but
+ * returns an 'enum packet_read_status' which indicates the status of the read.
+ * The number of bytes read will be assigined to *pktlen if the status of the
+ * read was 'PACKET_READ_NORMAL'.
+ */
+enum packet_read_status {
+	PACKET_READ_EOF = -1,
+	PACKET_READ_NORMAL,
+	PACKET_READ_FLUSH,
+};
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
+						char *buffer, unsigned size, int *pktlen,
+						int options);
+
 /*
  * Convenience wrapper for packet_read that is not gentle, and sets the
  * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 02/26] pkt-line: introduce struct packet_reader
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
  2018-01-03  0:18 ` [PATCH 01/26] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-09 18:08   ` Jonathan Tan
  2018-01-03  0:18 ` [PATCH 03/26] pkt-line: add delim packet support Brandon Williams
                   ` (25 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Sometimes it is advantageous to be able to peek the next packet line
without consuming it (e.g. to be able to determine the protocol version
a server is speaking).  In order to do that introduce 'struct
packet_reader' which is an abstraction around the normal packet reading
logic.  This enables a caller to be able to peek a single line at a time
using 'packet_reader_peek()' and having a caller consume a line by
calling 'packet_reader_read()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pkt-line.h | 57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 116 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 8d7cd389f..98c2d7d68 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -406,3 +406,62 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out)
 	}
 	return sb_out->len - orig_len;
 }
+
+/* Packet Reader Functions */
+void packet_reader_init(struct packet_reader *reader, int fd,
+			char *src_buffer, size_t src_len,
+			int options)
+{
+	memset(reader, 0, sizeof(*reader));
+
+	reader->fd = fd;
+	reader->src_buffer = src_buffer;
+	reader->src_len = src_len;
+	reader->buffer = packet_buffer;
+	reader->buffer_size = sizeof(packet_buffer);
+	reader->options = options;
+}
+
+enum packet_read_status packet_reader_read(struct packet_reader *reader)
+{
+	if (reader->line_peeked) {
+		reader->line_peeked = 0;
+		return reader->status;
+	}
+
+	reader->status = packet_read_with_status(reader->fd,
+						 &reader->src_buffer,
+						 &reader->src_len,
+						 reader->buffer,
+						 reader->buffer_size,
+						 &reader->pktlen,
+						 reader->options);
+
+	switch (reader->status) {
+	case PACKET_READ_EOF:
+		reader->pktlen = -1;
+		reader->line = NULL;
+		break;
+	case PACKET_READ_NORMAL:
+		reader->line = reader->buffer;
+		break;
+	case PACKET_READ_FLUSH:
+		reader->pktlen = 0;
+		reader->line = NULL;
+		break;
+	}
+
+	return reader->status;
+}
+
+enum packet_read_status packet_reader_peek(struct packet_reader *reader)
+{
+	/* Only allow peeking a single line */
+	if (reader->line_peeked)
+		return reader->status;
+
+	/* Peek a line by reading it and setting peeked flag */
+	packet_reader_read(reader);
+	reader->line_peeked = 1;
+	return reader->status;
+}
diff --git a/pkt-line.h b/pkt-line.h
index 06c468927..c446e886a 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -111,6 +111,63 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
  */
 ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
 
+struct packet_reader {
+	/* source file descriptor */
+	int fd;
+
+	/* source buffer and its size */
+	char *src_buffer;
+	size_t src_len;
+
+	/* buffer that pkt-lines are read into and its size */
+	char *buffer;
+	unsigned buffer_size;
+
+	/* options to be used during reads */
+	int options;
+
+	/* status of the last read */
+	enum packet_read_status status;
+
+	/* length of data read during the last read */
+	int pktlen;
+
+	/* the last line read */
+	const char *line;
+
+	/* indicates if a line has been peeked */
+	int line_peeked;
+};
+
+/*
+ * Initialize a 'struct packet_reader' object which is an
+ * abstraction around the 'packet_read_with_status()' function.
+ */
+extern void packet_reader_init(struct packet_reader *reader, int fd,
+			       char *src_buffer, size_t src_len,
+			       int options);
+
+/*
+ * Perform a packet read and return the status of the read.
+ * The values of 'pktlen' and 'line' are updated based on the status of the
+ * read as follows:
+ *
+ * PACKET_READ_ERROR: 'pktlen' is set to '-1' and 'line' is set to NULL
+ * PACKET_READ_NORMAL: 'pktlen' is set to the number of bytes read
+ *		       'line' is set to point at the read line
+ * PACKET_READ_FLUSH: 'pktlen' is set to '0' and 'line' is set to NULL
+ */
+extern enum packet_read_status packet_reader_read(struct packet_reader *reader);
+
+/*
+ * Peek the next packet line without consuming it and return the status.
+ * The next call to 'packet_reader_read()' will perform a read of the same line
+ * that was peeked, consuming the line.
+ *
+ * Only a single line can be peeked at a time.
+ */
+extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);
+
 #define DEFAULT_PACKET_MAX 1000
 #define LARGE_PACKET_MAX 65520
 #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 03/26] pkt-line: add delim packet support
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
  2018-01-03  0:18 ` [PATCH 01/26] pkt-line: introduce packet_read_with_status Brandon Williams
  2018-01-03  0:18 ` [PATCH 02/26] pkt-line: introduce struct packet_reader Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 04/26] upload-pack: convert to a builtin Brandon Williams
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

One of the design goals of protocol-v2 is to improve the semantics of
flush packets.  Currently in protocol-v1, flush packets are used both to
indicate a break in a list of packet lines as well as an indication that
one side has finished speaking.  This makes it particularly difficult
to implement proxies as a proxy would need to completely understand git
protocol instead of simply looking for a flush packet.

To do this, introduce the special deliminator packet '0001'.  A delim
packet can then be used as a deliminator between lists of packet lines
while flush packets can be reserved to indicate the end of a response.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 19 ++++++++++++++++++-
 pkt-line.h |  3 +++
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/pkt-line.c b/pkt-line.c
index 98c2d7d68..3159cbe10 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
 	write_or_die(fd, "0000", 4);
 }
 
+void packet_delim(int fd)
+{
+	packet_trace("0001", 4, 1);
+	write_or_die(fd, "0001", 4);
+}
+
 int packet_flush_gently(int fd)
 {
 	packet_trace("0000", 4, 1);
@@ -105,6 +111,12 @@ void packet_buf_flush(struct strbuf *buf)
 	strbuf_add(buf, "0000", 4);
 }
 
+void packet_buf_delim(struct strbuf *buf)
+{
+	packet_trace("0001", 4, 1);
+	strbuf_add(buf, "0001", 4);
+}
+
 static void set_packet_header(char *buf, const int size)
 {
 	static char hexchar[] = "0123456789abcdef";
@@ -297,7 +309,10 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_
 	if (len == 0) {
 		packet_trace("0000", 4, 0);
 		return PACKET_READ_FLUSH;
-	} else if (len >= 1 && len <= 3) {
+	} else if (len == 1) {
+		packet_trace("0001", 4, 0);
+		return PACKET_READ_DELIM;
+	} else if (len >= 2 && len <= 3) {
 		die("protocol error: bad line length character: %.4s", linelen);
 	}
 
@@ -333,6 +348,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len,
 		break;
 	case PACKET_READ_NORMAL:
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		pktlen = 0;
 		break;
@@ -445,6 +461,7 @@ enum packet_read_status packet_reader_read(struct packet_reader *reader)
 	case PACKET_READ_NORMAL:
 		reader->line = reader->buffer;
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		reader->pktlen = 0;
 		reader->line = NULL;
diff --git a/pkt-line.h b/pkt-line.h
index c446e886a..97b6dd1c7 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -20,8 +20,10 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_delim(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
@@ -75,6 +77,7 @@ enum packet_read_status {
 	PACKET_READ_EOF = -1,
 	PACKET_READ_NORMAL,
 	PACKET_READ_FLUSH,
+	PACKET_READ_DELIM,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
 						char *buffer, unsigned size, int *pktlen,
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 04/26] upload-pack: convert to a builtin
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (2 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 03/26] pkt-line: add delim packet support Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03 20:33   ` Stefan Beller
  2018-01-03  0:18 ` [PATCH 05/26] upload-pack: factor out processing lines Brandon Williams
                   ` (23 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile      | 3 ++-
 builtin.h     | 1 +
 git.c         | 1 +
 upload-pack.c | 2 +-
 4 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 2a81ae22e..e0740b452 100644
--- a/Makefile
+++ b/Makefile
@@ -636,7 +636,6 @@ PROGRAM_OBJS += imap-send.o
 PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
-PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -701,6 +700,7 @@ BUILT_INS += git-merge-subtree$X
 BUILT_INS += git-show$X
 BUILT_INS += git-stage$X
 BUILT_INS += git-status$X
+BUILT_INS += git-upload-pack$X
 BUILT_INS += git-whatchanged$X
 
 # what 'all' will build and 'install' will install in gitexecdir,
@@ -904,6 +904,7 @@ LIB_OBJS += tree-diff.o
 LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
+LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa..f332a1257 100644
--- a/builtin.h
+++ b/builtin.h
@@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_update_server_info(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix);
+extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
diff --git a/git.c b/git.c
index c870b9719..f71073dc8 100644
--- a/git.c
+++ b/git.c
@@ -478,6 +478,7 @@ static struct cmd_struct commands[] = {
 	{ "update-server-info", cmd_update_server_info, RUN_SETUP },
 	{ "upload-archive", cmd_upload_archive },
 	{ "upload-archive--writer", cmd_upload_archive_writer },
+	{ "upload-pack", cmd_upload_pack },
 	{ "var", cmd_var, RUN_SETUP_GENTLY },
 	{ "verify-commit", cmd_verify_commit, RUN_SETUP },
 	{ "verify-pack", cmd_verify_pack },
diff --git a/upload-pack.c b/upload-pack.c
index d5de18127..20acaa49d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1032,7 +1032,7 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
-int cmd_main(int argc, const char **argv)
+int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 {
 	const char *dir;
 	int strict = 0;
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 05/26] upload-pack: factor out processing lines
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (3 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 04/26] upload-pack: convert to a builtin Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03 20:38   ` Stefan Beller
  2018-01-03  0:18 ` [PATCH 06/26] transport: use get_refs_via_connect to get refs Brandon Williams
                   ` (22 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 74 insertions(+), 39 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 20acaa49d..9a507ae53 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -731,6 +731,75 @@ static void deepen_by_rev_list(int ac, const char **av,
 	packet_flush(1);
 }
 
+static int process_shallow(const char *line, struct object_array *shallows)
+{
+	const char *arg;
+	if (skip_prefix(line, "shallow ", &arg)) {
+		struct object_id oid;
+		struct object *object;
+		if (get_oid_hex(arg, &oid))
+			die("invalid shallow line: %s", line);
+		object = parse_object(&oid);
+		if (!object)
+			return 1;
+		if (object->type != OBJ_COMMIT)
+			die("invalid shallow object %s", oid_to_hex(&oid));
+		if (!(object->flags & CLIENT_SHALLOW)) {
+			object->flags |= CLIENT_SHALLOW;
+			add_object_array(object, NULL, shallows);
+		}
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen(const char *line, int *depth)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen ", &arg)) {
+		char *end = NULL;
+		*depth = strtol(arg, &end, 0);
+		if (!end || *end || depth <= 0)
+			die("Invalid deepen: %s", line);
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-since ", &arg)) {
+		char *end = NULL;
+		*deepen_since = parse_timestamp(arg, &end, 0);
+		if (!end || *end || !deepen_since ||
+		    /* revisions.c's max_age -1 is special */
+		    *deepen_since == -1)
+			die("Invalid deepen-since: %s", line);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
+static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-not ", &arg)) {
+		char *ref = NULL;
+		struct object_id oid;
+		if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
+			die("git upload-pack: ambiguous deepen-not: %s", line);
+		string_list_append(deepen_not, ref);
+		free(ref);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
 static void receive_needs(void)
 {
 	struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -752,49 +821,15 @@ static void receive_needs(void)
 		if (!line)
 			break;
 
-		if (skip_prefix(line, "shallow ", &arg)) {
-			struct object_id oid;
-			struct object *object;
-			if (get_oid_hex(arg, &oid))
-				die("invalid shallow line: %s", line);
-			object = parse_object(&oid);
-			if (!object)
-				continue;
-			if (object->type != OBJ_COMMIT)
-				die("invalid shallow object %s", oid_to_hex(&oid));
-			if (!(object->flags & CLIENT_SHALLOW)) {
-				object->flags |= CLIENT_SHALLOW;
-				add_object_array(object, NULL, &shallows);
-			}
+		if (process_shallow(line, &shallows))
 			continue;
-		}
-		if (skip_prefix(line, "deepen ", &arg)) {
-			char *end = NULL;
-			depth = strtol(arg, &end, 0);
-			if (!end || *end || depth <= 0)
-				die("Invalid deepen: %s", line);
+		if (process_deepen(line, &depth))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-since ", &arg)) {
-			char *end = NULL;
-			deepen_since = parse_timestamp(arg, &end, 0);
-			if (!end || *end || !deepen_since ||
-			    /* revisions.c's max_age -1 is special */
-			    deepen_since == -1)
-				die("Invalid deepen-since: %s", line);
-			deepen_rev_list = 1;
+		if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-not ", &arg)) {
-			char *ref = NULL;
-			struct object_id oid;
-			if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
-				die("git upload-pack: ambiguous deepen-not: %s", line);
-			string_list_append(&deepen_not, ref);
-			free(ref);
-			deepen_rev_list = 1;
+		if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
 			continue;
-		}
+
 		if (!skip_prefix(line, "want ", &arg) ||
 		    get_oid_hex(arg, &oid_buf))
 			die("git upload-pack: protocol error, "
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 06/26] transport: use get_refs_via_connect to get refs
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (4 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 05/26] upload-pack: factor out processing lines Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03 21:20   ` Stefan Beller
  2018-01-03  0:18 ` [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
                   ` (21 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index fc802260f..8e8779096 100644
--- a/transport.c
+++ b/transport.c
@@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
 
-	if (!data->got_remote_heads) {
-		connect_setup(transport, 0);
-		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		refs_tmp = get_refs_via_connect(transport, 0);
 
 	refs = fetch_pack(&args, data->fd, data->conn,
 			  refs_tmp ? refs_tmp : transport->remote_refs,
@@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	struct send_pack_args args;
 	int ret;
 
-	if (!data->got_remote_heads) {
-		struct ref *tmp_refs;
-		connect_setup(transport, 1);
-
-		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		get_refs_via_connect(transport, 1);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (5 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 06/26] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-09 18:27   ` Jonathan Tan
  2018-01-03  0:18 ` [PATCH 08/26] connect: discover protocol version outside of get_remote_heads Brandon Williams
                   ` (20 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

In order to allow for better control flow when protocol_v2 is introduced
convert 'get_remote_heads()' to use 'struct packet_reader' to read
packet lines.  This enables a client to be able to peek the first line
of a server's response (without consuming it) in order to determine the
protocol version its speaking and then passing control to the
appropriate handler.

This is needed because the initial response from a server speaking
protocol_v0 includes the first ref, while subsequent protocol versions
respond with a version line.  We want to be able to read this first line
without consuming the first ref sent in the protocol_v0 case so that the
protocol version the server is speaking can be determined outside of
'get_remote_heads()' in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 127 +++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 72 insertions(+), 55 deletions(-)

diff --git a/connect.c b/connect.c
index c3a014c5b..03bbb74e4 100644
--- a/connect.c
+++ b/connect.c
@@ -48,6 +48,12 @@ int check_ref_type(const struct ref *ref, int flags)
 
 static void die_initial_contact(int unexpected)
 {
+	/*
+	 * A hang-up after seeing some response from the other end
+	 * means that it is unexpected, as we know the other end is
+	 * willing to talk to us.  A hang-up before seeing any
+	 * response does not necessarily mean an ACL problem, though.
+	 */
 	if (unexpected)
 		die(_("The remote end hung up upon initial contact"));
 	else
@@ -56,6 +62,41 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+static enum protocol_version discover_version(struct packet_reader *reader)
+{
+	enum protocol_version version = protocol_unknown_version;
+
+	/*
+	 * Peek the first line of the server's response to
+	 * determine the protocol version the server is speaking.
+	 */
+	switch (packet_reader_peek(reader)) {
+	case PACKET_READ_EOF:
+		die_initial_contact(0);
+	case PACKET_READ_FLUSH:
+	case PACKET_READ_DELIM:
+		version = protocol_v0;
+		break;
+	case PACKET_READ_NORMAL:
+		version = determine_protocol_version_client(reader->line);
+		break;
+	}
+
+	/* Maybe process capabilities here, at least for v2 */
+	switch (version) {
+	case protocol_v1:
+		/* Read the peeked version line */
+		packet_reader_read(reader);
+		break;
+	case protocol_v0:
+		break;
+	case protocol_unknown_version:
+		die("unknown protocol version: '%s'\n", reader->line);
+	}
+
+	return version;
+}
+
 static void parse_one_symref_info(struct string_list *symref, const char *val, int len)
 {
 	char *sym, *target;
@@ -109,44 +150,10 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
-/*
- * Read one line of a server's ref advertisement into packet_buffer.
- */
-static int read_remote_ref(int in, char **src_buf, size_t *src_len,
-			   int *responded)
-{
-	int len = packet_read(in, src_buf, src_len,
-			      packet_buffer, sizeof(packet_buffer),
-			      PACKET_READ_GENTLE_ON_EOF |
-			      PACKET_READ_CHOMP_NEWLINE);
-	const char *arg;
-	if (len < 0)
-		die_initial_contact(*responded);
-	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
-		die("remote error: %s", arg);
-
-	*responded = 1;
-
-	return len;
-}
-
-#define EXPECTING_PROTOCOL_VERSION 0
-#define EXPECTING_FIRST_REF 1
-#define EXPECTING_REF 2
-#define EXPECTING_SHALLOW 3
-
-/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
-static int process_protocol_version(void)
-{
-	switch (determine_protocol_version_client(packet_buffer)) {
-	case protocol_v1:
-		return 1;
-	case protocol_v0:
-		return 0;
-	default:
-		die("server is speaking an unknown protocol");
-	}
-}
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+#define EXPECTING_DONE 3
 
 static void process_capabilities(int *len)
 {
@@ -230,28 +237,36 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 			      struct oid_array *shallow_points)
 {
 	struct ref **orig_list = list;
+	int len = 0;
+	int state = EXPECTING_FIRST_REF;
+	struct packet_reader reader;
+	const char *arg;
 
-	/*
-	 * A hang-up after seeing some response from the other end
-	 * means that it is unexpected, as we know the other end is
-	 * willing to talk to us.  A hang-up before seeing any
-	 * response does not necessarily mean an ACL problem, though.
-	 */
-	int responded = 0;
-	int len;
-	int state = EXPECTING_PROTOCOL_VERSION;
+	packet_reader_init(&reader, in, src_buf, src_len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	discover_version(&reader);
 
 	*list = NULL;
 
-	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+	while (state != EXPECTING_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			die_initial_contact(1);
+		case PACKET_READ_NORMAL:
+			len = reader.pktlen;
+			if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+				die("remote error: %s", arg);
+			break;
+		case PACKET_READ_FLUSH:
+			state = EXPECTING_DONE;
+			break;
+		case PACKET_READ_DELIM:
+			die("invalid packet\n");
+		}
+
 		switch (state) {
-		case EXPECTING_PROTOCOL_VERSION:
-			if (process_protocol_version()) {
-				state = EXPECTING_FIRST_REF;
-				break;
-			}
-			state = EXPECTING_FIRST_REF;
-			/* fallthrough */
 		case EXPECTING_FIRST_REF:
 			process_capabilities(&len);
 			if (process_dummy_ref()) {
@@ -269,6 +284,8 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 			if (process_shallow(len, shallow_points))
 				break;
 			die("protocol error: unexpected '%s'", packet_buffer);
+		case EXPECTING_DONE:
+			break;
 		default:
 			die("unexpected state %d", state);
 		}
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 08/26] connect: discover protocol version outside of get_remote_heads
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (6 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 09/26] transport: store protocol version Brandon Williams
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'.  This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c | 16 +++++++++++++++-
 builtin/send-pack.c  | 17 +++++++++++++++--
 connect.c            | 15 ++++-----------
 connect.h            |  3 +++
 remote-curl.c        | 20 ++++++++++++++++++--
 remote.h             |  5 +++--
 transport.c          | 24 +++++++++++++++++++-----
 7 files changed, 77 insertions(+), 23 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..85d4faf76 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -4,6 +4,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "sha1-array.h"
+#include "protocol.h"
 
 static const char fetch_pack_usage[] =
 "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
@@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	struct fetch_pack_args args;
 	struct oid_array shallow = OID_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
+	struct packet_reader reader;
 
 	packet_trace_identity("fetch-pack");
 
@@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 		if (!conn)
 			return args.diag_url ? 0 : 1;
 	}
-	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 			 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..83cb125a6 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
+#include "protocol.h"
 
 static const char * const send_pack_usage[] = {
 	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
@@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 	int progress = -1;
 	int from_stdin = 0;
 	struct push_cas_option cas = {0};
+	struct packet_reader reader;
 
 	struct option options[] = {
 		OPT__VERBOSITY(&verbose),
@@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			args.verbose ? CONNECT_VERBOSE : 0);
 	}
 
-	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-			 &extra_have, &shallow);
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
+				 &extra_have, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index 03bbb74e4..1787b0212 100644
--- a/connect.c
+++ b/connect.c
@@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
-static enum protocol_version discover_version(struct packet_reader *reader)
+enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
 
@@ -231,7 +231,7 @@ static int process_shallow(int len, struct oid_array *shallow_points)
 /*
  * Read all the refs from the other end
  */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct ref **get_remote_heads(struct packet_reader *reader,
 			      struct ref **list, unsigned int flags,
 			      struct oid_array *extra_have,
 			      struct oid_array *shallow_points)
@@ -239,23 +239,16 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	struct ref **orig_list = list;
 	int len = 0;
 	int state = EXPECTING_FIRST_REF;
-	struct packet_reader reader;
 	const char *arg;
 
-	packet_reader_init(&reader, in, src_buf, src_len,
-			   PACKET_READ_CHOMP_NEWLINE |
-			   PACKET_READ_GENTLE_ON_EOF);
-
-	discover_version(&reader);
-
 	*list = NULL;
 
 	while (state != EXPECTING_DONE) {
-		switch (packet_reader_read(&reader)) {
+		switch (packet_reader_read(reader)) {
 		case PACKET_READ_EOF:
 			die_initial_contact(1);
 		case PACKET_READ_NORMAL:
-			len = reader.pktlen;
+			len = reader->pktlen;
 			if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
 				die("remote error: %s", arg);
 			break;
diff --git a/connect.h b/connect.h
index 01f14cdf3..cdb8979dc 100644
--- a/connect.h
+++ b/connect.h
@@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
 extern const char *server_feature_value(const char *feature, int *len_ret);
 extern int url_is_local_not_ssh(const char *url);
 
+struct packet_reader;
+extern enum protocol_version discover_version(struct packet_reader *reader);
+
 #endif
diff --git a/remote-curl.c b/remote-curl.c
index 0053b0954..9f6d07683 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "config.h"
 #include "remote.h"
+#include "connect.h"
 #include "strbuf.h"
 #include "walker.h"
 #include "http.h"
@@ -13,6 +14,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "protocol.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -176,8 +178,22 @@ static struct discovery *last_discovery;
 static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 {
 	struct ref *list = NULL;
-	get_remote_heads(-1, heads->buf, heads->len, &list,
-			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
+	struct packet_reader reader;
+
+	packet_reader_init(&reader, -1, heads->buf, heads->len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
+				 NULL, &heads->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	return list;
 }
 
diff --git a/remote.h b/remote.h
index 1f6611be2..2016461df 100644
--- a/remote.h
+++ b/remote.h
@@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
 void free_refs(struct ref *ref);
 
 struct oid_array;
-extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct packet_reader;
+extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
-				     struct oid_array *shallow);
+				     struct oid_array *shallow_points);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/transport.c b/transport.c
index 8e8779096..63c3dbab9 100644
--- a/transport.c
+++ b/transport.c
@@ -18,6 +18,7 @@
 #include "sha1-array.h"
 #include "sigchain.h"
 #include "transport-internal.h"
+#include "protocol.h"
 
 static void set_upstreams(struct transport *transport, struct ref *refs,
 	int pretend)
@@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
 {
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
+	struct packet_reader reader;
 
 	connect_setup(transport, for_push);
-	get_remote_heads(data->fd[0], NULL, 0, &refs,
-			 for_push ? REF_NORMAL : 0,
-			 &data->extra_have,
-			 &data->shallow);
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &refs,
+				 for_push ? REF_NORMAL : 0,
+				 &data->extra_have,
+				 &data->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 	data->got_remote_heads = 1;
 
 	return refs;
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 09/26] transport: store protocol version
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (7 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 08/26] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-09 18:41   ` Jonathan Tan
  2018-01-03  0:18 ` [PATCH 10/26] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
                   ` (18 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version.  Store the
protocol version the server is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 63c3dbab9..2378dcb38 100644
--- a/transport.c
+++ b/transport.c
@@ -118,6 +118,7 @@ struct git_transport_data {
 	struct child_process *conn;
 	int fd[2];
 	unsigned got_remote_heads : 1;
+	enum protocol_version version;
 	struct oid_array extra_have;
 	struct oid_array shallow;
 };
@@ -200,7 +201,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	data->version = discover_version(&reader);
+	switch (data->version) {
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -221,7 +223,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 {
 	int ret = 0;
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
 	char *dest = xstrdup(transport->url);
 	struct fetch_pack_args args;
 	struct ref *refs_tmp = NULL;
@@ -247,10 +249,18 @@ static int fetch_refs_via_pack(struct transport *transport,
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0);
 
-	refs = fetch_pack(&args, data->fd, data->conn,
-			  refs_tmp ? refs_tmp : transport->remote_refs,
-			  dest, to_fetch, nr_heads, &data->shallow,
-			  &transport->pack_lockfile);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	close(data->fd[0]);
 	close(data->fd[1]);
 	if (finish_connect(data->conn))
@@ -549,7 +559,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 {
 	struct git_transport_data *data = transport->data;
 	struct send_pack_args args;
-	int ret;
+	int ret = 0;
 
 	if (!data->got_remote_heads)
 		get_refs_via_connect(transport, 1);
@@ -574,8 +584,15 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	else
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
-	ret = send_pack(&args, data->fd, data->conn, remote_refs,
-			&data->extra_have);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		ret = send_pack(&args, data->fd, data->conn, remote_refs,
+				&data->extra_have);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	close(data->fd[1]);
 	close(data->fd[0]);
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 10/26] protocol: introduce enum protocol_version value protocol_v2
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (8 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 09/26] transport: store protocol version Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 11/26] serve: introduce git-serve Brandon Williams
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   | 3 +++
 builtin/receive-pack.c | 6 ++++++
 builtin/send-pack.c    | 3 +++
 connect.c              | 3 +++
 protocol.c             | 2 ++
 protocol.h             | 1 +
 remote-curl.c          | 3 +++
 transport.c            | 9 +++++++++
 upload-pack.c          | 6 ++++++
 9 files changed, 36 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 85d4faf76..f492e8abd 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index b7ce7c7f5..3656e94fd 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		unpack_limit = receive_unpack_limit;
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * push support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 83cb125a6..b5427f75e 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
diff --git a/connect.c b/connect.c
index 1787b0212..caa539b75 100644
--- a/connect.c
+++ b/connect.c
@@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
 
 	/* Maybe process capabilities here, at least for v2 */
 	switch (version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 		/* Read the peeked version line */
 		packet_reader_read(reader);
diff --git a/protocol.c b/protocol.c
index 43012b7eb..5e636785d 100644
--- a/protocol.c
+++ b/protocol.c
@@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
 		return protocol_v0;
 	else if (!strcmp(value, "1"))
 		return protocol_v1;
+	else if (!strcmp(value, "2"))
+		return protocol_v2;
 	else
 		return protocol_unknown_version;
 }
diff --git a/protocol.h b/protocol.h
index 1b2bc94a8..2ad35e433 100644
--- a/protocol.h
+++ b/protocol.h
@@ -5,6 +5,7 @@ enum protocol_version {
 	protocol_unknown_version = -1,
 	protocol_v0 = 0,
 	protocol_v1 = 1,
+	protocol_v2 = 2,
 };
 
 /*
diff --git a/remote-curl.c b/remote-curl.c
index 9f6d07683..dae8a4a48 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
diff --git a/transport.c b/transport.c
index 2378dcb38..83d9dd1df 100644
--- a/transport.c
+++ b/transport.c
@@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 
 	data->version = discover_version(&reader);
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
 		refs_tmp = get_refs_via_connect(transport, 0);
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
@@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		ret = send_pack(&args, data->fd, data->conn, remote_refs,
diff --git a/upload-pack.c b/upload-pack.c
index 9a507ae53..2bc888fc1 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1104,6 +1104,12 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 	git_config(upload_pack_config, NULL);
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * fetch support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 11/26] serve: introduce git-serve
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (9 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 10/26] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-09 20:24   ` Jonathan Tan
  2018-02-01 18:48   ` Jeff Hostetler
  2018-01-03  0:18 ` [PATCH 12/26] ls-refs: introduce ls-refs server command Brandon Williams
                   ` (16 subsequent siblings)
  27 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt |  91 ++++++++++++
 Makefile                                |   2 +
 builtin.h                               |   1 +
 builtin/serve.c                         |  30 ++++
 git.c                                   |   1 +
 serve.c                                 | 239 ++++++++++++++++++++++++++++++++
 serve.h                                 |  15 ++
 8 files changed, 380 insertions(+)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 000000000..b87ba3816
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,91 @@
+ Git Wire Protocol, Version 2
+==============================
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+    supported by a single service.
+  * Easily extendable as capabilities are moved into their own section
+    of the protocol, no longer being hidden behind a NUL byte and
+    limited by the size of a pkt-line (as there will be a single
+    capability per pkt-line).
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+    string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+
+ Detailed Design
+=================
+
+A client can request to speak protocol v2 by sending `version=2` in the
+side-channel `GIT_PROTOCOL` in the initial request to the server.
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Special Packets
+-----------------
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+
+ Capability Advertisement
+--------------------------
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+    capability-advertisement = protocol-version
+			       capability-list
+			       flush-pkt
+
+    protocol-version = PKT-LINE("version 2" LF)
+    capability-list = *capability
+    capability = PKT-LINE(key[=value] LF)
+
+    key = 1*CHAR
+    value = 1*CHAR
+    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
+
+A client then responds to select the command it wants with any particular
+capabilities or arguments.  There is then an optional section where the
+client can provide any command specific parameters or queries.
+
+    command-request = command
+		      capability-list
+		      (command-args)
+		      flush-pkt
+    command = PKT-LINE("command=" key LF)
+    command-args = delim-pkt
+		   *arg
+    arg = 1*CHAR
+
+The server will then check to ensure that the client's request is
+comprised of a valid command as well as valid capabilities which were
+advertised.  If the request is valid the server will then execute the
+command.
+
+A particular command can last for as many rounds as are required to
+complete the service (multiple for negotiation during fetch or no
+additional trips in the case of ls-refs).
+
+When finished a client should send an empty request of just a flush-pkt to
+terminate the connection.
+
+ Commands in v2
+~~~~~~~~~~~~~~~~
+
+Commands are the core actions that a client wants to perform (fetch, push,
+etc).  Each command will be provided with a list capabilities and
+arguments as requested by a client.
diff --git a/Makefile b/Makefile
index e0740b452..5f3b5fe8b 100644
--- a/Makefile
+++ b/Makefile
@@ -876,6 +876,7 @@ LIB_OBJS += revision.o
 LIB_OBJS += run-command.o
 LIB_OBJS += send-pack.o
 LIB_OBJS += sequencer.o
+LIB_OBJS += serve.o
 LIB_OBJS += server-info.o
 LIB_OBJS += setup.o
 LIB_OBJS += sha1-array.o
@@ -1009,6 +1010,7 @@ BUILTIN_OBJS += builtin/rev-parse.o
 BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
+BUILTIN_OBJS += builtin/serve.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
diff --git a/builtin.h b/builtin.h
index f332a1257..3f3fdfc28 100644
--- a/builtin.h
+++ b/builtin.h
@@ -215,6 +215,7 @@ extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
 extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
+extern int cmd_serve(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/serve.c b/builtin/serve.c
new file mode 100644
index 000000000..bb726786a
--- /dev/null
+++ b/builtin/serve.c
@@ -0,0 +1,30 @@
+#include "cache.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "serve.h"
+
+static char const * const grep_usage[] = {
+	N_("git serve [<options>]"),
+	NULL
+};
+
+int cmd_serve(int argc, const char **argv, const char *prefix)
+{
+	struct serve_options opts = SERVE_OPTIONS_INIT;
+
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-capabilities", &opts.advertise_capabilities,
+			 N_("exit immediately after advertising capabilities")),
+		OPT_END()
+	};
+
+	/* ignore all unknown cmdline switches for now */
+	argc = parse_options(argc, argv, prefix, options, grep_usage,
+			     PARSE_OPT_KEEP_DASHDASH |
+			     PARSE_OPT_KEEP_UNKNOWN);
+	serve(&opts);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index f71073dc8..f85d682b6 100644
--- a/git.c
+++ b/git.c
@@ -461,6 +461,7 @@ static struct cmd_struct commands[] = {
 	{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
 	{ "rm", cmd_rm, RUN_SETUP },
 	{ "send-pack", cmd_send_pack, RUN_SETUP },
+	{ "serve", cmd_serve, RUN_SETUP },
 	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
 	{ "show", cmd_show, RUN_SETUP },
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/serve.c b/serve.c
new file mode 100644
index 000000000..da8127775
--- /dev/null
+++ b/serve.c
@@ -0,0 +1,239 @@
+#include "cache.h"
+#include "repository.h"
+#include "config.h"
+#include "pkt-line.h"
+#include "version.h"
+#include "argv-array.h"
+#include "serve.h"
+
+static int always_advertise(struct repository *r,
+			    struct strbuf *value)
+{
+	return 1;
+}
+
+static int agent_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (value)
+		strbuf_addstr(value, git_user_agent_sanitized());
+	return 1;
+}
+
+struct protocol_capability {
+	const char *name; /* capability name */
+
+	/*
+	 * Function queried to see if a capability should be advertised.
+	 * Optionally a value can be specified by adding it to 'value'.
+	 */
+	int (*advertise)(struct repository *r, struct strbuf *value);
+
+	/*
+	 * Function called when a client requests the capability as a command.
+	 * The command request will be provided to the function via 'keys', the
+	 * capabilities requested, and 'args', the command specific parameters.
+	 *
+	 * This field should be NULL for capabilities which are not commands.
+	 */
+	int (*command)(struct repository *r,
+		       struct argv_array *keys,
+		       struct argv_array *args);
+};
+
+static struct protocol_capability capabilities[] = {
+	{ "agent", agent_advertise, NULL },
+	{ "stateless-rpc", always_advertise, NULL },
+};
+
+static void advertise_capabilities(void)
+{
+	struct strbuf capability = STRBUF_INIT;
+	struct strbuf value = STRBUF_INIT;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+
+		if (c->advertise(the_repository, &value)) {
+			strbuf_addstr(&capability, c->name);
+
+			if (value.len) {
+				strbuf_addch(&capability, '=');
+				strbuf_addbuf(&capability, &value);
+			}
+
+			strbuf_addch(&capability, '\n');
+			packet_write(1, capability.buf, capability.len);
+		}
+
+		strbuf_reset(&capability);
+		strbuf_reset(&value);
+	}
+
+	packet_flush(1);
+	strbuf_release(&capability);
+	strbuf_release(&value);
+}
+
+static struct protocol_capability *get_capability(const char *key)
+{
+	int i;
+
+	if (!key)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+		const char *out;
+		if (skip_prefix(key, c->name, &out) && (!*out || *out == '='))
+			return c;
+	}
+
+	return NULL;
+}
+
+static int is_valid_capability(const char *key)
+{
+	const struct protocol_capability *c = get_capability(key);
+
+	return c && c->advertise(the_repository, NULL);
+}
+
+static int is_command(const char *key, struct protocol_capability **command)
+{
+	const char *out;
+
+	if (skip_prefix(key, "command=", &out)) {
+		struct protocol_capability *cmd = get_capability(out);
+
+		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
+			die("invalid cmd '%s'", out);
+		if (*command)
+			die("command already requested");
+
+		*command = cmd;
+		return 1;
+	}
+
+	return 0;
+}
+
+int has_capability(const struct argv_array *keys, const char *capability,
+		   const char **value)
+{
+	int i;
+	for (i = 0; i < keys->argc; i++) {
+		const char *out;
+		if (skip_prefix(keys->argv[i], capability, &out) &&
+		    (!*out || *out == '=')) {
+			if (value) {
+				if (*out == '=')
+					out++;
+				*value = out;
+			}
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+#define PROCESS_REQUEST_KEYS 0
+#define PROCESS_REQUEST_ARGS 1
+#define PROCESS_REQUEST_DONE 2
+
+static int process_request(void)
+{
+	int state = PROCESS_REQUEST_KEYS;
+	struct packet_reader reader;
+	struct argv_array keys = ARGV_ARRAY_INIT;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct protocol_capability *command = NULL;
+
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (state != PROCESS_REQUEST_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			BUG("Should have already died when seeing EOF");
+		case PACKET_READ_NORMAL:
+			break;
+		case PACKET_READ_FLUSH:
+			state = PROCESS_REQUEST_DONE;
+			continue;
+		case PACKET_READ_DELIM:
+			if (state != PROCESS_REQUEST_KEYS)
+				die("protocol error");
+			state = PROCESS_REQUEST_ARGS;
+			/*
+			 * maybe include a check to make sure that a
+			 * command/capabilities were given.
+			 */
+			continue;
+		}
+
+		switch (state) {
+		case PROCESS_REQUEST_KEYS:
+			/* collect request; a sequence of keys and values */
+			if (is_command(reader.line, &command) ||
+			    is_valid_capability(reader.line))
+				argv_array_push(&keys, reader.line);
+			break;
+		case PROCESS_REQUEST_ARGS:
+			/* collect arguments for the requested command */
+			argv_array_push(&args, reader.line);
+			break;
+		case PROCESS_REQUEST_DONE:
+			continue;
+		default:
+			BUG("invalid state");
+		}
+	}
+
+	/*
+	 * If no command and no keys were given then the client wanted to
+	 * terminate the connection.
+	 */
+	if (!keys.argc && !args.argc)
+		return 1;
+
+	if (!command)
+		die("no command requested");
+
+	command->command(the_repository, &keys, &args);
+
+	argv_array_clear(&keys);
+	argv_array_clear(&args);
+	return 0;
+}
+
+/* Main serve loop for protocol version 2 */
+void serve(struct serve_options *options)
+{
+	if (options->advertise_capabilities || !options->stateless_rpc) {
+		/* serve by default supports v2 */
+		packet_write_fmt(1, "version 2\n");
+
+		advertise_capabilities();
+		/*
+		 * If only the list of capabilities was requested exit
+		 * immediately after advertising capabilities
+		 */
+		if (options->advertise_capabilities)
+			return;
+	}
+
+	/*
+	 * If stateless-rpc was requested then exit after
+	 * a single request/response exchange
+	 */
+	if (options->stateless_rpc) {
+		process_request();
+	} else {
+		for (;;)
+			if (process_request())
+				break;
+	}
+}
diff --git a/serve.h b/serve.h
new file mode 100644
index 000000000..fe65ba9f4
--- /dev/null
+++ b/serve.h
@@ -0,0 +1,15 @@
+#ifndef SERVE_H
+#define SERVE_H
+
+struct argv_array;
+extern int has_capability(const struct argv_array *keys, const char *capability,
+			  const char **value);
+
+struct serve_options {
+	unsigned advertise_capabilities;
+	unsigned stateless_rpc;
+};
+#define SERVE_OPTIONS_INIT { 0 }
+extern void serve(struct serve_options *options);
+
+#endif /* SERVE_H */
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (10 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 11/26] serve: introduce git-serve Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-04  0:17   ` Stefan Beller
                     ` (2 more replies)
  2018-01-03  0:18 ` [PATCH 13/26] connect: request remote refs using v2 Brandon Williams
                   ` (15 subsequent siblings)
  27 siblings, 3 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt | 26 +++++++++
 Makefile                                |  1 +
 ls-refs.c                               | 97 +++++++++++++++++++++++++++++++++
 ls-refs.h                               |  9 +++
 serve.c                                 |  2 +
 5 files changed, 135 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index b87ba3816..5f4d0e719 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -89,3 +89,29 @@ terminate the connection.
 Commands are the core actions that a client wants to perform (fetch, push,
 etc).  Each command will be provided with a list capabilities and
 arguments as requested by a client.
+
+ Ls-refs
+---------
+
+Ls-refs is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in parameters
+which can be used to limit the refs sent from the server.
+
+Ls-ref takes in the following parameters wraped in packet-lines:
+
+  symrefs: In addition to the object pointed by it, show the underlying
+	   ref pointed by it when showing a symbolic ref.
+  peel: Show peeled tags.
+  ref-pattern <pattern>: When specified, only references matching the
+			 given patterns are displayed.
+
+The output of ls-refs is as follows:
+
+    output = *ref
+	     flush-pkt
+    ref = PKT-LINE((tip | peeled) LF)
+    tip = obj-id SP refname (SP symref-target)
+    peeled = obj-id SP refname "^{}"
+
+    symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
+    shallow = PKT-LINE("shallow" SP obj-id LF)
diff --git a/Makefile b/Makefile
index 5f3b5fe8b..152a73bec 100644
--- a/Makefile
+++ b/Makefile
@@ -820,6 +820,7 @@ LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 000000000..ac4904a40
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,97 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ls_refs_data {
+	unsigned peel;
+	unsigned symrefs;
+	struct argv_array patterns;
+};
+
+/*
+ * Check if one of the patterns matches the tail part of the ref.
+ * If no patterns were provided, all refs match.
+ */
+static int ref_match(const struct argv_array *patterns, const char *refname)
+{
+	char *pathbuf;
+	int i;
+
+	if (!patterns->argc)
+		return 1; /* no restriction */
+
+	pathbuf = xstrfmt("/%s", refname);
+	for (i = 0; i < patterns->argc; i++) {
+		if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
+			free(pathbuf);
+			return 1;
+		}
+	}
+	free(pathbuf);
+	return 0;
+}
+
+static int send_ref(const char *refname, const struct object_id *oid,
+		    int flag, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+	const char *refname_nons = strip_namespace(refname);
+	struct strbuf refline = STRBUF_INIT;
+
+	if (!ref_match(&data->patterns, refname))
+		return 0;
+
+	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (data->symrefs && flag & REF_ISSYMREF) {
+		struct object_id unused;
+		const char *symref_target = resolve_ref_unsafe(refname, 0,
+							       &unused,
+							       &flag);
+
+		if (!symref_target)
+			die("'%s' is a symref but it is not?", refname);
+
+		strbuf_addf(&refline, " %s", symref_target);
+	}
+
+	strbuf_addch(&refline, '\n');
+
+	packet_write(1, refline.buf, refline.len);
+	if (data->peel) {
+		struct object_id peeled;
+		if (!peel_ref(refname, &peeled))
+			packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled),
+					 refname_nons);
+	}
+
+	strbuf_release(&refline);
+	return 0;
+}
+
+int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array *args)
+{
+	int i;
+	struct ls_refs_data data = { 0, 0, ARGV_ARRAY_INIT };
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+		const char *out;
+
+		if (!strcmp("peel", arg))
+			data.peel = 1;
+		else if (!strcmp("symrefs", arg))
+			data.symrefs = 1;
+		else if (skip_prefix(arg, "ref-pattern ", &out))
+			argv_array_pushf(&data.patterns, "*/%s", out);
+	}
+
+	head_ref_namespaced(send_ref, &data);
+	for_each_namespaced_ref(send_ref, &data);
+	packet_flush(1);
+	argv_array_clear(&data.patterns);
+	return 0;
+}
diff --git a/ls-refs.h b/ls-refs.h
new file mode 100644
index 000000000..9e4c57bfe
--- /dev/null
+++ b/ls-refs.h
@@ -0,0 +1,9 @@
+#ifndef LS_REFS_H
+#define LS_REFS_H
+
+struct repository;
+struct argv_array;
+extern int ls_refs(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args);
+
+#endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index da8127775..88d548410 100644
--- a/serve.c
+++ b/serve.c
@@ -4,6 +4,7 @@
 #include "pkt-line.h"
 #include "version.h"
 #include "argv-array.h"
+#include "ls-refs.h"
 #include "serve.h"
 
 static int always_advertise(struct repository *r,
@@ -44,6 +45,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "stateless-rpc", always_advertise, NULL },
+	{ "ls-refs", always_advertise, ls_refs },
 };
 
 static void advertise_capabilities(void)
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 13/26] connect: request remote refs using v2
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (11 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 12/26] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-09 22:24   ` Jonathan Tan
  2018-01-03  0:18 ` [PATCH 14/26] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
                   ` (14 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              | 101 ++++++++++++++++++++++++++++++++++++++++++++++++-
 remote.h               |   4 ++
 t/t5701-protocol-v2.sh |  28 ++++++++++++++
 transport.c            |   2 +-
 upload-pack.c          |   9 +++--
 5 files changed, 138 insertions(+), 6 deletions(-)
 create mode 100755 t/t5701-protocol-v2.sh

diff --git a/connect.c b/connect.c
index caa539b75..9badd403f 100644
--- a/connect.c
+++ b/connect.c
@@ -12,9 +12,11 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "version.h"
 #include "protocol.h"
 
 static char *server_capabilities;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+static int server_supports_v2(const char *c, int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *out == '='))
+			return 1;
+	}
+
+	if (die_on_error)
+		die("server doesn't support '%s'", c);
+
+	return 0;
+}
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		argv_array_push(&server_capabilities_v2, reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
+}
+
 enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
@@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 	/* Maybe process capabilities here, at least for v2 */
 	switch (version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		process_capabilities_v2(reader);
 		break;
 	case protocol_v1:
 		/* Read the peeked version line */
@@ -292,6 +321,76 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 	return list;
 }
 
+static int process_ref_v2(const char *line, struct ref ***list)
+{
+	int ret = 1;
+	int i = 0;
+	struct object_id old_oid;
+	struct ref *ref;
+	struct string_list line_sections = STRING_LIST_INIT_DUP;
+
+	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+		ret = 0;
+		goto out;
+	}
+
+	if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
+		ret = 0;
+		goto out;
+	}
+
+	ref = alloc_ref(line_sections.items[i++].string);
+
+	if (i < line_sections.nr)
+		ref->symref = xstrdup(line_sections.items[i++].string);
+
+	oidcpy(&ref->old_oid, &old_oid);
+	**list = ref;
+	*list = &ref->next;
+
+out:
+	string_list_clear(&line_sections, 0);
+	return ret;
+}
+
+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+			     struct ref **list, int for_push,
+			     const struct argv_array *ref_patterns)
+{
+	int i;
+	*list = NULL;
+
+	/* Check that the server supports the ls-refs command */
+	/* Issue request for ls-refs */
+	if (server_supports_v2("ls-refs", 1))
+		packet_write_fmt(fd_out, "command=ls-refs\n");
+
+	if (server_supports_v2("agent", 0))
+	    packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
+
+	packet_delim(fd_out);
+	/* When pushing we don't want to request the peeled tags */
+	if (!for_push)
+		packet_write_fmt(fd_out, "peel\n");
+	packet_write_fmt(fd_out, "symrefs\n");
+	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
+		packet_write_fmt(fd_out, "ref-pattern %s\n",
+				 ref_patterns->argv[i]);
+	}
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (!process_ref_v2(reader->line, &list))
+			die("invalid ls-refs response: %s", reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
+
+	return list;
+}
+
 static const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp)
 {
 	int len;
diff --git a/remote.h b/remote.h
index 2016461df..21d0c776c 100644
--- a/remote.h
+++ b/remote.h
@@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
 
 struct oid_array;
 struct packet_reader;
+struct argv_array;
 extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
 				     struct oid_array *shallow_points);
+extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+				    struct ref **list, int for_push,
+				    const struct argv_array *ref_patterns);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/t/t5701-protocol-v2.sh b/t/t5701-protocol-v2.sh
new file mode 100755
index 000000000..4bf4d61ac
--- /dev/null
+++ b/t/t5701-protocol-v2.sh
@@ -0,0 +1,28 @@
+#!/bin/sh
+
+test_description='test git wire-protocol version 2'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'list refs with file:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote --symref "file://$(pwd)/file_parent" >actual 2>log &&
+
+	# Server responded using protocol v2
+	cat log &&
+	grep "git< version 2" log &&
+
+	git ls-remote --symref "file://$(pwd)/file_parent" >expect &&
+	test_cmp actual expect
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 83d9dd1df..ffc6b2614 100644
--- a/transport.c
+++ b/transport.c
@@ -204,7 +204,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
diff --git a/upload-pack.c b/upload-pack.c
index 2bc888fc1..2ca60d27c 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -19,6 +19,7 @@
 #include "argv-array.h"
 #include "prio-queue.h"
 #include "protocol.h"
+#include "serve.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -1071,6 +1072,7 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 {
 	const char *dir;
 	int strict = 0;
+	struct serve_options opts = SERVE_OPTIONS_INIT;
 	struct option options[] = {
 		OPT_BOOL(0, "stateless-rpc", &stateless_rpc,
 			 N_("quit after a single request/response exchange")),
@@ -1105,10 +1107,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v2:
-		/*
-		 * fetch support for protocol v2 has not been implemented yet,
-		 * so ignore the request to use v2 and fallback to using v0.
-		 */
+		opts.advertise_capabilities = advertise_refs;
+		opts.stateless_rpc = stateless_rpc;
+		serve(&opts);
 		break;
 	case protocol_v1:
 		/*
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 14/26] transport: convert get_refs_list to take a list of ref patterns
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (12 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 13/26] connect: request remote refs using v2 Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 15/26] transport: convert transport_get_remote_refs " Brandon Williams
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Convert the 'struct transport' virtual function 'get_refs_list()' to
optionally take an argv_array of ref patterns.  When communicating with
a server using protocol v2 these ref patterns can be sent when
requesting a listing of their refs allowing the server to filter the
refs it sends based on the sent patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c   |  5 +++--
 transport-internal.h |  4 +++-
 transport.c          | 16 +++++++++-------
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 508015023..4c334b5ee 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1026,7 +1026,8 @@ static int has_attribute(const char *attrs, const char *attr) {
 	}
 }
 
-static struct ref *get_refs_list(struct transport *transport, int for_push)
+static struct ref *get_refs_list(struct transport *transport, int for_push,
+				 const struct argv_array *ref_patterns)
 {
 	struct helper_data *data = transport->data;
 	struct child_process *helper;
@@ -1039,7 +1040,7 @@ static struct ref *get_refs_list(struct transport *transport, int for_push)
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push);
+		return transport->vtable->get_refs_list(transport, for_push, ref_patterns);
 	}
 
 	if (data->push && for_push)
diff --git a/transport-internal.h b/transport-internal.h
index 3c1a29d72..a67657ce3 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -3,6 +3,7 @@
 
 struct ref;
 struct transport;
+struct argv_array;
 
 struct transport_vtable {
 	/**
@@ -21,7 +22,8 @@ struct transport_vtable {
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
-	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
+	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
+				     const struct argv_array *ref_patterns);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index ffc6b2614..c54a44630 100644
--- a/transport.c
+++ b/transport.c
@@ -72,7 +72,7 @@ struct bundle_transport_data {
 	struct bundle_header header;
 };
 
-static struct ref *get_refs_from_bundle(struct transport *transport, int for_push)
+static struct ref *get_refs_from_bundle(struct transport *transport, int for_push, const struct argv_array *ref_patterns)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -189,7 +189,8 @@ static int connect_setup(struct transport *transport, int for_push)
 	return 0;
 }
 
-static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
+static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
+					const struct argv_array *ref_patterns)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -204,7 +205,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
+		get_remote_refs(data->fd[1], &reader, &refs, for_push,
+				ref_patterns);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -250,7 +252,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.update_shallow = data->options.update_shallow;
 
 	if (!data->got_remote_heads)
-		refs_tmp = get_refs_via_connect(transport, 0);
+		refs_tmp = get_refs_via_connect(transport, 0, NULL);
 
 	switch (data->version) {
 	case protocol_v2:
@@ -568,7 +570,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	int ret = 0;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1);
+		get_refs_via_connect(transport, 1, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1028,7 +1030,7 @@ int transport_push(struct transport *transport,
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1);
+		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1137,7 +1139,7 @@ int transport_push(struct transport *transport,
 const struct ref *transport_get_remote_refs(struct transport *transport)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0);
+		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
 		transport->got_remote_refs = 1;
 	}
 
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 15/26] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (13 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 14/26] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 16/26] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Convert 'transport_get_remote_refs()' to optionally take a list of ref
patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/clone.c     | 2 +-
 builtin/fetch.c     | 4 ++--
 builtin/ls-remote.c | 2 +-
 builtin/remote.c    | 2 +-
 transport.c         | 7 +++++--
 transport.h         | 3 ++-
 6 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 2da71db10..4db3079ac 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1104,7 +1104,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (transport->smart_options && !deepen)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
-	refs = transport_get_remote_refs(transport);
+	refs = transport_get_remote_refs(transport, NULL);
 
 	if (refs) {
 		mapped_refs = wanted_peer_refs(refs, refspec);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26fa..850382f55 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -250,7 +250,7 @@ static void find_non_local_tags(struct transport *transport,
 	struct string_list_item *item = NULL;
 
 	for_each_ref(add_existing, &existing_refs);
-	for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) {
+	for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) {
 		if (!starts_with(ref->name, "refs/tags/"))
 			continue;
 
@@ -336,7 +336,7 @@ static struct ref *get_ref_map(struct transport *transport,
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport);
+	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c4be98ab9..c6e9847c5 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -96,7 +96,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport);
+	ref = transport_get_remote_refs(transport, NULL);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/builtin/remote.c b/builtin/remote.c
index d95bf904c..d0b6ff6e2 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -862,7 +862,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport);
+		remote_refs = transport_get_remote_refs(transport, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/transport.c b/transport.c
index c54a44630..dfc603b36 100644
--- a/transport.c
+++ b/transport.c
@@ -1136,10 +1136,13 @@ int transport_push(struct transport *transport,
 	return 1;
 }
 
-const struct ref *transport_get_remote_refs(struct transport *transport)
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
+		transport->remote_refs =
+			transport->vtable->get_refs_list(transport, 0,
+							 ref_patterns);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 731c78b67..4b656f315 100644
--- a/transport.h
+++ b/transport.h
@@ -178,7 +178,8 @@ int transport_push(struct transport *connection,
 		   int refspec_nr, const char **refspec, int flags,
 		   unsigned int * reject_reasons);
 
-const struct ref *transport_get_remote_refs(struct transport *transport);
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 16/26] ls-remote: pass ref patterns when requesting a remote's refs
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (14 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 15/26] transport: convert transport_get_remote_refs " Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 17/26] fetch: pass ref patterns when fetching Brandon Williams
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Construct an argv_array of the ref patterns supplied via the command
line and pass them to 'transport_get_remote_refs()' to be used when
communicating protocol v2 so that the server can limit the ref
advertisement based on the supplied patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/ls-remote.c    | 7 +++++--
 t/t5701-protocol-v2.sh | 8 ++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c6e9847c5..caf1051f3 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -43,6 +43,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	struct remote *remote;
 	struct transport *transport;
@@ -74,8 +75,10 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (argc > 1) {
 		int i;
 		pattern = xcalloc(argc, sizeof(const char *));
-		for (i = 1; i < argc; i++)
+		for (i = 1; i < argc; i++) {
 			pattern[i - 1] = xstrfmt("*/%s", argv[i]);
+			argv_array_push(&ref_patterns, argv[i]);
+		}
 	}
 
 	remote = remote_get(dest);
@@ -96,7 +99,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport, NULL);
+	ref = transport_get_remote_refs(transport, &ref_patterns);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/t/t5701-protocol-v2.sh b/t/t5701-protocol-v2.sh
index 4bf4d61ac..7d8aeb766 100755
--- a/t/t5701-protocol-v2.sh
+++ b/t/t5701-protocol-v2.sh
@@ -25,4 +25,12 @@ test_expect_success 'list refs with file:// using protocol v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote "file://$(pwd)/file_parent" master 2>log &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 test_done
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 17/26] fetch: pass ref patterns when fetching
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (15 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 16/26] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 18/26] push: pass ref patterns when pushing Brandon Williams
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Construct a list of ref patterns to be passed to
'transport_get_remote_refs()' from the refspec to be used during the
fetch.  This list of ref patterns will be used to allow the server to
filter the ref advertisement when communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 850382f55..8128450bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -332,11 +332,21 @@ static struct ref *get_ref_map(struct transport *transport,
 	struct ref *rm;
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
+	const struct ref *remote_refs;
+
+	for (i = 0; i < refspec_count; i++) {
+		if (!refspecs[i].exact_sha1)
+			argv_array_push(&ref_patterns, refspecs[i].src);
+	}
+
+	remote_refs = transport_get_remote_refs(transport, &ref_patterns);
+
+	argv_array_clear(&ref_patterns);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 18/26] push: pass ref patterns when pushing
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (16 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 17/26] fetch: pass ref patterns when fetching Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 19/26] upload-pack: introduce fetch server command Brandon Williams
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Construct a list of ref patterns to be passed to 'get_refs_list()' from
the refspec to be used during the push.  This list of ref patterns will
be used to allow the server to filter the ref advertisement when
communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/transport.c b/transport.c
index dfc603b36..6ea3905e3 100644
--- a/transport.c
+++ b/transport.c
@@ -1026,11 +1026,26 @@ int transport_push(struct transport *transport,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
+		struct refspec *tmp_rs;
+		struct argv_array ref_patterns = ARGV_ARRAY_INIT;
+		int i;
 
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
+		tmp_rs = parse_push_refspec(refspec_nr, refspec);
+		for (i = 0; i < refspec_nr; i++) {
+			if (tmp_rs[i].dst)
+				argv_array_push(&ref_patterns, tmp_rs[i].dst);
+			else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
+				argv_array_push(&ref_patterns, tmp_rs[i].src);
+		}
+
+		remote_refs = transport->vtable->get_refs_list(transport, 1,
+							       &ref_patterns);
+
+		argv_array_clear(&ref_patterns);
+		free_refspec(refspec_nr, tmp_rs);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 19/26] upload-pack: introduce fetch server command
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (17 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 18/26] push: pass ref patterns when pushing Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-04  1:07   ` Stefan Beller
  2018-01-03  0:18 ` [PATCH 20/26] fetch-pack: perform a fetch using v2 Brandon Williams
                   ` (8 subsequent siblings)
  27 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt |  14 ++
 serve.c                                 |   2 +
 upload-pack.c                           | 290 ++++++++++++++++++++++++++++++++
 upload-pack.h                           |   9 +
 4 files changed, 315 insertions(+)
 create mode 100644 upload-pack.h

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 5f4d0e719..2a8e2f226 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -115,3 +115,17 @@ The output of ls-refs is as follows:
 
     symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
     shallow = PKT-LINE("shallow" SP obj-id LF)
+
+ Fetch
+-------
+
+Fetch will need to be a modified version of the v1 fetch protocol.  Some
+potential areas for improvement are: Ref-in-want, CDN offloading,
+Fetch-options.
+
+Since we'll have an 'ls-ref' service we can eliminate the need of fetch
+to perform a ref-advertisement, instead a client can run the 'ls-refs'
+service first, in order to find out what refs the server has, and then
+request those refs directly using the fetch service.
+
+//TODO Flesh out the design
diff --git a/serve.c b/serve.c
index 88d548410..ca3bb7190 100644
--- a/serve.c
+++ b/serve.c
@@ -6,6 +6,7 @@
 #include "argv-array.h"
 #include "ls-refs.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static int always_advertise(struct repository *r,
 			    struct strbuf *value)
@@ -46,6 +47,7 @@ static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "stateless-rpc", always_advertise, NULL },
 	{ "ls-refs", always_advertise, ls_refs },
+	{ "fetch", always_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/upload-pack.c b/upload-pack.c
index 2ca60d27c..c41f6f528 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -20,6 +20,7 @@
 #include "prio-queue.h"
 #include "protocol.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -1040,6 +1041,295 @@ static void upload_pack(void)
 	}
 }
 
+struct upload_pack_data {
+	struct object_array wants;
+	struct oid_array haves;
+
+	unsigned stateless_rpc : 1;
+
+	unsigned use_thin_pack : 1;
+	unsigned use_ofs_delta : 1;
+	unsigned no_progress : 1;
+	unsigned use_include_tag : 1;
+	unsigned done : 1;
+};
+
+#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
+
+static void upload_pack_data_clear(struct upload_pack_data *data)
+{
+	object_array_clear(&data->wants);
+	oid_array_clear(&data->haves);
+}
+
+static int parse_want(const char *line)
+{
+	const char *arg;
+	if (skip_prefix(line, "want ", &arg)) {
+		struct object_id oid;
+		struct object *o;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: protocol error, "
+			    "expected to get oid, not '%s'", line);
+
+		o = parse_object(&oid);
+		if (!o) {
+			packet_write_fmt(1,
+					 "ERR upload-pack: not our ref %s",
+					 oid_to_hex(&oid));
+			die("git upload-pack: not our ref %s",
+			    oid_to_hex(&oid));
+		}
+
+		if (!(o->flags & WANTED)) {
+			o->flags |= WANTED;
+			add_object_array(o, NULL, &want_obj);
+		}
+
+		return 1;
+	}
+
+	return 0;
+}
+
+static int parse_have(const char *line, struct oid_array *haves)
+{
+	const char *arg;
+	if (skip_prefix(line, "have ", &arg)) {
+		struct object_id oid;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: expected SHA1 object, got '%s'", arg);
+		oid_array_append(haves, &oid);
+		return 1;
+	}
+
+	return 0;
+}
+
+static void process_args(struct argv_array *args, struct upload_pack_data *data)
+{
+	int i;
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+
+		/* process want */
+		if (parse_want(arg))
+			continue;
+		/* process have line */
+		if (parse_have(arg, &data->haves))
+			continue;
+
+		/* process args like thin-pack */
+		if (!strcmp(arg, "thin-pack")) {
+			use_thin_pack = 1;
+			continue;
+		}
+		if (!strcmp(arg, "ofs-delta")) {
+			use_ofs_delta = 1;
+			continue;
+		}
+		if (!strcmp(arg, "no-progress")) {
+			no_progress = 1;
+			continue;
+		}
+		if (!strcmp(arg, "include-tag")) {
+			use_include_tag = 1;
+			continue;
+		}
+		if (!strcmp(arg, "done")) {
+			data->done = 1;
+			continue;
+		}
+
+		/* ignore unknown lines maybe? */
+		die("unexpect line: '%s'", arg);
+	}
+}
+
+static void read_haves(struct upload_pack_data *data)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) == PACKET_READ_NORMAL) {
+
+		if (parse_have(reader.line, &data->haves))
+			continue;
+		if (!strcmp(reader.line, "done")) {
+			data->done = 1;
+			continue;
+		}
+	}
+	if (reader.status != PACKET_READ_FLUSH)
+		die("ERROR");
+}
+
+static int process_haves(struct oid_array *haves, struct oid_array *common)
+{
+	int i;
+
+	/* Process haves */
+	for (i = 0; i < haves->nr; i++) {
+		const struct object_id *oid = &haves->oid[i];
+		struct object *o;
+		int we_knew_they_have = 0;
+
+		if (!has_object_file(oid))
+			continue;
+
+		oid_array_append(common, oid);
+
+		o = parse_object(oid);
+		if (!o)
+			die("oops (%s)", oid_to_hex(oid));
+		if (o->type == OBJ_COMMIT) {
+			struct commit_list *parents;
+			struct commit *commit = (struct commit *)o;
+			if (o->flags & THEY_HAVE)
+				we_knew_they_have = 1;
+			else
+				o->flags |= THEY_HAVE;
+			if (!oldest_have || (commit->date < oldest_have))
+				oldest_have = commit->date;
+			for (parents = commit->parents;
+			     parents;
+			     parents = parents->next)
+				parents->item->object.flags |= THEY_HAVE;
+		}
+		if (!we_knew_they_have)
+			add_object_array(o, NULL, &have_obj);
+	}
+
+	return 0;
+}
+
+static int send_acks(struct oid_array *acks, struct strbuf *response)
+{
+	int i;
+	/* Send Acks */
+	if (!acks->nr)
+		packet_buf_write(response, "NAK\n");
+
+	for (i = 0; i < acks->nr; i++) {
+		packet_buf_write(response, "ACK %s common\n",
+				 oid_to_hex(&acks->oid[i]));
+	}
+
+	if (ok_to_give_up()) {
+		/* Send Ready */
+		packet_buf_write(response, "ACK %s ready\n",
+				 oid_to_hex(&acks->oid[i-1]));
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_haves_and_send_acks(struct upload_pack_data *data)
+{
+	struct oid_array common = OID_ARRAY_INIT;
+	struct strbuf response = STRBUF_INIT;
+	int ret = 0;
+
+	process_haves(&data->haves, &common);
+	if (data->done) {
+		ret = 1;
+	} else if (send_acks(&common, &response)) {
+		packet_buf_delim(&response);
+		ret = 1;
+	} else {
+		/* Add Flush */
+		packet_buf_flush(&response);
+		ret = 0;
+	}
+
+	/* Send response */
+	write_or_die(1, response.buf, response.len);
+	strbuf_release(&response);
+
+	oid_array_clear(&data->haves);
+	oid_array_clear(&common);
+	return ret;
+}
+
+#define FETCH_PROCESS_ARGS 0
+#define FETCH_READ_HAVES 1
+#define FETCH_SEND_ACKS 2
+#define FETCH_SEND_PACK 3
+#define FETCH_DONE 4
+
+int upload_pack_v2(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args)
+{
+	int state = FETCH_PROCESS_ARGS;
+	struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
+	const char *out;
+	use_sideband = LARGE_PACKET_MAX;
+
+	/* Check if cmd is being run as a stateless-rpc */
+	if (has_capability(keys, "stateless-rpc", &out))
+		if (!strcmp(out, "true"))
+			data.stateless_rpc = 1;
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_PROCESS_ARGS:
+			process_args(args, &data);
+
+			if (!want_obj.nr) {
+				/*
+				 * Request didn't contain any 'want' lines,
+				 * guess they didn't want anything.
+				 */
+				state = FETCH_DONE;
+			} else if (data.haves.nr) {
+				/*
+				 * Request had 'have' lines, so lets ACK them.
+				 */
+				state = FETCH_SEND_ACKS;
+			} else {
+				/*
+				 * Request had 'want's but no 'have's so we can
+				 * immedietly go to construct and send a pack.
+				 */
+				state = FETCH_SEND_PACK;
+			}
+			break;
+		case FETCH_READ_HAVES:
+			read_haves(&data);
+			state = FETCH_SEND_ACKS;
+			break;
+		case FETCH_SEND_ACKS:
+			if (process_haves_and_send_acks(&data))
+				state = FETCH_SEND_PACK;
+			else if (data.stateless_rpc)
+				/*
+				 * Request was made via stateless-rpc and a
+				 * packfile isn't ready to be created and sent.
+				 */
+				state = FETCH_DONE;
+			else
+				state = FETCH_READ_HAVES;
+			break;
+		case FETCH_SEND_PACK:
+			create_pack_file();
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			break;
+		default:
+			BUG("invalid state");
+		}
+	}
+
+	upload_pack_data_clear(&data);
+	return 0;
+}
+
 static int upload_pack_config(const char *var, const char *value, void *unused)
 {
 	if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
diff --git a/upload-pack.h b/upload-pack.h
new file mode 100644
index 000000000..54c429563
--- /dev/null
+++ b/upload-pack.h
@@ -0,0 +1,9 @@
+#ifndef UPLOAD_PACK_H
+#define UPLOAD_PACK_H
+
+struct repository;
+struct argv_array;
+extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
+			  struct argv_array *args);
+
+#endif /* UPLOAD_PACK_H */
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 20/26] fetch-pack: perform a fetch using v2
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (18 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 19/26] upload-pack: introduce fetch server command Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-04  1:23   ` Stefan Beller
  2018-01-10  0:05   ` Jonathan Tan
  2018-01-03  0:18 ` [PATCH 21/26] transport-helper: remove name parameter Brandon Williams
                   ` (7 subsequent siblings)
  27 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   |   2 +-
 fetch-pack.c           | 267 ++++++++++++++++++++++++++++++++++++++++++++++++-
 fetch-pack.h           |   4 +-
 t/t5701-protocol-v2.sh |  40 ++++++++
 transport.c            |   8 +-
 5 files changed, 314 insertions(+), 7 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index f492e8abd..867dd3cc7 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -213,7 +213,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
-			 &shallow, pack_lockfile_ptr);
+			 &shallow, pack_lockfile_ptr, protocol_v0);
 	if (pack_lockfile) {
 		printf("lock %s\n", pack_lockfile);
 		fflush(stdout);
diff --git a/fetch-pack.c b/fetch-pack.c
index 9f6b07ad9..c26fdc539 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,262 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	return ref;
 }
 
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
+{
+	for ( ; wants ; wants = wants->next) {
+		const struct object_id *remote = &wants->old_oid;
+		const char *remote_hex;
+		struct object *o;
+
+		/*
+		 * If that object is complete (i.e. it is an ancestor of a
+		 * local ref), we tell them we have it but do not have to
+		 * tell them about its ancestors, which they already know
+		 * about.
+		 *
+		 * We use lookup_object here because we are only
+		 * interested in the case we *know* the object is
+		 * reachable and we have already scanned it.
+		 */
+		if (((o = lookup_object(remote->hash)) != NULL) &&
+		    (o->flags & COMPLETE)) {
+			continue;
+		}
+
+		remote_hex = oid_to_hex(remote);
+		packet_buf_write(req_buf, "want %s\n", remote_hex);
+	}
+}
+
+static int add_haves(struct strbuf *req_buf, int *in_vain)
+{
+	int ret = 0;
+	int haves_added = 0;
+	const struct object_id *oid;
+
+	while ((oid = get_rev())) {
+		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+		if (++haves_added >= INITIAL_FLUSH)
+			break;
+	};
+
+	*in_vain += haves_added;
+	if (!haves_added || *in_vain >= MAX_IN_VAIN) {
+		/* Send Done */
+		packet_buf_write(req_buf, "done\n");
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static int send_haves(int fd_out, int *in_vain)
+{
+	int ret = 0;
+	struct strbuf req_buf = STRBUF_INIT;
+
+	ret = add_haves(&req_buf, in_vain);
+
+	/* Send request */
+	packet_buf_flush(&req_buf);
+	write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+	strbuf_release(&req_buf);
+	return ret;
+}
+
+static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
+			      const struct ref *wants, struct oidset *common,
+			      int *in_vain)
+{
+	int ret = 0;
+	struct strbuf req_buf = STRBUF_INIT;
+
+	packet_buf_write(&req_buf, "command=fetch");
+	packet_buf_write(&req_buf, "agent=%s", git_user_agent_sanitized());
+	if (args->stateless_rpc)
+		packet_buf_write(&req_buf, "stateless-rpc=true");
+
+	packet_buf_delim(&req_buf);
+	if (args->use_thin_pack)
+		packet_buf_write(&req_buf, "thin-pack");
+	if (args->no_progress)
+		packet_buf_write(&req_buf, "no-progress");
+	if (args->include_tag)
+		packet_buf_write(&req_buf, "include-tag");
+	if (prefer_ofs_delta)
+		packet_buf_write(&req_buf, "ofs-delta");
+
+	/* add wants */
+	add_wants(wants, &req_buf);
+
+	/*
+	 * If we are running stateless-rpc we need to add all the common
+	 * commits we've found in previous rounds
+	 */
+	if (args->stateless_rpc) {
+		struct oidset_iter iter;
+		const struct object_id *oid;
+		oidset_iter_init(common, &iter);
+
+		while ((oid = oidset_iter_next(&iter))) {
+			packet_buf_write(&req_buf, "have %s\n", oid_to_hex(oid));
+		}
+	}
+
+	/* Add initial haves */
+	ret = add_haves(&req_buf, in_vain);
+
+	/* Send request */
+	packet_buf_flush(&req_buf);
+	write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+	strbuf_release(&req_buf);
+	return ret;
+}
+
+static enum ack_type process_ack(const char *line, struct object_id *oid)
+{
+	const char *arg;
+
+	if (!strcmp(line, "NAK"))
+		return NAK;
+	if (skip_prefix(line, "ACK ", &arg)) {
+		if (!parse_oid_hex(arg, oid, &arg)) {
+			if (strstr(arg, "continue"))
+				return ACK_continue;
+			if (strstr(arg, "common"))
+				return ACK_common;
+			if (strstr(arg, "ready"))
+				return ACK_ready;
+			return ACK;
+		}
+	}
+	if (skip_prefix(line, "ERR ", &arg))
+		die(_("remote error: %s"), arg);
+	die(_("git fetch-pack: expected ACK/NAK, got '%s'"), line);
+}
+
+static int process_acks(struct packet_reader *reader, struct oidset *common)
+{
+	int got_ready = 0;
+	int got_common = 0;
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		struct object_id oid;
+		struct commit *commit;
+		enum ack_type ack = process_ack(reader->line, &oid);
+
+		switch (ack) {
+		case ACK_ready:
+			clear_prio_queue(&rev_list);
+			got_ready = 1;
+			/* fallthrough */
+		case ACK_common:
+			oidset_insert(common, &oid);
+			commit = lookup_commit(&oid);
+			mark_common(commit, 0, 1);
+			got_common = 1;
+			break;
+		case NAK:
+			break;
+		case ACK:
+		case ACK_continue:
+			die("ACK/ACK_continue not supported");
+		}
+	}
+
+	if (reader->status != PACKET_READ_FLUSH &&
+	    reader->status != PACKET_READ_DELIM)
+		die("Error during processing acks: %d", reader->status);
+
+	/* return 0 if no common, 1 if there are common, or 2 if ready */
+	return got_ready + got_common;
+}
+
+#define FETCH_CHECK_LOCAL 0
+#define FETCH_SEND_REQUEST 1
+#define FETCH_PROCESS_ACKS 2
+#define FETCH_SEND_HAVES 3
+#define FETCH_GET_PACK 4
+#define FETCH_DONE 5
+
+static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
+				    int fd[2],
+				    const struct ref *orig_ref,
+				    struct ref **sought, int nr_sought,
+				    char **pack_lockfile)
+{
+	struct ref *ref = copy_ref_list(orig_ref);
+	int state = FETCH_CHECK_LOCAL;
+	struct oidset common = OIDSET_INIT;
+	struct packet_reader reader;
+	int in_vain = 0;
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_CHECK_LOCAL:
+			sort_ref_list(&ref, ref_compare_name);
+			QSORT(sought, nr_sought, cmp_ref_by_name);
+
+			/* v2 supports these by default */
+			allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+			use_sideband = 2;
+
+			/* Filter 'ref' by 'sought' and those that aren't local */
+			if (everything_local(args, &ref, sought, nr_sought))
+				state = FETCH_DONE;
+			else
+				state = FETCH_SEND_REQUEST;
+			break;
+		case FETCH_SEND_REQUEST:
+			if (send_fetch_request(fd[1], args, ref, &common, &in_vain))
+				state = FETCH_GET_PACK;
+			else
+				state = FETCH_PROCESS_ACKS;
+			break;
+		case FETCH_PROCESS_ACKS:
+			/* Process ACKs/NAKs */
+			switch (process_acks(&reader, &common)) {
+			case 2:
+				state = FETCH_GET_PACK;
+				break;
+			case 1:
+				in_vain = 0;
+				/* fallthrough */
+			default:
+				if (args->stateless_rpc)
+					state = FETCH_SEND_REQUEST;
+				else
+					state = FETCH_SEND_HAVES;
+				break;
+			}
+			break;
+		case FETCH_SEND_HAVES:
+			if (send_haves(fd[1], &in_vain))
+				state = FETCH_GET_PACK;
+			else
+				state = FETCH_PROCESS_ACKS;
+			break;
+		case FETCH_GET_PACK:
+			/* get the pack */
+			if (get_pack(args, fd, pack_lockfile))
+				die(_("git fetch-pack: fetch failed."));
+
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			break;
+		default:
+			die("invalid state");
+		}
+	}
+
+	oidset_clear(&common);
+	return ref;
+}
+
 static void fetch_pack_config(void)
 {
 	git_config_get_int("fetch.unpacklimit", &fetch_unpack_limit);
@@ -1153,7 +1409,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       const char *dest,
 		       struct ref **sought, int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile)
+		       char **pack_lockfile,
+		       enum protocol_version version)
 {
 	struct ref *ref_cpy;
 	struct shallow_info si;
@@ -1167,8 +1424,12 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		die(_("no matching remote head"));
 	}
 	prepare_shallow_info(&si, shallow);
-	ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
-				&si, pack_lockfile);
+	if (version == protocol_v2)
+		ref_cpy = do_fetch_pack_v2(args, fd, ref, sought, nr_sought,
+					   pack_lockfile);
+	else
+		ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
+					&si, pack_lockfile);
 	reprepare_packed_git();
 	update_shallow(args, sought, nr_sought, &si);
 	clear_shallow_info(&si);
diff --git a/fetch-pack.h b/fetch-pack.h
index b6aeb43a8..7afca7305 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -3,6 +3,7 @@
 
 #include "string-list.h"
 #include "run-command.h"
+#include "protocol.h"
 
 struct oid_array;
 
@@ -43,7 +44,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       struct ref **sought,
 		       int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile);
+		       char **pack_lockfile,
+		       enum protocol_version version);
 
 /*
  * Print an appropriate error message for each sought ref that wasn't
diff --git a/t/t5701-protocol-v2.sh b/t/t5701-protocol-v2.sh
index 7d8aeb766..3e411e178 100755
--- a/t/t5701-protocol-v2.sh
+++ b/t/t5701-protocol-v2.sh
@@ -33,4 +33,44 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
 	! grep "refs/tags/" log
 '
 
+test_expect_success 'clone with file:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 2" log
+'
+
+test_expect_success 'fetch with file:// using protocol v2' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
+		fetch origin 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 2" log
+'
+
+test_expect_success 'ref advertisment is filtered during fetch using protocol v2' '
+	test_commit -C file_parent three &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
+		fetch origin master 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 test_done
diff --git a/transport.c b/transport.c
index 6ea3905e3..4fdbd9adc 100644
--- a/transport.c
+++ b/transport.c
@@ -256,14 +256,18 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile, data->version);
+		packet_flush(data->fd[1]);
 		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
 				  refs_tmp ? refs_tmp : transport->remote_refs,
 				  dest, to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile);
+				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 21/26] transport-helper: remove name parameter
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (19 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 20/26] fetch-pack: perform a fetch using v2 Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 22/26] transport-helper: refactor process_connect_service Brandon Williams
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Commit 266f1fdfa (transport-helper: be quiet on read errors from
helpers, 2013-06-21) removed a call to 'die()' which printed the name of
the remote helper passed in to the 'recvline_fh()' function using the
'name' parameter.  Once the call to 'die()' was removed the parameter
was no longer necessary but wasn't removed.  Clean up 'recvline_fh()'
parameter list by removing the 'name' parameter.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 4c334b5ee..d72155768 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -49,7 +49,7 @@ static void sendline(struct helper_data *helper, struct strbuf *buffer)
 		die_errno("Full write to remote helper failed");
 }
 
-static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
+static int recvline_fh(FILE *helper, struct strbuf *buffer)
 {
 	strbuf_reset(buffer);
 	if (debug)
@@ -67,7 +67,7 @@ static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
 
 static int recvline(struct helper_data *helper, struct strbuf *buffer)
 {
-	return recvline_fh(helper->out, buffer, helper->name);
+	return recvline_fh(helper->out, buffer);
 }
 
 static void write_constant(int fd, const char *str)
@@ -586,7 +586,7 @@ static int process_connect_service(struct transport *transport,
 		goto exit;
 
 	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf, name))
+	if (recvline_fh(input, &cmdbuf))
 		exit(128);
 
 	if (!strcmp(cmdbuf.buf, "")) {
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 22/26] transport-helper: refactor process_connect_service
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (20 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 21/26] transport-helper: remove name parameter Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 23/26] transport-helper: introduce connect-half-duplex Brandon Williams
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

A future patch will need to take advantage of the logic which runs and
processes the response of the connect command on a remote helper so
factor out this logic from 'process_connect_service()' and place it into
a helper function 'run_connect()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 67 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index d72155768..c032a2a87 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -545,14 +545,13 @@ static int fetch_with_import(struct transport *transport,
 	return 0;
 }
 
-static int process_connect_service(struct transport *transport,
-				   const char *name, const char *exec)
+static int run_connect(struct transport *transport, struct strbuf *cmdbuf)
 {
 	struct helper_data *data = transport->data;
-	struct strbuf cmdbuf = STRBUF_INIT;
-	struct child_process *helper;
-	int r, duped, ret = 0;
+	int ret = 0;
+	int duped;
 	FILE *input;
+	struct child_process *helper;
 
 	helper = get_helper(transport);
 
@@ -568,44 +567,54 @@ static int process_connect_service(struct transport *transport,
 	input = xfdopen(duped, "r");
 	setvbuf(input, NULL, _IONBF, 0);
 
+	sendline(data, cmdbuf);
+	if (recvline_fh(input, cmdbuf))
+		exit(128);
+
+	if (!strcmp(cmdbuf->buf, "")) {
+		data->no_disconnect_req = 1;
+		if (debug)
+			fprintf(stderr, "Debug: Smart transport connection "
+				"ready.\n");
+		ret = 1;
+	} else if (!strcmp(cmdbuf->buf, "fallback")) {
+		if (debug)
+			fprintf(stderr, "Debug: Falling back to dumb "
+				"transport.\n");
+	} else {
+		die("Unknown response to connect: %s",
+			cmdbuf->buf);
+	}
+
+	fclose(input);
+	return ret;
+}
+
+static int process_connect_service(struct transport *transport,
+				   const char *name, const char *exec)
+{
+	struct helper_data *data = transport->data;
+	struct strbuf cmdbuf = STRBUF_INIT;
+	int ret = 0;
+
 	/*
 	 * Handle --upload-pack and friends. This is fire and forget...
 	 * just warn if it fails.
 	 */
 	if (strcmp(name, exec)) {
-		r = set_helper_option(transport, "servpath", exec);
+		int r = set_helper_option(transport, "servpath", exec);
 		if (r > 0)
 			warning("Setting remote service path not supported by protocol.");
 		else if (r < 0)
 			warning("Invalid remote service path.");
 	}
 
-	if (data->connect)
+	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
-	else
-		goto exit;
-
-	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf))
-		exit(128);
-
-	if (!strcmp(cmdbuf.buf, "")) {
-		data->no_disconnect_req = 1;
-		if (debug)
-			fprintf(stderr, "Debug: Smart transport connection "
-				"ready.\n");
-		ret = 1;
-	} else if (!strcmp(cmdbuf.buf, "fallback")) {
-		if (debug)
-			fprintf(stderr, "Debug: Falling back to dumb "
-				"transport.\n");
-	} else
-		die("Unknown response to connect: %s",
-			cmdbuf.buf);
+		ret = run_connect(transport, &cmdbuf);
+	}
 
-exit:
 	strbuf_release(&cmdbuf);
-	fclose(input);
 	return ret;
 }
 
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 23/26] transport-helper: introduce connect-half-duplex
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (21 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 22/26] transport-helper: refactor process_connect_service Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 24/26] pkt-line: add packet_buf_write_len function Brandon Williams
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce the transport-helper capability 'connect-half-duplex'.  This
capability indicates that the transport-helper can be requested to run
the 'connect-half-duplex' command which should attempt to make a
half-duplex connection with a remote end.  Once established, the
half-duplex connection can be used by the git client to communicate with
the remote end natively in a stateles-rpc manner as supported by
protocol v2.

If a half-duplex connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 8 ++++++++
 transport.c        | 1 +
 transport.h        | 6 ++++++
 3 files changed, 15 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index c032a2a87..d037609bc 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -26,6 +26,7 @@ struct helper_data {
 		option : 1,
 		push : 1,
 		connect : 1,
+		connect_half_duplex : 1,
 		signed_tags : 1,
 		check_connectivity : 1,
 		no_disconnect_req : 1,
@@ -188,6 +189,8 @@ static struct child_process *get_helper(struct transport *transport)
 			refspecs[refspec_nr++] = xstrdup(arg);
 		} else if (!strcmp(capname, "connect")) {
 			data->connect = 1;
+		} else if (!strcmp(capname, "connect-half-duplex")) {
+			data->connect_half_duplex = 1;
 		} else if (!strcmp(capname, "signed-tags")) {
 			data->signed_tags = 1;
 		} else if (skip_prefix(capname, "export-marks ", &arg)) {
@@ -612,6 +615,11 @@ static int process_connect_service(struct transport *transport,
 	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
 		ret = run_connect(transport, &cmdbuf);
+	} else if (data->connect_half_duplex) {
+		strbuf_addf(&cmdbuf, "connect-half-duplex %s\n", name);
+		ret = run_connect(transport, &cmdbuf);
+		if (ret)
+			transport->stateless_rpc = 1;
 	}
 
 	strbuf_release(&cmdbuf);
diff --git a/transport.c b/transport.c
index 4fdbd9adc..aafb8fbb4 100644
--- a/transport.c
+++ b/transport.c
@@ -250,6 +250,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 		data->options.check_self_contained_and_connected;
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
+	args.stateless_rpc = transport->stateless_rpc;
 
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0, NULL);
diff --git a/transport.h b/transport.h
index 4b656f315..9eac809ee 100644
--- a/transport.h
+++ b/transport.h
@@ -55,6 +55,12 @@ struct transport {
 	 */
 	unsigned cloning : 1;
 
+	/*
+	 * Indicates that the transport is connected via a half-duplex
+	 * connection and should operate in stateless-rpc mode.
+	 */
+	unsigned stateless_rpc : 1;
+
 	/*
 	 * These strings will be passed to the {pre, post}-receive hook,
 	 * on the remote side, if both sides support the push options capability.
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 24/26] pkt-line: add packet_buf_write_len function
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (22 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 23/26] transport-helper: introduce connect-half-duplex Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 25/26] remote-curl: create copy of the service name Brandon Williams
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Add the 'packet_buf_write_len()' function which allows for writing an
arbitrary length buffer into a 'struct strbuf' and formatting it in
packet-line format.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 16 ++++++++++++++++
 pkt-line.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 3159cbe10..e9968b7df 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 	va_end(args);
 }
 
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
+{
+	size_t orig_len, n;
+
+	orig_len = buf->len;
+	strbuf_addstr(buf, "0000");
+	strbuf_add(buf, data, len);
+	n = buf->len - orig_len;
+
+	if (n > LARGE_PACKET_MAX)
+		die("protocol error: impossibly long line");
+
+	set_packet_header(&buf->buf[orig_len], n);
+	packet_trace(buf->buf + orig_len + 4, n - 4, 1);
+}
+
 int write_packetized_from_fd(int fd_in, int fd_out)
 {
 	static char buf[LARGE_PACKET_DATA_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 97b6dd1c7..d411fcb30 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd(int fd_in, int fd_out);
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 25/26] remote-curl: create copy of the service name
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (23 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 24/26] pkt-line: add packet_buf_write_len function Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-03  0:18 ` [PATCH 26/26] remote-curl: implement connect-half-duplex command Brandon Williams
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Make a copy of the service name being requested instead of relying on
the buffer pointed to by the passed in 'const char *' to remain
unchanged.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index dae8a4a48..4086aa733 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -165,7 +165,7 @@ static int set_option(const char *name, const char *value)
 }
 
 struct discovery {
-	const char *service;
+	char *service;
 	char *buf_alloc;
 	char *buf;
 	size_t len;
@@ -257,6 +257,7 @@ static void free_discovery(struct discovery *d)
 		free(d->shallow.oid);
 		free(d->buf_alloc);
 		free_refs(d->refs);
+		free(d->service);
 		free(d);
 	}
 }
@@ -343,7 +344,7 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		warning(_("redirecting to %s"), url.buf);
 
 	last= xcalloc(1, sizeof(*last_discovery));
-	last->service = service;
+	last->service = xstrdup(service);
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH 26/26] remote-curl: implement connect-half-duplex command
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (24 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 25/26] remote-curl: create copy of the service name Brandon Williams
@ 2018-01-03  0:18 ` Brandon Williams
  2018-01-10  0:10   ` Jonathan Tan
  2018-01-10 17:57   ` Jonathan Tan
  2018-01-09 17:55 ` [PATCH 00/26] protocol version 2 Jonathan Tan
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
  27 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-03  0:18 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Teach remote-curl the 'connect-half-duplex' command which is used to
establish a half-duplex connection with servers which support protocol
version 2.  This allows remote-curl to act as a proxy, allowing the git
client to communicate natively with a remote end, simply using
remote-curl as a pass through to convert requests to http.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c          | 185 ++++++++++++++++++++++++++++++++++++++++++++++++-
 t/t5701-protocol-v2.sh |  41 +++++++++++
 2 files changed, 224 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 4086aa733..b63b06398 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -171,6 +171,7 @@ struct discovery {
 	size_t len;
 	struct ref *refs;
 	struct oid_array shallow;
+	enum protocol_version version;
 	unsigned proto_git : 1;
 };
 static struct discovery *last_discovery;
@@ -184,9 +185,13 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	heads->version = discover_version(&reader);
+	switch (heads->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		/*
+		 * Do nothing.  Client should run 'connect-half-duplex' and
+		 * request the refs themselves.
+		 */
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -1047,6 +1052,178 @@ static void parse_push(struct strbuf *buf)
 	free(specs);
 }
 
+struct proxy_state {
+	char *service_name;
+	char *service_url;
+	char *hdr_content_type;
+	char *hdr_accept;
+	struct strbuf request_buffer;
+	int in;
+	int out;
+	struct packet_reader reader;
+	size_t pos;
+	int seen_flush;
+};
+
+static void proxy_state_init(struct proxy_state *p, const char *service_name)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	memset(p, 0, sizeof(*p));
+	p->service_name = xstrdup(service_name);
+
+	p->in = 0;
+	p->out = 1;
+	strbuf_init(&p->request_buffer, 0);
+
+	strbuf_addf(&buf, "%s%s", url.buf, p->service_name);
+	p->service_url = strbuf_detach(&buf, NULL);
+
+	strbuf_addf(&buf, "Content-Type: application/x-%s-request", p->service_name);
+	p->hdr_content_type = strbuf_detach(&buf, NULL);
+
+	strbuf_addf(&buf, "Accept: application/x-%s-result", p->service_name);
+	p->hdr_accept = strbuf_detach(&buf, NULL);
+
+	packet_reader_init(&p->reader, p->in, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF);
+}
+
+static void proxy_state_clear(struct proxy_state *p)
+{
+	free(p->service_name);
+	free(p->service_url);
+	free(p->hdr_content_type);
+	free(p->hdr_accept);
+	strbuf_release(&p->request_buffer);
+}
+
+static size_t proxy_in(void *ptr, size_t eltsize,
+		       size_t nmemb, void *buffer_)
+{
+	size_t max = eltsize * nmemb;
+	struct proxy_state *p = buffer_;
+	size_t avail = p->request_buffer.len - p->pos;
+
+	if (!avail) {
+		if (p->seen_flush) {
+			p->seen_flush = 0;
+			return 0;
+		}
+
+		strbuf_reset(&p->request_buffer);
+		switch (packet_reader_read(&p->reader)) {
+		case PACKET_READ_EOF:
+			die("error reading request from parent process");
+		case PACKET_READ_NORMAL:
+			packet_buf_write_len(&p->request_buffer, p->reader.line,
+					     p->reader.pktlen);
+			break;
+		case PACKET_READ_DELIM:
+			packet_buf_delim(&p->request_buffer);
+			break;
+		case PACKET_READ_FLUSH:
+			packet_buf_flush(&p->request_buffer);
+			p->seen_flush = 1;
+			break;
+		}
+		p->pos = 0;
+		avail = p->request_buffer.len;
+	}
+
+	if (max < avail)
+		avail = max;
+	memcpy(ptr, p->request_buffer.buf + p->pos, avail);
+	p->pos += avail;
+	return avail;
+}
+static size_t proxy_out(char *ptr, size_t eltsize,
+			size_t nmemb, void *buffer_)
+{
+	size_t size = eltsize * nmemb;
+	struct proxy_state *p = buffer_;
+
+	write_or_die(p->out, ptr, size);
+	return size;
+}
+
+static int proxy_post(struct proxy_state *p)
+{
+	struct active_request_slot *slot;
+	struct curl_slist *headers = http_copy_default_headers();
+	int err;
+
+	headers = curl_slist_append(headers, p->hdr_content_type);
+	headers = curl_slist_append(headers, p->hdr_accept);
+	headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
+
+	slot = get_active_slot();
+
+	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
+	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
+
+	/* Setup function to read request from client */
+	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
+	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
+
+	/* Setup function to write server response to client */
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
+
+	err = run_slot(slot, NULL);
+
+	if (err != HTTP_OK)
+		err = -1;
+
+	curl_slist_free_all(headers);
+	return err;
+}
+
+static int connect_half_duplex(const char *service_name)
+{
+	struct discovery *discover;
+	struct proxy_state p;
+
+	/*
+	 * Run the info/refs request and see if the server supports protocol
+	 * v2.  If and only if the server supports v2 can we successfully
+	 * establish a half-duplex connection, otherwise we need to tell the
+	 * client to fallback to using other transport helper functions to
+	 * complete their request.
+	 */
+	discover = discover_refs(service_name, 0);
+	if (discover->version != protocol_v2) {
+		printf("fallback\n");
+		fflush(stdout);
+		return -1;
+	} else {
+		/* Half-Duplex Connection established */
+		printf("\n");
+		fflush(stdout);
+	}
+
+	proxy_state_init(&p, service_name);
+
+	/*
+	 * Dump the capability listing that we got from the server earlier
+	 * during the info/refs request.
+	 */
+	write_or_die(p.out, discover->buf, discover->len);
+
+	/* Peek the next packet line.  Until we see EOF keep sending POSTs */
+	while (packet_reader_peek(&p.reader) != PACKET_READ_EOF) {
+		if (proxy_post(&p)) {
+			/* We would have an err here */
+			break;
+		}
+	}
+
+	proxy_state_clear(&p);
+	return 0;
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -1115,12 +1292,16 @@ int cmd_main(int argc, const char **argv)
 			fflush(stdout);
 
 		} else if (!strcmp(buf.buf, "capabilities")) {
+			printf("connect-half-duplex\n");
 			printf("fetch\n");
 			printf("option\n");
 			printf("push\n");
 			printf("check-connectivity\n");
 			printf("\n");
 			fflush(stdout);
+		} else if (skip_prefix(buf.buf, "connect-half-duplex ", &arg)) {
+			if (!connect_half_duplex(arg))
+				break;
 		} else {
 			error("remote-curl: unknown command '%s' from git", buf.buf);
 			return 1;
diff --git a/t/t5701-protocol-v2.sh b/t/t5701-protocol-v2.sh
index 3e411e178..ada69ac09 100755
--- a/t/t5701-protocol-v2.sh
+++ b/t/t5701-protocol-v2.sh
@@ -73,4 +73,45 @@ test_expect_success 'ref advertisment is filtered during fetch using protocol v2
 	! grep "refs/tags/" log
 '
 
+# Test protocol v2 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v2' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+test_expect_success 'fetch with http:// using protocol v2' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=2 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s origin/master >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+stop_httpd
+
 test_done
-- 
2.15.1.620.gb9897f4670-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* Re: [PATCH 01/26] pkt-line: introduce packet_read_with_status
  2018-01-03  0:18 ` [PATCH 01/26] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-01-03 19:27   ` Stefan Beller
  2018-01-05 23:41     ` Brandon Williams
  2018-01-09 18:04   ` Jonathan Tan
  1 sibling, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-01-03 19:27 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> The current pkt-line API encodes the status of a pkt-line read in the
> length of the read content.  An error is indicated with '-1', a flush
> with '0' (which can be confusing since a return value of '0' can also
> indicate an empty pkt-line), and a positive integer for the length of
> the read content otherwise.  This doesn't leave much room for allowing
> the addition of additional special packets in the future.
>
> To solve this introduce 'packet_read_with_status()' which reads a packet
> and returns the status of the read encoded as an 'enum packet_status'
> type.  This allows for easily identifying between special and normal
> packets as well as errors.  It also enables easily adding a new special
> packet in the future.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  pkt-line.c | 55 ++++++++++++++++++++++++++++++++++++++++++-------------
>  pkt-line.h | 15 +++++++++++++++
>  2 files changed, 57 insertions(+), 13 deletions(-)
>
> diff --git a/pkt-line.c b/pkt-line.c
> index 2827ca772..8d7cd389f 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
>         return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
>  }
>
> -int packet_read(int fd, char **src_buf, size_t *src_len,
> -               char *buffer, unsigned size, int options)
> +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> +                                               char *buffer, unsigned size, int *pktlen,
> +                                               int options)
>  {
> -       int len, ret;
> +       int len;
>         char linelen[4];
>
> -       ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
> -       if (ret < 0)
> -               return ret;
> +       if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
> +               return PACKET_READ_EOF;
> +
>         len = packet_length(linelen);
>         if (len < 0)
>                 die("protocol error: bad line length character: %.4s", linelen);
> -       if (!len) {
> +
> +       if (len == 0) {
>                 packet_trace("0000", 4, 0);
> -               return 0;
> +               return PACKET_READ_FLUSH;
> +       } else if (len >= 1 && len <= 3) {
> +               die("protocol error: bad line length character: %.4s", linelen);

I wonder how much libified code we want here already, maybe we could
have PACKET_READ_ERROR as a return value here instead of die()ing.
There could also be an option that tells this code to die on error, this reminds
me of the repository discovery as well as the refs code, both of which have
this pattern.

Currently this series is only upgrading commands that use the network
anyway, so I guess die()ing in an ls-remote or fetch is no big deal,
but it could
be interesting to keep going once we have more of the partial clone
stuff working
(e.g. remote assisted log/blame would want to gracefully fall back instead of
die()ing without any useful output, I would think.)

>         }
> +
>         len -= 4;
> -       if (len >= size)
> +       if ((len < 0) || ((unsigned)len >= size))
>                 die("protocol error: bad line length %d", len);
> -       ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
> -       if (ret < 0)
> -               return ret;
> +
> +       if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
> +               return PACKET_READ_EOF;
>
>         if ((options & PACKET_READ_CHOMP_NEWLINE) &&
>             len && buffer[len-1] == '\n')
> @@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
>
>         buffer[len] = 0;
>         packet_trace(buffer, len, 0);
> -       return len;
> +       *pktlen = len;
> +       return PACKET_READ_NORMAL;
> +}
> +
> +int packet_read(int fd, char **src_buffer, size_t *src_len,
> +               char *buffer, unsigned size, int options)
> +{
> +       enum packet_read_status status;
> +       int pktlen;
> +
> +       status = packet_read_with_status(fd, src_buffer, src_len,
> +                                        buffer, size, &pktlen,
> +                                        options);
> +       switch (status) {
> +       case PACKET_READ_EOF:
> +               pktlen = -1;
> +               break;
> +       case PACKET_READ_NORMAL:
> +               break;
> +       case PACKET_READ_FLUSH:
> +               pktlen = 0;
> +               break;
> +       }
> +
> +       return pktlen;
>  }
>
>  static char *packet_read_line_generic(int fd,
> diff --git a/pkt-line.h b/pkt-line.h
> index 3dad583e2..06c468927 100644
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
>  int packet_read(int fd, char **src_buffer, size_t *src_len, char
>                 *buffer, unsigned size, int options);
>
> +/*
> + * Read a packetized line into a buffer like the 'packet_read()' function but
> + * returns an 'enum packet_read_status' which indicates the status of the read.
> + * The number of bytes read will be assigined to *pktlen if the status of the
> + * read was 'PACKET_READ_NORMAL'.
> + */
> +enum packet_read_status {
> +       PACKET_READ_EOF = -1,
> +       PACKET_READ_NORMAL,
> +       PACKET_READ_FLUSH,
> +};
> +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> +                                               char *buffer, unsigned size, int *pktlen,
> +                                               int options);
> +
>  /*
>   * Convenience wrapper for packet_read that is not gentle, and sets the
>   * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 04/26] upload-pack: convert to a builtin
  2018-01-03  0:18 ` [PATCH 04/26] upload-pack: convert to a builtin Brandon Williams
@ 2018-01-03 20:33   ` Stefan Beller
  2018-01-03 20:39     ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-01-03 20:33 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> In order to allow for code sharing with the server-side of fetch in
> protocol-v2 convert upload-pack to be a builtin.

What is the security aspect of this patch?

By making upload-pack builtin, it gains additional abilities,
such as answers to '-h' or '--help' (which would start a pager).
Is there an easy way to sooth my concerns? (best put into the
commit message)

Thanks,
Stefan

>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Makefile      | 3 ++-
>  builtin.h     | 1 +
>  git.c         | 1 +
>  upload-pack.c | 2 +-
>  4 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/Makefile b/Makefile
> index 2a81ae22e..e0740b452 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -636,7 +636,6 @@ PROGRAM_OBJS += imap-send.o
>  PROGRAM_OBJS += sh-i18n--envsubst.o
>  PROGRAM_OBJS += shell.o
>  PROGRAM_OBJS += show-index.o
> -PROGRAM_OBJS += upload-pack.o
>  PROGRAM_OBJS += remote-testsvn.o
>
>  # Binary suffix, set to .exe for Windows builds
> @@ -701,6 +700,7 @@ BUILT_INS += git-merge-subtree$X
>  BUILT_INS += git-show$X
>  BUILT_INS += git-stage$X
>  BUILT_INS += git-status$X
> +BUILT_INS += git-upload-pack$X
>  BUILT_INS += git-whatchanged$X
>
>  # what 'all' will build and 'install' will install in gitexecdir,
> @@ -904,6 +904,7 @@ LIB_OBJS += tree-diff.o
>  LIB_OBJS += tree.o
>  LIB_OBJS += tree-walk.o
>  LIB_OBJS += unpack-trees.o
> +LIB_OBJS += upload-pack.o
>  LIB_OBJS += url.o
>  LIB_OBJS += urlmatch.o
>  LIB_OBJS += usage.o
> diff --git a/builtin.h b/builtin.h
> index 42378f3aa..f332a1257 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, const char *prefix);
>  extern int cmd_update_server_info(int argc, const char **argv, const char *prefix);
>  extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
>  extern int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix);
> +extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
>  extern int cmd_var(int argc, const char **argv, const char *prefix);
>  extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
>  extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
> diff --git a/git.c b/git.c
> index c870b9719..f71073dc8 100644
> --- a/git.c
> +++ b/git.c
> @@ -478,6 +478,7 @@ static struct cmd_struct commands[] = {
>         { "update-server-info", cmd_update_server_info, RUN_SETUP },
>         { "upload-archive", cmd_upload_archive },
>         { "upload-archive--writer", cmd_upload_archive_writer },
> +       { "upload-pack", cmd_upload_pack },
>         { "var", cmd_var, RUN_SETUP_GENTLY },
>         { "verify-commit", cmd_verify_commit, RUN_SETUP },
>         { "verify-pack", cmd_verify_pack },
> diff --git a/upload-pack.c b/upload-pack.c
> index d5de18127..20acaa49d 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -1032,7 +1032,7 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
>         return parse_hide_refs_config(var, value, "uploadpack");
>  }
>
> -int cmd_main(int argc, const char **argv)
> +int cmd_upload_pack(int argc, const char **argv, const char *prefix)
>  {
>         const char *dir;
>         int strict = 0;
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 05/26] upload-pack: factor out processing lines
  2018-01-03  0:18 ` [PATCH 05/26] upload-pack: factor out processing lines Brandon Williams
@ 2018-01-03 20:38   ` Stefan Beller
  0 siblings, 0 replies; 362+ messages in thread
From: Stefan Beller @ 2018-01-03 20:38 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> Factor out the logic for processing shallow, deepen, deepen_since, and
> deepen_not lines into their own functions to simplify the
> 'receive_needs()' function in addition to making it easier to reuse some
> of this logic when implementing protocol_v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 74 insertions(+), 39 deletions(-)
>
> diff --git a/upload-pack.c b/upload-pack.c
> index 20acaa49d..9a507ae53 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -731,6 +731,75 @@ static void deepen_by_rev_list(int ac, const char **av,
>         packet_flush(1);
>  }
>
> +static int process_shallow(const char *line, struct object_array *shallows)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "shallow ", &arg)) {

stylistic nit:

    You could invert the condition in each of the process_* functions
    to just have

        if (!skip_prefix...))
            return 0

        /* less indented code goes here */

        return 1;

    That way we have less indentation as well as easier code.
    (The reader doesn't need to keep in mind what the else
    part is about; it is a rather local decision to bail out instead
    of having the return at the end of the function.)


> +               struct object_id oid;
> +               struct object *object;
> +               if (get_oid_hex(arg, &oid))
> +                       die("invalid shallow line: %s", line);
> +               object = parse_object(&oid);
> +               if (!object)
> +                       return 1;
> +               if (object->type != OBJ_COMMIT)
> +                       die("invalid shallow object %s", oid_to_hex(&oid));
> +               if (!(object->flags & CLIENT_SHALLOW)) {
> +                       object->flags |= CLIENT_SHALLOW;
> +                       add_object_array(object, NULL, shallows);
> +               }
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +static int process_deepen(const char *line, int *depth)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "deepen ", &arg)) {
> +               char *end = NULL;
> +               *depth = strtol(arg, &end, 0);
> +               if (!end || *end || depth <= 0)
> +                       die("Invalid deepen: %s", line);
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "deepen-since ", &arg)) {
> +               char *end = NULL;
> +               *deepen_since = parse_timestamp(arg, &end, 0);
> +               if (!end || *end || !deepen_since ||
> +                   /* revisions.c's max_age -1 is special */
> +                   *deepen_since == -1)
> +                       die("Invalid deepen-since: %s", line);
> +               *deepen_rev_list = 1;
> +               return 1;
> +       }
> +       return 0;
> +}
> +
> +static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "deepen-not ", &arg)) {
> +               char *ref = NULL;
> +               struct object_id oid;
> +               if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
> +                       die("git upload-pack: ambiguous deepen-not: %s", line);
> +               string_list_append(deepen_not, ref);
> +               free(ref);
> +               *deepen_rev_list = 1;
> +               return 1;
> +       }
> +       return 0;
> +}
> +
>  static void receive_needs(void)
>  {
>         struct object_array shallows = OBJECT_ARRAY_INIT;
> @@ -752,49 +821,15 @@ static void receive_needs(void)
>                 if (!line)
>                         break;
>
> -               if (skip_prefix(line, "shallow ", &arg)) {
> -                       struct object_id oid;
> -                       struct object *object;
> -                       if (get_oid_hex(arg, &oid))
> -                               die("invalid shallow line: %s", line);
> -                       object = parse_object(&oid);
> -                       if (!object)
> -                               continue;
> -                       if (object->type != OBJ_COMMIT)
> -                               die("invalid shallow object %s", oid_to_hex(&oid));
> -                       if (!(object->flags & CLIENT_SHALLOW)) {
> -                               object->flags |= CLIENT_SHALLOW;
> -                               add_object_array(object, NULL, &shallows);
> -                       }
> +               if (process_shallow(line, &shallows))
>                         continue;
> -               }
> -               if (skip_prefix(line, "deepen ", &arg)) {
> -                       char *end = NULL;
> -                       depth = strtol(arg, &end, 0);
> -                       if (!end || *end || depth <= 0)
> -                               die("Invalid deepen: %s", line);
> +               if (process_deepen(line, &depth))
>                         continue;
> -               }
> -               if (skip_prefix(line, "deepen-since ", &arg)) {
> -                       char *end = NULL;
> -                       deepen_since = parse_timestamp(arg, &end, 0);
> -                       if (!end || *end || !deepen_since ||
> -                           /* revisions.c's max_age -1 is special */
> -                           deepen_since == -1)
> -                               die("Invalid deepen-since: %s", line);
> -                       deepen_rev_list = 1;
> +               if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
>                         continue;
> -               }
> -               if (skip_prefix(line, "deepen-not ", &arg)) {
> -                       char *ref = NULL;
> -                       struct object_id oid;
> -                       if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
> -                               die("git upload-pack: ambiguous deepen-not: %s", line);
> -                       string_list_append(&deepen_not, ref);
> -                       free(ref);
> -                       deepen_rev_list = 1;
> +               if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
>                         continue;
> -               }
> +
>                 if (!skip_prefix(line, "want ", &arg) ||
>                     get_oid_hex(arg, &oid_buf))
>                         die("git upload-pack: protocol error, "
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 04/26] upload-pack: convert to a builtin
  2018-01-03 20:33   ` Stefan Beller
@ 2018-01-03 20:39     ` Brandon Williams
  2018-02-21 21:47       ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-03 20:39 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On 01/03, Stefan Beller wrote:
> On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> > In order to allow for code sharing with the server-side of fetch in
> > protocol-v2 convert upload-pack to be a builtin.
> 
> What is the security aspect of this patch?
> 
> By making upload-pack builtin, it gains additional abilities,
> such as answers to '-h' or '--help' (which would start a pager).
> Is there an easy way to sooth my concerns? (best put into the
> commit message)

receive-pack is already a builtin, so theres that.

> 
> Thanks,
> Stefan
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 06/26] transport: use get_refs_via_connect to get refs
  2018-01-03  0:18 ` [PATCH 06/26] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-01-03 21:20   ` Stefan Beller
  0 siblings, 0 replies; 362+ messages in thread
From: Stefan Beller @ 2018-01-03 21:20 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> Remove code duplication and use the existing 'get_refs_via_connect()'
> function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
> 'git_transport_push()'.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>

Reviewed-by: Stefan Beller <sbeller@google.com>

> ---
>  transport.c | 18 ++++--------------
>  1 file changed, 4 insertions(+), 14 deletions(-)
>
> diff --git a/transport.c b/transport.c
> index fc802260f..8e8779096 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
>         args.cloning = transport->cloning;
>         args.update_shallow = data->options.update_shallow;
>
> -       if (!data->got_remote_heads) {
> -               connect_setup(transport, 0);
> -               get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
> -                                NULL, &data->shallow);
> -               data->got_remote_heads = 1;
> -       }
> +       if (!data->got_remote_heads)
> +               refs_tmp = get_refs_via_connect(transport, 0);
>
>         refs = fetch_pack(&args, data->fd, data->conn,
>                           refs_tmp ? refs_tmp : transport->remote_refs,
> @@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
>         struct send_pack_args args;
>         int ret;
>
> -       if (!data->got_remote_heads) {
> -               struct ref *tmp_refs;
> -               connect_setup(transport, 1);
> -
> -               get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
> -                                NULL, &data->shallow);
> -               data->got_remote_heads = 1;
> -       }
> +       if (!data->got_remote_heads)
> +               get_refs_via_connect(transport, 1);
>
>         memset(&args, 0, sizeof(args));
>         args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-01-03  0:18 ` [PATCH 12/26] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-01-04  0:17   ` Stefan Beller
  2018-01-05 23:49     ` Brandon Williams
  2018-01-09 20:50   ` Jonathan Tan
  2018-02-01 19:16   ` Jeff Hostetler
  2 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-01-04  0:17 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> Introduce the ls-refs server command.  In protocol v2, the ls-refs
> command is used to request the ref advertisement from the server.  Since
> it is a command which can be requested (as opposed to mandatory in v1),
> a client can sent a number of parameters in its request to limit the ref
> advertisement based on provided ref-patterns.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/technical/protocol-v2.txt | 26 +++++++++
>  Makefile                                |  1 +
>  ls-refs.c                               | 97 +++++++++++++++++++++++++++++++++
>  ls-refs.h                               |  9 +++

Maybe consider putting any served command into a sub directory?

For example the code in builtin/ has laxer rules w.r.t. die()ing
as it is a user facing command, whereas some devs want to see
code at the root of the repo to not die() at all as the eventual goal
is to have a library there.
All this code is on the remote side, which also has different traits than
the code at the root of the git.git repo; non-localisation comes to mind,
but there might be other aspects as well (security?).


>  serve.c                                 |  2 +
>  5 files changed, 135 insertions(+)
>  create mode 100644 ls-refs.c
>  create mode 100644 ls-refs.h
>
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index b87ba3816..5f4d0e719 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -89,3 +89,29 @@ terminate the connection.
>  Commands are the core actions that a client wants to perform (fetch, push,
>  etc).  Each command will be provided with a list capabilities and
>  arguments as requested by a client.
> +
> + Ls-refs

So is it ls-refs or Ls-refs or is any capitalization valid?

> +---------
> +
> +Ls-refs is the command used to request a reference advertisement in v2.
> +Unlike the current reference advertisement, ls-refs takes in parameters
> +which can be used to limit the refs sent from the server.
> +
> +Ls-ref takes in the following parameters wraped in packet-lines:
> +
> +  symrefs: In addition to the object pointed by it, show the underlying
> +          ref pointed by it when showing a symbolic ref.
> +  peel: Show peeled tags.
> +  ref-pattern <pattern>: When specified, only references matching the
> +                        given patterns are displayed.

What kind of pattern matching is allowed here?
strictly prefix only, or globbing, regexes?
Is there a given grammar to follow? Maybe a link to the git
glossary is or somewhere else might be fine.

Seeing that we do wildmatch() down there (as opposed to regexes),
I wonder if it provides an entry for a denial of service attack, by crafting
a pattern that is very expensive for the server to compute but cheap to
ask for from a client. (c.f. 94da9193a6 (grep: add support for PCRE v2,
2017-06-01, but that is regexes!)

> +The output of ls-refs is as follows:
> +
> +    output = *ref
> +            flush-pkt
> +    ref = PKT-LINE((tip | peeled) LF)
> +    tip = obj-id SP refname (SP symref-target)
> +    peeled = obj-id SP refname "^{}"
> +
> +    symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
> +    shallow = PKT-LINE("shallow" SP obj-id LF)
> diff --git a/Makefile b/Makefile
> index 5f3b5fe8b..152a73bec 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -820,6 +820,7 @@ LIB_OBJS += list-objects-filter-options.o
>  LIB_OBJS += ll-merge.o
>  LIB_OBJS += lockfile.o
>  LIB_OBJS += log-tree.o
> +LIB_OBJS += ls-refs.o
>  LIB_OBJS += mailinfo.o
>  LIB_OBJS += mailmap.o
>  LIB_OBJS += match-trees.o
> diff --git a/ls-refs.c b/ls-refs.c
> new file mode 100644
> index 000000000..ac4904a40
> --- /dev/null
> +++ b/ls-refs.c
> @@ -0,0 +1,97 @@
> +#include "cache.h"
> +#include "repository.h"
> +#include "refs.h"
> +#include "remote.h"
> +#include "argv-array.h"
> +#include "ls-refs.h"
> +#include "pkt-line.h"
> +
> +struct ls_refs_data {
> +       unsigned peel;
> +       unsigned symrefs;
> +       struct argv_array patterns;
> +};
> +
> +/*
> + * Check if one of the patterns matches the tail part of the ref.
> + * If no patterns were provided, all refs match.
> + */
> +static int ref_match(const struct argv_array *patterns, const char *refname)
> +{
> +       char *pathbuf;
> +       int i;
> +
> +       if (!patterns->argc)
> +               return 1; /* no restriction */
> +
> +       pathbuf = xstrfmt("/%s", refname);
> +       for (i = 0; i < patterns->argc; i++) {
> +               if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
> +                       free(pathbuf);
> +                       return 1;
> +               }
> +       }
> +       free(pathbuf);
> +       return 0;
> +}
> +
> +static int send_ref(const char *refname, const struct object_id *oid,
> +                   int flag, void *cb_data)
> +{
> +       struct ls_refs_data *data = cb_data;
> +       const char *refname_nons = strip_namespace(refname);
> +       struct strbuf refline = STRBUF_INIT;
> +
> +       if (!ref_match(&data->patterns, refname))
> +               return 0;
> +
> +       strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> +       if (data->symrefs && flag & REF_ISSYMREF) {
> +               struct object_id unused;
> +               const char *symref_target = resolve_ref_unsafe(refname, 0,
> +                                                              &unused,
> +                                                              &flag);
> +
> +               if (!symref_target)
> +                       die("'%s' is a symref but it is not?", refname);
> +
> +               strbuf_addf(&refline, " %s", symref_target);
> +       }
> +
> +       strbuf_addch(&refline, '\n');
> +
> +       packet_write(1, refline.buf, refline.len);
> +       if (data->peel) {
> +               struct object_id peeled;
> +               if (!peel_ref(refname, &peeled))
> +                       packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled),
> +                                        refname_nons);
> +       }
> +
> +       strbuf_release(&refline);
> +       return 0;
> +}
> +
> +int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array *args)
> +{
> +       int i;
> +       struct ls_refs_data data = { 0, 0, ARGV_ARRAY_INIT };
> +
> +       for (i = 0; i < args->argc; i++) {
> +               const char *arg = args->argv[i];
> +               const char *out;
> +
> +               if (!strcmp("peel", arg))
> +                       data.peel = 1;
> +               else if (!strcmp("symrefs", arg))
> +                       data.symrefs = 1;
> +               else if (skip_prefix(arg, "ref-pattern ", &out))
> +                       argv_array_pushf(&data.patterns, "*/%s", out);
> +       }
> +
> +       head_ref_namespaced(send_ref, &data);
> +       for_each_namespaced_ref(send_ref, &data);
> +       packet_flush(1);
> +       argv_array_clear(&data.patterns);
> +       return 0;
> +}
> diff --git a/ls-refs.h b/ls-refs.h
> new file mode 100644
> index 000000000..9e4c57bfe
> --- /dev/null
> +++ b/ls-refs.h
> @@ -0,0 +1,9 @@
> +#ifndef LS_REFS_H
> +#define LS_REFS_H
> +
> +struct repository;
> +struct argv_array;
> +extern int ls_refs(struct repository *r, struct argv_array *keys,
> +                  struct argv_array *args);
> +
> +#endif /* LS_REFS_H */
> diff --git a/serve.c b/serve.c
> index da8127775..88d548410 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -4,6 +4,7 @@
>  #include "pkt-line.h"
>  #include "version.h"
>  #include "argv-array.h"
> +#include "ls-refs.h"
>  #include "serve.h"
>
>  static int always_advertise(struct repository *r,
> @@ -44,6 +45,7 @@ struct protocol_capability {
>  static struct protocol_capability capabilities[] = {
>         { "agent", agent_advertise, NULL },
>         { "stateless-rpc", always_advertise, NULL },
> +       { "ls-refs", always_advertise, ls_refs },
>  };
>
>  static void advertise_capabilities(void)
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 19/26] upload-pack: introduce fetch server command
  2018-01-03  0:18 ` [PATCH 19/26] upload-pack: introduce fetch server command Brandon Williams
@ 2018-01-04  1:07   ` Stefan Beller
  0 siblings, 0 replies; 362+ messages in thread
From: Stefan Beller @ 2018-01-04  1:07 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> Introduce the 'fetch' server command.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/technical/protocol-v2.txt |  14 ++
>  serve.c                                 |   2 +
>  upload-pack.c                           | 290 ++++++++++++++++++++++++++++++++
>  upload-pack.h                           |   9 +
>  4 files changed, 315 insertions(+)
>  create mode 100644 upload-pack.h
>
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 5f4d0e719..2a8e2f226 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -115,3 +115,17 @@ The output of ls-refs is as follows:
>
>      symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
>      shallow = PKT-LINE("shallow" SP obj-id LF)
> +
> + Fetch
> +-------
> +
> +Fetch will need to be a modified version of the v1 fetch protocol.  Some
> +potential areas for improvement are: Ref-in-want, CDN offloading,
> +Fetch-options.
> +
> +Since we'll have an 'ls-ref' service we can eliminate the need of fetch
> +to perform a ref-advertisement, instead a client can run the 'ls-refs'
> +service first, in order to find out what refs the server has, and then
> +request those refs directly using the fetch service.
> +
> +//TODO Flesh out the design

TODO: actually do it. ;)

a couple notes from the discussion in office:
* Could we split fetch into multiple phases
  (negotiation + getting the pack)
* negotiation could be reused in forced push to
  minimize the pack to be sent
* negotiation in a half duplex is actually better
  called 'discovery', which discovers about the set
  of objects available on the remote side.
  (the opposite would be reveal, or 'ask-for-discovery', which
  is could be used for a symmetric design of fetch and push)


> diff --git a/serve.c b/serve.c
> index 88d548410..ca3bb7190 100644
> --- a/serve.c
> +++ b/serve.c
> @@ -6,6 +6,7 @@
>  #include "argv-array.h"
>  #include "ls-refs.h"
>  #include "serve.h"
> +#include "upload-pack.h"
>
>  static int always_advertise(struct repository *r,
>                             struct strbuf *value)
> @@ -46,6 +47,7 @@ static struct protocol_capability capabilities[] = {
>         { "agent", agent_advertise, NULL },
>         { "stateless-rpc", always_advertise, NULL },
>         { "ls-refs", always_advertise, ls_refs },
> +       { "fetch", always_advertise, upload_pack_v2 },
>  };
>
>  static void advertise_capabilities(void)
> diff --git a/upload-pack.c b/upload-pack.c
> index 2ca60d27c..c41f6f528 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -20,6 +20,7 @@
>  #include "prio-queue.h"
>  #include "protocol.h"
>  #include "serve.h"
> +#include "upload-pack.h"
>
>  static const char * const upload_pack_usage[] = {
>         N_("git upload-pack [<options>] <dir>"),
> @@ -1040,6 +1041,295 @@ static void upload_pack(void)
>         }
>  }
>
> +struct upload_pack_data {
> +       struct object_array wants;
> +       struct oid_array haves;
> +
> +       unsigned stateless_rpc : 1;
> +
> +       unsigned use_thin_pack : 1;
> +       unsigned use_ofs_delta : 1;
> +       unsigned no_progress : 1;
> +       unsigned use_include_tag : 1;
> +       unsigned done : 1;
> +};
> +
> +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
> +
> +static void upload_pack_data_clear(struct upload_pack_data *data)
> +{
> +       object_array_clear(&data->wants);
> +       oid_array_clear(&data->haves);
> +}
> +
> +static int parse_want(const char *line)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "want ", &arg)) {
> +               struct object_id oid;
> +               struct object *o;
> +
> +               if (get_oid_hex(arg, &oid))
> +                       die("git upload-pack: protocol error, "
> +                           "expected to get oid, not '%s'", line);
> +
> +               o = parse_object(&oid);
> +               if (!o) {
> +                       packet_write_fmt(1,
> +                                        "ERR upload-pack: not our ref %s",
> +                                        oid_to_hex(&oid));
> +                       die("git upload-pack: not our ref %s",
> +                           oid_to_hex(&oid));
> +               }
> +
> +               if (!(o->flags & WANTED)) {
> +                       o->flags |= WANTED;
> +                       add_object_array(o, NULL, &want_obj);
> +               }
> +
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +static int parse_have(const char *line, struct oid_array *haves)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "have ", &arg)) {
> +               struct object_id oid;
> +
> +               if (get_oid_hex(arg, &oid))
> +                       die("git upload-pack: expected SHA1 object, got '%s'", arg);
> +               oid_array_append(haves, &oid);
> +               return 1;
> +       }

The same stylistic nit as in an earlier patch: Maybe
shortcircuit by inverting the condition?



> +
> +       return 0;
> +}
> +
> +static void process_args(struct argv_array *args, struct upload_pack_data *data)
> +{
> +       int i;
> +
> +       for (i = 0; i < args->argc; i++) {
> +               const char *arg = args->argv[i];
> +
> +               /* process want */
> +               if (parse_want(arg))
> +                       continue;
> +               /* process have line */
> +               if (parse_have(arg, &data->haves))
> +                       continue;
> +
> +               /* process args like thin-pack */
> +               if (!strcmp(arg, "thin-pack")) {
> +                       use_thin_pack = 1;
> +                       continue;
> +               }
> +               if (!strcmp(arg, "ofs-delta")) {
> +                       use_ofs_delta = 1;
> +                       continue;
> +               }
> +               if (!strcmp(arg, "no-progress")) {
> +                       no_progress = 1;
> +                       continue;
> +               }
> +               if (!strcmp(arg, "include-tag")) {
> +                       use_include_tag = 1;
> +                       continue;
> +               }
> +               if (!strcmp(arg, "done")) {
> +                       data->done = 1;
> +                       continue;
> +               }
> +
> +               /* ignore unknown lines maybe? */
> +               die("unexpect line: '%s'", arg);
> +       }
> +}
> +
> +static void read_haves(struct upload_pack_data *data)
> +{
> +       struct packet_reader reader;
> +       packet_reader_init(&reader, 0, NULL, 0,
> +                          PACKET_READ_CHOMP_NEWLINE);
> +
> +       while (packet_reader_read(&reader) == PACKET_READ_NORMAL) {
> +
> +               if (parse_have(reader.line, &data->haves))
> +                       continue;
> +               if (!strcmp(reader.line, "done")) {
> +                       data->done = 1;
> +                       continue;
> +               }
> +       }
> +       if (reader.status != PACKET_READ_FLUSH)
> +               die("ERROR");
> +}
> +
> +static int process_haves(struct oid_array *haves, struct oid_array *common)
> +{
> +       int i;
> +
> +       /* Process haves */
> +       for (i = 0; i < haves->nr; i++) {
> +               const struct object_id *oid = &haves->oid[i];
> +               struct object *o;
> +               int we_knew_they_have = 0;
> +
> +               if (!has_object_file(oid))
> +                       continue;
> +
> +               oid_array_append(common, oid);
> +
> +               o = parse_object(oid);
> +               if (!o)
> +                       die("oops (%s)", oid_to_hex(oid));
> +               if (o->type == OBJ_COMMIT) {
> +                       struct commit_list *parents;
> +                       struct commit *commit = (struct commit *)o;
> +                       if (o->flags & THEY_HAVE)
> +                               we_knew_they_have = 1;
> +                       else
> +                               o->flags |= THEY_HAVE;
> +                       if (!oldest_have || (commit->date < oldest_have))
> +                               oldest_have = commit->date;
> +                       for (parents = commit->parents;
> +                            parents;
> +                            parents = parents->next)
> +                               parents->item->object.flags |= THEY_HAVE;
> +               }
> +               if (!we_knew_they_have)
> +                       add_object_array(o, NULL, &have_obj);
> +       }
> +
> +       return 0;
> +}
> +
> +static int send_acks(struct oid_array *acks, struct strbuf *response)
> +{
> +       int i;
> +       /* Send Acks */
> +       if (!acks->nr)
> +               packet_buf_write(response, "NAK\n");
> +
> +       for (i = 0; i < acks->nr; i++) {
> +               packet_buf_write(response, "ACK %s common\n",
> +                                oid_to_hex(&acks->oid[i]));
> +       }
> +
> +       if (ok_to_give_up()) {
> +               /* Send Ready */
> +               packet_buf_write(response, "ACK %s ready\n",
> +                                oid_to_hex(&acks->oid[i-1]));
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +static int process_haves_and_send_acks(struct upload_pack_data *data)
> +{
> +       struct oid_array common = OID_ARRAY_INIT;
> +       struct strbuf response = STRBUF_INIT;
> +       int ret = 0;
> +
> +       process_haves(&data->haves, &common);
> +       if (data->done) {
> +               ret = 1;
> +       } else if (send_acks(&common, &response)) {
> +               packet_buf_delim(&response);
> +               ret = 1;
> +       } else {
> +               /* Add Flush */
> +               packet_buf_flush(&response);
> +               ret = 0;
> +       }
> +
> +       /* Send response */
> +       write_or_die(1, response.buf, response.len);
> +       strbuf_release(&response);
> +
> +       oid_array_clear(&data->haves);
> +       oid_array_clear(&common);
> +       return ret;
> +}
> +
> +#define FETCH_PROCESS_ARGS 0
> +#define FETCH_READ_HAVES 1
> +#define FETCH_SEND_ACKS 2
> +#define FETCH_SEND_PACK 3
> +#define FETCH_DONE 4
> +
> +int upload_pack_v2(struct repository *r, struct argv_array *keys,
> +                  struct argv_array *args)
> +{
> +       int state = FETCH_PROCESS_ARGS;
> +       struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
> +       const char *out;
> +       use_sideband = LARGE_PACKET_MAX;
> +
> +       /* Check if cmd is being run as a stateless-rpc */
> +       if (has_capability(keys, "stateless-rpc", &out))
> +               if (!strcmp(out, "true"))
> +                       data.stateless_rpc = 1;
> +
> +       while (state != FETCH_DONE) {
> +               switch (state) {
> +               case FETCH_PROCESS_ARGS:
> +                       process_args(args, &data);
> +
> +                       if (!want_obj.nr) {
> +                               /*
> +                                * Request didn't contain any 'want' lines,
> +                                * guess they didn't want anything.
> +                                */
> +                               state = FETCH_DONE;
> +                       } else if (data.haves.nr) {
> +                               /*
> +                                * Request had 'have' lines, so lets ACK them.
> +                                */
> +                               state = FETCH_SEND_ACKS;
> +                       } else {
> +                               /*
> +                                * Request had 'want's but no 'have's so we can
> +                                * immedietly go to construct and send a pack.
> +                                */
> +                               state = FETCH_SEND_PACK;
> +                       }
> +                       break;
> +               case FETCH_READ_HAVES:
> +                       read_haves(&data);
> +                       state = FETCH_SEND_ACKS;
> +                       break;
> +               case FETCH_SEND_ACKS:
> +                       if (process_haves_and_send_acks(&data))
> +                               state = FETCH_SEND_PACK;
> +                       else if (data.stateless_rpc)
> +                               /*
> +                                * Request was made via stateless-rpc and a
> +                                * packfile isn't ready to be created and sent.
> +                                */
> +                               state = FETCH_DONE;
> +                       else
> +                               state = FETCH_READ_HAVES;
> +                       break;
> +               case FETCH_SEND_PACK:
> +                       create_pack_file();
> +                       state = FETCH_DONE;
> +                       break;
> +               case FETCH_DONE:
> +                       break;
> +               default:
> +                       BUG("invalid state");
> +               }
> +       }
> +
> +       upload_pack_data_clear(&data);
> +       return 0;
> +}
> +
>  static int upload_pack_config(const char *var, const char *value, void *unused)
>  {
>         if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
> diff --git a/upload-pack.h b/upload-pack.h
> new file mode 100644
> index 000000000..54c429563
> --- /dev/null
> +++ b/upload-pack.h
> @@ -0,0 +1,9 @@
> +#ifndef UPLOAD_PACK_H
> +#define UPLOAD_PACK_H
> +
> +struct repository;
> +struct argv_array;
> +extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
> +                         struct argv_array *args);
> +
> +#endif /* UPLOAD_PACK_H */
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 20/26] fetch-pack: perform a fetch using v2
  2018-01-03  0:18 ` [PATCH 20/26] fetch-pack: perform a fetch using v2 Brandon Williams
@ 2018-01-04  1:23   ` Stefan Beller
  2018-01-05 23:55     ` Brandon Williams
  2018-01-10  0:05   ` Jonathan Tan
  1 sibling, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-01-04  1:23 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> When communicating with a v2 server, perform a fetch by requesting the
> 'fetch' command.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  builtin/fetch-pack.c   |   2 +-
>  fetch-pack.c           | 267 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  fetch-pack.h           |   4 +-
>  t/t5701-protocol-v2.sh |  40 ++++++++
>  transport.c            |   8 +-
>  5 files changed, 314 insertions(+), 7 deletions(-)
>
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index f492e8abd..867dd3cc7 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -213,7 +213,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>         }
>
>         ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
> -                        &shallow, pack_lockfile_ptr);
> +                        &shallow, pack_lockfile_ptr, protocol_v0);
>         if (pack_lockfile) {
>                 printf("lock %s\n", pack_lockfile);
>                 fflush(stdout);
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 9f6b07ad9..c26fdc539 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1008,6 +1008,262 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
>         return ref;
>  }
>
> +static void add_wants(const struct ref *wants, struct strbuf *req_buf)
> +{
> +       for ( ; wants ; wants = wants->next) {
> +               const struct object_id *remote = &wants->old_oid;
> +               const char *remote_hex;
> +               struct object *o;
> +
> +               /*
> +                * If that object is complete (i.e. it is an ancestor of a
> +                * local ref), we tell them we have it but do not have to
> +                * tell them about its ancestors, which they already know
> +                * about.
> +                *
> +                * We use lookup_object here because we are only
> +                * interested in the case we *know* the object is
> +                * reachable and we have already scanned it.
> +                */
> +               if (((o = lookup_object(remote->hash)) != NULL) &&
> +                   (o->flags & COMPLETE)) {
> +                       continue;
> +               }
> +
> +               remote_hex = oid_to_hex(remote);
> +               packet_buf_write(req_buf, "want %s\n", remote_hex);
> +       }
> +}
> +
> +static int add_haves(struct strbuf *req_buf, int *in_vain)
> +{
> +       int ret = 0;
> +       int haves_added = 0;
> +       const struct object_id *oid;
> +
> +       while ((oid = get_rev())) {
> +               packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
> +               if (++haves_added >= INITIAL_FLUSH)
> +                       break;
> +       };
> +
> +       *in_vain += haves_added;
> +       if (!haves_added || *in_vain >= MAX_IN_VAIN) {
> +               /* Send Done */
> +               packet_buf_write(req_buf, "done\n");
> +               ret = 1;
> +       }
> +
> +       return ret;
> +}
> +
> +static int send_haves(int fd_out, int *in_vain)
> +{
> +       int ret = 0;
> +       struct strbuf req_buf = STRBUF_INIT;
> +
> +       ret = add_haves(&req_buf, in_vain);
> +
> +       /* Send request */
> +       packet_buf_flush(&req_buf);
> +       write_or_die(fd_out, req_buf.buf, req_buf.len);
> +
> +       strbuf_release(&req_buf);
> +       return ret;
> +}
> +
> +static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
> +                             const struct ref *wants, struct oidset *common,
> +                             int *in_vain)
> +{
> +       int ret = 0;
> +       struct strbuf req_buf = STRBUF_INIT;
> +
> +       packet_buf_write(&req_buf, "command=fetch");
> +       packet_buf_write(&req_buf, "agent=%s", git_user_agent_sanitized());
> +       if (args->stateless_rpc)
> +               packet_buf_write(&req_buf, "stateless-rpc=true");
> +
> +       packet_buf_delim(&req_buf);
> +       if (args->use_thin_pack)
> +               packet_buf_write(&req_buf, "thin-pack");
> +       if (args->no_progress)
> +               packet_buf_write(&req_buf, "no-progress");
> +       if (args->include_tag)
> +               packet_buf_write(&req_buf, "include-tag");
> +       if (prefer_ofs_delta)
> +               packet_buf_write(&req_buf, "ofs-delta");
> +
> +       /* add wants */
> +       add_wants(wants, &req_buf);
> +
> +       /*
> +        * If we are running stateless-rpc we need to add all the common
> +        * commits we've found in previous rounds
> +        */
> +       if (args->stateless_rpc) {
> +               struct oidset_iter iter;
> +               const struct object_id *oid;
> +               oidset_iter_init(common, &iter);
> +
> +               while ((oid = oidset_iter_next(&iter))) {
> +                       packet_buf_write(&req_buf, "have %s\n", oid_to_hex(oid));
> +               }
> +       }
> +
> +       /* Add initial haves */
> +       ret = add_haves(&req_buf, in_vain);
> +
> +       /* Send request */
> +       packet_buf_flush(&req_buf);
> +       write_or_die(fd_out, req_buf.buf, req_buf.len);
> +
> +       strbuf_release(&req_buf);
> +       return ret;
> +}
> +
> +static enum ack_type process_ack(const char *line, struct object_id *oid)
> +{
> +       const char *arg;
> +
> +       if (!strcmp(line, "NAK"))
> +               return NAK;
> +       if (skip_prefix(line, "ACK ", &arg)) {
> +               if (!parse_oid_hex(arg, oid, &arg)) {
> +                       if (strstr(arg, "continue"))
> +                               return ACK_continue;
> +                       if (strstr(arg, "common"))
> +                               return ACK_common;
> +                       if (strstr(arg, "ready"))
> +                               return ACK_ready;
> +                       return ACK;
> +               }
> +       }
> +       if (skip_prefix(line, "ERR ", &arg))
> +               die(_("remote error: %s"), arg);
> +       die(_("git fetch-pack: expected ACK/NAK, got '%s'"), line);
> +}
> +
> +static int process_acks(struct packet_reader *reader, struct oidset *common)
> +{
> +       int got_ready = 0;
> +       int got_common = 0;
> +       while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +               struct object_id oid;
> +               struct commit *commit;
> +               enum ack_type ack = process_ack(reader->line, &oid);
> +
> +               switch (ack) {
> +               case ACK_ready:
> +                       clear_prio_queue(&rev_list);
> +                       got_ready = 1;
> +                       /* fallthrough */
> +               case ACK_common:
> +                       oidset_insert(common, &oid);
> +                       commit = lookup_commit(&oid);
> +                       mark_common(commit, 0, 1);
> +                       got_common = 1;
> +                       break;
> +               case NAK:
> +                       break;
> +               case ACK:
> +               case ACK_continue:
> +                       die("ACK/ACK_continue not supported");
> +               }
> +       }
> +
> +       if (reader->status != PACKET_READ_FLUSH &&
> +           reader->status != PACKET_READ_DELIM)
> +               die("Error during processing acks: %d", reader->status);
> +
> +       /* return 0 if no common, 1 if there are common, or 2 if ready */
> +       return got_ready + got_common;
> +}
> +
> +#define FETCH_CHECK_LOCAL 0
> +#define FETCH_SEND_REQUEST 1
> +#define FETCH_PROCESS_ACKS 2
> +#define FETCH_SEND_HAVES 3
> +#define FETCH_GET_PACK 4
> +#define FETCH_DONE 5
> +
> +static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> +                                   int fd[2],
> +                                   const struct ref *orig_ref,
> +                                   struct ref **sought, int nr_sought,
> +                                   char **pack_lockfile)
> +{
> +       struct ref *ref = copy_ref_list(orig_ref);
> +       int state = FETCH_CHECK_LOCAL;

Is there any reason to use #defines over an enum here?

> +       struct oidset common = OIDSET_INIT;
> +       struct packet_reader reader;
> +       int in_vain = 0;
> +       packet_reader_init(&reader, fd[0], NULL, 0,
> +                          PACKET_READ_CHOMP_NEWLINE);
> +
> +       while (state != FETCH_DONE) {
> +               switch (state) {
> +               case FETCH_CHECK_LOCAL:
> +                       sort_ref_list(&ref, ref_compare_name);
> +                       QSORT(sought, nr_sought, cmp_ref_by_name);
> +
> +                       /* v2 supports these by default */
> +                       allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
> +                       use_sideband = 2;
> +
> +                       /* Filter 'ref' by 'sought' and those that aren't local */
> +                       if (everything_local(args, &ref, sought, nr_sought))
> +                               state = FETCH_DONE;
> +                       else
> +                               state = FETCH_SEND_REQUEST;
> +                       break;
> +               case FETCH_SEND_REQUEST:
> +                       if (send_fetch_request(fd[1], args, ref, &common, &in_vain))
> +                               state = FETCH_GET_PACK;
> +                       else
> +                               state = FETCH_PROCESS_ACKS;
> +                       break;
> +               case FETCH_PROCESS_ACKS:
> +                       /* Process ACKs/NAKs */
> +                       switch (process_acks(&reader, &common)) {
> +                       case 2:
> +                               state = FETCH_GET_PACK;
> +                               break;
> +                       case 1:
> +                               in_vain = 0;
> +                               /* fallthrough */
> +                       default:
> +                               if (args->stateless_rpc)
> +                                       state = FETCH_SEND_REQUEST;
> +                               else
> +                                       state = FETCH_SEND_HAVES;
> +                               break;
> +                       }
> +                       break;
> +               case FETCH_SEND_HAVES:
> +                       if (send_haves(fd[1], &in_vain))
> +                               state = FETCH_GET_PACK;
> +                       else
> +                               state = FETCH_PROCESS_ACKS;
> +                       break;
> +               case FETCH_GET_PACK:
> +                       /* get the pack */
> +                       if (get_pack(args, fd, pack_lockfile))
> +                               die(_("git fetch-pack: fetch failed."));
> +
> +                       state = FETCH_DONE;
> +                       break;
> +               case FETCH_DONE:
> +                       break;
> +               default:
> +                       die("invalid state");
> +               }
> +       }
> +
> +       oidset_clear(&common);
> +       return ref;
> +}
> +
>  static void fetch_pack_config(void)
>  {
>         git_config_get_int("fetch.unpacklimit", &fetch_unpack_limit);
> @@ -1153,7 +1409,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
>                        const char *dest,
>                        struct ref **sought, int nr_sought,
>                        struct oid_array *shallow,
> -                      char **pack_lockfile)
> +                      char **pack_lockfile,
> +                      enum protocol_version version)
>  {
>         struct ref *ref_cpy;
>         struct shallow_info si;
> @@ -1167,8 +1424,12 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
>                 die(_("no matching remote head"));
>         }
>         prepare_shallow_info(&si, shallow);
> -       ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
> -                               &si, pack_lockfile);
> +       if (version == protocol_v2)
> +               ref_cpy = do_fetch_pack_v2(args, fd, ref, sought, nr_sought,
> +                                          pack_lockfile);
> +       else
> +               ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
> +                                       &si, pack_lockfile);
>         reprepare_packed_git();
>         update_shallow(args, sought, nr_sought, &si);
>         clear_shallow_info(&si);
> diff --git a/fetch-pack.h b/fetch-pack.h
> index b6aeb43a8..7afca7305 100644
> --- a/fetch-pack.h
> +++ b/fetch-pack.h
> @@ -3,6 +3,7 @@
>
>  #include "string-list.h"
>  #include "run-command.h"
> +#include "protocol.h"
>
>  struct oid_array;
>
> @@ -43,7 +44,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
>                        struct ref **sought,
>                        int nr_sought,
>                        struct oid_array *shallow,
> -                      char **pack_lockfile);
> +                      char **pack_lockfile,
> +                      enum protocol_version version);
>
>  /*
>   * Print an appropriate error message for each sought ref that wasn't
> diff --git a/t/t5701-protocol-v2.sh b/t/t5701-protocol-v2.sh
> index 7d8aeb766..3e411e178 100755
> --- a/t/t5701-protocol-v2.sh
> +++ b/t/t5701-protocol-v2.sh
> @@ -33,4 +33,44 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
>         ! grep "refs/tags/" log
>  '
>
> +test_expect_success 'clone with file:// using protocol v2' '
> +       GIT_TRACE_PACKET=1 git -c protocol.version=2 \
> +               clone "file://$(pwd)/file_parent" file_child 2>log &&
> +
> +       git -C file_child log -1 --format=%s >actual &&
> +       git -C file_parent log -1 --format=%s >expect &&
> +       test_cmp expect actual &&
> +
> +       # Server responded using protocol v1
> +       grep "clone< version 2" log
> +'
> +
> +test_expect_success 'fetch with file:// using protocol v2' '
> +       test_commit -C file_parent two &&
> +
> +       GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
> +               fetch origin 2>log &&
> +
> +       git -C file_child log -1 --format=%s origin/master >actual &&
> +       git -C file_parent log -1 --format=%s >expect &&
> +       test_cmp expect actual &&
> +
> +       # Server responded using protocol v1
> +       grep "fetch< version 2" log
> +'
> +
> +test_expect_success 'ref advertisment is filtered during fetch using protocol v2' '
> +       test_commit -C file_parent three &&
> +
> +       GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
> +               fetch origin master 2>log &&
> +
> +       git -C file_child log -1 --format=%s origin/master >actual &&
> +       git -C file_parent log -1 --format=%s >expect &&
> +       test_cmp expect actual &&
> +
> +       grep "ref-pattern master" log &&
> +       ! grep "refs/tags/" log
> +'
> +
>  test_done
> diff --git a/transport.c b/transport.c
> index 6ea3905e3..4fdbd9adc 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -256,14 +256,18 @@ static int fetch_refs_via_pack(struct transport *transport,
>
>         switch (data->version) {
>         case protocol_v2:
> -               die("support for protocol v2 not implemented yet");
> +               refs = fetch_pack(&args, data->fd, data->conn,
> +                                 refs_tmp ? refs_tmp : transport->remote_refs,
> +                                 dest, to_fetch, nr_heads, &data->shallow,
> +                                 &transport->pack_lockfile, data->version);
> +               packet_flush(data->fd[1]);
>                 break;
>         case protocol_v1:
>         case protocol_v0:
>                 refs = fetch_pack(&args, data->fd, data->conn,
>                                   refs_tmp ? refs_tmp : transport->remote_refs,
>                                   dest, to_fetch, nr_heads, &data->shallow,
> -                                 &transport->pack_lockfile);
> +                                 &transport->pack_lockfile, data->version);
>                 break;
>         case protocol_unknown_version:
>                 BUG("unknown protocol version");
> --
> 2.15.1.620.gb9897f4670-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 01/26] pkt-line: introduce packet_read_with_status
  2018-01-03 19:27   ` Stefan Beller
@ 2018-01-05 23:41     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-05 23:41 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On 01/03, Stefan Beller wrote:
> On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> > The current pkt-line API encodes the status of a pkt-line read in the
> > length of the read content.  An error is indicated with '-1', a flush
> > with '0' (which can be confusing since a return value of '0' can also
> > indicate an empty pkt-line), and a positive integer for the length of
> > the read content otherwise.  This doesn't leave much room for allowing
> > the addition of additional special packets in the future.
> >
> > To solve this introduce 'packet_read_with_status()' which reads a packet
> > and returns the status of the read encoded as an 'enum packet_status'
> > type.  This allows for easily identifying between special and normal
> > packets as well as errors.  It also enables easily adding a new special
> > packet in the future.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  pkt-line.c | 55 ++++++++++++++++++++++++++++++++++++++++++-------------
> >  pkt-line.h | 15 +++++++++++++++
> >  2 files changed, 57 insertions(+), 13 deletions(-)
> >
> > diff --git a/pkt-line.c b/pkt-line.c
> > index 2827ca772..8d7cd389f 100644
> > --- a/pkt-line.c
> > +++ b/pkt-line.c
> > @@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
> >         return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
> >  }
> >
> > -int packet_read(int fd, char **src_buf, size_t *src_len,
> > -               char *buffer, unsigned size, int options)
> > +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> > +                                               char *buffer, unsigned size, int *pktlen,
> > +                                               int options)
> >  {
> > -       int len, ret;
> > +       int len;
> >         char linelen[4];
> >
> > -       ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
> > -       if (ret < 0)
> > -               return ret;
> > +       if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
> > +               return PACKET_READ_EOF;
> > +
> >         len = packet_length(linelen);
> >         if (len < 0)
> >                 die("protocol error: bad line length character: %.4s", linelen);
> > -       if (!len) {
> > +
> > +       if (len == 0) {
> >                 packet_trace("0000", 4, 0);
> > -               return 0;
> > +               return PACKET_READ_FLUSH;
> > +       } else if (len >= 1 && len <= 3) {
> > +               die("protocol error: bad line length character: %.4s", linelen);
> 
> I wonder how much libified code we want here already, maybe we could
> have PACKET_READ_ERROR as a return value here instead of die()ing.
> There could also be an option that tells this code to die on error, this reminds
> me of the repository discovery as well as the refs code, both of which have
> this pattern.
> 
> Currently this series is only upgrading commands that use the network
> anyway, so I guess die()ing in an ls-remote or fetch is no big deal,
> but it could
> be interesting to keep going once we have more of the partial clone
> stuff working
> (e.g. remote assisted log/blame would want to gracefully fall back instead of
> die()ing without any useful output, I would think.)

These are all things we could do, but the current code just dies and it
may be more hassle right now to change all the uses of packet_read to
handle errors gracefully.  But its definitely something we can do in the
future.

> 
> >         }
> > +
> >         len -= 4;
> > -       if (len >= size)
> > +       if ((len < 0) || ((unsigned)len >= size))
> >                 die("protocol error: bad line length %d", len);
> > -       ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
> > -       if (ret < 0)
> > -               return ret;
> > +
> > +       if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
> > +               return PACKET_READ_EOF;
> >
> >         if ((options & PACKET_READ_CHOMP_NEWLINE) &&
> >             len && buffer[len-1] == '\n')
> > @@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
> >
> >         buffer[len] = 0;
> >         packet_trace(buffer, len, 0);
> > -       return len;
> > +       *pktlen = len;
> > +       return PACKET_READ_NORMAL;
> > +}
> > +
> > +int packet_read(int fd, char **src_buffer, size_t *src_len,
> > +               char *buffer, unsigned size, int options)
> > +{
> > +       enum packet_read_status status;
> > +       int pktlen;
> > +
> > +       status = packet_read_with_status(fd, src_buffer, src_len,
> > +                                        buffer, size, &pktlen,
> > +                                        options);
> > +       switch (status) {
> > +       case PACKET_READ_EOF:
> > +               pktlen = -1;
> > +               break;
> > +       case PACKET_READ_NORMAL:
> > +               break;
> > +       case PACKET_READ_FLUSH:
> > +               pktlen = 0;
> > +               break;
> > +       }
> > +
> > +       return pktlen;
> >  }
> >
> >  static char *packet_read_line_generic(int fd,
> > diff --git a/pkt-line.h b/pkt-line.h
> > index 3dad583e2..06c468927 100644
> > --- a/pkt-line.h
> > +++ b/pkt-line.h
> > @@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
> >  int packet_read(int fd, char **src_buffer, size_t *src_len, char
> >                 *buffer, unsigned size, int options);
> >
> > +/*
> > + * Read a packetized line into a buffer like the 'packet_read()' function but
> > + * returns an 'enum packet_read_status' which indicates the status of the read.
> > + * The number of bytes read will be assigined to *pktlen if the status of the
> > + * read was 'PACKET_READ_NORMAL'.
> > + */
> > +enum packet_read_status {
> > +       PACKET_READ_EOF = -1,
> > +       PACKET_READ_NORMAL,
> > +       PACKET_READ_FLUSH,
> > +};
> > +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> > +                                               char *buffer, unsigned size, int *pktlen,
> > +                                               int options);
> > +
> >  /*
> >   * Convenience wrapper for packet_read that is not gentle, and sets the
> >   * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
> > --
> > 2.15.1.620.gb9897f4670-goog
> >

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-01-04  0:17   ` Stefan Beller
@ 2018-01-05 23:49     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-05 23:49 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On 01/03, Stefan Beller wrote:
> On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> > Introduce the ls-refs server command.  In protocol v2, the ls-refs
> > command is used to request the ref advertisement from the server.  Since
> > it is a command which can be requested (as opposed to mandatory in v1),
> > a client can sent a number of parameters in its request to limit the ref
> > advertisement based on provided ref-patterns.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  Documentation/technical/protocol-v2.txt | 26 +++++++++
> >  Makefile                                |  1 +
> >  ls-refs.c                               | 97 +++++++++++++++++++++++++++++++++
> >  ls-refs.h                               |  9 +++
> 
> Maybe consider putting any served command into a sub directory?
> 
> For example the code in builtin/ has laxer rules w.r.t. die()ing
> as it is a user facing command, whereas some devs want to see
> code at the root of the repo to not die() at all as the eventual goal
> is to have a library there.
> All this code is on the remote side, which also has different traits than
> the code at the root of the git.git repo; non-localisation comes to mind,
> but there might be other aspects as well (security?).

Well if we were to do this then we should move upload-pack and
receive-pack into this same "server code" directory.

> 
> 
> >  serve.c                                 |  2 +
> >  5 files changed, 135 insertions(+)
> >  create mode 100644 ls-refs.c
> >  create mode 100644 ls-refs.h
> >
> > diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> > index b87ba3816..5f4d0e719 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -89,3 +89,29 @@ terminate the connection.
> >  Commands are the core actions that a client wants to perform (fetch, push,
> >  etc).  Each command will be provided with a list capabilities and
> >  arguments as requested by a client.
> > +
> > + Ls-refs
> 
> So is it ls-refs or Ls-refs or is any capitalization valid?

"ls-refs"  I'll make sure to change this.
> 
> > +---------
> > +
> > +Ls-refs is the command used to request a reference advertisement in v2.
> > +Unlike the current reference advertisement, ls-refs takes in parameters
> > +which can be used to limit the refs sent from the server.
> > +
> > +Ls-ref takes in the following parameters wraped in packet-lines:
> > +
> > +  symrefs: In addition to the object pointed by it, show the underlying
> > +          ref pointed by it when showing a symbolic ref.
> > +  peel: Show peeled tags.
> > +  ref-pattern <pattern>: When specified, only references matching the
> > +                        given patterns are displayed.
> 
> What kind of pattern matching is allowed here?
> strictly prefix only, or globbing, regexes?
> Is there a given grammar to follow? Maybe a link to the git
> glossary is or somewhere else might be fine.
> 
> Seeing that we do wildmatch() down there (as opposed to regexes),
> I wonder if it provides an entry for a denial of service attack, by crafting
> a pattern that is very expensive for the server to compute but cheap to
> ask for from a client. (c.f. 94da9193a6 (grep: add support for PCRE v2,
> 2017-06-01, but that is regexes!)
> 
> > +The output of ls-refs is as follows:
> > +
> > +    output = *ref
> > +            flush-pkt
> > +    ref = PKT-LINE((tip | peeled) LF)
> > +    tip = obj-id SP refname (SP symref-target)
> > +    peeled = obj-id SP refname "^{}"
> > +
> > +    symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
> > +    shallow = PKT-LINE("shallow" SP obj-id LF)
> > diff --git a/Makefile b/Makefile
> > index 5f3b5fe8b..152a73bec 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -820,6 +820,7 @@ LIB_OBJS += list-objects-filter-options.o
> >  LIB_OBJS += ll-merge.o
> >  LIB_OBJS += lockfile.o
> >  LIB_OBJS += log-tree.o
> > +LIB_OBJS += ls-refs.o
> >  LIB_OBJS += mailinfo.o
> >  LIB_OBJS += mailmap.o
> >  LIB_OBJS += match-trees.o
> > diff --git a/ls-refs.c b/ls-refs.c
> > new file mode 100644
> > index 000000000..ac4904a40
> > --- /dev/null
> > +++ b/ls-refs.c
> > @@ -0,0 +1,97 @@
> > +#include "cache.h"
> > +#include "repository.h"
> > +#include "refs.h"
> > +#include "remote.h"
> > +#include "argv-array.h"
> > +#include "ls-refs.h"
> > +#include "pkt-line.h"
> > +
> > +struct ls_refs_data {
> > +       unsigned peel;
> > +       unsigned symrefs;
> > +       struct argv_array patterns;
> > +};
> > +
> > +/*
> > + * Check if one of the patterns matches the tail part of the ref.
> > + * If no patterns were provided, all refs match.
> > + */
> > +static int ref_match(const struct argv_array *patterns, const char *refname)
> > +{
> > +       char *pathbuf;
> > +       int i;
> > +
> > +       if (!patterns->argc)
> > +               return 1; /* no restriction */
> > +
> > +       pathbuf = xstrfmt("/%s", refname);
> > +       for (i = 0; i < patterns->argc; i++) {
> > +               if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
> > +                       free(pathbuf);
> > +                       return 1;
> > +               }
> > +       }
> > +       free(pathbuf);
> > +       return 0;
> > +}
> > +
> > +static int send_ref(const char *refname, const struct object_id *oid,
> > +                   int flag, void *cb_data)
> > +{
> > +       struct ls_refs_data *data = cb_data;
> > +       const char *refname_nons = strip_namespace(refname);
> > +       struct strbuf refline = STRBUF_INIT;
> > +
> > +       if (!ref_match(&data->patterns, refname))
> > +               return 0;
> > +
> > +       strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
> > +       if (data->symrefs && flag & REF_ISSYMREF) {
> > +               struct object_id unused;
> > +               const char *symref_target = resolve_ref_unsafe(refname, 0,
> > +                                                              &unused,
> > +                                                              &flag);
> > +
> > +               if (!symref_target)
> > +                       die("'%s' is a symref but it is not?", refname);
> > +
> > +               strbuf_addf(&refline, " %s", symref_target);
> > +       }
> > +
> > +       strbuf_addch(&refline, '\n');
> > +
> > +       packet_write(1, refline.buf, refline.len);
> > +       if (data->peel) {
> > +               struct object_id peeled;
> > +               if (!peel_ref(refname, &peeled))
> > +                       packet_write_fmt(1, "%s %s^{}\n", oid_to_hex(&peeled),
> > +                                        refname_nons);
> > +       }
> > +
> > +       strbuf_release(&refline);
> > +       return 0;
> > +}
> > +
> > +int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array *args)
> > +{
> > +       int i;
> > +       struct ls_refs_data data = { 0, 0, ARGV_ARRAY_INIT };
> > +
> > +       for (i = 0; i < args->argc; i++) {
> > +               const char *arg = args->argv[i];
> > +               const char *out;
> > +
> > +               if (!strcmp("peel", arg))
> > +                       data.peel = 1;
> > +               else if (!strcmp("symrefs", arg))
> > +                       data.symrefs = 1;
> > +               else if (skip_prefix(arg, "ref-pattern ", &out))
> > +                       argv_array_pushf(&data.patterns, "*/%s", out);
> > +       }
> > +
> > +       head_ref_namespaced(send_ref, &data);
> > +       for_each_namespaced_ref(send_ref, &data);
> > +       packet_flush(1);
> > +       argv_array_clear(&data.patterns);
> > +       return 0;
> > +}
> > diff --git a/ls-refs.h b/ls-refs.h
> > new file mode 100644
> > index 000000000..9e4c57bfe
> > --- /dev/null
> > +++ b/ls-refs.h
> > @@ -0,0 +1,9 @@
> > +#ifndef LS_REFS_H
> > +#define LS_REFS_H
> > +
> > +struct repository;
> > +struct argv_array;
> > +extern int ls_refs(struct repository *r, struct argv_array *keys,
> > +                  struct argv_array *args);
> > +
> > +#endif /* LS_REFS_H */
> > diff --git a/serve.c b/serve.c
> > index da8127775..88d548410 100644
> > --- a/serve.c
> > +++ b/serve.c
> > @@ -4,6 +4,7 @@
> >  #include "pkt-line.h"
> >  #include "version.h"
> >  #include "argv-array.h"
> > +#include "ls-refs.h"
> >  #include "serve.h"
> >
> >  static int always_advertise(struct repository *r,
> > @@ -44,6 +45,7 @@ struct protocol_capability {
> >  static struct protocol_capability capabilities[] = {
> >         { "agent", agent_advertise, NULL },
> >         { "stateless-rpc", always_advertise, NULL },
> > +       { "ls-refs", always_advertise, ls_refs },
> >  };
> >
> >  static void advertise_capabilities(void)
> > --
> > 2.15.1.620.gb9897f4670-goog
> >

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 20/26] fetch-pack: perform a fetch using v2
  2018-01-04  1:23   ` Stefan Beller
@ 2018-01-05 23:55     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-05 23:55 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On 01/03, Stefan Beller wrote:
> On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:
> > +
> > +#define FETCH_CHECK_LOCAL 0
> > +#define FETCH_SEND_REQUEST 1
> > +#define FETCH_PROCESS_ACKS 2
> > +#define FETCH_SEND_HAVES 3
> > +#define FETCH_GET_PACK 4
> > +#define FETCH_DONE 5
> > +
> > +static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> > +                                   int fd[2],
> > +                                   const struct ref *orig_ref,
> > +                                   struct ref **sought, int nr_sought,
> > +                                   char **pack_lockfile)
> > +{
> > +       struct ref *ref = copy_ref_list(orig_ref);
> > +       int state = FETCH_CHECK_LOCAL;
> 
> Is there any reason to use #defines over an enum here?
> 

No, it would probably be better to use an enum, that would also get rid
of the default case of the switch statement.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 00/26] protocol version 2
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (25 preceding siblings ...)
  2018-01-03  0:18 ` [PATCH 26/26] remote-curl: implement connect-half-duplex command Brandon Williams
@ 2018-01-09 17:55 ` Jonathan Tan
  2018-01-11  0:23   ` Brandon Williams
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
  27 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 17:55 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:02 -0800
Brandon Williams <bmwill@google.com> wrote:

> The following patches extend what I sent out as an WIP
> (https://public-inbox.org/git/20171204235823.63299-1-bmwill@google.com/) and
> implement protocol version 2.

Summarizing (for myself) the rationale for protocol version 2:

The existing protocol has a few pain points: (a) limit on the length of
the capability line (the capability line can be used to include
additional parameters in a backwards-compatible way), (b) difficulty in
creating proxies because of inconsistent flush semantics, and (c) the
need to implement clients twice - once for HTTP and once for
connect-supporting transports. To which we can add another: (d) if we
want to support something entirely new (for example, a server-side "git
log"), we will need a new protocol anyway.

The new functionality introduced in this patch set is probably best done
using a new protocol. If it were done using the existing protocol (by
adding a parameter in the capabilities line), we would still run into
(a) and (c), so we might as well introduce the new protocol now.

Some of the above points are repeats from my previous e-mail:
https://public-inbox.org/git/20171110121347.1f7c184c543622b60164e9fb@google.com/

> Some changes from that series are as follows:
>  * Lots of various cleanup on the ls-refs and fetch command code, both server
>    and client.
>  * Fetch command now supports a stateless-rpc mode which enables communicating
>    with a half-duplex connection.

Good to hear about fetch support.

>  * Introduce a new remote-helper command 'connect-half-duplex' which is
>    implemented by remote-curl (the http remote-helper).  This allows for a
>    client to establish a half-duplex connection and use remote-curl as a proxy
>    to wrap requests in http before sending them to the remote end and
>    unwrapping the responses and sending them back to the client's stdin.

I'm not sure about the "half-duplex" name - it is half-duplex in that
each side must terminate their communications with a flush, but not
half-duplex in that request-response pairs can overlap each other (e.g.
during negotation during fetch - there is an optimization in which the
client tries to keep two requests pending at a time). I think that the
idea we want to communicate is that requests and responses are always
packetized, stateless, and always happen as a pair.

I wonder if "stateless-connect" is a better keyword - it makes sense to
me (once described) that "stateless" implies that the client sends
everything the server needs at once (thus, in a packet), the server
sends everything the client needs back at once (thus, in a packet), and
then the client must not assume any state-storing on the part of the
server or transport.

>  * The transport code is refactored for ls-remote, fetch, and push to provide a
>    list of ref-patterns (based on the refspec being used) when requesting refs
>    from the remote end.  This allows the ls-refs code to send this list of
>    patterns so the remote end and filter the refs it sends back.

Briefly looking at the implementation, the client seems to incur an
extra roundtrip when using ls-remote (and others) with a v2-supporting
server. I initially didn't like this, but upon further reflection, this
is probably fine for now. The client can be upgraded later, and I think
that clients will eventually want to query git-serve directly for
"ls-refs" first, and then fall back to v0 for ancient servers, instead
of checking git-upload-pack first (as in this patch set) - so, the
support for "ls-refs" here won't be carried forward merely for backwards
compatibility, but will eventually be actively used.

As for the decision to use a new endpoint "git-serve" instead of reusing
"git-upload-pack" (or "git-receive-pack"), reusing the existing one
might allow some sort of optimization later in which the first
"git-upload-pack" query immediately returns with the v2 answer (instead
of redirecting the client to "git-serve"), but this probably doesn't
matter in practice (as I stated above, I think that eventually clients
will query git-serve first).

> This series effectively implements protocol version 2 for listing a remotes
> refs (ls-remote) as well as for fetch for the builtin transports (ssh, git,
> file) and for the http/https transports.  Push is not implemented yet and
> doesn't need to be implemented at the same time as fetch since the
> receive-pack code can default to using protocol v0 when v2 is requested by the
> client.

Agreed - push can be done later.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 01/26] pkt-line: introduce packet_read_with_status
  2018-01-03  0:18 ` [PATCH 01/26] pkt-line: introduce packet_read_with_status Brandon Williams
  2018-01-03 19:27   ` Stefan Beller
@ 2018-01-09 18:04   ` Jonathan Tan
  2018-01-09 19:28     ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 18:04 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:03 -0800
Brandon Williams <bmwill@google.com> wrote:

> -int packet_read(int fd, char **src_buf, size_t *src_len,
> -		char *buffer, unsigned size, int options)
> +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> +						char *buffer, unsigned size, int *pktlen,
> +						int options)
>  {
> -	int len, ret;
> +	int len;
>  	char linelen[4];
>  
> -	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
> -	if (ret < 0)
> -		return ret;
> +	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
> +		return PACKET_READ_EOF;
> +
>  	len = packet_length(linelen);
>  	if (len < 0)
>  		die("protocol error: bad line length character: %.4s", linelen);
> -	if (!len) {
> +
> +	if (len == 0) {

This change (replacing "!len" with "len == 0") is unnecessary, I think.

>  		packet_trace("0000", 4, 0);
> -		return 0;
> +		return PACKET_READ_FLUSH;
> +	} else if (len >= 1 && len <= 3) {
> +		die("protocol error: bad line length character: %.4s", linelen);
>  	}

This seems to be more of a "bad line length" than a "bad line length
character".

Also, some of the checks are redundant. Above, it is probably better to
delete "len >= 1", and optionally write "len < 4" instead of "len <= 3"
(to emphasize that the subtraction of 4 below does not result in a
negative value).

> +
>  	len -= 4;
> -	if (len >= size)
> +	if ((len < 0) || ((unsigned)len >= size))
>  		die("protocol error: bad line length %d", len);

The "len < 0" check is redundant.

> -	ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
> -	if (ret < 0)
> -		return ret;
> +
> +	if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
> +		return PACKET_READ_EOF;
>  
>  	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
>  	    len && buffer[len-1] == '\n')

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 02/26] pkt-line: introduce struct packet_reader
  2018-01-03  0:18 ` [PATCH 02/26] pkt-line: introduce struct packet_reader Brandon Williams
@ 2018-01-09 18:08   ` Jonathan Tan
  2018-01-09 19:19     ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 18:08 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:04 -0800
Brandon Williams <bmwill@google.com> wrote:

> diff --git a/pkt-line.h b/pkt-line.h
> index 06c468927..c446e886a 100644
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -111,6 +111,63 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
>   */
>  ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
>  
> +struct packet_reader {
> +	/* source file descriptor */
> +	int fd;
> +
> +	/* source buffer and its size */
> +	char *src_buffer;
> +	size_t src_len;
> +
> +	/* buffer that pkt-lines are read into and its size */
> +	char *buffer;
> +	unsigned buffer_size;

Is the intention to support different buffers in the future?

[snip]

> +/*
> + * Peek the next packet line without consuming it and return the status.
> + * The next call to 'packet_reader_read()' will perform a read of the same line
> + * that was peeked, consuming the line.
> + *
> + * Only a single line can be peeked at a time.

It is logical to me that if you peeked at a line, and then peeked at it
again, you will get the same line - I would phrase this not as a
restriction ("only a single line") but just as a statement of fact (e.g.
"Peeking at the same line multiple times without an intervening
packet_reader_read will return the same result").

> + */
> +extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);
> +
>  #define DEFAULT_PACKET_MAX 1000
>  #define LARGE_PACKET_MAX 65520
>  #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader
  2018-01-03  0:18 ` [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
@ 2018-01-09 18:27   ` Jonathan Tan
  2018-01-09 19:09     ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 18:27 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:09 -0800
Brandon Williams <bmwill@google.com> wrote:

> -	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> +	while (state != EXPECTING_DONE) {
> +		switch (packet_reader_read(&reader)) {
> +		case PACKET_READ_EOF:
> +			die_initial_contact(1);
> +		case PACKET_READ_NORMAL:
> +			len = reader.pktlen;
> +			if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))

This should be a field in reader, not the global packet_buffer, I think.

Also, I did a search of usages of packet_buffer, and there are just a
few of them - it might be worthwhile to eliminate it, and have each
component using it allocate its own buffer. But this can be done in a
separate patch set.

> @@ -269,6 +284,8 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  			if (process_shallow(len, shallow_points))
>  				break;
>  			die("protocol error: unexpected '%s'", packet_buffer);

Here too.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 09/26] transport: store protocol version
  2018-01-03  0:18 ` [PATCH 09/26] transport: store protocol version Brandon Williams
@ 2018-01-09 18:41   ` Jonathan Tan
  2018-01-09 19:15     ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 18:41 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:11 -0800
Brandon Williams <bmwill@google.com> wrote:

> diff --git a/transport.c b/transport.c
> index 63c3dbab9..2378dcb38 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -118,6 +118,7 @@ struct git_transport_data {
>  	struct child_process *conn;
>  	int fd[2];
>  	unsigned got_remote_heads : 1;
> +	enum protocol_version version;

Should this be initialized to protocol_unknown_version? Right now, as
far as I can tell, it is zero-initialized, which means protocol_v0.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader
  2018-01-09 18:27   ` Jonathan Tan
@ 2018-01-09 19:09     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-09 19:09 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:09 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > -	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> > +	while (state != EXPECTING_DONE) {
> > +		switch (packet_reader_read(&reader)) {
> > +		case PACKET_READ_EOF:
> > +			die_initial_contact(1);
> > +		case PACKET_READ_NORMAL:
> > +			len = reader.pktlen;
> > +			if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
> 
> This should be a field in reader, not the global packet_buffer, I think.

Thanks for catching that.

> 
> Also, I did a search of usages of packet_buffer, and there are just a
> few of them - it might be worthwhile to eliminate it, and have each
> component using it allocate its own buffer. But this can be done in a
> separate patch set.

I'll go through and eliminate the references to packet_buffer by passing
in the buffer explicitly.

> 
> > @@ -269,6 +284,8 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> >  			if (process_shallow(len, shallow_points))
> >  				break;
> >  			die("protocol error: unexpected '%s'", packet_buffer);
> 
> Here too.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 09/26] transport: store protocol version
  2018-01-09 18:41   ` Jonathan Tan
@ 2018-01-09 19:15     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-09 19:15 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:11 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > diff --git a/transport.c b/transport.c
> > index 63c3dbab9..2378dcb38 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -118,6 +118,7 @@ struct git_transport_data {
> >  	struct child_process *conn;
> >  	int fd[2];
> >  	unsigned got_remote_heads : 1;
> > +	enum protocol_version version;
> 
> Should this be initialized to protocol_unknown_version? Right now, as
> far as I can tell, it is zero-initialized, which means protocol_v0.

I don't think it matters as the value isn't used until after the
version has already been discovered.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 02/26] pkt-line: introduce struct packet_reader
  2018-01-09 18:08   ` Jonathan Tan
@ 2018-01-09 19:19     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-09 19:19 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:04 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > diff --git a/pkt-line.h b/pkt-line.h
> > index 06c468927..c446e886a 100644
> > --- a/pkt-line.h
> > +++ b/pkt-line.h
> > @@ -111,6 +111,63 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
> >   */
> >  ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
> >  
> > +struct packet_reader {
> > +	/* source file descriptor */
> > +	int fd;
> > +
> > +	/* source buffer and its size */
> > +	char *src_buffer;
> > +	size_t src_len;
> > +
> > +	/* buffer that pkt-lines are read into and its size */
> > +	char *buffer;
> > +	unsigned buffer_size;
> 
> Is the intention to support different buffers in the future?

Potentially at some point.

> 
> [snip]
> 
> > +/*
> > + * Peek the next packet line without consuming it and return the status.
> > + * The next call to 'packet_reader_read()' will perform a read of the same line
> > + * that was peeked, consuming the line.
> > + *
> > + * Only a single line can be peeked at a time.
> 
> It is logical to me that if you peeked at a line, and then peeked at it
> again, you will get the same line - I would phrase this not as a
> restriction ("only a single line") but just as a statement of fact (e.g.
> "Peeking at the same line multiple times without an intervening
> packet_reader_read will return the same result").

Fair enough, i'll change the wording.

> 
> > + */
> > +extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);
> > +
> >  #define DEFAULT_PACKET_MAX 1000
> >  #define LARGE_PACKET_MAX 65520
> >  #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 01/26] pkt-line: introduce packet_read_with_status
  2018-01-09 18:04   ` Jonathan Tan
@ 2018-01-09 19:28     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-09 19:28 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:03 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > -int packet_read(int fd, char **src_buf, size_t *src_len,
> > -		char *buffer, unsigned size, int options)
> > +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> > +						char *buffer, unsigned size, int *pktlen,
> > +						int options)
> >  {
> > -	int len, ret;
> > +	int len;
> >  	char linelen[4];
> >  
> > -	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
> > -	if (ret < 0)
> > -		return ret;
> > +	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
> > +		return PACKET_READ_EOF;
> > +
> >  	len = packet_length(linelen);
> >  	if (len < 0)
> >  		die("protocol error: bad line length character: %.4s", linelen);
> > -	if (!len) {
> > +
> > +	if (len == 0) {
> 
> This change (replacing "!len" with "len == 0") is unnecessary, I think.
> 
> >  		packet_trace("0000", 4, 0);
> > -		return 0;
> > +		return PACKET_READ_FLUSH;
> > +	} else if (len >= 1 && len <= 3) {
> > +		die("protocol error: bad line length character: %.4s", linelen);
> >  	}
> 
> This seems to be more of a "bad line length" than a "bad line length
> character".

I'll make these changes, though I do think this needs to stay as a "bad
line length character" as the len could be neg which is an indication of
parsing the linelen character failed.

> 
> Also, some of the checks are redundant. Above, it is probably better to
> delete "len >= 1", and optionally write "len < 4" instead of "len <= 3"
> (to emphasize that the subtraction of 4 below does not result in a
> negative value).
> 
> > +
> >  	len -= 4;
> > -	if (len >= size)
> > +	if ((len < 0) || ((unsigned)len >= size))
> >  		die("protocol error: bad line length %d", len);
> 
> The "len < 0" check is redundant.
> 
> > -	ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
> > -	if (ret < 0)
> > -		return ret;
> > +
> > +	if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
> > +		return PACKET_READ_EOF;
> >  
> >  	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
> >  	    len && buffer[len-1] == '\n')

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-01-03  0:18 ` [PATCH 11/26] serve: introduce git-serve Brandon Williams
@ 2018-01-09 20:24   ` Jonathan Tan
  2018-01-09 22:16     ` Brandon Williams
  2018-02-01 18:48   ` Jeff Hostetler
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 20:24 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:13 -0800
Brandon Williams <bmwill@google.com> wrote:

> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> new file mode 100644
> index 000000000..b87ba3816
> --- /dev/null
> +++ b/Documentation/technical/protocol-v2.txt

I'll review the documentation later, once there is some consensus that
the overall design is OK. (Or maybe there already is consensus?)

> diff --git a/builtin/serve.c b/builtin/serve.c
> new file mode 100644
> index 000000000..bb726786a
> --- /dev/null
> +++ b/builtin/serve.c
> @@ -0,0 +1,30 @@
> +#include "cache.h"
> +#include "builtin.h"
> +#include "parse-options.h"
> +#include "serve.h"
> +
> +static char const * const grep_usage[] = {

Should be serve_usage.

> diff --git a/serve.c b/serve.c
> new file mode 100644
> index 000000000..da8127775
> --- /dev/null
> +++ b/serve.c

[snip]

> +struct protocol_capability {
> +	const char *name; /* capability name */

Maybe document as:

  The name of the capability. The server uses this name when advertising
  this capability, and the client uses this name to invoke the command
  corresponding to this capability.

> +	/*
> +	 * Function queried to see if a capability should be advertised.
> +	 * Optionally a value can be specified by adding it to 'value'.
> +	 */
> +	int (*advertise)(struct repository *r, struct strbuf *value);

Document what happens when value is appended to. For example:

  ... If value is appended to, the server will advertise this capability
  as <name>=<value> instead of <name>.

> +	/*
> +	 * Function called when a client requests the capability as a command.
> +	 * The command request will be provided to the function via 'keys', the
> +	 * capabilities requested, and 'args', the command specific parameters.
> +	 *
> +	 * This field should be NULL for capabilities which are not commands.
> +	 */
> +	int (*command)(struct repository *r,
> +		       struct argv_array *keys,
> +		       struct argv_array *args);

Looking at the code below, I see that the command is not executed unless
advertise returns true - this means that a command cannot be both
supported and unadvertised. Would this be too restrictive? For example,
this would disallow a gradual across-multiple-servers rollout in which
we allow but not advertise a capability, and then after some time,
advertise the capability.

If we change this, then the value parameter of advertise can be
mandatory instead of optional.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-01-03  0:18 ` [PATCH 12/26] ls-refs: introduce ls-refs server command Brandon Williams
  2018-01-04  0:17   ` Stefan Beller
@ 2018-01-09 20:50   ` Jonathan Tan
  2018-01-16 19:23     ` Brandon Williams
  2018-02-01 19:16   ` Jeff Hostetler
  2 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 20:50 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:14 -0800
Brandon Williams <bmwill@google.com> wrote:

> +  symrefs: In addition to the object pointed by it, show the underlying
> +	   ref pointed by it when showing a symbolic ref.
> +  peel: Show peeled tags.
> +  ref-pattern <pattern>: When specified, only references matching the
> +			 given patterns are displayed.

I notice "symrefs" being tested in patch 13 and "ref-pattern" being
tested in patch 16. Is it possible to make a test for "peel" as well?
(Or is it being tested somewhere I didn't notice?)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-01-09 20:24   ` Jonathan Tan
@ 2018-01-09 22:16     ` Brandon Williams
  2018-01-09 22:28       ` Jonathan Tan
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-09 22:16 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:13 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> > new file mode 100644
> > index 000000000..b87ba3816
> > --- /dev/null
> > +++ b/Documentation/technical/protocol-v2.txt
> 
> I'll review the documentation later, once there is some consensus that
> the overall design is OK. (Or maybe there already is consensus?)
> 
> > diff --git a/builtin/serve.c b/builtin/serve.c
> > new file mode 100644
> > index 000000000..bb726786a
> > --- /dev/null
> > +++ b/builtin/serve.c
> > @@ -0,0 +1,30 @@
> > +#include "cache.h"
> > +#include "builtin.h"
> > +#include "parse-options.h"
> > +#include "serve.h"
> > +
> > +static char const * const grep_usage[] = {
> 
> Should be serve_usage.
> 
> > diff --git a/serve.c b/serve.c
> > new file mode 100644
> > index 000000000..da8127775
> > --- /dev/null
> > +++ b/serve.c
> 
> [snip]
> 
> > +struct protocol_capability {
> > +	const char *name; /* capability name */
> 
> Maybe document as:
> 
>   The name of the capability. The server uses this name when advertising
>   this capability, and the client uses this name to invoke the command
>   corresponding to this capability.
> 
> > +	/*
> > +	 * Function queried to see if a capability should be advertised.
> > +	 * Optionally a value can be specified by adding it to 'value'.
> > +	 */
> > +	int (*advertise)(struct repository *r, struct strbuf *value);
> 
> Document what happens when value is appended to. For example:
> 
>   ... If value is appended to, the server will advertise this capability
>   as <name>=<value> instead of <name>.
> 

All good documentation changes.

> > +	/*
> > +	 * Function called when a client requests the capability as a command.
> > +	 * The command request will be provided to the function via 'keys', the
> > +	 * capabilities requested, and 'args', the command specific parameters.
> > +	 *
> > +	 * This field should be NULL for capabilities which are not commands.
> > +	 */
> > +	int (*command)(struct repository *r,
> > +		       struct argv_array *keys,
> > +		       struct argv_array *args);
> 
> Looking at the code below, I see that the command is not executed unless
> advertise returns true - this means that a command cannot be both
> supported and unadvertised. Would this be too restrictive? For example,
> this would disallow a gradual across-multiple-servers rollout in which
> we allow but not advertise a capability, and then after some time,
> advertise the capability.

One way to change this would be to just add another function to the
struct which is called to check if the command is allowed, instead of
relying on the same function to do that for both advertise and
allow...though I don't see a big win for allowing a command but not
advertising it.

> 
> If we change this, then the value parameter of advertise can be
> mandatory instead of optional.

I don't see how this fixes the issue you bring up.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 13/26] connect: request remote refs using v2
  2018-01-03  0:18 ` [PATCH 13/26] connect: request remote refs using v2 Brandon Williams
@ 2018-01-09 22:24   ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 22:24 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:15 -0800
Brandon Williams <bmwill@google.com> wrote:

> diff --git a/connect.c b/connect.c
> index caa539b75..9badd403f 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -12,9 +12,11 @@
>  #include "sha1-array.h"
>  #include "transport.h"
>  #include "strbuf.h"
> +#include "version.h"
>  #include "protocol.h"
>  
>  static char *server_capabilities;
> +static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
>  static const char *parse_feature_value(const char *, const char *, int *);
>  
>  static int check_ref(const char *name, unsigned int flags)
> @@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
>  		      "and the repository exists."));
>  }
>  
> +static int server_supports_v2(const char *c, int die_on_error)

Document what "c" means.

[snip]

> +static void process_capabilities_v2(struct packet_reader *reader)
> +{
> +	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +		argv_array_push(&server_capabilities_v2, reader->line);
> +	}

No need for braces on single-line blocks.

> +static int process_ref_v2(const char *line, struct ref ***list)

The "list" is the tail of a linked list, so maybe name it "tail"
instead.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-01-09 22:16     ` Brandon Williams
@ 2018-01-09 22:28       ` Jonathan Tan
  2018-01-09 22:34         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-09 22:28 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue, 9 Jan 2018 14:16:42 -0800
Brandon Williams <bmwill@google.com> wrote:

> All good documentation changes.

Thanks!

> > > +	/*
> > > +	 * Function called when a client requests the capability as a command.
> > > +	 * The command request will be provided to the function via 'keys', the
> > > +	 * capabilities requested, and 'args', the command specific parameters.
> > > +	 *
> > > +	 * This field should be NULL for capabilities which are not commands.
> > > +	 */
> > > +	int (*command)(struct repository *r,
> > > +		       struct argv_array *keys,
> > > +		       struct argv_array *args);
> > 
> > Looking at the code below, I see that the command is not executed unless
> > advertise returns true - this means that a command cannot be both
> > supported and unadvertised. Would this be too restrictive? For example,
> > this would disallow a gradual across-multiple-servers rollout in which
> > we allow but not advertise a capability, and then after some time,
> > advertise the capability.
> 
> One way to change this would be to just add another function to the
> struct which is called to check if the command is allowed, instead of
> relying on the same function to do that for both advertise and
> allow...though I don't see a big win for allowing a command but not
> advertising it.

My rationale for allowing a command but not advertising it is in the
paragraph above (that you quoted), but if that is insufficient
rationale, then I agree that we don't need to do this.

> > If we change this, then the value parameter of advertise can be
> > mandatory instead of optional.
> 
> I don't see how this fixes the issue you bring up.

This is a consequence, not a fix - if we were to do as I suggested, then
we no longer need to invoke advertise to check whether something is
advertised except when we are advertising them, in which case "value"
never needs to be NULL.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-01-09 22:28       ` Jonathan Tan
@ 2018-01-09 22:34         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-09 22:34 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue, 9 Jan 2018 14:16:42 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > All good documentation changes.
> 
> Thanks!
> 
> > > > +	/*
> > > > +	 * Function called when a client requests the capability as a command.
> > > > +	 * The command request will be provided to the function via 'keys', the
> > > > +	 * capabilities requested, and 'args', the command specific parameters.
> > > > +	 *
> > > > +	 * This field should be NULL for capabilities which are not commands.
> > > > +	 */
> > > > +	int (*command)(struct repository *r,
> > > > +		       struct argv_array *keys,
> > > > +		       struct argv_array *args);
> > > 
> > > Looking at the code below, I see that the command is not executed unless
> > > advertise returns true - this means that a command cannot be both
> > > supported and unadvertised. Would this be too restrictive? For example,
> > > this would disallow a gradual across-multiple-servers rollout in which
> > > we allow but not advertise a capability, and then after some time,
> > > advertise the capability.
> > 
> > One way to change this would be to just add another function to the
> > struct which is called to check if the command is allowed, instead of
> > relying on the same function to do that for both advertise and
> > allow...though I don't see a big win for allowing a command but not
> > advertising it.
> 
> My rationale for allowing a command but not advertising it is in the
> paragraph above (that you quoted), but if that is insufficient
> rationale, then I agree that we don't need to do this.

I have no issues with adding that functionality, i don't really feel
that strongly one way or another.  Just seemed like additional work for
not much gain right now, key being right now.  It very well may be worth
it for the use case you specified.  If so I can definitely make the
change.

> 
> > > If we change this, then the value parameter of advertise can be
> > > mandatory instead of optional.
> > 
> > I don't see how this fixes the issue you bring up.
> 
> This is a consequence, not a fix - if we were to do as I suggested, then
> we no longer need to invoke advertise to check whether something is
> advertised except when we are advertising them, in which case "value"
> never needs to be NULL.

Oh I understand what you are trying to explain, yes you're right.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 20/26] fetch-pack: perform a fetch using v2
  2018-01-03  0:18 ` [PATCH 20/26] fetch-pack: perform a fetch using v2 Brandon Williams
  2018-01-04  1:23   ` Stefan Beller
@ 2018-01-10  0:05   ` Jonathan Tan
  1 sibling, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-01-10  0:05 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:22 -0800
Brandon Williams <bmwill@google.com> wrote:

> +static enum ack_type process_ack(const char *line, struct object_id *oid)
> +{
> +	const char *arg;
> +
> +	if (!strcmp(line, "NAK"))
> +		return NAK;
> +	if (skip_prefix(line, "ACK ", &arg)) {
> +		if (!parse_oid_hex(arg, oid, &arg)) {
> +			if (strstr(arg, "continue"))
> +				return ACK_continue;

This function seems to be only used for v2, so I don't think we need to
parse "continue".

Also, maybe describe the plan for supporting functionality not supported
yet (e.g. server-side declaration of shallows and client-side "deepen").

It may be possible to delay support for server-side shallows on the
server (that is, only implement support for it in the client) since the
server can just declare that it doesn't support protocol v2 when serving
such repos (although it might just be easier to implement server-side
support in this case).

For "deepen", we need support for it both on the client and the server
now unless we plan to declare a "deepen" capability in the future (then,
as of these patches, clients that require "deepen" will use protocol v1;
when a new server declares "deepen", old clients will ignore it and keep
the status quo, and new clients can then use "deepen").

There may be others that I've missed.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 26/26] remote-curl: implement connect-half-duplex command
  2018-01-03  0:18 ` [PATCH 26/26] remote-curl: implement connect-half-duplex command Brandon Williams
@ 2018-01-10  0:10   ` Jonathan Tan
  2018-01-10 17:57   ` Jonathan Tan
  1 sibling, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-01-10  0:10 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:28 -0800
Brandon Williams <bmwill@google.com> wrote:

> Teach remote-curl the 'connect-half-duplex' command which is used to
> establish a half-duplex connection with servers which support protocol
> version 2.  This allows remote-curl to act as a proxy, allowing the git
> client to communicate natively with a remote end, simply using
> remote-curl as a pass through to convert requests to http.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  remote-curl.c          | 185 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  t/t5701-protocol-v2.sh |  41 +++++++++++
>  2 files changed, 224 insertions(+), 2 deletions(-)

I didn't look at the usage of the curl API in detail, but overall this
looks good. I'm pleasantly surprised that it didn't take so many lines
of code as I expected.

Overall everything looks good, except for the points that I have brought
up in my other e-mails.

> diff --git a/remote-curl.c b/remote-curl.c
> index 4086aa733..b63b06398 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c

[snip]

> +struct proxy_state {
> +	char *service_name;
> +	char *service_url;
> +	char *hdr_content_type;
> +	char *hdr_accept;

Maybe document that the above 3 fields (service_url to hdr_accept) are
cached because we need to pass them to curl_easy_setopt() for every
request.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 26/26] remote-curl: implement connect-half-duplex command
  2018-01-03  0:18 ` [PATCH 26/26] remote-curl: implement connect-half-duplex command Brandon Williams
  2018-01-10  0:10   ` Jonathan Tan
@ 2018-01-10 17:57   ` Jonathan Tan
  2018-01-11  1:09     ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-01-10 17:57 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On Tue,  2 Jan 2018 16:18:28 -0800
Brandon Williams <bmwill@google.com> wrote:

> +static size_t proxy_in(void *ptr, size_t eltsize,
> +		       size_t nmemb, void *buffer_)

OK, I managed to look at the Curl stuff in more detail.

I know that these parameter names are what remote_curl.c has been using
for its callbacks, but I find them confusing (in particular, some Curl
documentation rightly refer to the 1st parameter as a buffer, and the
4th parameter is actually userdata). Also, according to the Curl
documentation, the type of the first parameter is "char *". Could we
change the type of the first parameter to "char *", and the name of the
fourth parameter either to "proxy_state_" or "userdata"?

> +{
> +	size_t max = eltsize * nmemb;
> +	struct proxy_state *p = buffer_;
> +	size_t avail = p->request_buffer.len - p->pos;
> +
> +	if (!avail) {
> +		if (p->seen_flush) {
> +			p->seen_flush = 0;
> +			return 0;
> +		}
> +
> +		strbuf_reset(&p->request_buffer);
> +		switch (packet_reader_read(&p->reader)) {
> +		case PACKET_READ_EOF:
> +			die("error reading request from parent process");

This should say "BUG:", I think. I'm not sure what the best way of
explaining it is, but basically connect_half_duplex is supposed to
ensure (by peeking) that there is no EOF when proxy_in() is called.

> +		case PACKET_READ_NORMAL:
> +			packet_buf_write_len(&p->request_buffer, p->reader.line,
> +					     p->reader.pktlen);
> +			break;
> +		case PACKET_READ_DELIM:
> +			packet_buf_delim(&p->request_buffer);
> +			break;
> +		case PACKET_READ_FLUSH:
> +			packet_buf_flush(&p->request_buffer);
> +			p->seen_flush = 1;
> +			break;
> +		}
> +		p->pos = 0;
> +		avail = p->request_buffer.len;
> +	}
> +
> +	if (max < avail)
> +		avail = max;
> +	memcpy(ptr, p->request_buffer.buf + p->pos, avail);
> +	p->pos += avail;
> +	return avail;

Thanks, this looks correct. I wish that the Curl API had a way for us to
say "here are 4 more bytes, and that is all" instead of us having to
make a note (p->seen_flush) to remember to return 0 on the next call,
but that's the way it is.

> +}
> +static size_t proxy_out(char *ptr, size_t eltsize,
> +			size_t nmemb, void *buffer_)

Add a blank line before proxy_out. Also, same comment as proxy_in()
about the function signature.

> +{
> +	size_t size = eltsize * nmemb;
> +	struct proxy_state *p = buffer_;
> +
> +	write_or_die(p->out, ptr, size);
> +	return size;
> +}
> +
> +static int proxy_post(struct proxy_state *p)
> +{
> +	struct active_request_slot *slot;
> +	struct curl_slist *headers = http_copy_default_headers();
> +	int err;
> +
> +	headers = curl_slist_append(headers, p->hdr_content_type);
> +	headers = curl_slist_append(headers, p->hdr_accept);
> +	headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
> +
> +	slot = get_active_slot();
> +
> +	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> +	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
> +	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
> +	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);

I looked at the Curl documentation for CURLOPT_HTTPHEADER and
curl_easy_setopt doesn't consume the argument here (in fact, it asks us
to keep "headers" around), so it might be possible to just generate the
headers once in proxy_state_init().

> +
> +	/* Setup function to read request from client */
> +	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
> +	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
> +
> +	/* Setup function to write server response to client */
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
> +
> +	err = run_slot(slot, NULL);
> +
> +	if (err != HTTP_OK)
> +		err = -1;

This seems to mean that we cannot have two requests in flight at the
same time even while there is no response (from the fact that we have a
HTTP status code after returning from run_slot()).

I thought that git fetch over HTTP uses the two-requests-in-flight
optimization that it also does over other protocols like SSH, but I see
that that code path (fetch_git() in remote-curl.c) also uses run_slot()
indirectly, so maybe my assumption is wrong. Anyway, this is outside the
scope of this patch.

> +
> +	curl_slist_free_all(headers);
> +	return err;
> +}
> +
> +static int connect_half_duplex(const char *service_name)
> +{
> +	struct discovery *discover;
> +	struct proxy_state p;
> +
> +	/*
> +	 * Run the info/refs request and see if the server supports protocol
> +	 * v2.  If and only if the server supports v2 can we successfully
> +	 * establish a half-duplex connection, otherwise we need to tell the
> +	 * client to fallback to using other transport helper functions to
> +	 * complete their request.
> +	 */
> +	discover = discover_refs(service_name, 0);
> +	if (discover->version != protocol_v2) {
> +		printf("fallback\n");
> +		fflush(stdout);
> +		return -1;
> +	} else {
> +		/* Half-Duplex Connection established */
> +		printf("\n");
> +		fflush(stdout);
> +	}
> +
> +	proxy_state_init(&p, service_name);
> +
> +	/*
> +	 * Dump the capability listing that we got from the server earlier
> +	 * during the info/refs request.
> +	 */
> +	write_or_die(p.out, discover->buf, discover->len);
> +
> +	/* Peek the next packet line.  Until we see EOF keep sending POSTs */
> +	while (packet_reader_peek(&p.reader) != PACKET_READ_EOF) {
> +		if (proxy_post(&p)) {
> +			/* We would have an err here */

Probably better to comment "Error message already printed by
proxy_post".

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 00/26] protocol version 2
  2018-01-09 17:55 ` [PATCH 00/26] protocol version 2 Jonathan Tan
@ 2018-01-11  0:23   ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-11  0:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:02 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> >  * Introduce a new remote-helper command 'connect-half-duplex' which is
> >    implemented by remote-curl (the http remote-helper).  This allows for a
> >    client to establish a half-duplex connection and use remote-curl as a proxy
> >    to wrap requests in http before sending them to the remote end and
> >    unwrapping the responses and sending them back to the client's stdin.
> 
> I'm not sure about the "half-duplex" name - it is half-duplex in that
> each side must terminate their communications with a flush, but not
> half-duplex in that request-response pairs can overlap each other (e.g.
> during negotation during fetch - there is an optimization in which the
> client tries to keep two requests pending at a time). I think that the
> idea we want to communicate is that requests and responses are always
> packetized, stateless, and always happen as a pair.
> 
> I wonder if "stateless-connect" is a better keyword - it makes sense to
> me (once described) that "stateless" implies that the client sends
> everything the server needs at once (thus, in a packet), the server
> sends everything the client needs back at once (thus, in a packet), and
> then the client must not assume any state-storing on the part of the
> server or transport.

I like that name much better, I think I'll change it to use
'stateless-connect'.  Thanks :)


-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 26/26] remote-curl: implement connect-half-duplex command
  2018-01-10 17:57   ` Jonathan Tan
@ 2018-01-11  1:09     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-11  1:09 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/10, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:28 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +static size_t proxy_in(void *ptr, size_t eltsize,
> > +		       size_t nmemb, void *buffer_)
> 
> OK, I managed to look at the Curl stuff in more detail.
> 
> I know that these parameter names are what remote_curl.c has been using
> for its callbacks, but I find them confusing (in particular, some Curl
> documentation rightly refer to the 1st parameter as a buffer, and the
> 4th parameter is actually userdata). Also, according to the Curl
> documentation, the type of the first parameter is "char *". Could we
> change the type of the first parameter to "char *", and the name of the
> fourth parameter either to "proxy_state_" or "userdata"?

Sounds good, I'll make the change.

> 
> > +{
> > +	size_t max = eltsize * nmemb;
> > +	struct proxy_state *p = buffer_;
> > +	size_t avail = p->request_buffer.len - p->pos;
> > +
> > +	if (!avail) {
> > +		if (p->seen_flush) {
> > +			p->seen_flush = 0;
> > +			return 0;
> > +		}
> > +
> > +		strbuf_reset(&p->request_buffer);
> > +		switch (packet_reader_read(&p->reader)) {
> > +		case PACKET_READ_EOF:
> > +			die("error reading request from parent process");
> 
> This should say "BUG:", I think. I'm not sure what the best way of
> explaining it is, but basically connect_half_duplex is supposed to
> ensure (by peeking) that there is no EOF when proxy_in() is called.

This wouldn't necessarily be a bug if the parent dies early for some
reason though right?

> 
> > +		case PACKET_READ_NORMAL:
> > +			packet_buf_write_len(&p->request_buffer, p->reader.line,
> > +					     p->reader.pktlen);
> > +			break;
> > +		case PACKET_READ_DELIM:
> > +			packet_buf_delim(&p->request_buffer);
> > +			break;
> > +		case PACKET_READ_FLUSH:
> > +			packet_buf_flush(&p->request_buffer);
> > +			p->seen_flush = 1;
> > +			break;
> > +		}
> > +		p->pos = 0;
> > +		avail = p->request_buffer.len;
> > +	}
> > +
> > +	if (max < avail)
> > +		avail = max;
> > +	memcpy(ptr, p->request_buffer.buf + p->pos, avail);
> > +	p->pos += avail;
> > +	return avail;
> 
> Thanks, this looks correct. I wish that the Curl API had a way for us to
> say "here are 4 more bytes, and that is all" instead of us having to
> make a note (p->seen_flush) to remember to return 0 on the next call,
> but that's the way it is.
> 
> > +}
> > +static size_t proxy_out(char *ptr, size_t eltsize,
> > +			size_t nmemb, void *buffer_)
> 
> Add a blank line before proxy_out. Also, same comment as proxy_in()
> about the function signature.

I'll change this function too.

> 
> > +{
> > +	size_t size = eltsize * nmemb;
> > +	struct proxy_state *p = buffer_;
> > +
> > +	write_or_die(p->out, ptr, size);
> > +	return size;
> > +}
> > +
> > +static int proxy_post(struct proxy_state *p)
> > +{
> > +	struct active_request_slot *slot;
> > +	struct curl_slist *headers = http_copy_default_headers();
> > +	int err;
> > +
> > +	headers = curl_slist_append(headers, p->hdr_content_type);
> > +	headers = curl_slist_append(headers, p->hdr_accept);
> > +	headers = curl_slist_append(headers, "Transfer-Encoding: chunked");
> > +
> > +	slot = get_active_slot();
> > +
> > +	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> > +	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
> > +	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
> > +	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
> 
> I looked at the Curl documentation for CURLOPT_HTTPHEADER and
> curl_easy_setopt doesn't consume the argument here (in fact, it asks us
> to keep "headers" around), so it might be possible to just generate the
> headers once in proxy_state_init().

Yeah I'll go ahead and do that, it'll make the post function a bit
cleaner too.

> 
> > +
> > +	/* Setup function to read request from client */
> > +	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
> > +	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
> > +
> > +	/* Setup function to write server response to client */
> > +	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
> > +	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
> > +
> > +	err = run_slot(slot, NULL);
> > +
> > +	if (err != HTTP_OK)
> > +		err = -1;
> 
> This seems to mean that we cannot have two requests in flight at the
> same time even while there is no response (from the fact that we have a
> HTTP status code after returning from run_slot()).
> 
> I thought that git fetch over HTTP uses the two-requests-in-flight
> optimization that it also does over other protocols like SSH, but I see
> that that code path (fetch_git() in remote-curl.c) also uses run_slot()
> indirectly, so maybe my assumption is wrong. Anyway, this is outside the
> scope of this patch.
> 
> > +
> > +	curl_slist_free_all(headers);
> > +	return err;
> > +}
> > +
> > +static int connect_half_duplex(const char *service_name)
> > +{
> > +	struct discovery *discover;
> > +	struct proxy_state p;
> > +
> > +	/*
> > +	 * Run the info/refs request and see if the server supports protocol
> > +	 * v2.  If and only if the server supports v2 can we successfully
> > +	 * establish a half-duplex connection, otherwise we need to tell the
> > +	 * client to fallback to using other transport helper functions to
> > +	 * complete their request.
> > +	 */
> > +	discover = discover_refs(service_name, 0);
> > +	if (discover->version != protocol_v2) {
> > +		printf("fallback\n");
> > +		fflush(stdout);
> > +		return -1;
> > +	} else {
> > +		/* Half-Duplex Connection established */
> > +		printf("\n");
> > +		fflush(stdout);
> > +	}
> > +
> > +	proxy_state_init(&p, service_name);
> > +
> > +	/*
> > +	 * Dump the capability listing that we got from the server earlier
> > +	 * during the info/refs request.
> > +	 */
> > +	write_or_die(p.out, discover->buf, discover->len);
> > +
> > +	/* Peek the next packet line.  Until we see EOF keep sending POSTs */
> > +	while (packet_reader_peek(&p.reader) != PACKET_READ_EOF) {
> > +		if (proxy_post(&p)) {
> > +			/* We would have an err here */
> 
> Probably better to comment "Error message already printed by
> proxy_post".

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-01-09 20:50   ` Jonathan Tan
@ 2018-01-16 19:23     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-16 19:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 01/09, Jonathan Tan wrote:
> On Tue,  2 Jan 2018 16:18:14 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +  symrefs: In addition to the object pointed by it, show the underlying
> > +	   ref pointed by it when showing a symbolic ref.
> > +  peel: Show peeled tags.
> > +  ref-pattern <pattern>: When specified, only references matching the
> > +			 given patterns are displayed.
> 
> I notice "symrefs" being tested in patch 13 and "ref-pattern" being
> tested in patch 16. Is it possible to make a test for "peel" as well?
> (Or is it being tested somewhere I didn't notice?)

Really good suggestion.  I'll introduce unit tests for both the
git-serve cmdline as well as for more simple server commands (ls-refs).

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH v2 00/27] protocol version 2
  2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
                   ` (26 preceding siblings ...)
  2018-01-09 17:55 ` [PATCH 00/26] protocol version 2 Jonathan Tan
@ 2018-01-25 23:58 ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 01/27] pkt-line: introduce packet_read_with_status Brandon Williams
                     ` (29 more replies)
  27 siblings, 30 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Changes in v2:
 * Added documentation for fetch
 * changes #defines for state variables to be enums
 * couple code changes to pkt-line functions and documentation
 * Added unit tests for the git-serve binary as well as for ls-refs

Areas for improvement
 * Push isn't implemented, right now this is ok because if v2 is requested the
   server can just default to v0.  Before this can be merged we may want to
   change how the client request a new protocol, and not allow for sending
   "version=2" when pushing even though the user has it configured.  Or maybe
   its fine to just have an older client who doesn't understand how to push
   (and request v2) to die if the server tries to speak v2 at it.

   Fixing this essentially would just require piping through a bit more
   information to the function which ultimately runs connect (for both builtins
   and remote-curl)

 * I want to make sure that the docs are well written before this gets merged
   so I'm hoping that someone can do a through review on the docs themselves to
   make sure they are clear.

 * Right now there is a capability 'stateless-rpc' which essentially makes sure
   that a server command completes after a single round (this is to make sure
   http works cleanly).  After talking with some folks it may make more sense
   to just have v2 be stateless in nature so that all commands terminate after
   a single round trip.  This makes things a bit easier if a server wants to
   have ssh just be a proxy for http.

   One potential thing would be to flip this so that by default the protocol is
   stateless and if a server/command has a state-full mode that can be
   implemented as a capability at a later point.  Thoughts?

 * Shallow repositories and shallow clones aren't supported yet.  I'm working
   on it and it can be either added to v2 by default if people think it needs
   to be in there from the start, or we can add it as a capability at a later
   point.

Series can also be found on on github: https://github.com/bmwill/git/tree/protocol-v2

Brandon Williams (27):
  pkt-line: introduce packet_read_with_status
  pkt-line: introduce struct packet_reader
  pkt-line: add delim packet support
  upload-pack: convert to a builtin
  upload-pack: factor out processing lines
  transport: use get_refs_via_connect to get refs
  connect: convert get_remote_heads to use struct packet_reader
  connect: discover protocol version outside of get_remote_heads
  transport: store protocol version
  protocol: introduce enum protocol_version value protocol_v2
  test-pkt-line: introduce a packet-line test helper
  serve: introduce git-serve
  ls-refs: introduce ls-refs server command
  connect: request remote refs using v2
  transport: convert get_refs_list to take a list of ref patterns
  transport: convert transport_get_remote_refs to take a list of ref
    patterns
  ls-remote: pass ref patterns when requesting a remote's refs
  fetch: pass ref patterns when fetching
  push: pass ref patterns when pushing
  upload-pack: introduce fetch server command
  fetch-pack: perform a fetch using v2
  transport-helper: remove name parameter
  transport-helper: refactor process_connect_service
  transport-helper: introduce stateless-connect
  pkt-line: add packet_buf_write_len function
  remote-curl: create copy of the service name
  remote-curl: implement stateless-connect command

 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt | 270 +++++++++++++++++
 Makefile                                |   7 +-
 builtin.h                               |   2 +
 builtin/clone.c                         |   2 +-
 builtin/fetch-pack.c                    |  21 +-
 builtin/fetch.c                         |  14 +-
 builtin/ls-remote.c                     |   7 +-
 builtin/receive-pack.c                  |   6 +
 builtin/remote.c                        |   2 +-
 builtin/send-pack.c                     |  20 +-
 builtin/serve.c                         |  30 ++
 builtin/upload-pack.c                   |  74 +++++
 connect.c                               | 295 ++++++++++++++-----
 connect.h                               |   3 +
 fetch-pack.c                            | 277 +++++++++++++++++-
 fetch-pack.h                            |   4 +-
 git.c                                   |   2 +
 ls-refs.c                               |  96 ++++++
 ls-refs.h                               |   9 +
 pkt-line.c                              | 149 +++++++++-
 pkt-line.h                              |  77 +++++
 protocol.c                              |   2 +
 protocol.h                              |   1 +
 remote-curl.c                           | 209 ++++++++++++-
 remote.h                                |   9 +-
 serve.c                                 | 253 ++++++++++++++++
 serve.h                                 |  15 +
 t/helper/test-pkt-line.c                |  62 ++++
 t/t5701-git-serve.sh                    | 172 +++++++++++
 t/t5702-protocol-v2.sh                  | 117 ++++++++
 transport-helper.c                      |  84 +++---
 transport-internal.h                    |   4 +-
 transport.c                             | 119 ++++++--
 transport.h                             |   9 +-
 upload-pack.c                           | 501 ++++++++++++++++++++++++--------
 upload-pack.h                           |  18 ++
 37 files changed, 2646 insertions(+), 297 deletions(-)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 builtin/upload-pack.c
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100644 t/helper/test-pkt-line.c
 create mode 100755 t/t5701-git-serve.sh
 create mode 100755 t/t5702-protocol-v2.sh
 create mode 100644 upload-pack.h

-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH v2 01/27] pkt-line: introduce packet_read_with_status
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 02/27] pkt-line: introduce struct packet_reader Brandon Williams
                     ` (28 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

The current pkt-line API encodes the status of a pkt-line read in the
length of the read content.  An error is indicated with '-1', a flush
with '0' (which can be confusing since a return value of '0' can also
indicate an empty pkt-line), and a positive integer for the length of
the read content otherwise.  This doesn't leave much room for allowing
the addition of additional special packets in the future.

To solve this introduce 'packet_read_with_status()' which reads a packet
and returns the status of the read encoded as an 'enum packet_status'
type.  This allows for easily identifying between special and normal
packets as well as errors.  It also enables easily adding a new special
packet in the future.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 57 +++++++++++++++++++++++++++++++++++++++++++--------------
 pkt-line.h | 15 +++++++++++++++
 2 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 2827ca772..af0d2430f 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
-int packet_read(int fd, char **src_buf, size_t *src_len,
-		char *buffer, unsigned size, int options)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
+						char *buffer, unsigned size, int *pktlen,
+						int options)
 {
-	int len, ret;
+	int len;
 	char linelen[4];
 
-	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
-	if (ret < 0)
-		return ret;
+	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
+		return PACKET_READ_EOF;
+
 	len = packet_length(linelen);
-	if (len < 0)
+
+	if (len < 0) {
 		die("protocol error: bad line length character: %.4s", linelen);
-	if (!len) {
+	} else if (!len) {
 		packet_trace("0000", 4, 0);
-		return 0;
+		return PACKET_READ_FLUSH;
+	} else if (len < 4) {
+		die("protocol error: bad line length %d", len);
 	}
+
 	len -= 4;
-	if (len >= size)
+	if ((unsigned)len >= size)
 		die("protocol error: bad line length %d", len);
-	ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
-	if (ret < 0)
-		return ret;
+
+	if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
+		return PACKET_READ_EOF;
 
 	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
 	    len && buffer[len-1] == '\n')
@@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
 	buffer[len] = 0;
 	packet_trace(buffer, len, 0);
-	return len;
+	*pktlen = len;
+	return PACKET_READ_NORMAL;
+}
+
+int packet_read(int fd, char **src_buffer, size_t *src_len,
+		char *buffer, unsigned size, int options)
+{
+	enum packet_read_status status;
+	int pktlen;
+
+	status = packet_read_with_status(fd, src_buffer, src_len,
+					 buffer, size, &pktlen,
+					 options);
+	switch (status) {
+	case PACKET_READ_EOF:
+		pktlen = -1;
+		break;
+	case PACKET_READ_NORMAL:
+		break;
+	case PACKET_READ_FLUSH:
+		pktlen = 0;
+		break;
+	}
+
+	return pktlen;
 }
 
 static char *packet_read_line_generic(int fd,
diff --git a/pkt-line.h b/pkt-line.h
index 3dad583e2..06c468927 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Read a packetized line into a buffer like the 'packet_read()' function but
+ * returns an 'enum packet_read_status' which indicates the status of the read.
+ * The number of bytes read will be assigined to *pktlen if the status of the
+ * read was 'PACKET_READ_NORMAL'.
+ */
+enum packet_read_status {
+	PACKET_READ_EOF = -1,
+	PACKET_READ_NORMAL,
+	PACKET_READ_FLUSH,
+};
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
+						char *buffer, unsigned size, int *pktlen,
+						int options);
+
 /*
  * Convenience wrapper for packet_read that is not gentle, and sets the
  * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 02/27] pkt-line: introduce struct packet_reader
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 01/27] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 03/27] pkt-line: add delim packet support Brandon Williams
                     ` (27 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Sometimes it is advantageous to be able to peek the next packet line
without consuming it (e.g. to be able to determine the protocol version
a server is speaking).  In order to do that introduce 'struct
packet_reader' which is an abstraction around the normal packet reading
logic.  This enables a caller to be able to peek a single line at a time
using 'packet_reader_peek()' and having a caller consume a line by
calling 'packet_reader_read()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pkt-line.h | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 117 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index af0d2430f..4fc9ad4b0 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -406,3 +406,62 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out)
 	}
 	return sb_out->len - orig_len;
 }
+
+/* Packet Reader Functions */
+void packet_reader_init(struct packet_reader *reader, int fd,
+			char *src_buffer, size_t src_len,
+			int options)
+{
+	memset(reader, 0, sizeof(*reader));
+
+	reader->fd = fd;
+	reader->src_buffer = src_buffer;
+	reader->src_len = src_len;
+	reader->buffer = packet_buffer;
+	reader->buffer_size = sizeof(packet_buffer);
+	reader->options = options;
+}
+
+enum packet_read_status packet_reader_read(struct packet_reader *reader)
+{
+	if (reader->line_peeked) {
+		reader->line_peeked = 0;
+		return reader->status;
+	}
+
+	reader->status = packet_read_with_status(reader->fd,
+						 &reader->src_buffer,
+						 &reader->src_len,
+						 reader->buffer,
+						 reader->buffer_size,
+						 &reader->pktlen,
+						 reader->options);
+
+	switch (reader->status) {
+	case PACKET_READ_EOF:
+		reader->pktlen = -1;
+		reader->line = NULL;
+		break;
+	case PACKET_READ_NORMAL:
+		reader->line = reader->buffer;
+		break;
+	case PACKET_READ_FLUSH:
+		reader->pktlen = 0;
+		reader->line = NULL;
+		break;
+	}
+
+	return reader->status;
+}
+
+enum packet_read_status packet_reader_peek(struct packet_reader *reader)
+{
+	/* Only allow peeking a single line */
+	if (reader->line_peeked)
+		return reader->status;
+
+	/* Peek a line by reading it and setting peeked flag */
+	packet_reader_read(reader);
+	reader->line_peeked = 1;
+	return reader->status;
+}
diff --git a/pkt-line.h b/pkt-line.h
index 06c468927..7d9f0e537 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -111,6 +111,64 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
  */
 ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
 
+struct packet_reader {
+	/* source file descriptor */
+	int fd;
+
+	/* source buffer and its size */
+	char *src_buffer;
+	size_t src_len;
+
+	/* buffer that pkt-lines are read into and its size */
+	char *buffer;
+	unsigned buffer_size;
+
+	/* options to be used during reads */
+	int options;
+
+	/* status of the last read */
+	enum packet_read_status status;
+
+	/* length of data read during the last read */
+	int pktlen;
+
+	/* the last line read */
+	const char *line;
+
+	/* indicates if a line has been peeked */
+	int line_peeked;
+};
+
+/*
+ * Initialize a 'struct packet_reader' object which is an
+ * abstraction around the 'packet_read_with_status()' function.
+ */
+extern void packet_reader_init(struct packet_reader *reader, int fd,
+			       char *src_buffer, size_t src_len,
+			       int options);
+
+/*
+ * Perform a packet read and return the status of the read.
+ * The values of 'pktlen' and 'line' are updated based on the status of the
+ * read as follows:
+ *
+ * PACKET_READ_ERROR: 'pktlen' is set to '-1' and 'line' is set to NULL
+ * PACKET_READ_NORMAL: 'pktlen' is set to the number of bytes read
+ *		       'line' is set to point at the read line
+ * PACKET_READ_FLUSH: 'pktlen' is set to '0' and 'line' is set to NULL
+ */
+extern enum packet_read_status packet_reader_read(struct packet_reader *reader);
+
+/*
+ * Peek the next packet line without consuming it and return the status.
+ * The next call to 'packet_reader_read()' will perform a read of the same line
+ * that was peeked, consuming the line.
+ *
+ * Peeking multiple times without calling 'packet_reader_read()' will return
+ * the same result.
+ */
+extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);
+
 #define DEFAULT_PACKET_MAX 1000
 #define LARGE_PACKET_MAX 65520
 #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 03/27] pkt-line: add delim packet support
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 01/27] pkt-line: introduce packet_read_with_status Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 02/27] pkt-line: introduce struct packet_reader Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 04/27] upload-pack: convert to a builtin Brandon Williams
                     ` (26 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

One of the design goals of protocol-v2 is to improve the semantics of
flush packets.  Currently in protocol-v1, flush packets are used both to
indicate a break in a list of packet lines as well as an indication that
one side has finished speaking.  This makes it particularly difficult
to implement proxies as a proxy would need to completely understand git
protocol instead of simply looking for a flush packet.

To do this, introduce the special deliminator packet '0001'.  A delim
packet can then be used as a deliminator between lists of packet lines
while flush packets can be reserved to indicate the end of a response.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 17 +++++++++++++++++
 pkt-line.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 4fc9ad4b0..726e109ca 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
 	write_or_die(fd, "0000", 4);
 }
 
+void packet_delim(int fd)
+{
+	packet_trace("0001", 4, 1);
+	write_or_die(fd, "0001", 4);
+}
+
 int packet_flush_gently(int fd)
 {
 	packet_trace("0000", 4, 1);
@@ -105,6 +111,12 @@ void packet_buf_flush(struct strbuf *buf)
 	strbuf_add(buf, "0000", 4);
 }
 
+void packet_buf_delim(struct strbuf *buf)
+{
+	packet_trace("0001", 4, 1);
+	strbuf_add(buf, "0001", 4);
+}
+
 static void set_packet_header(char *buf, const int size)
 {
 	static char hexchar[] = "0123456789abcdef";
@@ -297,6 +309,9 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_
 	} else if (!len) {
 		packet_trace("0000", 4, 0);
 		return PACKET_READ_FLUSH;
+	} else if (len == 1) {
+		packet_trace("0001", 4, 0);
+		return PACKET_READ_DELIM;
 	} else if (len < 4) {
 		die("protocol error: bad line length %d", len);
 	}
@@ -333,6 +348,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len,
 		break;
 	case PACKET_READ_NORMAL:
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		pktlen = 0;
 		break;
@@ -445,6 +461,7 @@ enum packet_read_status packet_reader_read(struct packet_reader *reader)
 	case PACKET_READ_NORMAL:
 		reader->line = reader->buffer;
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		reader->pktlen = 0;
 		reader->line = NULL;
diff --git a/pkt-line.h b/pkt-line.h
index 7d9f0e537..16fe8bdbf 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -20,8 +20,10 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_delim(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
@@ -75,6 +77,7 @@ enum packet_read_status {
 	PACKET_READ_EOF = -1,
 	PACKET_READ_NORMAL,
 	PACKET_READ_FLUSH,
+	PACKET_READ_DELIM,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
 						char *buffer, unsigned size, int *pktlen,
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 04/27] upload-pack: convert to a builtin
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (2 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 03/27] pkt-line: add delim packet support Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 05/27] upload-pack: factor out processing lines Brandon Williams
                     ` (25 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile              |   3 +-
 builtin.h             |   1 +
 builtin/upload-pack.c |  67 +++++++++++++++++++++++++++++++
 git.c                 |   1 +
 upload-pack.c         | 107 ++++++++++++--------------------------------------
 upload-pack.h         |  13 ++++++
 6 files changed, 109 insertions(+), 83 deletions(-)
 create mode 100644 builtin/upload-pack.c
 create mode 100644 upload-pack.h

diff --git a/Makefile b/Makefile
index 1a9b23b67..b7ccc05fa 100644
--- a/Makefile
+++ b/Makefile
@@ -639,7 +639,6 @@ PROGRAM_OBJS += imap-send.o
 PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
-PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -909,6 +908,7 @@ LIB_OBJS += tree-diff.o
 LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
+LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
@@ -1026,6 +1026,7 @@ BUILTIN_OBJS += builtin/update-index.o
 BUILTIN_OBJS += builtin/update-ref.o
 BUILTIN_OBJS += builtin/update-server-info.o
 BUILTIN_OBJS += builtin/upload-archive.o
+BUILTIN_OBJS += builtin/upload-pack.o
 BUILTIN_OBJS += builtin/var.o
 BUILTIN_OBJS += builtin/verify-commit.o
 BUILTIN_OBJS += builtin/verify-pack.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa..f332a1257 100644
--- a/builtin.h
+++ b/builtin.h
@@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_update_server_info(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix);
+extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
new file mode 100644
index 000000000..2cb5cb35b
--- /dev/null
+++ b/builtin/upload-pack.c
@@ -0,0 +1,67 @@
+#include "cache.h"
+#include "builtin.h"
+#include "exec_cmd.h"
+#include "pkt-line.h"
+#include "parse-options.h"
+#include "protocol.h"
+#include "upload-pack.h"
+
+static const char * const upload_pack_usage[] = {
+	N_("git upload-pack [<options>] <dir>"),
+	NULL
+};
+
+int cmd_upload_pack(int argc, const char **argv, const char *prefix)
+{
+	const char *dir;
+	int strict = 0;
+	struct upload_pack_options opts = { 0 };
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-refs", &opts.advertise_refs,
+			 N_("exit immediately after initial ref advertisement")),
+		OPT_BOOL(0, "strict", &strict,
+			 N_("do not try <directory>/.git/ if <directory> is no Git directory")),
+		OPT_INTEGER(0, "timeout", &opts.timeout,
+			    N_("interrupt transfer after <n> seconds of inactivity")),
+		OPT_END()
+	};
+
+	packet_trace_identity("upload-pack");
+	check_replace_refs = 0;
+
+	argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
+
+	if (argc != 1)
+		usage_with_options(upload_pack_usage, options);
+
+	if (opts.timeout)
+		opts.daemon_mode = 1;
+
+	setup_path();
+
+	dir = argv[0];
+
+	if (!enter_repo(dir, strict))
+		die("'%s' does not appear to be a git repository", dir);
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+		if (opts.advertise_refs || !opts.stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
+	case protocol_v0:
+		upload_pack(&opts);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
+	return 0;
+}
diff --git a/git.c b/git.c
index c870b9719..f71073dc8 100644
--- a/git.c
+++ b/git.c
@@ -478,6 +478,7 @@ static struct cmd_struct commands[] = {
 	{ "update-server-info", cmd_update_server_info, RUN_SETUP },
 	{ "upload-archive", cmd_upload_archive },
 	{ "upload-archive--writer", cmd_upload_archive_writer },
+	{ "upload-pack", cmd_upload_pack },
 	{ "var", cmd_var, RUN_SETUP_GENTLY },
 	{ "verify-commit", cmd_verify_commit, RUN_SETUP },
 	{ "verify-pack", cmd_verify_pack },
diff --git a/upload-pack.c b/upload-pack.c
index d5de18127..2ad73a98b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -6,7 +6,6 @@
 #include "tag.h"
 #include "object.h"
 #include "commit.h"
-#include "exec_cmd.h"
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
@@ -15,15 +14,10 @@
 #include "sigchain.h"
 #include "version.h"
 #include "string-list.h"
-#include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
 #include "protocol.h"
-
-static const char * const upload_pack_usage[] = {
-	N_("git upload-pack [<options>] <dir>"),
-	NULL
-};
+#include "upload-pack.h"
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE	(1u << 11)
@@ -61,7 +55,6 @@ static int keepalive = 5;
  * otherwise maximum packet size (up to 65520 bytes).
  */
 static int use_sideband;
-static int advertise_refs;
 static int stateless_rpc;
 static const char *pack_objects_hook;
 
@@ -977,33 +970,6 @@ static int find_symref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static void upload_pack(void)
-{
-	struct string_list symref = STRING_LIST_INIT_DUP;
-
-	head_ref_namespaced(find_symref, &symref);
-
-	if (advertise_refs || !stateless_rpc) {
-		reset_timeout();
-		head_ref_namespaced(send_ref, &symref);
-		for_each_namespaced_ref(send_ref, &symref);
-		advertise_shallow_grafts(1);
-		packet_flush(1);
-	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
-	}
-	string_list_clear(&symref, 1);
-	if (advertise_refs)
-		return;
-
-	receive_needs();
-	if (want_obj.nr) {
-		get_common_commits();
-		create_pack_file();
-	}
-}
-
 static int upload_pack_config(const char *var, const char *value, void *unused)
 {
 	if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
@@ -1032,58 +998,35 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
-int cmd_main(int argc, const char **argv)
+void upload_pack(struct upload_pack_options *options)
 {
-	const char *dir;
-	int strict = 0;
-	struct option options[] = {
-		OPT_BOOL(0, "stateless-rpc", &stateless_rpc,
-			 N_("quit after a single request/response exchange")),
-		OPT_BOOL(0, "advertise-refs", &advertise_refs,
-			 N_("exit immediately after initial ref advertisement")),
-		OPT_BOOL(0, "strict", &strict,
-			 N_("do not try <directory>/.git/ if <directory> is no Git directory")),
-		OPT_INTEGER(0, "timeout", &timeout,
-			    N_("interrupt transfer after <n> seconds of inactivity")),
-		OPT_END()
-	};
-
-	packet_trace_identity("upload-pack");
-	check_replace_refs = 0;
-
-	argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
-
-	if (argc != 1)
-		usage_with_options(upload_pack_usage, options);
-
-	if (timeout)
-		daemon_mode = 1;
-
-	setup_path();
-
-	dir = argv[0];
+	struct string_list symref = STRING_LIST_INIT_DUP;
 
-	if (!enter_repo(dir, strict))
-		die("'%s' does not appear to be a git repository", dir);
+	stateless_rpc = options->stateless_rpc;
+	timeout = options->timeout;
+	daemon_mode = options->daemon_mode;
 
 	git_config(upload_pack_config, NULL);
 
-	switch (determine_protocol_version_server()) {
-	case protocol_v1:
-		/*
-		 * v1 is just the original protocol with a version string,
-		 * so just fall through after writing the version string.
-		 */
-		if (advertise_refs || !stateless_rpc)
-			packet_write_fmt(1, "version 1\n");
-
-		/* fallthrough */
-	case protocol_v0:
-		upload_pack();
-		break;
-	case protocol_unknown_version:
-		BUG("unknown protocol version");
+	head_ref_namespaced(find_symref, &symref);
+
+	if (options->advertise_refs || !stateless_rpc) {
+		reset_timeout();
+		head_ref_namespaced(send_ref, &symref);
+		for_each_namespaced_ref(send_ref, &symref);
+		advertise_shallow_grafts(1);
+		packet_flush(1);
+	} else {
+		head_ref_namespaced(check_ref, NULL);
+		for_each_namespaced_ref(check_ref, NULL);
 	}
+	string_list_clear(&symref, 1);
+	if (options->advertise_refs)
+		return;
 
-	return 0;
+	receive_needs();
+	if (want_obj.nr) {
+		get_common_commits();
+		create_pack_file();
+	}
 }
diff --git a/upload-pack.h b/upload-pack.h
new file mode 100644
index 000000000..a71e4dc7e
--- /dev/null
+++ b/upload-pack.h
@@ -0,0 +1,13 @@
+#ifndef UPLOAD_PACK_H
+#define UPLOAD_PACK_H
+
+struct upload_pack_options {
+	int stateless_rpc;
+	int advertise_refs;
+	unsigned int timeout;
+	int daemon_mode;
+};
+
+void upload_pack(struct upload_pack_options *options);
+
+#endif /* UPLOAD_PACK_H */
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 05/27] upload-pack: factor out processing lines
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (3 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 04/27] upload-pack: convert to a builtin Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-26 20:12     ` Stefan Beller
  2018-01-25 23:58   ` [PATCH v2 06/27] transport: use get_refs_via_connect to get refs Brandon Williams
                     ` (24 subsequent siblings)
  29 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 74 insertions(+), 39 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 2ad73a98b..42d83d5b1 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
 	packet_flush(1);
 }
 
+static int process_shallow(const char *line, struct object_array *shallows)
+{
+	const char *arg;
+	if (skip_prefix(line, "shallow ", &arg)) {
+		struct object_id oid;
+		struct object *object;
+		if (get_oid_hex(arg, &oid))
+			die("invalid shallow line: %s", line);
+		object = parse_object(&oid);
+		if (!object)
+			return 1;
+		if (object->type != OBJ_COMMIT)
+			die("invalid shallow object %s", oid_to_hex(&oid));
+		if (!(object->flags & CLIENT_SHALLOW)) {
+			object->flags |= CLIENT_SHALLOW;
+			add_object_array(object, NULL, shallows);
+		}
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen(const char *line, int *depth)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen ", &arg)) {
+		char *end = NULL;
+		*depth = (int) strtol(arg, &end, 0);
+		if (!end || *end || *depth <= 0)
+			die("Invalid deepen: %s", line);
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-since ", &arg)) {
+		char *end = NULL;
+		*deepen_since = parse_timestamp(arg, &end, 0);
+		if (!end || *end || !deepen_since ||
+		    /* revisions.c's max_age -1 is special */
+		    *deepen_since == -1)
+			die("Invalid deepen-since: %s", line);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
+static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-not ", &arg)) {
+		char *ref = NULL;
+		struct object_id oid;
+		if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
+			die("git upload-pack: ambiguous deepen-not: %s", line);
+		string_list_append(deepen_not, ref);
+		free(ref);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
 static void receive_needs(void)
 {
 	struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -745,49 +814,15 @@ static void receive_needs(void)
 		if (!line)
 			break;
 
-		if (skip_prefix(line, "shallow ", &arg)) {
-			struct object_id oid;
-			struct object *object;
-			if (get_oid_hex(arg, &oid))
-				die("invalid shallow line: %s", line);
-			object = parse_object(&oid);
-			if (!object)
-				continue;
-			if (object->type != OBJ_COMMIT)
-				die("invalid shallow object %s", oid_to_hex(&oid));
-			if (!(object->flags & CLIENT_SHALLOW)) {
-				object->flags |= CLIENT_SHALLOW;
-				add_object_array(object, NULL, &shallows);
-			}
+		if (process_shallow(line, &shallows))
 			continue;
-		}
-		if (skip_prefix(line, "deepen ", &arg)) {
-			char *end = NULL;
-			depth = strtol(arg, &end, 0);
-			if (!end || *end || depth <= 0)
-				die("Invalid deepen: %s", line);
+		if (process_deepen(line, &depth))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-since ", &arg)) {
-			char *end = NULL;
-			deepen_since = parse_timestamp(arg, &end, 0);
-			if (!end || *end || !deepen_since ||
-			    /* revisions.c's max_age -1 is special */
-			    deepen_since == -1)
-				die("Invalid deepen-since: %s", line);
-			deepen_rev_list = 1;
+		if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-not ", &arg)) {
-			char *ref = NULL;
-			struct object_id oid;
-			if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
-				die("git upload-pack: ambiguous deepen-not: %s", line);
-			string_list_append(&deepen_not, ref);
-			free(ref);
-			deepen_rev_list = 1;
+		if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
 			continue;
-		}
+
 		if (!skip_prefix(line, "want ", &arg) ||
 		    get_oid_hex(arg, &oid_buf))
 			die("git upload-pack: protocol error, "
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 06/27] transport: use get_refs_via_connect to get refs
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (4 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 05/27] upload-pack: factor out processing lines Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 07/27] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
                     ` (23 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index fc802260f..8e8779096 100644
--- a/transport.c
+++ b/transport.c
@@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
 
-	if (!data->got_remote_heads) {
-		connect_setup(transport, 0);
-		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		refs_tmp = get_refs_via_connect(transport, 0);
 
 	refs = fetch_pack(&args, data->fd, data->conn,
 			  refs_tmp ? refs_tmp : transport->remote_refs,
@@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	struct send_pack_args args;
 	int ret;
 
-	if (!data->got_remote_heads) {
-		struct ref *tmp_refs;
-		connect_setup(transport, 1);
-
-		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		get_refs_via_connect(transport, 1);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 07/27] connect: convert get_remote_heads to use struct packet_reader
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (5 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 06/27] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads Brandon Williams
                     ` (22 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

In order to allow for better control flow when protocol_v2 is introduced
convert 'get_remote_heads()' to use 'struct packet_reader' to read
packet lines.  This enables a client to be able to peek the first line
of a server's response (without consuming it) in order to determine the
protocol version its speaking and then passing control to the
appropriate handler.

This is needed because the initial response from a server speaking
protocol_v0 includes the first ref, while subsequent protocol versions
respond with a version line.  We want to be able to read this first line
without consuming the first ref sent in the protocol_v0 case so that the
protocol version the server is speaking can be determined outside of
'get_remote_heads()' in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 174 ++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 96 insertions(+), 78 deletions(-)

diff --git a/connect.c b/connect.c
index c3a014c5b..00e90075c 100644
--- a/connect.c
+++ b/connect.c
@@ -48,6 +48,12 @@ int check_ref_type(const struct ref *ref, int flags)
 
 static void die_initial_contact(int unexpected)
 {
+	/*
+	 * A hang-up after seeing some response from the other end
+	 * means that it is unexpected, as we know the other end is
+	 * willing to talk to us.  A hang-up before seeing any
+	 * response does not necessarily mean an ACL problem, though.
+	 */
 	if (unexpected)
 		die(_("The remote end hung up upon initial contact"));
 	else
@@ -56,6 +62,41 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+static enum protocol_version discover_version(struct packet_reader *reader)
+{
+	enum protocol_version version = protocol_unknown_version;
+
+	/*
+	 * Peek the first line of the server's response to
+	 * determine the protocol version the server is speaking.
+	 */
+	switch (packet_reader_peek(reader)) {
+	case PACKET_READ_EOF:
+		die_initial_contact(0);
+	case PACKET_READ_FLUSH:
+	case PACKET_READ_DELIM:
+		version = protocol_v0;
+		break;
+	case PACKET_READ_NORMAL:
+		version = determine_protocol_version_client(reader->line);
+		break;
+	}
+
+	/* Maybe process capabilities here, at least for v2 */
+	switch (version) {
+	case protocol_v1:
+		/* Read the peeked version line */
+		packet_reader_read(reader);
+		break;
+	case protocol_v0:
+		break;
+	case protocol_unknown_version:
+		die("unknown protocol version: '%s'\n", reader->line);
+	}
+
+	return version;
+}
+
 static void parse_one_symref_info(struct string_list *symref, const char *val, int len)
 {
 	char *sym, *target;
@@ -109,60 +150,21 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
-/*
- * Read one line of a server's ref advertisement into packet_buffer.
- */
-static int read_remote_ref(int in, char **src_buf, size_t *src_len,
-			   int *responded)
+static void process_capabilities(const char *line, int *len)
 {
-	int len = packet_read(in, src_buf, src_len,
-			      packet_buffer, sizeof(packet_buffer),
-			      PACKET_READ_GENTLE_ON_EOF |
-			      PACKET_READ_CHOMP_NEWLINE);
-	const char *arg;
-	if (len < 0)
-		die_initial_contact(*responded);
-	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
-		die("remote error: %s", arg);
-
-	*responded = 1;
-
-	return len;
-}
-
-#define EXPECTING_PROTOCOL_VERSION 0
-#define EXPECTING_FIRST_REF 1
-#define EXPECTING_REF 2
-#define EXPECTING_SHALLOW 3
-
-/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
-static int process_protocol_version(void)
-{
-	switch (determine_protocol_version_client(packet_buffer)) {
-	case protocol_v1:
-		return 1;
-	case protocol_v0:
-		return 0;
-	default:
-		die("server is speaking an unknown protocol");
-	}
-}
-
-static void process_capabilities(int *len)
-{
-	int nul_location = strlen(packet_buffer);
+	int nul_location = strlen(line);
 	if (nul_location == *len)
 		return;
-	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	server_capabilities = xstrdup(line + nul_location + 1);
 	*len = nul_location;
 }
 
-static int process_dummy_ref(void)
+static int process_dummy_ref(const char *line)
 {
 	struct object_id oid;
 	const char *name;
 
-	if (parse_oid_hex(packet_buffer, &oid, &name))
+	if (parse_oid_hex(line, &oid, &name))
 		return 0;
 	if (*name != ' ')
 		return 0;
@@ -171,20 +173,20 @@ static int process_dummy_ref(void)
 	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
 }
 
-static void check_no_capabilities(int len)
+static void check_no_capabilities(const char *line, int len)
 {
-	if (strlen(packet_buffer) != len)
+	if (strlen(line) != len)
 		warning("Ignoring capabilities after first line '%s'",
-			packet_buffer + strlen(packet_buffer));
+			line + strlen(line));
 }
 
-static int process_ref(int len, struct ref ***list, unsigned int flags,
-		       struct oid_array *extra_have)
+static int process_ref(const char *line, int len, struct ref ***list,
+		       unsigned int flags, struct oid_array *extra_have)
 {
 	struct object_id old_oid;
 	const char *name;
 
-	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+	if (parse_oid_hex(line, &old_oid, &name))
 		return 0;
 	if (*name != ' ')
 		return 0;
@@ -200,16 +202,17 @@ static int process_ref(int len, struct ref ***list, unsigned int flags,
 		**list = ref;
 		*list = &ref->next;
 	}
-	check_no_capabilities(len);
+	check_no_capabilities(line, len);
 	return 1;
 }
 
-static int process_shallow(int len, struct oid_array *shallow_points)
+static int process_shallow(const char *line, int len,
+			   struct oid_array *shallow_points)
 {
 	const char *arg;
 	struct object_id old_oid;
 
-	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+	if (!skip_prefix(line, "shallow ", &arg))
 		return 0;
 
 	if (get_oid_hex(arg, &old_oid))
@@ -217,10 +220,17 @@ static int process_shallow(int len, struct oid_array *shallow_points)
 	if (!shallow_points)
 		die("repository on the other end cannot be shallow");
 	oid_array_append(shallow_points, &old_oid);
-	check_no_capabilities(len);
+	check_no_capabilities(line, len);
 	return 1;
 }
 
+enum get_remote_heads_state {
+	EXPECTING_FIRST_REF = 0,
+	EXPECTING_REF,
+	EXPECTING_SHALLOW,
+	EXPECTING_DONE,
+};
+
 /*
  * Read all the refs from the other end
  */
@@ -230,47 +240,55 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 			      struct oid_array *shallow_points)
 {
 	struct ref **orig_list = list;
+	int len = 0;
+	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
+	struct packet_reader reader;
+	const char *arg;
 
-	/*
-	 * A hang-up after seeing some response from the other end
-	 * means that it is unexpected, as we know the other end is
-	 * willing to talk to us.  A hang-up before seeing any
-	 * response does not necessarily mean an ACL problem, though.
-	 */
-	int responded = 0;
-	int len;
-	int state = EXPECTING_PROTOCOL_VERSION;
+	packet_reader_init(&reader, in, src_buf, src_len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	discover_version(&reader);
 
 	*list = NULL;
 
-	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+	while (state != EXPECTING_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			die_initial_contact(1);
+		case PACKET_READ_NORMAL:
+			len = reader.pktlen;
+			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
+				die("remote error: %s", arg);
+			break;
+		case PACKET_READ_FLUSH:
+			state = EXPECTING_DONE;
+			break;
+		case PACKET_READ_DELIM:
+			die("invalid packet");
+		}
+
 		switch (state) {
-		case EXPECTING_PROTOCOL_VERSION:
-			if (process_protocol_version()) {
-				state = EXPECTING_FIRST_REF;
-				break;
-			}
-			state = EXPECTING_FIRST_REF;
-			/* fallthrough */
 		case EXPECTING_FIRST_REF:
-			process_capabilities(&len);
-			if (process_dummy_ref()) {
+			process_capabilities(reader.line, &len);
+			if (process_dummy_ref(reader.line)) {
 				state = EXPECTING_SHALLOW;
 				break;
 			}
 			state = EXPECTING_REF;
 			/* fallthrough */
 		case EXPECTING_REF:
-			if (process_ref(len, &list, flags, extra_have))
+			if (process_ref(reader.line, len, &list, flags, extra_have))
 				break;
 			state = EXPECTING_SHALLOW;
 			/* fallthrough */
 		case EXPECTING_SHALLOW:
-			if (process_shallow(len, shallow_points))
+			if (process_shallow(reader.line, len, shallow_points))
 				break;
-			die("protocol error: unexpected '%s'", packet_buffer);
-		default:
-			die("unexpected state %d", state);
+			die("protocol error: unexpected '%s'", reader.line);
+		case EXPECTING_DONE:
+			break;
 		}
 	}
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (6 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 07/27] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-31 14:40     ` Derrick Stolee
  2018-01-25 23:58   ` [PATCH v2 09/27] transport: store protocol version Brandon Williams
                     ` (21 subsequent siblings)
  29 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'.  This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c | 16 +++++++++++++++-
 builtin/send-pack.c  | 17 +++++++++++++++--
 connect.c            | 27 ++++++++++-----------------
 connect.h            |  3 +++
 remote-curl.c        | 20 ++++++++++++++++++--
 remote.h             |  5 +++--
 transport.c          | 24 +++++++++++++++++++-----
 7 files changed, 83 insertions(+), 29 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..85d4faf76 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -4,6 +4,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "sha1-array.h"
+#include "protocol.h"
 
 static const char fetch_pack_usage[] =
 "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
@@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	struct fetch_pack_args args;
 	struct oid_array shallow = OID_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
+	struct packet_reader reader;
 
 	packet_trace_identity("fetch-pack");
 
@@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 		if (!conn)
 			return args.diag_url ? 0 : 1;
 	}
-	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 			 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..83cb125a6 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
+#include "protocol.h"
 
 static const char * const send_pack_usage[] = {
 	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
@@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 	int progress = -1;
 	int from_stdin = 0;
 	struct push_cas_option cas = {0};
+	struct packet_reader reader;
 
 	struct option options[] = {
 		OPT__VERBOSITY(&verbose),
@@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			args.verbose ? CONNECT_VERBOSE : 0);
 	}
 
-	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-			 &extra_have, &shallow);
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
+				 &extra_have, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index 00e90075c..db3c9d24c 100644
--- a/connect.c
+++ b/connect.c
@@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
-static enum protocol_version discover_version(struct packet_reader *reader)
+enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
 
@@ -234,7 +234,7 @@ enum get_remote_heads_state {
 /*
  * Read all the refs from the other end
  */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct ref **get_remote_heads(struct packet_reader *reader,
 			      struct ref **list, unsigned int flags,
 			      struct oid_array *extra_have,
 			      struct oid_array *shallow_points)
@@ -242,24 +242,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	struct ref **orig_list = list;
 	int len = 0;
 	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
-	struct packet_reader reader;
 	const char *arg;
 
-	packet_reader_init(&reader, in, src_buf, src_len,
-			   PACKET_READ_CHOMP_NEWLINE |
-			   PACKET_READ_GENTLE_ON_EOF);
-
-	discover_version(&reader);
-
 	*list = NULL;
 
 	while (state != EXPECTING_DONE) {
-		switch (packet_reader_read(&reader)) {
+		switch (packet_reader_read(reader)) {
 		case PACKET_READ_EOF:
 			die_initial_contact(1);
 		case PACKET_READ_NORMAL:
-			len = reader.pktlen;
-			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
+			len = reader->pktlen;
+			if (len > 4 && skip_prefix(reader->line, "ERR ", &arg))
 				die("remote error: %s", arg);
 			break;
 		case PACKET_READ_FLUSH:
@@ -271,22 +264,22 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 
 		switch (state) {
 		case EXPECTING_FIRST_REF:
-			process_capabilities(reader.line, &len);
-			if (process_dummy_ref(reader.line)) {
+			process_capabilities(reader->line, &len);
+			if (process_dummy_ref(reader->line)) {
 				state = EXPECTING_SHALLOW;
 				break;
 			}
 			state = EXPECTING_REF;
 			/* fallthrough */
 		case EXPECTING_REF:
-			if (process_ref(reader.line, len, &list, flags, extra_have))
+			if (process_ref(reader->line, len, &list, flags, extra_have))
 				break;
 			state = EXPECTING_SHALLOW;
 			/* fallthrough */
 		case EXPECTING_SHALLOW:
-			if (process_shallow(reader.line, len, shallow_points))
+			if (process_shallow(reader->line, len, shallow_points))
 				break;
-			die("protocol error: unexpected '%s'", reader.line);
+			die("protocol error: unexpected '%s'", reader->line);
 		case EXPECTING_DONE:
 			break;
 		}
diff --git a/connect.h b/connect.h
index 01f14cdf3..cdb8979dc 100644
--- a/connect.h
+++ b/connect.h
@@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
 extern const char *server_feature_value(const char *feature, int *len_ret);
 extern int url_is_local_not_ssh(const char *url);
 
+struct packet_reader;
+extern enum protocol_version discover_version(struct packet_reader *reader);
+
 #endif
diff --git a/remote-curl.c b/remote-curl.c
index 0053b0954..9f6d07683 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "config.h"
 #include "remote.h"
+#include "connect.h"
 #include "strbuf.h"
 #include "walker.h"
 #include "http.h"
@@ -13,6 +14,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "protocol.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -176,8 +178,22 @@ static struct discovery *last_discovery;
 static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 {
 	struct ref *list = NULL;
-	get_remote_heads(-1, heads->buf, heads->len, &list,
-			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
+	struct packet_reader reader;
+
+	packet_reader_init(&reader, -1, heads->buf, heads->len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
+				 NULL, &heads->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	return list;
 }
 
diff --git a/remote.h b/remote.h
index 1f6611be2..2016461df 100644
--- a/remote.h
+++ b/remote.h
@@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
 void free_refs(struct ref *ref);
 
 struct oid_array;
-extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct packet_reader;
+extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
-				     struct oid_array *shallow);
+				     struct oid_array *shallow_points);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/transport.c b/transport.c
index 8e8779096..63c3dbab9 100644
--- a/transport.c
+++ b/transport.c
@@ -18,6 +18,7 @@
 #include "sha1-array.h"
 #include "sigchain.h"
 #include "transport-internal.h"
+#include "protocol.h"
 
 static void set_upstreams(struct transport *transport, struct ref *refs,
 	int pretend)
@@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
 {
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
+	struct packet_reader reader;
 
 	connect_setup(transport, for_push);
-	get_remote_heads(data->fd[0], NULL, 0, &refs,
-			 for_push ? REF_NORMAL : 0,
-			 &data->extra_have,
-			 &data->shallow);
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &refs,
+				 for_push ? REF_NORMAL : 0,
+				 &data->extra_have,
+				 &data->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 	data->got_remote_heads = 1;
 
 	return refs;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 09/27] transport: store protocol version
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (7 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-31 14:45     ` Derrick Stolee
  2018-01-25 23:58   ` [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
                     ` (20 subsequent siblings)
  29 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version.  Store the
protocol version the server is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 63c3dbab9..2378dcb38 100644
--- a/transport.c
+++ b/transport.c
@@ -118,6 +118,7 @@ struct git_transport_data {
 	struct child_process *conn;
 	int fd[2];
 	unsigned got_remote_heads : 1;
+	enum protocol_version version;
 	struct oid_array extra_have;
 	struct oid_array shallow;
 };
@@ -200,7 +201,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	data->version = discover_version(&reader);
+	switch (data->version) {
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -221,7 +223,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 {
 	int ret = 0;
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
 	char *dest = xstrdup(transport->url);
 	struct fetch_pack_args args;
 	struct ref *refs_tmp = NULL;
@@ -247,10 +249,18 @@ static int fetch_refs_via_pack(struct transport *transport,
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0);
 
-	refs = fetch_pack(&args, data->fd, data->conn,
-			  refs_tmp ? refs_tmp : transport->remote_refs,
-			  dest, to_fetch, nr_heads, &data->shallow,
-			  &transport->pack_lockfile);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	close(data->fd[0]);
 	close(data->fd[1]);
 	if (finish_connect(data->conn))
@@ -549,7 +559,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 {
 	struct git_transport_data *data = transport->data;
 	struct send_pack_args args;
-	int ret;
+	int ret = 0;
 
 	if (!data->got_remote_heads)
 		get_refs_via_connect(transport, 1);
@@ -574,8 +584,15 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	else
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
-	ret = send_pack(&args, data->fd, data->conn, remote_refs,
-			&data->extra_have);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		ret = send_pack(&args, data->fd, data->conn, remote_refs,
+				&data->extra_have);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	close(data->fd[1]);
 	close(data->fd[0]);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (8 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 09/27] transport: store protocol version Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-31 14:54     ` Derrick Stolee
  2018-01-25 23:58   ` [PATCH v2 11/27] test-pkt-line: introduce a packet-line test helper Brandon Williams
                     ` (19 subsequent siblings)
  29 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   | 3 +++
 builtin/receive-pack.c | 6 ++++++
 builtin/send-pack.c    | 3 +++
 builtin/upload-pack.c  | 7 +++++++
 connect.c              | 3 +++
 protocol.c             | 2 ++
 protocol.h             | 1 +
 remote-curl.c          | 3 +++
 transport.c            | 9 +++++++++
 9 files changed, 37 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 85d4faf76..f492e8abd 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index b7ce7c7f5..3656e94fd 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		unpack_limit = receive_unpack_limit;
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * push support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 83cb125a6..b5427f75e 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 2cb5cb35b..8d53e9794 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 		die("'%s' does not appear to be a git repository", dir);
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * fetch support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		upload_pack(&opts);
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/connect.c b/connect.c
index db3c9d24c..f2157a821 100644
--- a/connect.c
+++ b/connect.c
@@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
 
 	/* Maybe process capabilities here, at least for v2 */
 	switch (version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 		/* Read the peeked version line */
 		packet_reader_read(reader);
diff --git a/protocol.c b/protocol.c
index 43012b7eb..5e636785d 100644
--- a/protocol.c
+++ b/protocol.c
@@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
 		return protocol_v0;
 	else if (!strcmp(value, "1"))
 		return protocol_v1;
+	else if (!strcmp(value, "2"))
+		return protocol_v2;
 	else
 		return protocol_unknown_version;
 }
diff --git a/protocol.h b/protocol.h
index 1b2bc94a8..2ad35e433 100644
--- a/protocol.h
+++ b/protocol.h
@@ -5,6 +5,7 @@ enum protocol_version {
 	protocol_unknown_version = -1,
 	protocol_v0 = 0,
 	protocol_v1 = 1,
+	protocol_v2 = 2,
 };
 
 /*
diff --git a/remote-curl.c b/remote-curl.c
index 9f6d07683..dae8a4a48 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
diff --git a/transport.c b/transport.c
index 2378dcb38..83d9dd1df 100644
--- a/transport.c
+++ b/transport.c
@@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 
 	data->version = discover_version(&reader);
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
 		refs_tmp = get_refs_via_connect(transport, 0);
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
@@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		ret = send_pack(&args, data->fd, data->conn, remote_refs,
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 11/27] test-pkt-line: introduce a packet-line test helper
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (9 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 12/27] serve: introduce git-serve Brandon Williams
                     ` (18 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce a packet-line test helper which can either pack or unpack an
input stream into packet-lines and writes out the result to stdout.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile                 |  1 +
 t/helper/test-pkt-line.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 63 insertions(+)
 create mode 100644 t/helper/test-pkt-line.c

diff --git a/Makefile b/Makefile
index b7ccc05fa..3b849c060 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ TEST_PROGRAMS_NEED_X += test-mktemp
 TEST_PROGRAMS_NEED_X += test-online-cpus
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
+TEST_PROGRAMS_NEED_X += test-pkt-line
 TEST_PROGRAMS_NEED_X += test-prio-queue
 TEST_PROGRAMS_NEED_X += test-read-cache
 TEST_PROGRAMS_NEED_X += test-write-cache
diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
new file mode 100644
index 000000000..5df32b4cb
--- /dev/null
+++ b/t/helper/test-pkt-line.c
@@ -0,0 +1,62 @@
+#include "pkt-line.h"
+
+static void pack_line(const char *line)
+{
+	if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))
+		packet_flush(1);
+	else if (!strcmp(line, "0001") || !strcmp(line, "0001\n"))
+		packet_delim(1);
+	else
+		packet_write_fmt(1, "%s", line);
+}
+
+static void pack(int argc, const char **argv)
+{
+	if (argc) { /* read from argv */
+		int i;
+		for (i = 0; i < argc; i++)
+			pack_line(argv[i]);
+	} else { /* read from stdin */
+		char line[LARGE_PACKET_MAX];
+		while (fgets(line, sizeof(line), stdin)) {
+			pack_line(line);
+		}
+	}
+}
+
+static void unpack(void)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF |
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+		switch (reader.status) {
+		case PACKET_READ_EOF:
+			break;
+		case PACKET_READ_NORMAL:
+			printf("%s\n", reader.line);
+			break;
+		case PACKET_READ_FLUSH:
+			printf("0000\n");
+			break;
+		case PACKET_READ_DELIM:
+			printf("0001\n");
+			break;
+		}
+	}
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	if (argc < 2)
+		die("too few arguments");
+
+	if (!strcmp(argv[1], "pack"))
+		pack(argc - 2, argv + 2);
+	else if (!strcmp(argv[1], "unpack"))
+		unpack();
+
+	return 0;
+}
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 12/27] serve: introduce git-serve
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (10 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 11/27] test-pkt-line: introduce a packet-line test helper Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-26 10:39     ` Duy Nguyen
  2018-01-31 15:39     ` Derrick Stolee
  2018-01-25 23:58   ` [PATCH v2 13/27] ls-refs: introduce ls-refs server command Brandon Williams
                     ` (17 subsequent siblings)
  29 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt | 117 +++++++++++++++
 Makefile                                |   2 +
 builtin.h                               |   1 +
 builtin/serve.c                         |  30 ++++
 git.c                                   |   1 +
 serve.c                                 | 249 ++++++++++++++++++++++++++++++++
 serve.h                                 |  15 ++
 t/t5701-git-serve.sh                    |  56 +++++++
 9 files changed, 472 insertions(+)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-git-serve.sh

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 000000000..7f619a76c
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,117 @@
+ Git Wire Protocol, Version 2
+==============================
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+    supported by a single service.
+  * Easily extendable as capabilities are moved into their own section
+    of the protocol, no longer being hidden behind a NUL byte and
+    limited by the size of a pkt-line (as there will be a single
+    capability per pkt-line).
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+    string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+
+ Detailed Design
+=================
+
+A client can request to speak protocol v2 by sending `version=2` in the
+side-channel `GIT_PROTOCOL` in the initial request to the server.
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Special Packets
+-----------------
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+
+ Capability Advertisement
+--------------------------
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+    capability-advertisement = protocol-version
+			       capability-list
+			       flush-pkt
+
+    protocol-version = PKT-LINE("version 2" LF)
+    capability-list = *capability
+    capability = PKT-LINE(key[=value] LF)
+
+    key = 1*CHAR
+    value = 1*CHAR
+    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
+
+A client then responds to select the command it wants with any particular
+capabilities or arguments.  There is then an optional section where the
+client can provide any command specific parameters or queries.
+
+    command-request = command
+		      capability-list
+		      (command-args)
+		      flush-pkt
+    command = PKT-LINE("command=" key LF)
+    command-args = delim-pkt
+		   *arg
+    arg = 1*CHAR
+
+The server will then check to ensure that the client's request is
+comprised of a valid command as well as valid capabilities which were
+advertised.  If the request is valid the server will then execute the
+command.
+
+When a command has finished a client can either request that another
+command be executed or can terminate the connection by sending an empty
+request consisting of just a flush-pkt.
+
+ Capabilities
+~~~~~~~~~~~~~~
+
+There are two different types of capabilities: normal capabilities,
+which can be used to to convey information or alter the behavior of a
+request, and command capabilities, which are the core actions that a
+client wants to perform (fetch, push, etc).
+
+ agent
+-------
+
+The server can advertise the `agent` capability with a value `X` (in the
+form `agent=X`) to notify the client that the server is running version
+`X`.  The client may optionally send its own agent string by including
+the `agent` capability with a value `Y` (in the form `agent=Y`) in its
+request to the server (but it MUST NOT do so if the server did not
+advertise the agent capability). The `X` and `Y` strings may contain any
+printable ASCII characters except space (i.e., the byte range 32 < x <
+127), and are typically of the form "package/version" (e.g.,
+"git/1.8.3.1"). The agent strings are purely informative for statistics
+and debugging purposes, and MUST NOT be used to programmatically assume
+the presence or absence of particular features.
+
+ stateless-rpc
+---------------
+
+If advertised, the `stateless-rpc` capability indicates that the server
+supports running commands in a stateless-rpc mode, which means that a
+command lasts for only a single request-response round.
+
+Normally a command can last for as many rounds as are required to
+complete it (multiple for negotiation during fetch or no additional
+trips in the case of ls-refs).  If the client sends the `stateless-rpc`
+capability with a value of `true` (in the form `stateless-rpc=true`)
+then the invoked command must only last a single round.
diff --git a/Makefile b/Makefile
index 3b849c060..18c255428 100644
--- a/Makefile
+++ b/Makefile
@@ -881,6 +881,7 @@ LIB_OBJS += revision.o
 LIB_OBJS += run-command.o
 LIB_OBJS += send-pack.o
 LIB_OBJS += sequencer.o
+LIB_OBJS += serve.o
 LIB_OBJS += server-info.o
 LIB_OBJS += setup.o
 LIB_OBJS += sha1-array.o
@@ -1014,6 +1015,7 @@ BUILTIN_OBJS += builtin/rev-parse.o
 BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
+BUILTIN_OBJS += builtin/serve.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
diff --git a/builtin.h b/builtin.h
index f332a1257..3f3fdfc28 100644
--- a/builtin.h
+++ b/builtin.h
@@ -215,6 +215,7 @@ extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
 extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
+extern int cmd_serve(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/serve.c b/builtin/serve.c
new file mode 100644
index 000000000..d3fd240bb
--- /dev/null
+++ b/builtin/serve.c
@@ -0,0 +1,30 @@
+#include "cache.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "serve.h"
+
+static char const * const serve_usage[] = {
+	N_("git serve [<options>]"),
+	NULL
+};
+
+int cmd_serve(int argc, const char **argv, const char *prefix)
+{
+	struct serve_options opts = SERVE_OPTIONS_INIT;
+
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-capabilities", &opts.advertise_capabilities,
+			 N_("exit immediately after advertising capabilities")),
+		OPT_END()
+	};
+
+	/* ignore all unknown cmdline switches for now */
+	argc = parse_options(argc, argv, prefix, options, serve_usage,
+			     PARSE_OPT_KEEP_DASHDASH |
+			     PARSE_OPT_KEEP_UNKNOWN);
+	serve(&opts);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index f71073dc8..f85d682b6 100644
--- a/git.c
+++ b/git.c
@@ -461,6 +461,7 @@ static struct cmd_struct commands[] = {
 	{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
 	{ "rm", cmd_rm, RUN_SETUP },
 	{ "send-pack", cmd_send_pack, RUN_SETUP },
+	{ "serve", cmd_serve, RUN_SETUP },
 	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
 	{ "show", cmd_show, RUN_SETUP },
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/serve.c b/serve.c
new file mode 100644
index 000000000..90e3defe8
--- /dev/null
+++ b/serve.c
@@ -0,0 +1,249 @@
+#include "cache.h"
+#include "repository.h"
+#include "config.h"
+#include "pkt-line.h"
+#include "version.h"
+#include "argv-array.h"
+#include "serve.h"
+
+static int always_advertise(struct repository *r,
+			    struct strbuf *value)
+{
+	return 1;
+}
+
+static int agent_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (value)
+		strbuf_addstr(value, git_user_agent_sanitized());
+	return 1;
+}
+
+struct protocol_capability {
+	/*
+	 * The name of the capability.  The server uses this name when
+	 * advertising this capability, and the client uses this name to
+	 * specify this capability.
+	 */
+	const char *name;
+
+	/*
+	 * Function queried to see if a capability should be advertised.
+	 * Optionally a value can be specified by adding it to 'value'.
+	 * If a value is added to 'value', the server will advertise this
+	 * capability as "<name>=<value>" instead of "<name>".
+	 */
+	int (*advertise)(struct repository *r, struct strbuf *value);
+
+	/*
+	 * Function called when a client requests the capability as a command.
+	 * The command request will be provided to the function via 'keys', the
+	 * capabilities requested, and 'args', the command specific parameters.
+	 *
+	 * This field should be NULL for capabilities which are not commands.
+	 */
+	int (*command)(struct repository *r,
+		       struct argv_array *keys,
+		       struct argv_array *args);
+};
+
+static struct protocol_capability capabilities[] = {
+	{ "agent", agent_advertise, NULL },
+	{ "stateless-rpc", always_advertise, NULL },
+};
+
+static void advertise_capabilities(void)
+{
+	struct strbuf capability = STRBUF_INIT;
+	struct strbuf value = STRBUF_INIT;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+
+		if (c->advertise(the_repository, &value)) {
+			strbuf_addstr(&capability, c->name);
+
+			if (value.len) {
+				strbuf_addch(&capability, '=');
+				strbuf_addbuf(&capability, &value);
+			}
+
+			strbuf_addch(&capability, '\n');
+			packet_write(1, capability.buf, capability.len);
+		}
+
+		strbuf_reset(&capability);
+		strbuf_reset(&value);
+	}
+
+	packet_flush(1);
+	strbuf_release(&capability);
+	strbuf_release(&value);
+}
+
+static struct protocol_capability *get_capability(const char *key)
+{
+	int i;
+
+	if (!key)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+		const char *out;
+		if (skip_prefix(key, c->name, &out) && (!*out || *out == '='))
+			return c;
+	}
+
+	return NULL;
+}
+
+static int is_valid_capability(const char *key)
+{
+	const struct protocol_capability *c = get_capability(key);
+
+	return c && c->advertise(the_repository, NULL);
+}
+
+static int is_command(const char *key, struct protocol_capability **command)
+{
+	const char *out;
+
+	if (skip_prefix(key, "command=", &out)) {
+		struct protocol_capability *cmd = get_capability(out);
+
+		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
+			die("invalid command '%s'", out);
+		if (*command)
+			die("command already requested");
+
+		*command = cmd;
+		return 1;
+	}
+
+	return 0;
+}
+
+int has_capability(const struct argv_array *keys, const char *capability,
+		   const char **value)
+{
+	int i;
+	for (i = 0; i < keys->argc; i++) {
+		const char *out;
+		if (skip_prefix(keys->argv[i], capability, &out) &&
+		    (!*out || *out == '=')) {
+			if (value) {
+				if (*out == '=')
+					out++;
+				*value = out;
+			}
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+enum request_state {
+	PROCESS_REQUEST_KEYS = 0,
+	PROCESS_REQUEST_ARGS,
+	PROCESS_REQUEST_DONE,
+};
+
+static int process_request(void)
+{
+	enum request_state state = PROCESS_REQUEST_KEYS;
+	char *buffer = packet_buffer;
+	unsigned buffer_size = sizeof(packet_buffer);
+	int pktlen;
+	struct argv_array keys = ARGV_ARRAY_INIT;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct protocol_capability *command = NULL;
+
+	while (state != PROCESS_REQUEST_DONE) {
+		switch (packet_read_with_status(0, NULL, NULL, buffer,
+						buffer_size, &pktlen,
+						PACKET_READ_CHOMP_NEWLINE)) {
+		case PACKET_READ_EOF:
+			BUG("Should have already died when seeing EOF");
+		case PACKET_READ_NORMAL:
+			break;
+		case PACKET_READ_FLUSH:
+			state = PROCESS_REQUEST_DONE;
+			continue;
+		case PACKET_READ_DELIM:
+			if (state != PROCESS_REQUEST_KEYS)
+				die("protocol error");
+			state = PROCESS_REQUEST_ARGS;
+			/*
+			 * maybe include a check to make sure that a
+			 * command/capabilities were given.
+			 */
+			continue;
+		}
+
+		switch (state) {
+		case PROCESS_REQUEST_KEYS:
+			/* collect request; a sequence of keys and values */
+			if (is_command(buffer, &command) ||
+			    is_valid_capability(buffer))
+				argv_array_push(&keys, buffer);
+			else
+				die("unknown capability '%s'", buffer);
+			break;
+		case PROCESS_REQUEST_ARGS:
+			/* collect arguments for the requested command */
+			argv_array_push(&args, buffer);
+			break;
+		case PROCESS_REQUEST_DONE:
+			continue;
+		}
+	}
+
+	/*
+	 * If no command and no keys were given then the client wanted to
+	 * terminate the connection.
+	 */
+	if (!keys.argc && !args.argc)
+		return 1;
+
+	if (!command)
+		die("no command requested");
+
+	command->command(the_repository, &keys, &args);
+
+	argv_array_clear(&keys);
+	argv_array_clear(&args);
+	return 0;
+}
+
+/* Main serve loop for protocol version 2 */
+void serve(struct serve_options *options)
+{
+	if (options->advertise_capabilities || !options->stateless_rpc) {
+		/* serve by default supports v2 */
+		packet_write_fmt(1, "version 2\n");
+
+		advertise_capabilities();
+		/*
+		 * If only the list of capabilities was requested exit
+		 * immediately after advertising capabilities
+		 */
+		if (options->advertise_capabilities)
+			return;
+	}
+
+	/*
+	 * If stateless-rpc was requested then exit after
+	 * a single request/response exchange
+	 */
+	if (options->stateless_rpc) {
+		process_request();
+	} else {
+		for (;;)
+			if (process_request())
+				break;
+	}
+}
diff --git a/serve.h b/serve.h
new file mode 100644
index 000000000..fe65ba9f4
--- /dev/null
+++ b/serve.h
@@ -0,0 +1,15 @@
+#ifndef SERVE_H
+#define SERVE_H
+
+struct argv_array;
+extern int has_capability(const struct argv_array *keys, const char *capability,
+			  const char **value);
+
+struct serve_options {
+	unsigned advertise_capabilities;
+	unsigned stateless_rpc;
+};
+#define SERVE_OPTIONS_INIT { 0 }
+extern void serve(struct serve_options *options);
+
+#endif /* SERVE_H */
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
new file mode 100755
index 000000000..b5cc049e5
--- /dev/null
+++ b/t/t5701-git-serve.sh
@@ -0,0 +1,56 @@
+#!/bin/sh
+
+test_description='test git-serve and server commands'
+
+. ./test-lib.sh
+
+test_expect_success 'test capability advertisement' '
+	cat >expect <<-EOF &&
+	version 2
+	agent=git/$(git version | cut -d" " -f3)
+	stateless-rpc
+	0000
+	EOF
+
+	git serve --advertise-capabilities >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'stateless-rpc flag does not list capabilities' '
+	test-pkt-line pack >in <<-EOF &&
+	0000
+	EOF
+	git serve --stateless-rpc >out <in &&
+	test_must_be_empty out
+'
+
+test_expect_success 'request invalid capability' '
+	test-pkt-line pack >in <<-EOF &&
+	foobar
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "unknown capability" err
+'
+
+test_expect_success 'request with no command' '
+	test-pkt-line pack >in <<-EOF &&
+	agent=git/test
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "no command requested" err
+'
+
+test_expect_success 'request invalid command' '
+	test-pkt-line pack >in <<-EOF &&
+	command=foo
+	agent=git/test
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "invalid command" err
+'
+
+test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 13/27] ls-refs: introduce ls-refs server command
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (11 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 12/27] serve: introduce git-serve Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-26 22:20     ` Stefan Beller
  2018-01-25 23:58   ` [PATCH v2 14/27] connect: request remote refs using v2 Brandon Williams
                     ` (16 subsequent siblings)
  29 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt |  32 +++++++++
 Makefile                                |   1 +
 ls-refs.c                               |  96 ++++++++++++++++++++++++++
 ls-refs.h                               |   9 +++
 serve.c                                 |   2 +
 t/t5701-git-serve.sh                    | 115 ++++++++++++++++++++++++++++++++
 6 files changed, 255 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 7f619a76c..4683d41ac 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -115,3 +115,35 @@ complete it (multiple for negotiation during fetch or no additional
 trips in the case of ls-refs).  If the client sends the `stateless-rpc`
 capability with a value of `true` (in the form `stateless-rpc=true`)
 then the invoked command must only last a single round.
+
+ ls-refs
+---------
+
+`ls-refs` is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in parameters
+which can be used to limit the refs sent from the server.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "<command>=<feature 1>
+<feature 2>".
+
+ls-refs takes in the following parameters wrapped in packet-lines:
+
+    symrefs
+	In addition to the object pointed by it, show the underlying ref
+	pointed by it when showing a symbolic ref.
+    peel
+	Show peeled tags.
+    ref-pattern <pattern>
+	When specified, only references matching the one of the provided
+	patterns are displayed.
+
+The output of ls-refs is as follows:
+
+    output = *ref
+	     flush-pkt
+    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    ref-attribute = (symref | peeled)
+    symref = "symref-target:" symref-target
+    peeled = "peeled:" obj-id
diff --git a/Makefile b/Makefile
index 18c255428..e50927cfb 100644
--- a/Makefile
+++ b/Makefile
@@ -825,6 +825,7 @@ LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 000000000..70682b4f7
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,96 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ls_refs_data {
+	unsigned peel;
+	unsigned symrefs;
+	struct argv_array patterns;
+};
+
+/*
+ * Check if one of the patterns matches the tail part of the ref.
+ * If no patterns were provided, all refs match.
+ */
+static int ref_match(const struct argv_array *patterns, const char *refname)
+{
+	char *pathbuf;
+	int i;
+
+	if (!patterns->argc)
+		return 1; /* no restriction */
+
+	pathbuf = xstrfmt("/%s", refname);
+	for (i = 0; i < patterns->argc; i++) {
+		if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
+			free(pathbuf);
+			return 1;
+		}
+	}
+	free(pathbuf);
+	return 0;
+}
+
+static int send_ref(const char *refname, const struct object_id *oid,
+		    int flag, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+	const char *refname_nons = strip_namespace(refname);
+	struct strbuf refline = STRBUF_INIT;
+
+	if (!ref_match(&data->patterns, refname))
+		return 0;
+
+	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (data->symrefs && flag & REF_ISSYMREF) {
+		struct object_id unused;
+		const char *symref_target = resolve_ref_unsafe(refname, 0,
+							       &unused,
+							       &flag);
+
+		if (!symref_target)
+			die("'%s' is a symref but it is not?", refname);
+
+		strbuf_addf(&refline, " symref-target:%s", symref_target);
+	}
+
+	if (data->peel) {
+		struct object_id peeled;
+		if (!peel_ref(refname, &peeled))
+			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
+	}
+
+	strbuf_addch(&refline, '\n');
+	packet_write(1, refline.buf, refline.len);
+
+	strbuf_release(&refline);
+	return 0;
+}
+
+int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array *args)
+{
+	int i;
+	struct ls_refs_data data = { 0, 0, ARGV_ARRAY_INIT };
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+		const char *out;
+
+		if (!strcmp("peel", arg))
+			data.peel = 1;
+		else if (!strcmp("symrefs", arg))
+			data.symrefs = 1;
+		else if (skip_prefix(arg, "ref-pattern ", &out))
+			argv_array_pushf(&data.patterns, "*/%s", out);
+	}
+
+	head_ref_namespaced(send_ref, &data);
+	for_each_namespaced_ref(send_ref, &data);
+	packet_flush(1);
+	argv_array_clear(&data.patterns);
+	return 0;
+}
diff --git a/ls-refs.h b/ls-refs.h
new file mode 100644
index 000000000..9e4c57bfe
--- /dev/null
+++ b/ls-refs.h
@@ -0,0 +1,9 @@
+#ifndef LS_REFS_H
+#define LS_REFS_H
+
+struct repository;
+struct argv_array;
+extern int ls_refs(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args);
+
+#endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index 90e3defe8..2f404154a 100644
--- a/serve.c
+++ b/serve.c
@@ -4,6 +4,7 @@
 #include "pkt-line.h"
 #include "version.h"
 #include "argv-array.h"
+#include "ls-refs.h"
 #include "serve.h"
 
 static int always_advertise(struct repository *r,
@@ -51,6 +52,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "stateless-rpc", always_advertise, NULL },
+	{ "ls-refs", always_advertise, ls_refs },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index b5cc049e5..debdc1b8d 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -9,6 +9,7 @@ test_expect_success 'test capability advertisement' '
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	stateless-rpc
+	ls-refs
 	0000
 	EOF
 
@@ -53,4 +54,118 @@ test_expect_success 'request invalid command' '
 	test_i18ngrep "invalid command" err
 '
 
+# Test the basics of ls-refs
+#
+test_expect_success 'setup some refs and tags' '
+	test_commit one &&
+	git branch dev master &&
+	test_commit two &&
+	git symbolic-ref refs/heads/release refs/heads/master &&
+	git tag -a -m "annotated tag" annotated-tag
+'
+
+test_expect_success 'basics of ls-refs' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse HEAD) HEAD
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'basic ref-patterns' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	ref-pattern master
+	ref-pattern one
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/tags/one) refs/tags/one
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'wildcard ref-patterns' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	ref-pattern refs/heads/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'peel parameter' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	peel
+	ref-pattern refs/tags/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag peeled:$(git rev-parse refs/tags/annotated-tag^{})
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'symrefs parameter' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	symrefs
+	ref-pattern refs/heads/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release symref-target:refs/heads/master
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 14/27] connect: request remote refs using v2
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (12 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 13/27] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-31 15:22     ` Derrick Stolee
  2018-01-25 23:58   ` [PATCH v2 15/27] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
                     ` (15 subsequent siblings)
  29 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/upload-pack.c  |  10 ++--
 connect.c              | 123 ++++++++++++++++++++++++++++++++++++++++++++++++-
 remote.h               |   4 ++
 t/t5702-protocol-v2.sh |  28 +++++++++++
 transport.c            |   2 +-
 5 files changed, 160 insertions(+), 7 deletions(-)
 create mode 100755 t/t5702-protocol-v2.sh

diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 8d53e9794..a757df8da 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -5,6 +5,7 @@
 #include "parse-options.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -16,6 +17,7 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 	const char *dir;
 	int strict = 0;
 	struct upload_pack_options opts = { 0 };
+	struct serve_options serve_opts = SERVE_OPTIONS_INIT;
 	struct option options[] = {
 		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
 			 N_("quit after a single request/response exchange")),
@@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v2:
-		/*
-		 * fetch support for protocol v2 has not been implemented yet,
-		 * so ignore the request to use v2 and fallback to using v0.
-		 */
-		upload_pack(&opts);
+		serve_opts.advertise_capabilities = opts.advertise_refs;
+		serve_opts.stateless_rpc = opts.stateless_rpc;
+		serve(&serve_opts);
 		break;
 	case protocol_v1:
 		/*
diff --git a/connect.c b/connect.c
index f2157a821..3c653b65b 100644
--- a/connect.c
+++ b/connect.c
@@ -12,9 +12,11 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "version.h"
 #include "protocol.h"
 
 static char *server_capabilities;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+/* Checks if the server supports the capability 'c' */
+static int server_supports_v2(const char *c, int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *out == '='))
+			return 1;
+	}
+
+	if (die_on_error)
+		die("server doesn't support '%s'", c);
+
+	return 0;
+}
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
+		argv_array_push(&server_capabilities_v2, reader->line);
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
+}
+
 enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
@@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 	/* Maybe process capabilities here, at least for v2 */
 	switch (version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		process_capabilities_v2(reader);
 		break;
 	case protocol_v1:
 		/* Read the peeked version line */
@@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 	return list;
 }
 
+static int process_ref_v2(const char *line, struct ref ***list)
+{
+	int ret = 1;
+	int i = 0;
+	struct object_id old_oid;
+	struct ref *ref;
+	struct string_list line_sections = STRING_LIST_INIT_DUP;
+
+	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+		ret = 0;
+		goto out;
+	}
+
+	if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
+		ret = 0;
+		goto out;
+	}
+
+	ref = alloc_ref(line_sections.items[i++].string);
+
+	oidcpy(&ref->old_oid, &old_oid);
+	**list = ref;
+	*list = &ref->next;
+
+	for (; i < line_sections.nr; i++) {
+		const char *arg = line_sections.items[i].string;
+		if (skip_prefix(arg, "symref-target:", &arg))
+			ref->symref = xstrdup(arg);
+
+		if (skip_prefix(arg, "peeled:", &arg)) {
+			struct object_id peeled_oid;
+			char *peeled_name;
+			struct ref *peeled;
+			if (get_oid_hex(arg, &peeled_oid)) {
+				ret = 0;
+				goto out;
+			}
+
+			peeled_name = xstrfmt("%s^{}", ref->name);
+			peeled = alloc_ref(peeled_name);
+
+			oidcpy(&peeled->old_oid, &peeled_oid);
+			**list = peeled;
+			*list = &peeled->next;
+
+			free(peeled_name);
+		}
+	}
+
+out:
+	string_list_clear(&line_sections, 0);
+	return ret;
+}
+
+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+			     struct ref **list, int for_push,
+			     const struct argv_array *ref_patterns)
+{
+	int i;
+	*list = NULL;
+
+	/* Check that the server supports the ls-refs command */
+	/* Issue request for ls-refs */
+	if (server_supports_v2("ls-refs", 1))
+		packet_write_fmt(fd_out, "command=ls-refs\n");
+
+	if (server_supports_v2("agent", 0))
+	    packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
+
+	packet_delim(fd_out);
+	/* When pushing we don't want to request the peeled tags */
+	if (!for_push)
+		packet_write_fmt(fd_out, "peel\n");
+	packet_write_fmt(fd_out, "symrefs\n");
+	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
+		packet_write_fmt(fd_out, "ref-pattern %s\n",
+				 ref_patterns->argv[i]);
+	}
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (!process_ref_v2(reader->line, &list))
+			die("invalid ls-refs response: %s", reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
+
+	return list;
+}
+
 static const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp)
 {
 	int len;
diff --git a/remote.h b/remote.h
index 2016461df..21d0c776c 100644
--- a/remote.h
+++ b/remote.h
@@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
 
 struct oid_array;
 struct packet_reader;
+struct argv_array;
 extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
 				     struct oid_array *shallow_points);
+extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+				    struct ref **list, int for_push,
+				    const struct argv_array *ref_patterns);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
new file mode 100755
index 000000000..4bf4d61ac
--- /dev/null
+++ b/t/t5702-protocol-v2.sh
@@ -0,0 +1,28 @@
+#!/bin/sh
+
+test_description='test git wire-protocol version 2'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'list refs with file:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote --symref "file://$(pwd)/file_parent" >actual 2>log &&
+
+	# Server responded using protocol v2
+	cat log &&
+	grep "git< version 2" log &&
+
+	git ls-remote --symref "file://$(pwd)/file_parent" >expect &&
+	test_cmp actual expect
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 83d9dd1df..ffc6b2614 100644
--- a/transport.c
+++ b/transport.c
@@ -204,7 +204,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 15/27] transport: convert get_refs_list to take a list of ref patterns
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (13 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 14/27] connect: request remote refs using v2 Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 16/27] transport: convert transport_get_remote_refs " Brandon Williams
                     ` (14 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Convert the 'struct transport' virtual function 'get_refs_list()' to
optionally take an argv_array of ref patterns.  When communicating with
a server using protocol v2 these ref patterns can be sent when
requesting a listing of their refs allowing the server to filter the
refs it sends based on the sent patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c   |  5 +++--
 transport-internal.h |  4 +++-
 transport.c          | 16 +++++++++-------
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 508015023..4c334b5ee 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1026,7 +1026,8 @@ static int has_attribute(const char *attrs, const char *attr) {
 	}
 }
 
-static struct ref *get_refs_list(struct transport *transport, int for_push)
+static struct ref *get_refs_list(struct transport *transport, int for_push,
+				 const struct argv_array *ref_patterns)
 {
 	struct helper_data *data = transport->data;
 	struct child_process *helper;
@@ -1039,7 +1040,7 @@ static struct ref *get_refs_list(struct transport *transport, int for_push)
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push);
+		return transport->vtable->get_refs_list(transport, for_push, ref_patterns);
 	}
 
 	if (data->push && for_push)
diff --git a/transport-internal.h b/transport-internal.h
index 3c1a29d72..a67657ce3 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -3,6 +3,7 @@
 
 struct ref;
 struct transport;
+struct argv_array;
 
 struct transport_vtable {
 	/**
@@ -21,7 +22,8 @@ struct transport_vtable {
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
-	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
+	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
+				     const struct argv_array *ref_patterns);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index ffc6b2614..c54a44630 100644
--- a/transport.c
+++ b/transport.c
@@ -72,7 +72,7 @@ struct bundle_transport_data {
 	struct bundle_header header;
 };
 
-static struct ref *get_refs_from_bundle(struct transport *transport, int for_push)
+static struct ref *get_refs_from_bundle(struct transport *transport, int for_push, const struct argv_array *ref_patterns)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -189,7 +189,8 @@ static int connect_setup(struct transport *transport, int for_push)
 	return 0;
 }
 
-static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
+static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
+					const struct argv_array *ref_patterns)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -204,7 +205,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
+		get_remote_refs(data->fd[1], &reader, &refs, for_push,
+				ref_patterns);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -250,7 +252,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.update_shallow = data->options.update_shallow;
 
 	if (!data->got_remote_heads)
-		refs_tmp = get_refs_via_connect(transport, 0);
+		refs_tmp = get_refs_via_connect(transport, 0, NULL);
 
 	switch (data->version) {
 	case protocol_v2:
@@ -568,7 +570,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	int ret = 0;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1);
+		get_refs_via_connect(transport, 1, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1028,7 +1030,7 @@ int transport_push(struct transport *transport,
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1);
+		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1137,7 +1139,7 @@ int transport_push(struct transport *transport,
 const struct ref *transport_get_remote_refs(struct transport *transport)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0);
+		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
 		transport->got_remote_refs = 1;
 	}
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 16/27] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (14 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 15/27] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 17/27] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
                     ` (13 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Convert 'transport_get_remote_refs()' to optionally take a list of ref
patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/clone.c     | 2 +-
 builtin/fetch.c     | 4 ++--
 builtin/ls-remote.c | 2 +-
 builtin/remote.c    | 2 +-
 transport.c         | 7 +++++--
 transport.h         | 3 ++-
 6 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 284651797..6e77d993f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1121,7 +1121,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (transport->smart_options && !deepen)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
-	refs = transport_get_remote_refs(transport);
+	refs = transport_get_remote_refs(transport, NULL);
 
 	if (refs) {
 		mapped_refs = wanted_peer_refs(refs, refspec);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26fa..850382f55 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -250,7 +250,7 @@ static void find_non_local_tags(struct transport *transport,
 	struct string_list_item *item = NULL;
 
 	for_each_ref(add_existing, &existing_refs);
-	for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) {
+	for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) {
 		if (!starts_with(ref->name, "refs/tags/"))
 			continue;
 
@@ -336,7 +336,7 @@ static struct ref *get_ref_map(struct transport *transport,
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport);
+	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c4be98ab9..c6e9847c5 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -96,7 +96,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport);
+	ref = transport_get_remote_refs(transport, NULL);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/builtin/remote.c b/builtin/remote.c
index d95bf904c..d0b6ff6e2 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -862,7 +862,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport);
+		remote_refs = transport_get_remote_refs(transport, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/transport.c b/transport.c
index c54a44630..dfc603b36 100644
--- a/transport.c
+++ b/transport.c
@@ -1136,10 +1136,13 @@ int transport_push(struct transport *transport,
 	return 1;
 }
 
-const struct ref *transport_get_remote_refs(struct transport *transport)
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
+		transport->remote_refs =
+			transport->vtable->get_refs_list(transport, 0,
+							 ref_patterns);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 731c78b67..4b656f315 100644
--- a/transport.h
+++ b/transport.h
@@ -178,7 +178,8 @@ int transport_push(struct transport *connection,
 		   int refspec_nr, const char **refspec, int flags,
 		   unsigned int * reject_reasons);
 
-const struct ref *transport_get_remote_refs(struct transport *transport);
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 17/27] ls-remote: pass ref patterns when requesting a remote's refs
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (15 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 16/27] transport: convert transport_get_remote_refs " Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 18/27] fetch: pass ref patterns when fetching Brandon Williams
                     ` (12 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Construct an argv_array of the ref patterns supplied via the command
line and pass them to 'transport_get_remote_refs()' to be used when
communicating protocol v2 so that the server can limit the ref
advertisement based on the supplied patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/ls-remote.c    | 7 +++++--
 t/t5702-protocol-v2.sh | 8 ++++++++
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c6e9847c5..caf1051f3 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -43,6 +43,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	struct remote *remote;
 	struct transport *transport;
@@ -74,8 +75,10 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (argc > 1) {
 		int i;
 		pattern = xcalloc(argc, sizeof(const char *));
-		for (i = 1; i < argc; i++)
+		for (i = 1; i < argc; i++) {
 			pattern[i - 1] = xstrfmt("*/%s", argv[i]);
+			argv_array_push(&ref_patterns, argv[i]);
+		}
 	}
 
 	remote = remote_get(dest);
@@ -96,7 +99,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport, NULL);
+	ref = transport_get_remote_refs(transport, &ref_patterns);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 4bf4d61ac..7d8aeb766 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -25,4 +25,12 @@ test_expect_success 'list refs with file:// using protocol v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote "file://$(pwd)/file_parent" master 2>log &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 18/27] fetch: pass ref patterns when fetching
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (16 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 17/27] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 19/27] push: pass ref patterns when pushing Brandon Williams
                     ` (11 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Construct a list of ref patterns to be passed to
'transport_get_remote_refs()' from the refspec to be used during the
fetch.  This list of ref patterns will be used to allow the server to
filter the ref advertisement when communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 850382f55..8128450bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -332,11 +332,21 @@ static struct ref *get_ref_map(struct transport *transport,
 	struct ref *rm;
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
+	const struct ref *remote_refs;
+
+	for (i = 0; i < refspec_count; i++) {
+		if (!refspecs[i].exact_sha1)
+			argv_array_push(&ref_patterns, refspecs[i].src);
+	}
+
+	remote_refs = transport_get_remote_refs(transport, &ref_patterns);
+
+	argv_array_clear(&ref_patterns);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 19/27] push: pass ref patterns when pushing
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (17 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 18/27] fetch: pass ref patterns when fetching Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 20/27] upload-pack: introduce fetch server command Brandon Williams
                     ` (10 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Construct a list of ref patterns to be passed to 'get_refs_list()' from
the refspec to be used during the push.  This list of ref patterns will
be used to allow the server to filter the ref advertisement when
communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/transport.c b/transport.c
index dfc603b36..6ea3905e3 100644
--- a/transport.c
+++ b/transport.c
@@ -1026,11 +1026,26 @@ int transport_push(struct transport *transport,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
+		struct refspec *tmp_rs;
+		struct argv_array ref_patterns = ARGV_ARRAY_INIT;
+		int i;
 
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
+		tmp_rs = parse_push_refspec(refspec_nr, refspec);
+		for (i = 0; i < refspec_nr; i++) {
+			if (tmp_rs[i].dst)
+				argv_array_push(&ref_patterns, tmp_rs[i].dst);
+			else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
+				argv_array_push(&ref_patterns, tmp_rs[i].src);
+		}
+
+		remote_refs = transport->vtable->get_refs_list(transport, 1,
+							       &ref_patterns);
+
+		argv_array_clear(&ref_patterns);
+		free_refspec(refspec_nr, tmp_rs);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 20/27] upload-pack: introduce fetch server command
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (18 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 19/27] push: pass ref patterns when pushing Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 21/27] fetch-pack: perform a fetch using v2 Brandon Williams
                     ` (9 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt | 121 +++++++++++++
 serve.c                                 |   2 +
 t/t5701-git-serve.sh                    |   1 +
 upload-pack.c                           | 293 ++++++++++++++++++++++++++++++++
 upload-pack.h                           |   5 +
 5 files changed, 422 insertions(+)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 4683d41ac..ca09a2cfe 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -147,3 +147,124 @@ The output of ls-refs is as follows:
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
+
+ fetch
+-------
+
+`fetch` is the command used to fetch a packfile in v2.  It can be looked
+at as a modified version of the v1 fetch where the ref-advertisement is
+stripped out (since the `ls-refs` command fills that role) and the
+message format is tweaked to eliminate redundancies and permit easy
+addition of future extensions.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "<command>=<feature 1>
+<feature 2>".
+
+A `fetch` request can take the following parameters wrapped in
+packet-lines:
+
+    want <oid>
+	Indicates to the server an object which the client wants to
+	retrieve.
+
+    have <oid>
+	Indicates to the server an object which the client has locally.
+	This allows the server to make a packfile which only contains
+	the objects that the client needs. Multiple 'have' lines can be
+	supplied.
+
+    done
+	Indicates to the server that negotiation should terminate (or
+	not even begin if performing a clone) and that the server should
+	use the information supplied in the request to construct the
+	packfile.
+
+    thin-pack
+	Request that a thin pack be sent, which is a pack with deltas
+	which reference base objects not contained within the pack (but
+	are known to exist at the receiving end). This can reduce the
+	network traffic significantly, but it requires the receiving end
+	to know how to "thicken" these packs by adding the missing bases
+	to the pack.
+
+    no-progress
+	Request that progress information that would normally be sent on
+	side-band channel 2, during the packfile transfer, should not be
+	sent.  However, the side-band channel 3 is still used for error
+	responses.
+
+    include-tag
+	Request that annotated tags should be sent if the objects they
+	point to are being sent.
+
+    ofs-delta
+	Indicate that the client understands PACKv2 with delta referring
+	to its base by position in pack rather than by an oid.  That is,
+	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
+
+The response of `fetch` is broken into a number of sections separated by
+delimiter packets (0001), with each section beginning with its section
+header.
+
+    output = *section
+    section = (acknowledgments | packfile)
+	      (flush-pkt | delim-pkt)
+
+    acknowledgments = PKT-LINE("acknowledgments" LF)
+		      *(ready | nak | ack)
+    ready = PKT-LINE("ready" LF)
+    nak = PKT-LINE("NAK" LF)
+    ack = PKT-LINE("ACK" SP obj-id LF)
+
+    packfile = PKT-LINE("packfile" LF)
+	       [PACKFILE]
+
+----
+    acknowledgments section
+	* Always begins with the section header "acknowledgments"
+
+	* The server will respond with "NAK" if none of the object ids sent
+	  as have lines were common.
+
+	* The server will respond with "ACK obj-id" for all of the
+	  object ids sent as have lines which are common.
+
+	* A response cannot have both "ACK" lines as well as a "NAK"
+	  line.
+
+	* The server will respond with a "ready" line indicating that
+	  the server has found an acceptable common base and is ready to
+	  make and send a packfile (which will be found in the packfile
+	  section of the same response)
+
+	* If the client determines that it is finished with negotiations
+	  by sending a "done" line, the acknowledgments sections can be
+	  omitted from the server's response as an optimization.
+
+	* If the server has found a suitable cut point and has decided
+	  to send a "ready" line, then the server can decide to (as an
+	  optimization) omit any "ACK" lines it would have sent during
+	  its response.  This is because the server will have already
+	  determined the objects it plans to send to the client and no
+	  further negotiation is needed.
+
+----
+    packfile section
+	* Always begins with the section header "packfile"
+
+	* The transmission of the packfile begins immediately after the
+	  section header
+
+	* The data transfer of the packfile is always multiplexed, using
+	  the same semantics of the 'side-band-64k' capability from
+	  protocol version 1.  This means that each packet, during the
+	  packfile data stream, is made up of a leading 4-byte pkt-line
+	  length (typical of the pkt-line format), followed by a 1-byte
+	  stream code, followed by the actual data.
+
+	  The stream code can be one of:
+		1 - pack data
+		2 - progress messages
+		3 - fatal error message just before stream aborts
diff --git a/serve.c b/serve.c
index 2f404154a..e0235e2bc 100644
--- a/serve.c
+++ b/serve.c
@@ -6,6 +6,7 @@
 #include "argv-array.h"
 #include "ls-refs.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static int always_advertise(struct repository *r,
 			    struct strbuf *value)
@@ -53,6 +54,7 @@ static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "stateless-rpc", always_advertise, NULL },
 	{ "ls-refs", always_advertise, ls_refs },
+	{ "fetch", always_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index debdc1b8d..e3bc08667 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -10,6 +10,7 @@ test_expect_success 'test capability advertisement' '
 	agent=git/$(git version | cut -d" " -f3)
 	stateless-rpc
 	ls-refs
+	fetch
 	0000
 	EOF
 
diff --git a/upload-pack.c b/upload-pack.c
index 42d83d5b1..f7944ffdc 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "prio-queue.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE	(1u << 11)
@@ -1065,3 +1066,295 @@ void upload_pack(struct upload_pack_options *options)
 		create_pack_file();
 	}
 }
+
+struct upload_pack_data {
+	struct object_array wants;
+	struct oid_array haves;
+
+	unsigned stateless_rpc : 1;
+
+	unsigned use_thin_pack : 1;
+	unsigned use_ofs_delta : 1;
+	unsigned no_progress : 1;
+	unsigned use_include_tag : 1;
+	unsigned done : 1;
+};
+
+#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
+
+static void upload_pack_data_clear(struct upload_pack_data *data)
+{
+	object_array_clear(&data->wants);
+	oid_array_clear(&data->haves);
+}
+
+static int parse_want(const char *line)
+{
+	const char *arg;
+	if (skip_prefix(line, "want ", &arg)) {
+		struct object_id oid;
+		struct object *o;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: protocol error, "
+			    "expected to get oid, not '%s'", line);
+
+		o = parse_object(&oid);
+		if (!o) {
+			packet_write_fmt(1,
+					 "ERR upload-pack: not our ref %s",
+					 oid_to_hex(&oid));
+			die("git upload-pack: not our ref %s",
+			    oid_to_hex(&oid));
+		}
+
+		if (!(o->flags & WANTED)) {
+			o->flags |= WANTED;
+			add_object_array(o, NULL, &want_obj);
+		}
+
+		return 1;
+	}
+
+	return 0;
+}
+
+static int parse_have(const char *line, struct oid_array *haves)
+{
+	const char *arg;
+	if (skip_prefix(line, "have ", &arg)) {
+		struct object_id oid;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: expected SHA1 object, got '%s'", arg);
+		oid_array_append(haves, &oid);
+		return 1;
+	}
+
+	return 0;
+}
+
+static void process_args(struct argv_array *args, struct upload_pack_data *data)
+{
+	int i;
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+
+		/* process want */
+		if (parse_want(arg))
+			continue;
+		/* process have line */
+		if (parse_have(arg, &data->haves))
+			continue;
+
+		/* process args like thin-pack */
+		if (!strcmp(arg, "thin-pack")) {
+			use_thin_pack = 1;
+			continue;
+		}
+		if (!strcmp(arg, "ofs-delta")) {
+			use_ofs_delta = 1;
+			continue;
+		}
+		if (!strcmp(arg, "no-progress")) {
+			no_progress = 1;
+			continue;
+		}
+		if (!strcmp(arg, "include-tag")) {
+			use_include_tag = 1;
+			continue;
+		}
+		if (!strcmp(arg, "done")) {
+			data->done = 1;
+			continue;
+		}
+
+		/* ignore unknown lines maybe? */
+		die("unexpect line: '%s'", arg);
+	}
+}
+
+static void read_haves(struct upload_pack_data *data)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) == PACKET_READ_NORMAL) {
+
+		if (parse_have(reader.line, &data->haves))
+			continue;
+		if (!strcmp(reader.line, "done")) {
+			data->done = 1;
+			continue;
+		}
+	}
+	if (reader.status != PACKET_READ_FLUSH)
+		die("ERROR");
+}
+
+static int process_haves(struct oid_array *haves, struct oid_array *common)
+{
+	int i;
+
+	/* Process haves */
+	for (i = 0; i < haves->nr; i++) {
+		const struct object_id *oid = &haves->oid[i];
+		struct object *o;
+		int we_knew_they_have = 0;
+
+		if (!has_object_file(oid))
+			continue;
+
+		oid_array_append(common, oid);
+
+		o = parse_object(oid);
+		if (!o)
+			die("oops (%s)", oid_to_hex(oid));
+		if (o->type == OBJ_COMMIT) {
+			struct commit_list *parents;
+			struct commit *commit = (struct commit *)o;
+			if (o->flags & THEY_HAVE)
+				we_knew_they_have = 1;
+			else
+				o->flags |= THEY_HAVE;
+			if (!oldest_have || (commit->date < oldest_have))
+				oldest_have = commit->date;
+			for (parents = commit->parents;
+			     parents;
+			     parents = parents->next)
+				parents->item->object.flags |= THEY_HAVE;
+		}
+		if (!we_knew_they_have)
+			add_object_array(o, NULL, &have_obj);
+	}
+
+	return 0;
+}
+
+static int send_acks(struct oid_array *acks, struct strbuf *response)
+{
+	int i;
+
+	packet_buf_write(response, "acknowledgments\n");
+
+	/* Send Acks */
+	if (!acks->nr)
+		packet_buf_write(response, "NAK\n");
+
+	for (i = 0; i < acks->nr; i++) {
+		packet_buf_write(response, "ACK %s\n",
+				 oid_to_hex(&acks->oid[i]));
+	}
+
+	if (ok_to_give_up()) {
+		/* Send Ready */
+		packet_buf_write(response, "ready\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_haves_and_send_acks(struct upload_pack_data *data)
+{
+	struct oid_array common = OID_ARRAY_INIT;
+	struct strbuf response = STRBUF_INIT;
+	int ret = 0;
+
+	process_haves(&data->haves, &common);
+	if (data->done) {
+		ret = 1;
+	} else if (send_acks(&common, &response)) {
+		packet_buf_delim(&response);
+		ret = 1;
+	} else {
+		/* Add Flush */
+		packet_buf_flush(&response);
+		ret = 0;
+	}
+
+	/* Send response */
+	write_or_die(1, response.buf, response.len);
+	strbuf_release(&response);
+
+	oid_array_clear(&data->haves);
+	oid_array_clear(&common);
+	return ret;
+}
+
+enum fetch_state {
+	FETCH_PROCESS_ARGS = 0,
+	FETCH_READ_HAVES,
+	FETCH_SEND_ACKS,
+	FETCH_SEND_PACK,
+	FETCH_DONE,
+};
+
+int upload_pack_v2(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args)
+{
+	enum fetch_state state = FETCH_PROCESS_ARGS;
+	struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
+	const char *out;
+	use_sideband = LARGE_PACKET_MAX;
+
+	/* Check if cmd is being run as a stateless-rpc */
+	if (has_capability(keys, "stateless-rpc", &out))
+		if (!strcmp(out, "true"))
+			data.stateless_rpc = 1;
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_PROCESS_ARGS:
+			process_args(args, &data);
+
+			if (!want_obj.nr) {
+				/*
+				 * Request didn't contain any 'want' lines,
+				 * guess they didn't want anything.
+				 */
+				state = FETCH_DONE;
+			} else if (data.haves.nr) {
+				/*
+				 * Request had 'have' lines, so lets ACK them.
+				 */
+				state = FETCH_SEND_ACKS;
+			} else {
+				/*
+				 * Request had 'want's but no 'have's so we can
+				 * immedietly go to construct and send a pack.
+				 */
+				state = FETCH_SEND_PACK;
+			}
+			break;
+		case FETCH_READ_HAVES:
+			read_haves(&data);
+			state = FETCH_SEND_ACKS;
+			break;
+		case FETCH_SEND_ACKS:
+			if (process_haves_and_send_acks(&data))
+				state = FETCH_SEND_PACK;
+			else if (data.stateless_rpc)
+				/*
+				 * Request was made via stateless-rpc and a
+				 * packfile isn't ready to be created and sent.
+				 */
+				state = FETCH_DONE;
+			else
+				state = FETCH_READ_HAVES;
+			break;
+		case FETCH_SEND_PACK:
+			packet_write_fmt(1, "packfile\n");
+			create_pack_file();
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			continue;
+		}
+	}
+
+	upload_pack_data_clear(&data);
+	return 0;
+}
diff --git a/upload-pack.h b/upload-pack.h
index a71e4dc7e..6b7890238 100644
--- a/upload-pack.h
+++ b/upload-pack.h
@@ -10,4 +10,9 @@ struct upload_pack_options {
 
 void upload_pack(struct upload_pack_options *options);
 
+struct repository;
+struct argv_array;
+extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
+			  struct argv_array *args);
+
 #endif /* UPLOAD_PACK_H */
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 21/27] fetch-pack: perform a fetch using v2
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (19 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 20/27] upload-pack: introduce fetch server command Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 22/27] transport-helper: remove name parameter Brandon Williams
                     ` (8 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   |   2 +-
 fetch-pack.c           | 277 ++++++++++++++++++++++++++++++++++++++++++++++++-
 fetch-pack.h           |   4 +-
 t/t5702-protocol-v2.sh |  40 +++++++
 transport.c            |   8 +-
 5 files changed, 324 insertions(+), 7 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index f492e8abd..867dd3cc7 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -213,7 +213,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
-			 &shallow, pack_lockfile_ptr);
+			 &shallow, pack_lockfile_ptr, protocol_v0);
 	if (pack_lockfile) {
 		printf("lock %s\n", pack_lockfile);
 		fflush(stdout);
diff --git a/fetch-pack.c b/fetch-pack.c
index 9f6b07ad9..17927ae99 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,272 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	return ref;
 }
 
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
+{
+	for ( ; wants ; wants = wants->next) {
+		const struct object_id *remote = &wants->old_oid;
+		const char *remote_hex;
+		struct object *o;
+
+		/*
+		 * If that object is complete (i.e. it is an ancestor of a
+		 * local ref), we tell them we have it but do not have to
+		 * tell them about its ancestors, which they already know
+		 * about.
+		 *
+		 * We use lookup_object here because we are only
+		 * interested in the case we *know* the object is
+		 * reachable and we have already scanned it.
+		 */
+		if (((o = lookup_object(remote->hash)) != NULL) &&
+		    (o->flags & COMPLETE)) {
+			continue;
+		}
+
+		remote_hex = oid_to_hex(remote);
+		packet_buf_write(req_buf, "want %s\n", remote_hex);
+	}
+}
+
+static int add_haves(struct strbuf *req_buf, int *in_vain)
+{
+	int ret = 0;
+	int haves_added = 0;
+	const struct object_id *oid;
+
+	while ((oid = get_rev())) {
+		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+		if (++haves_added >= INITIAL_FLUSH)
+			break;
+	};
+
+	*in_vain += haves_added;
+	if (!haves_added || *in_vain >= MAX_IN_VAIN) {
+		/* Send Done */
+		packet_buf_write(req_buf, "done\n");
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static int send_haves(int fd_out, int *in_vain)
+{
+	int ret = 0;
+	struct strbuf req_buf = STRBUF_INIT;
+
+	ret = add_haves(&req_buf, in_vain);
+
+	/* Send request */
+	packet_buf_flush(&req_buf);
+	write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+	strbuf_release(&req_buf);
+	return ret;
+}
+
+static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
+			      const struct ref *wants, struct oidset *common,
+			      int *in_vain)
+{
+	int ret = 0;
+	struct strbuf req_buf = STRBUF_INIT;
+
+	packet_buf_write(&req_buf, "command=fetch");
+	packet_buf_write(&req_buf, "agent=%s", git_user_agent_sanitized());
+	if (args->stateless_rpc)
+		packet_buf_write(&req_buf, "stateless-rpc=true");
+
+	packet_buf_delim(&req_buf);
+	if (args->use_thin_pack)
+		packet_buf_write(&req_buf, "thin-pack");
+	if (args->no_progress)
+		packet_buf_write(&req_buf, "no-progress");
+	if (args->include_tag)
+		packet_buf_write(&req_buf, "include-tag");
+	if (prefer_ofs_delta)
+		packet_buf_write(&req_buf, "ofs-delta");
+
+	/* add wants */
+	add_wants(wants, &req_buf);
+
+	/*
+	 * If we are running stateless-rpc we need to add all the common
+	 * commits we've found in previous rounds
+	 */
+	if (args->stateless_rpc) {
+		struct oidset_iter iter;
+		const struct object_id *oid;
+		oidset_iter_init(common, &iter);
+
+		while ((oid = oidset_iter_next(&iter))) {
+			packet_buf_write(&req_buf, "have %s\n", oid_to_hex(oid));
+		}
+	}
+
+	/* Add initial haves */
+	ret = add_haves(&req_buf, in_vain);
+
+	/* Send request */
+	packet_buf_flush(&req_buf);
+	write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+	strbuf_release(&req_buf);
+	return ret;
+}
+
+/*
+ * Processes a section header in a server's response and checks if it matches
+ * `section`.  If the value of `peek` is 1, the header line will be peeked (and
+ * not consumed); if 0, the line will be consumed and the function will die if
+ * the section header doesn't match what was expected.
+ */
+static int process_section_header(struct packet_reader *reader,
+				  const char *section, int peek)
+{
+	int ret;
+
+	if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
+		die("error reading packet");
+
+	ret = !strcmp(reader->line, section);
+
+	if (!peek) {
+		if (!ret)
+			die("expected '%s', received '%s'",
+			    section, reader->line);
+		packet_reader_read(reader);
+	}
+
+	return ret;
+}
+
+static int process_acks(struct packet_reader *reader, struct oidset *common)
+{
+	/* received */
+	int received_ready = 0;
+	int received_ack = 0;
+
+	process_section_header(reader, "acknowledgments", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *arg;
+
+		if (!strcmp(reader->line, "NAK"))
+			continue;
+
+		if (skip_prefix(reader->line, "ACK ", &arg)) {
+			struct object_id oid;
+			if (!get_oid_hex(arg, &oid)) {
+				struct commit *commit;
+				oidset_insert(common, &oid);
+				commit = lookup_commit(&oid);
+				mark_common(commit, 0, 1);
+			}
+			continue;
+		}
+
+		if (!strcmp(reader->line, "ready")) {
+			clear_prio_queue(&rev_list);
+			received_ready = 1;
+			continue;
+		}
+
+		die(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH &&
+	    reader->status != PACKET_READ_DELIM)
+		die("Error during processing acks: %d", reader->status);
+
+	/* return 0 if no common, 1 if there are common, or 2 if ready */
+	return received_ready ? 2 : (received_ack ? 1 : 0);
+}
+
+enum fetch_state {
+	FETCH_CHECK_LOCAL = 0,
+	FETCH_SEND_REQUEST,
+	FETCH_PROCESS_ACKS,
+	FETCH_SEND_HAVES,
+	FETCH_GET_PACK,
+	FETCH_DONE,
+};
+
+static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
+				    int fd[2],
+				    const struct ref *orig_ref,
+				    struct ref **sought, int nr_sought,
+				    char **pack_lockfile)
+{
+	struct ref *ref = copy_ref_list(orig_ref);
+	enum fetch_state state = FETCH_CHECK_LOCAL;
+	struct oidset common = OIDSET_INIT;
+	struct packet_reader reader;
+	int in_vain = 0;
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_CHECK_LOCAL:
+			sort_ref_list(&ref, ref_compare_name);
+			QSORT(sought, nr_sought, cmp_ref_by_name);
+
+			/* v2 supports these by default */
+			allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+			use_sideband = 2;
+
+			/* Filter 'ref' by 'sought' and those that aren't local */
+			if (everything_local(args, &ref, sought, nr_sought))
+				state = FETCH_DONE;
+			else
+				state = FETCH_SEND_REQUEST;
+			break;
+		case FETCH_SEND_REQUEST:
+			if (send_fetch_request(fd[1], args, ref, &common, &in_vain))
+				state = FETCH_GET_PACK;
+			else
+				state = FETCH_PROCESS_ACKS;
+			break;
+		case FETCH_PROCESS_ACKS:
+			/* Process ACKs/NAKs */
+			switch (process_acks(&reader, &common)) {
+			case 2:
+				state = FETCH_GET_PACK;
+				break;
+			case 1:
+				in_vain = 0;
+				/* fallthrough */
+			default:
+				if (args->stateless_rpc)
+					state = FETCH_SEND_REQUEST;
+				else
+					state = FETCH_SEND_HAVES;
+				break;
+			}
+			break;
+		case FETCH_SEND_HAVES:
+			if (send_haves(fd[1], &in_vain))
+				state = FETCH_GET_PACK;
+			else
+				state = FETCH_PROCESS_ACKS;
+			break;
+		case FETCH_GET_PACK:
+			/* get the pack */
+			process_section_header(&reader, "packfile", 0);
+			if (get_pack(args, fd, pack_lockfile))
+				die(_("git fetch-pack: fetch failed."));
+
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			continue;
+		}
+	}
+
+	oidset_clear(&common);
+	return ref;
+}
+
 static void fetch_pack_config(void)
 {
 	git_config_get_int("fetch.unpacklimit", &fetch_unpack_limit);
@@ -1153,7 +1419,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       const char *dest,
 		       struct ref **sought, int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile)
+		       char **pack_lockfile,
+		       enum protocol_version version)
 {
 	struct ref *ref_cpy;
 	struct shallow_info si;
@@ -1167,8 +1434,12 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		die(_("no matching remote head"));
 	}
 	prepare_shallow_info(&si, shallow);
-	ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
-				&si, pack_lockfile);
+	if (version == protocol_v2)
+		ref_cpy = do_fetch_pack_v2(args, fd, ref, sought, nr_sought,
+					   pack_lockfile);
+	else
+		ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
+					&si, pack_lockfile);
 	reprepare_packed_git();
 	update_shallow(args, sought, nr_sought, &si);
 	clear_shallow_info(&si);
diff --git a/fetch-pack.h b/fetch-pack.h
index b6aeb43a8..7afca7305 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -3,6 +3,7 @@
 
 #include "string-list.h"
 #include "run-command.h"
+#include "protocol.h"
 
 struct oid_array;
 
@@ -43,7 +44,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       struct ref **sought,
 		       int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile);
+		       char **pack_lockfile,
+		       enum protocol_version version);
 
 /*
  * Print an appropriate error message for each sought ref that wasn't
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 7d8aeb766..3e411e178 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -33,4 +33,44 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
 	! grep "refs/tags/" log
 '
 
+test_expect_success 'clone with file:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 2" log
+'
+
+test_expect_success 'fetch with file:// using protocol v2' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
+		fetch origin 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 2" log
+'
+
+test_expect_success 'ref advertisment is filtered during fetch using protocol v2' '
+	test_commit -C file_parent three &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
+		fetch origin master 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 test_done
diff --git a/transport.c b/transport.c
index 6ea3905e3..4fdbd9adc 100644
--- a/transport.c
+++ b/transport.c
@@ -256,14 +256,18 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile, data->version);
+		packet_flush(data->fd[1]);
 		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
 				  refs_tmp ? refs_tmp : transport->remote_refs,
 				  dest, to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile);
+				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 22/27] transport-helper: remove name parameter
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (20 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 21/27] fetch-pack: perform a fetch using v2 Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 23/27] transport-helper: refactor process_connect_service Brandon Williams
                     ` (7 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Commit 266f1fdfa (transport-helper: be quiet on read errors from
helpers, 2013-06-21) removed a call to 'die()' which printed the name of
the remote helper passed in to the 'recvline_fh()' function using the
'name' parameter.  Once the call to 'die()' was removed the parameter
was no longer necessary but wasn't removed.  Clean up 'recvline_fh()'
parameter list by removing the 'name' parameter.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 4c334b5ee..d72155768 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -49,7 +49,7 @@ static void sendline(struct helper_data *helper, struct strbuf *buffer)
 		die_errno("Full write to remote helper failed");
 }
 
-static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
+static int recvline_fh(FILE *helper, struct strbuf *buffer)
 {
 	strbuf_reset(buffer);
 	if (debug)
@@ -67,7 +67,7 @@ static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
 
 static int recvline(struct helper_data *helper, struct strbuf *buffer)
 {
-	return recvline_fh(helper->out, buffer, helper->name);
+	return recvline_fh(helper->out, buffer);
 }
 
 static void write_constant(int fd, const char *str)
@@ -586,7 +586,7 @@ static int process_connect_service(struct transport *transport,
 		goto exit;
 
 	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf, name))
+	if (recvline_fh(input, &cmdbuf))
 		exit(128);
 
 	if (!strcmp(cmdbuf.buf, "")) {
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 23/27] transport-helper: refactor process_connect_service
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (21 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 22/27] transport-helper: remove name parameter Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 24/27] transport-helper: introduce stateless-connect Brandon Williams
                     ` (6 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

A future patch will need to take advantage of the logic which runs and
processes the response of the connect command on a remote helper so
factor out this logic from 'process_connect_service()' and place it into
a helper function 'run_connect()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 67 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index d72155768..c032a2a87 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -545,14 +545,13 @@ static int fetch_with_import(struct transport *transport,
 	return 0;
 }
 
-static int process_connect_service(struct transport *transport,
-				   const char *name, const char *exec)
+static int run_connect(struct transport *transport, struct strbuf *cmdbuf)
 {
 	struct helper_data *data = transport->data;
-	struct strbuf cmdbuf = STRBUF_INIT;
-	struct child_process *helper;
-	int r, duped, ret = 0;
+	int ret = 0;
+	int duped;
 	FILE *input;
+	struct child_process *helper;
 
 	helper = get_helper(transport);
 
@@ -568,44 +567,54 @@ static int process_connect_service(struct transport *transport,
 	input = xfdopen(duped, "r");
 	setvbuf(input, NULL, _IONBF, 0);
 
+	sendline(data, cmdbuf);
+	if (recvline_fh(input, cmdbuf))
+		exit(128);
+
+	if (!strcmp(cmdbuf->buf, "")) {
+		data->no_disconnect_req = 1;
+		if (debug)
+			fprintf(stderr, "Debug: Smart transport connection "
+				"ready.\n");
+		ret = 1;
+	} else if (!strcmp(cmdbuf->buf, "fallback")) {
+		if (debug)
+			fprintf(stderr, "Debug: Falling back to dumb "
+				"transport.\n");
+	} else {
+		die("Unknown response to connect: %s",
+			cmdbuf->buf);
+	}
+
+	fclose(input);
+	return ret;
+}
+
+static int process_connect_service(struct transport *transport,
+				   const char *name, const char *exec)
+{
+	struct helper_data *data = transport->data;
+	struct strbuf cmdbuf = STRBUF_INIT;
+	int ret = 0;
+
 	/*
 	 * Handle --upload-pack and friends. This is fire and forget...
 	 * just warn if it fails.
 	 */
 	if (strcmp(name, exec)) {
-		r = set_helper_option(transport, "servpath", exec);
+		int r = set_helper_option(transport, "servpath", exec);
 		if (r > 0)
 			warning("Setting remote service path not supported by protocol.");
 		else if (r < 0)
 			warning("Invalid remote service path.");
 	}
 
-	if (data->connect)
+	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
-	else
-		goto exit;
-
-	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf))
-		exit(128);
-
-	if (!strcmp(cmdbuf.buf, "")) {
-		data->no_disconnect_req = 1;
-		if (debug)
-			fprintf(stderr, "Debug: Smart transport connection "
-				"ready.\n");
-		ret = 1;
-	} else if (!strcmp(cmdbuf.buf, "fallback")) {
-		if (debug)
-			fprintf(stderr, "Debug: Falling back to dumb "
-				"transport.\n");
-	} else
-		die("Unknown response to connect: %s",
-			cmdbuf.buf);
+		ret = run_connect(transport, &cmdbuf);
+	}
 
-exit:
 	strbuf_release(&cmdbuf);
-	fclose(input);
 	return ret;
 }
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 24/27] transport-helper: introduce stateless-connect
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (22 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 23/27] transport-helper: refactor process_connect_service Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 25/27] pkt-line: add packet_buf_write_len function Brandon Williams
                     ` (5 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Introduce the transport-helper capability 'stateless-connect'.  This
capability indicates that the transport-helper can be requested to run
the 'stateless-connect' command which should attempt to make a
stateless connection with a remote end.  Once established, the
connection can be used by the git client to communicate with
the remote end natively in a stateless-rpc manner as supported by
protocol v2.  This means that the client must send everything the server
needs in a single request as the client must not assume any
state-storing on the part of the server or transport.

If a stateless connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 8 ++++++++
 transport.c        | 1 +
 transport.h        | 6 ++++++
 3 files changed, 15 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index c032a2a87..82eb57c4a 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -26,6 +26,7 @@ struct helper_data {
 		option : 1,
 		push : 1,
 		connect : 1,
+		stateless_connect : 1,
 		signed_tags : 1,
 		check_connectivity : 1,
 		no_disconnect_req : 1,
@@ -188,6 +189,8 @@ static struct child_process *get_helper(struct transport *transport)
 			refspecs[refspec_nr++] = xstrdup(arg);
 		} else if (!strcmp(capname, "connect")) {
 			data->connect = 1;
+		} else if (!strcmp(capname, "stateless-connect")) {
+			data->stateless_connect = 1;
 		} else if (!strcmp(capname, "signed-tags")) {
 			data->signed_tags = 1;
 		} else if (skip_prefix(capname, "export-marks ", &arg)) {
@@ -612,6 +615,11 @@ static int process_connect_service(struct transport *transport,
 	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
 		ret = run_connect(transport, &cmdbuf);
+	} else if (data->stateless_connect) {
+		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
+		ret = run_connect(transport, &cmdbuf);
+		if (ret)
+			transport->stateless_rpc = 1;
 	}
 
 	strbuf_release(&cmdbuf);
diff --git a/transport.c b/transport.c
index 4fdbd9adc..aafb8fbb4 100644
--- a/transport.c
+++ b/transport.c
@@ -250,6 +250,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 		data->options.check_self_contained_and_connected;
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
+	args.stateless_rpc = transport->stateless_rpc;
 
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0, NULL);
diff --git a/transport.h b/transport.h
index 4b656f315..9eac809ee 100644
--- a/transport.h
+++ b/transport.h
@@ -55,6 +55,12 @@ struct transport {
 	 */
 	unsigned cloning : 1;
 
+	/*
+	 * Indicates that the transport is connected via a half-duplex
+	 * connection and should operate in stateless-rpc mode.
+	 */
+	unsigned stateless_rpc : 1;
+
 	/*
 	 * These strings will be passed to the {pre, post}-receive hook,
 	 * on the remote side, if both sides support the push options capability.
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 25/27] pkt-line: add packet_buf_write_len function
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (23 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 24/27] transport-helper: introduce stateless-connect Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 26/27] remote-curl: create copy of the service name Brandon Williams
                     ` (4 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Add the 'packet_buf_write_len()' function which allows for writing an
arbitrary length buffer into a 'struct strbuf' and formatting it in
packet-line format.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 16 ++++++++++++++++
 pkt-line.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 726e109ca..5a8a17ecc 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 	va_end(args);
 }
 
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
+{
+	size_t orig_len, n;
+
+	orig_len = buf->len;
+	strbuf_addstr(buf, "0000");
+	strbuf_add(buf, data, len);
+	n = buf->len - orig_len;
+
+	if (n > LARGE_PACKET_MAX)
+		die("protocol error: impossibly long line");
+
+	set_packet_header(&buf->buf[orig_len], n);
+	packet_trace(buf->buf + orig_len + 4, n - 4, 1);
+}
+
 int write_packetized_from_fd(int fd_in, int fd_out)
 {
 	static char buf[LARGE_PACKET_DATA_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 16fe8bdbf..63724d4bf 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd(int fd_in, int fd_out);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 26/27] remote-curl: create copy of the service name
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (24 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 25/27] pkt-line: add packet_buf_write_len function Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-25 23:58   ` [PATCH v2 27/27] remote-curl: implement stateless-connect command Brandon Williams
                     ` (3 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Make a copy of the service name being requested instead of relying on
the buffer pointed to by the passed in 'const char *' to remain
unchanged.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index dae8a4a48..4086aa733 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -165,7 +165,7 @@ static int set_option(const char *name, const char *value)
 }
 
 struct discovery {
-	const char *service;
+	char *service;
 	char *buf_alloc;
 	char *buf;
 	size_t len;
@@ -257,6 +257,7 @@ static void free_discovery(struct discovery *d)
 		free(d->shallow.oid);
 		free(d->buf_alloc);
 		free_refs(d->refs);
+		free(d->service);
 		free(d);
 	}
 }
@@ -343,7 +344,7 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		warning(_("redirecting to %s"), url.buf);
 
 	last= xcalloc(1, sizeof(*last_discovery));
-	last->service = service;
+	last->service = xstrdup(service);
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v2 27/27] remote-curl: implement stateless-connect command
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (25 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 26/27] remote-curl: create copy of the service name Brandon Williams
@ 2018-01-25 23:58   ` Brandon Williams
  2018-01-31 16:00   ` [PATCH v2 00/27] protocol version 2 Derrick Stolee
                     ` (2 subsequent siblings)
  29 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-01-25 23:58 UTC (permalink / raw)
  To: git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder, Brandon Williams

Teach remote-curl the 'stateless-connect' command which is used to
establish a stateless connection with servers which support protocol
version 2.  This allows remote-curl to act as a proxy, allowing the git
client to communicate natively with a remote end, simply using
remote-curl as a pass through to convert requests to http.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c          | 185 ++++++++++++++++++++++++++++++++++++++++++++++++-
 t/t5702-protocol-v2.sh |  41 +++++++++++
 2 files changed, 224 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 4086aa733..a17c7e228 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -171,6 +171,7 @@ struct discovery {
 	size_t len;
 	struct ref *refs;
 	struct oid_array shallow;
+	enum protocol_version version;
 	unsigned proto_git : 1;
 };
 static struct discovery *last_discovery;
@@ -184,9 +185,13 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	heads->version = discover_version(&reader);
+	switch (heads->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		/*
+		 * Do nothing.  Client should run 'stateless-connect' and
+		 * request the refs themselves.
+		 */
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -1047,6 +1052,178 @@ static void parse_push(struct strbuf *buf)
 	free(specs);
 }
 
+struct proxy_state {
+	char *service_name;
+	char *service_url;
+	struct curl_slist *headers;
+	struct strbuf request_buffer;
+	int in;
+	int out;
+	struct packet_reader reader;
+	size_t pos;
+	int seen_flush;
+};
+
+static void proxy_state_init(struct proxy_state *p, const char *service_name)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	memset(p, 0, sizeof(*p));
+	p->service_name = xstrdup(service_name);
+
+	p->in = 0;
+	p->out = 1;
+	strbuf_init(&p->request_buffer, 0);
+
+	strbuf_addf(&buf, "%s%s", url.buf, p->service_name);
+	p->service_url = strbuf_detach(&buf, NULL);
+
+	p->headers = http_copy_default_headers();
+
+	strbuf_addf(&buf, "Content-Type: application/x-%s-request", p->service_name);
+	p->headers = curl_slist_append(p->headers, buf.buf);
+	strbuf_reset(&buf);
+
+	strbuf_addf(&buf, "Accept: application/x-%s-result", p->service_name);
+	p->headers = curl_slist_append(p->headers, buf.buf);
+
+	p->headers = curl_slist_append(p->headers, "Transfer-Encoding: chunked");
+
+	packet_reader_init(&p->reader, p->in, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	strbuf_release(&buf);
+}
+
+static void proxy_state_clear(struct proxy_state *p)
+{
+	free(p->service_name);
+	free(p->service_url);
+	curl_slist_free_all(p->headers);
+	strbuf_release(&p->request_buffer);
+}
+
+static size_t proxy_in(char *buffer, size_t eltsize,
+		       size_t nmemb, void *userdata)
+{
+	size_t max = eltsize * nmemb;
+	struct proxy_state *p = userdata;
+	size_t avail = p->request_buffer.len - p->pos;
+
+	if (!avail) {
+		if (p->seen_flush) {
+			p->seen_flush = 0;
+			return 0;
+		}
+
+		strbuf_reset(&p->request_buffer);
+		switch (packet_reader_read(&p->reader)) {
+		case PACKET_READ_EOF:
+			die("unexpected EOF when reading from parent process");
+		case PACKET_READ_NORMAL:
+			packet_buf_write_len(&p->request_buffer, p->reader.line,
+					     p->reader.pktlen);
+			break;
+		case PACKET_READ_DELIM:
+			packet_buf_delim(&p->request_buffer);
+			break;
+		case PACKET_READ_FLUSH:
+			packet_buf_flush(&p->request_buffer);
+			p->seen_flush = 1;
+			break;
+		}
+		p->pos = 0;
+		avail = p->request_buffer.len;
+	}
+
+	if (max < avail)
+		avail = max;
+	memcpy(buffer, p->request_buffer.buf + p->pos, avail);
+	p->pos += avail;
+	return avail;
+}
+
+static size_t proxy_out(char *buffer, size_t eltsize,
+			size_t nmemb, void *userdata)
+{
+	size_t size = eltsize * nmemb;
+	struct proxy_state *p = userdata;
+
+	write_or_die(p->out, buffer, size);
+	return size;
+}
+
+static int proxy_post(struct proxy_state *p)
+{
+	struct active_request_slot *slot;
+	int err;
+
+	slot = get_active_slot();
+
+	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
+	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, p->headers);
+
+	/* Setup function to read request from client */
+	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
+	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
+
+	/* Setup function to write server response to client */
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
+
+	err = run_slot(slot, NULL);
+
+	if (err != HTTP_OK)
+		err = -1;
+
+	return err;
+}
+
+static int stateless_connect(const char *service_name)
+{
+	struct discovery *discover;
+	struct proxy_state p;
+
+	/*
+	 * Run the info/refs request and see if the server supports protocol
+	 * v2.  If and only if the server supports v2 can we successfully
+	 * establish a stateless connection, otherwise we need to tell the
+	 * client to fallback to using other transport helper functions to
+	 * complete their request.
+	 */
+	discover = discover_refs(service_name, 0);
+	if (discover->version != protocol_v2) {
+		printf("fallback\n");
+		fflush(stdout);
+		return -1;
+	} else {
+		/* Stateless Connection established */
+		printf("\n");
+		fflush(stdout);
+	}
+
+	proxy_state_init(&p, service_name);
+
+	/*
+	 * Dump the capability listing that we got from the server earlier
+	 * during the info/refs request.
+	 */
+	write_or_die(p.out, discover->buf, discover->len);
+
+	/* Peek the next packet line.  Until we see EOF keep sending POSTs */
+	while (packet_reader_peek(&p.reader) != PACKET_READ_EOF) {
+		if (proxy_post(&p)) {
+			/* We would have an err here */
+			break;
+		}
+	}
+
+	proxy_state_clear(&p);
+	return 0;
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -1115,12 +1292,16 @@ int cmd_main(int argc, const char **argv)
 			fflush(stdout);
 
 		} else if (!strcmp(buf.buf, "capabilities")) {
+			printf("stateless-connect\n");
 			printf("fetch\n");
 			printf("option\n");
 			printf("push\n");
 			printf("check-connectivity\n");
 			printf("\n");
 			fflush(stdout);
+		} else if (skip_prefix(buf.buf, "stateless-connect ", &arg)) {
+			if (!stateless_connect(arg))
+				break;
 		} else {
 			error("remote-curl: unknown command '%s' from git", buf.buf);
 			return 1;
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 3e411e178..ada69ac09 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -73,4 +73,45 @@ test_expect_success 'ref advertisment is filtered during fetch using protocol v2
 	! grep "refs/tags/" log
 '
 
+# Test protocol v2 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v2' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+test_expect_success 'fetch with http:// using protocol v2' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=2 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s origin/master >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+stop_httpd
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 12/27] serve: introduce git-serve
  2018-01-25 23:58   ` [PATCH v2 12/27] serve: introduce git-serve Brandon Williams
@ 2018-01-26 10:39     ` Duy Nguyen
  2018-02-27  5:46       ` Jonathan Nieder
  2018-01-31 15:39     ` Derrick Stolee
  1 sibling, 1 reply; 362+ messages in thread
From: Duy Nguyen @ 2018-01-26 10:39 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Git Mailing List, Stefan Beller, Junio C Hamano, Jeff King,
	Philip Oakley, stolee, Jonathan Nieder

On Fri, Jan 26, 2018 at 6:58 AM, Brandon Williams <bmwill@google.com> wrote:
> + Detailed Design
> +=================
> +
> +A client can request to speak protocol v2 by sending `version=2` in the
> +side-channel `GIT_PROTOCOL` in the initial request to the server.
> +
> +In protocol v2 communication is command oriented.  When first contacting a
> +server a list of capabilities will advertised.  Some of these capabilities

s/will advertised/will be advertised/

> + Capability Advertisement
> +--------------------------
> +
> +A server which decides to communicate (based on a request from a client)
> +using protocol version 2, notifies the client by sending a version string
> +in its initial response followed by an advertisement of its capabilities.
> +Each capability is a key with an optional value.  Clients must ignore all
> +unknown keys.

With have a convention in $GIT_DIR/index file format that's probably a
good thing to follow here: lowercase keys are optional, such unknown
keys can (and must) be ignored. Uppercase keys are mandatory. If a
client can't understand one of those keys, abort. This gives the
server a way to "select" clients and introduce incompatible changes if
we ever have to.

> Semantics of unknown values are left to the definition of
> +each key.  Some capabilities will describe commands which can be requested
> +to be executed by the client.
> +
> +    capability-advertisement = protocol-version
> +                              capability-list
> +                              flush-pkt
> +
> +    protocol-version = PKT-LINE("version 2" LF)
> +    capability-list = *capability
> +    capability = PKT-LINE(key[=value] LF)
> +
> +    key = 1*CHAR
> +    value = 1*CHAR
> +    CHAR = 1*(ALPHA / DIGIT / "-" / "_")

Is this a bit too restricted for "value"? Something like "." (e.g.
version) or "@" (I wonder if anybody will add an capability that
contains an email address). Unless there's a good reason to limit it,
should we just go full ascii (without control codes)?

> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +    command-request = command
> +                     capability-list
> +                     (command-args)
> +                     flush-pkt
> +    command = PKT-LINE("command=" key LF)
> +    command-args = delim-pkt
> +                  *arg
> +    arg = 1*CHAR
> +
> +The server will then check to ensure that the client's request is
> +comprised of a valid command as well as valid capabilities which were
> +advertised.  If the request is valid the server will then execute the
> +command.

What happens when the request is not valid? Or..

> +When a command has finished

How does the client know a command has finished? Is it up to each
command design?

More or less related it bugs me that I have a translated git client,
but I still receive remote error messages in English. It's a hard
problem, but I'm hoping that we won't need to change the core protocol
to support that someday. Although we could make rule now that side
channel message could be sent in "printf"-like form, where the client
can translate the format string and substitutes placeholders with real
values afterward...

> a client can either request that another
> +command be executed or can terminate the connection by sending an empty
> +request consisting of just a flush-pkt.
> +
> + Capabilities
> +~~~~~~~~~~~~~~
> +
> +There are two different types of capabilities: normal capabilities,
> +which can be used to to convey information or alter the behavior of a
> +request, and command capabilities, which are the core actions that a
> +client wants to perform (fetch, push, etc).
> +
> + agent
> +-------
> +
> +The server can advertise the `agent` capability with a value `X` (in the
> +form `agent=X`) to notify the client that the server is running version
> +`X`.  The client may optionally send its own agent string by including
> +the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> +request to the server (but it MUST NOT do so if the server did not
> +advertise the agent capability). The `X` and `Y` strings may contain any
> +printable ASCII characters except space (i.e., the byte range 32 < x <
> +127), and are typically of the form "package/version" (e.g.,
> +"git/1.8.3.1"). The agent strings are purely informative for statistics
> +and debugging purposes, and MUST NOT be used to programmatically assume
> +the presence or absence of particular features.
> +
> + stateless-rpc
> +---------------
> +
> +If advertised, the `stateless-rpc` capability indicates that the server
> +supports running commands in a stateless-rpc mode, which means that a
> +command lasts for only a single request-response round.
> +
> +Normally a command can last for as many rounds as are required to
> +complete it (multiple for negotiation during fetch or no additional
> +trips in the case of ls-refs).  If the client sends the `stateless-rpc`
> +capability with a value of `true` (in the form `stateless-rpc=true`)
> +then the invoked command must only last a single round.

Speaking of stateless-rpc, I remember last time this topic was brought
up, there was some discussion to kind of optimize it for http as well,
to fit the "client sends request, server responds data" model and
avoid too many round trips (ideally everything happens in one round
trip). Does it evolve to anything real? All the cool stuff happened
while I was away, sorry if this was discussed and settled.
-- 
Duy

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 05/27] upload-pack: factor out processing lines
  2018-01-25 23:58   ` [PATCH v2 05/27] upload-pack: factor out processing lines Brandon Williams
@ 2018-01-26 20:12     ` Stefan Beller
  2018-01-26 21:33       ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-01-26 20:12 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Thu, Jan 25, 2018 at 3:58 PM, Brandon Williams <bmwill@google.com> wrote:
> Factor out the logic for processing shallow, deepen, deepen_since, and
> deepen_not lines into their own functions to simplify the
> 'receive_needs()' function in addition to making it easier to reuse some
> of this logic when implementing protocol_v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
>  1 file changed, 74 insertions(+), 39 deletions(-)
>
> diff --git a/upload-pack.c b/upload-pack.c
> index 2ad73a98b..42d83d5b1 100644
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
>         packet_flush(1);
>  }
>
> +static int process_shallow(const char *line, struct object_array *shallows)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "shallow ", &arg)) {

stylistic nit:

    You could invert the condition in each of the process_* functions
    to just have

        if (!skip_prefix...))
            return 0

        /* less indented code goes here */

        return 1;

    That way we have less indentation as well as easier code.
    (The reader doesn't need to keep in mind what the else
    part is about; it is a rather local decision to bail out instead
    of having the return at the end of the function.)

> +               struct object_id oid;
> +               struct object *object;
> +               if (get_oid_hex(arg, &oid))
> +                       die("invalid shallow line: %s", line);
> +               object = parse_object(&oid);
> +               if (!object)
> +                       return 1;
> +               if (object->type != OBJ_COMMIT)
> +                       die("invalid shallow object %s", oid_to_hex(&oid));
> +               if (!(object->flags & CLIENT_SHALLOW)) {
> +                       object->flags |= CLIENT_SHALLOW;
> +                       add_object_array(object, NULL, shallows);
> +               }
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +static int process_deepen(const char *line, int *depth)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "deepen ", &arg)) {
> +               char *end = NULL;
> +               *depth = (int) strtol(arg, &end, 0);
> +               if (!end || *end || *depth <= 0)
> +                       die("Invalid deepen: %s", line);
> +               return 1;
> +       }
> +
> +       return 0;
> +}
> +
> +static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "deepen-since ", &arg)) {
> +               char *end = NULL;
> +               *deepen_since = parse_timestamp(arg, &end, 0);
> +               if (!end || *end || !deepen_since ||
> +                   /* revisions.c's max_age -1 is special */
> +                   *deepen_since == -1)
> +                       die("Invalid deepen-since: %s", line);
> +               *deepen_rev_list = 1;
> +               return 1;
> +       }
> +       return 0;
> +}
> +
> +static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
> +{
> +       const char *arg;
> +       if (skip_prefix(line, "deepen-not ", &arg)) {
> +               char *ref = NULL;
> +               struct object_id oid;
> +               if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
> +                       die("git upload-pack: ambiguous deepen-not: %s", line);
> +               string_list_append(deepen_not, ref);
> +               free(ref);
> +               *deepen_rev_list = 1;
> +               return 1;
> +       }
> +       return 0;
> +}
> +
>  static void receive_needs(void)
>  {
>         struct object_array shallows = OBJECT_ARRAY_INIT;
> @@ -745,49 +814,15 @@ static void receive_needs(void)
>                 if (!line)
>                         break;
>
> -               if (skip_prefix(line, "shallow ", &arg)) {
> -                       struct object_id oid;
> -                       struct object *object;
> -                       if (get_oid_hex(arg, &oid))
> -                               die("invalid shallow line: %s", line);
> -                       object = parse_object(&oid);
> -                       if (!object)
> -                               continue;
> -                       if (object->type != OBJ_COMMIT)
> -                               die("invalid shallow object %s", oid_to_hex(&oid));
> -                       if (!(object->flags & CLIENT_SHALLOW)) {
> -                               object->flags |= CLIENT_SHALLOW;
> -                               add_object_array(object, NULL, &shallows);
> -                       }
> +               if (process_shallow(line, &shallows))
>                         continue;
> -               }
> -               if (skip_prefix(line, "deepen ", &arg)) {
> -                       char *end = NULL;
> -                       depth = strtol(arg, &end, 0);
> -                       if (!end || *end || depth <= 0)
> -                               die("Invalid deepen: %s", line);
> +               if (process_deepen(line, &depth))
>                         continue;
> -               }
> -               if (skip_prefix(line, "deepen-since ", &arg)) {
> -                       char *end = NULL;
> -                       deepen_since = parse_timestamp(arg, &end, 0);
> -                       if (!end || *end || !deepen_since ||
> -                           /* revisions.c's max_age -1 is special */
> -                           deepen_since == -1)
> -                               die("Invalid deepen-since: %s", line);
> -                       deepen_rev_list = 1;
> +               if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
>                         continue;
> -               }
> -               if (skip_prefix(line, "deepen-not ", &arg)) {
> -                       char *ref = NULL;
> -                       struct object_id oid;
> -                       if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
> -                               die("git upload-pack: ambiguous deepen-not: %s", line);
> -                       string_list_append(&deepen_not, ref);
> -                       free(ref);
> -                       deepen_rev_list = 1;
> +               if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
>                         continue;
> -               }
> +
>                 if (!skip_prefix(line, "want ", &arg) ||
>                     get_oid_hex(arg, &oid_buf))
>                         die("git upload-pack: protocol error, "
> --
> 2.16.0.rc1.238.g530d649a79-goog
>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 05/27] upload-pack: factor out processing lines
  2018-01-26 20:12     ` Stefan Beller
@ 2018-01-26 21:33       ` Brandon Williams
  2018-01-31 14:08         ` Derrick Stolee
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-01-26 21:33 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On 01/26, Stefan Beller wrote:
> On Thu, Jan 25, 2018 at 3:58 PM, Brandon Williams <bmwill@google.com> wrote:
> > Factor out the logic for processing shallow, deepen, deepen_since, and
> > deepen_not lines into their own functions to simplify the
> > 'receive_needs()' function in addition to making it easier to reuse some
> > of this logic when implementing protocol_v2.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
> >  1 file changed, 74 insertions(+), 39 deletions(-)
> >
> > diff --git a/upload-pack.c b/upload-pack.c
> > index 2ad73a98b..42d83d5b1 100644
> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
> >         packet_flush(1);
> >  }
> >
> > +static int process_shallow(const char *line, struct object_array *shallows)
> > +{
> > +       const char *arg;
> > +       if (skip_prefix(line, "shallow ", &arg)) {
> 
> stylistic nit:
> 
>     You could invert the condition in each of the process_* functions
>     to just have
> 
>         if (!skip_prefix...))
>             return 0
> 
>         /* less indented code goes here */
> 
>         return 1;
> 
>     That way we have less indentation as well as easier code.
>     (The reader doesn't need to keep in mind what the else
>     part is about; it is a rather local decision to bail out instead
>     of having the return at the end of the function.)

I was trying to move the existing code into helper functions so
rewriting them in transit may make it less reviewable?

> 
> > +               struct object_id oid;
> > +               struct object *object;
> > +               if (get_oid_hex(arg, &oid))
> > +                       die("invalid shallow line: %s", line);
> > +               object = parse_object(&oid);
> > +               if (!object)
> > +                       return 1;
> > +               if (object->type != OBJ_COMMIT)
> > +                       die("invalid shallow object %s", oid_to_hex(&oid));
> > +               if (!(object->flags & CLIENT_SHALLOW)) {
> > +                       object->flags |= CLIENT_SHALLOW;
> > +                       add_object_array(object, NULL, shallows);
> > +               }
> > +               return 1;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int process_deepen(const char *line, int *depth)
> > +{
> > +       const char *arg;
> > +       if (skip_prefix(line, "deepen ", &arg)) {
> > +               char *end = NULL;
> > +               *depth = (int) strtol(arg, &end, 0);
> > +               if (!end || *end || *depth <= 0)
> > +                       die("Invalid deepen: %s", line);
> > +               return 1;
> > +       }
> > +
> > +       return 0;
> > +}
> > +
> > +static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
> > +{
> > +       const char *arg;
> > +       if (skip_prefix(line, "deepen-since ", &arg)) {
> > +               char *end = NULL;
> > +               *deepen_since = parse_timestamp(arg, &end, 0);
> > +               if (!end || *end || !deepen_since ||
> > +                   /* revisions.c's max_age -1 is special */
> > +                   *deepen_since == -1)
> > +                       die("Invalid deepen-since: %s", line);
> > +               *deepen_rev_list = 1;
> > +               return 1;
> > +       }
> > +       return 0;
> > +}
> > +
> > +static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
> > +{
> > +       const char *arg;
> > +       if (skip_prefix(line, "deepen-not ", &arg)) {
> > +               char *ref = NULL;
> > +               struct object_id oid;
> > +               if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
> > +                       die("git upload-pack: ambiguous deepen-not: %s", line);
> > +               string_list_append(deepen_not, ref);
> > +               free(ref);
> > +               *deepen_rev_list = 1;
> > +               return 1;
> > +       }
> > +       return 0;
> > +}
> > +
> >  static void receive_needs(void)
> >  {
> >         struct object_array shallows = OBJECT_ARRAY_INIT;
> > @@ -745,49 +814,15 @@ static void receive_needs(void)
> >                 if (!line)
> >                         break;
> >
> > -               if (skip_prefix(line, "shallow ", &arg)) {
> > -                       struct object_id oid;
> > -                       struct object *object;
> > -                       if (get_oid_hex(arg, &oid))
> > -                               die("invalid shallow line: %s", line);
> > -                       object = parse_object(&oid);
> > -                       if (!object)
> > -                               continue;
> > -                       if (object->type != OBJ_COMMIT)
> > -                               die("invalid shallow object %s", oid_to_hex(&oid));
> > -                       if (!(object->flags & CLIENT_SHALLOW)) {
> > -                               object->flags |= CLIENT_SHALLOW;
> > -                               add_object_array(object, NULL, &shallows);
> > -                       }
> > +               if (process_shallow(line, &shallows))
> >                         continue;
> > -               }
> > -               if (skip_prefix(line, "deepen ", &arg)) {
> > -                       char *end = NULL;
> > -                       depth = strtol(arg, &end, 0);
> > -                       if (!end || *end || depth <= 0)
> > -                               die("Invalid deepen: %s", line);
> > +               if (process_deepen(line, &depth))
> >                         continue;
> > -               }
> > -               if (skip_prefix(line, "deepen-since ", &arg)) {
> > -                       char *end = NULL;
> > -                       deepen_since = parse_timestamp(arg, &end, 0);
> > -                       if (!end || *end || !deepen_since ||
> > -                           /* revisions.c's max_age -1 is special */
> > -                           deepen_since == -1)
> > -                               die("Invalid deepen-since: %s", line);
> > -                       deepen_rev_list = 1;
> > +               if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
> >                         continue;
> > -               }
> > -               if (skip_prefix(line, "deepen-not ", &arg)) {
> > -                       char *ref = NULL;
> > -                       struct object_id oid;
> > -                       if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
> > -                               die("git upload-pack: ambiguous deepen-not: %s", line);
> > -                       string_list_append(&deepen_not, ref);
> > -                       free(ref);
> > -                       deepen_rev_list = 1;
> > +               if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
> >                         continue;
> > -               }
> > +
> >                 if (!skip_prefix(line, "want ", &arg) ||
> >                     get_oid_hex(arg, &oid_buf))
> >                         die("git upload-pack: protocol error, "
> > --
> > 2.16.0.rc1.238.g530d649a79-goog
> >

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 13/27] ls-refs: introduce ls-refs server command
  2018-01-25 23:58   ` [PATCH v2 13/27] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-01-26 22:20     ` Stefan Beller
  2018-02-02 22:31       ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-01-26 22:20 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On Thu, Jan 25, 2018 at 3:58 PM, Brandon Williams <bmwill@google.com> wrote:

> +ls-refs takes in the following parameters wrapped in packet-lines:
> +
> +    symrefs
> +       In addition to the object pointed by it, show the underlying ref
> +       pointed by it when showing a symbolic ref.
> +    peel
> +       Show peeled tags.

Would it make sense to default these two to on, and rather have
optional no-symrefs and no-peel ?

That would save bandwidth in the default case, I would think.

> +       cat >expect <<-EOF &&
> +       $(git rev-parse HEAD) HEAD
> +       $(git rev-parse refs/heads/dev) refs/heads/dev
> +       $(git rev-parse refs/heads/master) refs/heads/master
> +       $(git rev-parse refs/heads/release) refs/heads/release
> +       $(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag
> +       $(git rev-parse refs/tags/one) refs/tags/one
> +       $(git rev-parse refs/tags/two) refs/tags/two

Invoking rev-parse quite a few times? I think the test suite is a
trade off between readability ("what we expect the test to do and test")
and speed (specifically on Windows forking is expensive);

I tried to come up with a more concise way to create this expectation
using git-rev-parse, but did not find a good way to do so.

However maybe

  git for-each-ref --format='%(*objectname) %(*refname)' >expect

might help in reproducing the expected message? The downside
of this would be to have to closely guard which refs are there though.
I guess the '--pattern' could help there as it may be the same pattern
as the input to the ls-refs. This might be too abstract for a test though.

I dunno.

Stefan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 05/27] upload-pack: factor out processing lines
  2018-01-26 21:33       ` Brandon Williams
@ 2018-01-31 14:08         ` Derrick Stolee
  0 siblings, 0 replies; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 14:08 UTC (permalink / raw)
  To: Brandon Williams, Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Jonathan Nieder

On 1/26/2018 4:33 PM, Brandon Williams wrote:
> On 01/26, Stefan Beller wrote:
>> On Thu, Jan 25, 2018 at 3:58 PM, Brandon Williams <bmwill@google.com> wrote:
>>> Factor out the logic for processing shallow, deepen, deepen_since, and
>>> deepen_not lines into their own functions to simplify the
>>> 'receive_needs()' function in addition to making it easier to reuse some
>>> of this logic when implementing protocol_v2.
>>>
>>> Signed-off-by: Brandon Williams <bmwill@google.com>
>>> ---
>>>   upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
>>>   1 file changed, 74 insertions(+), 39 deletions(-)
>>>
>>> diff --git a/upload-pack.c b/upload-pack.c
>>> index 2ad73a98b..42d83d5b1 100644
>>> --- a/upload-pack.c
>>> +++ b/upload-pack.c
>>> @@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
>>>          packet_flush(1);
>>>   }
>>>
>>> +static int process_shallow(const char *line, struct object_array *shallows)
>>> +{
>>> +       const char *arg;
>>> +       if (skip_prefix(line, "shallow ", &arg)) {
>> stylistic nit:
>>
>>      You could invert the condition in each of the process_* functions
>>      to just have
>>
>>          if (!skip_prefix...))
>>              return 0
>>
>>          /* less indented code goes here */
>>
>>          return 1;
>>
>>      That way we have less indentation as well as easier code.
>>      (The reader doesn't need to keep in mind what the else
>>      part is about; it is a rather local decision to bail out instead
>>      of having the return at the end of the function.)
> I was trying to move the existing code into helper functions so
> rewriting them in transit may make it less reviewable?

I think the way you kept to the existing code as much as possible is 
good and easier to review. Perhaps a style pass after the patch lands is 
good for #leftoverbits.

>>
>>> +               struct object_id oid;
>>> +               struct object *object;
>>> +               if (get_oid_hex(arg, &oid))
>>> +                       die("invalid shallow line: %s", line);
>>> +               object = parse_object(&oid);
>>> +               if (!object)
>>> +                       return 1;
>>> +               if (object->type != OBJ_COMMIT)
>>> +                       die("invalid shallow object %s", oid_to_hex(&oid));
>>> +               if (!(object->flags & CLIENT_SHALLOW)) {
>>> +                       object->flags |= CLIENT_SHALLOW;
>>> +                       add_object_array(object, NULL, shallows);
>>> +               }
>>> +               return 1;
>>> +       }
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static int process_deepen(const char *line, int *depth)
>>> +{
>>> +       const char *arg;
>>> +       if (skip_prefix(line, "deepen ", &arg)) {
>>> +               char *end = NULL;
>>> +               *depth = (int) strtol(arg, &end, 0);

nit: space between (int) and strtol?

>>> +               if (!end || *end || *depth <= 0)
>>> +                       die("Invalid deepen: %s", line);
>>> +               return 1;
>>> +       }
>>> +
>>> +       return 0;
>>> +}
>>> +
>>> +static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
>>> +{
>>> +       const char *arg;
>>> +       if (skip_prefix(line, "deepen-since ", &arg)) {
>>> +               char *end = NULL;
>>> +               *deepen_since = parse_timestamp(arg, &end, 0);
>>> +               if (!end || *end || !deepen_since ||
>>> +                   /* revisions.c's max_age -1 is special */
>>> +                   *deepen_since == -1)
>>> +                       die("Invalid deepen-since: %s", line);
>>> +               *deepen_rev_list = 1;
>>> +               return 1;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>> +static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
>>> +{
>>> +       const char *arg;
>>> +       if (skip_prefix(line, "deepen-not ", &arg)) {
>>> +               char *ref = NULL;
>>> +               struct object_id oid;
>>> +               if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
>>> +                       die("git upload-pack: ambiguous deepen-not: %s", line);
>>> +               string_list_append(deepen_not, ref);
>>> +               free(ref);
>>> +               *deepen_rev_list = 1;
>>> +               return 1;
>>> +       }
>>> +       return 0;
>>> +}
>>> +
>>>   static void receive_needs(void)
>>>   {
>>>          struct object_array shallows = OBJECT_ARRAY_INIT;
>>> @@ -745,49 +814,15 @@ static void receive_needs(void)
>>>                  if (!line)
>>>                          break;
>>>
>>> -               if (skip_prefix(line, "shallow ", &arg)) {
>>> -                       struct object_id oid;
>>> -                       struct object *object;
>>> -                       if (get_oid_hex(arg, &oid))
>>> -                               die("invalid shallow line: %s", line);
>>> -                       object = parse_object(&oid);
>>> -                       if (!object)
>>> -                               continue;
>>> -                       if (object->type != OBJ_COMMIT)
>>> -                               die("invalid shallow object %s", oid_to_hex(&oid));
>>> -                       if (!(object->flags & CLIENT_SHALLOW)) {
>>> -                               object->flags |= CLIENT_SHALLOW;
>>> -                               add_object_array(object, NULL, &shallows);
>>> -                       }
>>> +               if (process_shallow(line, &shallows))
>>>                          continue;
>>> -               }
>>> -               if (skip_prefix(line, "deepen ", &arg)) {
>>> -                       char *end = NULL;
>>> -                       depth = strtol(arg, &end, 0);
>>> -                       if (!end || *end || depth <= 0)
>>> -                               die("Invalid deepen: %s", line);
>>> +               if (process_deepen(line, &depth))
>>>                          continue;
>>> -               }
>>> -               if (skip_prefix(line, "deepen-since ", &arg)) {
>>> -                       char *end = NULL;
>>> -                       deepen_since = parse_timestamp(arg, &end, 0);
>>> -                       if (!end || *end || !deepen_since ||
>>> -                           /* revisions.c's max_age -1 is special */
>>> -                           deepen_since == -1)
>>> -                               die("Invalid deepen-since: %s", line);
>>> -                       deepen_rev_list = 1;
>>> +               if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
>>>                          continue;
>>> -               }
>>> -               if (skip_prefix(line, "deepen-not ", &arg)) {
>>> -                       char *ref = NULL;
>>> -                       struct object_id oid;
>>> -                       if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
>>> -                               die("git upload-pack: ambiguous deepen-not: %s", line);
>>> -                       string_list_append(&deepen_not, ref);
>>> -                       free(ref);
>>> -                       deepen_rev_list = 1;
>>> +               if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
>>>                          continue;
>>> -               }
>>> +
>>>                  if (!skip_prefix(line, "want ", &arg) ||
>>>                      get_oid_hex(arg, &oid_buf))
>>>                          die("git upload-pack: protocol error, "
>>> --
>>> 2.16.0.rc1.238.g530d649a79-goog
>>>


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads
  2018-01-25 23:58   ` [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-01-31 14:40     ` Derrick Stolee
  2018-02-01 17:57       ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 14:40 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, gitster, peff, philipoakley, jrnieder

On 1/25/2018 6:58 PM, Brandon Williams wrote:
> In order to prepare for the addition of protocol_v2 push the protocol
> version discovery outside of 'get_remote_heads()'.  This will allow for
> keeping the logic for processing the reference advertisement for
> protocol_v1 and protocol_v0 separate from the logic for protocol_v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>   builtin/fetch-pack.c | 16 +++++++++++++++-
>   builtin/send-pack.c  | 17 +++++++++++++++--
>   connect.c            | 27 ++++++++++-----------------
>   connect.h            |  3 +++
>   remote-curl.c        | 20 ++++++++++++++++++--
>   remote.h             |  5 +++--
>   transport.c          | 24 +++++++++++++++++++-----
>   7 files changed, 83 insertions(+), 29 deletions(-)
>
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 366b9d13f..85d4faf76 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -4,6 +4,7 @@
>   #include "remote.h"
>   #include "connect.h"
>   #include "sha1-array.h"
> +#include "protocol.h"
>   
>   static const char fetch_pack_usage[] =
>   "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
> @@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>   	struct fetch_pack_args args;
>   	struct oid_array shallow = OID_ARRAY_INIT;
>   	struct string_list deepen_not = STRING_LIST_INIT_DUP;
> +	struct packet_reader reader;
>   
>   	packet_trace_identity("fetch-pack");
>   
> @@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>   		if (!conn)
>   			return args.diag_url ? 0 : 1;
>   	}
> -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> +
> +	packet_reader_init(&reader, fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");

Is this really a BUG in the client, or a bug/incompatibility in the server?

Perhaps I'm misunderstanding, but it looks like discover_version() will 
die() on an unknown version (the die() is in 
protocol.c:determine_protocol_version_client()). So maybe that's why 
this is a BUG()?

If there is something to change here, this BUG() appears three more times.

> +	}
>   
>   	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
>   			 &shallow, pack_lockfile_ptr);
> diff --git a/builtin/send-pack.c b/builtin/send-pack.c
> index fc4f0bb5f..83cb125a6 100644
> --- a/builtin/send-pack.c
> +++ b/builtin/send-pack.c
> @@ -14,6 +14,7 @@
>   #include "sha1-array.h"
>   #include "gpg-interface.h"
>   #include "gettext.h"
> +#include "protocol.h"
>   
>   static const char * const send_pack_usage[] = {
>   	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
> @@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
>   	int progress = -1;
>   	int from_stdin = 0;
>   	struct push_cas_option cas = {0};
> +	struct packet_reader reader;
>   
>   	struct option options[] = {
>   		OPT__VERBOSITY(&verbose),
> @@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
>   			args.verbose ? CONNECT_VERBOSE : 0);
>   	}
>   
> -	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
> -			 &extra_have, &shallow);
> +	packet_reader_init(&reader, fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
> +				 &extra_have, &shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
>   
>   	transport_verify_remote_names(nr_refspecs, refspecs);
>   
> diff --git a/connect.c b/connect.c
> index 00e90075c..db3c9d24c 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
>   		      "and the repository exists."));
>   }
>   
> -static enum protocol_version discover_version(struct packet_reader *reader)
> +enum protocol_version discover_version(struct packet_reader *reader)
>   {
>   	enum protocol_version version = protocol_unknown_version;
>   
> @@ -234,7 +234,7 @@ enum get_remote_heads_state {
>   /*
>    * Read all the refs from the other end
>    */
> -struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct ref **get_remote_heads(struct packet_reader *reader,
>   			      struct ref **list, unsigned int flags,
>   			      struct oid_array *extra_have,
>   			      struct oid_array *shallow_points)
> @@ -242,24 +242,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>   	struct ref **orig_list = list;
>   	int len = 0;
>   	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
> -	struct packet_reader reader;
>   	const char *arg;
>   
> -	packet_reader_init(&reader, in, src_buf, src_len,
> -			   PACKET_READ_CHOMP_NEWLINE |
> -			   PACKET_READ_GENTLE_ON_EOF);
> -
> -	discover_version(&reader);
> -
>   	*list = NULL;
>   
>   	while (state != EXPECTING_DONE) {
> -		switch (packet_reader_read(&reader)) {
> +		switch (packet_reader_read(reader)) {
>   		case PACKET_READ_EOF:
>   			die_initial_contact(1);
>   		case PACKET_READ_NORMAL:
> -			len = reader.pktlen;
> -			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
> +			len = reader->pktlen;
> +			if (len > 4 && skip_prefix(reader->line, "ERR ", &arg))
>   				die("remote error: %s", arg);
>   			break;
>   		case PACKET_READ_FLUSH:
> @@ -271,22 +264,22 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>   
>   		switch (state) {
>   		case EXPECTING_FIRST_REF:
> -			process_capabilities(reader.line, &len);
> -			if (process_dummy_ref(reader.line)) {
> +			process_capabilities(reader->line, &len);
> +			if (process_dummy_ref(reader->line)) {
>   				state = EXPECTING_SHALLOW;
>   				break;
>   			}
>   			state = EXPECTING_REF;
>   			/* fallthrough */
>   		case EXPECTING_REF:
> -			if (process_ref(reader.line, len, &list, flags, extra_have))
> +			if (process_ref(reader->line, len, &list, flags, extra_have))
>   				break;
>   			state = EXPECTING_SHALLOW;
>   			/* fallthrough */
>   		case EXPECTING_SHALLOW:
> -			if (process_shallow(reader.line, len, shallow_points))
> +			if (process_shallow(reader->line, len, shallow_points))
>   				break;
> -			die("protocol error: unexpected '%s'", reader.line);
> +			die("protocol error: unexpected '%s'", reader->line);
>   		case EXPECTING_DONE:
>   			break;
>   		}
> diff --git a/connect.h b/connect.h
> index 01f14cdf3..cdb8979dc 100644
> --- a/connect.h
> +++ b/connect.h
> @@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
>   extern const char *server_feature_value(const char *feature, int *len_ret);
>   extern int url_is_local_not_ssh(const char *url);
>   
> +struct packet_reader;
> +extern enum protocol_version discover_version(struct packet_reader *reader);
> +
>   #endif
> diff --git a/remote-curl.c b/remote-curl.c
> index 0053b0954..9f6d07683 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -1,6 +1,7 @@
>   #include "cache.h"
>   #include "config.h"
>   #include "remote.h"
> +#include "connect.h"
>   #include "strbuf.h"
>   #include "walker.h"
>   #include "http.h"
> @@ -13,6 +14,7 @@
>   #include "credential.h"
>   #include "sha1-array.h"
>   #include "send-pack.h"
> +#include "protocol.h"
>   
>   static struct remote *remote;
>   /* always ends with a trailing slash */
> @@ -176,8 +178,22 @@ static struct discovery *last_discovery;
>   static struct ref *parse_git_refs(struct discovery *heads, int for_push)
>   {
>   	struct ref *list = NULL;
> -	get_remote_heads(-1, heads->buf, heads->len, &list,
> -			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
> +	struct packet_reader reader;
> +
> +	packet_reader_init(&reader, -1, heads->buf, heads->len,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
> +				 NULL, &heads->shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
> +
>   	return list;
>   }
>   
> diff --git a/remote.h b/remote.h
> index 1f6611be2..2016461df 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
>   void free_refs(struct ref *ref);
>   
>   struct oid_array;
> -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct packet_reader;
> +extern struct ref **get_remote_heads(struct packet_reader *reader,
>   				     struct ref **list, unsigned int flags,
>   				     struct oid_array *extra_have,
> -				     struct oid_array *shallow);
> +				     struct oid_array *shallow_points);
>   
>   int resolve_remote_symref(struct ref *ref, struct ref *list);
>   int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
> diff --git a/transport.c b/transport.c
> index 8e8779096..63c3dbab9 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -18,6 +18,7 @@
>   #include "sha1-array.h"
>   #include "sigchain.h"
>   #include "transport-internal.h"
> +#include "protocol.h"
>   
>   static void set_upstreams(struct transport *transport, struct ref *refs,
>   	int pretend)
> @@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
>   static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
>   {
>   	struct git_transport_data *data = transport->data;
> -	struct ref *refs;
> +	struct ref *refs = NULL;
> +	struct packet_reader reader;
>   
>   	connect_setup(transport, for_push);
> -	get_remote_heads(data->fd[0], NULL, 0, &refs,
> -			 for_push ? REF_NORMAL : 0,
> -			 &data->extra_have,
> -			 &data->shallow);
> +
> +	packet_reader_init(&reader, data->fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &refs,
> +				 for_push ? REF_NORMAL : 0,
> +				 &data->extra_have,
> +				 &data->shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
>   	data->got_remote_heads = 1;
>   
>   	return refs;


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 09/27] transport: store protocol version
  2018-01-25 23:58   ` [PATCH v2 09/27] transport: store protocol version Brandon Williams
@ 2018-01-31 14:45     ` Derrick Stolee
  0 siblings, 0 replies; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 14:45 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, gitster, peff, philipoakley, jrnieder

On 1/25/2018 6:58 PM, Brandon Williams wrote:
> +	switch (data->version) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		refs = fetch_pack(&args, data->fd, data->conn,
> +				  refs_tmp ? refs_tmp : transport->remote_refs,
> +				  dest, to_fetch, nr_heads, &data->shallow,
> +				  &transport->pack_lockfile);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}

After seeing this pattern a few times, I think it would be good to 
convert it to a macro that calls a statement for protocol_v1/v0 (and 
later calls a different one for protocol_v2). It would at minimum reduce 
the code clones surrounding this handling of unknown_version, and we 
could have one place that is clear this BUG() is due to an unexpected 
response from discover_version().


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2
  2018-01-25 23:58   ` [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
@ 2018-01-31 14:54     ` Derrick Stolee
  2018-02-02 22:44       ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 14:54 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, gitster, peff, philipoakley, jrnieder

On 1/25/2018 6:58 PM, Brandon Williams wrote:
> Introduce protocol_v2, a new value for 'enum protocol_version'.
> Subsequent patches will fill in the implementation of protocol_v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>   builtin/fetch-pack.c   | 3 +++
>   builtin/receive-pack.c | 6 ++++++
>   builtin/send-pack.c    | 3 +++
>   builtin/upload-pack.c  | 7 +++++++
>   connect.c              | 3 +++
>   protocol.c             | 2 ++
>   protocol.h             | 1 +
>   remote-curl.c          | 3 +++
>   transport.c            | 9 +++++++++
>   9 files changed, 37 insertions(+)
>
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 85d4faf76..f492e8abd 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>   			   PACKET_READ_GENTLE_ON_EOF);
>   
>   	switch (discover_version(&reader)) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   	case protocol_v0:
>   		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index b7ce7c7f5..3656e94fd 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>   		unpack_limit = receive_unpack_limit;
>   
>   	switch (determine_protocol_version_server()) {
> +	case protocol_v2:
> +		/*
> +		 * push support for protocol v2 has not been implemented yet,
> +		 * so ignore the request to use v2 and fallback to using v0.
> +		 */
> +		break;
>   	case protocol_v1:
>   		/*
>   		 * v1 is just the original protocol with a version string,
> diff --git a/builtin/send-pack.c b/builtin/send-pack.c
> index 83cb125a6..b5427f75e 100644
> --- a/builtin/send-pack.c
> +++ b/builtin/send-pack.c
> @@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
>   			   PACKET_READ_GENTLE_ON_EOF);
>   
>   	switch (discover_version(&reader)) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   	case protocol_v0:
>   		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
> diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
> index 2cb5cb35b..8d53e9794 100644
> --- a/builtin/upload-pack.c
> +++ b/builtin/upload-pack.c
> @@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
>   		die("'%s' does not appear to be a git repository", dir);
>   
>   	switch (determine_protocol_version_server()) {
> +	case protocol_v2:
> +		/*
> +		 * fetch support for protocol v2 has not been implemented yet,
> +		 * so ignore the request to use v2 and fallback to using v0.
> +		 */
> +		upload_pack(&opts);
> +		break;
>   	case protocol_v1:
>   		/*
>   		 * v1 is just the original protocol with a version string,
> diff --git a/connect.c b/connect.c
> index db3c9d24c..f2157a821 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
>   
>   	/* Maybe process capabilities here, at least for v2 */
>   	switch (version) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   		/* Read the peeked version line */
>   		packet_reader_read(reader);
> diff --git a/protocol.c b/protocol.c
> index 43012b7eb..5e636785d 100644
> --- a/protocol.c
> +++ b/protocol.c
> @@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
>   		return protocol_v0;
>   	else if (!strcmp(value, "1"))
>   		return protocol_v1;
> +	else if (!strcmp(value, "2"))
> +		return protocol_v2;
>   	else
>   		return protocol_unknown_version;
>   }
> diff --git a/protocol.h b/protocol.h
> index 1b2bc94a8..2ad35e433 100644
> --- a/protocol.h
> +++ b/protocol.h
> @@ -5,6 +5,7 @@ enum protocol_version {
>   	protocol_unknown_version = -1,
>   	protocol_v0 = 0,
>   	protocol_v1 = 1,
> +	protocol_v2 = 2,
>   };
>   
>   /*
> diff --git a/remote-curl.c b/remote-curl.c
> index 9f6d07683..dae8a4a48 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
>   			   PACKET_READ_GENTLE_ON_EOF);
>   
>   	switch (discover_version(&reader)) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   	case protocol_v0:
>   		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
> diff --git a/transport.c b/transport.c
> index 2378dcb38..83d9dd1df 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
>   
>   	data->version = discover_version(&reader);
>   	switch (data->version) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   	case protocol_v0:
>   		get_remote_heads(&reader, &refs,
> @@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
>   		refs_tmp = get_refs_via_connect(transport, 0);
>   
>   	switch (data->version) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   	case protocol_v0:
>   		refs = fetch_pack(&args, data->fd, data->conn,
> @@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
>   		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
>   
>   	switch (data->version) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;
>   	case protocol_v1:
>   	case protocol_v0:
>   		ret = send_pack(&args, data->fd, data->conn, remote_refs,

With a macro approach to version selection, this change becomes simpler 
in some ways and harder in others.

It is simpler in that we can have the macro from the previous commits 
just fall back to version 0 behavior.

It is harder in that this commit would need one of two options:

1. A macro that performs an arbitrary statement when given v2, which 
would be the die() for these actions not in v2.
2. A macro that clearly states v2 is not supported and calls die() on v2.

Here is my simple, untested attempt at a union of these options:

#define ON_PROTOCOL_VERSION(version,v0,v2) switch(version) {\
case protocol_v2:\
     (v2);\
     break;\
case protocol_v1:\
case protocol_v0:\
     (v0);\
     break;\
case protocol_unknown_version:\
     BUG("unknown protocol version");\
}
#define ON_PROTOCOL_VERSION_V0_FALLBACK(version,v0) switch(version) {\
case protocol_v2:\
case protocol_v1:\
case protocol_v0:\
     (v0);\
     break;\
case protocol_unknown_version:\
     BUG("unknown protocol version");\
}
#define ON_PROTOCOL_VERSION_V0_ONLY(version,v0) \
     ON_PROTOCOL_VERSION(version,v0,\
                 BUG("support for protocol v2 not implemented yet"))

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 14/27] connect: request remote refs using v2
  2018-01-25 23:58   ` [PATCH v2 14/27] connect: request remote refs using v2 Brandon Williams
@ 2018-01-31 15:22     ` Derrick Stolee
  2018-01-31 20:10       ` Eric Sunshine
  0 siblings, 1 reply; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 15:22 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, gitster, peff, philipoakley, jrnieder

On 1/25/2018 6:58 PM, Brandon Williams wrote:
> Teach the client to be able to request a remote's refs using protocol
> v2.  This is done by having a client issue a 'ls-refs' request to a v2
> server.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>   builtin/upload-pack.c  |  10 ++--
>   connect.c              | 123 ++++++++++++++++++++++++++++++++++++++++++++++++-
>   remote.h               |   4 ++
>   t/t5702-protocol-v2.sh |  28 +++++++++++
>   transport.c            |   2 +-
>   5 files changed, 160 insertions(+), 7 deletions(-)
>   create mode 100755 t/t5702-protocol-v2.sh
>
> diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
> index 8d53e9794..a757df8da 100644
> --- a/builtin/upload-pack.c
> +++ b/builtin/upload-pack.c
> @@ -5,6 +5,7 @@
>   #include "parse-options.h"
>   #include "protocol.h"
>   #include "upload-pack.h"
> +#include "serve.h"
>   
>   static const char * const upload_pack_usage[] = {
>   	N_("git upload-pack [<options>] <dir>"),
> @@ -16,6 +17,7 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
>   	const char *dir;
>   	int strict = 0;
>   	struct upload_pack_options opts = { 0 };
> +	struct serve_options serve_opts = SERVE_OPTIONS_INIT;
>   	struct option options[] = {
>   		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
>   			 N_("quit after a single request/response exchange")),
> @@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
>   
>   	switch (determine_protocol_version_server()) {
>   	case protocol_v2:
> -		/*
> -		 * fetch support for protocol v2 has not been implemented yet,
> -		 * so ignore the request to use v2 and fallback to using v0.
> -		 */
> -		upload_pack(&opts);
> +		serve_opts.advertise_capabilities = opts.advertise_refs;
> +		serve_opts.stateless_rpc = opts.stateless_rpc;
> +		serve(&serve_opts);
>   		break;
>   	case protocol_v1:
>   		/*
> diff --git a/connect.c b/connect.c
> index f2157a821..3c653b65b 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -12,9 +12,11 @@
>   #include "sha1-array.h"
>   #include "transport.h"
>   #include "strbuf.h"
> +#include "version.h"
>   #include "protocol.h"
>   
>   static char *server_capabilities;
> +static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
>   static const char *parse_feature_value(const char *, const char *, int *);
>   
>   static int check_ref(const char *name, unsigned int flags)
> @@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
>   		      "and the repository exists."));
>   }
>   
> +/* Checks if the server supports the capability 'c' */
> +static int server_supports_v2(const char *c, int die_on_error)
> +{
> +	int i;
> +
> +	for (i = 0; i < server_capabilities_v2.argc; i++) {
> +		const char *out;
> +		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
> +		    (!*out || *out == '='))
> +			return 1;
> +	}
> +
> +	if (die_on_error)
> +		die("server doesn't support '%s'", c);
> +
> +	return 0;
> +}
> +
> +static void process_capabilities_v2(struct packet_reader *reader)
> +{
> +	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
> +		argv_array_push(&server_capabilities_v2, reader->line);
> +
> +	if (reader->status != PACKET_READ_FLUSH)
> +		die("protocol error");
> +}
> +
>   enum protocol_version discover_version(struct packet_reader *reader)
>   {
>   	enum protocol_version version = protocol_unknown_version;
> @@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
>   	/* Maybe process capabilities here, at least for v2 */
>   	switch (version) {
>   	case protocol_v2:
> -		die("support for protocol v2 not implemented yet");
> +		process_capabilities_v2(reader);
>   		break;
>   	case protocol_v1:
>   		/* Read the peeked version line */
> @@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader,
>   	return list;
>   }
>   
> +static int process_ref_v2(const char *line, struct ref ***list)
> +{
> +	int ret = 1;
> +	int i = 0;

nit: you set 'i' here, but first use it in a for loop with blank 
initializer. Perhaps keep the first assignment closer to the first use?

> +	struct object_id old_oid;
> +	struct ref *ref;
> +	struct string_list line_sections = STRING_LIST_INIT_DUP;
> +
> +	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
> +		ret = 0;
> +		goto out;
> +	}
> +
> +	if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
> +		ret = 0;
> +		goto out;
> +	}
> +
> +	ref = alloc_ref(line_sections.items[i++].string);
> +
> +	oidcpy(&ref->old_oid, &old_oid);
> +	**list = ref;
> +	*list = &ref->next;
> +
> +	for (; i < line_sections.nr; i++) {
> +		const char *arg = line_sections.items[i].string;
> +		if (skip_prefix(arg, "symref-target:", &arg))
> +			ref->symref = xstrdup(arg);
> +
> +		if (skip_prefix(arg, "peeled:", &arg)) {
> +			struct object_id peeled_oid;
> +			char *peeled_name;
> +			struct ref *peeled;
> +			if (get_oid_hex(arg, &peeled_oid)) {
> +				ret = 0;
> +				goto out;
> +			}
> +
> +			peeled_name = xstrfmt("%s^{}", ref->name);
> +			peeled = alloc_ref(peeled_name);
> +
> +			oidcpy(&peeled->old_oid, &peeled_oid);
> +			**list = peeled;
> +			*list = &peeled->next;
> +
> +			free(peeled_name);
> +		}
> +	}
> +
> +out:
> +	string_list_clear(&line_sections, 0);
> +	return ret;
> +}
> +
> +struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +			     struct ref **list, int for_push,
> +			     const struct argv_array *ref_patterns)
> +{
> +	int i;
> +	*list = NULL;
> +
> +	/* Check that the server supports the ls-refs command */
> +	/* Issue request for ls-refs */
> +	if (server_supports_v2("ls-refs", 1))
> +		packet_write_fmt(fd_out, "command=ls-refs\n");
> +
> +	if (server_supports_v2("agent", 0))
> +	    packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
> +
> +	packet_delim(fd_out);
> +	/* When pushing we don't want to request the peeled tags */
> +	if (!for_push)
> +		packet_write_fmt(fd_out, "peel\n");
> +	packet_write_fmt(fd_out, "symrefs\n");
> +	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
> +		packet_write_fmt(fd_out, "ref-pattern %s\n",
> +				 ref_patterns->argv[i]);
> +	}
> +	packet_flush(fd_out);
> +
> +	/* Process response from server */
> +	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +		if (!process_ref_v2(reader->line, &list))
> +			die("invalid ls-refs response: %s", reader->line);
> +	}
> +
> +	if (reader->status != PACKET_READ_FLUSH)
> +		die("protocol error");
> +
> +	return list;
> +}
> +
>   static const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp)
>   {
>   	int len;
> diff --git a/remote.h b/remote.h
> index 2016461df..21d0c776c 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
>   
>   struct oid_array;
>   struct packet_reader;
> +struct argv_array;
>   extern struct ref **get_remote_heads(struct packet_reader *reader,
>   				     struct ref **list, unsigned int flags,
>   				     struct oid_array *extra_have,
>   				     struct oid_array *shallow_points);
> +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +				    struct ref **list, int for_push,
> +				    const struct argv_array *ref_patterns);
>   
>   int resolve_remote_symref(struct ref *ref, struct ref *list);
>   int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
> diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
> new file mode 100755
> index 000000000..4bf4d61ac
> --- /dev/null
> +++ b/t/t5702-protocol-v2.sh
> @@ -0,0 +1,28 @@
> +#!/bin/sh
> +
> +test_description='test git wire-protocol version 2'
> +
> +TEST_NO_CREATE_REPO=1
> +
> +. ./test-lib.sh
> +
> +# Test protocol v2 with 'file://' transport
> +#
> +test_expect_success 'create repo to be served by file:// transport' '
> +	git init file_parent &&
> +	test_commit -C file_parent one
> +'
> +
> +test_expect_success 'list refs with file:// using protocol v2' '
> +	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
> +		ls-remote --symref "file://$(pwd)/file_parent" >actual 2>log &&
> +
> +	# Server responded using protocol v2
> +	cat log &&
> +	grep "git< version 2" log &&
> +
> +	git ls-remote --symref "file://$(pwd)/file_parent" >expect &&
> +	test_cmp actual expect
> +'
> +
> +test_done
> diff --git a/transport.c b/transport.c
> index 83d9dd1df..ffc6b2614 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -204,7 +204,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
>   	data->version = discover_version(&reader);
>   	switch (data->version) {
>   	case protocol_v2:
> -		die("support for protocol v2 not implemented yet");
> +		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
>   		break;
>   	case protocol_v1:
>   	case protocol_v0:


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 12/27] serve: introduce git-serve
  2018-01-25 23:58   ` [PATCH v2 12/27] serve: introduce git-serve Brandon Williams
  2018-01-26 10:39     ` Duy Nguyen
@ 2018-01-31 15:39     ` Derrick Stolee
  1 sibling, 0 replies; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 15:39 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, gitster, peff, philipoakley, jrnieder

On 1/25/2018 6:58 PM, Brandon Williams wrote:
> Introduce git-serve, the base server for protocol version 2.
>
> Protocol version 2 is intended to be a replacement for Git's current
> wire protocol.  The intention is that it will be a simpler, less
> wasteful protocol which can evolve over time.
>
> Protocol version 2 improves upon version 1 by eliminating the initial
> ref advertisement.  In its place a server will export a list of
> capabilities and commands which it supports in a capability
> advertisement.  A client can then request that a particular command be
> executed by providing a number of capabilities and command specific
> parameters.  At the completion of a command, a client can request that
> another command be executed or can terminate the connection by sending a
> flush packet.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>   .gitignore                              |   1 +
>   Documentation/technical/protocol-v2.txt | 117 +++++++++++++++
>   Makefile                                |   2 +
>   builtin.h                               |   1 +
>   builtin/serve.c                         |  30 ++++
>   git.c                                   |   1 +
>   serve.c                                 | 249 ++++++++++++++++++++++++++++++++
>   serve.h                                 |  15 ++
>   t/t5701-git-serve.sh                    |  56 +++++++
>   9 files changed, 472 insertions(+)
>   create mode 100644 Documentation/technical/protocol-v2.txt
>   create mode 100644 builtin/serve.c
>   create mode 100644 serve.c
>   create mode 100644 serve.h
>   create mode 100755 t/t5701-git-serve.sh
>
> diff --git a/.gitignore b/.gitignore
> index 833ef3b0b..2d0450c26 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -140,6 +140,7 @@
>   /git-rm
>   /git-send-email
>   /git-send-pack
> +/git-serve
>   /git-sh-i18n
>   /git-sh-i18n--envsubst
>   /git-sh-setup
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> new file mode 100644
> index 000000000..7f619a76c
> --- /dev/null
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -0,0 +1,117 @@
> + Git Wire Protocol, Version 2
> +==============================
> +
> +This document presents a specification for a version 2 of Git's wire
> +protocol.  Protocol v2 will improve upon v1 in the following ways:
> +
> +  * Instead of multiple service names, multiple commands will be
> +    supported by a single service.

As someone unfamiliar with the old protocol code, this statement is 
underselling the architectural significance of your change. The new 
model allows a single service to handle all different wire protocols 
(git://, ssh://, https://) while being agnostic to the command-specific 
logic. It also hides the protocol negotiation away from these consumers.

The ease with which you are adding new commands in later commits really 
demonstrates the value of this patch. To make that point here, you would 
almost need to document the old model to show how it was difficult to 
use and extend. Perhaps this document will not need expanding since the 
code speaks for itself.

I just wanted to state for the record that the new architecture is a big 
improvement and will make more commands much easier to implement.

> +  * Easily extendable as capabilities are moved into their own section
> +    of the protocol, no longer being hidden behind a NUL byte and
> +    limited by the size of a pkt-line (as there will be a single
> +    capability per pkt-line).
> +  * Separate out other information hidden behind NUL bytes (e.g. agent
> +    string as a capability and symrefs can be requested using 'ls-refs')
> +  * Reference advertisement will be omitted unless explicitly requested
> +  * ls-refs command to explicitly request some refs
> +

nit: some bullets have full stops (.) and others do not.

> + Detailed Design
> +=================
> +
> +A client can request to speak protocol v2 by sending `version=2` in the
> +side-channel `GIT_PROTOCOL` in the initial request to the server.
> +
> +In protocol v2 communication is command oriented.  When first contacting a
> +server a list of capabilities will advertised.  Some of these capabilities
> +will be commands which a client can request be executed.  Once a command
> +has completed, a client can reuse the connection and request that other
> +commands be executed.
> +
> + Special Packets
> +-----------------
> +
> +In protocol v2 these special packets will have the following semantics:
> +
> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
> +
> + Capability Advertisement
> +--------------------------
> +
> +A server which decides to communicate (based on a request from a client)
> +using protocol version 2, notifies the client by sending a version string
> +in its initial response followed by an advertisement of its capabilities.
> +Each capability is a key with an optional value.  Clients must ignore all
> +unknown keys.  Semantics of unknown values are left to the definition of
> +each key.  Some capabilities will describe commands which can be requested
> +to be executed by the client.
> +
> +    capability-advertisement = protocol-version
> +			       capability-list
> +			       flush-pkt
> +
> +    protocol-version = PKT-LINE("version 2" LF)
> +    capability-list = *capability
> +    capability = PKT-LINE(key[=value] LF)
> +
> +    key = 1*CHAR
> +    value = 1*CHAR
> +    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> +
> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +    command-request = command
> +		      capability-list
> +		      (command-args)
> +		      flush-pkt
> +    command = PKT-LINE("command=" key LF)
> +    command-args = delim-pkt
> +		   *arg
> +    arg = 1*CHAR
> +
> +The server will then check to ensure that the client's request is
> +comprised of a valid command as well as valid capabilities which were
> +advertised.  If the request is valid the server will then execute the
> +command.
> +
> +When a command has finished a client can either request that another
> +command be executed or can terminate the connection by sending an empty
> +request consisting of just a flush-pkt.
> +
> + Capabilities
> +~~~~~~~~~~~~~~
> +
> +There are two different types of capabilities: normal capabilities,
> +which can be used to to convey information or alter the behavior of a
> +request, and command capabilities, which are the core actions that a
> +client wants to perform (fetch, push, etc).
> +
> + agent
> +-------
> +
> +The server can advertise the `agent` capability with a value `X` (in the
> +form `agent=X`) to notify the client that the server is running version
> +`X`.  The client may optionally send its own agent string by including
> +the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> +request to the server (but it MUST NOT do so if the server did not
> +advertise the agent capability). The `X` and `Y` strings may contain any
> +printable ASCII characters except space (i.e., the byte range 32 < x <
> +127), and are typically of the form "package/version" (e.g.,
> +"git/1.8.3.1"). The agent strings are purely informative for statistics
> +and debugging purposes, and MUST NOT be used to programmatically assume
> +the presence or absence of particular features.
> +
> + stateless-rpc
> +---------------
> +
> +If advertised, the `stateless-rpc` capability indicates that the server
> +supports running commands in a stateless-rpc mode, which means that a
> +command lasts for only a single request-response round.
> +
> +Normally a command can last for as many rounds as are required to
> +complete it (multiple for negotiation during fetch or no additional
> +trips in the case of ls-refs).  If the client sends the `stateless-rpc`
> +capability with a value of `true` (in the form `stateless-rpc=true`)
> +then the invoked command must only last a single round.
> diff --git a/Makefile b/Makefile
> index 3b849c060..18c255428 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -881,6 +881,7 @@ LIB_OBJS += revision.o
>   LIB_OBJS += run-command.o
>   LIB_OBJS += send-pack.o
>   LIB_OBJS += sequencer.o
> +LIB_OBJS += serve.o
>   LIB_OBJS += server-info.o
>   LIB_OBJS += setup.o
>   LIB_OBJS += sha1-array.o
> @@ -1014,6 +1015,7 @@ BUILTIN_OBJS += builtin/rev-parse.o
>   BUILTIN_OBJS += builtin/revert.o
>   BUILTIN_OBJS += builtin/rm.o
>   BUILTIN_OBJS += builtin/send-pack.o
> +BUILTIN_OBJS += builtin/serve.o
>   BUILTIN_OBJS += builtin/shortlog.o
>   BUILTIN_OBJS += builtin/show-branch.o
>   BUILTIN_OBJS += builtin/show-ref.o
> diff --git a/builtin.h b/builtin.h
> index f332a1257..3f3fdfc28 100644
> --- a/builtin.h
> +++ b/builtin.h
> @@ -215,6 +215,7 @@ extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
>   extern int cmd_revert(int argc, const char **argv, const char *prefix);
>   extern int cmd_rm(int argc, const char **argv, const char *prefix);
>   extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
> +extern int cmd_serve(int argc, const char **argv, const char *prefix);
>   extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
>   extern int cmd_show(int argc, const char **argv, const char *prefix);
>   extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
> diff --git a/builtin/serve.c b/builtin/serve.c
> new file mode 100644
> index 000000000..d3fd240bb
> --- /dev/null
> +++ b/builtin/serve.c
> @@ -0,0 +1,30 @@
> +#include "cache.h"
> +#include "builtin.h"
> +#include "parse-options.h"
> +#include "serve.h"
> +
> +static char const * const serve_usage[] = {
> +	N_("git serve [<options>]"),
> +	NULL
> +};
> +
> +int cmd_serve(int argc, const char **argv, const char *prefix)
> +{
> +	struct serve_options opts = SERVE_OPTIONS_INIT;
> +
> +	struct option options[] = {
> +		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
> +			 N_("quit after a single request/response exchange")),
> +		OPT_BOOL(0, "advertise-capabilities", &opts.advertise_capabilities,
> +			 N_("exit immediately after advertising capabilities")),
> +		OPT_END()
> +	};
> +
> +	/* ignore all unknown cmdline switches for now */
> +	argc = parse_options(argc, argv, prefix, options, serve_usage,
> +			     PARSE_OPT_KEEP_DASHDASH |
> +			     PARSE_OPT_KEEP_UNKNOWN);
> +	serve(&opts);
> +
> +	return 0;
> +}
> diff --git a/git.c b/git.c
> index f71073dc8..f85d682b6 100644
> --- a/git.c
> +++ b/git.c
> @@ -461,6 +461,7 @@ static struct cmd_struct commands[] = {
>   	{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
>   	{ "rm", cmd_rm, RUN_SETUP },
>   	{ "send-pack", cmd_send_pack, RUN_SETUP },
> +	{ "serve", cmd_serve, RUN_SETUP },
>   	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
>   	{ "show", cmd_show, RUN_SETUP },
>   	{ "show-branch", cmd_show_branch, RUN_SETUP },
> diff --git a/serve.c b/serve.c
> new file mode 100644
> index 000000000..90e3defe8
> --- /dev/null
> +++ b/serve.c
> @@ -0,0 +1,249 @@
> +#include "cache.h"
> +#include "repository.h"
> +#include "config.h"
> +#include "pkt-line.h"
> +#include "version.h"
> +#include "argv-array.h"
> +#include "serve.h"
> +
> +static int always_advertise(struct repository *r,
> +			    struct strbuf *value)
> +{
> +	return 1;
> +}
> +
> +static int agent_advertise(struct repository *r,
> +			   struct strbuf *value)
> +{
> +	if (value)
> +		strbuf_addstr(value, git_user_agent_sanitized());
> +	return 1;
> +}
> +
> +struct protocol_capability {
> +	/*
> +	 * The name of the capability.  The server uses this name when
> +	 * advertising this capability, and the client uses this name to
> +	 * specify this capability.
> +	 */
> +	const char *name;
> +
> +	/*
> +	 * Function queried to see if a capability should be advertised.
> +	 * Optionally a value can be specified by adding it to 'value'.
> +	 * If a value is added to 'value', the server will advertise this
> +	 * capability as "<name>=<value>" instead of "<name>".
> +	 */
> +	int (*advertise)(struct repository *r, struct strbuf *value);
> +
> +	/*
> +	 * Function called when a client requests the capability as a command.
> +	 * The command request will be provided to the function via 'keys', the
> +	 * capabilities requested, and 'args', the command specific parameters.
> +	 *
> +	 * This field should be NULL for capabilities which are not commands.
> +	 */
> +	int (*command)(struct repository *r,
> +		       struct argv_array *keys,
> +		       struct argv_array *args);
> +};
> +
> +static struct protocol_capability capabilities[] = {
> +	{ "agent", agent_advertise, NULL },
> +	{ "stateless-rpc", always_advertise, NULL },
> +};
> +
> +static void advertise_capabilities(void)
> +{
> +	struct strbuf capability = STRBUF_INIT;
> +	struct strbuf value = STRBUF_INIT;
> +	int i;
> +
> +	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
> +		struct protocol_capability *c = &capabilities[i];
> +
> +		if (c->advertise(the_repository, &value)) {
> +			strbuf_addstr(&capability, c->name);
> +
> +			if (value.len) {
> +				strbuf_addch(&capability, '=');
> +				strbuf_addbuf(&capability, &value);
> +			}
> +
> +			strbuf_addch(&capability, '\n');
> +			packet_write(1, capability.buf, capability.len);
> +		}
> +
> +		strbuf_reset(&capability);
> +		strbuf_reset(&value);
> +	}
> +
> +	packet_flush(1);
> +	strbuf_release(&capability);
> +	strbuf_release(&value);
> +}
> +
> +static struct protocol_capability *get_capability(const char *key)
> +{
> +	int i;
> +
> +	if (!key)
> +		return NULL;
> +
> +	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
> +		struct protocol_capability *c = &capabilities[i];
> +		const char *out;
> +		if (skip_prefix(key, c->name, &out) && (!*out || *out == '='))
> +			return c;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int is_valid_capability(const char *key)
> +{
> +	const struct protocol_capability *c = get_capability(key);
> +
> +	return c && c->advertise(the_repository, NULL);
> +}
> +
> +static int is_command(const char *key, struct protocol_capability **command)
> +{
> +	const char *out;
> +
> +	if (skip_prefix(key, "command=", &out)) {
> +		struct protocol_capability *cmd = get_capability(out);
> +
> +		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
> +			die("invalid command '%s'", out);
> +		if (*command)
> +			die("command already requested");
> +
> +		*command = cmd;
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +int has_capability(const struct argv_array *keys, const char *capability,
> +		   const char **value)
> +{
> +	int i;
> +	for (i = 0; i < keys->argc; i++) {
> +		const char *out;
> +		if (skip_prefix(keys->argv[i], capability, &out) &&
> +		    (!*out || *out == '=')) {
> +			if (value) {
> +				if (*out == '=')
> +					out++;
> +				*value = out;
> +			}
> +			return 1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +enum request_state {
> +	PROCESS_REQUEST_KEYS = 0,
> +	PROCESS_REQUEST_ARGS,
> +	PROCESS_REQUEST_DONE,
> +};
> +
> +static int process_request(void)
> +{
> +	enum request_state state = PROCESS_REQUEST_KEYS;
> +	char *buffer = packet_buffer;
> +	unsigned buffer_size = sizeof(packet_buffer);
> +	int pktlen;
> +	struct argv_array keys = ARGV_ARRAY_INIT;
> +	struct argv_array args = ARGV_ARRAY_INIT;
> +	struct protocol_capability *command = NULL;
> +
> +	while (state != PROCESS_REQUEST_DONE) {
> +		switch (packet_read_with_status(0, NULL, NULL, buffer,
> +						buffer_size, &pktlen,
> +						PACKET_READ_CHOMP_NEWLINE)) {
> +		case PACKET_READ_EOF:
> +			BUG("Should have already died when seeing EOF");
> +		case PACKET_READ_NORMAL:
> +			break;
> +		case PACKET_READ_FLUSH:
> +			state = PROCESS_REQUEST_DONE;
> +			continue;
> +		case PACKET_READ_DELIM:
> +			if (state != PROCESS_REQUEST_KEYS)
> +				die("protocol error");
> +			state = PROCESS_REQUEST_ARGS;
> +			/*
> +			 * maybe include a check to make sure that a
> +			 * command/capabilities were given.
> +			 */
> +			continue;
> +		}
> +
> +		switch (state) {
> +		case PROCESS_REQUEST_KEYS:
> +			/* collect request; a sequence of keys and values */
> +			if (is_command(buffer, &command) ||
> +			    is_valid_capability(buffer))
> +				argv_array_push(&keys, buffer);
> +			else
> +				die("unknown capability '%s'", buffer);
> +			break;
> +		case PROCESS_REQUEST_ARGS:
> +			/* collect arguments for the requested command */
> +			argv_array_push(&args, buffer);
> +			break;
> +		case PROCESS_REQUEST_DONE:
> +			continue;
> +		}
> +	}
> +
> +	/*
> +	 * If no command and no keys were given then the client wanted to
> +	 * terminate the connection.
> +	 */
> +	if (!keys.argc && !args.argc)
> +		return 1;
> +
> +	if (!command)
> +		die("no command requested");
> +
> +	command->command(the_repository, &keys, &args);
> +
> +	argv_array_clear(&keys);
> +	argv_array_clear(&args);
> +	return 0;
> +}
> +
> +/* Main serve loop for protocol version 2 */
> +void serve(struct serve_options *options)
> +{
> +	if (options->advertise_capabilities || !options->stateless_rpc) {
> +		/* serve by default supports v2 */
> +		packet_write_fmt(1, "version 2\n");
> +
> +		advertise_capabilities();
> +		/*
> +		 * If only the list of capabilities was requested exit
> +		 * immediately after advertising capabilities
> +		 */
> +		if (options->advertise_capabilities)
> +			return;
> +	}
> +
> +	/*
> +	 * If stateless-rpc was requested then exit after
> +	 * a single request/response exchange
> +	 */
> +	if (options->stateless_rpc) {
> +		process_request();
> +	} else {
> +		for (;;)
> +			if (process_request())
> +				break;
> +	}
> +}
> diff --git a/serve.h b/serve.h
> new file mode 100644
> index 000000000..fe65ba9f4
> --- /dev/null
> +++ b/serve.h
> @@ -0,0 +1,15 @@
> +#ifndef SERVE_H
> +#define SERVE_H
> +
> +struct argv_array;
> +extern int has_capability(const struct argv_array *keys, const char *capability,
> +			  const char **value);
> +
> +struct serve_options {
> +	unsigned advertise_capabilities;
> +	unsigned stateless_rpc;
> +};
> +#define SERVE_OPTIONS_INIT { 0 }
> +extern void serve(struct serve_options *options);
> +
> +#endif /* SERVE_H */
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> new file mode 100755
> index 000000000..b5cc049e5
> --- /dev/null
> +++ b/t/t5701-git-serve.sh
> @@ -0,0 +1,56 @@
> +#!/bin/sh
> +
> +test_description='test git-serve and server commands'
> +
> +. ./test-lib.sh
> +
> +test_expect_success 'test capability advertisement' '
> +	cat >expect <<-EOF &&
> +	version 2
> +	agent=git/$(git version | cut -d" " -f3)
> +	stateless-rpc
> +	0000
> +	EOF
> +
> +	git serve --advertise-capabilities >out &&
> +	test-pkt-line unpack <out >actual &&
> +	test_cmp actual expect
> +'
> +
> +test_expect_success 'stateless-rpc flag does not list capabilities' '
> +	test-pkt-line pack >in <<-EOF &&
> +	0000
> +	EOF
> +	git serve --stateless-rpc >out <in &&
> +	test_must_be_empty out
> +'
> +
> +test_expect_success 'request invalid capability' '
> +	test-pkt-line pack >in <<-EOF &&
> +	foobar
> +	0000
> +	EOF
> +	test_must_fail git serve --stateless-rpc 2>err <in &&
> +	test_i18ngrep "unknown capability" err
> +'
> +
> +test_expect_success 'request with no command' '
> +	test-pkt-line pack >in <<-EOF &&
> +	agent=git/test
> +	0000
> +	EOF
> +	test_must_fail git serve --stateless-rpc 2>err <in &&
> +	test_i18ngrep "no command requested" err
> +'
> +
> +test_expect_success 'request invalid command' '
> +	test-pkt-line pack >in <<-EOF &&
> +	command=foo
> +	agent=git/test
> +	0000
> +	EOF
> +	test_must_fail git serve --stateless-rpc 2>err <in &&
> +	test_i18ngrep "invalid command" err
> +'
> +
> +test_done


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 00/27] protocol version 2
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (26 preceding siblings ...)
  2018-01-25 23:58   ` [PATCH v2 27/27] remote-curl: implement stateless-connect command Brandon Williams
@ 2018-01-31 16:00   ` Derrick Stolee
  2018-02-07  0:58     ` Brandon Williams
  2018-02-01 19:40   ` Jeff Hostetler
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
  29 siblings, 1 reply; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 16:00 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, gitster, peff, philipoakley, jrnieder

Sorry for chiming in with mostly nitpicks so late since sending this 
version. Mostly, I tried to read it to see if I could understand the 
scope of the patch and how this code worked before. It looks very 
polished, so I the nits were the best I could do.

On 1/25/2018 6:58 PM, Brandon Williams wrote:
> Changes in v2:
>   * Added documentation for fetch
>   * changes #defines for state variables to be enums
>   * couple code changes to pkt-line functions and documentation
>   * Added unit tests for the git-serve binary as well as for ls-refs

I'm a fan of more unit-level testing, and I think that will be more 
important as we go on with these multiple configuration options.

> Areas for improvement
>   * Push isn't implemented, right now this is ok because if v2 is requested the
>     server can just default to v0.  Before this can be merged we may want to
>     change how the client request a new protocol, and not allow for sending
>     "version=2" when pushing even though the user has it configured.  Or maybe
>     its fine to just have an older client who doesn't understand how to push
>     (and request v2) to die if the server tries to speak v2 at it.
>
>     Fixing this essentially would just require piping through a bit more
>     information to the function which ultimately runs connect (for both builtins
>     and remote-curl)

Definitely save push for a later patch. Getting 'fetch' online did 
require 'ls-refs' at the same time. Future reviews will be easier when 
adding one command at a time.

>
>   * I want to make sure that the docs are well written before this gets merged
>     so I'm hoping that someone can do a through review on the docs themselves to
>     make sure they are clear.

I made a comment in the docs about the architectural changes. While I 
think a discussion on that topic would be valuable, I'm not sure that's 
the point of the document (i.e. documenting what v2 does versus selling 
the value of the patch). I thought the docs were clear for how the 
commands work.

>   * Right now there is a capability 'stateless-rpc' which essentially makes sure
>     that a server command completes after a single round (this is to make sure
>     http works cleanly).  After talking with some folks it may make more sense
>     to just have v2 be stateless in nature so that all commands terminate after
>     a single round trip.  This makes things a bit easier if a server wants to
>     have ssh just be a proxy for http.
>
>     One potential thing would be to flip this so that by default the protocol is
>     stateless and if a server/command has a state-full mode that can be
>     implemented as a capability at a later point.  Thoughts?

At minimum, all commands should be designed with a "stateless first" 
philosophy since a large number of users communicate via HTTP[S] and any 
decisions that make stateless communication painful should be rejected.

>   * Shallow repositories and shallow clones aren't supported yet.  I'm working
>     on it and it can be either added to v2 by default if people think it needs
>     to be in there from the start, or we can add it as a capability at a later
>     point.

I'm happy to say the following:

1. Shallow repositories should not be used for servers, since they 
cannot service all requests.

2. Since v2 has easy capability features, I'm happy to leave shallow for 
later. We will want to verify that a shallow clone command reverts to v1.


I fetched bw/protocol-v2 with tip 13c70148, built, set 
'protocol.version=2' in the config, and tested fetches against GitHub 
and VSTS just as a compatibility test. Everything worked just fine.

Is there an easy way to test the existing test suite for clone and fetch 
using protocol v2 to make sure there are no regressions with 
protocol.version=2 in the config?

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 14/27] connect: request remote refs using v2
  2018-01-31 15:22     ` Derrick Stolee
@ 2018-01-31 20:10       ` Eric Sunshine
  2018-01-31 22:14         ` Derrick Stolee
  0 siblings, 1 reply; 362+ messages in thread
From: Eric Sunshine @ 2018-01-31 20:10 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Brandon Williams, Git List, Stefan Beller, Junio C Hamano,
	Jeff King, Philip Oakley, Jonathan Nieder

On Wed, Jan 31, 2018 at 10:22 AM, Derrick Stolee <stolee@gmail.com> wrote:
> On 1/25/2018 6:58 PM, Brandon Williams wrote:
>>  +static int process_ref_v2(const char *line, struct ref ***list)
>> +{
>> +       int ret = 1;
>> +       int i = 0;
>
> nit: you set 'i' here, but first use it in a for loop with blank
> initializer. Perhaps keep the first assignment closer to the first use?

Hmm, I see 'i' being incremented a couple times before the loop...

>> +       if (string_list_split(&line_sections, line, ' ', -1) < 2) {
>> +               ret = 0;
>> +               goto out;
>> +       }
>> +
>> +       if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {

here...

>> +               ret = 0;
>> +               goto out;
>> +       }
>> +
>> +       ref = alloc_ref(line_sections.items[i++].string);

and here...

>> +
>> +       oidcpy(&ref->old_oid, &old_oid);
>> +       **list = ref;
>> +       *list = &ref->next;
>> +
>> +       for (; i < line_sections.nr; i++) {

then it is used in the loop.

>> +               const char *arg = line_sections.items[i].string;
>> +               if (skip_prefix(arg, "symref-target:", &arg))
>> +                       ref->symref = xstrdup(arg);

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 14/27] connect: request remote refs using v2
  2018-01-31 20:10       ` Eric Sunshine
@ 2018-01-31 22:14         ` Derrick Stolee
  0 siblings, 0 replies; 362+ messages in thread
From: Derrick Stolee @ 2018-01-31 22:14 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Brandon Williams, Git List, Stefan Beller, Junio C Hamano,
	Jeff King, Philip Oakley, Jonathan Nieder



On 1/31/2018 3:10 PM, Eric Sunshine wrote:
> On Wed, Jan 31, 2018 at 10:22 AM, Derrick Stolee <stolee@gmail.com> wrote:
>> On 1/25/2018 6:58 PM, Brandon Williams wrote:
>>>   +static int process_ref_v2(const char *line, struct ref ***list)
>>> +{
>>> +       int ret = 1;
>>> +       int i = 0;
>> nit: you set 'i' here, but first use it in a for loop with blank
>> initializer. Perhaps keep the first assignment closer to the first use?
> Hmm, I see 'i' being incremented a couple times before the loop...
>
>>> +       if (string_list_split(&line_sections, line, ' ', -1) < 2) {
>>> +               ret = 0;
>>> +               goto out;
>>> +       }
>>> +
>>> +       if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
> here...
>
>>> +               ret = 0;
>>> +               goto out;
>>> +       }
>>> +
>>> +       ref = alloc_ref(line_sections.items[i++].string);
> and here...
>
>>> +
>>> +       oidcpy(&ref->old_oid, &old_oid);
>>> +       **list = ref;
>>> +       *list = &ref->next;
>>> +
>>> +       for (; i < line_sections.nr; i++) {
> then it is used in the loop.
>
>>> +               const char *arg = line_sections.items[i].string;
>>> +               if (skip_prefix(arg, "symref-target:", &arg))
>>> +                       ref->symref = xstrdup(arg);

Thanks! Sorry I missed this.

-Stolee

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads
  2018-01-31 14:40     ` Derrick Stolee
@ 2018-02-01 17:57       ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-01 17:57 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, sbeller, gitster, peff, philipoakley, jrnieder

On 01/31, Derrick Stolee wrote:
> On 1/25/2018 6:58 PM, Brandon Williams wrote:
> > In order to prepare for the addition of protocol_v2 push the protocol
> > version discovery outside of 'get_remote_heads()'.  This will allow for
> > keeping the logic for processing the reference advertisement for
> > protocol_v1 and protocol_v0 separate from the logic for protocol_v2.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >   builtin/fetch-pack.c | 16 +++++++++++++++-
> >   builtin/send-pack.c  | 17 +++++++++++++++--
> >   connect.c            | 27 ++++++++++-----------------
> >   connect.h            |  3 +++
> >   remote-curl.c        | 20 ++++++++++++++++++--
> >   remote.h             |  5 +++--
> >   transport.c          | 24 +++++++++++++++++++-----
> >   7 files changed, 83 insertions(+), 29 deletions(-)
> > 
> > diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> > index 366b9d13f..85d4faf76 100644
> > --- a/builtin/fetch-pack.c
> > +++ b/builtin/fetch-pack.c
> > @@ -4,6 +4,7 @@
> >   #include "remote.h"
> >   #include "connect.h"
> >   #include "sha1-array.h"
> > +#include "protocol.h"
> >   static const char fetch_pack_usage[] =
> >   "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
> > @@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
> >   	struct fetch_pack_args args;
> >   	struct oid_array shallow = OID_ARRAY_INIT;
> >   	struct string_list deepen_not = STRING_LIST_INIT_DUP;
> > +	struct packet_reader reader;
> >   	packet_trace_identity("fetch-pack");
> > @@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
> >   		if (!conn)
> >   			return args.diag_url ? 0 : 1;
> >   	}
> > -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> > +
> > +	packet_reader_init(&reader, fd[0], NULL, 0,
> > +			   PACKET_READ_CHOMP_NEWLINE |
> > +			   PACKET_READ_GENTLE_ON_EOF);
> > +
> > +	switch (discover_version(&reader)) {
> > +	case protocol_v1:
> > +	case protocol_v0:
> > +		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> > +		break;
> > +	case protocol_unknown_version:
> > +		BUG("unknown protocol version");
> 
> Is this really a BUG in the client, or a bug/incompatibility in the server?
> 
> Perhaps I'm misunderstanding, but it looks like discover_version() will
> die() on an unknown version (the die() is in
> protocol.c:determine_protocol_version_client()). So maybe that's why this is
> a BUG()?
> 
> If there is something to change here, this BUG() appears three more times.

Yes, I have it labeled as a BUG because discover_version can't return an
unknown protocol version.  If the server actually returns an unknown
protocol version then it should be handled in
protocol.c:determine_protocol_version_client() as you mentioned.

> 
> > +	}
> >   	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
> >   			 &shallow, pack_lockfile_ptr);
> > diff --git a/builtin/send-pack.c b/builtin/send-pack.c
> > index fc4f0bb5f..83cb125a6 100644
> > --- a/builtin/send-pack.c
> > +++ b/builtin/send-pack.c
> > @@ -14,6 +14,7 @@
> >   #include "sha1-array.h"
> >   #include "gpg-interface.h"
> >   #include "gettext.h"
> > +#include "protocol.h"
> >   static const char * const send_pack_usage[] = {
> >   	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
> > @@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
> >   	int progress = -1;
> >   	int from_stdin = 0;
> >   	struct push_cas_option cas = {0};
> > +	struct packet_reader reader;
> >   	struct option options[] = {
> >   		OPT__VERBOSITY(&verbose),
> > @@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
> >   			args.verbose ? CONNECT_VERBOSE : 0);
> >   	}
> > -	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
> > -			 &extra_have, &shallow);
> > +	packet_reader_init(&reader, fd[0], NULL, 0,
> > +			   PACKET_READ_CHOMP_NEWLINE |
> > +			   PACKET_READ_GENTLE_ON_EOF);
> > +
> > +	switch (discover_version(&reader)) {
> > +	case protocol_v1:
> > +	case protocol_v0:
> > +		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
> > +				 &extra_have, &shallow);
> > +		break;
> > +	case protocol_unknown_version:
> > +		BUG("unknown protocol version");
> > +	}
> >   	transport_verify_remote_names(nr_refspecs, refspecs);
> > diff --git a/connect.c b/connect.c
> > index 00e90075c..db3c9d24c 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
> >   		      "and the repository exists."));
> >   }
> > -static enum protocol_version discover_version(struct packet_reader *reader)
> > +enum protocol_version discover_version(struct packet_reader *reader)
> >   {
> >   	enum protocol_version version = protocol_unknown_version;
> > @@ -234,7 +234,7 @@ enum get_remote_heads_state {
> >   /*
> >    * Read all the refs from the other end
> >    */
> > -struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> > +struct ref **get_remote_heads(struct packet_reader *reader,
> >   			      struct ref **list, unsigned int flags,
> >   			      struct oid_array *extra_have,
> >   			      struct oid_array *shallow_points)
> > @@ -242,24 +242,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> >   	struct ref **orig_list = list;
> >   	int len = 0;
> >   	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
> > -	struct packet_reader reader;
> >   	const char *arg;
> > -	packet_reader_init(&reader, in, src_buf, src_len,
> > -			   PACKET_READ_CHOMP_NEWLINE |
> > -			   PACKET_READ_GENTLE_ON_EOF);
> > -
> > -	discover_version(&reader);
> > -
> >   	*list = NULL;
> >   	while (state != EXPECTING_DONE) {
> > -		switch (packet_reader_read(&reader)) {
> > +		switch (packet_reader_read(reader)) {
> >   		case PACKET_READ_EOF:
> >   			die_initial_contact(1);
> >   		case PACKET_READ_NORMAL:
> > -			len = reader.pktlen;
> > -			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
> > +			len = reader->pktlen;
> > +			if (len > 4 && skip_prefix(reader->line, "ERR ", &arg))
> >   				die("remote error: %s", arg);
> >   			break;
> >   		case PACKET_READ_FLUSH:
> > @@ -271,22 +264,22 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> >   		switch (state) {
> >   		case EXPECTING_FIRST_REF:
> > -			process_capabilities(reader.line, &len);
> > -			if (process_dummy_ref(reader.line)) {
> > +			process_capabilities(reader->line, &len);
> > +			if (process_dummy_ref(reader->line)) {
> >   				state = EXPECTING_SHALLOW;
> >   				break;
> >   			}
> >   			state = EXPECTING_REF;
> >   			/* fallthrough */
> >   		case EXPECTING_REF:
> > -			if (process_ref(reader.line, len, &list, flags, extra_have))
> > +			if (process_ref(reader->line, len, &list, flags, extra_have))
> >   				break;
> >   			state = EXPECTING_SHALLOW;
> >   			/* fallthrough */
> >   		case EXPECTING_SHALLOW:
> > -			if (process_shallow(reader.line, len, shallow_points))
> > +			if (process_shallow(reader->line, len, shallow_points))
> >   				break;
> > -			die("protocol error: unexpected '%s'", reader.line);
> > +			die("protocol error: unexpected '%s'", reader->line);
> >   		case EXPECTING_DONE:
> >   			break;
> >   		}
> > diff --git a/connect.h b/connect.h
> > index 01f14cdf3..cdb8979dc 100644
> > --- a/connect.h
> > +++ b/connect.h
> > @@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
> >   extern const char *server_feature_value(const char *feature, int *len_ret);
> >   extern int url_is_local_not_ssh(const char *url);
> > +struct packet_reader;
> > +extern enum protocol_version discover_version(struct packet_reader *reader);
> > +
> >   #endif
> > diff --git a/remote-curl.c b/remote-curl.c
> > index 0053b0954..9f6d07683 100644
> > --- a/remote-curl.c
> > +++ b/remote-curl.c
> > @@ -1,6 +1,7 @@
> >   #include "cache.h"
> >   #include "config.h"
> >   #include "remote.h"
> > +#include "connect.h"
> >   #include "strbuf.h"
> >   #include "walker.h"
> >   #include "http.h"
> > @@ -13,6 +14,7 @@
> >   #include "credential.h"
> >   #include "sha1-array.h"
> >   #include "send-pack.h"
> > +#include "protocol.h"
> >   static struct remote *remote;
> >   /* always ends with a trailing slash */
> > @@ -176,8 +178,22 @@ static struct discovery *last_discovery;
> >   static struct ref *parse_git_refs(struct discovery *heads, int for_push)
> >   {
> >   	struct ref *list = NULL;
> > -	get_remote_heads(-1, heads->buf, heads->len, &list,
> > -			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
> > +	struct packet_reader reader;
> > +
> > +	packet_reader_init(&reader, -1, heads->buf, heads->len,
> > +			   PACKET_READ_CHOMP_NEWLINE |
> > +			   PACKET_READ_GENTLE_ON_EOF);
> > +
> > +	switch (discover_version(&reader)) {
> > +	case protocol_v1:
> > +	case protocol_v0:
> > +		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
> > +				 NULL, &heads->shallow);
> > +		break;
> > +	case protocol_unknown_version:
> > +		BUG("unknown protocol version");
> > +	}
> > +
> >   	return list;
> >   }
> > diff --git a/remote.h b/remote.h
> > index 1f6611be2..2016461df 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
> >   void free_refs(struct ref *ref);
> >   struct oid_array;
> > -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> > +struct packet_reader;
> > +extern struct ref **get_remote_heads(struct packet_reader *reader,
> >   				     struct ref **list, unsigned int flags,
> >   				     struct oid_array *extra_have,
> > -				     struct oid_array *shallow);
> > +				     struct oid_array *shallow_points);
> >   int resolve_remote_symref(struct ref *ref, struct ref *list);
> >   int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
> > diff --git a/transport.c b/transport.c
> > index 8e8779096..63c3dbab9 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -18,6 +18,7 @@
> >   #include "sha1-array.h"
> >   #include "sigchain.h"
> >   #include "transport-internal.h"
> > +#include "protocol.h"
> >   static void set_upstreams(struct transport *transport, struct ref *refs,
> >   	int pretend)
> > @@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
> >   static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
> >   {
> >   	struct git_transport_data *data = transport->data;
> > -	struct ref *refs;
> > +	struct ref *refs = NULL;
> > +	struct packet_reader reader;
> >   	connect_setup(transport, for_push);
> > -	get_remote_heads(data->fd[0], NULL, 0, &refs,
> > -			 for_push ? REF_NORMAL : 0,
> > -			 &data->extra_have,
> > -			 &data->shallow);
> > +
> > +	packet_reader_init(&reader, data->fd[0], NULL, 0,
> > +			   PACKET_READ_CHOMP_NEWLINE |
> > +			   PACKET_READ_GENTLE_ON_EOF);
> > +
> > +	switch (discover_version(&reader)) {
> > +	case protocol_v1:
> > +	case protocol_v0:
> > +		get_remote_heads(&reader, &refs,
> > +				 for_push ? REF_NORMAL : 0,
> > +				 &data->extra_have,
> > +				 &data->shallow);
> > +		break;
> > +	case protocol_unknown_version:
> > +		BUG("unknown protocol version");
> > +	}
> >   	data->got_remote_heads = 1;
> >   	return refs;
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-01-03  0:18 ` [PATCH 11/26] serve: introduce git-serve Brandon Williams
  2018-01-09 20:24   ` Jonathan Tan
@ 2018-02-01 18:48   ` Jeff Hostetler
  2018-02-01 18:57     ` Stefan Beller
  1 sibling, 1 reply; 362+ messages in thread
From: Jeff Hostetler @ 2018-02-01 18:48 UTC (permalink / raw)
  To: Brandon Williams, git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder



On 1/2/2018 7:18 PM, Brandon Williams wrote:
> Introduce git-serve, the base server for protocol version 2.
> 
> Protocol version 2 is intended to be a replacement for Git's current
> wire protocol.  The intention is that it will be a simpler, less
> wasteful protocol which can evolve over time.
> 
> Protocol version 2 improves upon version 1 by eliminating the initial
> ref advertisement.  In its place a server will export a list of
> capabilities and commands which it supports in a capability
> advertisement.  A client can then request that a particular command be
> executed by providing a number of capabilities and command specific
> parameters.  At the completion of a command, a client can request that
> another command be executed or can terminate the connection by sending a
> flush packet.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>   .gitignore                              |   1 +
>   Documentation/technical/protocol-v2.txt |  91 ++++++++++++
>   Makefile                                |   2 +
>   builtin.h                               |   1 +
>   builtin/serve.c                         |  30 ++++
>   git.c                                   |   1 +
>   serve.c                                 | 239 ++++++++++++++++++++++++++++++++
>   serve.h                                 |  15 ++
>   8 files changed, 380 insertions(+)
>   create mode 100644 Documentation/technical/protocol-v2.txt
>   create mode 100644 builtin/serve.c
>   create mode 100644 serve.c
>   create mode 100644 serve.h
> 
> diff --git a/.gitignore b/.gitignore
> index 833ef3b0b..2d0450c26 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -140,6 +140,7 @@
>   /git-rm
>   /git-send-email
>   /git-send-pack
> +/git-serve
>   /git-sh-i18n
>   /git-sh-i18n--envsubst
>   /git-sh-setup
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> new file mode 100644
> index 000000000..b87ba3816
> --- /dev/null
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -0,0 +1,91 @@
> + Git Wire Protocol, Version 2
> +==============================
> +
> +This document presents a specification for a version 2 of Git's wire
> +protocol.  Protocol v2 will improve upon v1 in the following ways:
> +
> +  * Instead of multiple service names, multiple commands will be
> +    supported by a single service.
> +  * Easily extendable as capabilities are moved into their own section
> +    of the protocol, no longer being hidden behind a NUL byte and
> +    limited by the size of a pkt-line (as there will be a single
> +    capability per pkt-line).
> +  * Separate out other information hidden behind NUL bytes (e.g. agent
> +    string as a capability and symrefs can be requested using 'ls-refs')
> +  * Reference advertisement will be omitted unless explicitly requested
> +  * ls-refs command to explicitly request some refs
> +
> + Detailed Design
> +=================
> +
> +A client can request to speak protocol v2 by sending `version=2` in the
> +side-channel `GIT_PROTOCOL` in the initial request to the server.
> +
> +In protocol v2 communication is command oriented.  When first contacting a
> +server a list of capabilities will advertised.  Some of these capabilities
> +will be commands which a client can request be executed.  Once a command
> +has completed, a client can reuse the connection and request that other
> +commands be executed.
> +
> + Special Packets
> +-----------------
> +
> +In protocol v2 these special packets will have the following semantics:
> +
> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message

Previously, a 0001 pkt-line meant that there was 1 byte of data
following, right?  Does this change that and/or prevent 1 byte
packets?  (Not sure if it is likely, but the odd-tail of a packfile
might get sent in a 0001 line, right?)  Or is it that 0001 is only
special during the V2 negotiation stuff, but not during the packfile
transmission?

(I'm not against having this delimiter -- I think it is useful, but
just curious if will cause problems elsewhere.)

Should we also consider increasing the pkt-line limit to 5 hex-digits
while we're at it ?   That would let us have 1MB buffers if that would
help with large packfiles.  Granted, we're throttled by the network,
so it might not matter.  Would it be interesting to have a 5 digit
prefix with parts of the high bits of first digit being flags ?
Or is this too radical of a change?


> +
> + Capability Advertisement
> +--------------------------
> +
> +A server which decides to communicate (based on a request from a client)
> +using protocol version 2, notifies the client by sending a version string
> +in its initial response followed by an advertisement of its capabilities.
> +Each capability is a key with an optional value.  Clients must ignore all
> +unknown keys.  Semantics of unknown values are left to the definition of
> +each key.  Some capabilities will describe commands which can be requested
> +to be executed by the client.
> +
> +    capability-advertisement = protocol-version
> +			       capability-list
> +			       flush-pkt
> +
> +    protocol-version = PKT-LINE("version 2" LF)
> +    capability-list = *capability
> +    capability = PKT-LINE(key[=value] LF)
> +
> +    key = 1*CHAR
> +    value = 1*CHAR
> +    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> +
> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +    command-request = command
> +		      capability-list
> +		      (command-args)
> +		      flush-pkt
> +    command = PKT-LINE("command=" key LF)
> +    command-args = delim-pkt
> +		   *arg
> +    arg = 1*CHAR
> +
> +The server will then check to ensure that the client's request is
> +comprised of a valid command as well as valid capabilities which were
> +advertised.  If the request is valid the server will then execute the
> +command.
> +
> +A particular command can last for as many rounds as are required to
> +complete the service (multiple for negotiation during fetch or no
> +additional trips in the case of ls-refs).
> +
> +When finished a client should send an empty request of just a flush-pkt to
> +terminate the connection.
> +
> + Commands in v2
> +~~~~~~~~~~~~~~~~
> +
> +Commands are the core actions that a client wants to perform (fetch, push,
> +etc).  Each command will be provided with a list capabilities and
> +arguments as requested by a client.


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 18:48   ` Jeff Hostetler
@ 2018-02-01 18:57     ` Stefan Beller
  2018-02-01 19:09       ` Jeff Hostetler
  2018-02-01 19:45       ` Randall S. Becker
  0 siblings, 2 replies; 362+ messages in thread
From: Stefan Beller @ 2018-02-01 18:57 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Brandon Williams, git, Junio C Hamano, Jeff King, Philip Oakley,
	Derrick Stolee, Jonathan Nieder

On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
>
>
> On 1/2/2018 7:18 PM, Brandon Williams wrote:
>>
>> Introduce git-serve, the base server for protocol version 2.
>>
>> Protocol version 2 is intended to be a replacement for Git's current
>> wire protocol.  The intention is that it will be a simpler, less
>> wasteful protocol which can evolve over time.
>>
>> Protocol version 2 improves upon version 1 by eliminating the initial
>> ref advertisement.  In its place a server will export a list of
>> capabilities and commands which it supports in a capability
>> advertisement.  A client can then request that a particular command be
>> executed by providing a number of capabilities and command specific
>> parameters.  At the completion of a command, a client can request that
>> another command be executed or can terminate the connection by sending a
>> flush packet.
>>
>> Signed-off-by: Brandon Williams <bmwill@google.com>
>> ---
>>   .gitignore                              |   1 +
>>   Documentation/technical/protocol-v2.txt |  91 ++++++++++++
>>   Makefile                                |   2 +
>>   builtin.h                               |   1 +
>>   builtin/serve.c                         |  30 ++++
>>   git.c                                   |   1 +
>>   serve.c                                 | 239
>> ++++++++++++++++++++++++++++++++
>>   serve.h                                 |  15 ++
>>   8 files changed, 380 insertions(+)
>>   create mode 100644 Documentation/technical/protocol-v2.txt
>>   create mode 100644 builtin/serve.c
>>   create mode 100644 serve.c
>>   create mode 100644 serve.h
>>
>> diff --git a/.gitignore b/.gitignore
>> index 833ef3b0b..2d0450c26 100644
>> --- a/.gitignore
>> +++ b/.gitignore
>> @@ -140,6 +140,7 @@
>>   /git-rm
>>   /git-send-email
>>   /git-send-pack
>> +/git-serve
>>   /git-sh-i18n
>>   /git-sh-i18n--envsubst
>>   /git-sh-setup
>> diff --git a/Documentation/technical/protocol-v2.txt
>> b/Documentation/technical/protocol-v2.txt
>> new file mode 100644
>> index 000000000..b87ba3816
>> --- /dev/null
>> +++ b/Documentation/technical/protocol-v2.txt
>> @@ -0,0 +1,91 @@
>> + Git Wire Protocol, Version 2
>> +==============================
>> +
>> +This document presents a specification for a version 2 of Git's wire
>> +protocol.  Protocol v2 will improve upon v1 in the following ways:
>> +
>> +  * Instead of multiple service names, multiple commands will be
>> +    supported by a single service.
>> +  * Easily extendable as capabilities are moved into their own section
>> +    of the protocol, no longer being hidden behind a NUL byte and
>> +    limited by the size of a pkt-line (as there will be a single
>> +    capability per pkt-line).
>> +  * Separate out other information hidden behind NUL bytes (e.g. agent
>> +    string as a capability and symrefs can be requested using 'ls-refs')
>> +  * Reference advertisement will be omitted unless explicitly requested
>> +  * ls-refs command to explicitly request some refs
>> +
>> + Detailed Design
>> +=================
>> +
>> +A client can request to speak protocol v2 by sending `version=2` in the
>> +side-channel `GIT_PROTOCOL` in the initial request to the server.
>> +
>> +In protocol v2 communication is command oriented.  When first contacting
>> a
>> +server a list of capabilities will advertised.  Some of these
>> capabilities
>> +will be commands which a client can request be executed.  Once a command
>> +has completed, a client can reuse the connection and request that other
>> +commands be executed.
>> +
>> + Special Packets
>> +-----------------
>> +
>> +In protocol v2 these special packets will have the following semantics:
>> +
>> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
>> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
>
>
> Previously, a 0001 pkt-line meant that there was 1 byte of data
> following, right?

No, the length was including the length field, so 0005 would indicate that
there is one byte following, (+4 bytes of "0005" included)

> Does this change that and/or prevent 1 byte
> packets?  (Not sure if it is likely, but the odd-tail of a packfile
> might get sent in a 0001 line, right?)  Or is it that 0001 is only
> special during the V2 negotiation stuff, but not during the packfile
> transmission?

0001 is invalid in the current protocol v0.

>
> (I'm not against having this delimiter -- I think it is useful, but
> just curious if will cause problems elsewhere.)
>
> Should we also consider increasing the pkt-line limit to 5 hex-digits
> while we're at it ?   That would let us have 1MB buffers if that would
> help with large packfiles.

AFAICT there is a static allocation of one pkt-line (of maximum size),
such that the code can read in a full packet and then process it.
If we'd increase the packet size we'd need the static buffer to be 1MB,
which sounds good for my developer machine. But I suspect it may be
too much for people using git on embedded devices?

pack files larger than 64k are put into multiple pkt-lines, which is
not a big deal, as the overhead of 4bytes per 64k is negligible.
(also there is progress information in the side channel, which
would come in as a special packet in between real packets,
such that every 64k transmitted you can update your progress
meter; Not sure I feel strongly on fewer progress updates)

>  Granted, we're throttled by the network,
> so it might not matter.  Would it be interesting to have a 5 digit
> prefix with parts of the high bits of first digit being flags ?
> Or is this too radical of a change?

What would the flags be for?

As an alternative we could put the channel number in one byte,
such that we can have a side channel not just while streaming the
pack but all the time. (Again, not sure if that buys a lot for us)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 18:57     ` Stefan Beller
@ 2018-02-01 19:09       ` Jeff Hostetler
  2018-02-01 20:05         ` Brandon Williams
  2018-02-01 19:45       ` Randall S. Becker
  1 sibling, 1 reply; 362+ messages in thread
From: Jeff Hostetler @ 2018-02-01 19:09 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Brandon Williams, git, Junio C Hamano, Jeff King, Philip Oakley,
	Derrick Stolee, Jonathan Nieder



On 2/1/2018 1:57 PM, Stefan Beller wrote:
> On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
>>
>>
>> On 1/2/2018 7:18 PM, Brandon Williams wrote:
>>>
>>> Introduce git-serve, the base server for protocol version 2.
[...]
>>> + Special Packets
>>> +-----------------
>>> +
>>> +In protocol v2 these special packets will have the following semantics:
>>> +
>>> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
>>> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
>>
>>
>> Previously, a 0001 pkt-line meant that there was 1 byte of data
>> following, right?
> 
> No, the length was including the length field, so 0005 would indicate that
> there is one byte following, (+4 bytes of "0005" included)

d'oh.  right.  thanks!

>> Should we also consider increasing the pkt-line limit to 5 hex-digits
>> while we're at it ?   That would let us have 1MB buffers if that would
>> help with large packfiles.
> 
> AFAICT there is a static allocation of one pkt-line (of maximum size),
> such that the code can read in a full packet and then process it.
> If we'd increase the packet size we'd need the static buffer to be 1MB,
> which sounds good for my developer machine. But I suspect it may be
> too much for people using git on embedded devices?

I got burned by that static buffer once upon a time when I wanted
to have 2 streams going at the same time.  Hopefully, we can move
that into the new reader structure at some point (if it isn't already).

> 
> pack files larger than 64k are put into multiple pkt-lines, which is
> not a big deal, as the overhead of 4bytes per 64k is negligible.
> (also there is progress information in the side channel, which
> would come in as a special packet in between real packets,
> such that every 64k transmitted you can update your progress
> meter; Not sure I feel strongly on fewer progress updates)
> 
>>   Granted, we're throttled by the network,
>> so it might not matter.  Would it be interesting to have a 5 digit
>> prefix with parts of the high bits of first digit being flags ?
>> Or is this too radical of a change?
> 
> What would the flags be for?
> 
> As an alternative we could put the channel number in one byte,
> such that we can have a side channel not just while streaming the
> pack but all the time. (Again, not sure if that buys a lot for us)
> 

Delimiters like the 0001 and the side channel are a couple of
ideas, but I was just thinking out loud.  And right, I'm not sure
it gets us much right now.

Jeff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-01-03  0:18 ` [PATCH 12/26] ls-refs: introduce ls-refs server command Brandon Williams
  2018-01-04  0:17   ` Stefan Beller
  2018-01-09 20:50   ` Jonathan Tan
@ 2018-02-01 19:16   ` Jeff Hostetler
  2018-02-07  0:55     ` Brandon Williams
  2 siblings, 1 reply; 362+ messages in thread
From: Jeff Hostetler @ 2018-02-01 19:16 UTC (permalink / raw)
  To: Brandon Williams, git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder



On 1/2/2018 7:18 PM, Brandon Williams wrote:
> Introduce the ls-refs server command.  In protocol v2, the ls-refs
> command is used to request the ref advertisement from the server.  Since
> it is a command which can be requested (as opposed to mandatory in v1),
> a client can sent a number of parameters in its request to limit the ref
> advertisement based on provided ref-patterns.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>   Documentation/technical/protocol-v2.txt | 26 +++++++++
>   Makefile                                |  1 +
>   ls-refs.c                               | 97 +++++++++++++++++++++++++++++++++
>   ls-refs.h                               |  9 +++
>   serve.c                                 |  2 +
>   5 files changed, 135 insertions(+)
>   create mode 100644 ls-refs.c
>   create mode 100644 ls-refs.h
> 
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index b87ba3816..5f4d0e719 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -89,3 +89,29 @@ terminate the connection.
>   Commands are the core actions that a client wants to perform (fetch, push,
>   etc).  Each command will be provided with a list capabilities and
>   arguments as requested by a client.
> +
> + Ls-refs
> +---------
> +
> +Ls-refs is the command used to request a reference advertisement in v2.
> +Unlike the current reference advertisement, ls-refs takes in parameters
> +which can be used to limit the refs sent from the server.
> +
> +Ls-ref takes in the following parameters wraped in packet-lines:
> +
> +  symrefs: In addition to the object pointed by it, show the underlying
> +	   ref pointed by it when showing a symbolic ref.
> +  peel: Show peeled tags.
> +  ref-pattern <pattern>: When specified, only references matching the
> +			 given patterns are displayed.
> +
> +The output of ls-refs is as follows:
> +
> +    output = *ref
> +	     flush-pkt
> +    ref = PKT-LINE((tip | peeled) LF)
> +    tip = obj-id SP refname (SP symref-target)
> +    peeled = obj-id SP refname "^{}"
> +
> +    symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
> +    shallow = PKT-LINE("shallow" SP obj-id LF)

Do you want to talk about ordering requirements on this?
I think packed-refs has one, but I'm not sure it matters here
where the client or server sorts it.

Are there any provisions for compressing the renames, like in the
reftable spec or in index-v4 ?

It doesn't need to be in the initial version.  Just asking.  We could
always add a "ls-refs-2" command that builds upon this.

Thanks,
Jeff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 00/27] protocol version 2
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (27 preceding siblings ...)
  2018-01-31 16:00   ` [PATCH v2 00/27] protocol version 2 Derrick Stolee
@ 2018-02-01 19:40   ` Jeff Hostetler
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
  29 siblings, 0 replies; 362+ messages in thread
From: Jeff Hostetler @ 2018-02-01 19:40 UTC (permalink / raw)
  To: Brandon Williams, git
  Cc: sbeller, gitster, peff, philipoakley, stolee, jrnieder



On 1/25/2018 6:58 PM, Brandon Williams wrote:
> Changes in v2:
>   * Added documentation for fetch
>   * changes #defines for state variables to be enums
>   * couple code changes to pkt-line functions and documentation
>   * Added unit tests for the git-serve binary as well as for ls-refs
[...]


This looks really nice.  I'm eager to get this in so we can do some
additional commands to help make partial clone more efficient.

Thanks,
Jeff


^ permalink raw reply	[flat|nested] 362+ messages in thread

* RE: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 18:57     ` Stefan Beller
  2018-02-01 19:09       ` Jeff Hostetler
@ 2018-02-01 19:45       ` Randall S. Becker
  2018-02-01 20:08         ` 'Brandon Williams'
  1 sibling, 1 reply; 362+ messages in thread
From: Randall S. Becker @ 2018-02-01 19:45 UTC (permalink / raw)
  To: 'Stefan Beller', 'Jeff Hostetler'
  Cc: 'Brandon Williams', 'git',
	'Junio C Hamano', 'Jeff King',
	'Philip Oakley', 'Derrick Stolee',
	'Jonathan Nieder'

On February 1, 2018 1:58 PM, Stefan Beller wrote:
> On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler <git@jeffhostetler.com>
> wrote:
> >
> >
> > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> >>
> >> Introduce git-serve, the base server for protocol version 2.
> >>
> >> Protocol version 2 is intended to be a replacement for Git's current
> >> wire protocol.  The intention is that it will be a simpler, less
> >> wasteful protocol which can evolve over time.
> >>
> >> Protocol version 2 improves upon version 1 by eliminating the initial
> >> ref advertisement.  In its place a server will export a list of
> >> capabilities and commands which it supports in a capability
> >> advertisement.  A client can then request that a particular command
> >> be executed by providing a number of capabilities and command
> >> specific parameters.  At the completion of a command, a client can
> >> request that another command be executed or can terminate the
> >> connection by sending a flush packet.
> >>
> >> Signed-off-by: Brandon Williams <bmwill@google.com>
> >> ---
> >>   .gitignore                              |   1 +
> >>   Documentation/technical/protocol-v2.txt |  91 ++++++++++++
> >>   Makefile                                |   2 +
> >>   builtin.h                               |   1 +
> >>   builtin/serve.c                         |  30 ++++
> >>   git.c                                   |   1 +
> >>   serve.c                                 | 239
> >> ++++++++++++++++++++++++++++++++
> >>   serve.h                                 |  15 ++
> >>   8 files changed, 380 insertions(+)
> >>   create mode 100644 Documentation/technical/protocol-v2.txt
> >>   create mode 100644 builtin/serve.c
> >>   create mode 100644 serve.c
> >>   create mode 100644 serve.h
> >>
> >> diff --git a/.gitignore b/.gitignore
> >> index 833ef3b0b..2d0450c26 100644
> >> --- a/.gitignore
> >> +++ b/.gitignore
> >> @@ -140,6 +140,7 @@
> >>   /git-rm
> >>   /git-send-email
> >>   /git-send-pack
> >> +/git-serve
> >>   /git-sh-i18n
> >>   /git-sh-i18n--envsubst
> >>   /git-sh-setup
> >> diff --git a/Documentation/technical/protocol-v2.txt
> >> b/Documentation/technical/protocol-v2.txt
> >> new file mode 100644
> >> index 000000000..b87ba3816
> >> --- /dev/null
> >> +++ b/Documentation/technical/protocol-v2.txt
> >> @@ -0,0 +1,91 @@
> >> + Git Wire Protocol, Version 2
> >> +==============================
> >> +
> >> +This document presents a specification for a version 2 of Git's wire
> >> +protocol.  Protocol v2 will improve upon v1 in the following ways:
> >> +
> >> +  * Instead of multiple service names, multiple commands will be
> >> +    supported by a single service.
> >> +  * Easily extendable as capabilities are moved into their own section
> >> +    of the protocol, no longer being hidden behind a NUL byte and
> >> +    limited by the size of a pkt-line (as there will be a single
> >> +    capability per pkt-line).
> >> +  * Separate out other information hidden behind NUL bytes (e.g. agent
> >> +    string as a capability and symrefs can be requested using
> >> + 'ls-refs')
> >> +  * Reference advertisement will be omitted unless explicitly
> >> + requested
> >> +  * ls-refs command to explicitly request some refs
> >> +
> >> + Detailed Design
> >> +=================
> >> +
> >> +A client can request to speak protocol v2 by sending `version=2` in
> >> +the side-channel `GIT_PROTOCOL` in the initial request to the server.
> >> +
> >> +In protocol v2 communication is command oriented.  When first
> >> +contacting
> >> a
> >> +server a list of capabilities will advertised.  Some of these
> >> capabilities
> >> +will be commands which a client can request be executed.  Once a
> >> +command has completed, a client can reuse the connection and request
> >> +that other commands be executed.
> >> +
> >> + Special Packets
> >> +-----------------
> >> +
> >> +In protocol v2 these special packets will have the following semantics:
> >> +
> >> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
> >> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a
> >> + message
> >
> >
> > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > following, right?
> 
> No, the length was including the length field, so 0005 would indicate that
> there is one byte following, (+4 bytes of "0005" included)
> 
> > Does this change that and/or prevent 1 byte packets?  (Not sure if it
> > is likely, but the odd-tail of a packfile might get sent in a 0001
> > line, right?)  Or is it that 0001 is only special during the V2
> > negotiation stuff, but not during the packfile transmission?
> 
> 0001 is invalid in the current protocol v0.
> 
> >
> > (I'm not against having this delimiter -- I think it is useful, but
> > just curious if will cause problems elsewhere.)
> >
> > Should we also consider increasing the pkt-line limit to 5 hex-digits
> > while we're at it ?   That would let us have 1MB buffers if that would
> > help with large packfiles.
> 
> AFAICT there is a static allocation of one pkt-line (of maximum size), such
> that the code can read in a full packet and then process it.
> If we'd increase the packet size we'd need the static buffer to be 1MB, which
> sounds good for my developer machine. But I suspect it may be too much for
> people using git on embedded devices?
> 
> pack files larger than 64k are put into multiple pkt-lines, which is not a big
> deal, as the overhead of 4bytes per 64k is negligible.
> (also there is progress information in the side channel, which would come in
> as a special packet in between real packets, such that every 64k transmitted
> you can update your progress meter; Not sure I feel strongly on fewer
> progress updates)

Can I request, selfishly from my own platform's (NonStop) performance heartache, that we don't require 1Mb? We're not embedded on this platform, but there is an optimized message system packet size down at 50Kb that I would like to stay under. Although above that is no problem, there is a significant cost incurred above that size point. And please make sure xread/xwrite are used in any event.

> >  Granted, we're throttled by the network, so it might not matter.
> > Would it be interesting to have a 5 digit prefix with parts of the
> > high bits of first digit being flags ?
> > Or is this too radical of a change?
> 
> What would the flags be for?
> 
> As an alternative we could put the channel number in one byte, such that we
> can have a side channel not just while streaming the pack but all the time.
> (Again, not sure if that buys a lot for us)

Cheers,
Randall

-- Brief whoami:
 NonStop developer since approximately 211288444200000000
 UNIX developer since approximately 421664400
-- In my real life, I talk too much.




^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 19:09       ` Jeff Hostetler
@ 2018-02-01 20:05         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-01 20:05 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: Stefan Beller, git, Junio C Hamano, Jeff King, Philip Oakley,
	Derrick Stolee, Jonathan Nieder

On 02/01, Jeff Hostetler wrote:
> 
> 
> On 2/1/2018 1:57 PM, Stefan Beller wrote:
> > On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler <git@jeffhostetler.com> wrote:
> > > 
> > > 
> > > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > > > 
> > > > Introduce git-serve, the base server for protocol version 2.
> [...]
> > > > + Special Packets
> > > > +-----------------
> > > > +
> > > > +In protocol v2 these special packets will have the following semantics:
> > > > +
> > > > +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
> > > > +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
> > > 
> > > 
> > > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > > following, right?
> > 
> > No, the length was including the length field, so 0005 would indicate that
> > there is one byte following, (+4 bytes of "0005" included)
> 
> d'oh.  right.  thanks!
> 
> > > Should we also consider increasing the pkt-line limit to 5 hex-digits
> > > while we're at it ?   That would let us have 1MB buffers if that would
> > > help with large packfiles.
> > 
> > AFAICT there is a static allocation of one pkt-line (of maximum size),
> > such that the code can read in a full packet and then process it.
> > If we'd increase the packet size we'd need the static buffer to be 1MB,
> > which sounds good for my developer machine. But I suspect it may be
> > too much for people using git on embedded devices?
> 
> I got burned by that static buffer once upon a time when I wanted
> to have 2 streams going at the same time.  Hopefully, we can move
> that into the new reader structure at some point (if it isn't already).

Yeah the reader struct could easily be extended to take in the
buffer to read the data into.  Because I'm not trying to do any of that
atm I decided to have it default to using the static buffer, but it
would be as simple as changing the reader->buffer variable to use a
different buffer.

> 
> > 
> > pack files larger than 64k are put into multiple pkt-lines, which is
> > not a big deal, as the overhead of 4bytes per 64k is negligible.
> > (also there is progress information in the side channel, which
> > would come in as a special packet in between real packets,
> > such that every 64k transmitted you can update your progress
> > meter; Not sure I feel strongly on fewer progress updates)
> > 
> > >   Granted, we're throttled by the network,
> > > so it might not matter.  Would it be interesting to have a 5 digit
> > > prefix with parts of the high bits of first digit being flags ?
> > > Or is this too radical of a change?
> > 
> > What would the flags be for?
> > 
> > As an alternative we could put the channel number in one byte,
> > such that we can have a side channel not just while streaming the
> > pack but all the time. (Again, not sure if that buys a lot for us)
> > 
> 
> Delimiters like the 0001 and the side channel are a couple of
> ideas, but I was just thinking out loud.  And right, I'm not sure
> it gets us much right now.
> 
> Jeff

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 19:45       ` Randall S. Becker
@ 2018-02-01 20:08         ` 'Brandon Williams'
  2018-02-01 20:37           ` Randall S. Becker
  0 siblings, 1 reply; 362+ messages in thread
From: 'Brandon Williams' @ 2018-02-01 20:08 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: 'Stefan Beller', 'Jeff Hostetler', 'git',
	'Junio C Hamano', 'Jeff King',
	'Philip Oakley', 'Derrick Stolee',
	'Jonathan Nieder'

On 02/01, Randall S. Becker wrote:
> On February 1, 2018 1:58 PM, Stefan Beller wrote:
> > On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler <git@jeffhostetler.com>
> > wrote:
> > >
> > >
> > > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > >>
> > >> Introduce git-serve, the base server for protocol version 2.
> > >>
> > >> Protocol version 2 is intended to be a replacement for Git's current
> > >> wire protocol.  The intention is that it will be a simpler, less
> > >> wasteful protocol which can evolve over time.
> > >>
> > >> Protocol version 2 improves upon version 1 by eliminating the initial
> > >> ref advertisement.  In its place a server will export a list of
> > >> capabilities and commands which it supports in a capability
> > >> advertisement.  A client can then request that a particular command
> > >> be executed by providing a number of capabilities and command
> > >> specific parameters.  At the completion of a command, a client can
> > >> request that another command be executed or can terminate the
> > >> connection by sending a flush packet.
> > >>
> > >> Signed-off-by: Brandon Williams <bmwill@google.com>
> > >> ---
> > >>   .gitignore                              |   1 +
> > >>   Documentation/technical/protocol-v2.txt |  91 ++++++++++++
> > >>   Makefile                                |   2 +
> > >>   builtin.h                               |   1 +
> > >>   builtin/serve.c                         |  30 ++++
> > >>   git.c                                   |   1 +
> > >>   serve.c                                 | 239
> > >> ++++++++++++++++++++++++++++++++
> > >>   serve.h                                 |  15 ++
> > >>   8 files changed, 380 insertions(+)
> > >>   create mode 100644 Documentation/technical/protocol-v2.txt
> > >>   create mode 100644 builtin/serve.c
> > >>   create mode 100644 serve.c
> > >>   create mode 100644 serve.h
> > >>
> > >> diff --git a/.gitignore b/.gitignore
> > >> index 833ef3b0b..2d0450c26 100644
> > >> --- a/.gitignore
> > >> +++ b/.gitignore
> > >> @@ -140,6 +140,7 @@
> > >>   /git-rm
> > >>   /git-send-email
> > >>   /git-send-pack
> > >> +/git-serve
> > >>   /git-sh-i18n
> > >>   /git-sh-i18n--envsubst
> > >>   /git-sh-setup
> > >> diff --git a/Documentation/technical/protocol-v2.txt
> > >> b/Documentation/technical/protocol-v2.txt
> > >> new file mode 100644
> > >> index 000000000..b87ba3816
> > >> --- /dev/null
> > >> +++ b/Documentation/technical/protocol-v2.txt
> > >> @@ -0,0 +1,91 @@
> > >> + Git Wire Protocol, Version 2
> > >> +==============================
> > >> +
> > >> +This document presents a specification for a version 2 of Git's wire
> > >> +protocol.  Protocol v2 will improve upon v1 in the following ways:
> > >> +
> > >> +  * Instead of multiple service names, multiple commands will be
> > >> +    supported by a single service.
> > >> +  * Easily extendable as capabilities are moved into their own section
> > >> +    of the protocol, no longer being hidden behind a NUL byte and
> > >> +    limited by the size of a pkt-line (as there will be a single
> > >> +    capability per pkt-line).
> > >> +  * Separate out other information hidden behind NUL bytes (e.g. agent
> > >> +    string as a capability and symrefs can be requested using
> > >> + 'ls-refs')
> > >> +  * Reference advertisement will be omitted unless explicitly
> > >> + requested
> > >> +  * ls-refs command to explicitly request some refs
> > >> +
> > >> + Detailed Design
> > >> +=================
> > >> +
> > >> +A client can request to speak protocol v2 by sending `version=2` in
> > >> +the side-channel `GIT_PROTOCOL` in the initial request to the server.
> > >> +
> > >> +In protocol v2 communication is command oriented.  When first
> > >> +contacting
> > >> a
> > >> +server a list of capabilities will advertised.  Some of these
> > >> capabilities
> > >> +will be commands which a client can request be executed.  Once a
> > >> +command has completed, a client can reuse the connection and request
> > >> +that other commands be executed.
> > >> +
> > >> + Special Packets
> > >> +-----------------
> > >> +
> > >> +In protocol v2 these special packets will have the following semantics:
> > >> +
> > >> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
> > >> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of a
> > >> + message
> > >
> > >
> > > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > > following, right?
> > 
> > No, the length was including the length field, so 0005 would indicate that
> > there is one byte following, (+4 bytes of "0005" included)
> > 
> > > Does this change that and/or prevent 1 byte packets?  (Not sure if it
> > > is likely, but the odd-tail of a packfile might get sent in a 0001
> > > line, right?)  Or is it that 0001 is only special during the V2
> > > negotiation stuff, but not during the packfile transmission?
> > 
> > 0001 is invalid in the current protocol v0.
> > 
> > >
> > > (I'm not against having this delimiter -- I think it is useful, but
> > > just curious if will cause problems elsewhere.)
> > >
> > > Should we also consider increasing the pkt-line limit to 5 hex-digits
> > > while we're at it ?   That would let us have 1MB buffers if that would
> > > help with large packfiles.
> > 
> > AFAICT there is a static allocation of one pkt-line (of maximum size), such
> > that the code can read in a full packet and then process it.
> > If we'd increase the packet size we'd need the static buffer to be 1MB, which
> > sounds good for my developer machine. But I suspect it may be too much for
> > people using git on embedded devices?
> > 
> > pack files larger than 64k are put into multiple pkt-lines, which is not a big
> > deal, as the overhead of 4bytes per 64k is negligible.
> > (also there is progress information in the side channel, which would come in
> > as a special packet in between real packets, such that every 64k transmitted
> > you can update your progress meter; Not sure I feel strongly on fewer
> > progress updates)
> 
> Can I request, selfishly from my own platform's (NonStop) performance heartache, that we don't require 1Mb? We're not embedded on this platform, but there is an optimized message system packet size down at 50Kb that I would like to stay under. Although above that is no problem, there is a significant cost incurred above that size point. And please make sure xread/xwrite are used in any event.

I think that it would be too much of a change to up to 1MB lines at the
moment so I'm planning on leaving it right where it is :)

> 
> > >  Granted, we're throttled by the network, so it might not matter.
> > > Would it be interesting to have a 5 digit prefix with parts of the
> > > high bits of first digit being flags ?
> > > Or is this too radical of a change?
> > 
> > What would the flags be for?
> > 
> > As an alternative we could put the channel number in one byte, such that we
> > can have a side channel not just while streaming the pack but all the time.
> > (Again, not sure if that buys a lot for us)
> 
> Cheers,
> Randall
> 
> -- Brief whoami:
>  NonStop developer since approximately 211288444200000000
>  UNIX developer since approximately 421664400
> -- In my real life, I talk too much.
> 
> 
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* RE: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 20:08         ` 'Brandon Williams'
@ 2018-02-01 20:37           ` Randall S. Becker
  2018-02-01 20:50             ` Stefan Beller
  0 siblings, 1 reply; 362+ messages in thread
From: Randall S. Becker @ 2018-02-01 20:37 UTC (permalink / raw)
  To: 'Brandon Williams'
  Cc: 'Stefan Beller', 'Jeff Hostetler', 'git',
	'Junio C Hamano', 'Jeff King',
	'Philip Oakley', 'Derrick Stolee',
	'Jonathan Nieder'

On February 1, 2018 3:08 PM, Brandon Williams wrote:
> On 02/01, Randall S. Becker wrote:
> > On February 1, 2018 1:58 PM, Stefan Beller wrote:
> > > On Thu, Feb 1, 2018 at 10:48 AM, Jeff Hostetler
> > > <git@jeffhostetler.com>
> > > wrote:
> > > >
> > > >
> > > > On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > > >>
> > > >> Introduce git-serve, the base server for protocol version 2.
> > > >>
> > > >> Protocol version 2 is intended to be a replacement for Git's
> > > >> current wire protocol.  The intention is that it will be a
> > > >> simpler, less wasteful protocol which can evolve over time.
> > > >>
> > > >> Protocol version 2 improves upon version 1 by eliminating the
> > > >> initial ref advertisement.  In its place a server will export a
> > > >> list of capabilities and commands which it supports in a
> > > >> capability advertisement.  A client can then request that a
> > > >> particular command be executed by providing a number of
> > > >> capabilities and command specific parameters.  At the completion
> > > >> of a command, a client can request that another command be
> > > >> executed or can terminate the connection by sending a flush packet.
> > > >>
> > > >> Signed-off-by: Brandon Williams <bmwill@google.com>
> > > >> ---
> > > >>   .gitignore                              |   1 +
> > > >>   Documentation/technical/protocol-v2.txt |  91 ++++++++++++
> > > >>   Makefile                                |   2 +
> > > >>   builtin.h                               |   1 +
> > > >>   builtin/serve.c                         |  30 ++++
> > > >>   git.c                                   |   1 +
> > > >>   serve.c                                 | 239
> > > >> ++++++++++++++++++++++++++++++++
> > > >>   serve.h                                 |  15 ++
> > > >>   8 files changed, 380 insertions(+)
> > > >>   create mode 100644 Documentation/technical/protocol-v2.txt
> > > >>   create mode 100644 builtin/serve.c
> > > >>   create mode 100644 serve.c
> > > >>   create mode 100644 serve.h
> > > >>
> > > >> diff --git a/.gitignore b/.gitignore index 833ef3b0b..2d0450c26
> > > >> 100644
> > > >> --- a/.gitignore
> > > >> +++ b/.gitignore
> > > >> @@ -140,6 +140,7 @@
> > > >>   /git-rm
> > > >>   /git-send-email
> > > >>   /git-send-pack
> > > >> +/git-serve
> > > >>   /git-sh-i18n
> > > >>   /git-sh-i18n--envsubst
> > > >>   /git-sh-setup
> > > >> diff --git a/Documentation/technical/protocol-v2.txt
> > > >> b/Documentation/technical/protocol-v2.txt
> > > >> new file mode 100644
> > > >> index 000000000..b87ba3816
> > > >> --- /dev/null
> > > >> +++ b/Documentation/technical/protocol-v2.txt
> > > >> @@ -0,0 +1,91 @@
> > > >> + Git Wire Protocol, Version 2
> > > >> +==============================
> > > >> +
> > > >> +This document presents a specification for a version 2 of Git's
> > > >> +wire protocol.  Protocol v2 will improve upon v1 in the following
> ways:
> > > >> +
> > > >> +  * Instead of multiple service names, multiple commands will be
> > > >> +    supported by a single service.
> > > >> +  * Easily extendable as capabilities are moved into their own
section
> > > >> +    of the protocol, no longer being hidden behind a NUL byte and
> > > >> +    limited by the size of a pkt-line (as there will be a single
> > > >> +    capability per pkt-line).
> > > >> +  * Separate out other information hidden behind NUL bytes (e.g.
> agent
> > > >> +    string as a capability and symrefs can be requested using
> > > >> + 'ls-refs')
> > > >> +  * Reference advertisement will be omitted unless explicitly
> > > >> + requested
> > > >> +  * ls-refs command to explicitly request some refs
> > > >> +
> > > >> + Detailed Design
> > > >> +=================
> > > >> +
> > > >> +A client can request to speak protocol v2 by sending `version=2`
> > > >> +in the side-channel `GIT_PROTOCOL` in the initial request to the
> server.
> > > >> +
> > > >> +In protocol v2 communication is command oriented.  When first
> > > >> +contacting
> > > >> a
> > > >> +server a list of capabilities will advertised.  Some of these
> > > >> capabilities
> > > >> +will be commands which a client can request be executed.  Once a
> > > >> +command has completed, a client can reuse the connection and
> > > >> +request that other commands be executed.
> > > >> +
> > > >> + Special Packets
> > > >> +-----------------
> > > >> +
> > > >> +In protocol v2 these special packets will have the following
> semantics:
> > > >> +
> > > >> +  * '0000' Flush Packet (flush-pkt) - indicates the end of a
> > > >> + message
> > > >> +  * '0001' Delimiter Packet (delim-pkt) - separates sections of
> > > >> + a message
> > > >
> > > >
> > > > Previously, a 0001 pkt-line meant that there was 1 byte of data
> > > > following, right?
> > >
> > > No, the length was including the length field, so 0005 would
> > > indicate that there is one byte following, (+4 bytes of "0005"
> > > included)
> > >
> > > > Does this change that and/or prevent 1 byte packets?  (Not sure if
> > > > it is likely, but the odd-tail of a packfile might get sent in a
> > > > 0001 line, right?)  Or is it that 0001 is only special during the
> > > > V2 negotiation stuff, but not during the packfile transmission?
> > >
> > > 0001 is invalid in the current protocol v0.
> > >
> > > >
> > > > (I'm not against having this delimiter -- I think it is useful,
> > > > but just curious if will cause problems elsewhere.)
> > > >
> > > > Should we also consider increasing the pkt-line limit to 5
hex-digits
> > > > while we're at it ?   That would let us have 1MB buffers if that
would
> > > > help with large packfiles.
> > >
> > > AFAICT there is a static allocation of one pkt-line (of maximum
> > > size), such that the code can read in a full packet and then process
it.
> > > If we'd increase the packet size we'd need the static buffer to be
> > > 1MB, which sounds good for my developer machine. But I suspect it
> > > may be too much for people using git on embedded devices?
> > >
> > > pack files larger than 64k are put into multiple pkt-lines, which is
> > > not a big deal, as the overhead of 4bytes per 64k is negligible.
> > > (also there is progress information in the side channel, which would
> > > come in as a special packet in between real packets, such that every
> > > 64k transmitted you can update your progress meter; Not sure I feel
> > > strongly on fewer progress updates)
> >
> > Can I request, selfishly from my own platform's (NonStop) performance
> heartache, that we don't require 1Mb? We're not embedded on this
> platform, but there is an optimized message system packet size down at
> 50Kb that I would like to stay under. Although above that is no problem,
> there is a significant cost incurred above that size point. And please
make
> sure xread/xwrite are used in any event.
> 
> I think that it would be too much of a change to up to 1MB lines at the
> moment so I'm planning on leaving it right where it is :)

In for a kilo, in for a tonne. Once we're way up there, it's not a problem
or much of a difference. :)

> > > >  Granted, we're throttled by the network, so it might not matter.
> > > > Would it be interesting to have a 5 digit prefix with parts of the
> > > > high bits of first digit being flags ?
> > > > Or is this too radical of a change?
> > >
> > > What would the flags be for?
> > >
> > > As an alternative we could put the channel number in one byte, such
> > > that we can have a side channel not just while streaming the pack but
all
> the time.
> > > (Again, not sure if that buys a lot for us)


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 11/26] serve: introduce git-serve
  2018-02-01 20:37           ` Randall S. Becker
@ 2018-02-01 20:50             ` Stefan Beller
  0 siblings, 0 replies; 362+ messages in thread
From: Stefan Beller @ 2018-02-01 20:50 UTC (permalink / raw)
  To: Randall S. Becker
  Cc: Brandon Williams, Jeff Hostetler, git, Junio C Hamano, Jeff King,
	Philip Oakley, Derrick Stolee, Jonathan Nieder

On Thu, Feb 1, 2018 at 12:37 PM, Randall S. Becker
<rsbecker@nexbridge.com> wrote:

>> I think that it would be too much of a change to up to 1MB lines at the
>> moment so I'm planning on leaving it right where it is :)
>
> In for a kilo, in for a tonne. Once we're way up there, it's not a problem
> or much of a difference. :)

What benefit does a larger buffer have?

I outlined the negatives above (large static buffer, issues with
progress meter).

And it seems to me that Brandon wants to keep this series as small as possible
w.r.t. bait for endless discussions and only deliver innovation, that solves the
immediate needs. Are there issues with too small buffers? (Can you link to
the performance measurements or an analysis?)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 13/27] ls-refs: introduce ls-refs server command
  2018-01-26 22:20     ` Stefan Beller
@ 2018-02-02 22:31       ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-02 22:31 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Junio C Hamano, Jeff King, Philip Oakley, Derrick Stolee,
	Jonathan Nieder

On 01/26, Stefan Beller wrote:
> On Thu, Jan 25, 2018 at 3:58 PM, Brandon Williams <bmwill@google.com> wrote:
> 
> > +ls-refs takes in the following parameters wrapped in packet-lines:
> > +
> > +    symrefs
> > +       In addition to the object pointed by it, show the underlying ref
> > +       pointed by it when showing a symbolic ref.
> > +    peel
> > +       Show peeled tags.
> 
> Would it make sense to default these two to on, and rather have
> optional no-symrefs and no-peel ?
> 
> That would save bandwidth in the default case, I would think.

Maybe?  That would save sending those strings for each request

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2
  2018-01-31 14:54     ` Derrick Stolee
@ 2018-02-02 22:44       ` Brandon Williams
  2018-02-05 14:14         ` Derrick Stolee
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-02 22:44 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, sbeller, gitster, peff, philipoakley, jrnieder

On 01/31, Derrick Stolee wrote:
> On 1/25/2018 6:58 PM, Brandon Williams wrote:
> > Introduce protocol_v2, a new value for 'enum protocol_version'.
> > Subsequent patches will fill in the implementation of protocol_v2.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >   builtin/fetch-pack.c   | 3 +++
> >   builtin/receive-pack.c | 6 ++++++
> >   builtin/send-pack.c    | 3 +++
> >   builtin/upload-pack.c  | 7 +++++++
> >   connect.c              | 3 +++
> >   protocol.c             | 2 ++
> >   protocol.h             | 1 +
> >   remote-curl.c          | 3 +++
> >   transport.c            | 9 +++++++++
> >   9 files changed, 37 insertions(+)
> > 
> > diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> > index 85d4faf76..f492e8abd 100644
> > --- a/builtin/fetch-pack.c
> > +++ b/builtin/fetch-pack.c
> > @@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
> >   			   PACKET_READ_GENTLE_ON_EOF);
> >   	switch (discover_version(&reader)) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   	case protocol_v0:
> >   		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> > index b7ce7c7f5..3656e94fd 100644
> > --- a/builtin/receive-pack.c
> > +++ b/builtin/receive-pack.c
> > @@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> >   		unpack_limit = receive_unpack_limit;
> >   	switch (determine_protocol_version_server()) {
> > +	case protocol_v2:
> > +		/*
> > +		 * push support for protocol v2 has not been implemented yet,
> > +		 * so ignore the request to use v2 and fallback to using v0.
> > +		 */
> > +		break;
> >   	case protocol_v1:
> >   		/*
> >   		 * v1 is just the original protocol with a version string,
> > diff --git a/builtin/send-pack.c b/builtin/send-pack.c
> > index 83cb125a6..b5427f75e 100644
> > --- a/builtin/send-pack.c
> > +++ b/builtin/send-pack.c
> > @@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
> >   			   PACKET_READ_GENTLE_ON_EOF);
> >   	switch (discover_version(&reader)) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   	case protocol_v0:
> >   		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
> > diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
> > index 2cb5cb35b..8d53e9794 100644
> > --- a/builtin/upload-pack.c
> > +++ b/builtin/upload-pack.c
> > @@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
> >   		die("'%s' does not appear to be a git repository", dir);
> >   	switch (determine_protocol_version_server()) {
> > +	case protocol_v2:
> > +		/*
> > +		 * fetch support for protocol v2 has not been implemented yet,
> > +		 * so ignore the request to use v2 and fallback to using v0.
> > +		 */
> > +		upload_pack(&opts);
> > +		break;
> >   	case protocol_v1:
> >   		/*
> >   		 * v1 is just the original protocol with a version string,
> > diff --git a/connect.c b/connect.c
> > index db3c9d24c..f2157a821 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
> >   	/* Maybe process capabilities here, at least for v2 */
> >   	switch (version) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   		/* Read the peeked version line */
> >   		packet_reader_read(reader);
> > diff --git a/protocol.c b/protocol.c
> > index 43012b7eb..5e636785d 100644
> > --- a/protocol.c
> > +++ b/protocol.c
> > @@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
> >   		return protocol_v0;
> >   	else if (!strcmp(value, "1"))
> >   		return protocol_v1;
> > +	else if (!strcmp(value, "2"))
> > +		return protocol_v2;
> >   	else
> >   		return protocol_unknown_version;
> >   }
> > diff --git a/protocol.h b/protocol.h
> > index 1b2bc94a8..2ad35e433 100644
> > --- a/protocol.h
> > +++ b/protocol.h
> > @@ -5,6 +5,7 @@ enum protocol_version {
> >   	protocol_unknown_version = -1,
> >   	protocol_v0 = 0,
> >   	protocol_v1 = 1,
> > +	protocol_v2 = 2,
> >   };
> >   /*
> > diff --git a/remote-curl.c b/remote-curl.c
> > index 9f6d07683..dae8a4a48 100644
> > --- a/remote-curl.c
> > +++ b/remote-curl.c
> > @@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
> >   			   PACKET_READ_GENTLE_ON_EOF);
> >   	switch (discover_version(&reader)) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   	case protocol_v0:
> >   		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
> > diff --git a/transport.c b/transport.c
> > index 2378dcb38..83d9dd1df 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
> >   	data->version = discover_version(&reader);
> >   	switch (data->version) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   	case protocol_v0:
> >   		get_remote_heads(&reader, &refs,
> > @@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
> >   		refs_tmp = get_refs_via_connect(transport, 0);
> >   	switch (data->version) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   	case protocol_v0:
> >   		refs = fetch_pack(&args, data->fd, data->conn,
> > @@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
> >   		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
> >   	switch (data->version) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> >   	case protocol_v1:
> >   	case protocol_v0:
> >   		ret = send_pack(&args, data->fd, data->conn, remote_refs,
> 
> With a macro approach to version selection, this change becomes simpler in
> some ways and harder in others.
> 
> It is simpler in that we can have the macro from the previous commits just
> fall back to version 0 behavior.
> 
> It is harder in that this commit would need one of two options:
> 
> 1. A macro that performs an arbitrary statement when given v2, which would
> be the die() for these actions not in v2.
> 2. A macro that clearly states v2 is not supported and calls die() on v2.
> 
> Here is my simple, untested attempt at a union of these options:
> 
> #define ON_PROTOCOL_VERSION(version,v0,v2) switch(version) {\
> case protocol_v2:\
>     (v2);\
>     break;\
> case protocol_v1:\
> case protocol_v0:\
>     (v0);\
>     break;\
> case protocol_unknown_version:\
>     BUG("unknown protocol version");\
> }
> #define ON_PROTOCOL_VERSION_V0_FALLBACK(version,v0) switch(version) {\
> case protocol_v2:\
> case protocol_v1:\
> case protocol_v0:\
>     (v0);\
>     break;\
> case protocol_unknown_version:\
>     BUG("unknown protocol version");\
> }
> #define ON_PROTOCOL_VERSION_V0_ONLY(version,v0) \
>     ON_PROTOCOL_VERSION(version,v0,\
>                 BUG("support for protocol v2 not implemented yet"))


While I understand wanting to isolate the switch statement code, I think
that creating such a macro would make reading the code much more
difficult (and a pain to get right).  Really I don't want to try my hand
at crafting such a macro :D

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2
  2018-02-02 22:44       ` Brandon Williams
@ 2018-02-05 14:14         ` Derrick Stolee
  0 siblings, 0 replies; 362+ messages in thread
From: Derrick Stolee @ 2018-02-05 14:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, peff, philipoakley, jrnieder

On 2/2/2018 5:44 PM, Brandon Williams wrote:
> On 01/31, Derrick Stolee wrote:
>> On 1/25/2018 6:58 PM, Brandon Williams wrote:
>>> Introduce protocol_v2, a new value for 'enum protocol_version'.
>>> Subsequent patches will fill in the implementation of protocol_v2.
>>>
>>> Signed-off-by: Brandon Williams <bmwill@google.com>
>>> ---
>>>    builtin/fetch-pack.c   | 3 +++
>>>    builtin/receive-pack.c | 6 ++++++
>>>    builtin/send-pack.c    | 3 +++
>>>    builtin/upload-pack.c  | 7 +++++++
>>>    connect.c              | 3 +++
>>>    protocol.c             | 2 ++
>>>    protocol.h             | 1 +
>>>    remote-curl.c          | 3 +++
>>>    transport.c            | 9 +++++++++
>>>    9 files changed, 37 insertions(+)
>>>
>>> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
>>> index 85d4faf76..f492e8abd 100644
>>> --- a/builtin/fetch-pack.c
>>> +++ b/builtin/fetch-pack.c
>>> @@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>>>    			   PACKET_READ_GENTLE_ON_EOF);
>>>    	switch (discover_version(&reader)) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    	case protocol_v0:
>>>    		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
>>> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
>>> index b7ce7c7f5..3656e94fd 100644
>>> --- a/builtin/receive-pack.c
>>> +++ b/builtin/receive-pack.c
>>> @@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>>>    		unpack_limit = receive_unpack_limit;
>>>    	switch (determine_protocol_version_server()) {
>>> +	case protocol_v2:
>>> +		/*
>>> +		 * push support for protocol v2 has not been implemented yet,
>>> +		 * so ignore the request to use v2 and fallback to using v0.
>>> +		 */
>>> +		break;
>>>    	case protocol_v1:
>>>    		/*
>>>    		 * v1 is just the original protocol with a version string,
>>> diff --git a/builtin/send-pack.c b/builtin/send-pack.c
>>> index 83cb125a6..b5427f75e 100644
>>> --- a/builtin/send-pack.c
>>> +++ b/builtin/send-pack.c
>>> @@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
>>>    			   PACKET_READ_GENTLE_ON_EOF);
>>>    	switch (discover_version(&reader)) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    	case protocol_v0:
>>>    		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
>>> diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
>>> index 2cb5cb35b..8d53e9794 100644
>>> --- a/builtin/upload-pack.c
>>> +++ b/builtin/upload-pack.c
>>> @@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
>>>    		die("'%s' does not appear to be a git repository", dir);
>>>    	switch (determine_protocol_version_server()) {
>>> +	case protocol_v2:
>>> +		/*
>>> +		 * fetch support for protocol v2 has not been implemented yet,
>>> +		 * so ignore the request to use v2 and fallback to using v0.
>>> +		 */
>>> +		upload_pack(&opts);
>>> +		break;
>>>    	case protocol_v1:
>>>    		/*
>>>    		 * v1 is just the original protocol with a version string,
>>> diff --git a/connect.c b/connect.c
>>> index db3c9d24c..f2157a821 100644
>>> --- a/connect.c
>>> +++ b/connect.c
>>> @@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
>>>    	/* Maybe process capabilities here, at least for v2 */
>>>    	switch (version) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    		/* Read the peeked version line */
>>>    		packet_reader_read(reader);
>>> diff --git a/protocol.c b/protocol.c
>>> index 43012b7eb..5e636785d 100644
>>> --- a/protocol.c
>>> +++ b/protocol.c
>>> @@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
>>>    		return protocol_v0;
>>>    	else if (!strcmp(value, "1"))
>>>    		return protocol_v1;
>>> +	else if (!strcmp(value, "2"))
>>> +		return protocol_v2;
>>>    	else
>>>    		return protocol_unknown_version;
>>>    }
>>> diff --git a/protocol.h b/protocol.h
>>> index 1b2bc94a8..2ad35e433 100644
>>> --- a/protocol.h
>>> +++ b/protocol.h
>>> @@ -5,6 +5,7 @@ enum protocol_version {
>>>    	protocol_unknown_version = -1,
>>>    	protocol_v0 = 0,
>>>    	protocol_v1 = 1,
>>> +	protocol_v2 = 2,
>>>    };
>>>    /*
>>> diff --git a/remote-curl.c b/remote-curl.c
>>> index 9f6d07683..dae8a4a48 100644
>>> --- a/remote-curl.c
>>> +++ b/remote-curl.c
>>> @@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
>>>    			   PACKET_READ_GENTLE_ON_EOF);
>>>    	switch (discover_version(&reader)) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    	case protocol_v0:
>>>    		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
>>> diff --git a/transport.c b/transport.c
>>> index 2378dcb38..83d9dd1df 100644
>>> --- a/transport.c
>>> +++ b/transport.c
>>> @@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
>>>    	data->version = discover_version(&reader);
>>>    	switch (data->version) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    	case protocol_v0:
>>>    		get_remote_heads(&reader, &refs,
>>> @@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
>>>    		refs_tmp = get_refs_via_connect(transport, 0);
>>>    	switch (data->version) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    	case protocol_v0:
>>>    		refs = fetch_pack(&args, data->fd, data->conn,
>>> @@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
>>>    		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
>>>    	switch (data->version) {
>>> +	case protocol_v2:
>>> +		die("support for protocol v2 not implemented yet");
>>> +		break;
>>>    	case protocol_v1:
>>>    	case protocol_v0:
>>>    		ret = send_pack(&args, data->fd, data->conn, remote_refs,
>> With a macro approach to version selection, this change becomes simpler in
>> some ways and harder in others.
>>
>> It is simpler in that we can have the macro from the previous commits just
>> fall back to version 0 behavior.
>>
>> It is harder in that this commit would need one of two options:
>>
>> 1. A macro that performs an arbitrary statement when given v2, which would
>> be the die() for these actions not in v2.
>> 2. A macro that clearly states v2 is not supported and calls die() on v2.
>>
>> Here is my simple, untested attempt at a union of these options:
>>
>> #define ON_PROTOCOL_VERSION(version,v0,v2) switch(version) {\
>> case protocol_v2:\
>>      (v2);\
>>      break;\
>> case protocol_v1:\
>> case protocol_v0:\
>>      (v0);\
>>      break;\
>> case protocol_unknown_version:\
>>      BUG("unknown protocol version");\
>> }
>> #define ON_PROTOCOL_VERSION_V0_FALLBACK(version,v0) switch(version) {\
>> case protocol_v2:\
>> case protocol_v1:\
>> case protocol_v0:\
>>      (v0);\
>>      break;\
>> case protocol_unknown_version:\
>>      BUG("unknown protocol version");\
>> }
>> #define ON_PROTOCOL_VERSION_V0_ONLY(version,v0) \
>>      ON_PROTOCOL_VERSION(version,v0,\
>>                  BUG("support for protocol v2 not implemented yet"))
>
> While I understand wanting to isolate the switch statement code, I think
> that creating such a macro would make reading the code much more
> difficult (and a pain to get right).  Really I don't want to try my hand
> at crafting such a macro :D
>

Sounds good. You're right that the macro approach is more likely to be 
used incorrectly.

-Stolee

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 12/26] ls-refs: introduce ls-refs server command
  2018-02-01 19:16   ` Jeff Hostetler
@ 2018-02-07  0:55     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  0:55 UTC (permalink / raw)
  To: Jeff Hostetler
  Cc: git, sbeller, gitster, peff, philipoakley, stolee, jrnieder

On 02/01, Jeff Hostetler wrote:
> 
> 
> On 1/2/2018 7:18 PM, Brandon Williams wrote:
> > Introduce the ls-refs server command.  In protocol v2, the ls-refs
> > command is used to request the ref advertisement from the server.  Since
> > it is a command which can be requested (as opposed to mandatory in v1),
> > a client can sent a number of parameters in its request to limit the ref
> > advertisement based on provided ref-patterns.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >   Documentation/technical/protocol-v2.txt | 26 +++++++++
> >   Makefile                                |  1 +
> >   ls-refs.c                               | 97 +++++++++++++++++++++++++++++++++
> >   ls-refs.h                               |  9 +++
> >   serve.c                                 |  2 +
> >   5 files changed, 135 insertions(+)
> >   create mode 100644 ls-refs.c
> >   create mode 100644 ls-refs.h
> > 
> > diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> > index b87ba3816..5f4d0e719 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -89,3 +89,29 @@ terminate the connection.
> >   Commands are the core actions that a client wants to perform (fetch, push,
> >   etc).  Each command will be provided with a list capabilities and
> >   arguments as requested by a client.
> > +
> > + Ls-refs
> > +---------
> > +
> > +Ls-refs is the command used to request a reference advertisement in v2.
> > +Unlike the current reference advertisement, ls-refs takes in parameters
> > +which can be used to limit the refs sent from the server.
> > +
> > +Ls-ref takes in the following parameters wraped in packet-lines:
> > +
> > +  symrefs: In addition to the object pointed by it, show the underlying
> > +	   ref pointed by it when showing a symbolic ref.
> > +  peel: Show peeled tags.
> > +  ref-pattern <pattern>: When specified, only references matching the
> > +			 given patterns are displayed.
> > +
> > +The output of ls-refs is as follows:
> > +
> > +    output = *ref
> > +	     flush-pkt
> > +    ref = PKT-LINE((tip | peeled) LF)
> > +    tip = obj-id SP refname (SP symref-target)
> > +    peeled = obj-id SP refname "^{}"
> > +
> > +    symref = PKT-LINE("symref" SP symbolic-ref SP resolved-ref LF)
> > +    shallow = PKT-LINE("shallow" SP obj-id LF)
> 
> Do you want to talk about ordering requirements on this?
> I think packed-refs has one, but I'm not sure it matters here
> where the client or server sorts it.
> 
> Are there any provisions for compressing the renames, like in the
> reftable spec or in index-v4 ?

Not currently but it would be rather easy to just add a feature to
ls-refs to transmit the resultant list of refs into something like
reftable.  So this is something that can be added later.

> 
> It doesn't need to be in the initial version.  Just asking.  We could
> always add a "ls-refs-2" command that builds upon this.
> 
> Thanks,
> Jeff

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 00/27] protocol version 2
  2018-01-31 16:00   ` [PATCH v2 00/27] protocol version 2 Derrick Stolee
@ 2018-02-07  0:58     ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  0:58 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, sbeller, gitster, peff, philipoakley, jrnieder

On 01/31, Derrick Stolee wrote:
> Sorry for chiming in with mostly nitpicks so late since sending this
> version. Mostly, I tried to read it to see if I could understand the scope
> of the patch and how this code worked before. It looks very polished, so I
> the nits were the best I could do.
> 
> On 1/25/2018 6:58 PM, Brandon Williams wrote:
> > Changes in v2:
> >   * Added documentation for fetch
> >   * changes #defines for state variables to be enums
> >   * couple code changes to pkt-line functions and documentation
> >   * Added unit tests for the git-serve binary as well as for ls-refs
> 
> I'm a fan of more unit-level testing, and I think that will be more
> important as we go on with these multiple configuration options.
> 
> > Areas for improvement
> >   * Push isn't implemented, right now this is ok because if v2 is requested the
> >     server can just default to v0.  Before this can be merged we may want to
> >     change how the client request a new protocol, and not allow for sending
> >     "version=2" when pushing even though the user has it configured.  Or maybe
> >     its fine to just have an older client who doesn't understand how to push
> >     (and request v2) to die if the server tries to speak v2 at it.
> > 
> >     Fixing this essentially would just require piping through a bit more
> >     information to the function which ultimately runs connect (for both builtins
> >     and remote-curl)
> 
> Definitely save push for a later patch. Getting 'fetch' online did require
> 'ls-refs' at the same time. Future reviews will be easier when adding one
> command at a time.
> 
> > 
> >   * I want to make sure that the docs are well written before this gets merged
> >     so I'm hoping that someone can do a through review on the docs themselves to
> >     make sure they are clear.
> 
> I made a comment in the docs about the architectural changes. While I think
> a discussion on that topic would be valuable, I'm not sure that's the point
> of the document (i.e. documenting what v2 does versus selling the value of
> the patch). I thought the docs were clear for how the commands work.
> 
> >   * Right now there is a capability 'stateless-rpc' which essentially makes sure
> >     that a server command completes after a single round (this is to make sure
> >     http works cleanly).  After talking with some folks it may make more sense
> >     to just have v2 be stateless in nature so that all commands terminate after
> >     a single round trip.  This makes things a bit easier if a server wants to
> >     have ssh just be a proxy for http.
> > 
> >     One potential thing would be to flip this so that by default the protocol is
> >     stateless and if a server/command has a state-full mode that can be
> >     implemented as a capability at a later point.  Thoughts?
> 
> At minimum, all commands should be designed with a "stateless first"
> philosophy since a large number of users communicate via HTTP[S] and any
> decisions that make stateless communication painful should be rejected.

I agree with this and my next version will run with this philosophy in
mind (v2 will be stateless by default).

> 
> >   * Shallow repositories and shallow clones aren't supported yet.  I'm working
> >     on it and it can be either added to v2 by default if people think it needs
> >     to be in there from the start, or we can add it as a capability at a later
> >     point.
> 
> I'm happy to say the following:
> 
> 1. Shallow repositories should not be used for servers, since they cannot
> service all requests.
> 
> 2. Since v2 has easy capability features, I'm happy to leave shallow for
> later. We will want to verify that a shallow clone command reverts to v1.
> 
> 
> I fetched bw/protocol-v2 with tip 13c70148, built, set 'protocol.version=2'
> in the config, and tested fetches against GitHub and VSTS just as a
> compatibility test. Everything worked just fine.
> 
> Is there an easy way to test the existing test suite for clone and fetch
> using protocol v2 to make sure there are no regressions with
> protocol.version=2 in the config?

Yes there already exist interop tests for testing the addition of
requesting a new protocol at //t/interop/i5700-protocol-transition.sh

> 
> Thanks,
> -Stolee

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH v3 00/35] protocol version 2
  2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
                     ` (28 preceding siblings ...)
  2018-02-01 19:40   ` Jeff Hostetler
@ 2018-02-07  1:12   ` Brandon Williams
  2018-02-07  1:12     ` [PATCH v3 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
                       ` (37 more replies)
  29 siblings, 38 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Changes in v3:
 * There were some comments about how the protocol should be designed
   stateless first.  I've made this change and instead of having to
   supply the `stateless-rpc=true` capability to force stateless
   behavior, the protocol just requires all commands to be stateless.
 
 * Added some patches towards the end of the series to force the client
   to not request to use protocol v2 when pushing (even if configured to
   use v2).  This is to ease the roll-out process of a push command in
   protocol v2.  This way when servers gain the ability to accept
   pushing in v2 (and they start responding using v2 when requests are
   sent to the git-receive-pack endpoint) that clients who still don't
   understand how to push using v2 won't request to use v2 and then die
   when they recognize that the server does indeed know how to accept a
   push under v2.

 * I implemented the `shallow` feature for fetch.  This feature
   encapsulates the existing functionality of all the shallow/deepen
   capabilities in v0.  So now a server can process shallow requests.

 * Various other small tweaks that I can't remember :)

After all of that I think the series is in a pretty good state, baring
any more critical reviewing feedback.

Thanks!

Brandon Williams (35):
  pkt-line: introduce packet_read_with_status
  pkt-line: introduce struct packet_reader
  pkt-line: add delim packet support
  upload-pack: convert to a builtin
  upload-pack: factor out processing lines
  transport: use get_refs_via_connect to get refs
  connect: convert get_remote_heads to use struct packet_reader
  connect: discover protocol version outside of get_remote_heads
  transport: store protocol version
  protocol: introduce enum protocol_version value protocol_v2
  test-pkt-line: introduce a packet-line test helper
  serve: introduce git-serve
  ls-refs: introduce ls-refs server command
  connect: request remote refs using v2
  transport: convert get_refs_list to take a list of ref patterns
  transport: convert transport_get_remote_refs to take a list of ref
    patterns
  ls-remote: pass ref patterns when requesting a remote's refs
  fetch: pass ref patterns when fetching
  push: pass ref patterns when pushing
  upload-pack: introduce fetch server command
  fetch-pack: perform a fetch using v2
  upload-pack: support shallow requests
  fetch-pack: support shallow requests
  connect: refactor git_connect to only get the protocol version once
  connect: don't request v2 when pushing
  transport-helper: remove name parameter
  transport-helper: refactor process_connect_service
  transport-helper: introduce stateless-connect
  pkt-line: add packet_buf_write_len function
  remote-curl: create copy of the service name
  remote-curl: store the protocol version the server responded with
  http: allow providing extra headers for http requests
  http: don't always add Git-Protocol header
  remote-curl: implement stateless-connect command
  remote-curl: don't request v2 when pushing

 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt | 338 +++++++++++++++++
 Makefile                                |   7 +-
 builtin.h                               |   2 +
 builtin/clone.c                         |   2 +-
 builtin/fetch-pack.c                    |  21 +-
 builtin/fetch.c                         |  14 +-
 builtin/ls-remote.c                     |   7 +-
 builtin/receive-pack.c                  |   6 +
 builtin/remote.c                        |   2 +-
 builtin/send-pack.c                     |  20 +-
 builtin/serve.c                         |  30 ++
 builtin/upload-pack.c                   |  74 ++++
 connect.c                               | 352 +++++++++++++-----
 connect.h                               |   7 +
 fetch-pack.c                            | 319 +++++++++++++++-
 fetch-pack.h                            |   4 +-
 git.c                                   |   2 +
 http.c                                  |  25 +-
 http.h                                  |   2 +
 ls-refs.c                               |  96 +++++
 ls-refs.h                               |   9 +
 pkt-line.c                              | 149 +++++++-
 pkt-line.h                              |  77 ++++
 protocol.c                              |   2 +
 protocol.h                              |   1 +
 remote-curl.c                           | 257 ++++++++++++-
 remote.h                                |   9 +-
 serve.c                                 | 260 +++++++++++++
 serve.h                                 |  15 +
 t/helper/test-pkt-line.c                |  64 ++++
 t/t5701-git-serve.sh                    | 176 +++++++++
 t/t5702-protocol-v2.sh                  | 239 ++++++++++++
 transport-helper.c                      |  84 +++--
 transport-internal.h                    |   4 +-
 transport.c                             | 116 ++++--
 transport.h                             |   9 +-
 upload-pack.c                           | 625 ++++++++++++++++++++++++--------
 upload-pack.h                           |  21 ++
 39 files changed, 3088 insertions(+), 360 deletions(-)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 builtin/upload-pack.c
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100644 t/helper/test-pkt-line.c
 create mode 100755 t/t5701-git-serve.sh
 create mode 100755 t/t5702-protocol-v2.sh
 create mode 100644 upload-pack.h

-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH v3 01/35] pkt-line: introduce packet_read_with_status
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-13  0:25       ` Jonathan Nieder
  2018-02-07  1:12     ` [PATCH v3 02/35] pkt-line: introduce struct packet_reader Brandon Williams
                       ` (36 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

The current pkt-line API encodes the status of a pkt-line read in the
length of the read content.  An error is indicated with '-1', a flush
with '0' (which can be confusing since a return value of '0' can also
indicate an empty pkt-line), and a positive integer for the length of
the read content otherwise.  This doesn't leave much room for allowing
the addition of additional special packets in the future.

To solve this introduce 'packet_read_with_status()' which reads a packet
and returns the status of the read encoded as an 'enum packet_status'
type.  This allows for easily identifying between special and normal
packets as well as errors.  It also enables easily adding a new special
packet in the future.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 57 +++++++++++++++++++++++++++++++++++++++++++--------------
 pkt-line.h | 15 +++++++++++++++
 2 files changed, 58 insertions(+), 14 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 2827ca772..af0d2430f 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
-int packet_read(int fd, char **src_buf, size_t *src_len,
-		char *buffer, unsigned size, int options)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
+						char *buffer, unsigned size, int *pktlen,
+						int options)
 {
-	int len, ret;
+	int len;
 	char linelen[4];
 
-	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
-	if (ret < 0)
-		return ret;
+	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
+		return PACKET_READ_EOF;
+
 	len = packet_length(linelen);
-	if (len < 0)
+
+	if (len < 0) {
 		die("protocol error: bad line length character: %.4s", linelen);
-	if (!len) {
+	} else if (!len) {
 		packet_trace("0000", 4, 0);
-		return 0;
+		return PACKET_READ_FLUSH;
+	} else if (len < 4) {
+		die("protocol error: bad line length %d", len);
 	}
+
 	len -= 4;
-	if (len >= size)
+	if ((unsigned)len >= size)
 		die("protocol error: bad line length %d", len);
-	ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
-	if (ret < 0)
-		return ret;
+
+	if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
+		return PACKET_READ_EOF;
 
 	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
 	    len && buffer[len-1] == '\n')
@@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
 	buffer[len] = 0;
 	packet_trace(buffer, len, 0);
-	return len;
+	*pktlen = len;
+	return PACKET_READ_NORMAL;
+}
+
+int packet_read(int fd, char **src_buffer, size_t *src_len,
+		char *buffer, unsigned size, int options)
+{
+	enum packet_read_status status;
+	int pktlen;
+
+	status = packet_read_with_status(fd, src_buffer, src_len,
+					 buffer, size, &pktlen,
+					 options);
+	switch (status) {
+	case PACKET_READ_EOF:
+		pktlen = -1;
+		break;
+	case PACKET_READ_NORMAL:
+		break;
+	case PACKET_READ_FLUSH:
+		pktlen = 0;
+		break;
+	}
+
+	return pktlen;
 }
 
 static char *packet_read_line_generic(int fd,
diff --git a/pkt-line.h b/pkt-line.h
index 3dad583e2..06c468927 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Read a packetized line into a buffer like the 'packet_read()' function but
+ * returns an 'enum packet_read_status' which indicates the status of the read.
+ * The number of bytes read will be assigined to *pktlen if the status of the
+ * read was 'PACKET_READ_NORMAL'.
+ */
+enum packet_read_status {
+	PACKET_READ_EOF = -1,
+	PACKET_READ_NORMAL,
+	PACKET_READ_FLUSH,
+};
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
+						char *buffer, unsigned size, int *pktlen,
+						int options);
+
 /*
  * Convenience wrapper for packet_read that is not gentle, and sets the
  * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 02/35] pkt-line: introduce struct packet_reader
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
  2018-02-07  1:12     ` [PATCH v3 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-13  0:49       ` Jonathan Nieder
  2018-02-27  5:57       ` Jonathan Nieder
  2018-02-07  1:12     ` [PATCH v3 03/35] pkt-line: add delim packet support Brandon Williams
                       ` (35 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Sometimes it is advantageous to be able to peek the next packet line
without consuming it (e.g. to be able to determine the protocol version
a server is speaking).  In order to do that introduce 'struct
packet_reader' which is an abstraction around the normal packet reading
logic.  This enables a caller to be able to peek a single line at a time
using 'packet_reader_peek()' and having a caller consume a line by
calling 'packet_reader_read()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pkt-line.h | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 117 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index af0d2430f..4fc9ad4b0 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -406,3 +406,62 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out)
 	}
 	return sb_out->len - orig_len;
 }
+
+/* Packet Reader Functions */
+void packet_reader_init(struct packet_reader *reader, int fd,
+			char *src_buffer, size_t src_len,
+			int options)
+{
+	memset(reader, 0, sizeof(*reader));
+
+	reader->fd = fd;
+	reader->src_buffer = src_buffer;
+	reader->src_len = src_len;
+	reader->buffer = packet_buffer;
+	reader->buffer_size = sizeof(packet_buffer);
+	reader->options = options;
+}
+
+enum packet_read_status packet_reader_read(struct packet_reader *reader)
+{
+	if (reader->line_peeked) {
+		reader->line_peeked = 0;
+		return reader->status;
+	}
+
+	reader->status = packet_read_with_status(reader->fd,
+						 &reader->src_buffer,
+						 &reader->src_len,
+						 reader->buffer,
+						 reader->buffer_size,
+						 &reader->pktlen,
+						 reader->options);
+
+	switch (reader->status) {
+	case PACKET_READ_EOF:
+		reader->pktlen = -1;
+		reader->line = NULL;
+		break;
+	case PACKET_READ_NORMAL:
+		reader->line = reader->buffer;
+		break;
+	case PACKET_READ_FLUSH:
+		reader->pktlen = 0;
+		reader->line = NULL;
+		break;
+	}
+
+	return reader->status;
+}
+
+enum packet_read_status packet_reader_peek(struct packet_reader *reader)
+{
+	/* Only allow peeking a single line */
+	if (reader->line_peeked)
+		return reader->status;
+
+	/* Peek a line by reading it and setting peeked flag */
+	packet_reader_read(reader);
+	reader->line_peeked = 1;
+	return reader->status;
+}
diff --git a/pkt-line.h b/pkt-line.h
index 06c468927..7d9f0e537 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -111,6 +111,64 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
  */
 ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
 
+struct packet_reader {
+	/* source file descriptor */
+	int fd;
+
+	/* source buffer and its size */
+	char *src_buffer;
+	size_t src_len;
+
+	/* buffer that pkt-lines are read into and its size */
+	char *buffer;
+	unsigned buffer_size;
+
+	/* options to be used during reads */
+	int options;
+
+	/* status of the last read */
+	enum packet_read_status status;
+
+	/* length of data read during the last read */
+	int pktlen;
+
+	/* the last line read */
+	const char *line;
+
+	/* indicates if a line has been peeked */
+	int line_peeked;
+};
+
+/*
+ * Initialize a 'struct packet_reader' object which is an
+ * abstraction around the 'packet_read_with_status()' function.
+ */
+extern void packet_reader_init(struct packet_reader *reader, int fd,
+			       char *src_buffer, size_t src_len,
+			       int options);
+
+/*
+ * Perform a packet read and return the status of the read.
+ * The values of 'pktlen' and 'line' are updated based on the status of the
+ * read as follows:
+ *
+ * PACKET_READ_ERROR: 'pktlen' is set to '-1' and 'line' is set to NULL
+ * PACKET_READ_NORMAL: 'pktlen' is set to the number of bytes read
+ *		       'line' is set to point at the read line
+ * PACKET_READ_FLUSH: 'pktlen' is set to '0' and 'line' is set to NULL
+ */
+extern enum packet_read_status packet_reader_read(struct packet_reader *reader);
+
+/*
+ * Peek the next packet line without consuming it and return the status.
+ * The next call to 'packet_reader_read()' will perform a read of the same line
+ * that was peeked, consuming the line.
+ *
+ * Peeking multiple times without calling 'packet_reader_read()' will return
+ * the same result.
+ */
+extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);
+
 #define DEFAULT_PACKET_MAX 1000
 #define LARGE_PACKET_MAX 65520
 #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 03/35] pkt-line: add delim packet support
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
  2018-02-07  1:12     ` [PATCH v3 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
  2018-02-07  1:12     ` [PATCH v3 02/35] pkt-line: introduce struct packet_reader Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-22 19:13       ` Stefan Beller
  2018-02-07  1:12     ` [PATCH v3 04/35] upload-pack: convert to a builtin Brandon Williams
                       ` (34 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

One of the design goals of protocol-v2 is to improve the semantics of
flush packets.  Currently in protocol-v1, flush packets are used both to
indicate a break in a list of packet lines as well as an indication that
one side has finished speaking.  This makes it particularly difficult
to implement proxies as a proxy would need to completely understand git
protocol instead of simply looking for a flush packet.

To do this, introduce the special deliminator packet '0001'.  A delim
packet can then be used as a deliminator between lists of packet lines
while flush packets can be reserved to indicate the end of a response.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 17 +++++++++++++++++
 pkt-line.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 4fc9ad4b0..726e109ca 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
 	write_or_die(fd, "0000", 4);
 }
 
+void packet_delim(int fd)
+{
+	packet_trace("0001", 4, 1);
+	write_or_die(fd, "0001", 4);
+}
+
 int packet_flush_gently(int fd)
 {
 	packet_trace("0000", 4, 1);
@@ -105,6 +111,12 @@ void packet_buf_flush(struct strbuf *buf)
 	strbuf_add(buf, "0000", 4);
 }
 
+void packet_buf_delim(struct strbuf *buf)
+{
+	packet_trace("0001", 4, 1);
+	strbuf_add(buf, "0001", 4);
+}
+
 static void set_packet_header(char *buf, const int size)
 {
 	static char hexchar[] = "0123456789abcdef";
@@ -297,6 +309,9 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_
 	} else if (!len) {
 		packet_trace("0000", 4, 0);
 		return PACKET_READ_FLUSH;
+	} else if (len == 1) {
+		packet_trace("0001", 4, 0);
+		return PACKET_READ_DELIM;
 	} else if (len < 4) {
 		die("protocol error: bad line length %d", len);
 	}
@@ -333,6 +348,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len,
 		break;
 	case PACKET_READ_NORMAL:
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		pktlen = 0;
 		break;
@@ -445,6 +461,7 @@ enum packet_read_status packet_reader_read(struct packet_reader *reader)
 	case PACKET_READ_NORMAL:
 		reader->line = reader->buffer;
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		reader->pktlen = 0;
 		reader->line = NULL;
diff --git a/pkt-line.h b/pkt-line.h
index 7d9f0e537..16fe8bdbf 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -20,8 +20,10 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_delim(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
@@ -75,6 +77,7 @@ enum packet_read_status {
 	PACKET_READ_EOF = -1,
 	PACKET_READ_NORMAL,
 	PACKET_READ_FLUSH,
+	PACKET_READ_DELIM,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
 						char *buffer, unsigned size, int *pktlen,
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (2 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 03/35] pkt-line: add delim packet support Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 21:44       ` Jonathan Tan
  2018-02-07  1:12     ` [PATCH v3 05/35] upload-pack: factor out processing lines Brandon Williams
                       ` (33 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile              |   3 +-
 builtin.h             |   1 +
 builtin/upload-pack.c |  67 +++++++++++++++++++++++++++++++
 git.c                 |   1 +
 upload-pack.c         | 107 ++++++++++++--------------------------------------
 upload-pack.h         |  13 ++++++
 6 files changed, 109 insertions(+), 83 deletions(-)
 create mode 100644 builtin/upload-pack.c
 create mode 100644 upload-pack.h

diff --git a/Makefile b/Makefile
index 1a9b23b67..b7ccc05fa 100644
--- a/Makefile
+++ b/Makefile
@@ -639,7 +639,6 @@ PROGRAM_OBJS += imap-send.o
 PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
-PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -909,6 +908,7 @@ LIB_OBJS += tree-diff.o
 LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
+LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
@@ -1026,6 +1026,7 @@ BUILTIN_OBJS += builtin/update-index.o
 BUILTIN_OBJS += builtin/update-ref.o
 BUILTIN_OBJS += builtin/update-server-info.o
 BUILTIN_OBJS += builtin/upload-archive.o
+BUILTIN_OBJS += builtin/upload-pack.o
 BUILTIN_OBJS += builtin/var.o
 BUILTIN_OBJS += builtin/verify-commit.o
 BUILTIN_OBJS += builtin/verify-pack.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa..f332a1257 100644
--- a/builtin.h
+++ b/builtin.h
@@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_update_server_info(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix);
+extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
new file mode 100644
index 000000000..2cb5cb35b
--- /dev/null
+++ b/builtin/upload-pack.c
@@ -0,0 +1,67 @@
+#include "cache.h"
+#include "builtin.h"
+#include "exec_cmd.h"
+#include "pkt-line.h"
+#include "parse-options.h"
+#include "protocol.h"
+#include "upload-pack.h"
+
+static const char * const upload_pack_usage[] = {
+	N_("git upload-pack [<options>] <dir>"),
+	NULL
+};
+
+int cmd_upload_pack(int argc, const char **argv, const char *prefix)
+{
+	const char *dir;
+	int strict = 0;
+	struct upload_pack_options opts = { 0 };
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-refs", &opts.advertise_refs,
+			 N_("exit immediately after initial ref advertisement")),
+		OPT_BOOL(0, "strict", &strict,
+			 N_("do not try <directory>/.git/ if <directory> is no Git directory")),
+		OPT_INTEGER(0, "timeout", &opts.timeout,
+			    N_("interrupt transfer after <n> seconds of inactivity")),
+		OPT_END()
+	};
+
+	packet_trace_identity("upload-pack");
+	check_replace_refs = 0;
+
+	argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
+
+	if (argc != 1)
+		usage_with_options(upload_pack_usage, options);
+
+	if (opts.timeout)
+		opts.daemon_mode = 1;
+
+	setup_path();
+
+	dir = argv[0];
+
+	if (!enter_repo(dir, strict))
+		die("'%s' does not appear to be a git repository", dir);
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+		if (opts.advertise_refs || !opts.stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
+	case protocol_v0:
+		upload_pack(&opts);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
+	return 0;
+}
diff --git a/git.c b/git.c
index c870b9719..f71073dc8 100644
--- a/git.c
+++ b/git.c
@@ -478,6 +478,7 @@ static struct cmd_struct commands[] = {
 	{ "update-server-info", cmd_update_server_info, RUN_SETUP },
 	{ "upload-archive", cmd_upload_archive },
 	{ "upload-archive--writer", cmd_upload_archive_writer },
+	{ "upload-pack", cmd_upload_pack },
 	{ "var", cmd_var, RUN_SETUP_GENTLY },
 	{ "verify-commit", cmd_verify_commit, RUN_SETUP },
 	{ "verify-pack", cmd_verify_pack },
diff --git a/upload-pack.c b/upload-pack.c
index d5de18127..2ad73a98b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -6,7 +6,6 @@
 #include "tag.h"
 #include "object.h"
 #include "commit.h"
-#include "exec_cmd.h"
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
@@ -15,15 +14,10 @@
 #include "sigchain.h"
 #include "version.h"
 #include "string-list.h"
-#include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
 #include "protocol.h"
-
-static const char * const upload_pack_usage[] = {
-	N_("git upload-pack [<options>] <dir>"),
-	NULL
-};
+#include "upload-pack.h"
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE	(1u << 11)
@@ -61,7 +55,6 @@ static int keepalive = 5;
  * otherwise maximum packet size (up to 65520 bytes).
  */
 static int use_sideband;
-static int advertise_refs;
 static int stateless_rpc;
 static const char *pack_objects_hook;
 
@@ -977,33 +970,6 @@ static int find_symref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static void upload_pack(void)
-{
-	struct string_list symref = STRING_LIST_INIT_DUP;
-
-	head_ref_namespaced(find_symref, &symref);
-
-	if (advertise_refs || !stateless_rpc) {
-		reset_timeout();
-		head_ref_namespaced(send_ref, &symref);
-		for_each_namespaced_ref(send_ref, &symref);
-		advertise_shallow_grafts(1);
-		packet_flush(1);
-	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
-	}
-	string_list_clear(&symref, 1);
-	if (advertise_refs)
-		return;
-
-	receive_needs();
-	if (want_obj.nr) {
-		get_common_commits();
-		create_pack_file();
-	}
-}
-
 static int upload_pack_config(const char *var, const char *value, void *unused)
 {
 	if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
@@ -1032,58 +998,35 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
-int cmd_main(int argc, const char **argv)
+void upload_pack(struct upload_pack_options *options)
 {
-	const char *dir;
-	int strict = 0;
-	struct option options[] = {
-		OPT_BOOL(0, "stateless-rpc", &stateless_rpc,
-			 N_("quit after a single request/response exchange")),
-		OPT_BOOL(0, "advertise-refs", &advertise_refs,
-			 N_("exit immediately after initial ref advertisement")),
-		OPT_BOOL(0, "strict", &strict,
-			 N_("do not try <directory>/.git/ if <directory> is no Git directory")),
-		OPT_INTEGER(0, "timeout", &timeout,
-			    N_("interrupt transfer after <n> seconds of inactivity")),
-		OPT_END()
-	};
-
-	packet_trace_identity("upload-pack");
-	check_replace_refs = 0;
-
-	argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
-
-	if (argc != 1)
-		usage_with_options(upload_pack_usage, options);
-
-	if (timeout)
-		daemon_mode = 1;
-
-	setup_path();
-
-	dir = argv[0];
+	struct string_list symref = STRING_LIST_INIT_DUP;
 
-	if (!enter_repo(dir, strict))
-		die("'%s' does not appear to be a git repository", dir);
+	stateless_rpc = options->stateless_rpc;
+	timeout = options->timeout;
+	daemon_mode = options->daemon_mode;
 
 	git_config(upload_pack_config, NULL);
 
-	switch (determine_protocol_version_server()) {
-	case protocol_v1:
-		/*
-		 * v1 is just the original protocol with a version string,
-		 * so just fall through after writing the version string.
-		 */
-		if (advertise_refs || !stateless_rpc)
-			packet_write_fmt(1, "version 1\n");
-
-		/* fallthrough */
-	case protocol_v0:
-		upload_pack();
-		break;
-	case protocol_unknown_version:
-		BUG("unknown protocol version");
+	head_ref_namespaced(find_symref, &symref);
+
+	if (options->advertise_refs || !stateless_rpc) {
+		reset_timeout();
+		head_ref_namespaced(send_ref, &symref);
+		for_each_namespaced_ref(send_ref, &symref);
+		advertise_shallow_grafts(1);
+		packet_flush(1);
+	} else {
+		head_ref_namespaced(check_ref, NULL);
+		for_each_namespaced_ref(check_ref, NULL);
 	}
+	string_list_clear(&symref, 1);
+	if (options->advertise_refs)
+		return;
 
-	return 0;
+	receive_needs();
+	if (want_obj.nr) {
+		get_common_commits();
+		create_pack_file();
+	}
 }
diff --git a/upload-pack.h b/upload-pack.h
new file mode 100644
index 000000000..a71e4dc7e
--- /dev/null
+++ b/upload-pack.h
@@ -0,0 +1,13 @@
+#ifndef UPLOAD_PACK_H
+#define UPLOAD_PACK_H
+
+struct upload_pack_options {
+	int stateless_rpc;
+	int advertise_refs;
+	unsigned int timeout;
+	int daemon_mode;
+};
+
+void upload_pack(struct upload_pack_options *options);
+
+#endif /* UPLOAD_PACK_H */
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 05/35] upload-pack: factor out processing lines
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (3 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 04/35] upload-pack: convert to a builtin Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-22 19:31       ` Stefan Beller
  2018-02-07  1:12     ` [PATCH v3 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
                       ` (32 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 upload-pack.c | 113 ++++++++++++++++++++++++++++++++++++++--------------------
 1 file changed, 74 insertions(+), 39 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 2ad73a98b..1e8a9e1ca 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
 	packet_flush(1);
 }
 
+static int process_shallow(const char *line, struct object_array *shallows)
+{
+	const char *arg;
+	if (skip_prefix(line, "shallow ", &arg)) {
+		struct object_id oid;
+		struct object *object;
+		if (get_oid_hex(arg, &oid))
+			die("invalid shallow line: %s", line);
+		object = parse_object(&oid);
+		if (!object)
+			return 1;
+		if (object->type != OBJ_COMMIT)
+			die("invalid shallow object %s", oid_to_hex(&oid));
+		if (!(object->flags & CLIENT_SHALLOW)) {
+			object->flags |= CLIENT_SHALLOW;
+			add_object_array(object, NULL, shallows);
+		}
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen(const char *line, int *depth)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen ", &arg)) {
+		char *end = NULL;
+		*depth = (int)strtol(arg, &end, 0);
+		if (!end || *end || *depth <= 0)
+			die("Invalid deepen: %s", line);
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-since ", &arg)) {
+		char *end = NULL;
+		*deepen_since = parse_timestamp(arg, &end, 0);
+		if (!end || *end || !deepen_since ||
+		    /* revisions.c's max_age -1 is special */
+		    *deepen_since == -1)
+			die("Invalid deepen-since: %s", line);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
+static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-not ", &arg)) {
+		char *ref = NULL;
+		struct object_id oid;
+		if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
+			die("git upload-pack: ambiguous deepen-not: %s", line);
+		string_list_append(deepen_not, ref);
+		free(ref);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
 static void receive_needs(void)
 {
 	struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -745,49 +814,15 @@ static void receive_needs(void)
 		if (!line)
 			break;
 
-		if (skip_prefix(line, "shallow ", &arg)) {
-			struct object_id oid;
-			struct object *object;
-			if (get_oid_hex(arg, &oid))
-				die("invalid shallow line: %s", line);
-			object = parse_object(&oid);
-			if (!object)
-				continue;
-			if (object->type != OBJ_COMMIT)
-				die("invalid shallow object %s", oid_to_hex(&oid));
-			if (!(object->flags & CLIENT_SHALLOW)) {
-				object->flags |= CLIENT_SHALLOW;
-				add_object_array(object, NULL, &shallows);
-			}
+		if (process_shallow(line, &shallows))
 			continue;
-		}
-		if (skip_prefix(line, "deepen ", &arg)) {
-			char *end = NULL;
-			depth = strtol(arg, &end, 0);
-			if (!end || *end || depth <= 0)
-				die("Invalid deepen: %s", line);
+		if (process_deepen(line, &depth))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-since ", &arg)) {
-			char *end = NULL;
-			deepen_since = parse_timestamp(arg, &end, 0);
-			if (!end || *end || !deepen_since ||
-			    /* revisions.c's max_age -1 is special */
-			    deepen_since == -1)
-				die("Invalid deepen-since: %s", line);
-			deepen_rev_list = 1;
+		if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-not ", &arg)) {
-			char *ref = NULL;
-			struct object_id oid;
-			if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
-				die("git upload-pack: ambiguous deepen-not: %s", line);
-			string_list_append(&deepen_not, ref);
-			free(ref);
-			deepen_rev_list = 1;
+		if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
 			continue;
-		}
+
 		if (!skip_prefix(line, "want ", &arg) ||
 		    get_oid_hex(arg, &oid_buf))
 			die("git upload-pack: protocol error, "
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 06/35] transport: use get_refs_via_connect to get refs
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (4 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 05/35] upload-pack: factor out processing lines Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-27  6:08       ` Jonathan Nieder
  2018-02-07  1:12     ` [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
                       ` (31 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index fc802260f..8e8779096 100644
--- a/transport.c
+++ b/transport.c
@@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
 
-	if (!data->got_remote_heads) {
-		connect_setup(transport, 0);
-		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		refs_tmp = get_refs_via_connect(transport, 0);
 
 	refs = fetch_pack(&args, data->fd, data->conn,
 			  refs_tmp ? refs_tmp : transport->remote_refs,
@@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	struct send_pack_args args;
 	int ret;
 
-	if (!data->got_remote_heads) {
-		struct ref *tmp_refs;
-		connect_setup(transport, 1);
-
-		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		get_refs_via_connect(transport, 1);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (5 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-22 19:52       ` Stefan Beller
  2018-02-22 20:09       ` Stefan Beller
  2018-02-07  1:12     ` [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
                       ` (30 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

In order to allow for better control flow when protocol_v2 is introduced
convert 'get_remote_heads()' to use 'struct packet_reader' to read
packet lines.  This enables a client to be able to peek the first line
of a server's response (without consuming it) in order to determine the
protocol version its speaking and then passing control to the
appropriate handler.

This is needed because the initial response from a server speaking
protocol_v0 includes the first ref, while subsequent protocol versions
respond with a version line.  We want to be able to read this first line
without consuming the first ref sent in the protocol_v0 case so that the
protocol version the server is speaking can be determined outside of
'get_remote_heads()' in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 174 ++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 96 insertions(+), 78 deletions(-)

diff --git a/connect.c b/connect.c
index c3a014c5b..00e90075c 100644
--- a/connect.c
+++ b/connect.c
@@ -48,6 +48,12 @@ int check_ref_type(const struct ref *ref, int flags)
 
 static void die_initial_contact(int unexpected)
 {
+	/*
+	 * A hang-up after seeing some response from the other end
+	 * means that it is unexpected, as we know the other end is
+	 * willing to talk to us.  A hang-up before seeing any
+	 * response does not necessarily mean an ACL problem, though.
+	 */
 	if (unexpected)
 		die(_("The remote end hung up upon initial contact"));
 	else
@@ -56,6 +62,41 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+static enum protocol_version discover_version(struct packet_reader *reader)
+{
+	enum protocol_version version = protocol_unknown_version;
+
+	/*
+	 * Peek the first line of the server's response to
+	 * determine the protocol version the server is speaking.
+	 */
+	switch (packet_reader_peek(reader)) {
+	case PACKET_READ_EOF:
+		die_initial_contact(0);
+	case PACKET_READ_FLUSH:
+	case PACKET_READ_DELIM:
+		version = protocol_v0;
+		break;
+	case PACKET_READ_NORMAL:
+		version = determine_protocol_version_client(reader->line);
+		break;
+	}
+
+	/* Maybe process capabilities here, at least for v2 */
+	switch (version) {
+	case protocol_v1:
+		/* Read the peeked version line */
+		packet_reader_read(reader);
+		break;
+	case protocol_v0:
+		break;
+	case protocol_unknown_version:
+		die("unknown protocol version: '%s'\n", reader->line);
+	}
+
+	return version;
+}
+
 static void parse_one_symref_info(struct string_list *symref, const char *val, int len)
 {
 	char *sym, *target;
@@ -109,60 +150,21 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
-/*
- * Read one line of a server's ref advertisement into packet_buffer.
- */
-static int read_remote_ref(int in, char **src_buf, size_t *src_len,
-			   int *responded)
+static void process_capabilities(const char *line, int *len)
 {
-	int len = packet_read(in, src_buf, src_len,
-			      packet_buffer, sizeof(packet_buffer),
-			      PACKET_READ_GENTLE_ON_EOF |
-			      PACKET_READ_CHOMP_NEWLINE);
-	const char *arg;
-	if (len < 0)
-		die_initial_contact(*responded);
-	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
-		die("remote error: %s", arg);
-
-	*responded = 1;
-
-	return len;
-}
-
-#define EXPECTING_PROTOCOL_VERSION 0
-#define EXPECTING_FIRST_REF 1
-#define EXPECTING_REF 2
-#define EXPECTING_SHALLOW 3
-
-/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
-static int process_protocol_version(void)
-{
-	switch (determine_protocol_version_client(packet_buffer)) {
-	case protocol_v1:
-		return 1;
-	case protocol_v0:
-		return 0;
-	default:
-		die("server is speaking an unknown protocol");
-	}
-}
-
-static void process_capabilities(int *len)
-{
-	int nul_location = strlen(packet_buffer);
+	int nul_location = strlen(line);
 	if (nul_location == *len)
 		return;
-	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	server_capabilities = xstrdup(line + nul_location + 1);
 	*len = nul_location;
 }
 
-static int process_dummy_ref(void)
+static int process_dummy_ref(const char *line)
 {
 	struct object_id oid;
 	const char *name;
 
-	if (parse_oid_hex(packet_buffer, &oid, &name))
+	if (parse_oid_hex(line, &oid, &name))
 		return 0;
 	if (*name != ' ')
 		return 0;
@@ -171,20 +173,20 @@ static int process_dummy_ref(void)
 	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
 }
 
-static void check_no_capabilities(int len)
+static void check_no_capabilities(const char *line, int len)
 {
-	if (strlen(packet_buffer) != len)
+	if (strlen(line) != len)
 		warning("Ignoring capabilities after first line '%s'",
-			packet_buffer + strlen(packet_buffer));
+			line + strlen(line));
 }
 
-static int process_ref(int len, struct ref ***list, unsigned int flags,
-		       struct oid_array *extra_have)
+static int process_ref(const char *line, int len, struct ref ***list,
+		       unsigned int flags, struct oid_array *extra_have)
 {
 	struct object_id old_oid;
 	const char *name;
 
-	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+	if (parse_oid_hex(line, &old_oid, &name))
 		return 0;
 	if (*name != ' ')
 		return 0;
@@ -200,16 +202,17 @@ static int process_ref(int len, struct ref ***list, unsigned int flags,
 		**list = ref;
 		*list = &ref->next;
 	}
-	check_no_capabilities(len);
+	check_no_capabilities(line, len);
 	return 1;
 }
 
-static int process_shallow(int len, struct oid_array *shallow_points)
+static int process_shallow(const char *line, int len,
+			   struct oid_array *shallow_points)
 {
 	const char *arg;
 	struct object_id old_oid;
 
-	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+	if (!skip_prefix(line, "shallow ", &arg))
 		return 0;
 
 	if (get_oid_hex(arg, &old_oid))
@@ -217,10 +220,17 @@ static int process_shallow(int len, struct oid_array *shallow_points)
 	if (!shallow_points)
 		die("repository on the other end cannot be shallow");
 	oid_array_append(shallow_points, &old_oid);
-	check_no_capabilities(len);
+	check_no_capabilities(line, len);
 	return 1;
 }
 
+enum get_remote_heads_state {
+	EXPECTING_FIRST_REF = 0,
+	EXPECTING_REF,
+	EXPECTING_SHALLOW,
+	EXPECTING_DONE,
+};
+
 /*
  * Read all the refs from the other end
  */
@@ -230,47 +240,55 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 			      struct oid_array *shallow_points)
 {
 	struct ref **orig_list = list;
+	int len = 0;
+	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
+	struct packet_reader reader;
+	const char *arg;
 
-	/*
-	 * A hang-up after seeing some response from the other end
-	 * means that it is unexpected, as we know the other end is
-	 * willing to talk to us.  A hang-up before seeing any
-	 * response does not necessarily mean an ACL problem, though.
-	 */
-	int responded = 0;
-	int len;
-	int state = EXPECTING_PROTOCOL_VERSION;
+	packet_reader_init(&reader, in, src_buf, src_len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	discover_version(&reader);
 
 	*list = NULL;
 
-	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+	while (state != EXPECTING_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			die_initial_contact(1);
+		case PACKET_READ_NORMAL:
+			len = reader.pktlen;
+			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
+				die("remote error: %s", arg);
+			break;
+		case PACKET_READ_FLUSH:
+			state = EXPECTING_DONE;
+			break;
+		case PACKET_READ_DELIM:
+			die("invalid packet");
+		}
+
 		switch (state) {
-		case EXPECTING_PROTOCOL_VERSION:
-			if (process_protocol_version()) {
-				state = EXPECTING_FIRST_REF;
-				break;
-			}
-			state = EXPECTING_FIRST_REF;
-			/* fallthrough */
 		case EXPECTING_FIRST_REF:
-			process_capabilities(&len);
-			if (process_dummy_ref()) {
+			process_capabilities(reader.line, &len);
+			if (process_dummy_ref(reader.line)) {
 				state = EXPECTING_SHALLOW;
 				break;
 			}
 			state = EXPECTING_REF;
 			/* fallthrough */
 		case EXPECTING_REF:
-			if (process_ref(len, &list, flags, extra_have))
+			if (process_ref(reader.line, len, &list, flags, extra_have))
 				break;
 			state = EXPECTING_SHALLOW;
 			/* fallthrough */
 		case EXPECTING_SHALLOW:
-			if (process_shallow(len, shallow_points))
+			if (process_shallow(reader.line, len, shallow_points))
 				break;
-			die("protocol error: unexpected '%s'", packet_buffer);
-		default:
-			die("unexpected state %d", state);
+			die("protocol error: unexpected '%s'", reader.line);
+		case EXPECTING_DONE:
+			break;
 		}
 	}
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (6 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 22:11       ` Jonathan Tan
  2018-02-07  1:12     ` [PATCH v3 09/35] transport: store protocol version Brandon Williams
                       ` (29 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'.  This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c | 16 +++++++++++++++-
 builtin/send-pack.c  | 17 +++++++++++++++--
 connect.c            | 27 ++++++++++-----------------
 connect.h            |  3 +++
 remote-curl.c        | 20 ++++++++++++++++++--
 remote.h             |  5 +++--
 transport.c          | 24 +++++++++++++++++++-----
 7 files changed, 83 insertions(+), 29 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..85d4faf76 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -4,6 +4,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "sha1-array.h"
+#include "protocol.h"
 
 static const char fetch_pack_usage[] =
 "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
@@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	struct fetch_pack_args args;
 	struct oid_array shallow = OID_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
+	struct packet_reader reader;
 
 	packet_trace_identity("fetch-pack");
 
@@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 		if (!conn)
 			return args.diag_url ? 0 : 1;
 	}
-	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 			 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..83cb125a6 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
+#include "protocol.h"
 
 static const char * const send_pack_usage[] = {
 	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
@@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 	int progress = -1;
 	int from_stdin = 0;
 	struct push_cas_option cas = {0};
+	struct packet_reader reader;
 
 	struct option options[] = {
 		OPT__VERBOSITY(&verbose),
@@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			args.verbose ? CONNECT_VERBOSE : 0);
 	}
 
-	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-			 &extra_have, &shallow);
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
+				 &extra_have, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index 00e90075c..db3c9d24c 100644
--- a/connect.c
+++ b/connect.c
@@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
-static enum protocol_version discover_version(struct packet_reader *reader)
+enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
 
@@ -234,7 +234,7 @@ enum get_remote_heads_state {
 /*
  * Read all the refs from the other end
  */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct ref **get_remote_heads(struct packet_reader *reader,
 			      struct ref **list, unsigned int flags,
 			      struct oid_array *extra_have,
 			      struct oid_array *shallow_points)
@@ -242,24 +242,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	struct ref **orig_list = list;
 	int len = 0;
 	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
-	struct packet_reader reader;
 	const char *arg;
 
-	packet_reader_init(&reader, in, src_buf, src_len,
-			   PACKET_READ_CHOMP_NEWLINE |
-			   PACKET_READ_GENTLE_ON_EOF);
-
-	discover_version(&reader);
-
 	*list = NULL;
 
 	while (state != EXPECTING_DONE) {
-		switch (packet_reader_read(&reader)) {
+		switch (packet_reader_read(reader)) {
 		case PACKET_READ_EOF:
 			die_initial_contact(1);
 		case PACKET_READ_NORMAL:
-			len = reader.pktlen;
-			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
+			len = reader->pktlen;
+			if (len > 4 && skip_prefix(reader->line, "ERR ", &arg))
 				die("remote error: %s", arg);
 			break;
 		case PACKET_READ_FLUSH:
@@ -271,22 +264,22 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 
 		switch (state) {
 		case EXPECTING_FIRST_REF:
-			process_capabilities(reader.line, &len);
-			if (process_dummy_ref(reader.line)) {
+			process_capabilities(reader->line, &len);
+			if (process_dummy_ref(reader->line)) {
 				state = EXPECTING_SHALLOW;
 				break;
 			}
 			state = EXPECTING_REF;
 			/* fallthrough */
 		case EXPECTING_REF:
-			if (process_ref(reader.line, len, &list, flags, extra_have))
+			if (process_ref(reader->line, len, &list, flags, extra_have))
 				break;
 			state = EXPECTING_SHALLOW;
 			/* fallthrough */
 		case EXPECTING_SHALLOW:
-			if (process_shallow(reader.line, len, shallow_points))
+			if (process_shallow(reader->line, len, shallow_points))
 				break;
-			die("protocol error: unexpected '%s'", reader.line);
+			die("protocol error: unexpected '%s'", reader->line);
 		case EXPECTING_DONE:
 			break;
 		}
diff --git a/connect.h b/connect.h
index 01f14cdf3..cdb8979dc 100644
--- a/connect.h
+++ b/connect.h
@@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
 extern const char *server_feature_value(const char *feature, int *len_ret);
 extern int url_is_local_not_ssh(const char *url);
 
+struct packet_reader;
+extern enum protocol_version discover_version(struct packet_reader *reader);
+
 #endif
diff --git a/remote-curl.c b/remote-curl.c
index 0053b0954..9f6d07683 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "config.h"
 #include "remote.h"
+#include "connect.h"
 #include "strbuf.h"
 #include "walker.h"
 #include "http.h"
@@ -13,6 +14,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "protocol.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -176,8 +178,22 @@ static struct discovery *last_discovery;
 static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 {
 	struct ref *list = NULL;
-	get_remote_heads(-1, heads->buf, heads->len, &list,
-			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
+	struct packet_reader reader;
+
+	packet_reader_init(&reader, -1, heads->buf, heads->len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
+				 NULL, &heads->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	return list;
 }
 
diff --git a/remote.h b/remote.h
index 1f6611be2..2016461df 100644
--- a/remote.h
+++ b/remote.h
@@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
 void free_refs(struct ref *ref);
 
 struct oid_array;
-extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct packet_reader;
+extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
-				     struct oid_array *shallow);
+				     struct oid_array *shallow_points);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/transport.c b/transport.c
index 8e8779096..63c3dbab9 100644
--- a/transport.c
+++ b/transport.c
@@ -18,6 +18,7 @@
 #include "sha1-array.h"
 #include "sigchain.h"
 #include "transport-internal.h"
+#include "protocol.h"
 
 static void set_upstreams(struct transport *transport, struct ref *refs,
 	int pretend)
@@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
 {
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
+	struct packet_reader reader;
 
 	connect_setup(transport, for_push);
-	get_remote_heads(data->fd[0], NULL, 0, &refs,
-			 for_push ? REF_NORMAL : 0,
-			 &data->extra_have,
-			 &data->shallow);
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &refs,
+				 for_push ? REF_NORMAL : 0,
+				 &data->extra_have,
+				 &data->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 	data->got_remote_heads = 1;
 
 	return refs;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 09/35] transport: store protocol version
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (7 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-07  1:12     ` [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
                       ` (28 subsequent siblings)
  37 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version.  Store the
protocol version the server is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 63c3dbab9..2378dcb38 100644
--- a/transport.c
+++ b/transport.c
@@ -118,6 +118,7 @@ struct git_transport_data {
 	struct child_process *conn;
 	int fd[2];
 	unsigned got_remote_heads : 1;
+	enum protocol_version version;
 	struct oid_array extra_have;
 	struct oid_array shallow;
 };
@@ -200,7 +201,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	data->version = discover_version(&reader);
+	switch (data->version) {
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -221,7 +223,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 {
 	int ret = 0;
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
 	char *dest = xstrdup(transport->url);
 	struct fetch_pack_args args;
 	struct ref *refs_tmp = NULL;
@@ -247,10 +249,18 @@ static int fetch_refs_via_pack(struct transport *transport,
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0);
 
-	refs = fetch_pack(&args, data->fd, data->conn,
-			  refs_tmp ? refs_tmp : transport->remote_refs,
-			  dest, to_fetch, nr_heads, &data->shallow,
-			  &transport->pack_lockfile);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	close(data->fd[0]);
 	close(data->fd[1]);
 	if (finish_connect(data->conn))
@@ -549,7 +559,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 {
 	struct git_transport_data *data = transport->data;
 	struct send_pack_args args;
-	int ret;
+	int ret = 0;
 
 	if (!data->got_remote_heads)
 		get_refs_via_connect(transport, 1);
@@ -574,8 +584,15 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	else
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
-	ret = send_pack(&args, data->fd, data->conn, remote_refs,
-			&data->extra_have);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		ret = send_pack(&args, data->fd, data->conn, remote_refs,
+				&data->extra_have);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	close(data->fd[1]);
 	close(data->fd[0]);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (8 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 09/35] transport: store protocol version Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-27  6:18       ` Jonathan Nieder
  2018-02-07  1:12     ` [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
                       ` (27 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   | 3 +++
 builtin/receive-pack.c | 6 ++++++
 builtin/send-pack.c    | 3 +++
 builtin/upload-pack.c  | 7 +++++++
 connect.c              | 3 +++
 protocol.c             | 2 ++
 protocol.h             | 1 +
 remote-curl.c          | 3 +++
 transport.c            | 9 +++++++++
 9 files changed, 37 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 85d4faf76..f492e8abd 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index b7ce7c7f5..3656e94fd 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		unpack_limit = receive_unpack_limit;
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * push support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 83cb125a6..b5427f75e 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 2cb5cb35b..8d53e9794 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 		die("'%s' does not appear to be a git repository", dir);
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * fetch support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		upload_pack(&opts);
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/connect.c b/connect.c
index db3c9d24c..f2157a821 100644
--- a/connect.c
+++ b/connect.c
@@ -84,6 +84,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
 
 	/* Maybe process capabilities here, at least for v2 */
 	switch (version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 		/* Read the peeked version line */
 		packet_reader_read(reader);
diff --git a/protocol.c b/protocol.c
index 43012b7eb..5e636785d 100644
--- a/protocol.c
+++ b/protocol.c
@@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
 		return protocol_v0;
 	else if (!strcmp(value, "1"))
 		return protocol_v1;
+	else if (!strcmp(value, "2"))
+		return protocol_v2;
 	else
 		return protocol_unknown_version;
 }
diff --git a/protocol.h b/protocol.h
index 1b2bc94a8..2ad35e433 100644
--- a/protocol.h
+++ b/protocol.h
@@ -5,6 +5,7 @@ enum protocol_version {
 	protocol_unknown_version = -1,
 	protocol_v0 = 0,
 	protocol_v1 = 1,
+	protocol_v2 = 2,
 };
 
 /*
diff --git a/remote-curl.c b/remote-curl.c
index 9f6d07683..dae8a4a48 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
diff --git a/transport.c b/transport.c
index 2378dcb38..83d9dd1df 100644
--- a/transport.c
+++ b/transport.c
@@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 
 	data->version = discover_version(&reader);
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
 		refs_tmp = get_refs_via_connect(transport, 0);
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
@@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		ret = send_pack(&args, data->fd, data->conn, remote_refs,
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (9 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-22 20:40       ` Stefan Beller
  2018-02-07  1:12     ` [PATCH v3 12/35] serve: introduce git-serve Brandon Williams
                       ` (26 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Introduce a packet-line test helper which can either pack or unpack an
input stream into packet-lines and writes out the result to stdout.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile                 |  1 +
 t/helper/test-pkt-line.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 t/helper/test-pkt-line.c

diff --git a/Makefile b/Makefile
index b7ccc05fa..3b849c060 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ TEST_PROGRAMS_NEED_X += test-mktemp
 TEST_PROGRAMS_NEED_X += test-online-cpus
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
+TEST_PROGRAMS_NEED_X += test-pkt-line
 TEST_PROGRAMS_NEED_X += test-prio-queue
 TEST_PROGRAMS_NEED_X += test-read-cache
 TEST_PROGRAMS_NEED_X += test-write-cache
diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
new file mode 100644
index 000000000..0f19e53c7
--- /dev/null
+++ b/t/helper/test-pkt-line.c
@@ -0,0 +1,64 @@
+#include "pkt-line.h"
+
+static void pack_line(const char *line)
+{
+	if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))
+		packet_flush(1);
+	else if (!strcmp(line, "0001") || !strcmp(line, "0001\n"))
+		packet_delim(1);
+	else
+		packet_write_fmt(1, "%s", line);
+}
+
+static void pack(int argc, const char **argv)
+{
+	if (argc) { /* read from argv */
+		int i;
+		for (i = 0; i < argc; i++)
+			pack_line(argv[i]);
+	} else { /* read from stdin */
+		char line[LARGE_PACKET_MAX];
+		while (fgets(line, sizeof(line), stdin)) {
+			pack_line(line);
+		}
+	}
+}
+
+static void unpack(void)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF |
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+		switch (reader.status) {
+		case PACKET_READ_EOF:
+			break;
+		case PACKET_READ_NORMAL:
+			printf("%s\n", reader.line);
+			break;
+		case PACKET_READ_FLUSH:
+			printf("0000\n");
+			break;
+		case PACKET_READ_DELIM:
+			printf("0001\n");
+			break;
+		}
+	}
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	if (argc < 2)
+		die("too few arguments");
+
+	if (!strcmp(argv[1], "pack"))
+		pack(argc - 2, argv + 2);
+	else if (!strcmp(argv[1], "unpack"))
+		unpack();
+	else
+		die("invalid argument '%s'", argv[1]);
+
+	return 0;
+}
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 12/35] serve: introduce git-serve
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (10 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 22:45       ` Jonathan Tan
  2018-02-22  9:33       ` Jeff King
  2018-02-07  1:12     ` [PATCH v3 13/35] ls-refs: introduce ls-refs server command Brandon Williams
                       ` (25 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt | 114 +++++++++++++++
 Makefile                                |   2 +
 builtin.h                               |   1 +
 builtin/serve.c                         |  30 ++++
 git.c                                   |   1 +
 serve.c                                 | 250 ++++++++++++++++++++++++++++++++
 serve.h                                 |  15 ++
 t/t5701-git-serve.sh                    |  60 ++++++++
 9 files changed, 474 insertions(+)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-git-serve.sh

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 000000000..f87372f9b
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,114 @@
+ Git Wire Protocol, Version 2
+==============================
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+    supported by a single service
+  * Easily extendable as capabilities are moved into their own section
+    of the protocol, no longer being hidden behind a NUL byte and
+    limited by the size of a pkt-line (as there will be a single
+    capability per pkt-line)
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+    string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+  * Designed with http and stateless-rpc in mind.  With clear flush
+    semantics the http remote helper can simply act as a proxy.
+
+ Detailed Design
+=================
+
+A client can request to speak protocol v2 by sending `version=2` in the
+side-channel `GIT_PROTOCOL` in the initial request to the server.
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Special Packets
+-----------------
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+
+ Capability Advertisement
+--------------------------
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+    capability-advertisement = protocol-version
+			       capability-list
+			       flush-pkt
+
+    protocol-version = PKT-LINE("version 2" LF)
+    capability-list = *capability
+    capability = PKT-LINE(key[=value] LF)
+
+    key = 1*CHAR
+    value = 1*CHAR
+    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
+
+A client then responds to select the command it wants with any particular
+capabilities or arguments.  There is then an optional section where the
+client can provide any command specific parameters or queries.
+
+    command-request = command
+		      capability-list
+		      (command-args)
+		      flush-pkt
+    command = PKT-LINE("command=" key LF)
+    command-args = delim-pkt
+		   *arg
+    arg = 1*CHAR
+
+The server will then check to ensure that the client's request is
+comprised of a valid command as well as valid capabilities which were
+advertised.  If the request is valid the server will then execute the
+command.
+
+When a command has finished a client can either request that another
+command be executed or can terminate the connection by sending an empty
+request consisting of just a flush-pkt.
+
+ Capabilities
+~~~~~~~~~~~~~~
+
+There are two different types of capabilities: normal capabilities,
+which can be used to to convey information or alter the behavior of a
+request, and command capabilities, which are the core actions that a
+client wants to perform (fetch, push, etc).
+
+All commands must only last a single round and be stateless from the
+perspective of the server side.  All state MUST be retained and managed
+by the client process.  This permits simple round-robin load-balancing
+on the server side, without needing to worry about state management.
+
+Clients MUST NOT require state management on the server side in order to
+function correctly.
+
+ agent
+-------
+
+The server can advertise the `agent` capability with a value `X` (in the
+form `agent=X`) to notify the client that the server is running version
+`X`.  The client may optionally send its own agent string by including
+the `agent` capability with a value `Y` (in the form `agent=Y`) in its
+request to the server (but it MUST NOT do so if the server did not
+advertise the agent capability). The `X` and `Y` strings may contain any
+printable ASCII characters except space (i.e., the byte range 32 < x <
+127), and are typically of the form "package/version" (e.g.,
+"git/1.8.3.1"). The agent strings are purely informative for statistics
+and debugging purposes, and MUST NOT be used to programmatically assume
+the presence or absence of particular features.
diff --git a/Makefile b/Makefile
index 3b849c060..18c255428 100644
--- a/Makefile
+++ b/Makefile
@@ -881,6 +881,7 @@ LIB_OBJS += revision.o
 LIB_OBJS += run-command.o
 LIB_OBJS += send-pack.o
 LIB_OBJS += sequencer.o
+LIB_OBJS += serve.o
 LIB_OBJS += server-info.o
 LIB_OBJS += setup.o
 LIB_OBJS += sha1-array.o
@@ -1014,6 +1015,7 @@ BUILTIN_OBJS += builtin/rev-parse.o
 BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
+BUILTIN_OBJS += builtin/serve.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
diff --git a/builtin.h b/builtin.h
index f332a1257..3f3fdfc28 100644
--- a/builtin.h
+++ b/builtin.h
@@ -215,6 +215,7 @@ extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
 extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
+extern int cmd_serve(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/serve.c b/builtin/serve.c
new file mode 100644
index 000000000..d3fd240bb
--- /dev/null
+++ b/builtin/serve.c
@@ -0,0 +1,30 @@
+#include "cache.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "serve.h"
+
+static char const * const serve_usage[] = {
+	N_("git serve [<options>]"),
+	NULL
+};
+
+int cmd_serve(int argc, const char **argv, const char *prefix)
+{
+	struct serve_options opts = SERVE_OPTIONS_INIT;
+
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-capabilities", &opts.advertise_capabilities,
+			 N_("exit immediately after advertising capabilities")),
+		OPT_END()
+	};
+
+	/* ignore all unknown cmdline switches for now */
+	argc = parse_options(argc, argv, prefix, options, serve_usage,
+			     PARSE_OPT_KEEP_DASHDASH |
+			     PARSE_OPT_KEEP_UNKNOWN);
+	serve(&opts);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index f71073dc8..f85d682b6 100644
--- a/git.c
+++ b/git.c
@@ -461,6 +461,7 @@ static struct cmd_struct commands[] = {
 	{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
 	{ "rm", cmd_rm, RUN_SETUP },
 	{ "send-pack", cmd_send_pack, RUN_SETUP },
+	{ "serve", cmd_serve, RUN_SETUP },
 	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
 	{ "show", cmd_show, RUN_SETUP },
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/serve.c b/serve.c
new file mode 100644
index 000000000..cf23179b9
--- /dev/null
+++ b/serve.c
@@ -0,0 +1,250 @@
+#include "cache.h"
+#include "repository.h"
+#include "config.h"
+#include "pkt-line.h"
+#include "version.h"
+#include "argv-array.h"
+#include "serve.h"
+
+static int agent_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (value)
+		strbuf_addstr(value, git_user_agent_sanitized());
+	return 1;
+}
+
+struct protocol_capability {
+	/*
+	 * The name of the capability.  The server uses this name when
+	 * advertising this capability, and the client uses this name to
+	 * specify this capability.
+	 */
+	const char *name;
+
+	/*
+	 * Function queried to see if a capability should be advertised.
+	 * Optionally a value can be specified by adding it to 'value'.
+	 * If a value is added to 'value', the server will advertise this
+	 * capability as "<name>=<value>" instead of "<name>".
+	 */
+	int (*advertise)(struct repository *r, struct strbuf *value);
+
+	/*
+	 * Function called when a client requests the capability as a command.
+	 * The command request will be provided to the function via 'keys', the
+	 * capabilities requested, and 'args', the command specific parameters.
+	 *
+	 * This field should be NULL for capabilities which are not commands.
+	 */
+	int (*command)(struct repository *r,
+		       struct argv_array *keys,
+		       struct argv_array *args);
+};
+
+static struct protocol_capability capabilities[] = {
+	{ "agent", agent_advertise, NULL },
+};
+
+static void advertise_capabilities(void)
+{
+	struct strbuf capability = STRBUF_INIT;
+	struct strbuf value = STRBUF_INIT;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+
+		if (c->advertise(the_repository, &value)) {
+			strbuf_addstr(&capability, c->name);
+
+			if (value.len) {
+				strbuf_addch(&capability, '=');
+				strbuf_addbuf(&capability, &value);
+			}
+
+			strbuf_addch(&capability, '\n');
+			packet_write(1, capability.buf, capability.len);
+		}
+
+		strbuf_reset(&capability);
+		strbuf_reset(&value);
+	}
+
+	packet_flush(1);
+	strbuf_release(&capability);
+	strbuf_release(&value);
+}
+
+static struct protocol_capability *get_capability(const char *key)
+{
+	int i;
+
+	if (!key)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+		const char *out;
+		if (skip_prefix(key, c->name, &out) && (!*out || *out == '='))
+			return c;
+	}
+
+	return NULL;
+}
+
+static int is_valid_capability(const char *key)
+{
+	const struct protocol_capability *c = get_capability(key);
+
+	return c && c->advertise(the_repository, NULL);
+}
+
+static int is_command(const char *key, struct protocol_capability **command)
+{
+	const char *out;
+
+	if (skip_prefix(key, "command=", &out)) {
+		struct protocol_capability *cmd = get_capability(out);
+
+		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
+			die("invalid command '%s'", out);
+		if (*command)
+			die("command already requested");
+
+		*command = cmd;
+		return 1;
+	}
+
+	return 0;
+}
+
+int has_capability(const struct argv_array *keys, const char *capability,
+		   const char **value)
+{
+	int i;
+	for (i = 0; i < keys->argc; i++) {
+		const char *out;
+		if (skip_prefix(keys->argv[i], capability, &out) &&
+		    (!*out || *out == '=')) {
+			if (value) {
+				if (*out == '=')
+					out++;
+				*value = out;
+			}
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+enum request_state {
+	PROCESS_REQUEST_KEYS = 0,
+	PROCESS_REQUEST_ARGS,
+	PROCESS_REQUEST_DONE,
+};
+
+static int process_request(void)
+{
+	enum request_state state = PROCESS_REQUEST_KEYS;
+	struct packet_reader reader;
+	struct argv_array keys = ARGV_ARRAY_INIT;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct protocol_capability *command = NULL;
+
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	/*
+	 * Check to see if the client closed their end before sending another
+	 * request.  If so we can terminate the connection.
+	 */
+	if (packet_reader_peek(&reader) == PACKET_READ_EOF)
+		return 1;
+	reader.options = PACKET_READ_CHOMP_NEWLINE;
+
+	while (state != PROCESS_REQUEST_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			BUG("Should have already died when seeing EOF");
+		case PACKET_READ_NORMAL:
+			break;
+		case PACKET_READ_FLUSH:
+			state = PROCESS_REQUEST_DONE;
+			continue;
+		case PACKET_READ_DELIM:
+			if (state != PROCESS_REQUEST_KEYS)
+				die("protocol error");
+			state = PROCESS_REQUEST_ARGS;
+			/*
+			 * maybe include a check to make sure that a
+			 * command/capabilities were given.
+			 */
+			continue;
+		}
+
+		switch (state) {
+		case PROCESS_REQUEST_KEYS:
+			/* collect request; a sequence of keys and values */
+			if (is_command(reader.line, &command) ||
+			    is_valid_capability(reader.line))
+				argv_array_push(&keys, reader.line);
+			else
+				die("unknown capability '%s'", reader.line);
+			break;
+		case PROCESS_REQUEST_ARGS:
+			/* collect arguments for the requested command */
+			argv_array_push(&args, reader.line);
+			break;
+		case PROCESS_REQUEST_DONE:
+			continue;
+		}
+	}
+
+	/*
+	 * If no command and no keys were given then the client wanted to
+	 * terminate the connection.
+	 */
+	if (!keys.argc && !args.argc)
+		return 1;
+
+	if (!command)
+		die("no command requested");
+
+	command->command(the_repository, &keys, &args);
+
+	argv_array_clear(&keys);
+	argv_array_clear(&args);
+	return 0;
+}
+
+/* Main serve loop for protocol version 2 */
+void serve(struct serve_options *options)
+{
+	if (options->advertise_capabilities || !options->stateless_rpc) {
+		/* serve by default supports v2 */
+		packet_write_fmt(1, "version 2\n");
+
+		advertise_capabilities();
+		/*
+		 * If only the list of capabilities was requested exit
+		 * immediately after advertising capabilities
+		 */
+		if (options->advertise_capabilities)
+			return;
+	}
+
+	/*
+	 * If stateless-rpc was requested then exit after
+	 * a single request/response exchange
+	 */
+	if (options->stateless_rpc) {
+		process_request();
+	} else {
+		for (;;)
+			if (process_request())
+				break;
+	}
+}
diff --git a/serve.h b/serve.h
new file mode 100644
index 000000000..fe65ba9f4
--- /dev/null
+++ b/serve.h
@@ -0,0 +1,15 @@
+#ifndef SERVE_H
+#define SERVE_H
+
+struct argv_array;
+extern int has_capability(const struct argv_array *keys, const char *capability,
+			  const char **value);
+
+struct serve_options {
+	unsigned advertise_capabilities;
+	unsigned stateless_rpc;
+};
+#define SERVE_OPTIONS_INIT { 0 }
+extern void serve(struct serve_options *options);
+
+#endif /* SERVE_H */
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
new file mode 100755
index 000000000..affbad097
--- /dev/null
+++ b/t/t5701-git-serve.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+
+test_description='test git-serve and server commands'
+
+. ./test-lib.sh
+
+test_expect_success 'test capability advertisement' '
+	cat >expect <<-EOF &&
+	version 2
+	agent=git/$(git version | cut -d" " -f3)
+	0000
+	EOF
+
+	git serve --advertise-capabilities >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'stateless-rpc flag does not list capabilities' '
+	# Empty request
+	test-pkt-line pack >in <<-EOF &&
+	0000
+	EOF
+	git serve --stateless-rpc >out <in &&
+	test_must_be_empty out &&
+
+	# EOF
+	git serve --stateless-rpc >out &&
+	test_must_be_empty out
+'
+
+test_expect_success 'request invalid capability' '
+	test-pkt-line pack >in <<-EOF &&
+	foobar
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "unknown capability" err
+'
+
+test_expect_success 'request with no command' '
+	test-pkt-line pack >in <<-EOF &&
+	agent=git/test
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "no command requested" err
+'
+
+test_expect_success 'request invalid command' '
+	test-pkt-line pack >in <<-EOF &&
+	command=foo
+	agent=git/test
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "invalid command" err
+'
+
+test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (11 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 12/35] serve: introduce git-serve Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-22  9:48       ` Jeff King
  2018-02-07  1:12     ` [PATCH v3 14/35] connect: request remote refs using v2 Brandon Williams
                       ` (24 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt |  32 +++++++++
 Makefile                                |   1 +
 ls-refs.c                               |  96 ++++++++++++++++++++++++++
 ls-refs.h                               |   9 +++
 serve.c                                 |   8 +++
 t/t5701-git-serve.sh                    | 115 ++++++++++++++++++++++++++++++++
 6 files changed, 261 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index f87372f9b..ef81df868 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -112,3 +112,35 @@ printable ASCII characters except space (i.e., the byte range 32 < x <
 "git/1.8.3.1"). The agent strings are purely informative for statistics
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
+
+ ls-refs
+---------
+
+`ls-refs` is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in parameters
+which can be used to limit the refs sent from the server.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "<command>=<feature 1>
+<feature 2>".
+
+ls-refs takes in the following parameters wrapped in packet-lines:
+
+    symrefs
+	In addition to the object pointed by it, show the underlying ref
+	pointed by it when showing a symbolic ref.
+    peel
+	Show peeled tags.
+    ref-pattern <pattern>
+	When specified, only references matching the one of the provided
+	patterns are displayed.
+
+The output of ls-refs is as follows:
+
+    output = *ref
+	     flush-pkt
+    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    ref-attribute = (symref | peeled)
+    symref = "symref-target:" symref-target
+    peeled = "peeled:" obj-id
diff --git a/Makefile b/Makefile
index 18c255428..e50927cfb 100644
--- a/Makefile
+++ b/Makefile
@@ -825,6 +825,7 @@ LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 000000000..70682b4f7
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,96 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ls_refs_data {
+	unsigned peel;
+	unsigned symrefs;
+	struct argv_array patterns;
+};
+
+/*
+ * Check if one of the patterns matches the tail part of the ref.
+ * If no patterns were provided, all refs match.
+ */
+static int ref_match(const struct argv_array *patterns, const char *refname)
+{
+	char *pathbuf;
+	int i;
+
+	if (!patterns->argc)
+		return 1; /* no restriction */
+
+	pathbuf = xstrfmt("/%s", refname);
+	for (i = 0; i < patterns->argc; i++) {
+		if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
+			free(pathbuf);
+			return 1;
+		}
+	}
+	free(pathbuf);
+	return 0;
+}
+
+static int send_ref(const char *refname, const struct object_id *oid,
+		    int flag, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+	const char *refname_nons = strip_namespace(refname);
+	struct strbuf refline = STRBUF_INIT;
+
+	if (!ref_match(&data->patterns, refname))
+		return 0;
+
+	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (data->symrefs && flag & REF_ISSYMREF) {
+		struct object_id unused;
+		const char *symref_target = resolve_ref_unsafe(refname, 0,
+							       &unused,
+							       &flag);
+
+		if (!symref_target)
+			die("'%s' is a symref but it is not?", refname);
+
+		strbuf_addf(&refline, " symref-target:%s", symref_target);
+	}
+
+	if (data->peel) {
+		struct object_id peeled;
+		if (!peel_ref(refname, &peeled))
+			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
+	}
+
+	strbuf_addch(&refline, '\n');
+	packet_write(1, refline.buf, refline.len);
+
+	strbuf_release(&refline);
+	return 0;
+}
+
+int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array *args)
+{
+	int i;
+	struct ls_refs_data data = { 0, 0, ARGV_ARRAY_INIT };
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+		const char *out;
+
+		if (!strcmp("peel", arg))
+			data.peel = 1;
+		else if (!strcmp("symrefs", arg))
+			data.symrefs = 1;
+		else if (skip_prefix(arg, "ref-pattern ", &out))
+			argv_array_pushf(&data.patterns, "*/%s", out);
+	}
+
+	head_ref_namespaced(send_ref, &data);
+	for_each_namespaced_ref(send_ref, &data);
+	packet_flush(1);
+	argv_array_clear(&data.patterns);
+	return 0;
+}
diff --git a/ls-refs.h b/ls-refs.h
new file mode 100644
index 000000000..9e4c57bfe
--- /dev/null
+++ b/ls-refs.h
@@ -0,0 +1,9 @@
+#ifndef LS_REFS_H
+#define LS_REFS_H
+
+struct repository;
+struct argv_array;
+extern int ls_refs(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args);
+
+#endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index cf23179b9..c7925c5c7 100644
--- a/serve.c
+++ b/serve.c
@@ -4,8 +4,15 @@
 #include "pkt-line.h"
 #include "version.h"
 #include "argv-array.h"
+#include "ls-refs.h"
 #include "serve.h"
 
+static int always_advertise(struct repository *r,
+			    struct strbuf *value)
+{
+	return 1;
+}
+
 static int agent_advertise(struct repository *r,
 			   struct strbuf *value)
 {
@@ -44,6 +51,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
+	{ "ls-refs", always_advertise, ls_refs },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index affbad097..33536254e 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,6 +8,7 @@ test_expect_success 'test capability advertisement' '
 	cat >expect <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
+	ls-refs
 	0000
 	EOF
 
@@ -57,4 +58,118 @@ test_expect_success 'request invalid command' '
 	test_i18ngrep "invalid command" err
 '
 
+# Test the basics of ls-refs
+#
+test_expect_success 'setup some refs and tags' '
+	test_commit one &&
+	git branch dev master &&
+	test_commit two &&
+	git symbolic-ref refs/heads/release refs/heads/master &&
+	git tag -a -m "annotated tag" annotated-tag
+'
+
+test_expect_success 'basics of ls-refs' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse HEAD) HEAD
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'basic ref-patterns' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	ref-pattern master
+	ref-pattern one
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/tags/one) refs/tags/one
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'wildcard ref-patterns' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	ref-pattern refs/heads/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'peel parameter' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	peel
+	ref-pattern refs/tags/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag peeled:$(git rev-parse refs/tags/annotated-tag^{})
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'symrefs parameter' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	symrefs
+	ref-pattern refs/heads/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release symref-target:refs/heads/master
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (12 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 13/35] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 22:54       ` Jonathan Tan
  2018-02-27  6:51       ` Jonathan Nieder
  2018-02-07  1:12     ` [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
                       ` (23 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/upload-pack.c  |  10 ++--
 connect.c              | 123 ++++++++++++++++++++++++++++++++++++++++++++++++-
 connect.h              |   2 +
 remote.h               |   4 ++
 t/t5702-protocol-v2.sh |  53 +++++++++++++++++++++
 transport.c            |   2 +-
 6 files changed, 187 insertions(+), 7 deletions(-)
 create mode 100755 t/t5702-protocol-v2.sh

diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 8d53e9794..a757df8da 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -5,6 +5,7 @@
 #include "parse-options.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -16,6 +17,7 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 	const char *dir;
 	int strict = 0;
 	struct upload_pack_options opts = { 0 };
+	struct serve_options serve_opts = SERVE_OPTIONS_INIT;
 	struct option options[] = {
 		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
 			 N_("quit after a single request/response exchange")),
@@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v2:
-		/*
-		 * fetch support for protocol v2 has not been implemented yet,
-		 * so ignore the request to use v2 and fallback to using v0.
-		 */
-		upload_pack(&opts);
+		serve_opts.advertise_capabilities = opts.advertise_refs;
+		serve_opts.stateless_rpc = opts.stateless_rpc;
+		serve(&serve_opts);
 		break;
 	case protocol_v1:
 		/*
diff --git a/connect.c b/connect.c
index f2157a821..7cb1f1df7 100644
--- a/connect.c
+++ b/connect.c
@@ -12,9 +12,11 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "version.h"
 #include "protocol.h"
 
 static char *server_capabilities;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+/* Checks if the server supports the capability 'c' */
+int server_supports_v2(const char *c, int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *out == '='))
+			return 1;
+	}
+
+	if (die_on_error)
+		die("server doesn't support '%s'", c);
+
+	return 0;
+}
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
+		argv_array_push(&server_capabilities_v2, reader->line);
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
+}
+
 enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
@@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 	/* Maybe process capabilities here, at least for v2 */
 	switch (version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		process_capabilities_v2(reader);
 		break;
 	case protocol_v1:
 		/* Read the peeked version line */
@@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 	return list;
 }
 
+static int process_ref_v2(const char *line, struct ref ***list)
+{
+	int ret = 1;
+	int i = 0;
+	struct object_id old_oid;
+	struct ref *ref;
+	struct string_list line_sections = STRING_LIST_INIT_DUP;
+
+	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+		ret = 0;
+		goto out;
+	}
+
+	if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
+		ret = 0;
+		goto out;
+	}
+
+	ref = alloc_ref(line_sections.items[i++].string);
+
+	oidcpy(&ref->old_oid, &old_oid);
+	**list = ref;
+	*list = &ref->next;
+
+	for (; i < line_sections.nr; i++) {
+		const char *arg = line_sections.items[i].string;
+		if (skip_prefix(arg, "symref-target:", &arg))
+			ref->symref = xstrdup(arg);
+
+		if (skip_prefix(arg, "peeled:", &arg)) {
+			struct object_id peeled_oid;
+			char *peeled_name;
+			struct ref *peeled;
+			if (get_oid_hex(arg, &peeled_oid)) {
+				ret = 0;
+				goto out;
+			}
+
+			peeled_name = xstrfmt("%s^{}", ref->name);
+			peeled = alloc_ref(peeled_name);
+
+			oidcpy(&peeled->old_oid, &peeled_oid);
+			**list = peeled;
+			*list = &peeled->next;
+
+			free(peeled_name);
+		}
+	}
+
+out:
+	string_list_clear(&line_sections, 0);
+	return ret;
+}
+
+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+			     struct ref **list, int for_push,
+			     const struct argv_array *ref_patterns)
+{
+	int i;
+	*list = NULL;
+
+	/* Check that the server supports the ls-refs command */
+	/* Issue request for ls-refs */
+	if (server_supports_v2("ls-refs", 1))
+		packet_write_fmt(fd_out, "command=ls-refs\n");
+
+	if (server_supports_v2("agent", 0))
+	    packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
+
+	packet_delim(fd_out);
+	/* When pushing we don't want to request the peeled tags */
+	if (!for_push)
+		packet_write_fmt(fd_out, "peel\n");
+	packet_write_fmt(fd_out, "symrefs\n");
+	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
+		packet_write_fmt(fd_out, "ref-pattern %s\n",
+				 ref_patterns->argv[i]);
+	}
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (!process_ref_v2(reader->line, &list))
+			die("invalid ls-refs response: %s", reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("protocol error");
+
+	return list;
+}
+
 static const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp)
 {
 	int len;
diff --git a/connect.h b/connect.h
index cdb8979dc..8898d4495 100644
--- a/connect.h
+++ b/connect.h
@@ -16,4 +16,6 @@ extern int url_is_local_not_ssh(const char *url);
 struct packet_reader;
 extern enum protocol_version discover_version(struct packet_reader *reader);
 
+extern int server_supports_v2(const char *c, int die_on_error);
+
 #endif
diff --git a/remote.h b/remote.h
index 2016461df..21d0c776c 100644
--- a/remote.h
+++ b/remote.h
@@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
 
 struct oid_array;
 struct packet_reader;
+struct argv_array;
 extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
 				     struct oid_array *shallow_points);
+extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+				    struct ref **list, int for_push,
+				    const struct argv_array *ref_patterns);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
new file mode 100755
index 000000000..1e42b5588
--- /dev/null
+++ b/t/t5702-protocol-v2.sh
@@ -0,0 +1,53 @@
+#!/bin/sh
+
+test_description='test git wire-protocol version 2'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	test_commit -C "$daemon_parent" one
+'
+
+test_expect_success 'list refs with git:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote --symref "$GIT_DAEMON_URL/parent" >actual 2>log &&
+
+	# Client requested to use protocol v2
+	grep "git> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+
+	git ls-remote --symref "$GIT_DAEMON_URL/parent" >expect &&
+	test_cmp actual expect
+'
+
+stop_git_daemon
+
+# Test protocol v2 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'list refs with file:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote --symref "file://$(pwd)/file_parent" >actual 2>log &&
+
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+
+	git ls-remote --symref "file://$(pwd)/file_parent" >expect &&
+	test_cmp actual expect
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 83d9dd1df..ffc6b2614 100644
--- a/transport.c
+++ b/transport.c
@@ -204,7 +204,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (13 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 14/35] connect: request remote refs using v2 Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 22:56       ` Jonathan Tan
  2018-02-07  1:12     ` [PATCH v3 16/35] transport: convert transport_get_remote_refs " Brandon Williams
                       ` (22 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Convert the 'struct transport' virtual function 'get_refs_list()' to
optionally take an argv_array of ref patterns.  When communicating with
a server using protocol v2 these ref patterns can be sent when
requesting a listing of their refs allowing the server to filter the
refs it sends based on the sent patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c   |  5 +++--
 transport-internal.h |  4 +++-
 transport.c          | 16 +++++++++-------
 3 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 508015023..4c334b5ee 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1026,7 +1026,8 @@ static int has_attribute(const char *attrs, const char *attr) {
 	}
 }
 
-static struct ref *get_refs_list(struct transport *transport, int for_push)
+static struct ref *get_refs_list(struct transport *transport, int for_push,
+				 const struct argv_array *ref_patterns)
 {
 	struct helper_data *data = transport->data;
 	struct child_process *helper;
@@ -1039,7 +1040,7 @@ static struct ref *get_refs_list(struct transport *transport, int for_push)
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push);
+		return transport->vtable->get_refs_list(transport, for_push, ref_patterns);
 	}
 
 	if (data->push && for_push)
diff --git a/transport-internal.h b/transport-internal.h
index 3c1a29d72..a67657ce3 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -3,6 +3,7 @@
 
 struct ref;
 struct transport;
+struct argv_array;
 
 struct transport_vtable {
 	/**
@@ -21,7 +22,8 @@ struct transport_vtable {
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
-	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
+	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
+				     const struct argv_array *ref_patterns);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index ffc6b2614..c54a44630 100644
--- a/transport.c
+++ b/transport.c
@@ -72,7 +72,7 @@ struct bundle_transport_data {
 	struct bundle_header header;
 };
 
-static struct ref *get_refs_from_bundle(struct transport *transport, int for_push)
+static struct ref *get_refs_from_bundle(struct transport *transport, int for_push, const struct argv_array *ref_patterns)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -189,7 +189,8 @@ static int connect_setup(struct transport *transport, int for_push)
 	return 0;
 }
 
-static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
+static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
+					const struct argv_array *ref_patterns)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -204,7 +205,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
+		get_remote_refs(data->fd[1], &reader, &refs, for_push,
+				ref_patterns);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -250,7 +252,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.update_shallow = data->options.update_shallow;
 
 	if (!data->got_remote_heads)
-		refs_tmp = get_refs_via_connect(transport, 0);
+		refs_tmp = get_refs_via_connect(transport, 0, NULL);
 
 	switch (data->version) {
 	case protocol_v2:
@@ -568,7 +570,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	int ret = 0;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1);
+		get_refs_via_connect(transport, 1, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1028,7 +1030,7 @@ int transport_push(struct transport *transport,
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1);
+		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1137,7 +1139,7 @@ int transport_push(struct transport *transport,
 const struct ref *transport_get_remote_refs(struct transport *transport)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0);
+		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
 		transport->got_remote_refs = 1;
 	}
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (14 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 22:58       ` Jonathan Tan
  2018-02-07  1:12     ` [PATCH v3 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
                       ` (21 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Convert 'transport_get_remote_refs()' to optionally take a list of ref
patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/clone.c     | 2 +-
 builtin/fetch.c     | 4 ++--
 builtin/ls-remote.c | 2 +-
 builtin/remote.c    | 2 +-
 transport.c         | 7 +++++--
 transport.h         | 3 ++-
 6 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 284651797..6e77d993f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1121,7 +1121,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (transport->smart_options && !deepen)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
-	refs = transport_get_remote_refs(transport);
+	refs = transport_get_remote_refs(transport, NULL);
 
 	if (refs) {
 		mapped_refs = wanted_peer_refs(refs, refspec);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26fa..850382f55 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -250,7 +250,7 @@ static void find_non_local_tags(struct transport *transport,
 	struct string_list_item *item = NULL;
 
 	for_each_ref(add_existing, &existing_refs);
-	for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) {
+	for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) {
 		if (!starts_with(ref->name, "refs/tags/"))
 			continue;
 
@@ -336,7 +336,7 @@ static struct ref *get_ref_map(struct transport *transport,
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport);
+	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c4be98ab9..c6e9847c5 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -96,7 +96,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport);
+	ref = transport_get_remote_refs(transport, NULL);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/builtin/remote.c b/builtin/remote.c
index d95bf904c..d0b6ff6e2 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -862,7 +862,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport);
+		remote_refs = transport_get_remote_refs(transport, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/transport.c b/transport.c
index c54a44630..dfc603b36 100644
--- a/transport.c
+++ b/transport.c
@@ -1136,10 +1136,13 @@ int transport_push(struct transport *transport,
 	return 1;
 }
 
-const struct ref *transport_get_remote_refs(struct transport *transport)
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
+		transport->remote_refs =
+			transport->vtable->get_refs_list(transport, 0,
+							 ref_patterns);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 731c78b67..4b656f315 100644
--- a/transport.h
+++ b/transport.h
@@ -178,7 +178,8 @@ int transport_push(struct transport *connection,
 		   int refspec_nr, const char **refspec, int flags,
 		   unsigned int * reject_reasons);
 
-const struct ref *transport_get_remote_refs(struct transport *transport);
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 17/35] ls-remote: pass ref patterns when requesting a remote's refs
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (15 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 16/35] transport: convert transport_get_remote_refs " Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-07  1:12     ` [PATCH v3 18/35] fetch: pass ref patterns when fetching Brandon Williams
                       ` (20 subsequent siblings)
  37 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Construct an argv_array of the ref patterns supplied via the command
line and pass them to 'transport_get_remote_refs()' to be used when
communicating protocol v2 so that the server can limit the ref
advertisement based on the supplied patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/ls-remote.c    |  7 +++++--
 t/t5702-protocol-v2.sh | 16 ++++++++++++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c6e9847c5..caf1051f3 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -43,6 +43,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	struct remote *remote;
 	struct transport *transport;
@@ -74,8 +75,10 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (argc > 1) {
 		int i;
 		pattern = xcalloc(argc, sizeof(const char *));
-		for (i = 1; i < argc; i++)
+		for (i = 1; i < argc; i++) {
 			pattern[i - 1] = xstrfmt("*/%s", argv[i]);
+			argv_array_push(&ref_patterns, argv[i]);
+		}
 	}
 
 	remote = remote_get(dest);
@@ -96,7 +99,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport, NULL);
+	ref = transport_get_remote_refs(transport, &ref_patterns);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 1e42b5588..a33ff6597 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -30,6 +30,14 @@ test_expect_success 'list refs with git:// using protocol v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote "$GIT_DAEMON_URL/parent" master 2>log &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
@@ -50,4 +58,12 @@ test_expect_success 'list refs with file:// using protocol v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		ls-remote "file://$(pwd)/file_parent" master 2>log &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 18/35] fetch: pass ref patterns when fetching
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (16 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-27  6:53       ` Jonathan Nieder
  2018-02-07  1:12     ` [PATCH v3 19/35] push: pass ref patterns when pushing Brandon Williams
                       ` (19 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Construct a list of ref patterns to be passed to
'transport_get_remote_refs()' from the refspec to be used during the
fetch.  This list of ref patterns will be used to allow the server to
filter the ref advertisement when communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 850382f55..8128450bf 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -332,11 +332,21 @@ static struct ref *get_ref_map(struct transport *transport,
 	struct ref *rm;
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
+	const struct ref *remote_refs;
+
+	for (i = 0; i < refspec_count; i++) {
+		if (!refspecs[i].exact_sha1)
+			argv_array_push(&ref_patterns, refspecs[i].src);
+	}
+
+	remote_refs = transport_get_remote_refs(transport, &ref_patterns);
+
+	argv_array_clear(&ref_patterns);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 19/35] push: pass ref patterns when pushing
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (17 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 18/35] fetch: pass ref patterns when fetching Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-27 18:23       ` Stefan Beller
  2018-02-07  1:12     ` [PATCH v3 20/35] upload-pack: introduce fetch server command Brandon Williams
                       ` (18 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Construct a list of ref patterns to be passed to 'get_refs_list()' from
the refspec to be used during the push.  This list of ref patterns will
be used to allow the server to filter the ref advertisement when
communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 17 ++++++++++++++++-
 1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/transport.c b/transport.c
index dfc603b36..6ea3905e3 100644
--- a/transport.c
+++ b/transport.c
@@ -1026,11 +1026,26 @@ int transport_push(struct transport *transport,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
+		struct refspec *tmp_rs;
+		struct argv_array ref_patterns = ARGV_ARRAY_INIT;
+		int i;
 
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
+		tmp_rs = parse_push_refspec(refspec_nr, refspec);
+		for (i = 0; i < refspec_nr; i++) {
+			if (tmp_rs[i].dst)
+				argv_array_push(&ref_patterns, tmp_rs[i].dst);
+			else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
+				argv_array_push(&ref_patterns, tmp_rs[i].src);
+		}
+
+		remote_refs = transport->vtable->get_refs_list(transport, 1,
+							       &ref_patterns);
+
+		argv_array_clear(&ref_patterns);
+		free_refspec(refspec_nr, tmp_rs);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 20/35] upload-pack: introduce fetch server command
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (18 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 19/35] push: pass ref patterns when pushing Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-21 23:46       ` Jonathan Tan
  2018-02-07  1:12     ` [PATCH v3 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
                       ` (17 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt | 127 +++++++++++++++
 serve.c                                 |   2 +
 t/t5701-git-serve.sh                    |   1 +
 upload-pack.c                           | 281 ++++++++++++++++++++++++++++++++
 upload-pack.h                           |   5 +
 5 files changed, 416 insertions(+)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index ef81df868..4d5096dae 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -144,3 +144,130 @@ The output of ls-refs is as follows:
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
+
+ fetch
+-------
+
+`fetch` is the command used to fetch a packfile in v2.  It can be looked
+at as a modified version of the v1 fetch where the ref-advertisement is
+stripped out (since the `ls-refs` command fills that role) and the
+message format is tweaked to eliminate redundancies and permit easy
+addition of future extensions.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "<command>=<feature 1>
+<feature 2>".
+
+A `fetch` request can take the following parameters wrapped in
+packet-lines:
+
+    want <oid>
+	Indicates to the server an object which the client wants to
+	retrieve.
+
+    have <oid>
+	Indicates to the server an object which the client has locally.
+	This allows the server to make a packfile which only contains
+	the objects that the client needs. Multiple 'have' lines can be
+	supplied.
+
+    done
+	Indicates to the server that negotiation should terminate (or
+	not even begin if performing a clone) and that the server should
+	use the information supplied in the request to construct the
+	packfile.
+
+    thin-pack
+	Request that a thin pack be sent, which is a pack with deltas
+	which reference base objects not contained within the pack (but
+	are known to exist at the receiving end). This can reduce the
+	network traffic significantly, but it requires the receiving end
+	to know how to "thicken" these packs by adding the missing bases
+	to the pack.
+
+    no-progress
+	Request that progress information that would normally be sent on
+	side-band channel 2, during the packfile transfer, should not be
+	sent.  However, the side-band channel 3 is still used for error
+	responses.
+
+    include-tag
+	Request that annotated tags should be sent if the objects they
+	point to are being sent.
+
+    ofs-delta
+	Indicate that the client understands PACKv2 with delta referring
+	to its base by position in pack rather than by an oid.  That is,
+	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
+
+The response of `fetch` is broken into a number of sections separated by
+delimiter packets (0001), with each section beginning with its section
+header.
+
+    output = *section
+    section = (acknowledgments | packfile)
+	      (flush-pkt | delim-pkt)
+
+    acknowledgments = PKT-LINE("acknowledgments" LF)
+		      *(ready | nak | ack)
+    ready = PKT-LINE("ready" LF)
+    nak = PKT-LINE("NAK" LF)
+    ack = PKT-LINE("ACK" SP obj-id LF)
+
+    packfile = PKT-LINE("packfile" LF)
+	       [PACKFILE]
+
+----
+    acknowledgments section
+	* Always begins with the section header "acknowledgments"
+
+	* The server will respond with "NAK" if none of the object ids sent
+	  as have lines were common.
+
+	* The server will respond with "ACK obj-id" for all of the
+	  object ids sent as have lines which are common.
+
+	* A response cannot have both "ACK" lines as well as a "NAK"
+	  line.
+
+	* The server will respond with a "ready" line indicating that
+	  the server has found an acceptable common base and is ready to
+	  make and send a packfile (which will be found in the packfile
+	  section of the same response)
+
+	* If the client determines that it is finished with negotiations
+	  by sending a "done" line, the acknowledgments sections can be
+	  omitted from the server's response as an optimization.
+
+	* If the server has found a suitable cut point and has decided
+	  to send a "ready" line, then the server can decide to (as an
+	  optimization) omit any "ACK" lines it would have sent during
+	  its response.  This is because the server will have already
+	  determined the objects it plans to send to the client and no
+	  further negotiation is needed.
+
+----
+    packfile section
+	* Always begins with the section header "packfile"
+
+	* The transmission of the packfile begins immediately after the
+	  section header
+
+	* The data transfer of the packfile is always multiplexed, using
+	  the same semantics of the 'side-band-64k' capability from
+	  protocol version 1.  This means that each packet, during the
+	  packfile data stream, is made up of a leading 4-byte pkt-line
+	  length (typical of the pkt-line format), followed by a 1-byte
+	  stream code, followed by the actual data.
+
+	  The stream code can be one of:
+		1 - pack data
+		2 - progress messages
+		3 - fatal error message just before stream aborts
+
+	* This section is only included if the client has sent 'want'
+	  lines in its request and either requested that no more
+	  negotiation be done by sending 'done' or if the server has
+	  decided it has found a sufficient cut point to produce a
+	  packfile.
diff --git a/serve.c b/serve.c
index c7925c5c7..05cc434cf 100644
--- a/serve.c
+++ b/serve.c
@@ -6,6 +6,7 @@
 #include "argv-array.h"
 #include "ls-refs.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static int always_advertise(struct repository *r,
 			    struct strbuf *value)
@@ -52,6 +53,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "ls-refs", always_advertise, ls_refs },
+	{ "fetch", always_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 33536254e..202cb782d 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -9,6 +9,7 @@ test_expect_success 'test capability advertisement' '
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs
+	fetch
 	0000
 	EOF
 
diff --git a/upload-pack.c b/upload-pack.c
index 1e8a9e1ca..c6518a24d 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "prio-queue.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE	(1u << 11)
@@ -1065,3 +1066,283 @@ void upload_pack(struct upload_pack_options *options)
 		create_pack_file();
 	}
 }
+
+struct upload_pack_data {
+	struct object_array wants;
+	struct oid_array haves;
+
+	unsigned stateless_rpc : 1;
+
+	unsigned use_thin_pack : 1;
+	unsigned use_ofs_delta : 1;
+	unsigned no_progress : 1;
+	unsigned use_include_tag : 1;
+	unsigned done : 1;
+};
+
+#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
+
+static void upload_pack_data_clear(struct upload_pack_data *data)
+{
+	object_array_clear(&data->wants);
+	oid_array_clear(&data->haves);
+}
+
+static int parse_want(const char *line)
+{
+	const char *arg;
+	if (skip_prefix(line, "want ", &arg)) {
+		struct object_id oid;
+		struct object *o;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: protocol error, "
+			    "expected to get oid, not '%s'", line);
+
+		o = parse_object(&oid);
+		if (!o) {
+			packet_write_fmt(1,
+					 "ERR upload-pack: not our ref %s",
+					 oid_to_hex(&oid));
+			die("git upload-pack: not our ref %s",
+			    oid_to_hex(&oid));
+		}
+
+		if (!(o->flags & WANTED)) {
+			o->flags |= WANTED;
+			add_object_array(o, NULL, &want_obj);
+		}
+
+		return 1;
+	}
+
+	return 0;
+}
+
+static int parse_have(const char *line, struct oid_array *haves)
+{
+	const char *arg;
+	if (skip_prefix(line, "have ", &arg)) {
+		struct object_id oid;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: expected SHA1 object, got '%s'", arg);
+		oid_array_append(haves, &oid);
+		return 1;
+	}
+
+	return 0;
+}
+
+static void process_args(struct argv_array *args, struct upload_pack_data *data)
+{
+	int i;
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+
+		/* process want */
+		if (parse_want(arg))
+			continue;
+		/* process have line */
+		if (parse_have(arg, &data->haves))
+			continue;
+
+		/* process args like thin-pack */
+		if (!strcmp(arg, "thin-pack")) {
+			use_thin_pack = 1;
+			continue;
+		}
+		if (!strcmp(arg, "ofs-delta")) {
+			use_ofs_delta = 1;
+			continue;
+		}
+		if (!strcmp(arg, "no-progress")) {
+			no_progress = 1;
+			continue;
+		}
+		if (!strcmp(arg, "include-tag")) {
+			use_include_tag = 1;
+			continue;
+		}
+		if (!strcmp(arg, "done")) {
+			data->done = 1;
+			continue;
+		}
+
+		/* ignore unknown lines maybe? */
+		die("unexpect line: '%s'", arg);
+	}
+}
+
+static void read_haves(struct upload_pack_data *data)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) == PACKET_READ_NORMAL) {
+
+		if (parse_have(reader.line, &data->haves))
+			continue;
+		if (!strcmp(reader.line, "done")) {
+			data->done = 1;
+			continue;
+		}
+	}
+	if (reader.status != PACKET_READ_FLUSH)
+		die("ERROR");
+}
+
+static int process_haves(struct oid_array *haves, struct oid_array *common)
+{
+	int i;
+
+	/* Process haves */
+	for (i = 0; i < haves->nr; i++) {
+		const struct object_id *oid = &haves->oid[i];
+		struct object *o;
+		int we_knew_they_have = 0;
+
+		if (!has_object_file(oid))
+			continue;
+
+		oid_array_append(common, oid);
+
+		o = parse_object(oid);
+		if (!o)
+			die("oops (%s)", oid_to_hex(oid));
+		if (o->type == OBJ_COMMIT) {
+			struct commit_list *parents;
+			struct commit *commit = (struct commit *)o;
+			if (o->flags & THEY_HAVE)
+				we_knew_they_have = 1;
+			else
+				o->flags |= THEY_HAVE;
+			if (!oldest_have || (commit->date < oldest_have))
+				oldest_have = commit->date;
+			for (parents = commit->parents;
+			     parents;
+			     parents = parents->next)
+				parents->item->object.flags |= THEY_HAVE;
+		}
+		if (!we_knew_they_have)
+			add_object_array(o, NULL, &have_obj);
+	}
+
+	return 0;
+}
+
+static int send_acks(struct oid_array *acks, struct strbuf *response)
+{
+	int i;
+
+	packet_buf_write(response, "acknowledgments\n");
+
+	/* Send Acks */
+	if (!acks->nr)
+		packet_buf_write(response, "NAK\n");
+
+	for (i = 0; i < acks->nr; i++) {
+		packet_buf_write(response, "ACK %s\n",
+				 oid_to_hex(&acks->oid[i]));
+	}
+
+	if (ok_to_give_up()) {
+		/* Send Ready */
+		packet_buf_write(response, "ready\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_haves_and_send_acks(struct upload_pack_data *data)
+{
+	struct oid_array common = OID_ARRAY_INIT;
+	struct strbuf response = STRBUF_INIT;
+	int ret = 0;
+
+	process_haves(&data->haves, &common);
+	if (data->done) {
+		ret = 1;
+	} else if (send_acks(&common, &response)) {
+		packet_buf_delim(&response);
+		ret = 1;
+	} else {
+		/* Add Flush */
+		packet_buf_flush(&response);
+		ret = 0;
+	}
+
+	/* Send response */
+	write_or_die(1, response.buf, response.len);
+	strbuf_release(&response);
+
+	oid_array_clear(&data->haves);
+	oid_array_clear(&common);
+	return ret;
+}
+
+enum fetch_state {
+	FETCH_PROCESS_ARGS = 0,
+	FETCH_READ_HAVES,
+	FETCH_SEND_ACKS,
+	FETCH_SEND_PACK,
+	FETCH_DONE,
+};
+
+int upload_pack_v2(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args)
+{
+	enum fetch_state state = FETCH_PROCESS_ARGS;
+	struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
+	use_sideband = LARGE_PACKET_MAX;
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_PROCESS_ARGS:
+			process_args(args, &data);
+
+			if (!want_obj.nr) {
+				/*
+				 * Request didn't contain any 'want' lines,
+				 * guess they didn't want anything.
+				 */
+				state = FETCH_DONE;
+			} else if (data.haves.nr) {
+				/*
+				 * Request had 'have' lines, so lets ACK them.
+				 */
+				state = FETCH_SEND_ACKS;
+			} else {
+				/*
+				 * Request had 'want's but no 'have's so we can
+				 * immedietly go to construct and send a pack.
+				 */
+				state = FETCH_SEND_PACK;
+			}
+			break;
+		case FETCH_READ_HAVES:
+			read_haves(&data);
+			state = FETCH_SEND_ACKS;
+			break;
+		case FETCH_SEND_ACKS:
+			if (process_haves_and_send_acks(&data))
+				state = FETCH_SEND_PACK;
+			else
+				state = FETCH_DONE;
+			break;
+		case FETCH_SEND_PACK:
+			packet_write_fmt(1, "packfile\n");
+			create_pack_file();
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			continue;
+		}
+	}
+
+	upload_pack_data_clear(&data);
+	return 0;
+}
diff --git a/upload-pack.h b/upload-pack.h
index a71e4dc7e..6b7890238 100644
--- a/upload-pack.h
+++ b/upload-pack.h
@@ -10,4 +10,9 @@ struct upload_pack_options {
 
 void upload_pack(struct upload_pack_options *options);
 
+struct repository;
+struct argv_array;
+extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
+			  struct argv_array *args);
+
 #endif /* UPLOAD_PACK_H */
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 21/35] fetch-pack: perform a fetch using v2
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (19 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 20/35] upload-pack: introduce fetch server command Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-24  0:54       ` Jonathan Tan
  2018-02-27 19:27       ` Stefan Beller
  2018-02-07  1:12     ` [PATCH v3 22/35] upload-pack: support shallow requests Brandon Williams
                       ` (16 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   |   2 +-
 fetch-pack.c           | 252 ++++++++++++++++++++++++++++++++++++++++++++++++-
 fetch-pack.h           |   4 +-
 t/t5702-protocol-v2.sh |  84 +++++++++++++++++
 transport.c            |   7 +-
 5 files changed, 342 insertions(+), 7 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index f492e8abd..867dd3cc7 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -213,7 +213,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
-			 &shallow, pack_lockfile_ptr);
+			 &shallow, pack_lockfile_ptr, protocol_v0);
 	if (pack_lockfile) {
 		printf("lock %s\n", pack_lockfile);
 		fflush(stdout);
diff --git a/fetch-pack.c b/fetch-pack.c
index 9f6b07ad9..4fb5805dd 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,247 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	return ref;
 }
 
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
+{
+	for ( ; wants ; wants = wants->next) {
+		const struct object_id *remote = &wants->old_oid;
+		const char *remote_hex;
+		struct object *o;
+
+		/*
+		 * If that object is complete (i.e. it is an ancestor of a
+		 * local ref), we tell them we have it but do not have to
+		 * tell them about its ancestors, which they already know
+		 * about.
+		 *
+		 * We use lookup_object here because we are only
+		 * interested in the case we *know* the object is
+		 * reachable and we have already scanned it.
+		 */
+		if (((o = lookup_object(remote->hash)) != NULL) &&
+		    (o->flags & COMPLETE)) {
+			continue;
+		}
+
+		remote_hex = oid_to_hex(remote);
+		packet_buf_write(req_buf, "want %s\n", remote_hex);
+	}
+}
+
+static void add_common(struct strbuf *req_buf, struct oidset *common)
+{
+	struct oidset_iter iter;
+	const struct object_id *oid;
+	oidset_iter_init(common, &iter);
+
+	while ((oid = oidset_iter_next(&iter))) {
+		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+	}
+}
+
+static int add_haves(struct strbuf *req_buf, int *in_vain)
+{
+	int ret = 0;
+	int haves_added = 0;
+	const struct object_id *oid;
+
+	while ((oid = get_rev())) {
+		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+		if (++haves_added >= INITIAL_FLUSH)
+			break;
+	};
+
+	*in_vain += haves_added;
+	if (!haves_added || *in_vain >= MAX_IN_VAIN) {
+		/* Send Done */
+		packet_buf_write(req_buf, "done\n");
+		ret = 1;
+	}
+
+	return ret;
+}
+
+static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
+			      const struct ref *wants, struct oidset *common,
+			      int *in_vain)
+{
+	int ret = 0;
+	struct strbuf req_buf = STRBUF_INIT;
+
+	if (server_supports_v2("fetch", 1))
+		packet_buf_write(&req_buf, "command=fetch");
+	if (server_supports_v2("agent", 0))
+		packet_buf_write(&req_buf, "agent=%s", git_user_agent_sanitized());
+
+	packet_buf_delim(&req_buf);
+	if (args->use_thin_pack)
+		packet_buf_write(&req_buf, "thin-pack");
+	if (args->no_progress)
+		packet_buf_write(&req_buf, "no-progress");
+	if (args->include_tag)
+		packet_buf_write(&req_buf, "include-tag");
+	if (prefer_ofs_delta)
+		packet_buf_write(&req_buf, "ofs-delta");
+
+	/* add wants */
+	add_wants(wants, &req_buf);
+
+	/* Add all of the common commits we've found in previous rounds */
+	add_common(&req_buf, common);
+
+	/* Add initial haves */
+	ret = add_haves(&req_buf, in_vain);
+
+	/* Send request */
+	packet_buf_flush(&req_buf);
+	write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+	strbuf_release(&req_buf);
+	return ret;
+}
+
+/*
+ * Processes a section header in a server's response and checks if it matches
+ * `section`.  If the value of `peek` is 1, the header line will be peeked (and
+ * not consumed); if 0, the line will be consumed and the function will die if
+ * the section header doesn't match what was expected.
+ */
+static int process_section_header(struct packet_reader *reader,
+				  const char *section, int peek)
+{
+	int ret;
+
+	if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
+		die("error reading packet");
+
+	ret = !strcmp(reader->line, section);
+
+	if (!peek) {
+		if (!ret)
+			die("expected '%s', received '%s'",
+			    section, reader->line);
+		packet_reader_read(reader);
+	}
+
+	return ret;
+}
+
+static int process_acks(struct packet_reader *reader, struct oidset *common)
+{
+	/* received */
+	int received_ready = 0;
+	int received_ack = 0;
+
+	process_section_header(reader, "acknowledgments", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *arg;
+
+		if (!strcmp(reader->line, "NAK"))
+			continue;
+
+		if (skip_prefix(reader->line, "ACK ", &arg)) {
+			struct object_id oid;
+			if (!get_oid_hex(arg, &oid)) {
+				struct commit *commit;
+				oidset_insert(common, &oid);
+				commit = lookup_commit(&oid);
+				mark_common(commit, 0, 1);
+			}
+			continue;
+		}
+
+		if (!strcmp(reader->line, "ready")) {
+			clear_prio_queue(&rev_list);
+			received_ready = 1;
+			continue;
+		}
+
+		die(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH &&
+	    reader->status != PACKET_READ_DELIM)
+		die("Error during processing acks: %d", reader->status);
+
+	/* return 0 if no common, 1 if there are common, or 2 if ready */
+	return received_ready ? 2 : (received_ack ? 1 : 0);
+}
+
+enum fetch_state {
+	FETCH_CHECK_LOCAL = 0,
+	FETCH_SEND_REQUEST,
+	FETCH_PROCESS_ACKS,
+	FETCH_GET_PACK,
+	FETCH_DONE,
+};
+
+static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
+				    int fd[2],
+				    const struct ref *orig_ref,
+				    struct ref **sought, int nr_sought,
+				    char **pack_lockfile)
+{
+	struct ref *ref = copy_ref_list(orig_ref);
+	enum fetch_state state = FETCH_CHECK_LOCAL;
+	struct oidset common = OIDSET_INIT;
+	struct packet_reader reader;
+	int in_vain = 0;
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_CHECK_LOCAL:
+			sort_ref_list(&ref, ref_compare_name);
+			QSORT(sought, nr_sought, cmp_ref_by_name);
+
+			/* v2 supports these by default */
+			allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+			use_sideband = 2;
+
+			/* Filter 'ref' by 'sought' and those that aren't local */
+			if (everything_local(args, &ref, sought, nr_sought))
+				state = FETCH_DONE;
+			else
+				state = FETCH_SEND_REQUEST;
+			break;
+		case FETCH_SEND_REQUEST:
+			if (send_fetch_request(fd[1], args, ref, &common, &in_vain))
+				state = FETCH_GET_PACK;
+			else
+				state = FETCH_PROCESS_ACKS;
+			break;
+		case FETCH_PROCESS_ACKS:
+			/* Process ACKs/NAKs */
+			switch (process_acks(&reader, &common)) {
+			case 2:
+				state = FETCH_GET_PACK;
+				break;
+			case 1:
+				in_vain = 0;
+				/* fallthrough */
+			default:
+				state = FETCH_SEND_REQUEST;
+				break;
+			}
+			break;
+		case FETCH_GET_PACK:
+			/* get the pack */
+			process_section_header(&reader, "packfile", 0);
+			if (get_pack(args, fd, pack_lockfile))
+				die(_("git fetch-pack: fetch failed."));
+
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			continue;
+		}
+	}
+
+	oidset_clear(&common);
+	return ref;
+}
+
 static void fetch_pack_config(void)
 {
 	git_config_get_int("fetch.unpacklimit", &fetch_unpack_limit);
@@ -1153,7 +1394,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       const char *dest,
 		       struct ref **sought, int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile)
+		       char **pack_lockfile,
+		       enum protocol_version version)
 {
 	struct ref *ref_cpy;
 	struct shallow_info si;
@@ -1167,8 +1409,12 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		die(_("no matching remote head"));
 	}
 	prepare_shallow_info(&si, shallow);
-	ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
-				&si, pack_lockfile);
+	if (version == protocol_v2)
+		ref_cpy = do_fetch_pack_v2(args, fd, ref, sought, nr_sought,
+					   pack_lockfile);
+	else
+		ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
+					&si, pack_lockfile);
 	reprepare_packed_git();
 	update_shallow(args, sought, nr_sought, &si);
 	clear_shallow_info(&si);
diff --git a/fetch-pack.h b/fetch-pack.h
index b6aeb43a8..7afca7305 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -3,6 +3,7 @@
 
 #include "string-list.h"
 #include "run-command.h"
+#include "protocol.h"
 
 struct oid_array;
 
@@ -43,7 +44,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       struct ref **sought,
 		       int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile);
+		       char **pack_lockfile,
+		       enum protocol_version version);
 
 /*
  * Print an appropriate error message for each sought ref that wasn't
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index a33ff6597..16304d1c8 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -38,6 +38,50 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
 	! grep "refs/tags/" log
 '
 
+test_expect_success 'clone with git:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "clone> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "clone< version 2" log
+'
+
+test_expect_success 'fetch with git:// using protocol v2' '
+	test_commit -C "$daemon_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=2 \
+		fetch 2>log &&
+
+	git -C daemon_child log -1 --format=%s origin/master >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "fetch> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "fetch< version 2" log
+'
+
+test_expect_success 'pull with git:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=2 \
+		pull 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "fetch> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "fetch< version 2" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
@@ -66,4 +110,44 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
 	! grep "refs/tags/" log
 '
 
+test_expect_success 'clone with file:// using protocol v2' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "clone< version 2" log
+'
+
+test_expect_success 'fetch with file:// using protocol v2' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
+		fetch origin 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "fetch< version 2" log
+'
+
+test_expect_success 'ref advertisment is filtered during fetch using protocol v2' '
+	test_commit -C file_parent three &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
+		fetch origin master 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	grep "ref-pattern master" log &&
+	! grep "refs/tags/" log
+'
+
 test_done
diff --git a/transport.c b/transport.c
index 6ea3905e3..c275f46ed 100644
--- a/transport.c
+++ b/transport.c
@@ -256,14 +256,17 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
 				  refs_tmp ? refs_tmp : transport->remote_refs,
 				  dest, to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile);
+				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 22/35] upload-pack: support shallow requests
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (20 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
@ 2018-02-07  1:12     ` Brandon Williams
  2018-02-07 19:00       ` Stefan Beller
  2018-02-27 18:29       ` Jonathan Nieder
  2018-02-07  1:13     ` [PATCH v3 23/35] fetch-pack: " Brandon Williams
                       ` (15 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:12 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Add the 'shallow' feature to the protocol version 2 command 'fetch'
which indicates that the server supports shallow clients and deepen
requets.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt |  67 +++++++++++++++-
 serve.c                                 |   2 +-
 t/t5701-git-serve.sh                    |   2 +-
 upload-pack.c                           | 138 +++++++++++++++++++++++---------
 upload-pack.h                           |   3 +
 5 files changed, 173 insertions(+), 39 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 4d5096dae..fedeb6b77 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -201,12 +201,42 @@ packet-lines:
 	to its base by position in pack rather than by an oid.  That is,
 	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
 
+    shallow <oid>
+	A client must notify the server of all objects for which it only
+	has shallow copies of (meaning that it doesn't have the parents
+	of a commit) by supplying a 'shallow <oid>' line for each such
+	object so that the serve is aware of the limitations of the
+	client's history.
+
+    deepen <depth>
+	Request that the fetch/clone should be shallow having a commit depth of
+	<depth> relative to the remote side.
+
+    deepen-relative
+	Requests that the semantics of the "deepen" command be changed
+	to indicate that the depth requested is relative to the clients
+	current shallow boundary, instead of relative to the remote
+	refs.
+
+    deepen-since <timestamp>
+	Requests that the shallow clone/fetch should be cut at a
+	specific time, instead of depth.  Internally it's equivalent of
+	doing "rev-list --max-age=<timestamp>". Cannot be used with
+	"deepen".
+
+    deepen-not <rev>
+	Requests that the shallow clone/fetch should be cut at a
+	specific revision specified by '<rev>', instead of a depth.
+	Internally it's equivalent of doing "rev-list --not <rev>".
+	Cannot be used with "deepen", but can be used with
+	"deepen-since".
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
 
     output = *section
-    section = (acknowledgments | packfile)
+    section = (acknowledgments | shallow-info | packfile)
 	      (flush-pkt | delim-pkt)
 
     acknowledgments = PKT-LINE("acknowledgments" LF)
@@ -215,6 +245,11 @@ header.
     nak = PKT-LINE("NAK" LF)
     ack = PKT-LINE("ACK" SP obj-id LF)
 
+    shallow-info = PKT-LINE("shallow-info" LF)
+		   *PKT-LINE((shallow | unshallow) LF)
+    shallow = "shallow" SP obj-id
+    unshallow = "unshallow" SP obj-id
+
     packfile = PKT-LINE("packfile" LF)
 	       [PACKFILE]
 
@@ -247,6 +282,36 @@ header.
 	  determined the objects it plans to send to the client and no
 	  further negotiation is needed.
 
+----
+    shallow-info section
+	If the client has requested a shallow fetch/clone, a shallow
+	client requests a fetch or the server is shallow then the
+	server's response may include a shallow-info section.  The
+	shallow-info section will be include if (due to one of the above
+	conditions) the server needs to inform the client of any shallow
+	boundaries or adjustments to the clients already existing
+	shallow boundaries.
+
+	* Always begins with the section header "shallow-info"
+
+	* If a positive depth is requested, the server will compute the
+	  set of commits which are no deeper than the desired depth.
+
+	* The server sends a "shallow obj-id" line for each commit whose
+	  parents will not be sent in the following packfile.
+
+	* The server sends an "unshallow obj-id" line for each commit
+	  which the client has indicated is shallow, but is no longer
+	  shallow as a result of the fetch (due to its parents being
+	  sent in the following packfile).
+
+	* The server MUST NOT send any "unshallow" lines for anything
+	  which the client has not indicated was shallow as a part of
+	  its request.
+
+	* This section is only included if a packfile section is also
+	  included in the response.
+
 ----
     packfile section
 	* Always begins with the section header "packfile"
diff --git a/serve.c b/serve.c
index 05cc434cf..c3e58c1e7 100644
--- a/serve.c
+++ b/serve.c
@@ -53,7 +53,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "ls-refs", always_advertise, ls_refs },
-	{ "fetch", always_advertise, upload_pack_v2 },
+	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 202cb782d..491adc693 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -9,7 +9,7 @@ test_expect_success 'test capability advertisement' '
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs
-	fetch
+	fetch=shallow
 	0000
 	EOF
 
diff --git a/upload-pack.c b/upload-pack.c
index c6518a24d..a7e4f9e9c 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -710,7 +710,6 @@ static void deepen(int depth, int deepen_relative,
 	}
 
 	send_unshallow(shallows);
-	packet_flush(1);
 }
 
 static void deepen_by_rev_list(int ac, const char **av,
@@ -722,7 +721,52 @@ static void deepen_by_rev_list(int ac, const char **av,
 	send_shallow(result);
 	free_commit_list(result);
 	send_unshallow(shallows);
-	packet_flush(1);
+}
+
+static int send_shallow_list(int depth, int deepen_rev_list,
+			     timestamp_t deepen_since,
+			     struct string_list *deepen_not,
+			     struct object_array *shallows)
+{
+	int ret = 0;
+
+	if (depth > 0 && deepen_rev_list)
+		die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");
+	if (depth > 0) {
+		deepen(depth, deepen_relative, shallows);
+		ret = 1;
+	} else if (deepen_rev_list) {
+		struct argv_array av = ARGV_ARRAY_INIT;
+		int i;
+
+		argv_array_push(&av, "rev-list");
+		if (deepen_since)
+			argv_array_pushf(&av, "--max-age=%"PRItime, deepen_since);
+		if (deepen_not->nr) {
+			argv_array_push(&av, "--not");
+			for (i = 0; i < deepen_not->nr; i++) {
+				struct string_list_item *s = deepen_not->items + i;
+				argv_array_push(&av, s->string);
+			}
+			argv_array_push(&av, "--not");
+		}
+		for (i = 0; i < want_obj.nr; i++) {
+			struct object *o = want_obj.objects[i].item;
+			argv_array_push(&av, oid_to_hex(&o->oid));
+		}
+		deepen_by_rev_list(av.argc, av.argv, shallows);
+		argv_array_clear(&av);
+		ret = 1;
+	} else {
+		if (shallows->nr > 0) {
+			int i;
+			for (i = 0; i < shallows->nr; i++)
+				register_shallow(&shallows->objects[i].item->oid);
+		}
+	}
+
+	shallow_nr += shallows->nr;
+	return ret;
 }
 
 static int process_shallow(const char *line, struct object_array *shallows)
@@ -884,40 +928,10 @@ static void receive_needs(void)
 
 	if (depth == 0 && !deepen_rev_list && shallows.nr == 0)
 		return;
-	if (depth > 0 && deepen_rev_list)
-		die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");
-	if (depth > 0)
-		deepen(depth, deepen_relative, &shallows);
-	else if (deepen_rev_list) {
-		struct argv_array av = ARGV_ARRAY_INIT;
-		int i;
 
-		argv_array_push(&av, "rev-list");
-		if (deepen_since)
-			argv_array_pushf(&av, "--max-age=%"PRItime, deepen_since);
-		if (deepen_not.nr) {
-			argv_array_push(&av, "--not");
-			for (i = 0; i < deepen_not.nr; i++) {
-				struct string_list_item *s = deepen_not.items + i;
-				argv_array_push(&av, s->string);
-			}
-			argv_array_push(&av, "--not");
-		}
-		for (i = 0; i < want_obj.nr; i++) {
-			struct object *o = want_obj.objects[i].item;
-			argv_array_push(&av, oid_to_hex(&o->oid));
-		}
-		deepen_by_rev_list(av.argc, av.argv, &shallows);
-		argv_array_clear(&av);
-	}
-	else
-		if (shallows.nr > 0) {
-			int i;
-			for (i = 0; i < shallows.nr; i++)
-				register_shallow(&shallows.objects[i].item->oid);
-		}
-
-	shallow_nr += shallows.nr;
+	if (send_shallow_list(depth, deepen_rev_list, deepen_since,
+			      &deepen_not, &shallows))
+		packet_flush(1);
 	object_array_clear(&shallows);
 }
 
@@ -1071,6 +1085,13 @@ struct upload_pack_data {
 	struct object_array wants;
 	struct oid_array haves;
 
+	struct object_array shallows;
+	struct string_list deepen_not;
+	int depth;
+	timestamp_t deepen_since;
+	int deepen_rev_list;
+	int deepen_relative;
+
 	unsigned stateless_rpc : 1;
 
 	unsigned use_thin_pack : 1;
@@ -1080,12 +1101,14 @@ struct upload_pack_data {
 	unsigned done : 1;
 };
 
-#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
+#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, OBJECT_ARRAY_INIT, STRING_LIST_INIT_DUP, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }
 
 static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	object_array_clear(&data->wants);
 	oid_array_clear(&data->haves);
+	object_array_clear(&data->shallows);
+	string_list_clear(&data->deepen_not, 0);
 }
 
 static int parse_want(const char *line)
@@ -1170,6 +1193,22 @@ static void process_args(struct argv_array *args, struct upload_pack_data *data)
 			continue;
 		}
 
+		/* Shallow related arguments */
+		if (process_shallow(arg, &data->shallows))
+			continue;
+		if (process_deepen(arg, &data->depth))
+			continue;
+		if (process_deepen_since(arg, &data->deepen_since,
+					 &data->deepen_rev_list))
+			continue;
+		if (process_deepen_not(arg, &data->deepen_not,
+				       &data->deepen_rev_list))
+			continue;
+		if (!strcmp(arg, "deepen-relative")) {
+			data->deepen_relative = 1;
+			continue;
+		}
+
 		/* ignore unknown lines maybe? */
 		die("unexpect line: '%s'", arg);
 	}
@@ -1284,6 +1323,23 @@ static int process_haves_and_send_acks(struct upload_pack_data *data)
 	return ret;
 }
 
+static void send_shallow_info(struct upload_pack_data *data)
+{
+	/* No shallow info needs to be sent */
+	if (!data->depth && !data->deepen_rev_list && !data->shallows.nr &&
+	    !is_repository_shallow())
+		return;
+
+	packet_write_fmt(1, "shallow-info\n");
+
+	if (!send_shallow_list(data->depth, data->deepen_rev_list,
+			       data->deepen_since, &data->deepen_not,
+			       &data->shallows) && is_repository_shallow())
+		deepen(INFINITE_DEPTH, data->deepen_relative, &data->shallows);
+
+	packet_delim(1);
+}
+
 enum fetch_state {
 	FETCH_PROCESS_ARGS = 0,
 	FETCH_READ_HAVES,
@@ -1334,6 +1390,8 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
 				state = FETCH_DONE;
 			break;
 		case FETCH_SEND_PACK:
+			send_shallow_info(&data);
+
 			packet_write_fmt(1, "packfile\n");
 			create_pack_file();
 			state = FETCH_DONE;
@@ -1346,3 +1404,11 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
 	upload_pack_data_clear(&data);
 	return 0;
 }
+
+int upload_pack_advertise(struct repository *r,
+			  struct strbuf *value)
+{
+	if (value)
+		strbuf_addstr(value, "shallow");
+	return 1;
+}
diff --git a/upload-pack.h b/upload-pack.h
index 6b7890238..7720f2142 100644
--- a/upload-pack.h
+++ b/upload-pack.h
@@ -14,5 +14,8 @@ struct repository;
 struct argv_array;
 extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
 			  struct argv_array *args);
+struct strbuf;
+extern int upload_pack_advertise(struct repository *r,
+				 struct strbuf *value);
 
 #endif /* UPLOAD_PACK_H */
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 23/35] fetch-pack: support shallow requests
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (21 preceding siblings ...)
  2018-02-07  1:12     ` [PATCH v3 22/35] upload-pack: support shallow requests Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-23 19:37       ` Jonathan Tan
  2018-02-07  1:13     ` [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
                       ` (14 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Enable shallow clones and deepen requests using protocol version 2 if
the server 'fetch' command supports the 'shallow' feature.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c    | 22 +++++++++++++++++++
 connect.h    |  2 ++
 fetch-pack.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index 7cb1f1df7..9577528f3 100644
--- a/connect.c
+++ b/connect.c
@@ -82,6 +82,28 @@ int server_supports_v2(const char *c, int die_on_error)
 	return 0;
 }
 
+int server_supports_feature(const char *c, const char *feature,
+			    int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *(out++) == '=')) {
+			if (parse_feature_request(out, feature))
+				return 1;
+			else
+				break;
+		}
+	}
+
+	if (die_on_error)
+		die("server doesn't support feature '%s'", feature);
+
+	return 0;
+}
+
 static void process_capabilities_v2(struct packet_reader *reader)
 {
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
diff --git a/connect.h b/connect.h
index 8898d4495..0e69c6709 100644
--- a/connect.h
+++ b/connect.h
@@ -17,5 +17,7 @@ struct packet_reader;
 extern enum protocol_version discover_version(struct packet_reader *reader);
 
 extern int server_supports_v2(const char *c, int die_on_error);
+extern int server_supports_feature(const char *c, const char *feature,
+				   int die_on_error);
 
 #endif
diff --git a/fetch-pack.c b/fetch-pack.c
index 4fb5805dd..c0807e219 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,26 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	return ref;
 }
 
+static void add_shallow_requests(struct strbuf *req_buf,
+				 const struct fetch_pack_args *args)
+{
+	if (is_repository_shallow())
+		write_shallow_commits(req_buf, 1, NULL);
+	if (args->depth > 0)
+		packet_buf_write(req_buf, "deepen %d", args->depth);
+	if (args->deepen_since) {
+		timestamp_t max_age = approxidate(args->deepen_since);
+		packet_buf_write(req_buf, "deepen-since %"PRItime, max_age);
+	}
+	if (args->deepen_not) {
+		int i;
+		for (i = 0; i < args->deepen_not->nr; i++) {
+			struct string_list_item *s = args->deepen_not->items + i;
+			packet_buf_write(req_buf, "deepen-not %s", s->string);
+		}
+	}
+}
+
 static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
 	for ( ; wants ; wants = wants->next) {
@@ -1090,6 +1110,10 @@ static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
 	if (prefer_ofs_delta)
 		packet_buf_write(&req_buf, "ofs-delta");
 
+	/* Add shallow-info and deepen request */
+	if (server_supports_feature("fetch", "shallow", 1))
+		add_shallow_requests(&req_buf, args);
+
 	/* add wants */
 	add_wants(wants, &req_buf);
 
@@ -1119,7 +1143,7 @@ static int process_section_header(struct packet_reader *reader,
 	int ret;
 
 	if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
-		die("error reading packet");
+		die("error reading section header '%s'", section);
 
 	ret = !strcmp(reader->line, section);
 
@@ -1174,6 +1198,43 @@ static int process_acks(struct packet_reader *reader, struct oidset *common)
 	return received_ready ? 2 : (received_ack ? 1 : 0);
 }
 
+static void receive_shallow_info(struct fetch_pack_args *args,
+				 struct packet_reader *reader)
+{
+	process_section_header(reader, "shallow-info", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *arg;
+		struct object_id oid;
+
+		if (skip_prefix(reader->line, "shallow ", &arg)) {
+			if (get_oid_hex(arg, &oid))
+				die(_("invalid shallow line: %s"), reader->line);
+			register_shallow(&oid);
+			continue;
+		}
+		if (skip_prefix(reader->line, "unshallow ", &arg)) {
+			if (get_oid_hex(arg, &oid))
+				die(_("invalid unshallow line: %s"), reader->line);
+			if (!lookup_object(oid.hash))
+				die(_("object not found: %s"), reader->line);
+			/* make sure that it is parsed as shallow */
+			if (!parse_object(&oid))
+				die(_("error in object: %s"), reader->line);
+			if (unregister_shallow(&oid))
+				die(_("no shallow found: %s"), reader->line);
+			continue;
+		}
+		die(_("expected shallow/unshallow, got %s"), reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH &&
+	    reader->status != PACKET_READ_DELIM)
+		die("error processing shallow info: %d", reader->status);
+
+	setup_alternate_shallow(&shallow_lock, &alternate_shallow_file, NULL);
+	args->deepen = 1;
+}
+
 enum fetch_state {
 	FETCH_CHECK_LOCAL = 0,
 	FETCH_SEND_REQUEST,
@@ -1205,6 +1266,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			/* v2 supports these by default */
 			allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
 			use_sideband = 2;
+			if (args->depth > 0 || args->deepen_since || args->deepen_not)
+				args->deepen = 1;
 
 			/* Filter 'ref' by 'sought' and those that aren't local */
 			if (everything_local(args, &ref, sought, nr_sought))
@@ -1233,6 +1296,10 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			}
 			break;
 		case FETCH_GET_PACK:
+			/* Check for shallow-info section */
+			if (process_section_header(&reader, "shallow-info", 1))
+				receive_shallow_info(args, &reader);
+
 			/* get the pack */
 			process_section_header(&reader, "packfile", 0);
 			if (get_pack(args, fd, pack_lockfile))
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (22 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 23/35] fetch-pack: " Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-21 23:51       ` Jonathan Tan
  2018-02-07  1:13     ` [PATCH v3 25/35] connect: don't request v2 when pushing Brandon Williams
                       ` (13 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Instead of having each builtin transport asking for which protocol
version the user has configured in 'protocol.version' by calling
`get_protocol_version_config()` multiple times, factor this logic out
so there is just a single call at the beginning of `git_connect()`.

This will be helpful in the next patch where we can have centralized
logic which determines if we need to request a different protocol
version than what the user has configured.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/connect.c b/connect.c
index 9577528f3..dbf4def65 100644
--- a/connect.c
+++ b/connect.c
@@ -1029,6 +1029,7 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
  */
 static struct child_process *git_connect_git(int fd[2], char *hostandport,
 					     const char *path, const char *prog,
+					     enum protocol_version version,
 					     int flags)
 {
 	struct child_process *conn;
@@ -1067,10 +1068,10 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
 		    target_host, 0);
 
 	/* If using a new version put that stuff here after a second null byte */
-	if (get_protocol_version_config() > 0) {
+	if (version > 0) {
 		strbuf_addch(&request, '\0');
 		strbuf_addf(&request, "version=%d%c",
-			    get_protocol_version_config(), '\0');
+			    version, '\0');
 	}
 
 	packet_write(fd[1], request.buf, request.len);
@@ -1086,14 +1087,14 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
  */
 static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 			     enum ssh_variant variant, const char *port,
-			     int flags)
+			     enum protocol_version version, int flags)
 {
 	if (variant == VARIANT_SSH &&
-	    get_protocol_version_config() > 0) {
+	    version > 0) {
 		argv_array_push(args, "-o");
 		argv_array_push(args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
 		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-				 get_protocol_version_config());
+				 version);
 	}
 
 	if (flags & CONNECT_IPV4) {
@@ -1146,7 +1147,8 @@ static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 
 /* Prepare a child_process for use by Git's SSH-tunneled transport. */
 static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
-			  const char *port, int flags)
+			  const char *port, enum protocol_version version,
+			  int flags)
 {
 	const char *ssh;
 	enum ssh_variant variant;
@@ -1180,14 +1182,14 @@ static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
 		argv_array_push(&detect.args, ssh);
 		argv_array_push(&detect.args, "-G");
 		push_ssh_options(&detect.args, &detect.env_array,
-				 VARIANT_SSH, port, flags);
+				 VARIANT_SSH, port, version, flags);
 		argv_array_push(&detect.args, ssh_host);
 
 		variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
 	}
 
 	argv_array_push(&conn->args, ssh);
-	push_ssh_options(&conn->args, &conn->env_array, variant, port, flags);
+	push_ssh_options(&conn->args, &conn->env_array, variant, port, version, flags);
 	argv_array_push(&conn->args, ssh_host);
 }
 
@@ -1208,6 +1210,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 	char *hostandport, *path;
 	struct child_process *conn;
 	enum protocol protocol;
+	enum protocol_version version = get_protocol_version_config();
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -1222,7 +1225,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
-		conn = git_connect_git(fd, hostandport, path, prog, flags);
+		conn = git_connect_git(fd, hostandport, path, prog, version, flags);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
 		const char *const *var;
@@ -1265,12 +1268,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 				strbuf_release(&cmd);
 				return NULL;
 			}
-			fill_ssh_args(conn, ssh_host, port, flags);
+			fill_ssh_args(conn, ssh_host, port, version, flags);
 		} else {
 			transport_check_allowed("file");
-			if (get_protocol_version_config() > 0) {
+			if (version > 0) {
 				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-						 get_protocol_version_config());
+						 version);
 			}
 		}
 		argv_array_push(&conn->args, cmd.buf);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 25/35] connect: don't request v2 when pushing
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (23 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-07  1:13     ` [PATCH v3 26/35] transport-helper: remove name parameter Brandon Williams
                       ` (12 subsequent siblings)
  37 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.

We could run into issues if we didn't fall back to protocol v2 when
pushing right now.  This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2.  So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              |  8 ++++++++
 t/t5702-protocol-v2.sh | 22 ++++++++++++++++++++++
 2 files changed, 30 insertions(+)

diff --git a/connect.c b/connect.c
index dbf4def65..37a6a8935 100644
--- a/connect.c
+++ b/connect.c
@@ -1212,6 +1212,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 	enum protocol protocol;
 	enum protocol_version version = get_protocol_version_config();
 
+	/*
+	 * NEEDSWORK: If we are trying to use protocol v2 and we are planning
+	 * to perform a push, then fallback to v0 since the client doesn't know
+	 * how to push yet using v2.
+	 */
+	if (version == protocol_v2 && !strcmp("git-receive-pack", prog))
+		version = protocol_v0;
+
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
 	 */
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 16304d1c8..60e43bcf5 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -82,6 +82,28 @@ test_expect_success 'pull with git:// using protocol v2' '
 	grep "fetch< version 2" log
 '
 
+test_expect_success 'push with git:// and a config of v2 does not request v2' '
+	# Till v2 for push is designed, make sure that if a client has
+	# protocol.version configured to use v2, that the client instead falls
+	# back and uses v0.
+
+	test_commit -C daemon_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=2 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	! grep "push> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	! grep "push< version 2" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 26/35] transport-helper: remove name parameter
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (24 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 25/35] connect: don't request v2 when pushing Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-27 23:03       ` Jonathan Nieder
  2018-02-07  1:13     ` [PATCH v3 27/35] transport-helper: refactor process_connect_service Brandon Williams
                       ` (11 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Commit 266f1fdfa (transport-helper: be quiet on read errors from
helpers, 2013-06-21) removed a call to 'die()' which printed the name of
the remote helper passed in to the 'recvline_fh()' function using the
'name' parameter.  Once the call to 'die()' was removed the parameter
was no longer necessary but wasn't removed.  Clean up 'recvline_fh()'
parameter list by removing the 'name' parameter.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 4c334b5ee..d72155768 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -49,7 +49,7 @@ static void sendline(struct helper_data *helper, struct strbuf *buffer)
 		die_errno("Full write to remote helper failed");
 }
 
-static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
+static int recvline_fh(FILE *helper, struct strbuf *buffer)
 {
 	strbuf_reset(buffer);
 	if (debug)
@@ -67,7 +67,7 @@ static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
 
 static int recvline(struct helper_data *helper, struct strbuf *buffer)
 {
-	return recvline_fh(helper->out, buffer, helper->name);
+	return recvline_fh(helper->out, buffer);
 }
 
 static void write_constant(int fd, const char *str)
@@ -586,7 +586,7 @@ static int process_connect_service(struct transport *transport,
 		goto exit;
 
 	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf, name))
+	if (recvline_fh(input, &cmdbuf))
 		exit(128);
 
 	if (!strcmp(cmdbuf.buf, "")) {
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 27/35] transport-helper: refactor process_connect_service
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (25 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 26/35] transport-helper: remove name parameter Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-07  1:13     ` [PATCH v3 28/35] transport-helper: introduce stateless-connect Brandon Williams
                       ` (10 subsequent siblings)
  37 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

A future patch will need to take advantage of the logic which runs and
processes the response of the connect command on a remote helper so
factor out this logic from 'process_connect_service()' and place it into
a helper function 'run_connect()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 67 +++++++++++++++++++++++++++++++-----------------------
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index d72155768..c032a2a87 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -545,14 +545,13 @@ static int fetch_with_import(struct transport *transport,
 	return 0;
 }
 
-static int process_connect_service(struct transport *transport,
-				   const char *name, const char *exec)
+static int run_connect(struct transport *transport, struct strbuf *cmdbuf)
 {
 	struct helper_data *data = transport->data;
-	struct strbuf cmdbuf = STRBUF_INIT;
-	struct child_process *helper;
-	int r, duped, ret = 0;
+	int ret = 0;
+	int duped;
 	FILE *input;
+	struct child_process *helper;
 
 	helper = get_helper(transport);
 
@@ -568,44 +567,54 @@ static int process_connect_service(struct transport *transport,
 	input = xfdopen(duped, "r");
 	setvbuf(input, NULL, _IONBF, 0);
 
+	sendline(data, cmdbuf);
+	if (recvline_fh(input, cmdbuf))
+		exit(128);
+
+	if (!strcmp(cmdbuf->buf, "")) {
+		data->no_disconnect_req = 1;
+		if (debug)
+			fprintf(stderr, "Debug: Smart transport connection "
+				"ready.\n");
+		ret = 1;
+	} else if (!strcmp(cmdbuf->buf, "fallback")) {
+		if (debug)
+			fprintf(stderr, "Debug: Falling back to dumb "
+				"transport.\n");
+	} else {
+		die("Unknown response to connect: %s",
+			cmdbuf->buf);
+	}
+
+	fclose(input);
+	return ret;
+}
+
+static int process_connect_service(struct transport *transport,
+				   const char *name, const char *exec)
+{
+	struct helper_data *data = transport->data;
+	struct strbuf cmdbuf = STRBUF_INIT;
+	int ret = 0;
+
 	/*
 	 * Handle --upload-pack and friends. This is fire and forget...
 	 * just warn if it fails.
 	 */
 	if (strcmp(name, exec)) {
-		r = set_helper_option(transport, "servpath", exec);
+		int r = set_helper_option(transport, "servpath", exec);
 		if (r > 0)
 			warning("Setting remote service path not supported by protocol.");
 		else if (r < 0)
 			warning("Invalid remote service path.");
 	}
 
-	if (data->connect)
+	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
-	else
-		goto exit;
-
-	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf))
-		exit(128);
-
-	if (!strcmp(cmdbuf.buf, "")) {
-		data->no_disconnect_req = 1;
-		if (debug)
-			fprintf(stderr, "Debug: Smart transport connection "
-				"ready.\n");
-		ret = 1;
-	} else if (!strcmp(cmdbuf.buf, "fallback")) {
-		if (debug)
-			fprintf(stderr, "Debug: Falling back to dumb "
-				"transport.\n");
-	} else
-		die("Unknown response to connect: %s",
-			cmdbuf.buf);
+		ret = run_connect(transport, &cmdbuf);
+	}
 
-exit:
 	strbuf_release(&cmdbuf);
-	fclose(input);
 	return ret;
 }
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 28/35] transport-helper: introduce stateless-connect
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (26 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 27/35] transport-helper: refactor process_connect_service Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-22  0:01       ` Jonathan Tan
  2018-02-27 23:30       ` Jonathan Nieder
  2018-02-07  1:13     ` [PATCH v3 29/35] pkt-line: add packet_buf_write_len function Brandon Williams
                       ` (9 subsequent siblings)
  37 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Introduce the transport-helper capability 'stateless-connect'.  This
capability indicates that the transport-helper can be requested to run
the 'stateless-connect' command which should attempt to make a
stateless connection with a remote end.  Once established, the
connection can be used by the git client to communicate with
the remote end natively in a stateless-rpc manner as supported by
protocol v2.  This means that the client must send everything the server
needs in a single request as the client must not assume any
state-storing on the part of the server or transport.

If a stateless connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 8 ++++++++
 transport.c        | 1 +
 transport.h        | 6 ++++++
 3 files changed, 15 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index c032a2a87..82eb57c4a 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -26,6 +26,7 @@ struct helper_data {
 		option : 1,
 		push : 1,
 		connect : 1,
+		stateless_connect : 1,
 		signed_tags : 1,
 		check_connectivity : 1,
 		no_disconnect_req : 1,
@@ -188,6 +189,8 @@ static struct child_process *get_helper(struct transport *transport)
 			refspecs[refspec_nr++] = xstrdup(arg);
 		} else if (!strcmp(capname, "connect")) {
 			data->connect = 1;
+		} else if (!strcmp(capname, "stateless-connect")) {
+			data->stateless_connect = 1;
 		} else if (!strcmp(capname, "signed-tags")) {
 			data->signed_tags = 1;
 		} else if (skip_prefix(capname, "export-marks ", &arg)) {
@@ -612,6 +615,11 @@ static int process_connect_service(struct transport *transport,
 	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
 		ret = run_connect(transport, &cmdbuf);
+	} else if (data->stateless_connect) {
+		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
+		ret = run_connect(transport, &cmdbuf);
+		if (ret)
+			transport->stateless_rpc = 1;
 	}
 
 	strbuf_release(&cmdbuf);
diff --git a/transport.c b/transport.c
index c275f46ed..9125174f7 100644
--- a/transport.c
+++ b/transport.c
@@ -250,6 +250,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 		data->options.check_self_contained_and_connected;
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
+	args.stateless_rpc = transport->stateless_rpc;
 
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0, NULL);
diff --git a/transport.h b/transport.h
index 4b656f315..9eac809ee 100644
--- a/transport.h
+++ b/transport.h
@@ -55,6 +55,12 @@ struct transport {
 	 */
 	unsigned cloning : 1;
 
+	/*
+	 * Indicates that the transport is connected via a half-duplex
+	 * connection and should operate in stateless-rpc mode.
+	 */
+	unsigned stateless_rpc : 1;
+
 	/*
 	 * These strings will be passed to the {pre, post}-receive hook,
 	 * on the remote side, if both sides support the push options capability.
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 29/35] pkt-line: add packet_buf_write_len function
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (27 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 28/35] transport-helper: introduce stateless-connect Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-27 23:11       ` Jonathan Nieder
  2018-02-07  1:13     ` [PATCH v3 30/35] remote-curl: create copy of the service name Brandon Williams
                       ` (8 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Add the 'packet_buf_write_len()' function which allows for writing an
arbitrary length buffer into a 'struct strbuf' and formatting it in
packet-line format.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 16 ++++++++++++++++
 pkt-line.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 726e109ca..5a8a17ecc 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 	va_end(args);
 }
 
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
+{
+	size_t orig_len, n;
+
+	orig_len = buf->len;
+	strbuf_addstr(buf, "0000");
+	strbuf_add(buf, data, len);
+	n = buf->len - orig_len;
+
+	if (n > LARGE_PACKET_MAX)
+		die("protocol error: impossibly long line");
+
+	set_packet_header(&buf->buf[orig_len], n);
+	packet_trace(buf->buf + orig_len + 4, n - 4, 1);
+}
+
 int write_packetized_from_fd(int fd_in, int fd_out)
 {
 	static char buf[LARGE_PACKET_DATA_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 16fe8bdbf..63724d4bf 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd(int fd_in, int fd_out);
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 30/35] remote-curl: create copy of the service name
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (28 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 29/35] pkt-line: add packet_buf_write_len function Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-22  0:06       ` Jonathan Tan
  2018-02-07  1:13     ` [PATCH v3 31/35] remote-curl: store the protocol version the server responded with Brandon Williams
                       ` (7 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Make a copy of the service name being requested instead of relying on
the buffer pointed to by the passed in 'const char *' to remain
unchanged.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index dae8a4a48..4086aa733 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -165,7 +165,7 @@ static int set_option(const char *name, const char *value)
 }
 
 struct discovery {
-	const char *service;
+	char *service;
 	char *buf_alloc;
 	char *buf;
 	size_t len;
@@ -257,6 +257,7 @@ static void free_discovery(struct discovery *d)
 		free(d->shallow.oid);
 		free(d->buf_alloc);
 		free_refs(d->refs);
+		free(d->service);
 		free(d);
 	}
 }
@@ -343,7 +344,7 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		warning(_("redirecting to %s"), url.buf);
 
 	last= xcalloc(1, sizeof(*last_discovery));
-	last->service = service;
+	last->service = xstrdup(service);
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 31/35] remote-curl: store the protocol version the server responded with
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (29 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 30/35] remote-curl: create copy of the service name Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-27 23:17       ` Jonathan Nieder
  2018-02-07  1:13     ` [PATCH v3 32/35] http: allow providing extra headers for http requests Brandon Williams
                       ` (6 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Store the protocol version the server responded with when performing
discovery.  This will be used in a future patch to either change the
'Git-Protocol' header sent in subsequent requests or to determine if a
client needs to fallback to using a different protocol version.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 4086aa733..c54035843 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -171,6 +171,7 @@ struct discovery {
 	size_t len;
 	struct ref *refs;
 	struct oid_array shallow;
+	enum protocol_version version;
 	unsigned proto_git : 1;
 };
 static struct discovery *last_discovery;
@@ -184,7 +185,8 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	heads->version = discover_version(&reader);
+	switch (heads->version) {
 	case protocol_v2:
 		die("support for protocol v2 not implemented yet");
 		break;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 32/35] http: allow providing extra headers for http requests
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (30 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 31/35] remote-curl: store the protocol version the server responded with Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-22  0:09       ` Jonathan Tan
  2018-02-07  1:13     ` [PATCH v3 33/35] http: don't always add Git-Protocol header Brandon Williams
                       ` (5 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Add a way for callers to request that extra headers be included when
making http requests.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c | 8 ++++++++
 http.h | 2 ++
 2 files changed, 10 insertions(+)

diff --git a/http.c b/http.c
index 597771271..e1757d62b 100644
--- a/http.c
+++ b/http.c
@@ -1723,6 +1723,14 @@ static int http_request(const char *url,
 
 	headers = curl_slist_append(headers, buf.buf);
 
+	/* Add additional headers here */
+	if (options && options->extra_headers) {
+		const struct string_list_item *item;
+		for_each_string_list_item(item, options->extra_headers) {
+			headers = curl_slist_append(headers, item->string);
+		}
+	}
+
 	curl_easy_setopt(slot->curl, CURLOPT_URL, url);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
diff --git a/http.h b/http.h
index f7bd3b26b..a113915c7 100644
--- a/http.h
+++ b/http.h
@@ -172,6 +172,8 @@ struct http_get_options {
 	 * for details.
 	 */
 	struct strbuf *base_url;
+
+	struct string_list *extra_headers;
 };
 
 /* Return values for http_get_*() */
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 33/35] http: don't always add Git-Protocol header
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (31 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 32/35] http: allow providing extra headers for http requests Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-07  1:13     ` [PATCH v3 34/35] remote-curl: implement stateless-connect command Brandon Williams
                       ` (4 subsequent siblings)
  37 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Instead of always sending the Git-Protocol header with the configured
version with every http request, explicitly send it when discovering
refs and then only send it on subsequent http requests if the server
understood the version requested.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c        | 17 -----------------
 remote-curl.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 17 deletions(-)

diff --git a/http.c b/http.c
index e1757d62b..8f1129ac7 100644
--- a/http.c
+++ b/http.c
@@ -904,21 +904,6 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
-static void protocol_http_header(void)
-{
-	if (get_protocol_version_config() > 0) {
-		struct strbuf protocol_header = STRBUF_INIT;
-
-		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
-			    get_protocol_version_config());
-
-
-		extra_http_headers = curl_slist_append(extra_http_headers,
-						       protocol_header.buf);
-		strbuf_release(&protocol_header);
-	}
-}
-
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -949,8 +934,6 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
-	protocol_http_header();
-
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/remote-curl.c b/remote-curl.c
index c54035843..b4e9db85b 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -291,6 +291,19 @@ static int show_http_message(struct strbuf *type, struct strbuf *charset,
 	return 0;
 }
 
+static int get_protocol_http_header(enum protocol_version version,
+				    struct strbuf *header)
+{
+	if (version > 0) {
+		strbuf_addf(header, GIT_PROTOCOL_HEADER ": version=%d",
+			    version);
+
+		return 1;
+	}
+
+	return 0;
+}
+
 static struct discovery *discover_refs(const char *service, int for_push)
 {
 	struct strbuf exp = STRBUF_INIT;
@@ -299,6 +312,8 @@ static struct discovery *discover_refs(const char *service, int for_push)
 	struct strbuf buffer = STRBUF_INIT;
 	struct strbuf refs_url = STRBUF_INIT;
 	struct strbuf effective_url = STRBUF_INIT;
+	struct strbuf protocol_header = STRBUF_INIT;
+	struct string_list extra_headers = STRING_LIST_INIT_DUP;
 	struct discovery *last = last_discovery;
 	int http_ret, maybe_smart = 0;
 	struct http_get_options http_options;
@@ -318,11 +333,16 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		strbuf_addf(&refs_url, "service=%s", service);
 	}
 
+	/* Add the extra Git-Protocol header */
+	if (get_protocol_http_header(get_protocol_version_config(), &protocol_header))
+		string_list_append(&extra_headers, protocol_header.buf);
+
 	memset(&http_options, 0, sizeof(http_options));
 	http_options.content_type = &type;
 	http_options.charset = &charset;
 	http_options.effective_url = &effective_url;
 	http_options.base_url = &url;
+	http_options.extra_headers = &extra_headers;
 	http_options.initial_request = 1;
 	http_options.no_cache = 1;
 	http_options.keep_error = 1;
@@ -389,6 +409,8 @@ static struct discovery *discover_refs(const char *service, int for_push)
 	strbuf_release(&charset);
 	strbuf_release(&effective_url);
 	strbuf_release(&buffer);
+	strbuf_release(&protocol_header);
+	string_list_clear(&extra_headers, 0);
 	last_discovery = last;
 	return last;
 }
@@ -425,6 +447,7 @@ struct rpc_state {
 	char *service_url;
 	char *hdr_content_type;
 	char *hdr_accept;
+	char *protocol_header;
 	char *buf;
 	size_t alloc;
 	size_t len;
@@ -611,6 +634,10 @@ static int post_rpc(struct rpc_state *rpc)
 	headers = curl_slist_append(headers, needs_100_continue ?
 		"Expect: 100-continue" : "Expect:");
 
+	/* Add the extra Git-Protocol header */
+	if (rpc->protocol_header)
+		headers = curl_slist_append(headers, rpc->protocol_header);
+
 retry:
 	slot = get_active_slot();
 
@@ -751,6 +778,11 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 	strbuf_addf(&buf, "Accept: application/x-%s-result", svc);
 	rpc->hdr_accept = strbuf_detach(&buf, NULL);
 
+	if (get_protocol_http_header(heads->version, &buf))
+		rpc->protocol_header = strbuf_detach(&buf, NULL);
+	else
+		rpc->protocol_header = NULL;
+
 	while (!err) {
 		int n = packet_read(rpc->out, NULL, NULL, rpc->buf, rpc->alloc, 0);
 		if (!n)
@@ -778,6 +810,7 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 	free(rpc->service_url);
 	free(rpc->hdr_content_type);
 	free(rpc->hdr_accept);
+	free(rpc->protocol_header);
 	free(rpc->buf);
 	strbuf_release(&buf);
 	return err;
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 34/35] remote-curl: implement stateless-connect command
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (32 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 33/35] http: don't always add Git-Protocol header Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-28  0:05       ` Jonathan Nieder
  2018-02-07  1:13     ` [PATCH v3 35/35] remote-curl: don't request v2 when pushing Brandon Williams
                       ` (3 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

Teach remote-curl the 'stateless-connect' command which is used to
establish a stateless connection with servers which support protocol
version 2.  This allows remote-curl to act as a proxy, allowing the git
client to communicate natively with a remote end, simply using
remote-curl as a pass through to convert requests to http.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c          | 187 ++++++++++++++++++++++++++++++++++++++++++++++++-
 t/t5702-protocol-v2.sh |  41 +++++++++++
 2 files changed, 227 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index b4e9db85b..af431b658 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -188,7 +188,10 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 	heads->version = discover_version(&reader);
 	switch (heads->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		/*
+		 * Do nothing.  Client should run 'stateless-connect' and
+		 * request the refs themselves.
+		 */
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -1082,6 +1085,184 @@ static void parse_push(struct strbuf *buf)
 	free(specs);
 }
 
+struct proxy_state {
+	char *service_name;
+	char *service_url;
+	struct curl_slist *headers;
+	struct strbuf request_buffer;
+	int in;
+	int out;
+	struct packet_reader reader;
+	size_t pos;
+	int seen_flush;
+};
+
+static void proxy_state_init(struct proxy_state *p, const char *service_name,
+			     enum protocol_version version)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	memset(p, 0, sizeof(*p));
+	p->service_name = xstrdup(service_name);
+
+	p->in = 0;
+	p->out = 1;
+	strbuf_init(&p->request_buffer, 0);
+
+	strbuf_addf(&buf, "%s%s", url.buf, p->service_name);
+	p->service_url = strbuf_detach(&buf, NULL);
+
+	p->headers = http_copy_default_headers();
+
+	strbuf_addf(&buf, "Content-Type: application/x-%s-request", p->service_name);
+	p->headers = curl_slist_append(p->headers, buf.buf);
+	strbuf_reset(&buf);
+
+	strbuf_addf(&buf, "Accept: application/x-%s-result", p->service_name);
+	p->headers = curl_slist_append(p->headers, buf.buf);
+	strbuf_reset(&buf);
+
+	p->headers = curl_slist_append(p->headers, "Transfer-Encoding: chunked");
+
+	/* Add the Git-Protocol header */
+	if (get_protocol_http_header(version, &buf))
+		p->headers = curl_slist_append(p->headers, buf.buf);
+
+	packet_reader_init(&p->reader, p->in, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	strbuf_release(&buf);
+}
+
+static void proxy_state_clear(struct proxy_state *p)
+{
+	free(p->service_name);
+	free(p->service_url);
+	curl_slist_free_all(p->headers);
+	strbuf_release(&p->request_buffer);
+}
+
+static size_t proxy_in(char *buffer, size_t eltsize,
+		       size_t nmemb, void *userdata)
+{
+	size_t max = eltsize * nmemb;
+	struct proxy_state *p = userdata;
+	size_t avail = p->request_buffer.len - p->pos;
+
+	if (!avail) {
+		if (p->seen_flush) {
+			p->seen_flush = 0;
+			return 0;
+		}
+
+		strbuf_reset(&p->request_buffer);
+		switch (packet_reader_read(&p->reader)) {
+		case PACKET_READ_EOF:
+			die("unexpected EOF when reading from parent process");
+		case PACKET_READ_NORMAL:
+			packet_buf_write_len(&p->request_buffer, p->reader.line,
+					     p->reader.pktlen);
+			break;
+		case PACKET_READ_DELIM:
+			packet_buf_delim(&p->request_buffer);
+			break;
+		case PACKET_READ_FLUSH:
+			packet_buf_flush(&p->request_buffer);
+			p->seen_flush = 1;
+			break;
+		}
+		p->pos = 0;
+		avail = p->request_buffer.len;
+	}
+
+	if (max < avail)
+		avail = max;
+	memcpy(buffer, p->request_buffer.buf + p->pos, avail);
+	p->pos += avail;
+	return avail;
+}
+
+static size_t proxy_out(char *buffer, size_t eltsize,
+			size_t nmemb, void *userdata)
+{
+	size_t size = eltsize * nmemb;
+	struct proxy_state *p = userdata;
+
+	write_or_die(p->out, buffer, size);
+	return size;
+}
+
+static int proxy_post(struct proxy_state *p)
+{
+	struct active_request_slot *slot;
+	int err;
+
+	slot = get_active_slot();
+
+	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
+	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, p->headers);
+
+	/* Setup function to read request from client */
+	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
+	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
+
+	/* Setup function to write server response to client */
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
+
+	err = run_slot(slot, NULL);
+
+	if (err != HTTP_OK)
+		err = -1;
+
+	return err;
+}
+
+static int stateless_connect(const char *service_name)
+{
+	struct discovery *discover;
+	struct proxy_state p;
+
+	/*
+	 * Run the info/refs request and see if the server supports protocol
+	 * v2.  If and only if the server supports v2 can we successfully
+	 * establish a stateless connection, otherwise we need to tell the
+	 * client to fallback to using other transport helper functions to
+	 * complete their request.
+	 */
+	discover = discover_refs(service_name, 0);
+	if (discover->version != protocol_v2) {
+		printf("fallback\n");
+		fflush(stdout);
+		return -1;
+	} else {
+		/* Stateless Connection established */
+		printf("\n");
+		fflush(stdout);
+	}
+
+	proxy_state_init(&p, service_name, discover->version);
+
+	/*
+	 * Dump the capability listing that we got from the server earlier
+	 * during the info/refs request.
+	 */
+	write_or_die(p.out, discover->buf, discover->len);
+
+	/* Peek the next packet line.  Until we see EOF keep sending POSTs */
+	while (packet_reader_peek(&p.reader) != PACKET_READ_EOF) {
+		if (proxy_post(&p)) {
+			/* We would have an err here */
+			break;
+		}
+	}
+
+	proxy_state_clear(&p);
+	return 0;
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -1150,12 +1331,16 @@ int cmd_main(int argc, const char **argv)
 			fflush(stdout);
 
 		} else if (!strcmp(buf.buf, "capabilities")) {
+			printf("stateless-connect\n");
 			printf("fetch\n");
 			printf("option\n");
 			printf("push\n");
 			printf("check-connectivity\n");
 			printf("\n");
 			fflush(stdout);
+		} else if (skip_prefix(buf.buf, "stateless-connect ", &arg)) {
+			if (!stateless_connect(arg))
+				break;
 		} else {
 			error("remote-curl: unknown command '%s' from git", buf.buf);
 			return 1;
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 60e43bcf5..c2c39fe0c 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -172,4 +172,45 @@ test_expect_success 'ref advertisment is filtered during fetch using protocol v2
 	! grep "refs/tags/" log
 '
 
+# Test protocol v2 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v2' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+test_expect_success 'fetch with http:// using protocol v2' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=2 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s origin/master >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+stop_httpd
+
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v3 35/35] remote-curl: don't request v2 when pushing
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (33 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 34/35] remote-curl: implement stateless-connect command Brandon Williams
@ 2018-02-07  1:13     ` Brandon Williams
  2018-02-22  0:12       ` Jonathan Tan
  2018-02-12 14:50     ` [PATCH v3 00/35] protocol version 2 Derrick Stolee
                       ` (2 subsequent siblings)
  37 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-07  1:13 UTC (permalink / raw)
  To: git
  Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds, Brandon Williams

In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.

We could run into issues if we didn't fall back to protocol v2 when
pushing right now.  This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2.  So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c          | 11 ++++++++++-
 t/t5702-protocol-v2.sh | 23 +++++++++++++++++++++++
 2 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index af431b658..c39b6ece6 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -320,6 +320,7 @@ static struct discovery *discover_refs(const char *service, int for_push)
 	struct discovery *last = last_discovery;
 	int http_ret, maybe_smart = 0;
 	struct http_get_options http_options;
+	enum protocol_version version = get_protocol_version_config();
 
 	if (last && !strcmp(service, last->service))
 		return last;
@@ -336,8 +337,16 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		strbuf_addf(&refs_url, "service=%s", service);
 	}
 
+	/*
+	 * NEEDSWORK: If we are trying to use protocol v2 and we are planning
+	 * to perform a push, then fallback to v0 since the client doesn't know
+	 * how to push yet using v2.
+	 */
+	if (version == protocol_v2 && !strcmp("git-receive-pack", service))
+		version = protocol_v0;
+
 	/* Add the extra Git-Protocol header */
-	if (get_protocol_http_header(get_protocol_version_config(), &protocol_header))
+	if (get_protocol_http_header(version, &protocol_header))
 		string_list_append(&extra_headers, protocol_header.buf);
 
 	memset(&http_options, 0, sizeof(http_options));
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index c2c39fe0c..14d589a7f 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -211,6 +211,29 @@ test_expect_success 'fetch with http:// using protocol v2' '
 	grep "git< version 2" log
 '
 
+test_expect_success 'push with http:// and a config of v2 does not request v2' '
+	# Till v2 for push is designed, make sure that if a client has
+	# protocol.version configured to use v2, that the client instead falls
+	# back and uses v0.
+
+	test_commit -C http_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		push origin HEAD:client_branch && 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client didnt request to use protocol v2
+	! grep "Git-Protocol: version=2" log &&
+	# Server didnt respond using protocol v2
+	! grep "git< version 2" log
+'
+
+
 stop_httpd
 
 test_done
-- 
2.16.0.rc1.238.g530d649a79-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 22/35] upload-pack: support shallow requests
  2018-02-07  1:12     ` [PATCH v3 22/35] upload-pack: support shallow requests Brandon Williams
@ 2018-02-07 19:00       ` Stefan Beller
  2018-02-10 10:23         ` Duy Nguyen
  2018-02-13 17:06         ` Brandon Williams
  2018-02-27 18:29       ` Jonathan Nieder
  1 sibling, 2 replies; 362+ messages in thread
From: Stefan Beller @ 2018-02-07 19:00 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> Add the 'shallow' feature to the protocol version 2 command 'fetch'
> which indicates that the server supports shallow clients and deepen
> requets.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/technical/protocol-v2.txt |  67 +++++++++++++++-
>  serve.c                                 |   2 +-
>  t/t5701-git-serve.sh                    |   2 +-
>  upload-pack.c                           | 138 +++++++++++++++++++++++---------
>  upload-pack.h                           |   3 +
>  5 files changed, 173 insertions(+), 39 deletions(-)
>
> diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> index 4d5096dae..fedeb6b77 100644
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -201,12 +201,42 @@ packet-lines:
>         to its base by position in pack rather than by an oid.  That is,
>         they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
>
> +    shallow <oid>
> +       A client must notify the server of all objects for which it only

s/all objects/all commits/ for preciseness

> +       has shallow copies of (meaning that it doesn't have the parents
> +       of a commit) by supplying a 'shallow <oid>' line for each such
> +       object so that the serve is aware of the limitations of the
> +       client's history.
> +
> +    deepen <depth>
> +       Request that the fetch/clone should be shallow having a commit depth of
> +       <depth> relative to the remote side.

What does depth mean? number of commits, or number of edges?
Are there any special numbers (-1, 0, 1, max int) ?

From reading ahead: "Cannot be used with deepen-since, but
can be combined with deepen-relative" ?


> +
> +    deepen-relative
> +       Requests that the semantics of the "deepen" command be changed
> +       to indicate that the depth requested is relative to the clients
> +       current shallow boundary, instead of relative to the remote
> +       refs.
> +
> +    deepen-since <timestamp>
> +       Requests that the shallow clone/fetch should be cut at a
> +       specific time, instead of depth.  Internally it's equivalent of
> +       doing "rev-list --max-age=<timestamp>". Cannot be used with
> +       "deepen".
> +
> +    deepen-not <rev>
> +       Requests that the shallow clone/fetch should be cut at a
> +       specific revision specified by '<rev>', instead of a depth.
> +       Internally it's equivalent of doing "rev-list --not <rev>".
> +       Cannot be used with "deepen", but can be used with
> +       "deepen-since".

What happens if those are given in combination?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 22/35] upload-pack: support shallow requests
  2018-02-07 19:00       ` Stefan Beller
@ 2018-02-10 10:23         ` Duy Nguyen
  2018-02-13 17:06         ` Brandon Williams
  1 sibling, 0 replies; 362+ messages in thread
From: Duy Nguyen @ 2018-02-10 10:23 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Brandon Williams, git, Jeff King, Junio C Hamano,
	Jonathan Nieder, Derrick Stolee, Jeff Hostetler

On Thu, Feb 8, 2018 at 2:00 AM, Stefan Beller <sbeller@google.com> wrote:
>> +
>> +    deepen-relative
>> +       Requests that the semantics of the "deepen" command be changed
>> +       to indicate that the depth requested is relative to the clients
>> +       current shallow boundary, instead of relative to the remote
>> +       refs.
>> +
>> +    deepen-since <timestamp>
>> +       Requests that the shallow clone/fetch should be cut at a
>> +       specific time, instead of depth.  Internally it's equivalent of
>> +       doing "rev-list --max-age=<timestamp>". Cannot be used with
>> +       "deepen".
>> +
>> +    deepen-not <rev>
>> +       Requests that the shallow clone/fetch should be cut at a
>> +       specific revision specified by '<rev>', instead of a depth.
>> +       Internally it's equivalent of doing "rev-list --not <rev>".
>> +       Cannot be used with "deepen", but can be used with
>> +       "deepen-since".
>
> What happens if those are given in combination?

It should be described in the old protocol document or I did a bad job
documenting it. Some of these can be combined (I think it's AND logic
from rev-list point of view), with the exception of --depth which does
not use rev-list underneath and cannot be combined with the others.
-- 
Duy

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 00/35] protocol version 2
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (34 preceding siblings ...)
  2018-02-07  1:13     ` [PATCH v3 35/35] remote-curl: don't request v2 when pushing Brandon Williams
@ 2018-02-12 14:50     ` Derrick Stolee
  2018-02-21 20:01     ` Brandon Williams
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
  37 siblings, 0 replies; 362+ messages in thread
From: Derrick Stolee @ 2018-02-12 14:50 UTC (permalink / raw)
  To: Brandon Williams, git; +Cc: sbeller, peff, gitster, jrnieder, git, pclouds

On 2/6/2018 8:12 PM, Brandon Williams wrote:
> Changes in v3:
>   * There were some comments about how the protocol should be designed
>     stateless first.  I've made this change and instead of having to
>     supply the `stateless-rpc=true` capability to force stateless
>     behavior, the protocol just requires all commands to be stateless.
>   
>   * Added some patches towards the end of the series to force the client
>     to not request to use protocol v2 when pushing (even if configured to
>     use v2).  This is to ease the roll-out process of a push command in
>     protocol v2.  This way when servers gain the ability to accept
>     pushing in v2 (and they start responding using v2 when requests are
>     sent to the git-receive-pack endpoint) that clients who still don't
>     understand how to push using v2 won't request to use v2 and then die
>     when they recognize that the server does indeed know how to accept a
>     push under v2.
>
>   * I implemented the `shallow` feature for fetch.  This feature
>     encapsulates the existing functionality of all the shallow/deepen
>     capabilities in v0.  So now a server can process shallow requests.
>
>   * Various other small tweaks that I can't remember :)
>
> After all of that I think the series is in a pretty good state, baring
> any more critical reviewing feedback.
>
> Thanks!
>
> Brandon Williams (35):
>    pkt-line: introduce packet_read_with_status
>    pkt-line: introduce struct packet_reader
>    pkt-line: add delim packet support
>    upload-pack: convert to a builtin
>    upload-pack: factor out processing lines
>    transport: use get_refs_via_connect to get refs
>    connect: convert get_remote_heads to use struct packet_reader
>    connect: discover protocol version outside of get_remote_heads
>    transport: store protocol version
>    protocol: introduce enum protocol_version value protocol_v2
>    test-pkt-line: introduce a packet-line test helper
>    serve: introduce git-serve
>    ls-refs: introduce ls-refs server command
>    connect: request remote refs using v2
>    transport: convert get_refs_list to take a list of ref patterns
>    transport: convert transport_get_remote_refs to take a list of ref
>      patterns
>    ls-remote: pass ref patterns when requesting a remote's refs
>    fetch: pass ref patterns when fetching
>    push: pass ref patterns when pushing
>    upload-pack: introduce fetch server command
>    fetch-pack: perform a fetch using v2
>    upload-pack: support shallow requests
>    fetch-pack: support shallow requests
>    connect: refactor git_connect to only get the protocol version once
>    connect: don't request v2 when pushing
>    transport-helper: remove name parameter
>    transport-helper: refactor process_connect_service
>    transport-helper: introduce stateless-connect
>    pkt-line: add packet_buf_write_len function
>    remote-curl: create copy of the service name
>    remote-curl: store the protocol version the server responded with
>    http: allow providing extra headers for http requests
>    http: don't always add Git-Protocol header
>    remote-curl: implement stateless-connect command
>    remote-curl: don't request v2 when pushing
>
>   .gitignore                              |   1 +
>   Documentation/technical/protocol-v2.txt | 338 +++++++++++++++++
>   Makefile                                |   7 +-
>   builtin.h                               |   2 +
>   builtin/clone.c                         |   2 +-
>   builtin/fetch-pack.c                    |  21 +-
>   builtin/fetch.c                         |  14 +-
>   builtin/ls-remote.c                     |   7 +-
>   builtin/receive-pack.c                  |   6 +
>   builtin/remote.c                        |   2 +-
>   builtin/send-pack.c                     |  20 +-
>   builtin/serve.c                         |  30 ++
>   builtin/upload-pack.c                   |  74 ++++
>   connect.c                               | 352 +++++++++++++-----
>   connect.h                               |   7 +
>   fetch-pack.c                            | 319 +++++++++++++++-
>   fetch-pack.h                            |   4 +-
>   git.c                                   |   2 +
>   http.c                                  |  25 +-
>   http.h                                  |   2 +
>   ls-refs.c                               |  96 +++++
>   ls-refs.h                               |   9 +
>   pkt-line.c                              | 149 +++++++-
>   pkt-line.h                              |  77 ++++
>   protocol.c                              |   2 +
>   protocol.h                              |   1 +
>   remote-curl.c                           | 257 ++++++++++++-
>   remote.h                                |   9 +-
>   serve.c                                 | 260 +++++++++++++
>   serve.h                                 |  15 +
>   t/helper/test-pkt-line.c                |  64 ++++
>   t/t5701-git-serve.sh                    | 176 +++++++++
>   t/t5702-protocol-v2.sh                  | 239 ++++++++++++
>   transport-helper.c                      |  84 +++--
>   transport-internal.h                    |   4 +-
>   transport.c                             | 116 ++++--
>   transport.h                             |   9 +-
>   upload-pack.c                           | 625 ++++++++++++++++++++++++--------
>   upload-pack.h                           |  21 ++
>   39 files changed, 3088 insertions(+), 360 deletions(-)
>   create mode 100644 Documentation/technical/protocol-v2.txt
>   create mode 100644 builtin/serve.c
>   create mode 100644 builtin/upload-pack.c
>   create mode 100644 ls-refs.c
>   create mode 100644 ls-refs.h
>   create mode 100644 serve.c
>   create mode 100644 serve.h
>   create mode 100644 t/helper/test-pkt-line.c
>   create mode 100755 t/t5701-git-serve.sh
>   create mode 100755 t/t5702-protocol-v2.sh
>   create mode 100644 upload-pack.h
>

I inspected the diff between v2 and v3 and found the changes to be good.

Reviewed-By: Derrick Stolee <dstolee@microsoft.com>

Thanks,
-Stolee


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 01/35] pkt-line: introduce packet_read_with_status
  2018-02-07  1:12     ` [PATCH v3 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-02-13  0:25       ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-13  0:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Hi,

Brandon Williams wrote:

> The current pkt-line API encodes the status of a pkt-line read in the
> length of the read content.  An error is indicated with '-1', a flush
> with '0' (which can be confusing since a return value of '0' can also
> indicate an empty pkt-line), and a positive integer for the length of
> the read content otherwise.  This doesn't leave much room for allowing
> the addition of additional special packets in the future.
>
> To solve this introduce 'packet_read_with_status()' which reads a packet
> and returns the status of the read encoded as an 'enum packet_status'
> type.  This allows for easily identifying between special and normal
> packets as well as errors.  It also enables easily adding a new special
> packet in the future.

Makes sense, thanks.  Using an enum return value is less opaque, too.

[...]
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -280,28 +280,33 @@ static int packet_length(const char *linelen)
>  	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
>  }
>  
> -int packet_read(int fd, char **src_buf, size_t *src_len,
> -		char *buffer, unsigned size, int options)
> +enum packet_read_status packet_read_with_status(int fd, char **src_buffer, size_t *src_len,
> +						char *buffer, unsigned size, int *pktlen,
> +						int options)

This function definition straddles two worlds: it is line-wrapped as
though there are a limited number of columns, but it goes far past 80
columns.

Can "make style" or a similar tool take care of rewrapping it?


>  {
> -	int len, ret;
> +	int len;
>  	char linelen[4];
>  
> -	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
> -	if (ret < 0)
> -		return ret;
> +	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
> +		return PACKET_READ_EOF;
> +

EOF is indeed the only error that get_packet_data can return.

Could be worth a doc comment on get_packet_data to make that clearer.
It's not too important since it's static, though.

>  	len = packet_length(linelen);
> -	if (len < 0)
> +
> +	if (len < 0) {
>  		die("protocol error: bad line length character: %.4s", linelen);
> -	if (!len) {
> +	} else if (!len) {
>  		packet_trace("0000", 4, 0);
> -		return 0;
> +		return PACKET_READ_FLUSH;

The advertised change. Makes sense.

[...]
> -	if (len >= size)
> +	if ((unsigned)len >= size)
>  		die("protocol error: bad line length %d", len);

The comparison is safe since we just checked that len >= 0.

Is there some static analysis that can make this kind of operation
easier?

[...]
> @@ -309,7 +314,31 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
>  
>  	buffer[len] = 0;
>  	packet_trace(buffer, len, 0);
> -	return len;
> +	*pktlen = len;
> +	return PACKET_READ_NORMAL;
> +}
> +
> +int packet_read(int fd, char **src_buffer, size_t *src_len,
> +		char *buffer, unsigned size, int options)
> +{
> +	enum packet_read_status status;
> +	int pktlen;
> +
> +	status = packet_read_with_status(fd, src_buffer, src_len,
> +					 buffer, size, &pktlen,
> +					 options);
> +	switch (status) {
> +	case PACKET_READ_EOF:
> +		pktlen = -1;
> +		break;
> +	case PACKET_READ_NORMAL:
> +		break;
> +	case PACKET_READ_FLUSH:
> +		pktlen = 0;
> +		break;
> +	}
> +
> +	return pktlen;

nit: can simplify by avoiding the status temporary:

	int pktlen;

	switch (packet_read_with_status(...)) {
	case PACKET_READ_EOF:
		return -1;
	case PACKET_READ_FLUSH:
		return 0;
	case PACKET_READ_NORMAL:
		return pktlen;
	}

As a bonus, that lets static analyzers check that the cases are
exhaustive.  (On the other hand, C doesn't guarantee that an enum can
only have the values listed as enumerators.  Did we end up figuring
out a way to handle that, beyond always including a 'default: BUG()'?)

> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -65,6 +65,21 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
>  int packet_read(int fd, char **src_buffer, size_t *src_len, char
>  		*buffer, unsigned size, int options);
>  
> +/*
> + * Read a packetized line into a buffer like the 'packet_read()' function but
> + * returns an 'enum packet_read_status' which indicates the status of the read.
> + * The number of bytes read will be assigined to *pktlen if the status of the
> + * read was 'PACKET_READ_NORMAL'.
> + */
> +enum packet_read_status {
> +	PACKET_READ_EOF = -1,
> +	PACKET_READ_NORMAL,
> +	PACKET_READ_FLUSH,
> +};

nit: do any callers treat the return value as a number?  It would be
less magical if the numbering were left to the compiler (0, 1, 2).

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 02/35] pkt-line: introduce struct packet_reader
  2018-02-07  1:12     ` [PATCH v3 02/35] pkt-line: introduce struct packet_reader Brandon Williams
@ 2018-02-13  0:49       ` Jonathan Nieder
  2018-02-27 18:14         ` Brandon Williams
  2018-02-27  5:57       ` Jonathan Nieder
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-13  0:49 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Hi,

Brandon Williams wrote:

> Subject: pkt-line: introduce struct packet_reader

nit: this subject line doesn't describe what the purpose/intent behind
the patch is.  Maybe something like

	pkt-line: allow peeking at a packet line without consuming it

would make it clearer.

> Sometimes it is advantageous to be able to peek the next packet line
> without consuming it (e.g. to be able to determine the protocol version
> a server is speaking).  In order to do that introduce 'struct
> packet_reader' which is an abstraction around the normal packet reading
> logic.  This enables a caller to be able to peek a single line at a time
> using 'packet_reader_peek()' and having a caller consume a line by
> calling 'packet_reader_read()'.

Makes sense.  The packet_reader owns a buffer to support the peek
operation and make buffer reuse a little easier.

[...]
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -111,6 +111,64 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
>   */
>  ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
>  
> +struct packet_reader {
> +	/* source file descriptor */
> +	int fd;
> +
> +	/* source buffer and its size */
> +	char *src_buffer;
> +	size_t src_len;

Can or should this be a strbuf?

> +
> +	/* buffer that pkt-lines are read into and its size */
> +	char *buffer;
> +	unsigned buffer_size;

Likewise.

> +
> +	/* options to be used during reads */
> +	int options;

What option values are possible?

> +
> +	/* status of the last read */
> +	enum packet_read_status status;

This reminds me of FILE*'s status value, ferror, etc.  I'm mildly
nervous about it --- it encourages a style of error handling where you
ignore errors from an individual operation and hope the recorded
status later has the most relevant error.

I think it is being used to support peek --- you need to record the
error to reply it.  Is that right?  I wonder if it would make sense to
structure as

		struct packet_read_result last_line_read;
	};

	struct packet_read_result {
		enum packet_read_status status;

		const char *line;
		int len;
	};

What you have here also seems fine.  I think what would help most
for readability is if the comment mentioned the purpose --- e.g.

	/* status of the last read, to support peeking */

Or if the contract were tied to the purpose:

	/* status of the last read, only valid if line_peeked is true */

[...]
> +/*
> + * Initialize a 'struct packet_reader' object which is an
> + * abstraction around the 'packet_read_with_status()' function.
> + */
> +extern void packet_reader_init(struct packet_reader *reader, int fd,
> +			       char *src_buffer, size_t src_len,
> +			       int options);

This comment doesn't describe how I should use the function.  Is the
intent to point the reader to packet_read_with_status for more details
about the arguments?

Can src_buffer be a const char *?

[...]
> +/*
> + * Perform a packet read and return the status of the read.

nit: s/Perform a packet read/Read one pkt-line/

> + * The values of 'pktlen' and 'line' are updated based on the status of the
> + * read as follows:
> + *
> + * PACKET_READ_ERROR: 'pktlen' is set to '-1' and 'line' is set to NULL
> + * PACKET_READ_NORMAL: 'pktlen' is set to the number of bytes read
> + *		       'line' is set to point at the read line
> + * PACKET_READ_FLUSH: 'pktlen' is set to '0' and 'line' is set to NULL
> + */
> +extern enum packet_read_status packet_reader_read(struct packet_reader *reader);

This is reasonable.  As described above an alternative would be
possible to have a separate packet_read_result output parameter but
the interface described here looks pretty easy/pleasant to use.

> +
> +/*
> + * Peek the next packet line without consuming it and return the status.

nit: s/Peek/Peek at/, or s/Peek/Read/

> + * The next call to 'packet_reader_read()' will perform a read of the same line
> + * that was peeked, consuming the line.

nit: s/peeked/peeked at/

> + *
> + * Peeking multiple times without calling 'packet_reader_read()' will return
> + * the same result.
> + */
> +extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);

Nice.

[...]
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -406,3 +406,62 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out)
>  	}
>  	return sb_out->len - orig_len;
>  }
> +
> +/* Packet Reader Functions */
> +void packet_reader_init(struct packet_reader *reader, int fd,
> +			char *src_buffer, size_t src_len,
> +			int options)

This comment looks like it's attached to packet_reader_init, but it's
meant to be attached to the whole collection.  It's possible that this
title-above-multiple-functions won't be maintained, but that's okay.

> +{
> +	memset(reader, 0, sizeof(*reader));
> +
> +	reader->fd = fd;
> +	reader->src_buffer = src_buffer;
> +	reader->src_len = src_len;
> +	reader->buffer = packet_buffer;
> +	reader->buffer_size = sizeof(packet_buffer);

Looks like this is very non-reentrant.  Can the doc comment warn about
that?  Or even better, can it be made reentrant by owning its own
strbuf?

> +	reader->options = options;
> +}
> +
> +enum packet_read_status packet_reader_read(struct packet_reader *reader)
> +{
> +	if (reader->line_peeked) {
> +		reader->line_peeked = 0;
> +		return reader->status;
> +	}

Nice.

> +
> +	reader->status = packet_read_with_status(reader->fd,
> +						 &reader->src_buffer,
> +						 &reader->src_len,
> +						 reader->buffer,
> +						 reader->buffer_size,
> +						 &reader->pktlen,
> +						 reader->options);
> +
> +	switch (reader->status) {
> +	case PACKET_READ_EOF:
> +		reader->pktlen = -1;
> +		reader->line = NULL;
> +		break;
> +	case PACKET_READ_NORMAL:
> +		reader->line = reader->buffer;
> +		break;
> +	case PACKET_READ_FLUSH:
> +		reader->pktlen = 0;
> +		reader->line = NULL;
> +		break;
> +	}
> +
> +	return reader->status;
> +}
> +
> +enum packet_read_status packet_reader_peek(struct packet_reader *reader)
> +{
> +	/* Only allow peeking a single line */

nit: s/peeking at/

> +	if (reader->line_peeked)
> +		return reader->status;
> +
> +	/* Peek a line by reading it and setting peeked flag */

nit: s/Peek/Peek at/

> +	packet_reader_read(reader);
> +	reader->line_peeked = 1;
> +	return reader->status;
> +}

Thanks for a pleasant read.

Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 22/35] upload-pack: support shallow requests
  2018-02-07 19:00       ` Stefan Beller
  2018-02-10 10:23         ` Duy Nguyen
@ 2018-02-13 17:06         ` Brandon Williams
  1 sibling, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-13 17:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/07, Stefan Beller wrote:
> On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> > Add the 'shallow' feature to the protocol version 2 command 'fetch'
> > which indicates that the server supports shallow clients and deepen
> > requets.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  Documentation/technical/protocol-v2.txt |  67 +++++++++++++++-
> >  serve.c                                 |   2 +-
> >  t/t5701-git-serve.sh                    |   2 +-
> >  upload-pack.c                           | 138 +++++++++++++++++++++++---------
> >  upload-pack.h                           |   3 +
> >  5 files changed, 173 insertions(+), 39 deletions(-)
> >
> > diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
> > index 4d5096dae..fedeb6b77 100644
> > --- a/Documentation/technical/protocol-v2.txt
> > +++ b/Documentation/technical/protocol-v2.txt
> > @@ -201,12 +201,42 @@ packet-lines:
> >         to its base by position in pack rather than by an oid.  That is,
> >         they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
> >
> > +    shallow <oid>
> > +       A client must notify the server of all objects for which it only
> 
> s/all objects/all commits/ for preciseness
> 
> > +       has shallow copies of (meaning that it doesn't have the parents
> > +       of a commit) by supplying a 'shallow <oid>' line for each such
> > +       object so that the serve is aware of the limitations of the
> > +       client's history.
> > +
> > +    deepen <depth>
> > +       Request that the fetch/clone should be shallow having a commit depth of
> > +       <depth> relative to the remote side.
> 
> What does depth mean? number of commits, or number of edges?
> Are there any special numbers (-1, 0, 1, max int) ?
> 
> From reading ahead: "Cannot be used with deepen-since, but
> can be combined with deepen-relative" ?

It just uses the current logic, which has no documentation on any of
that so...I'm not really sure?

> 
> 
> > +
> > +    deepen-relative
> > +       Requests that the semantics of the "deepen" command be changed
> > +       to indicate that the depth requested is relative to the clients
> > +       current shallow boundary, instead of relative to the remote
> > +       refs.
> > +
> > +    deepen-since <timestamp>
> > +       Requests that the shallow clone/fetch should be cut at a
> > +       specific time, instead of depth.  Internally it's equivalent of
> > +       doing "rev-list --max-age=<timestamp>". Cannot be used with
> > +       "deepen".
> > +
> > +    deepen-not <rev>
> > +       Requests that the shallow clone/fetch should be cut at a
> > +       specific revision specified by '<rev>', instead of a depth.
> > +       Internally it's equivalent of doing "rev-list --not <rev>".
> > +       Cannot be used with "deepen", but can be used with
> > +       "deepen-since".
> 
> What happens if those are given in combination?

Should act as an AND, it uses the old logic and there isn't very much
documentation on that...

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 00/35] protocol version 2
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (35 preceding siblings ...)
  2018-02-12 14:50     ` [PATCH v3 00/35] protocol version 2 Derrick Stolee
@ 2018-02-21 20:01     ` Brandon Williams
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
  37 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-21 20:01 UTC (permalink / raw)
  To: git; +Cc: sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/06, Brandon Williams wrote:
> Changes in v3:
>  * There were some comments about how the protocol should be designed
>    stateless first.  I've made this change and instead of having to
>    supply the `stateless-rpc=true` capability to force stateless
>    behavior, the protocol just requires all commands to be stateless.
>  
>  * Added some patches towards the end of the series to force the client
>    to not request to use protocol v2 when pushing (even if configured to
>    use v2).  This is to ease the roll-out process of a push command in
>    protocol v2.  This way when servers gain the ability to accept
>    pushing in v2 (and they start responding using v2 when requests are
>    sent to the git-receive-pack endpoint) that clients who still don't
>    understand how to push using v2 won't request to use v2 and then die
>    when they recognize that the server does indeed know how to accept a
>    push under v2.
> 
>  * I implemented the `shallow` feature for fetch.  This feature
>    encapsulates the existing functionality of all the shallow/deepen
>    capabilities in v0.  So now a server can process shallow requests.
> 
>  * Various other small tweaks that I can't remember :)
> 
> After all of that I think the series is in a pretty good state, baring
> any more critical reviewing feedback.
> 
> Thanks!

I'm hoping to get some more in depth review before I do any more
re-rolls, but for those interested I will need to do a re-roll to
eliminate the prelude from the http transport.  This is the prelude
which includes the service line followed by any number of packet lines
culminating in a flush-pkt like so:

  # service=git-upload-pack
  some
  other
  optional
  lines
  0000

With this eliminated all transports will be exactly the same, the only
difference will be how the protocol is tunneled.

> 
> Brandon Williams (35):
>   pkt-line: introduce packet_read_with_status
>   pkt-line: introduce struct packet_reader
>   pkt-line: add delim packet support
>   upload-pack: convert to a builtin
>   upload-pack: factor out processing lines
>   transport: use get_refs_via_connect to get refs
>   connect: convert get_remote_heads to use struct packet_reader
>   connect: discover protocol version outside of get_remote_heads
>   transport: store protocol version
>   protocol: introduce enum protocol_version value protocol_v2
>   test-pkt-line: introduce a packet-line test helper
>   serve: introduce git-serve
>   ls-refs: introduce ls-refs server command
>   connect: request remote refs using v2
>   transport: convert get_refs_list to take a list of ref patterns
>   transport: convert transport_get_remote_refs to take a list of ref
>     patterns
>   ls-remote: pass ref patterns when requesting a remote's refs
>   fetch: pass ref patterns when fetching
>   push: pass ref patterns when pushing
>   upload-pack: introduce fetch server command
>   fetch-pack: perform a fetch using v2
>   upload-pack: support shallow requests
>   fetch-pack: support shallow requests
>   connect: refactor git_connect to only get the protocol version once
>   connect: don't request v2 when pushing
>   transport-helper: remove name parameter
>   transport-helper: refactor process_connect_service
>   transport-helper: introduce stateless-connect
>   pkt-line: add packet_buf_write_len function
>   remote-curl: create copy of the service name
>   remote-curl: store the protocol version the server responded with
>   http: allow providing extra headers for http requests
>   http: don't always add Git-Protocol header
>   remote-curl: implement stateless-connect command
>   remote-curl: don't request v2 when pushing
> 
>  .gitignore                              |   1 +
>  Documentation/technical/protocol-v2.txt | 338 +++++++++++++++++
>  Makefile                                |   7 +-
>  builtin.h                               |   2 +
>  builtin/clone.c                         |   2 +-
>  builtin/fetch-pack.c                    |  21 +-
>  builtin/fetch.c                         |  14 +-
>  builtin/ls-remote.c                     |   7 +-
>  builtin/receive-pack.c                  |   6 +
>  builtin/remote.c                        |   2 +-
>  builtin/send-pack.c                     |  20 +-
>  builtin/serve.c                         |  30 ++
>  builtin/upload-pack.c                   |  74 ++++
>  connect.c                               | 352 +++++++++++++-----
>  connect.h                               |   7 +
>  fetch-pack.c                            | 319 +++++++++++++++-
>  fetch-pack.h                            |   4 +-
>  git.c                                   |   2 +
>  http.c                                  |  25 +-
>  http.h                                  |   2 +
>  ls-refs.c                               |  96 +++++
>  ls-refs.h                               |   9 +
>  pkt-line.c                              | 149 +++++++-
>  pkt-line.h                              |  77 ++++
>  protocol.c                              |   2 +
>  protocol.h                              |   1 +
>  remote-curl.c                           | 257 ++++++++++++-
>  remote.h                                |   9 +-
>  serve.c                                 | 260 +++++++++++++
>  serve.h                                 |  15 +
>  t/helper/test-pkt-line.c                |  64 ++++
>  t/t5701-git-serve.sh                    | 176 +++++++++
>  t/t5702-protocol-v2.sh                  | 239 ++++++++++++
>  transport-helper.c                      |  84 +++--
>  transport-internal.h                    |   4 +-
>  transport.c                             | 116 ++++--
>  transport.h                             |   9 +-
>  upload-pack.c                           | 625 ++++++++++++++++++++++++--------
>  upload-pack.h                           |  21 ++
>  39 files changed, 3088 insertions(+), 360 deletions(-)
>  create mode 100644 Documentation/technical/protocol-v2.txt
>  create mode 100644 builtin/serve.c
>  create mode 100644 builtin/upload-pack.c
>  create mode 100644 ls-refs.c
>  create mode 100644 ls-refs.h
>  create mode 100644 serve.c
>  create mode 100644 serve.h
>  create mode 100644 t/helper/test-pkt-line.c
>  create mode 100755 t/t5701-git-serve.sh
>  create mode 100755 t/t5702-protocol-v2.sh
>  create mode 100644 upload-pack.h
> 
> -- 
> 2.16.0.rc1.238.g530d649a79-goog
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-07  1:12     ` [PATCH v3 04/35] upload-pack: convert to a builtin Brandon Williams
@ 2018-02-21 21:44       ` Jonathan Tan
  2018-02-22  9:58         ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 21:44 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:41 -0800
Brandon Williams <bmwill@google.com> wrote:

> In order to allow for code sharing with the server-side of fetch in
> protocol-v2 convert upload-pack to be a builtin.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

As Stefan mentioned in [1], also mention in the commit message that this
means that the "git-upload-pack" invocation gains additional
capabilities (for example, invoking a pager for --help).

Having said that, the main purpose of this patch seems to be to libify
upload-pack, and the move to builtin is just a way of putting the
program somewhere - we could have easily renamed upload-pack.c and
created a new upload-pack.c containing the main(), preserving the
non-builtin-ness of upload-pack, while still gaining the benefits of
libifying upload-pack.

If the community does want to make upload-pack a builtin, I would write
the commit message this way:

  upload-pack: libify

  Libify upload-pack. The main() function is moved to
  builtin/upload-pack.c, thus making upload-pack a builtin. Note that
  this means that "git-upload-pack" gains functionality such as the
  ability to invoke a pager when passed "--help".

And if not:

  upload-pack: libify

  Libify upload-pack by moving most of the functionality in
  upload-pack.c into a file upload-pack-lib.c (or some other name),
  to be used in subsequent patches.

[1] https://public-inbox.org/git/CAGZ79kb2=uU0_K8wr27gNdNX-T+P+7gVdgc5EBdYc3zBobsR8w@mail.gmail.com/

> -static void upload_pack(void)
> -{
> -	struct string_list symref = STRING_LIST_INIT_DUP;
> -
> -	head_ref_namespaced(find_symref, &symref);
> -
> -	if (advertise_refs || !stateless_rpc) {
> -		reset_timeout();
> -		head_ref_namespaced(send_ref, &symref);
> -		for_each_namespaced_ref(send_ref, &symref);
> -		advertise_shallow_grafts(1);
> -		packet_flush(1);
> -	} else {
> -		head_ref_namespaced(check_ref, NULL);
> -		for_each_namespaced_ref(check_ref, NULL);
> -	}
> -	string_list_clear(&symref, 1);
> -	if (advertise_refs)
> -		return;
> -
> -	receive_needs();
> -	if (want_obj.nr) {
> -		get_common_commits();
> -		create_pack_file();
> -	}
> -}

I see that this function had to be moved to the bottom because it now
also needs to make use of functions like upload_pack_config() - that's
fine.

> +struct upload_pack_options {
> +	int stateless_rpc;
> +	int advertise_refs;
> +	unsigned int timeout;
> +	int daemon_mode;
> +};

I would have expected "unsigned stateless_rpc : 1" etc., but I see that
this makes it easier to use with OPT_BOOL (which needs us to pass it a
pointer-to-int).

As for what existing code does, files like fetch-pack and diff use
"unsigned : 1", but they also process arguments without OPT_, so I don't
think they are relevant.

I think that we should decide if we're going to prefer "unsigned : 1" or
"int" for flags in new code. Personally, I prefer "unsigned : 1"
(despite the slight inconvenience in that argument parsers will need to
declare their own temporary "int" and then assign its contents to the
options struct) because of the stronger type, but I'm OK either way.
Whatever the decision, I don't think it needs to block the review of
this patch set.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 04/26] upload-pack: convert to a builtin
  2018-01-03 20:39     ` Brandon Williams
@ 2018-02-21 21:47       ` Jonathan Nieder
  2018-02-21 23:35         ` Junio C Hamano
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-21 21:47 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Stefan Beller, git, Junio C Hamano, Jeff King, Philip Oakley,
	Derrick Stolee, Sitaram Chamarty

Brandon Williams wrote:
> On 01/03, Stefan Beller wrote:
> > On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams <bmwill@google.com> wrote:

>>> In order to allow for code sharing with the server-side of fetch in
>>> protocol-v2 convert upload-pack to be a builtin.
>>
>> What is the security aspect of this patch?
>>
>> By making upload-pack builtin, it gains additional abilities,
>> such as answers to '-h' or '--help' (which would start a pager).
>> Is there an easy way to sooth my concerns? (best put into the
>> commit message)
>
> receive-pack is already a builtin, so theres that.

*nod*

Since v2.4.12~1^2 (shell: disallow repo names beginning with dash,
2017-04-29), git-shell refuses to pass --help to upload-pack, limiting
the security impact in configurations that use git-shell (e.g.
gitolite installations).

If you're not using git-shell, then hopefully you have some other form
of filtering preventing arbitrary options being passed to
git-upload-pack.  If you don't, then you're in trouble, for the
reasons described in that commit.

Since some installations may be allowing access to git-upload-pack
(for read-only access) without allowing access to git-receive-pack,
this does increase the chance of attack.  On the other hand, I suspect
the maintainability benefit is worth it.

For defense in depth, it would be comforting if the git wrapper had
some understanding of "don't support --help in handle_builtin when
invoked as a dashed command".  That is, I don't expect that anyone has
been relying on

	git-add --help

acting like

	git help add

instead of printing the usage message from

	git add -h

It's a little fussy because today we rewrite "git add --help" to
"git-add --help" before rewriting it to "git help add"; we'd have to
skip that middle hop for this to work.

I don't think that has to block this patch or series, though --- it's
just a separate thought about hardening.

Cc-ing Sitaram Chamarty since he tends to be wiser about this kind of
thing than I am.

What do you think?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads
  2018-02-07  1:12     ` [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-02-21 22:11       ` Jonathan Tan
  2018-02-22 18:17         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 22:11 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:45 -0800
Brandon Williams <bmwill@google.com> wrote:

> -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> +
> +	packet_reader_init(&reader, fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}

This inlining is repeated a few times, which raises the question: if the
intention is to keep the v0/1 logic separately from v2, why not have a
single function that wraps them all? Looking at the end result (after
all the patches in this patch set are applied), it seems that the v2
version does not have extra_have or shallow parameters, which is a good
enough reason for me (I don't think functions that take in many
arguments and then selectively use them is a good idea). I think that
other reviewers will have this question too, so maybe discuss this in
the commit message.

> diff --git a/remote.h b/remote.h
> index 1f6611be2..2016461df 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
>  void free_refs(struct ref *ref);
>  
>  struct oid_array;
> -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct packet_reader;
> +extern struct ref **get_remote_heads(struct packet_reader *reader,
>  				     struct ref **list, unsigned int flags,
>  				     struct oid_array *extra_have,
> -				     struct oid_array *shallow);
> +				     struct oid_array *shallow_points);

This change probably does not belong in this patch, especially since
remote.c is unchanged.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-07  1:12     ` [PATCH v3 12/35] serve: introduce git-serve Brandon Williams
@ 2018-02-21 22:45       ` Jonathan Tan
  2018-02-23 21:33         ` Brandon Williams
  2018-02-22  9:33       ` Jeff King
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 22:45 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:49 -0800
Brandon Williams <bmwill@google.com> wrote:

>  .gitignore                              |   1 +
>  Documentation/technical/protocol-v2.txt | 114 +++++++++++++++
>  Makefile                                |   2 +
>  builtin.h                               |   1 +
>  builtin/serve.c                         |  30 ++++
>  git.c                                   |   1 +
>  serve.c                                 | 250 ++++++++++++++++++++++++++++++++
>  serve.h                                 |  15 ++
>  t/t5701-git-serve.sh                    |  60 ++++++++
>  9 files changed, 474 insertions(+)
>  create mode 100644 Documentation/technical/protocol-v2.txt
>  create mode 100644 builtin/serve.c
>  create mode 100644 serve.c
>  create mode 100644 serve.h
>  create mode 100755 t/t5701-git-serve.sh

As someone who is implementing the server side of protocol V2 in JGit, I
now have a bit more insight into this :-)

First of all, I used to not have a strong opinion on the existence of a
new endpoint, but now I think that it's better to *not* have git-serve.
As it is, as far as I can tell, upload-pack also needs to support (and
does support, as of the end of this patch set) protocol v2 anyway, so it
might be better to merely upgrade upload-pack.

> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +    command-request = command
> +		      capability-list
> +		      (command-args)

If you are stating that this is optional, write "*1command-args". (RFC
5234 also supports square brackets, but "*1" is already used in
pack-protocol.txt and http-protocol.txt.)

> +		      flush-pkt
> +    command = PKT-LINE("command=" key LF)
> +    command-args = delim-pkt
> +		   *arg
> +    arg = 1*CHAR

arg should be wrapped in PKT-LINE, I think, and terminated by an LF.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-07  1:12     ` [PATCH v3 14/35] connect: request remote refs using v2 Brandon Williams
@ 2018-02-21 22:54       ` Jonathan Tan
  2018-02-22 18:19         ` Brandon Williams
  2018-02-27  6:51       ` Jonathan Nieder
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 22:54 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:51 -0800
Brandon Williams <bmwill@google.com> wrote:

> +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +				    struct ref **list, int for_push,
> +				    const struct argv_array *ref_patterns);

I haven't looked at the rest of this patch in detail, but the type of
ref_patterns is probably better as struct string_list, since this is not
a true argument array (e.g. with flags starting with --). Same comment
for the next few patches that deal with ref patterns.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns
  2018-02-07  1:12     ` [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
@ 2018-02-21 22:56       ` Jonathan Tan
  2018-02-22 18:25         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 22:56 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:52 -0800
Brandon Williams <bmwill@google.com> wrote:

> @@ -21,7 +22,8 @@ struct transport_vtable {
>  	 * the ref without a huge amount of effort, it should store it
>  	 * in the ref's old_sha1 field; otherwise it should be all 0.
>  	 **/
> -	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
> +	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
> +				     const struct argv_array *ref_patterns);

Also mention in the documentation that this function is allowed to
return refs that do not match the ref patterns.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-07  1:12     ` [PATCH v3 16/35] transport: convert transport_get_remote_refs " Brandon Williams
@ 2018-02-21 22:58       ` Jonathan Tan
  2018-02-22 18:26         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 22:58 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:53 -0800
Brandon Williams <bmwill@google.com> wrote:

> -const struct ref *transport_get_remote_refs(struct transport *transport)
> +const struct ref *transport_get_remote_refs(struct transport *transport,
> +					    const struct argv_array *ref_patterns)
>  {
>  	if (!transport->got_remote_refs) {
> -		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
> +		transport->remote_refs =
> +			transport->vtable->get_refs_list(transport, 0,
> +							 ref_patterns);
>  		transport->got_remote_refs = 1;
>  	}

Should we do our own client-side filtering if the server side cannot do
it for us (because it doesn't support protocol v2)? Either way, this
decision should be mentioned in the commit message.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH 04/26] upload-pack: convert to a builtin
  2018-02-21 21:47       ` Jonathan Nieder
@ 2018-02-21 23:35         ` Junio C Hamano
  0 siblings, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-02-21 23:35 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Stefan Beller, git, Jeff King, Philip Oakley,
	Derrick Stolee, Sitaram Chamarty

Jonathan Nieder <jrnieder@gmail.com> writes:

> For defense in depth, it would be comforting if the git wrapper had
> some understanding of "don't support --help in handle_builtin when
> invoked as a dashed command".  That is, I don't expect that anyone has
> been relying on
>
> 	git-add --help
>
> acting like
>
> 	git help add
>
> instead of printing the usage message from
>
> 	git add -h

Sounds like a neat trick.

> It's a little fussy because today we rewrite "git add --help" to
> "git-add --help" before rewriting it to "git help add"; we'd have to
> skip that middle hop for this to work.

I do not quite get this part.  "git add --help" goes through run_argv()
and then to handle_builtin() which is what does this "git help add"
swapping.

"git-add --help" does get thrown into the same codepath by
pretending as if we got "add --help" as an argument to "git"
command, and that happens without going through run_argv(),
so presumably we can add another perameter to handle_builtin()
so that the callee can tell these two invocation sites apart, no?

> I don't think that has to block this patch or series, though --- it's
> just a separate thought about hardening.

Yeah, I agree with this assessment.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 20/35] upload-pack: introduce fetch server command
  2018-02-07  1:12     ` [PATCH v3 20/35] upload-pack: introduce fetch server command Brandon Williams
@ 2018-02-21 23:46       ` Jonathan Tan
  2018-02-22 18:48         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 23:46 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:57 -0800
Brandon Williams <bmwill@google.com> wrote:

> +    want <oid>
> +	Indicates to the server an object which the client wants to
> +	retrieve.

Mention that the client can "want" anything even if not advertised by
the server (like uploadpack.allowanysha1inwant).

> +    output = *section
> +    section = (acknowledgments | packfile)
> +	      (flush-pkt | delim-pkt)
> +
> +    acknowledgments = PKT-LINE("acknowledgments" LF)
> +		      *(ready | nak | ack)

Can this part be described more precisely in the BNF section? I see that
you describe later that there can be multiple ACKs or one NAK (but not
both), and "ready" can be sent regardless of whether ACKs or a NAK is
sent.

> +    ready = PKT-LINE("ready" LF)
> +    nak = PKT-LINE("NAK" LF)
> +    ack = PKT-LINE("ACK" SP obj-id LF)
> +
> +    packfile = PKT-LINE("packfile" LF)
> +	       [PACKFILE]
> +
> +----
> +    acknowledgments section
> +	* Always begins with the section header "acknowledgments"
> +
> +	* The server will respond with "NAK" if none of the object ids sent
> +	  as have lines were common.
> +
> +	* The server will respond with "ACK obj-id" for all of the
> +	  object ids sent as have lines which are common.
> +
> +	* A response cannot have both "ACK" lines as well as a "NAK"
> +	  line.
> +
> +	* The server will respond with a "ready" line indicating that
> +	  the server has found an acceptable common base and is ready to
> +	  make and send a packfile (which will be found in the packfile
> +	  section of the same response)
> +
> +	* If the client determines that it is finished with negotiations
> +	  by sending a "done" line, the acknowledgments sections can be
> +	  omitted from the server's response as an optimization.

Should this be changed to "must"? The current implementation does not
support it (on the client side).

> +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }

Optional: the trailing zeroes can be omitted. (That's shorter, and also
easier to maintain when we add new fields.)

> +int upload_pack_v2(struct repository *r, struct argv_array *keys,
> +		   struct argv_array *args)
> +{
> +	enum fetch_state state = FETCH_PROCESS_ARGS;
> +	struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
> +	use_sideband = LARGE_PACKET_MAX;
> +
> +	while (state != FETCH_DONE) {
> +		switch (state) {
> +		case FETCH_PROCESS_ARGS:
> +			process_args(args, &data);
> +
> +			if (!want_obj.nr) {
> +				/*
> +				 * Request didn't contain any 'want' lines,
> +				 * guess they didn't want anything.
> +				 */
> +				state = FETCH_DONE;
> +			} else if (data.haves.nr) {
> +				/*
> +				 * Request had 'have' lines, so lets ACK them.
> +				 */
> +				state = FETCH_SEND_ACKS;
> +			} else {
> +				/*
> +				 * Request had 'want's but no 'have's so we can
> +				 * immedietly go to construct and send a pack.
> +				 */
> +				state = FETCH_SEND_PACK;
> +			}
> +			break;
> +		case FETCH_READ_HAVES:
> +			read_haves(&data);
> +			state = FETCH_SEND_ACKS;
> +			break;

This branch seems to never be taken?

> +		case FETCH_SEND_ACKS:
> +			if (process_haves_and_send_acks(&data))
> +				state = FETCH_SEND_PACK;
> +			else
> +				state = FETCH_DONE;
> +			break;
> +		case FETCH_SEND_PACK:
> +			packet_write_fmt(1, "packfile\n");
> +			create_pack_file();
> +			state = FETCH_DONE;
> +			break;
> +		case FETCH_DONE:
> +			continue;
> +		}
> +	}
> +
> +	upload_pack_data_clear(&data);
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once
  2018-02-07  1:13     ` [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
@ 2018-02-21 23:51       ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-02-21 23:51 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:13:01 -0800
Brandon Williams <bmwill@google.com> wrote:

> Instead of having each builtin transport asking for which protocol
> version the user has configured in 'protocol.version' by calling
> `get_protocol_version_config()` multiple times, factor this logic out
> so there is just a single call at the beginning of `git_connect()`.
> 
> This will be helpful in the next patch where we can have centralized
> logic which determines if we need to request a different protocol
> version than what the user has configured.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect
  2018-02-07  1:13     ` [PATCH v3 28/35] transport-helper: introduce stateless-connect Brandon Williams
@ 2018-02-22  0:01       ` Jonathan Tan
  2018-02-22 18:53         ` Brandon Williams
  2018-02-27 23:30       ` Jonathan Nieder
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22  0:01 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:13:05 -0800
Brandon Williams <bmwill@google.com> wrote:

> Introduce the transport-helper capability 'stateless-connect'.  This
> capability indicates that the transport-helper can be requested to run
> the 'stateless-connect' command which should attempt to make a
> stateless connection with a remote end.  Once established, the
> connection can be used by the git client to communicate with
> the remote end natively in a stateless-rpc manner as supported by
> protocol v2.  This means that the client must send everything the server
> needs in a single request as the client must not assume any
> state-storing on the part of the server or transport.

Maybe it's worth mentioning that support in the actual remote helpers
will be added in a subsequent patch.

> If a stateless connection cannot be established then the remote-helper
> will respond in the same manner as the 'connect' command indicating that
> the client should fallback to using the dumb remote-helper commands.

This makes sense, but there doesn't seem to be any code in this patch
that implements this.

> @@ -612,6 +615,11 @@ static int process_connect_service(struct transport *transport,
>  	if (data->connect) {
>  		strbuf_addf(&cmdbuf, "connect %s\n", name);
>  		ret = run_connect(transport, &cmdbuf);
> +	} else if (data->stateless_connect) {
> +		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
> +		ret = run_connect(transport, &cmdbuf);
> +		if (ret)
> +			transport->stateless_rpc = 1;

Why is process_connect_service() falling back to stateless_connect if
connect doesn't work? I don't think this fallback would work, as a
client that needs "connect" might need its full capabilities.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 30/35] remote-curl: create copy of the service name
  2018-02-07  1:13     ` [PATCH v3 30/35] remote-curl: create copy of the service name Brandon Williams
@ 2018-02-22  0:06       ` Jonathan Tan
  2018-02-22 18:56         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22  0:06 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:13:07 -0800
Brandon Williams <bmwill@google.com> wrote:

> Make a copy of the service name being requested instead of relying on
> the buffer pointed to by the passed in 'const char *' to remain
> unchanged.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

Probably worth mentioning in the commit message:

  Currently, all service names are string constants, but a subsequent
  patch will introduce service names from external sources.

Other than that,

Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 32/35] http: allow providing extra headers for http requests
  2018-02-07  1:13     ` [PATCH v3 32/35] http: allow providing extra headers for http requests Brandon Williams
@ 2018-02-22  0:09       ` Jonathan Tan
  2018-02-22 18:58         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22  0:09 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:13:09 -0800
Brandon Williams <bmwill@google.com> wrote:

> @@ -172,6 +172,8 @@ struct http_get_options {
>  	 * for details.
>  	 */
>  	struct strbuf *base_url;
> +
> +	struct string_list *extra_headers;

Document this? For example:

  If not NULL, additional HTTP headers to be sent with the request. The
  strings in the list must not be freed until after the request.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 35/35] remote-curl: don't request v2 when pushing
  2018-02-07  1:13     ` [PATCH v3 35/35] remote-curl: don't request v2 when pushing Brandon Williams
@ 2018-02-22  0:12       ` Jonathan Tan
  2018-02-22 18:59         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22  0:12 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:13:12 -0800
Brandon Williams <bmwill@google.com> wrote:

> +test_expect_success 'push with http:// and a config of v2 does not request v2' '
> +	# Till v2 for push is designed, make sure that if a client has
> +	# protocol.version configured to use v2, that the client instead falls
> +	# back and uses v0.
> +
> +	test_commit -C http_child three &&
> +
> +	# Push to another branch, as the target repository has the
> +	# master branch checked out and we cannot push into it.
> +	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
> +		push origin HEAD:client_branch && 2>log &&

Should it be protocol.version=2? Also, two double ampersands?

Also, optionally, it might be better to do
GIT_TRACE_PACKET="$(pwd)/log", so that it does not get mixed with other
stderr output.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-07  1:12     ` [PATCH v3 12/35] serve: introduce git-serve Brandon Williams
  2018-02-21 22:45       ` Jonathan Tan
@ 2018-02-22  9:33       ` Jeff King
  2018-02-23 21:45         ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22  9:33 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote:

> +In protocol v2 communication is command oriented.  When first contacting a
> +server a list of capabilities will advertised.  Some of these capabilities
> +will be commands which a client can request be executed.  Once a command
> +has completed, a client can reuse the connection and request that other
> +commands be executed.

If I understand this correctly, we'll potentially have a lot more
round-trips between the client and server (one per "command"). And for
git-over-http, each one will be its own HTTP request?

We've traditionally tried to minimize HTTP requests, but I guess it's
not too bad if we can keep the connection open in most cases. Then we
just suffer some extra framing bytes, but we don't have to re-establish
the TCP connection each time.

I do wonder if the extra round trips will be noticeable in high-latency
conditions. E.g., if I'm 200ms away, converting the current
ref-advertisement spew to "capabilities, then the client asks for refs,
then we spew the refs" is going to cost an extra 200ms, even if the
fetch just ends up being a noop. I'm not sure how bad that is in the
grand scheme of things (after all, the TCP handshake involves some
round-trips, too).

> + Capability Advertisement
> +--------------------------
> +
> +A server which decides to communicate (based on a request from a client)
> +using protocol version 2, notifies the client by sending a version string
> +in its initial response followed by an advertisement of its capabilities.
> +Each capability is a key with an optional value.  Clients must ignore all
> +unknown keys.  Semantics of unknown values are left to the definition of
> +each key.  Some capabilities will describe commands which can be requested
> +to be executed by the client.
> +
> +    capability-advertisement = protocol-version
> +			       capability-list
> +			       flush-pkt
> +
> +    protocol-version = PKT-LINE("version 2" LF)
> +    capability-list = *capability
> +    capability = PKT-LINE(key[=value] LF)
> +
> +    key = 1*CHAR
> +    value = 1*CHAR
> +    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> +
> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +    command-request = command
> +		      capability-list
> +		      (command-args)
> +		      flush-pkt
> +    command = PKT-LINE("command=" key LF)
> +    command-args = delim-pkt
> +		   *arg
> +    arg = 1*CHAR

For a single stateful TCP connection like git:// or git-over-ssh, the
client would get the capabilities once and then issue a series of
commands. For git-over-http, how does it work?

The client speaks first in HTTP, so we'd first make a request to get
just the capabilities from the server? And then proceed from there with
a series of requests, assuming that the capabilities for each server we
subsequently contact are the same? That's probably reasonable (and
certainly the existing http protocol makes that capabilities
assumption).

I don't see any documentation on how this all works with http. But
reading patch 34, it looks like we just do the usual
service=git-upload-pack request (with the magic request for v2), and
then the server would send us capabilities. Which follows my line of
thinking in the paragraph above.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-07  1:12     ` [PATCH v3 13/35] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-02-22  9:48       ` Jeff King
  2018-02-23  0:45         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22  9:48 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Tue, Feb 06, 2018 at 05:12:50PM -0800, Brandon Williams wrote:

> +ls-refs takes in the following parameters wrapped in packet-lines:
> +
> +    symrefs
> +	In addition to the object pointed by it, show the underlying ref
> +	pointed by it when showing a symbolic ref.
> +    peel
> +	Show peeled tags.
> +    ref-pattern <pattern>
> +	When specified, only references matching the one of the provided
> +	patterns are displayed.

How do we match those patterns? That's probably an important thing to
include in the spec.

Looking at the code, I see:

> +/*
> + * Check if one of the patterns matches the tail part of the ref.
> + * If no patterns were provided, all refs match.
> + */
> +static int ref_match(const struct argv_array *patterns, const char *refname)

This kind of tail matching can't quite implement all of the current
behavior. Because we actually do the normal dwim_ref() matching, which
includes stuff like "refs/remotes/%s/HEAD".

The other problem with tail-matching is that it's inefficient on the
server. Ideally we could get a request for "master" and only look up
refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
in refs/pull, we wouldn't have to process those at all. Of course this
is no worse than the current code, which not only looks at each ref but
actually _sends_ it. But it would be nice if we could fix this.

There's some more discussion in this old thread:

  https://public-inbox.org/git/20161024132932.i42rqn2vlpocqmkq@sigill.intra.peff.net/

> +{
> +	char *pathbuf;
> +	int i;
> +
> +	if (!patterns->argc)
> +		return 1; /* no restriction */
> +
> +	pathbuf = xstrfmt("/%s", refname);
> +	for (i = 0; i < patterns->argc; i++) {
> +		if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
> +			free(pathbuf);
> +			return 1;
> +		}
> +	}
> +	free(pathbuf);
> +	return 0;
> +}

Does the client have to be aware that we're using wildmatch? I think
they'd need "refs/heads/**" to actually implement what we usually
specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
make this work with just "*"?

Do we anticipate that the client would left-anchor the refspec like
"/refs/heads/*" so that in theory the server could avoid looking outside
of /refs/heads/?

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-21 21:44       ` Jonathan Tan
@ 2018-02-22  9:58         ` Jeff King
  2018-02-22 18:07           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22  9:58 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Brandon Williams, git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Wed, Feb 21, 2018 at 01:44:22PM -0800, Jonathan Tan wrote:

> On Tue,  6 Feb 2018 17:12:41 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > In order to allow for code sharing with the server-side of fetch in
> > protocol-v2 convert upload-pack to be a builtin.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> As Stefan mentioned in [1], also mention in the commit message that this
> means that the "git-upload-pack" invocation gains additional
> capabilities (for example, invoking a pager for --help).

And possibly respecting pager.upload-pack, which would violate our rule
that it is safe to run upload-pack in untrusted repositories.

(This actually doesn't work right now because pager.* is broken for
builtins that don't specify RUN_SETUP; but I think with the fixes last
year to the config code, we can now drop that restriction).

Obviously we can work around this with an extra RUN_NO_PAGER_CONFIG
flag. But I think it points to a general danger in making upload-pack a
builtin. I'm not sure what other features it would want to avoid (or
what might grow in the future).

> Having said that, the main purpose of this patch seems to be to libify
> upload-pack, and the move to builtin is just a way of putting the
> program somewhere - we could have easily renamed upload-pack.c and
> created a new upload-pack.c containing the main(), preserving the
> non-builtin-ness of upload-pack, while still gaining the benefits of
> libifying upload-pack.

Yeah, this seems like a better route to me.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22  9:58         ` Jeff King
@ 2018-02-22 18:07           ` Brandon Williams
  2018-02-22 18:14             ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:07 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Tan, git, sbeller, gitster, jrnieder, stolee, git, pclouds

On 02/22, Jeff King wrote:
> On Wed, Feb 21, 2018 at 01:44:22PM -0800, Jonathan Tan wrote:
> 
> > On Tue,  6 Feb 2018 17:12:41 -0800
> > Brandon Williams <bmwill@google.com> wrote:
> > 
> > > In order to allow for code sharing with the server-side of fetch in
> > > protocol-v2 convert upload-pack to be a builtin.
> > > 
> > > Signed-off-by: Brandon Williams <bmwill@google.com>
> > 
> > As Stefan mentioned in [1], also mention in the commit message that this
> > means that the "git-upload-pack" invocation gains additional
> > capabilities (for example, invoking a pager for --help).
> 
> And possibly respecting pager.upload-pack, which would violate our rule
> that it is safe to run upload-pack in untrusted repositories.

And this isn't an issue with receive-pack because this same guarantee
doesn't exist?

> 
> (This actually doesn't work right now because pager.* is broken for
> builtins that don't specify RUN_SETUP; but I think with the fixes last
> year to the config code, we can now drop that restriction).
> 
> Obviously we can work around this with an extra RUN_NO_PAGER_CONFIG
> flag. But I think it points to a general danger in making upload-pack a
> builtin. I'm not sure what other features it would want to avoid (or
> what might grow in the future).
> 
> > Having said that, the main purpose of this patch seems to be to libify
> > upload-pack, and the move to builtin is just a way of putting the
> > program somewhere - we could have easily renamed upload-pack.c and
> > created a new upload-pack.c containing the main(), preserving the
> > non-builtin-ness of upload-pack, while still gaining the benefits of
> > libifying upload-pack.
> 
> Yeah, this seems like a better route to me.
> 
> -Peff

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 18:07           ` Brandon Williams
@ 2018-02-22 18:14             ` Jeff King
  2018-02-22 19:38               ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22 18:14 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jonathan Tan, git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Thu, Feb 22, 2018 at 10:07:15AM -0800, Brandon Williams wrote:

> On 02/22, Jeff King wrote:
> > On Wed, Feb 21, 2018 at 01:44:22PM -0800, Jonathan Tan wrote:
> > 
> > > On Tue,  6 Feb 2018 17:12:41 -0800
> > > Brandon Williams <bmwill@google.com> wrote:
> > > 
> > > > In order to allow for code sharing with the server-side of fetch in
> > > > protocol-v2 convert upload-pack to be a builtin.
> > > > 
> > > > Signed-off-by: Brandon Williams <bmwill@google.com>
> > > 
> > > As Stefan mentioned in [1], also mention in the commit message that this
> > > means that the "git-upload-pack" invocation gains additional
> > > capabilities (for example, invoking a pager for --help).
> > 
> > And possibly respecting pager.upload-pack, which would violate our rule
> > that it is safe to run upload-pack in untrusted repositories.
> 
> And this isn't an issue with receive-pack because this same guarantee
> doesn't exist?

Yes, exactly (which is confusing and weird, yes, but that's how it is).

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads
  2018-02-21 22:11       ` Jonathan Tan
@ 2018-02-22 18:17         ` Brandon Williams
  2018-02-22 19:22           ` Jonathan Tan
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:17 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:45 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> > +
> > +	packet_reader_init(&reader, fd[0], NULL, 0,
> > +			   PACKET_READ_CHOMP_NEWLINE |
> > +			   PACKET_READ_GENTLE_ON_EOF);
> > +
> > +	switch (discover_version(&reader)) {
> > +	case protocol_v1:
> > +	case protocol_v0:
> > +		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> > +		break;
> > +	case protocol_unknown_version:
> > +		BUG("unknown protocol version");
> > +	}
> 
> This inlining is repeated a few times, which raises the question: if the
> intention is to keep the v0/1 logic separately from v2, why not have a
> single function that wraps them all? Looking at the end result (after
> all the patches in this patch set are applied), it seems that the v2
> version does not have extra_have or shallow parameters, which is a good
> enough reason for me (I don't think functions that take in many
> arguments and then selectively use them is a good idea). I think that
> other reviewers will have this question too, so maybe discuss this in
> the commit message.

Yes this sort of switch statement appears a few times but really there
isn't a good way to "have one function to wrap it all" with the current
state of the code. That sort of change would take tons of refactoring to
get into a state where we could do that, and is outside the scope of
this series.

> 
> > diff --git a/remote.h b/remote.h
> > index 1f6611be2..2016461df 100644
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
> >  void free_refs(struct ref *ref);
> >  
> >  struct oid_array;
> > -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> > +struct packet_reader;
> > +extern struct ref **get_remote_heads(struct packet_reader *reader,
> >  				     struct ref **list, unsigned int flags,
> >  				     struct oid_array *extra_have,
> > -				     struct oid_array *shallow);
> > +				     struct oid_array *shallow_points);
> 
> This change probably does not belong in this patch, especially since
> remote.c is unchanged.

Yes this hunk is needed, the signature of get_remote_heads changes.  It
may be difficult to see that due to the fact that we don't really have a
clear story on how header files are divided up within the project.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-21 22:54       ` Jonathan Tan
@ 2018-02-22 18:19         ` Brandon Williams
  2018-02-22 18:26           ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:19 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:51 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> > +				    struct ref **list, int for_push,
> > +				    const struct argv_array *ref_patterns);
> 
> I haven't looked at the rest of this patch in detail, but the type of
> ref_patterns is probably better as struct string_list, since this is not
> a true argument array (e.g. with flags starting with --). Same comment
> for the next few patches that deal with ref patterns.

Its just a list of strings which don't require having a util pointer
hanging around so actually using an argv_array would be more memory
efficient than a string_list.  But either way I don't think it matters
much.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns
  2018-02-21 22:56       ` Jonathan Tan
@ 2018-02-22 18:25         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:25 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:52 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > @@ -21,7 +22,8 @@ struct transport_vtable {
> >  	 * the ref without a huge amount of effort, it should store it
> >  	 * in the ref's old_sha1 field; otherwise it should be all 0.
> >  	 **/
> > -	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
> > +	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
> > +				     const struct argv_array *ref_patterns);
> 
> Also mention in the documentation that this function is allowed to
> return refs that do not match the ref patterns.

I'll add a comment.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-21 22:58       ` Jonathan Tan
@ 2018-02-22 18:26         ` Brandon Williams
  2018-02-22 19:32           ` Jonathan Tan
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:26 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:53 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > -const struct ref *transport_get_remote_refs(struct transport *transport)
> > +const struct ref *transport_get_remote_refs(struct transport *transport,
> > +					    const struct argv_array *ref_patterns)
> >  {
> >  	if (!transport->got_remote_refs) {
> > -		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
> > +		transport->remote_refs =
> > +			transport->vtable->get_refs_list(transport, 0,
> > +							 ref_patterns);
> >  		transport->got_remote_refs = 1;
> >  	}
> 
> Should we do our own client-side filtering if the server side cannot do
> it for us (because it doesn't support protocol v2)? Either way, this
> decision should be mentioned in the commit message.

If someone wants to add this in the future they can, but that is outside
the scope of this series.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-22 18:19         ` Brandon Williams
@ 2018-02-22 18:26           ` Jeff King
  2018-02-22 19:25             ` Jonathan Tan
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22 18:26 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jonathan Tan, git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Thu, Feb 22, 2018 at 10:19:22AM -0800, Brandon Williams wrote:

> On 02/21, Jonathan Tan wrote:
> > On Tue,  6 Feb 2018 17:12:51 -0800
> > Brandon Williams <bmwill@google.com> wrote:
> > 
> > > +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> > > +				    struct ref **list, int for_push,
> > > +				    const struct argv_array *ref_patterns);
> > 
> > I haven't looked at the rest of this patch in detail, but the type of
> > ref_patterns is probably better as struct string_list, since this is not
> > a true argument array (e.g. with flags starting with --). Same comment
> > for the next few patches that deal with ref patterns.
> 
> Its just a list of strings which don't require having a util pointer
> hanging around so actually using an argv_array would be more memory
> efficient than a string_list.  But either way I don't think it matters
> much.

I agree that it shouldn't matter much here. But if the name argv_array
is standing in the way of using it, I think we should consider giving it
a more general name. I picked that not to evoke "this must be arguments"
but "this is terminated by a single NULL".

In general I think it should be the preferred structure for string
lists, just because it actually converts for free to the "other" common
format (whereas you can never pass string_list.items to a function that
doesn't know about string lists).

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 20/35] upload-pack: introduce fetch server command
  2018-02-21 23:46       ` Jonathan Tan
@ 2018-02-22 18:48         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:48 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:57 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +    want <oid>
> > +	Indicates to the server an object which the client wants to
> > +	retrieve.
> 
> Mention that the client can "want" anything even if not advertised by
> the server (like uploadpack.allowanysha1inwant).

Will do.

> 
> > +    output = *section
> > +    section = (acknowledgments | packfile)
> > +	      (flush-pkt | delim-pkt)
> > +
> > +    acknowledgments = PKT-LINE("acknowledgments" LF)
> > +		      *(ready | nak | ack)
> 
> Can this part be described more precisely in the BNF section? I see that
> you describe later that there can be multiple ACKs or one NAK (but not
> both), and "ready" can be sent regardless of whether ACKs or a NAK is
> sent.

Yep I'll fix that.

> 
> > +    ready = PKT-LINE("ready" LF)
> > +    nak = PKT-LINE("NAK" LF)
> > +    ack = PKT-LINE("ACK" SP obj-id LF)
> > +
> > +    packfile = PKT-LINE("packfile" LF)
> > +	       [PACKFILE]
> > +
> > +----
> > +    acknowledgments section
> > +	* Always begins with the section header "acknowledgments"
> > +
> > +	* The server will respond with "NAK" if none of the object ids sent
> > +	  as have lines were common.
> > +
> > +	* The server will respond with "ACK obj-id" for all of the
> > +	  object ids sent as have lines which are common.
> > +
> > +	* A response cannot have both "ACK" lines as well as a "NAK"
> > +	  line.
> > +
> > +	* The server will respond with a "ready" line indicating that
> > +	  the server has found an acceptable common base and is ready to
> > +	  make and send a packfile (which will be found in the packfile
> > +	  section of the same response)
> > +
> > +	* If the client determines that it is finished with negotiations
> > +	  by sending a "done" line, the acknowledgments sections can be
> > +	  omitted from the server's response as an optimization.
> 
> Should this be changed to "must"? The current implementation does not
> support it (on the client side).

This is actually a great question and one which may need to be thought
about in terms of its application to future extensions to the fetch
command.  Since fetch's response is now broken up into sections we may
want the client to cope with sections being in any order and maybe even
skipping sections it doesn't know about.  Not sure if its necessary but
its an idea.

> 
> > +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
> 
> Optional: the trailing zeroes can be omitted. (That's shorter, and also
> easier to maintain when we add new fields.)
> 
> > +int upload_pack_v2(struct repository *r, struct argv_array *keys,
> > +		   struct argv_array *args)
> > +{
> > +	enum fetch_state state = FETCH_PROCESS_ARGS;
> > +	struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
> > +	use_sideband = LARGE_PACKET_MAX;
> > +
> > +	while (state != FETCH_DONE) {
> > +		switch (state) {
> > +		case FETCH_PROCESS_ARGS:
> > +			process_args(args, &data);
> > +
> > +			if (!want_obj.nr) {
> > +				/*
> > +				 * Request didn't contain any 'want' lines,
> > +				 * guess they didn't want anything.
> > +				 */
> > +				state = FETCH_DONE;
> > +			} else if (data.haves.nr) {
> > +				/*
> > +				 * Request had 'have' lines, so lets ACK them.
> > +				 */
> > +				state = FETCH_SEND_ACKS;
> > +			} else {
> > +				/*
> > +				 * Request had 'want's but no 'have's so we can
> > +				 * immedietly go to construct and send a pack.
> > +				 */
> > +				state = FETCH_SEND_PACK;
> > +			}
> > +			break;
> > +		case FETCH_READ_HAVES:
> > +			read_haves(&data);
> > +			state = FETCH_SEND_ACKS;
> > +			break;
> 
> This branch seems to never be taken?

Must be left over from another version, I'll remove it.

> 
> > +		case FETCH_SEND_ACKS:
> > +			if (process_haves_and_send_acks(&data))
> > +				state = FETCH_SEND_PACK;
> > +			else
> > +				state = FETCH_DONE;
> > +			break;
> > +		case FETCH_SEND_PACK:
> > +			packet_write_fmt(1, "packfile\n");
> > +			create_pack_file();
> > +			state = FETCH_DONE;
> > +			break;
> > +		case FETCH_DONE:
> > +			continue;
> > +		}
> > +	}
> > +
> > +	upload_pack_data_clear(&data);
> > +	return 0;
> > +}

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect
  2018-02-22  0:01       ` Jonathan Tan
@ 2018-02-22 18:53         ` Brandon Williams
  2018-02-22 21:55           ` Jonathan Tan
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:53 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:13:05 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > Introduce the transport-helper capability 'stateless-connect'.  This
> > capability indicates that the transport-helper can be requested to run
> > the 'stateless-connect' command which should attempt to make a
> > stateless connection with a remote end.  Once established, the
> > connection can be used by the git client to communicate with
> > the remote end natively in a stateless-rpc manner as supported by
> > protocol v2.  This means that the client must send everything the server
> > needs in a single request as the client must not assume any
> > state-storing on the part of the server or transport.
> 
> Maybe it's worth mentioning that support in the actual remote helpers
> will be added in a subsequent patch.

I can mention that.

> 
> > If a stateless connection cannot be established then the remote-helper
> > will respond in the same manner as the 'connect' command indicating that
> > the client should fallback to using the dumb remote-helper commands.
> 
> This makes sense, but there doesn't seem to be any code in this patch
> that implements this.
> 
> > @@ -612,6 +615,11 @@ static int process_connect_service(struct transport *transport,
> >  	if (data->connect) {
> >  		strbuf_addf(&cmdbuf, "connect %s\n", name);
> >  		ret = run_connect(transport, &cmdbuf);
> > +	} else if (data->stateless_connect) {
> > +		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
> > +		ret = run_connect(transport, &cmdbuf);
> > +		if (ret)
> > +			transport->stateless_rpc = 1;
> 
> Why is process_connect_service() falling back to stateless_connect if
> connect doesn't work? I don't think this fallback would work, as a
> client that needs "connect" might need its full capabilities.

Right now there isn't really a notion of "needing" connect since if
connect fails then you need to fallback to doing the dumb thing.  Also
note that there isn't all fallback from connect to stateless-connect
here.  If the remote helper advertises connect, only connect will be
tried even if stateless-connect is advertised.  So this only really
works in the case where stateless-connect is advertised and connect
isn't, as is with our http remote-helper.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 30/35] remote-curl: create copy of the service name
  2018-02-22  0:06       ` Jonathan Tan
@ 2018-02-22 18:56         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:13:07 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > Make a copy of the service name being requested instead of relying on
> > the buffer pointed to by the passed in 'const char *' to remain
> > unchanged.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> Probably worth mentioning in the commit message:
> 
>   Currently, all service names are string constants, but a subsequent
>   patch will introduce service names from external sources.
> 
> Other than that,
> 
> Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

I'll add that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 32/35] http: allow providing extra headers for http requests
  2018-02-22  0:09       ` Jonathan Tan
@ 2018-02-22 18:58         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:58 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:13:09 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > @@ -172,6 +172,8 @@ struct http_get_options {
> >  	 * for details.
> >  	 */
> >  	struct strbuf *base_url;
> > +
> > +	struct string_list *extra_headers;
> 
> Document this? For example:
> 
>   If not NULL, additional HTTP headers to be sent with the request. The
>   strings in the list must not be freed until after the request.

I'll add that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 35/35] remote-curl: don't request v2 when pushing
  2018-02-22  0:12       ` Jonathan Tan
@ 2018-02-22 18:59         ` Brandon Williams
  2018-02-22 19:09           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 18:59 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:13:12 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +test_expect_success 'push with http:// and a config of v2 does not request v2' '
> > +	# Till v2 for push is designed, make sure that if a client has
> > +	# protocol.version configured to use v2, that the client instead falls
> > +	# back and uses v0.
> > +
> > +	test_commit -C http_child three &&
> > +
> > +	# Push to another branch, as the target repository has the
> > +	# master branch checked out and we cannot push into it.
> > +	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
> > +		push origin HEAD:client_branch && 2>log &&
> 
> Should it be protocol.version=2? Also, two double ampersands?
> 
> Also, optionally, it might be better to do
> GIT_TRACE_PACKET="$(pwd)/log", so that it does not get mixed with other
> stderr output.

Wow thanks for catching that, let me fix that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 35/35] remote-curl: don't request v2 when pushing
  2018-02-22 18:59         ` Brandon Williams
@ 2018-02-22 19:09           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 19:09 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/22, Brandon Williams wrote:
> On 02/21, Jonathan Tan wrote:
> > On Tue,  6 Feb 2018 17:13:12 -0800
> > Brandon Williams <bmwill@google.com> wrote:
> > 
> > > +test_expect_success 'push with http:// and a config of v2 does not request v2' '
> > > +	# Till v2 for push is designed, make sure that if a client has
> > > +	# protocol.version configured to use v2, that the client instead falls
> > > +	# back and uses v0.
> > > +
> > > +	test_commit -C http_child three &&
> > > +
> > > +	# Push to another branch, as the target repository has the
> > > +	# master branch checked out and we cannot push into it.
> > > +	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
> > > +		push origin HEAD:client_branch && 2>log &&
> > 
> > Should it be protocol.version=2? Also, two double ampersands?
> > 
> > Also, optionally, it might be better to do
> > GIT_TRACE_PACKET="$(pwd)/log", so that it does not get mixed with other
> > stderr output.
> 
> Wow thanks for catching that, let me fix that.

I like setting the log via "$(pwd)/log" but it turns out that this
appends to the file if it already exists, which means the previous tests
need to do some cleanup.  This is actually probably preferable anyway.

> 
> -- 
> Brandon Williams

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 03/35] pkt-line: add delim packet support
  2018-02-07  1:12     ` [PATCH v3 03/35] pkt-line: add delim packet support Brandon Williams
@ 2018-02-22 19:13       ` Stefan Beller
  2018-02-22 19:37         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-02-22 19:13 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> One of the design goals of protocol-v2 is to improve the semantics of
> flush packets.  Currently in protocol-v1, flush packets are used both to
> indicate a break in a list of packet lines as well as an indication that
> one side has finished speaking.  This makes it particularly difficult
> to implement proxies as a proxy would need to completely understand git
> protocol instead of simply looking for a flush packet.
>
> To do this, introduce the special deliminator packet '0001'.  A delim
> packet can then be used as a deliminator between lists of packet lines
> while flush packets can be reserved to indicate the end of a response.

Please mention where this can be found in the documentation.
(Defer to later patch?)
As the commit message states, this is only to be used for v2,
in v0 it is still an illegal pkt.

>
> Signed-off-by: Brandon Williams <bmwill@google.com>

The code is
Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads
  2018-02-22 18:17         ` Brandon Williams
@ 2018-02-22 19:22           ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22 19:22 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Thu, 22 Feb 2018 10:17:39 -0800
Brandon Williams <bmwill@google.com> wrote:

> > > diff --git a/remote.h b/remote.h
> > > index 1f6611be2..2016461df 100644
> > > --- a/remote.h
> > > +++ b/remote.h
> > > @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
> > >  void free_refs(struct ref *ref);
> > >  
> > >  struct oid_array;
> > > -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> > > +struct packet_reader;
> > > +extern struct ref **get_remote_heads(struct packet_reader *reader,
> > >  				     struct ref **list, unsigned int flags,
> > >  				     struct oid_array *extra_have,
> > > -				     struct oid_array *shallow);
> > > +				     struct oid_array *shallow_points);
> > 
> > This change probably does not belong in this patch, especially since
> > remote.c is unchanged.
> 
> Yes this hunk is needed, the signature of get_remote_heads changes.  It
> may be difficult to see that due to the fact that we don't really have a
> clear story on how header files are divided up within the project.

Thanks - I indeed didn't notice that the implementation of
get_remote_heads() is modified too in this patch.

My initial comment was about just the renaming of "shallow" to
"shallow_points", but yes, you're right - I see in the implementation
that it is indeed named "shallow_points" there, so this change is
justified, especially since you're already changing the signature of
this function.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-22 18:26           ` Jeff King
@ 2018-02-22 19:25             ` Jonathan Tan
  2018-02-27  6:21               ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22 19:25 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Thu, 22 Feb 2018 13:26:58 -0500
Jeff King <peff@peff.net> wrote:

> On Thu, Feb 22, 2018 at 10:19:22AM -0800, Brandon Williams wrote:
> 
> > On 02/21, Jonathan Tan wrote:
> > > On Tue,  6 Feb 2018 17:12:51 -0800
> > > Brandon Williams <bmwill@google.com> wrote:
> > > 
> > > > +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> > > > +				    struct ref **list, int for_push,
> > > > +				    const struct argv_array *ref_patterns);
> > > 
> > > I haven't looked at the rest of this patch in detail, but the type of
> > > ref_patterns is probably better as struct string_list, since this is not
> > > a true argument array (e.g. with flags starting with --). Same comment
> > > for the next few patches that deal with ref patterns.
> > 
> > Its just a list of strings which don't require having a util pointer
> > hanging around so actually using an argv_array would be more memory
> > efficient than a string_list.  But either way I don't think it matters
> > much.
> 
> I agree that it shouldn't matter much here. But if the name argv_array
> is standing in the way of using it, I think we should consider giving it
> a more general name. I picked that not to evoke "this must be arguments"
> but "this is terminated by a single NULL".
> 
> In general I think it should be the preferred structure for string
> lists, just because it actually converts for free to the "other" common
> format (whereas you can never pass string_list.items to a function that
> doesn't know about string lists).

This sounds reasonable - I withdraw my comment about using struct
string_list.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 05/35] upload-pack: factor out processing lines
  2018-02-07  1:12     ` [PATCH v3 05/35] upload-pack: factor out processing lines Brandon Williams
@ 2018-02-22 19:31       ` Stefan Beller
  2018-02-22 19:39         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-02-22 19:31 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> Factor out the logic for processing shallow, deepen, deepen_since, and
> deepen_not lines into their own functions to simplify the
> 'receive_needs()' function in addition to making it easier to reuse some
> of this logic when implementing protocol_v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>

Reviewed-by: Stefan Beller <sbeller@google.com>
for the stated purpose of just refactoring existing code for better reuse later.

I do have a few comments on the code in general,
which might be out of scope for this series.

A close review would have been fastest if we had some sort of
https://public-inbox.org/git/20171025224620.27657-1-sbeller@google.com/
which I might revive soon for this purpose. (it showed that I would need it)


> +               *depth = (int)strtol(arg, &end, 0);

strtol is not used quite correctly here IMHO, as we do not
inspect errno for ERANGE

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-22 18:26         ` Brandon Williams
@ 2018-02-22 19:32           ` Jonathan Tan
  2018-02-22 19:51             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22 19:32 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Thu, 22 Feb 2018 10:26:47 -0800
Brandon Williams <bmwill@google.com> wrote:

> On 02/21, Jonathan Tan wrote:
> > On Tue,  6 Feb 2018 17:12:53 -0800
> > Brandon Williams <bmwill@google.com> wrote:
> > 
> > > -const struct ref *transport_get_remote_refs(struct transport *transport)
> > > +const struct ref *transport_get_remote_refs(struct transport *transport,
> > > +					    const struct argv_array *ref_patterns)
> > >  {
> > >  	if (!transport->got_remote_refs) {
> > > -		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
> > > +		transport->remote_refs =
> > > +			transport->vtable->get_refs_list(transport, 0,
> > > +							 ref_patterns);
> > >  		transport->got_remote_refs = 1;
> > >  	}
> > 
> > Should we do our own client-side filtering if the server side cannot do
> > it for us (because it doesn't support protocol v2)? Either way, this
> > decision should be mentioned in the commit message.
> 
> If someone wants to add this in the future they can, but that is outside
> the scope of this series.

In that case, also document that this function is allowed to return refs
that do not match the ref patterns.

Unlike in patch 15 (which deals with the interface between the transport
code and transport vtables, which can be changed as long as the
transport code is aware of it, as I wrote in [1]), this may result in
user-visible differences depending on which protocol is used. But after
more thinking, I don't think we're in a situation yet where having extra
refs shown/written are harmful, and if it comes to that, we can tighten
this code later without backwards incompatibility. So, OK, this is fine.

[1] https://public-inbox.org/git/20180221145639.c6cf2409ce2120109bdd169f@google.com/

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 03/35] pkt-line: add delim packet support
  2018-02-22 19:13       ` Stefan Beller
@ 2018-02-22 19:37         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 19:37 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/22, Stefan Beller wrote:
> On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> > One of the design goals of protocol-v2 is to improve the semantics of
> > flush packets.  Currently in protocol-v1, flush packets are used both to
> > indicate a break in a list of packet lines as well as an indication that
> > one side has finished speaking.  This makes it particularly difficult
> > to implement proxies as a proxy would need to completely understand git
> > protocol instead of simply looking for a flush packet.
> >
> > To do this, introduce the special deliminator packet '0001'.  A delim
> > packet can then be used as a deliminator between lists of packet lines
> > while flush packets can be reserved to indicate the end of a response.
> 
> Please mention where this can be found in the documentation.
> (Defer to later patch?)

Yeah the documentation does get added in a future patch, I'll make a
comment to that effect.

> As the commit message states, this is only to be used for v2,
> in v0 it is still an illegal pkt.
> 
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> The code is
> Reviewed-by: Stefan Beller <sbeller@google.com>

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 18:14             ` Jeff King
@ 2018-02-22 19:38               ` Jonathan Nieder
  2018-02-22 20:19                 ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-22 19:38 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

Hi,

Jeff King wrote:
> On Thu, Feb 22, 2018 at 10:07:15AM -0800, Brandon Williams wrote:
>> On 02/22, Jeff King wrote:
>>> On Wed, Feb 21, 2018 at 01:44:22PM -0800, Jonathan Tan wrote:
>>>> On Tue,  6 Feb 2018 17:12:41 -0800
>>>> Brandon Williams <bmwill@google.com> wrote:

>>>>> In order to allow for code sharing with the server-side of fetch in
>>>>> protocol-v2 convert upload-pack to be a builtin.
[...]
>>>> As Stefan mentioned in [1], also mention in the commit message that this
>>>> means that the "git-upload-pack" invocation gains additional
>>>> capabilities (for example, invoking a pager for --help).
>>>
>>> And possibly respecting pager.upload-pack, which would violate our rule
>>> that it is safe to run upload-pack in untrusted repositories.
>>
>> And this isn't an issue with receive-pack because this same guarantee
>> doesn't exist?
>
> Yes, exactly (which is confusing and weird, yes, but that's how it is).

To be clear, which of the following are you (most) worried about?

 1. being invoked with --help and spawning a pager
 2. receiving and acting on options between 'git' and 'upload-pack'
 3. repository discovery
 4. pager config
 5. alias discovery
 6. increased code surface / unknown threats

For (1), "--help" has to be the first argument.  "git daemon" passes
--strict so it doesn't happen there.  "git http-backend" passes
--stateless-rpc so it doesn't happen there.  "git shell" sanitizes
input to avoid it happening there.

A custom setup could provide their own entry point that doesn't do
such sanitization.  I think that in some sense it's out of our hands,
but it would be nice to harden as described upthread.

For (2), I am having trouble imagining a setup where it would happen.

upload-pack doesn't have the RUN_SETUP or RUN_SETUP_GENTLY flag set,
so (3) doesn't apply.

Although in most setups the user does not control the config files on
a server, item (4) looks like a real issue worth solving.  I think we
should introduce a flag to skip looking for pager config.  We could
use it for receive-pack, too.

Builtins are handled before aliases, so (5) doesn't apply.

(6) is a real issue: it is why "git shell" is not a builtin, for
example.  But I would rather that we use appropriate sanitization
before upload-pack is invoked than live in fear.  git upload-pack is
sufficiently complicated that I don't think the git.c wrapper
increases the complexity by a significant amount.

That leaves me with a personal answer of only being worried about (4)
and not the rest.  What do you think?  Is one of the other items I
listed above worrisome, or is there another item I missed?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 05/35] upload-pack: factor out processing lines
  2018-02-22 19:31       ` Stefan Beller
@ 2018-02-22 19:39         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 19:39 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/22, Stefan Beller wrote:
> On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> > Factor out the logic for processing shallow, deepen, deepen_since, and
> > deepen_not lines into their own functions to simplify the
> > 'receive_needs()' function in addition to making it easier to reuse some
> > of this logic when implementing protocol_v2.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> Reviewed-by: Stefan Beller <sbeller@google.com>
> for the stated purpose of just refactoring existing code for better reuse later.
> 
> I do have a few comments on the code in general,
> which might be out of scope for this series.

Yeah you mentioned some comments in a previous round based on style
preference.  I'm going to refrain from changing the style of this patch
since it is a matter of preference.

> 
> A close review would have been fastest if we had some sort of
> https://public-inbox.org/git/20171025224620.27657-1-sbeller@google.com/
> which I might revive soon for this purpose. (it showed that I would need it)
> 
> 
> > +               *depth = (int)strtol(arg, &end, 0);
> 
> strtol is not used quite correctly here IMHO, as we do not
> inspect errno for ERANGE
> 
> Thanks,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-22 19:32           ` Jonathan Tan
@ 2018-02-22 19:51             ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-22 19:51 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/22, Jonathan Tan wrote:
> On Thu, 22 Feb 2018 10:26:47 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > On 02/21, Jonathan Tan wrote:
> > > On Tue,  6 Feb 2018 17:12:53 -0800
> > > Brandon Williams <bmwill@google.com> wrote:
> > > 
> > > > -const struct ref *transport_get_remote_refs(struct transport *transport)
> > > > +const struct ref *transport_get_remote_refs(struct transport *transport,
> > > > +					    const struct argv_array *ref_patterns)
> > > >  {
> > > >  	if (!transport->got_remote_refs) {
> > > > -		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
> > > > +		transport->remote_refs =
> > > > +			transport->vtable->get_refs_list(transport, 0,
> > > > +							 ref_patterns);
> > > >  		transport->got_remote_refs = 1;
> > > >  	}
> > > 
> > > Should we do our own client-side filtering if the server side cannot do
> > > it for us (because it doesn't support protocol v2)? Either way, this
> > > decision should be mentioned in the commit message.
> > 
> > If someone wants to add this in the future they can, but that is outside
> > the scope of this series.
> 
> In that case, also document that this function is allowed to return refs
> that do not match the ref patterns.
> 
> Unlike in patch 15 (which deals with the interface between the transport
> code and transport vtables, which can be changed as long as the
> transport code is aware of it, as I wrote in [1]), this may result in
> user-visible differences depending on which protocol is used. But after
> more thinking, I don't think we're in a situation yet where having extra
> refs shown/written are harmful, and if it comes to that, we can tighten
> this code later without backwards incompatibility. So, OK, this is fine.
> 
> [1] https://public-inbox.org/git/20180221145639.c6cf2409ce2120109bdd169f@google.com/

I'll add the documentation.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-07  1:12     ` [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
@ 2018-02-22 19:52       ` Stefan Beller
  2018-02-22 20:09       ` Stefan Beller
  1 sibling, 0 replies; 362+ messages in thread
From: Stefan Beller @ 2018-02-22 19:52 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:

> @@ -56,6 +62,41 @@ static void die_initial_contact(int unexpected)
>                       "and the repository exists."));
>  }
>
> +static enum protocol_version discover_version(struct packet_reader *reader)
> +{
> +       enum protocol_version version = protocol_unknown_version;
> +
> +       /*
> +        * Peek the first line of the server's response to
> +        * determine the protocol version the server is speaking.
> +        */
> +       switch (packet_reader_peek(reader)) {
> +       case PACKET_READ_EOF:
> +               die_initial_contact(0);
> +       case PACKET_READ_FLUSH:
> +       case PACKET_READ_DELIM:
> +               version = protocol_v0;
> +               break;
> +       case PACKET_READ_NORMAL:
> +               version = determine_protocol_version_client(reader->line);
> +               break;
> +       }
> +
> +       /* Maybe process capabilities here, at least for v2 */

We do not (yet) react to v2, so this comment only makes
sense after a later patch? If so please include it later,
as this is confusing for now.


> +       switch (version) {
> +       case protocol_v1:
> +               /* Read the peeked version line */
> +               packet_reader_read(reader);

I wonder if we want to assign version to v0 here,
as now all v1 is done and we could treat the remaining
communication as a v0. Not sure if that helps with some
switch/cases, but as we'd give all cases to have the compiler
not yell at us, this would be no big deal. So I guess we can keep
it v1.

With or without the comment nit, this patch is
Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-07  1:12     ` [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
  2018-02-22 19:52       ` Stefan Beller
@ 2018-02-22 20:09       ` Stefan Beller
  2018-02-23 21:30         ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-02-22 20:09 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

> +static enum protocol_version discover_version(struct packet_reader *reader)
> +{
...
> +
> +       /* Maybe process capabilities here, at least for v2 */
> +       switch (version) {
> +       case protocol_v1:
> +               /* Read the peeked version line */
> +               packet_reader_read(reader);
> +               break;
> +       case protocol_v0:
> +               break;
> +       case protocol_unknown_version:
> +               die("unknown protocol version: '%s'\n", reader->line);

The following patches introduce more of the switch(version) cases.
And there it actually is a
    BUG("protocol version unknown? should have been set in discover_version")
but here it is a mere
  die (_("The server uses a different protocol version than we can
speak: %s\n"),
      reader->line);
so I would think here it is reasonable to add _(translation).

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 19:38               ` Jonathan Nieder
@ 2018-02-22 20:19                 ` Jeff King
  2018-02-22 20:21                   ` Jeff King
  2018-02-22 21:24                   ` Jonathan Nieder
  0 siblings, 2 replies; 362+ messages in thread
From: Jeff King @ 2018-02-22 20:19 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 11:38:14AM -0800, Jonathan Nieder wrote:

> >>> And possibly respecting pager.upload-pack, which would violate our rule
> >>> that it is safe to run upload-pack in untrusted repositories.
> >>
> >> And this isn't an issue with receive-pack because this same guarantee
> >> doesn't exist?
> >
> > Yes, exactly (which is confusing and weird, yes, but that's how it is).
> 
> To be clear, which of the following are you (most) worried about?
> 
>  1. being invoked with --help and spawning a pager
>  2. receiving and acting on options between 'git' and 'upload-pack'
>  3. repository discovery
>  4. pager config
>  5. alias discovery
>  6. increased code surface / unknown threats

My immediate concern is (4). But my greater concern is that people who
work on git.c should not have to worry about accidentally violating this
principle when they add a new feature or config option.

In other words, it seems like an accident waiting to happen. I'd be more
amenable to it if there was some compelling reason for it to be a
builtin, but I don't see one listed in the commit message. I see only
"let's make it easier to share the code", which AFAICT is equally served
by just lib-ifying the code and calling it from the standalone
upload-pack.c.

> Although in most setups the user does not control the config files on
> a server, item (4) looks like a real issue worth solving.  I think we
> should introduce a flag to skip looking for pager config.  We could
> use it for receive-pack, too.

There's not much point for receive-pack. It respects hooks, so any
security ship has already sailed there.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 20:19                 ` Jeff King
@ 2018-02-22 20:21                   ` Jeff King
  2018-02-22 21:26                     ` Jonathan Nieder
  2018-02-23 21:09                     ` Brandon Williams
  2018-02-22 21:24                   ` Jonathan Nieder
  1 sibling, 2 replies; 362+ messages in thread
From: Jeff King @ 2018-02-22 20:21 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 03:19:40PM -0500, Jeff King wrote:

> > To be clear, which of the following are you (most) worried about?
> > 
> >  1. being invoked with --help and spawning a pager
> >  2. receiving and acting on options between 'git' and 'upload-pack'
> >  3. repository discovery
> >  4. pager config
> >  5. alias discovery
> >  6. increased code surface / unknown threats
> 
> My immediate concern is (4). But my greater concern is that people who
> work on git.c should not have to worry about accidentally violating this
> principle when they add a new feature or config option.
> 
> In other words, it seems like an accident waiting to happen. I'd be more
> amenable to it if there was some compelling reason for it to be a
> builtin, but I don't see one listed in the commit message. I see only
> "let's make it easier to share the code", which AFAICT is equally served
> by just lib-ifying the code and calling it from the standalone
> upload-pack.c.

By the way, any decision here would presumably need to be extended to
git-serve, etc. The current property is that it's safe to fetch from an
untrusted repository, even over ssh. If we're keeping that for protocol
v1, we'd want it to apply to protocol v2, as well.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper
  2018-02-07  1:12     ` [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
@ 2018-02-22 20:40       ` Stefan Beller
  2018-02-23 21:22         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-02-22 20:40 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:

> +static void pack_line(const char *line)
> +{
> +       if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))

From our in-office discussion:
v1/v0 packs pktlines twice in http, which is not possible to
construct using this test helper when using the same string
for the packed and unpacked representation of flush and delim packets,
i.e. test-pkt-line --pack $(test-pkt-line --pack 0000) would produce
'0000' instead of '00090000\n'.
To fix it we'd have to replace the unpacked versions of these pkts to
something else such as "FLUSH" "DELIM".

However as we do not anticipate the test helper to be used in further
tests for v0, this ought to be no big issue.
Maybe someone else cares though?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 20:19                 ` Jeff King
  2018-02-22 20:21                   ` Jeff King
@ 2018-02-22 21:24                   ` Jonathan Nieder
  2018-02-22 21:44                     ` Jeff King
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-22 21:24 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

Hi,

Jeff King wrote:
> On Thu, Feb 22, 2018 at 11:38:14AM -0800, Jonathan Nieder wrote:

>> To be clear, which of the following are you (most) worried about?
>>
>>  1. being invoked with --help and spawning a pager
>>  2. receiving and acting on options between 'git' and 'upload-pack'
>>  3. repository discovery
>>  4. pager config
>>  5. alias discovery
>>  6. increased code surface / unknown threats
>
> My immediate concern is (4).

Thanks for clarifying.

>                              But my greater concern is that people who
> work on git.c should not have to worry about accidentally violating this
> principle when they add a new feature or config option.

That sounds like a combination of (6) and insufficient documentation
or tests.  Ideas for how we can help prevent such accidents?

> In other words, it seems like an accident waiting to happen. I'd be more
> amenable to it if there was some compelling reason for it to be a
> builtin, but I don't see one listed in the commit message. I see only
> "let's make it easier to share the code", which AFAICT is equally served
> by just lib-ifying the code and calling it from the standalone
> upload-pack.c.

If we have so little control of the common code used by git commands
that could be invoked by a remote user, I think we're in trouble
already.  I don't think being a builtin vs not makes that
significantly different, since there are plenty of builtins that can
be triggered by remote users.  Further, if we have so little control
over the security properties of git.c, what hope do we have of making
the rest of libgit.a usable in secure code?

In other words, having to pay more attention to the git wrapper from a
security pov actually feels to me like a *good* thing.  The git
wrapper is the entry point to almost all git commands.  If it is an
accident waiting to happen, then anything that calls git commands is
already an accident waiting to happen.  So how can we make it not an
accident waiting to happen? :)

>> Although in most setups the user does not control the config files on
>> a server, item (4) looks like a real issue worth solving.  I think we
>> should introduce a flag to skip looking for pager config.  We could
>> use it for receive-pack, too.
>
> There's not much point for receive-pack. It respects hooks, so any
> security ship has already sailed there.

Yet there are plenty of cases where people who can push are not
supposed to have root privilege.  I am not worried about hooks
specifically (although the changes described at [1] might help and I
still plan to work on those) but I am worried about e.g. commandline
injection issues.  I don't think we can treat receive-pack as out of
scope.

And to be clear, I don't think you were saying receive-pack *is* out
of scope.  But you seem to be trying to draw some boundary, where I
only see something fuzzier (e.g. if a bug only applies to
receive-pack, then that certainly decreases its impact, but it doesn't
make the impact go away).

Thanks,
Jonathan

[1] https://public-inbox.org/git/20171002234517.GV19555@aiede.mtv.corp.google.com/

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 20:21                   ` Jeff King
@ 2018-02-22 21:26                     ` Jonathan Nieder
  2018-02-22 21:44                       ` Jeff King
  2018-02-23 21:09                     ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-22 21:26 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

Jeff King wrote:

>                 The current property is that it's safe to fetch from an
> untrusted repository, even over ssh. If we're keeping that for protocol
> v1, we'd want it to apply to protocol v2, as well.

Ah, this is what I had been missing (the non-ssh case).

I see your point.  I think we need to fix the pager config issue and add
some clarifying documentation to git.c so that people know what to look
out for.

Keep in mind that git upload-archive (a read-only command, just like
git upload-pack) also already has the same issues.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 21:24                   ` Jonathan Nieder
@ 2018-02-22 21:44                     ` Jeff King
  2018-02-22 22:21                       ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22 21:44 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 01:24:02PM -0800, Jonathan Nieder wrote:

> >                              But my greater concern is that people who
> > work on git.c should not have to worry about accidentally violating this
> > principle when they add a new feature or config option.
> 
> That sounds like a combination of (6) and insufficient documentation
> or tests.  Ideas for how we can help prevent such accidents?

I don't think it's insufficient tests. How can we test against some
problem in the untrusted repository when the feature that would trigger
it has not been written yet?

E.g., imagine we begin to allow alias.* to override command names in
git.c. Now suddenly setting "alias.upload-pack" is a vulnerability.
Should we add a test for that _now_ as a precaution? I don't think so,
because we can't guess what new features are going to be added. So we'd
be lucky if such a test actually did anything useful.

A comment could help, but it seems quite likely that whatever feature
somebody is adding might not be right next to that comment (and thus not
seen). I think we're mostly relying on institutional knowledge during
review to catch these things. Which is not great, but I'm not sure where
we'd document that knowledge that people would actually see it at the
right time.

> > In other words, it seems like an accident waiting to happen. I'd be more
> > amenable to it if there was some compelling reason for it to be a
> > builtin, but I don't see one listed in the commit message. I see only
> > "let's make it easier to share the code", which AFAICT is equally served
> > by just lib-ifying the code and calling it from the standalone
> > upload-pack.c.
> 
> If we have so little control of the common code used by git commands
> that could be invoked by a remote user, I think we're in trouble
> already.  I don't think being a builtin vs not makes that
> significantly different, since there are plenty of builtins that can
> be triggered by remote users.  Further, if we have so little control
> over the security properties of git.c, what hope do we have of making
> the rest of libgit.a usable in secure code?

I agree that the situation is already pretty dicey. But I also think
that using the git wrapper is more risky than the rest of libgit.a.
There's tons of dangerous code in libgit.a, but upload-pack is smart
enough not to call it. And people modifying upload-pack have a greater
chance of thinking about the security implications, because they know
they're working with upload-pack. Whereas people are likely to touch
git.c without considering upload-pack at all.

The big danger in libgit.a is from modifying some low-level code called
by upload-pack in a way that trusts the on-disk contents more than it
should. My gut says that's less likely, though certainly not impossible
(a likely candidate would perhaps be a ref backend config that opens up
holes; e.g., if you could point a database backend at some random path).

> In other words, having to pay more attention to the git wrapper from a
> security pov actually feels to me like a *good* thing.  The git
> wrapper is the entry point to almost all git commands.  If it is an
> accident waiting to happen, then anything that calls git commands is
> already an accident waiting to happen.  So how can we make it not an
> accident waiting to happen? :)

But I don't think it _is_ an accident waiting to happen for the rest of
the commands. upload-pack is special. The point is that people may touch
git.c thinking they are adding a nice new feature (like pager config, or
aliases, or default options, or whatever). And it _would_ be a nice new
feature for most commands, but not for upload-pack, because its
requirements are different.

So thinking about security in the git wrapper is just a burden for those
other commands.

> > There's not much point for receive-pack. It respects hooks, so any
> > security ship has already sailed there.
> 
> Yet there are plenty of cases where people who can push are not
> supposed to have root privilege.  I am not worried about hooks
> specifically (although the changes described at [1] might help and I
> still plan to work on those) but I am worried about e.g. commandline
> injection issues.  I don't think we can treat receive-pack as out of
> scope.
> 
> And to be clear, I don't think you were saying receive-pack *is* out
> of scope.  But you seem to be trying to draw some boundary, where I
> only see something fuzzier (e.g. if a bug only applies to
> receive-pack, then that certainly decreases its impact, but it doesn't
> make the impact go away).

Right, I think command-line injection is a separate issue. My concern is
_just_ about "can we be run against on-disk repo contents". And nothing
matters for receive-pack there, because you can already execute
arbitrary code with hooks.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 21:26                     ` Jonathan Nieder
@ 2018-02-22 21:44                       ` Jeff King
  2018-03-12 22:43                         ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22 21:44 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 01:26:34PM -0800, Jonathan Nieder wrote:

> Keep in mind that git upload-archive (a read-only command, just like
> git upload-pack) also already has the same issues.

Yuck. I don't think we've ever made a historical promise about that. But
then, I don't think the promise about upload-pack has ever really been
documented, except in mailing list discussions.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect
  2018-02-22 18:53         ` Brandon Williams
@ 2018-02-22 21:55           ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-02-22 21:55 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Thu, 22 Feb 2018 10:53:53 -0800
Brandon Williams <bmwill@google.com> wrote:

> > > @@ -612,6 +615,11 @@ static int process_connect_service(struct transport *transport,
> > >  	if (data->connect) {
> > >  		strbuf_addf(&cmdbuf, "connect %s\n", name);
> > >  		ret = run_connect(transport, &cmdbuf);
> > > +	} else if (data->stateless_connect) {
> > > +		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
> > > +		ret = run_connect(transport, &cmdbuf);
> > > +		if (ret)
> > > +			transport->stateless_rpc = 1;
> > 
> > Why is process_connect_service() falling back to stateless_connect if
> > connect doesn't work? I don't think this fallback would work, as a
> > client that needs "connect" might need its full capabilities.
> 
> Right now there isn't really a notion of "needing" connect since if
> connect fails then you need to fallback to doing the dumb thing.  Also
> note that there isn't all fallback from connect to stateless-connect
> here.  If the remote helper advertises connect, only connect will be
> tried even if stateless-connect is advertised.  So this only really
> works in the case where stateless-connect is advertised and connect
> isn't, as is with our http remote-helper.

After some in-office discussion, I think I understand how this works.
Assuming a HTTP server that supports protocol v2 (at least for
ls-refs/fetch):

 1. Fetch, which supports protocol v2, will (indirectly) call
    process_connect_service. If it learns that it supports v2, it must
    know that what's returned may not be a fully bidirectional channel,
    but may only be a stateless-connect channel (and it does know).
 2. Archive/upload-archive, which does not support protocol v2, will
    (indirectly) call process_connect_service. stateless_connect checks
    info/refs and observes that the server supports protocol v2, so it
    returns a stateless-connect channel. The user, being unaware of
    protocol versions, tries to use it, and it doesn't work. (This is a
    slight regression in that previously, it would fail more quickly -
    archive/upload-archive has always not supported HTTP because HTTP
    doesn't support connect.)

I still think that it's too confusing for process_connect_service() to
attempt to fallback to stateless-connect, at least because the user must
remember that process_connect_service() returns such a channel if
protocol v2 is used (and existing code must be updated to know this).
It's probably better to have a new API that can return either a connect
channel or a stateless-connect channel, and the user will always use it
as if it was a stateless-connect channel. The old API then can be
separately deprecated and removed, if desired.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 21:44                     ` Jeff King
@ 2018-02-22 22:21                       ` Jeff King
  2018-02-22 22:42                         ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22 22:21 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 04:44:02PM -0500, Jeff King wrote:

> But I don't think it _is_ an accident waiting to happen for the rest of
> the commands. upload-pack is special. The point is that people may touch
> git.c thinking they are adding a nice new feature (like pager config, or
> aliases, or default options, or whatever). And it _would_ be a nice new
> feature for most commands, but not for upload-pack, because its
> requirements are different.
> 
> So thinking about security in the git wrapper is just a burden for those
> other commands.

All of that said, I think the current code is quite dangerous already,
and maybe even broken.  upload-pack may run sub-commands like rev-list
or pack-objects, which are themselves builtins.

For example:

  git init --bare evil.git
  git -C evil.git --work-tree=. commit --allow-empty -m foo
  git -C evil.git config pager.pack-objects 'echo >&2 oops'
  git clone --no-local evil.git victim

That doesn't _quite_ work, because we route pack-objects' stderr into a
pipe, which suppresses the pager. But we don't for rev-list, which we
call when checking reachability. It's a bit tricky to get a client to
trigger those for a vanilla fetch, though. Here's the best I could come
up with:

  git init --bare evil.git
  git -C evil.git --work-tree=. commit --allow-empty -m one
  git -C evil.git config pager.rev-list 'echo >&2 oops'

  git init super
  (
	cd super
	# obviously use host:path if you're attacking somebody over ssh
	git submodule add ../evil.git evil
	git commit -am 'add evil submodule'
  )
  git -C evil.git config uploadpack.allowReachableSHA1InWant true
  git -C evil.git update-ref -d refs/heads/master

  git clone --recurse-submodules super victim

I couldn't quite get it to work, but I think it's because I'm doing
something wrong with the submodules. But I also think this attack would
_have_ to be done over ssh, because on a local system the submodule
clone would a hard-link rather than a real fetch.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 22:21                       ` Jeff King
@ 2018-02-22 22:42                         ` Jonathan Nieder
  2018-02-22 23:05                           ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-22 22:42 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

Jeff King wrote:

> All of that said, I think the current code is quite dangerous already,
> and maybe even broken.  upload-pack may run sub-commands like rev-list
> or pack-objects, which are themselves builtins.

Sounds like more commands to set the IGNORE_PAGER_CONFIG flag for in
git.c.

Thanks for looking this over thoughtfully.

[...]
> I couldn't quite get it to work, but I think it's because I'm doing
> something wrong with the submodules. But I also think this attack would
> _have_ to be done over ssh, because on a local system the submodule
> clone would a hard-link rather than a real fetch.

What happens if the submodule URL starts with file://?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 22:42                         ` Jonathan Nieder
@ 2018-02-22 23:05                           ` Jeff King
  2018-02-22 23:23                             ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-22 23:05 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 02:42:35PM -0800, Jonathan Nieder wrote:

> > I couldn't quite get it to work, but I think it's because I'm doing
> > something wrong with the submodules. But I also think this attack would
> > _have_ to be done over ssh, because on a local system the submodule
> > clone would a hard-link rather than a real fetch.
> 
> What happens if the submodule URL starts with file://?

Ah, that would do it. Or I guess any follow-up fetch.

I'm still having trouble convincing submodules to fetch _just_ the
desired sha1, though. It always just fetches everything. I know there's
a way that this kicks in (that's why we have things like
allowReachableSHA1InWant), but I'm not sufficiently well-versed in
submodules to know how to trigger it.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 23:05                           ` Jeff King
@ 2018-02-22 23:23                             ` Jeff King
  0 siblings, 0 replies; 362+ messages in thread
From: Jeff King @ 2018-02-22 23:23 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Thu, Feb 22, 2018 at 06:05:15PM -0500, Jeff King wrote:

> On Thu, Feb 22, 2018 at 02:42:35PM -0800, Jonathan Nieder wrote:
> 
> > > I couldn't quite get it to work, but I think it's because I'm doing
> > > something wrong with the submodules. But I also think this attack would
> > > _have_ to be done over ssh, because on a local system the submodule
> > > clone would a hard-link rather than a real fetch.
> > 
> > What happens if the submodule URL starts with file://?
> 
> Ah, that would do it. Or I guess any follow-up fetch.
> 
> I'm still having trouble convincing submodules to fetch _just_ the
> desired sha1, though. It always just fetches everything. I know there's
> a way that this kicks in (that's why we have things like
> allowReachableSHA1InWant), but I'm not sufficiently well-versed in
> submodules to know how to trigger it.

<facepalm> This won't work anyway. I was right when I said that we don't
redirect stderr for rev-list, but of course it's stdout that determines
the pager behavior. So I don't think you could get rev-list to trigger a
pager here.

I don't think there's currently any vulnerability, but it's more to do
with luck than any amount of carefulness on our part.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-22  9:48       ` Jeff King
@ 2018-02-23  0:45         ` Brandon Williams
  2018-02-24  0:19           ` Brandon Williams
  2018-02-24  4:01           ` Jeff King
  0 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-23  0:45 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On 02/22, Jeff King wrote:
> On Tue, Feb 06, 2018 at 05:12:50PM -0800, Brandon Williams wrote:
> 
> > +ls-refs takes in the following parameters wrapped in packet-lines:
> > +
> > +    symrefs
> > +	In addition to the object pointed by it, show the underlying ref
> > +	pointed by it when showing a symbolic ref.
> > +    peel
> > +	Show peeled tags.
> > +    ref-pattern <pattern>
> > +	When specified, only references matching the one of the provided
> > +	patterns are displayed.
> 
> How do we match those patterns? That's probably an important thing to
> include in the spec.

Yeah I thought about it when I first wrote it and was hoping that
someone who nudge me in the right direction :)

> 
> Looking at the code, I see:
> 
> > +/*
> > + * Check if one of the patterns matches the tail part of the ref.
> > + * If no patterns were provided, all refs match.
> > + */
> > +static int ref_match(const struct argv_array *patterns, const char *refname)
> 
> This kind of tail matching can't quite implement all of the current
> behavior. Because we actually do the normal dwim_ref() matching, which
> includes stuff like "refs/remotes/%s/HEAD".
> 
> The other problem with tail-matching is that it's inefficient on the
> server. Ideally we could get a request for "master" and only look up
> refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
> in refs/pull, we wouldn't have to process those at all. Of course this
> is no worse than the current code, which not only looks at each ref but
> actually _sends_ it. But it would be nice if we could fix this.
> 
> There's some more discussion in this old thread:
> 
>   https://public-inbox.org/git/20161024132932.i42rqn2vlpocqmkq@sigill.intra.peff.net/

Thanks for the pointer.  I was told to be wary a while about about
performance implications on the server but no discussion ensued till now
about it :)

We always have the ability to extend the patterns accepted via a feature
(or capability) to ls-refs, so maybe the best thing to do now would only
support a few patterns with specific semantics.  Something like if you
say "master" only match against refs/heads/ and refs/tags/ and if you
want something else you would need to specify "refs/pull/master"?

That way we could only support globs at the end "master*" where * can
match anything (including slashes)

> 
> > +{
> > +	char *pathbuf;
> > +	int i;
> > +
> > +	if (!patterns->argc)
> > +		return 1; /* no restriction */
> > +
> > +	pathbuf = xstrfmt("/%s", refname);
> > +	for (i = 0; i < patterns->argc; i++) {
> > +		if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
> > +			free(pathbuf);
> > +			return 1;
> > +		}
> > +	}
> > +	free(pathbuf);
> > +	return 0;
> > +}
> 
> Does the client have to be aware that we're using wildmatch? I think
> they'd need "refs/heads/**" to actually implement what we usually
> specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
> make this work with just "*"?
> 
> Do we anticipate that the client would left-anchor the refspec like
> "/refs/heads/*" so that in theory the server could avoid looking outside
> of /refs/heads/?

Yeah we may want to anchor it by providing the leading '/' instead of
just "refs/<blah>".

> 
> -Peff

I need to read over the discussion you linked to more but what sort of
ref patterns do you believe we should support as part of the initial
release of v2?  It seems like you wanted this at some point in the past
so I assume you have an idea of what sort of filtering would be
beneficial.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 23/35] fetch-pack: support shallow requests
  2018-02-07  1:13     ` [PATCH v3 23/35] fetch-pack: " Brandon Williams
@ 2018-02-23 19:37       ` Jonathan Tan
  2018-02-23 19:56         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-23 19:37 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:13:00 -0800
Brandon Williams <bmwill@google.com> wrote:

> @@ -1090,6 +1110,10 @@ static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
>  	if (prefer_ofs_delta)
>  		packet_buf_write(&req_buf, "ofs-delta");
>  
> +	/* Add shallow-info and deepen request */
> +	if (server_supports_feature("fetch", "shallow", 1))
> +		add_shallow_requests(&req_buf, args);

One more thing I observed when trying to implement the server side in
JGit - the last argument should be 0, not 1, right? I don't think that
"shallow" should be required on the server unless we really need it.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 23/35] fetch-pack: support shallow requests
  2018-02-23 19:37       ` Jonathan Tan
@ 2018-02-23 19:56         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 19:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/23, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:13:00 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > @@ -1090,6 +1110,10 @@ static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
> >  	if (prefer_ofs_delta)
> >  		packet_buf_write(&req_buf, "ofs-delta");
> >  
> > +	/* Add shallow-info and deepen request */
> > +	if (server_supports_feature("fetch", "shallow", 1))
> > +		add_shallow_requests(&req_buf, args);
> 
> One more thing I observed when trying to implement the server side in
> JGit - the last argument should be 0, not 1, right? I don't think that
> "shallow" should be required on the server unless we really need it.

Good catch, I'll fix that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 20:21                   ` Jeff King
  2018-02-22 21:26                     ` Jonathan Nieder
@ 2018-02-23 21:09                     ` Brandon Williams
  2018-03-03  4:24                       ` Jeff King
  1 sibling, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 21:09 UTC (permalink / raw)
  To: Jeff King
  Cc: Jonathan Nieder, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On 02/22, Jeff King wrote:
> On Thu, Feb 22, 2018 at 03:19:40PM -0500, Jeff King wrote:
> 
> > > To be clear, which of the following are you (most) worried about?
> > > 
> > >  1. being invoked with --help and spawning a pager
> > >  2. receiving and acting on options between 'git' and 'upload-pack'
> > >  3. repository discovery
> > >  4. pager config
> > >  5. alias discovery
> > >  6. increased code surface / unknown threats
> > 
> > My immediate concern is (4). But my greater concern is that people who
> > work on git.c should not have to worry about accidentally violating this
> > principle when they add a new feature or config option.
> > 
> > In other words, it seems like an accident waiting to happen. I'd be more
> > amenable to it if there was some compelling reason for it to be a
> > builtin, but I don't see one listed in the commit message. I see only
> > "let's make it easier to share the code", which AFAICT is equally served
> > by just lib-ifying the code and calling it from the standalone
> > upload-pack.c.
> 
> By the way, any decision here would presumably need to be extended to
> git-serve, etc. The current property is that it's safe to fetch from an
> untrusted repository, even over ssh. If we're keeping that for protocol
> v1, we'd want it to apply to protocol v2, as well.
> 
> -Peff

This may be more complicated.  Right now (for backward compatibility)
all fetches for v2 are issued to the upload-pack endpoint. So even
though I've introduced git-serve it doesn't have requests issued to it
and no requests can be issued to it currently (support isn't added to
http-backend or git-daemon).  This just means that the command already
exists to make it easy for testing specific v2 stuff and if we want to
expose it as an endpoint (like when we have a brand new server command
that is completely incompatible with v1) its already there and support
just needs to be plumbed in.

This whole notion of treating upload-pack differently from receive-pack
has bad consequences for v2 though.  The idea for v2 is to be able to
run any number of commands via the same endpoint, so at the end of the
day the endpoint you used is irrelevant.  So you could issue both fetch
and push commands via the same endpoint in v2 whether its git-serve,
receive-pack, or upload-pack.  So really, like Jonathan has said
elsewhere, we need to figure out how to be ok with having receive-pack
and upload-pack builtins, or having neither of them builtins, because it
doesn't make much sense for v2 to straddle that line.  I mean you could
do some complicated advertising of commands based on the endpoint you
hit, but then what does that mean if you're hitting the git-serve
endpoint where you should presumably be able to do any operation.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper
  2018-02-22 20:40       ` Stefan Beller
@ 2018-02-23 21:22         ` Brandon Williams
  2018-03-03  4:25           ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 21:22 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/22, Stefan Beller wrote:
> On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> 
> > +static void pack_line(const char *line)
> > +{
> > +       if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))
> 
> From our in-office discussion:
> v1/v0 packs pktlines twice in http, which is not possible to
> construct using this test helper when using the same string
> for the packed and unpacked representation of flush and delim packets,
> i.e. test-pkt-line --pack $(test-pkt-line --pack 0000) would produce
> '0000' instead of '00090000\n'.
> To fix it we'd have to replace the unpacked versions of these pkts to
> something else such as "FLUSH" "DELIM".
> 
> However as we do not anticipate the test helper to be used in further
> tests for v0, this ought to be no big issue.
> Maybe someone else cares though?

I'm going to punt and say, if someone cares enough they can update this
test-helper when they want to use it for v1/v0 stuff.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-22 20:09       ` Stefan Beller
@ 2018-02-23 21:30         ` Brandon Williams
  2018-02-23 21:48           ` Stefan Beller
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 21:30 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/22, Stefan Beller wrote:
> > +static enum protocol_version discover_version(struct packet_reader *reader)
> > +{
> ...
> > +
> > +       /* Maybe process capabilities here, at least for v2 */
> > +       switch (version) {
> > +       case protocol_v1:
> > +               /* Read the peeked version line */
> > +               packet_reader_read(reader);
> > +               break;
> > +       case protocol_v0:
> > +               break;
> > +       case protocol_unknown_version:
> > +               die("unknown protocol version: '%s'\n", reader->line);
> 
> The following patches introduce more of the switch(version) cases.
> And there it actually is a
>     BUG("protocol version unknown? should have been set in discover_version")
> but here it is a mere
>   die (_("The server uses a different protocol version than we can
> speak: %s\n"),
>       reader->line);
> so I would think here it is reasonable to add _(translation).

This should be a BUG as it shouldn't ever be unknown at this point.  And
I'll also drop that comment.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-21 22:45       ` Jonathan Tan
@ 2018-02-23 21:33         ` Brandon Williams
  2018-02-27 18:05           ` Jonathan Tan
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 21:33 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/21, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:49 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> >  .gitignore                              |   1 +
> >  Documentation/technical/protocol-v2.txt | 114 +++++++++++++++
> >  Makefile                                |   2 +
> >  builtin.h                               |   1 +
> >  builtin/serve.c                         |  30 ++++
> >  git.c                                   |   1 +
> >  serve.c                                 | 250 ++++++++++++++++++++++++++++++++
> >  serve.h                                 |  15 ++
> >  t/t5701-git-serve.sh                    |  60 ++++++++
> >  9 files changed, 474 insertions(+)
> >  create mode 100644 Documentation/technical/protocol-v2.txt
> >  create mode 100644 builtin/serve.c
> >  create mode 100644 serve.c
> >  create mode 100644 serve.h
> >  create mode 100755 t/t5701-git-serve.sh
> 
> As someone who is implementing the server side of protocol V2 in JGit, I
> now have a bit more insight into this :-)
> 
> First of all, I used to not have a strong opinion on the existence of a
> new endpoint, but now I think that it's better to *not* have git-serve.
> As it is, as far as I can tell, upload-pack also needs to support (and
> does support, as of the end of this patch set) protocol v2 anyway, so it
> might be better to merely upgrade upload-pack.

Having it allows for easier testing and the easy ability to make it a
true endpoint when we want to.  As of right now, git-serve isn't an
endpoint as you can't issue requests there via http-backend or
git-daemon.

> 
> > +A client then responds to select the command it wants with any particular
> > +capabilities or arguments.  There is then an optional section where the
> > +client can provide any command specific parameters or queries.
> > +
> > +    command-request = command
> > +		      capability-list
> > +		      (command-args)
> 
> If you are stating that this is optional, write "*1command-args". (RFC
> 5234 also supports square brackets, but "*1" is already used in
> pack-protocol.txt and http-protocol.txt.)
> 
> > +		      flush-pkt
> > +    command = PKT-LINE("command=" key LF)
> > +    command-args = delim-pkt
> > +		   *arg
> > +    arg = 1*CHAR
> 
> arg should be wrapped in PKT-LINE, I think, and terminated by an LF.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-22  9:33       ` Jeff King
@ 2018-02-23 21:45         ` Brandon Williams
  2018-03-03  4:33           ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 21:45 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On 02/22, Jeff King wrote:
> On Tue, Feb 06, 2018 at 05:12:49PM -0800, Brandon Williams wrote:
> 
> > +In protocol v2 communication is command oriented.  When first contacting a
> > +server a list of capabilities will advertised.  Some of these capabilities
> > +will be commands which a client can request be executed.  Once a command
> > +has completed, a client can reuse the connection and request that other
> > +commands be executed.
> 
> If I understand this correctly, we'll potentially have a lot more
> round-trips between the client and server (one per "command"). And for
> git-over-http, each one will be its own HTTP request?
> 
> We've traditionally tried to minimize HTTP requests, but I guess it's
> not too bad if we can keep the connection open in most cases. Then we
> just suffer some extra framing bytes, but we don't have to re-establish
> the TCP connection each time.
> 
> I do wonder if the extra round trips will be noticeable in high-latency
> conditions. E.g., if I'm 200ms away, converting the current
> ref-advertisement spew to "capabilities, then the client asks for refs,
> then we spew the refs" is going to cost an extra 200ms, even if the
> fetch just ends up being a noop. I'm not sure how bad that is in the
> grand scheme of things (after all, the TCP handshake involves some
> round-trips, too).

I think this is the price of extending the protocol in a backward
compatible way.  If we don't want to be backwards compatible (allowing
for graceful fallback to v1) then we could design this differently.
Even so we're not completely out of luck just yet.

Back when I introduced the GIT_PROTOCOL side-channel I was able to
demonstrate that arbitrary data could be sent to the server and it would
only respect the stuff it knows about.  This means that we can do a
follow up to v2 at some point to introduce an optimization where we can
stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
if the server supports it.

> 
> > + Capability Advertisement
> > +--------------------------
> > +
> > +A server which decides to communicate (based on a request from a client)
> > +using protocol version 2, notifies the client by sending a version string
> > +in its initial response followed by an advertisement of its capabilities.
> > +Each capability is a key with an optional value.  Clients must ignore all
> > +unknown keys.  Semantics of unknown values are left to the definition of
> > +each key.  Some capabilities will describe commands which can be requested
> > +to be executed by the client.
> > +
> > +    capability-advertisement = protocol-version
> > +			       capability-list
> > +			       flush-pkt
> > +
> > +    protocol-version = PKT-LINE("version 2" LF)
> > +    capability-list = *capability
> > +    capability = PKT-LINE(key[=value] LF)
> > +
> > +    key = 1*CHAR
> > +    value = 1*CHAR
> > +    CHAR = 1*(ALPHA / DIGIT / "-" / "_")
> > +
> > +A client then responds to select the command it wants with any particular
> > +capabilities or arguments.  There is then an optional section where the
> > +client can provide any command specific parameters or queries.
> > +
> > +    command-request = command
> > +		      capability-list
> > +		      (command-args)
> > +		      flush-pkt
> > +    command = PKT-LINE("command=" key LF)
> > +    command-args = delim-pkt
> > +		   *arg
> > +    arg = 1*CHAR
> 
> For a single stateful TCP connection like git:// or git-over-ssh, the
> client would get the capabilities once and then issue a series of
> commands. For git-over-http, how does it work?
> 
> The client speaks first in HTTP, so we'd first make a request to get
> just the capabilities from the server? And then proceed from there with
> a series of requests, assuming that the capabilities for each server we
> subsequently contact are the same? That's probably reasonable (and
> certainly the existing http protocol makes that capabilities
> assumption).
> 
> I don't see any documentation on how this all works with http. But

I can add in a bit for the initial request when using http, but the rest
of it should function the same.

> reading patch 34, it looks like we just do the usual
> service=git-upload-pack request (with the magic request for v2), and
> then the server would send us capabilities. Which follows my line of
> thinking in the paragraph above.

Yes this is exactly how it should work.  First we make an info/refs
request and if the server speaks v2 then instead of a refs request we
should get back a capability listing.  Then subsequent requests are made
assuming the capabilities are the same like we've done with the
existing protocol.

The great thing about this is that from the POV of the git-client, it
doesn't care if its speaking using the git://, ssh://, file://, or
http:// transport; it's all the same protocol.  In my next re-roll I'll
even drop the "# service" bit from the http server response and then the
responses will truly be identical in all cases.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-23 21:30         ` Brandon Williams
@ 2018-02-23 21:48           ` Stefan Beller
  2018-02-23 22:56             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-02-23 21:48 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Fri, Feb 23, 2018 at 1:30 PM, Brandon Williams <bmwill@google.com> wrote:
> On 02/22, Stefan Beller wrote:
>> > +static enum protocol_version discover_version(struct packet_reader *reader)
>> > +{
>> ...
>> > +
>> > +       /* Maybe process capabilities here, at least for v2 */
>> > +       switch (version) {
>> > +       case protocol_v1:
>> > +               /* Read the peeked version line */
>> > +               packet_reader_read(reader);
>> > +               break;
>> > +       case protocol_v0:
>> > +               break;
>> > +       case protocol_unknown_version:
>> > +               die("unknown protocol version: '%s'\n", reader->line);
>>
>> The following patches introduce more of the switch(version) cases.
>> And there it actually is a
>>     BUG("protocol version unknown? should have been set in discover_version")
>> but here it is a mere
>>   die (_("The server uses a different protocol version than we can
>> speak: %s\n"),
>>       reader->line);
>> so I would think here it is reasonable to add _(translation).
>
> This should be a BUG as it shouldn't ever be unknown at this point.  And
> I'll also drop that comment.

Huh?
Then I miss-understood the flow of code. When the server announces its
answer is version 42, but the client cannot handle it, which die call is
responsible for reporting it to the user?
(That is technically a BUG on the server side, as we probably never
asked for v42, so I would not want to print BUG locally at the client?)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-23 21:48           ` Stefan Beller
@ 2018-02-23 22:56             ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-23 22:56 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/23, Stefan Beller wrote:
> On Fri, Feb 23, 2018 at 1:30 PM, Brandon Williams <bmwill@google.com> wrote:
> > On 02/22, Stefan Beller wrote:
> >> > +static enum protocol_version discover_version(struct packet_reader *reader)
> >> > +{
> >> ...
> >> > +
> >> > +       /* Maybe process capabilities here, at least for v2 */
> >> > +       switch (version) {
> >> > +       case protocol_v1:
> >> > +               /* Read the peeked version line */
> >> > +               packet_reader_read(reader);
> >> > +               break;
> >> > +       case protocol_v0:
> >> > +               break;
> >> > +       case protocol_unknown_version:
> >> > +               die("unknown protocol version: '%s'\n", reader->line);
> >>
> >> The following patches introduce more of the switch(version) cases.
> >> And there it actually is a
> >>     BUG("protocol version unknown? should have been set in discover_version")
> >> but here it is a mere
> >>   die (_("The server uses a different protocol version than we can
> >> speak: %s\n"),
> >>       reader->line);
> >> so I would think here it is reasonable to add _(translation).
> >
> > This should be a BUG as it shouldn't ever be unknown at this point.  And
> > I'll also drop that comment.
> 
> Huh?
> Then I miss-understood the flow of code. When the server announces its
> answer is version 42, but the client cannot handle it, which die call is
> responsible for reporting it to the user?
> (That is technically a BUG on the server side, as we probably never
> asked for v42, so I would not want to print BUG locally at the client?)

This is handled in 
`determine_protocol_version_client(const char *server_response)`,
which is just a few lines out of context here.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-23  0:45         ` Brandon Williams
@ 2018-02-24  0:19           ` Brandon Williams
  2018-02-24  4:03             ` Jeff King
  2018-02-24  4:01           ` Jeff King
  1 sibling, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-24  0:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On 02/22, Brandon Williams wrote:
> On 02/22, Jeff King wrote:
> > On Tue, Feb 06, 2018 at 05:12:50PM -0800, Brandon Williams wrote:
> > 
> > > +ls-refs takes in the following parameters wrapped in packet-lines:
> > > +
> > > +    symrefs
> > > +	In addition to the object pointed by it, show the underlying ref
> > > +	pointed by it when showing a symbolic ref.
> > > +    peel
> > > +	Show peeled tags.
> > > +    ref-pattern <pattern>
> > > +	When specified, only references matching the one of the provided
> > > +	patterns are displayed.
> > 
> > How do we match those patterns? That's probably an important thing to
> > include in the spec.
> 
> Yeah I thought about it when I first wrote it and was hoping that
> someone who nudge me in the right direction :)
> 
> > 
> > Looking at the code, I see:
> > 
> > > +/*
> > > + * Check if one of the patterns matches the tail part of the ref.
> > > + * If no patterns were provided, all refs match.
> > > + */
> > > +static int ref_match(const struct argv_array *patterns, const char *refname)
> > 
> > This kind of tail matching can't quite implement all of the current
> > behavior. Because we actually do the normal dwim_ref() matching, which
> > includes stuff like "refs/remotes/%s/HEAD".
> > 
> > The other problem with tail-matching is that it's inefficient on the
> > server. Ideally we could get a request for "master" and only look up
> > refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
> > in refs/pull, we wouldn't have to process those at all. Of course this
> > is no worse than the current code, which not only looks at each ref but
> > actually _sends_ it. But it would be nice if we could fix this.
> > 
> > There's some more discussion in this old thread:
> > 
> >   https://public-inbox.org/git/20161024132932.i42rqn2vlpocqmkq@sigill.intra.peff.net/
> 
> Thanks for the pointer.  I was told to be wary a while about about
> performance implications on the server but no discussion ensued till now
> about it :)
> 
> We always have the ability to extend the patterns accepted via a feature
> (or capability) to ls-refs, so maybe the best thing to do now would only
> support a few patterns with specific semantics.  Something like if you
> say "master" only match against refs/heads/ and refs/tags/ and if you
> want something else you would need to specify "refs/pull/master"?
> 
> That way we could only support globs at the end "master*" where * can
> match anything (including slashes)

After some in-office discussion it seems like the best thing to do for
this (right now since if we change our mind we can just introduce a
capability which extends the patterns supported) would be to left-anchor
the ref-patterns and only allow for a single wildcard character '*'
which matches zero or more characters (and doesn't care about slashes
'/').  This wildcard character should only be supported at the end of
the ref pattern.  This means that if a client wants 'master' then they
would need to specify 'refs/heads/master' (and the other
ref_rev_parse_rules expansions) as a ref pattern. But they could say
"refs/heads/*" for all refs under refs/heads.

> 
> > 
> > > +{
> > > +	char *pathbuf;
> > > +	int i;
> > > +
> > > +	if (!patterns->argc)
> > > +		return 1; /* no restriction */
> > > +
> > > +	pathbuf = xstrfmt("/%s", refname);
> > > +	for (i = 0; i < patterns->argc; i++) {
> > > +		if (!wildmatch(patterns->argv[i], pathbuf, 0)) {
> > > +			free(pathbuf);
> > > +			return 1;
> > > +		}
> > > +	}
> > > +	free(pathbuf);
> > > +	return 0;
> > > +}
> > 
> > Does the client have to be aware that we're using wildmatch? I think
> > they'd need "refs/heads/**" to actually implement what we usually
> > specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
> > make this work with just "*"?
> > 
> > Do we anticipate that the client would left-anchor the refspec like
> > "/refs/heads/*" so that in theory the server could avoid looking outside
> > of /refs/heads/?
> 
> Yeah we may want to anchor it by providing the leading '/' instead of
> just "refs/<blah>".
> 
> > 
> > -Peff
> 
> I need to read over the discussion you linked to more but what sort of
> ref patterns do you believe we should support as part of the initial
> release of v2?  It seems like you wanted this at some point in the past
> so I assume you have an idea of what sort of filtering would be
> beneficial.
> 
> -- 
> Brandon Williams

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 21/35] fetch-pack: perform a fetch using v2
  2018-02-07  1:12     ` [PATCH v3 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
@ 2018-02-24  0:54       ` Jonathan Tan
  2018-02-26 22:23         ` Brandon Williams
  2018-02-27 19:27       ` Stefan Beller
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-24  0:54 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Tue,  6 Feb 2018 17:12:58 -0800
Brandon Williams <bmwill@google.com> wrote:

> +	while ((oid = get_rev())) {
> +		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
> +		if (++haves_added >= INITIAL_FLUSH)
> +			break;
> +	};

Unnecessary semicolon after closing brace.

> +			/* Filter 'ref' by 'sought' and those that aren't local */
> +			if (everything_local(args, &ref, sought, nr_sought))
> +				state = FETCH_DONE;
> +			else
> +				state = FETCH_SEND_REQUEST;
> +			break;

I haven't looked at this patch in detail, but I found a bug that can be
reproduced if you patch the following onto this patch:

    --- a/t/t5702-protocol-v2.sh
    +++ b/t/t5702-protocol-v2.sh
    @@ -124,6 +124,7 @@ test_expect_success 'clone with file:// using protocol v2' '
     
     test_expect_success 'fetch with file:// using protocol v2' '
            test_commit -C file_parent two &&
    +       git -C file_parent tag -d one &&
     
            GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
                    fetch origin 2>log &&
    @@ -133,7 +134,8 @@ test_expect_success 'fetch with file:// using protocol v2' '
            test_cmp expect actual &&
     
            # Server responded using protocol v2
    -       grep "fetch< version 2" log
    +       grep "fetch< version 2" log &&
    +       grep "have " log
     '

Merely including the second hunk (the one with 'grep "have "') does not
make the test fail, but including both the first and second hunks does.
That is, fetch v2 emits "have" only for remote refs that point to
objects we already have, not for local refs.

Everything still appears to work, except that packfiles are usually much
larger than they need to be.

I did some digging in the code and found out that the equivalent of
find_common() (which calls `for_each_ref(rev_list_insert_ref_oid,
NULL)`) was not called in v2. In v1, find_common() is called immediately
after everything_local(), but there is no equivalent in v2. (I quoted
the invocation of everything_local() in v2 above.)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-23  0:45         ` Brandon Williams
  2018-02-24  0:19           ` Brandon Williams
@ 2018-02-24  4:01           ` Jeff King
  2018-02-26 22:33             ` Junio C Hamano
  2018-02-27  0:02             ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 362+ messages in thread
From: Jeff King @ 2018-02-24  4:01 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Thu, Feb 22, 2018 at 04:45:14PM -0800, Brandon Williams wrote:

> > This kind of tail matching can't quite implement all of the current
> > behavior. Because we actually do the normal dwim_ref() matching, which
> > includes stuff like "refs/remotes/%s/HEAD".
> > 
> > The other problem with tail-matching is that it's inefficient on the
> > server. Ideally we could get a request for "master" and only look up
> > refs/heads/master, refs/tags/master, etc. And if there are 50,000 refs
> > in refs/pull, we wouldn't have to process those at all. Of course this
> > is no worse than the current code, which not only looks at each ref but
> > actually _sends_ it. But it would be nice if we could fix this.
> > 
> > There's some more discussion in this old thread:
> > 
> >   https://public-inbox.org/git/20161024132932.i42rqn2vlpocqmkq@sigill.intra.peff.net/
> 
> Thanks for the pointer.  I was told to be wary a while about about
> performance implications on the server but no discussion ensued till now
> about it :)
> 
> We always have the ability to extend the patterns accepted via a feature
> (or capability) to ls-refs, so maybe the best thing to do now would only
> support a few patterns with specific semantics.  Something like if you
> say "master" only match against refs/heads/ and refs/tags/ and if you
> want something else you would need to specify "refs/pull/master"?

The big question is whether you want to break compatibility with the
existing program behavior. If not, then I think you have to ask for
every variant in ref_rev_parse_rules (of which there are 6 variants).

Which sounds pretty gross, but it actually may not be _too_ bad. Most
fetches tend to ask for either a single name, or they use left-anchored
wildcards. So it would work to just have the client expand all of the
possibilities itself into fully-qualified refs, and keep the server as
dumb as possible.

And then the server for now can just cull based on the pattern list,
like you have here. But later, we could optimize it to look up the
individual patterns, which should be cheaper, since we'd generally have
many fewer patterns than total refs.

> > Does the client have to be aware that we're using wildmatch? I think
> > they'd need "refs/heads/**" to actually implement what we usually
> > specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
> > make this work with just "*"?
> > 
> > Do we anticipate that the client would left-anchor the refspec like
> > "/refs/heads/*" so that in theory the server could avoid looking outside
> > of /refs/heads/?
> 
> Yeah we may want to anchor it by providing the leading '/' instead of
> just "refs/<blah>".

I actually wonder if we should just specify that the patterns must
_always_ be fully-qualified, but may end with a single "/*" to iterate
over wildcards. Or even simpler, that "refs/heads/foo" would find that
ref itself, and anything under it.

That drops any question about how wildcards work (e.g., does "refs/foo*"
work to find "refs/foobar"?).

> I need to read over the discussion you linked to more but what sort of
> ref patterns do you believe we should support as part of the initial
> release of v2?  It seems like you wanted this at some point in the past
> so I assume you have an idea of what sort of filtering would be
> beneficial.

My goals were just optimizing:

  1. Don't send all the refs across the wire if we can avoid it.

  2. Don't even iterate over all the refs internally if we can avoid it.

Especially with the new binary-searching packed-refs code, we should be
able to serve a request like "ls-refs refs/heads/*" without looking into
"refs/pull" or "refs/changes" at all.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-24  0:19           ` Brandon Williams
@ 2018-02-24  4:03             ` Jeff King
  0 siblings, 0 replies; 362+ messages in thread
From: Jeff King @ 2018-02-24  4:03 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Fri, Feb 23, 2018 at 04:19:54PM -0800, Brandon Williams wrote:

> > We always have the ability to extend the patterns accepted via a feature
> > (or capability) to ls-refs, so maybe the best thing to do now would only
> > support a few patterns with specific semantics.  Something like if you
> > say "master" only match against refs/heads/ and refs/tags/ and if you
> > want something else you would need to specify "refs/pull/master"?
> > 
> > That way we could only support globs at the end "master*" where * can
> > match anything (including slashes)
> 
> After some in-office discussion it seems like the best thing to do for
> this (right now since if we change our mind we can just introduce a
> capability which extends the patterns supported) would be to left-anchor
> the ref-patterns and only allow for a single wildcard character '*'
> which matches zero or more characters (and doesn't care about slashes
> '/').  This wildcard character should only be supported at the end of
> the ref pattern.  This means that if a client wants 'master' then they
> would need to specify 'refs/heads/master' (and the other
> ref_rev_parse_rules expansions) as a ref pattern. But they could say
> "refs/heads/*" for all refs under refs/heads.

Heh, I just responded without having read this and came up with the same
suggestion.

So I agree that is the right path. Or the simplification I mentioned
that "refs/heads/master" would return that ref or possibly
"refs/heads/master/foo" if it exists. Remember that it's fine to be
overly broad here. This is purely an optimization in the advertisement,
as we'd still pick out the refs we care about in a separate step.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 21/35] fetch-pack: perform a fetch using v2
  2018-02-24  0:54       ` Jonathan Tan
@ 2018-02-26 22:23         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-26 22:23 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/23, Jonathan Tan wrote:
> On Tue,  6 Feb 2018 17:12:58 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +	while ((oid = get_rev())) {
> > +		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
> > +		if (++haves_added >= INITIAL_FLUSH)
> > +			break;
> > +	};
> 
> Unnecessary semicolon after closing brace.

Thanks, I'll remove that.

> 
> > +			/* Filter 'ref' by 'sought' and those that aren't local */
> > +			if (everything_local(args, &ref, sought, nr_sought))
> > +				state = FETCH_DONE;
> > +			else
> > +				state = FETCH_SEND_REQUEST;
> > +			break;
> 
> I haven't looked at this patch in detail, but I found a bug that can be
> reproduced if you patch the following onto this patch:
> 
>     --- a/t/t5702-protocol-v2.sh
>     +++ b/t/t5702-protocol-v2.sh
>     @@ -124,6 +124,7 @@ test_expect_success 'clone with file:// using protocol v2' '
>      
>      test_expect_success 'fetch with file:// using protocol v2' '
>             test_commit -C file_parent two &&
>     +       git -C file_parent tag -d one &&
>      
>             GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=2 \
>                     fetch origin 2>log &&
>     @@ -133,7 +134,8 @@ test_expect_success 'fetch with file:// using protocol v2' '
>             test_cmp expect actual &&
>      
>             # Server responded using protocol v2
>     -       grep "fetch< version 2" log
>     +       grep "fetch< version 2" log &&
>     +       grep "have " log
>      '
> 
> Merely including the second hunk (the one with 'grep "have "') does not
> make the test fail, but including both the first and second hunks does.
> That is, fetch v2 emits "have" only for remote refs that point to
> objects we already have, not for local refs.
> 
> Everything still appears to work, except that packfiles are usually much
> larger than they need to be.
> 
> I did some digging in the code and found out that the equivalent of
> find_common() (which calls `for_each_ref(rev_list_insert_ref_oid,
> NULL)`) was not called in v2. In v1, find_common() is called immediately
> after everything_local(), but there is no equivalent in v2. (I quoted
> the invocation of everything_local() in v2 above.)

I actually caught this Friday morning when I realized that fetching from
a referenced repository would replicated objects instead of using them
from the referenced repository.  Thanks for pointing this out :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-24  4:01           ` Jeff King
@ 2018-02-26 22:33             ` Junio C Hamano
  2018-02-27  0:02             ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-02-26 22:33 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, sbeller, jrnieder, stolee, git, pclouds

Jeff King <peff@peff.net> writes:

> I actually wonder if we should just specify that the patterns must
> _always_ be fully-qualified, but may end with a single "/*" to iterate
> over wildcards. Or even simpler, that "refs/heads/foo" would find that
> ref itself, and anything under it.
>
> That drops any question about how wildcards work (e.g., does "refs/foo*"
> work to find "refs/foobar"?).

Sounds quite sensible to me.


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-24  4:01           ` Jeff King
  2018-02-26 22:33             ` Junio C Hamano
@ 2018-02-27  0:02             ` Ævar Arnfjörð Bjarmason
  2018-02-27  5:15               ` Jonathan Nieder
  1 sibling, 1 reply; 362+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2018-02-27  0:02 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, git, sbeller, gitster, jrnieder, stolee, git, pclouds


On Sat, Feb 24 2018, Jeff King jotted:

> On Thu, Feb 22, 2018 at 04:45:14PM -0800, Brandon Williams wrote:
>> > Does the client have to be aware that we're using wildmatch? I think
>> > they'd need "refs/heads/**" to actually implement what we usually
>> > specify in refspecs as "refs/heads/*". Or does the lack of WM_PATHNAME
>> > make this work with just "*"?
>> >
>> > Do we anticipate that the client would left-anchor the refspec like
>> > "/refs/heads/*" so that in theory the server could avoid looking outside
>> > of /refs/heads/?
>>
>> Yeah we may want to anchor it by providing the leading '/' instead of
>> just "refs/<blah>".
>
> I actually wonder if we should just specify that the patterns must
> _always_ be fully-qualified, but may end with a single "/*" to iterate
> over wildcards. Or even simpler, that "refs/heads/foo" would find that
> ref itself, and anything under it.

I agree that this is a very good trade-off for now, but I think having
an escape hatch makes sense. It looks like the protocol is implicitly
extendible since another parameter could be added, but maybe having such
a parameter from the get-go would make sense:

    pattern-type [simple|wildmatch|pcre|...]
    ref-pattern <pattern>

E.g.:

    pattern-type simple
    ref-pattern refs/tags/*
    ref-pattern refs/pull/*
    pattern-type wildmatch
    ref-pattern refs/**/2018
    pattern-type pcre
    ref-pattern ^refs/release/v-201[56789]-\d+$

I.e. each ref-pattern is typed by the pattern-type in play, with just
"simple" (with the behavior being discussed here) for now, anything else
(wildmatch, pcre etc.) would be an error.

But it allows for adding more patterns down the line, and in
e.g. in-house setups of git where you control both the server & clients
to make the trade-off that we'd like a bit more work on the server
(e.g. to match dated tags created in the last 3 months) by setting some
config option.

The discussion upthread about:

> The other problem with tail-matching is that it's inefficient on the
> server[...]

Is also something that's only true in the current implementation, but
doesn't need to be, so it would be unfortunate to not work in an escape
hatch for that limtiation.

E.g. if the refs were stored indexed using the method described at
https://swtch.com/~rsc/regexp/regexp4.html tail matching becomes no less
efficient than prefix matching, but a function of how many trigrams in
your index match the pattern given.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-27  0:02             ` Ævar Arnfjörð Bjarmason
@ 2018-02-27  5:15               ` Jonathan Nieder
  2018-02-27 18:02                 ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  5:15 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Jeff King, Brandon Williams, git, sbeller, gitster, stolee, git, pclouds

Ævar Arnfjörð Bjarmason wrote:
> On Sat, Feb 24 2018, Jeff King jotted:

>> I actually wonder if we should just specify that the patterns must
>> _always_ be fully-qualified, but may end with a single "/*" to iterate
>> over wildcards. Or even simpler, that "refs/heads/foo" would find that
>> ref itself, and anything under it.
>
> I agree that this is a very good trade-off for now, but I think having
> an escape hatch makes sense. It looks like the protocol is implicitly
> extendible since another parameter could be added, but maybe having such
> a parameter from the get-go would make sense:

I prefer to rely on the implicit extensibility (following the general
YAGNI principle).

In other words, we can introduce a pattern-type later and make the
current pattern-type the default.

Thanks for looking to the future.

[...]
> E.g. if the refs were stored indexed using the method described at
> https://swtch.com/~rsc/regexp/regexp4.html tail matching becomes no less
> efficient than prefix matching, but a function of how many trigrams in
> your index match the pattern given.

I think the nearest planned change to ref storage is [1], which is
still optimized for prefix matching.  Longer term, maybe some day
we'll want a secondary index that supports infix matching, or maybe
we'll never need it. :)

Sincerely,
Jonathan

[1] https://public-inbox.org/git/CAJo=hJsZcAM9sipdVr7TMD-FD2V2W6_pvMQ791EGCDsDkQ033w@mail.gmail.com/#t

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v2 12/27] serve: introduce git-serve
  2018-01-26 10:39     ` Duy Nguyen
@ 2018-02-27  5:46       ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  5:46 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Brandon Williams, Git Mailing List, Stefan Beller,
	Junio C Hamano, Jeff King, Philip Oakley, stolee

Hi Duy,

Duy Nguyen wrote:
> On Fri, Jan 26, 2018 at 6:58 AM, Brandon Williams <bmwill@google.com> wrote:

>> + stateless-rpc
>> +---------------
>> +
>> +If advertised, the `stateless-rpc` capability indicates that the server
>> +supports running commands in a stateless-rpc mode, which means that a
>> +command lasts for only a single request-response round.
>> +
>> +Normally a command can last for as many rounds as are required to
>> +complete it (multiple for negotiation during fetch or no additional
>> +trips in the case of ls-refs).  If the client sends the `stateless-rpc`
>> +capability with a value of `true` (in the form `stateless-rpc=true`)
>> +then the invoked command must only last a single round.
>
> Speaking of stateless-rpc, I remember last time this topic was brought
> up, there was some discussion to kind of optimize it for http as well,
> to fit the "client sends request, server responds data" model and
> avoid too many round trips (ideally everything happens in one round
> trip). Does it evolve to anything real? All the cool stuff happened
> while I was away, sorry if this was discussed and settled.

We have a few different ideas for improving negotiation.  They were
speculative enough that we didn't want to make them part of the
baseline protocol v2.  Feel free to poke me in a new thread. :)

Some teasers:

- allow both client and server to suggest commits in negotiation,
  instead of just the client?

- send a bloom filter for the peer to filter their suggestions
  against?

- send other basic information like maximum generation number or
  maximum commit date?

- exponential backoff in negotiation instead of linear walking?
  prioritizing ref tips?  Imitating the bitmap selection algorithm?

- at the "end" of negotiation, sending a graph data structure instead
  of a pack, to allow an extra round trip to produce a truly minimal
  pack?

Those are some initial ideas, but it's also likely that someone can
come up with some other experiments to try, too.  (E.g. we've looked
at various papers on set reconciliation, but they don't make enough
use of the graph structure to help much.)

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 02/35] pkt-line: introduce struct packet_reader
  2018-02-07  1:12     ` [PATCH v3 02/35] pkt-line: introduce struct packet_reader Brandon Williams
  2018-02-13  0:49       ` Jonathan Nieder
@ 2018-02-27  5:57       ` Jonathan Nieder
  2018-02-27  6:12         ` Jonathan Nieder
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  5:57 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Hi,

Brandon Williams wrote:

> Sometimes it is advantageous to be able to peek the next packet line
> without consuming it (e.g. to be able to determine the protocol version
> a server is speaking).  In order to do that introduce 'struct
> packet_reader' which is an abstraction around the normal packet reading
> logic.  This enables a caller to be able to peek a single line at a time
> using 'packet_reader_peek()' and having a caller consume a line by
> calling 'packet_reader_read()'.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  pkt-line.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  pkt-line.h | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 117 insertions(+)

I like it!

The questions and nits from
https://public-inbox.org/git/20180213004937.GB42272@aiede.svl.corp.google.com/
still apply.  In particular, the ownership of the buffers inside the
'struct packet_reader' is still unclear; could the packet_reader create
its own (strbuf) buffers so that the contract around them (who is allowed
to write to them; who is responsible for freeing them) is more obvious?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 06/35] transport: use get_refs_via_connect to get refs
  2018-02-07  1:12     ` [PATCH v3 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-02-27  6:08       ` Jonathan Nieder
  2018-02-27 18:17         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  6:08 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Remove code duplication and use the existing 'get_refs_via_connect()'
> function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
> 'git_transport_push()'.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  transport.c | 18 ++++--------------
>  1 file changed, 4 insertions(+), 14 deletions(-)

I like the diffstat.

[...]
> +++ b/transport.c
> @@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
>  	args.cloning = transport->cloning;
>  	args.update_shallow = data->options.update_shallow;
>  
> -	if (!data->got_remote_heads) {
> -		connect_setup(transport, 0);
> -		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
> -				 NULL, &data->shallow);
> -		data->got_remote_heads = 1;
> -	}
> +	if (!data->got_remote_heads)
> +		refs_tmp = get_refs_via_connect(transport, 0);

The only difference between the old and new code is that the old code
passes NULL as 'extra_have' and the new code passes &data->extra_have.

That means this populates the data->extra_have oid_array.  Does it
matter?

> @@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
>  	struct send_pack_args args;
>  	int ret;
>  
> -	if (!data->got_remote_heads) {
> -		struct ref *tmp_refs;
> -		connect_setup(transport, 1);
> -
> -		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
> -				 NULL, &data->shallow);
> -		data->got_remote_heads = 1;
> -	}
> +	if (!data->got_remote_heads)
> +		get_refs_via_connect(transport, 1);

not a new problem, just curious: Does this leak tmp_refs?

Same question as the other caller about whether we mind getting
extra_have populated.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 02/35] pkt-line: introduce struct packet_reader
  2018-02-27  5:57       ` Jonathan Nieder
@ 2018-02-27  6:12         ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  6:12 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Jonathan Nieder wrote:
> Brandon Williams wrote:

>> Sometimes it is advantageous to be able to peek the next packet line
>> without consuming it (e.g. to be able to determine the protocol version
>> a server is speaking).  In order to do that introduce 'struct
>> packet_reader' which is an abstraction around the normal packet reading
>> logic.  This enables a caller to be able to peek a single line at a time
>> using 'packet_reader_peek()' and having a caller consume a line by
>> calling 'packet_reader_read()'.
>>
>> Signed-off-by: Brandon Williams <bmwill@google.com>
>> ---
>>  pkt-line.c | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  pkt-line.h | 58 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  2 files changed, 117 insertions(+)
>
> I like it!
>
> The questions and nits from
> https://public-inbox.org/git/20180213004937.GB42272@aiede.svl.corp.google.com/
> still apply.  In particular, the ownership of the buffers inside the
> 'struct packet_reader' is still unclear; could the packet_reader create
> its own (strbuf) buffers so that the contract around them (who is allowed
> to write to them; who is responsible for freeing them) is more obvious?

Just to be clear: I sent that review after you sent this patch, so
there should not have been any reason for me to expect the q's and
nits to magically not apply. ;-)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2
  2018-02-07  1:12     ` [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
@ 2018-02-27  6:18       ` Jonathan Nieder
  2018-02-27 18:41         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  6:18 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Hi,

Brandon Williams wrote:

> Introduce protocol_v2, a new value for 'enum protocol_version'.
> Subsequent patches will fill in the implementation of protocol_v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---

Yay!

[...]
> +++ b/builtin/fetch-pack.c
> @@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  			   PACKET_READ_GENTLE_ON_EOF);
>  
>  	switch (discover_version(&reader)) {
> +	case protocol_v2:
> +		die("support for protocol v2 not implemented yet");
> +		break;

This code goes away in a later patch, so no need to do anything about
this, but the 'break' is redundant after the 'die'.

[...]
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  		unpack_limit = receive_unpack_limit;
>  
>  	switch (determine_protocol_version_server()) {
> +	case protocol_v2:
> +		/*
> +		 * push support for protocol v2 has not been implemented yet,
> +		 * so ignore the request to use v2 and fallback to using v0.
> +		 */
> +		break;

As you mentioned in the cover letter, it's probably worth doing the
same fallback on the client side (send-pack), too.

Otherwise when this client talks to a new-enough server, it would
request protocol v2 and then get confused when the server responds
with the protocol v2 it requested.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-22 19:25             ` Jonathan Tan
@ 2018-02-27  6:21               ` Jonathan Nieder
  2018-02-27 21:58                 ` Junio C Hamano
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  6:21 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Jeff King, Brandon Williams, git, sbeller, gitster, stolee, git, pclouds

Jonathan Tan wrote:
> On Thu, 22 Feb 2018 13:26:58 -0500
> Jeff King <peff@peff.net> wrote:

>> I agree that it shouldn't matter much here. But if the name argv_array
>> is standing in the way of using it, I think we should consider giving it
>> a more general name. I picked that not to evoke "this must be arguments"
>> but "this is terminated by a single NULL".
[...]
> This sounds reasonable - I withdraw my comment about using struct
> string_list.

Marking with #leftoverbits as a reminder to think about what such a
more general name would be (or what kind of docs to put in
argv-array.h) and make it so the next time I do a search for that
keyword.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-07  1:12     ` [PATCH v3 14/35] connect: request remote refs using v2 Brandon Williams
  2018-02-21 22:54       ` Jonathan Tan
@ 2018-02-27  6:51       ` Jonathan Nieder
  2018-02-27 19:30         ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  6:51 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Teach the client to be able to request a remote's refs using protocol
> v2.  This is done by having a client issue a 'ls-refs' request to a v2
> server.

Yay, ls-remote support!

[...]
> --- a/builtin/upload-pack.c
> +++ b/builtin/upload-pack.c
> @@ -5,6 +5,7 @@
>  #include "parse-options.h"
>  #include "protocol.h"
>  #include "upload-pack.h"
> +#include "serve.h"

nit, no change needed in this patch: What is a good logical order for
the #includes here?  Bonus points if there's a tool to make it happen
automatically.

Asking since adding #includes like this at the end tends to result in
a harder-to-read list of #includes, sometimes with duplicates, and
often producing conflicts when multiple patches in flight add a
#include to the same file.

[...]
> @@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
>  
>  	switch (determine_protocol_version_server()) {
>  	case protocol_v2:
> -		/*
> -		 * fetch support for protocol v2 has not been implemented yet,
> -		 * so ignore the request to use v2 and fallback to using v0.
> -		 */
> -		upload_pack(&opts);
> +		serve_opts.advertise_capabilities = opts.advertise_refs;
> +		serve_opts.stateless_rpc = opts.stateless_rpc;
> +		serve(&serve_opts);

Interesting!  daemon.c has its own (file-local) serve() function;
can one of the two be renamed?

I actually like both names: serve() is a traditional name for the
function a server calls when it's done setting up and is ready to
serve.  But the clash is confusing.

[...]
> +++ b/connect.c
> @@ -12,9 +12,11 @@
>  #include "sha1-array.h"
>  #include "transport.h"
>  #include "strbuf.h"
> +#include "version.h"
>  #include "protocol.h"
>  
>  static char *server_capabilities;
> +static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;

Can a quick doc comment describe these and how they relate?

Is only one of them set, based on which protocol version is in use?
Should server_capabilities be renamed to server_capabilities_v1?

>  static const char *parse_feature_value(const char *, const char *, int *);
>  
>  static int check_ref(const char *name, unsigned int flags)
> @@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
>  		      "and the repository exists."));
>  }
>  
> +/* Checks if the server supports the capability 'c' */
> +int server_supports_v2(const char *c, int die_on_error)
> +{
> +	int i;
> +
> +	for (i = 0; i < server_capabilities_v2.argc; i++) {
> +		const char *out;
> +		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
> +		    (!*out || *out == '='))
> +			return 1;
> +	}
> +
> +	if (die_on_error)
> +		die("server doesn't support '%s'", c);
> +
> +	return 0;
> +}

Nice.
> +
> +static void process_capabilities_v2(struct packet_reader *reader)
> +{
> +	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
> +		argv_array_push(&server_capabilities_v2, reader->line);
> +
> +	if (reader->status != PACKET_READ_FLUSH)
> +		die("protocol error");

Can this say more?  E.g. "expected flush after capabilities, got <foo>"?

[...]
> @@ -85,7 +114,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
>  	/* Maybe process capabilities here, at least for v2 */

Is this comment out of date now?

>  	switch (version) {
>  	case protocol_v2:
> -		die("support for protocol v2 not implemented yet");
> +		process_capabilities_v2(reader);
>  		break;
>  	case protocol_v1:
>  		/* Read the peeked version line */
> @@ -293,6 +322,98 @@ struct ref **get_remote_heads(struct packet_reader *reader,
>  	return list;
>  }
>  
> +static int process_ref_v2(const char *line, struct ref ***list)

What does the return value represent?

Could it return the more typical 0 on success, -1 on error?

> +{
> +	int ret = 1;
> +	int i = 0;
> +	struct object_id old_oid;
> +	struct ref *ref;
> +	struct string_list line_sections = STRING_LIST_INIT_DUP;
> +
> +	if (string_list_split(&line_sections, line, ' ', -1) < 2) {

Can there be a comment describing the expected format?

> +		ret = 0;
> +		goto out;
> +	}
> +
> +	if (get_oid_hex(line_sections.items[i++].string, &old_oid)) {
> +		ret = 0;
> +		goto out;
> +	}
> +
> +	ref = alloc_ref(line_sections.items[i++].string);

Ref names cannot contains a space, so this is safe.  Good.

> +
> +	oidcpy(&ref->old_oid, &old_oid);
> +	**list = ref;
> +	*list = &ref->next;
> +
> +	for (; i < line_sections.nr; i++) {
> +		const char *arg = line_sections.items[i].string;
> +		if (skip_prefix(arg, "symref-target:", &arg))
> +			ref->symref = xstrdup(arg);

Using space-delimited fields in a single pkt-line means that
- values cannot contains a space
- total length is limited by the size of a pkt-line

Given the context, I think that's fine.  More generally it is tempting
to use a pkt-line per field to avoid the trouble v1 had with
capability lists crammed into a pkt-line, but I see why you used a
pkt-line per ref to avoid having to have sections-within-a-section.

My only potential worry is the length part: do we have an explicit
limit on the length of a ref name?  git-check-ref-format(1) doesn't
mention one.  A 32k ref name would be a bit ridiculous, though.

> +
> +		if (skip_prefix(arg, "peeled:", &arg)) {
> +			struct object_id peeled_oid;
> +			char *peeled_name;
> +			struct ref *peeled;
> +			if (get_oid_hex(arg, &peeled_oid)) {
> +				ret = 0;
> +				goto out;
> +			}

Can this also check that there's no trailing garbage after the oid?

[...]
> +
> +			peeled_name = xstrfmt("%s^{}", ref->name);

optional: can reuse a buffer to avoid allocation churn:

	struct strbuf peeled_name = STRBUF_INIT;

			strbuf_reset(&peeled_name);
			strbuf_addf(&peeled_name, "%s^{}", ref->name);
			// or strbuf_addstr(ref->name); strbuf_addstr("^{}");

 out:
 	strbuf_release(&peeled_name);

[...]
> +struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +			     struct ref **list, int for_push,
> +			     const struct argv_array *ref_patterns)
> +{
> +	int i;
> +	*list = NULL;
> +
> +	/* Check that the server supports the ls-refs command */
> +	/* Issue request for ls-refs */
> +	if (server_supports_v2("ls-refs", 1))
> +		packet_write_fmt(fd_out, "command=ls-refs\n");

Since the code is so clear, I don't think the above two comments are
helping.

> +
> +	if (server_supports_v2("agent", 0))
> +	    packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());

whitespace nit: mixing tabs and spaces.  Does "make style" catch this?

> +
> +	packet_delim(fd_out);
> +	/* When pushing we don't want to request the peeled tags */

Can you say more about this?  In theory it would be nice to have the
peeled tags since they name commits whose history can be excluded from
the pack.

> +	if (!for_push)
> +		packet_write_fmt(fd_out, "peel\n");
> +	packet_write_fmt(fd_out, "symrefs\n");

Are symrefs useful during push?

> +	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
> +		packet_write_fmt(fd_out, "ref-pattern %s\n",
> +				 ref_patterns->argv[i]);
> +	}

The exciting part.

Why do these pkts end with \n?  I would have expected the pkt-line
framing to work to delimit them.

> +	packet_flush(fd_out);
> +
> +	/* Process response from server */
> +	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +		if (!process_ref_v2(reader->line, &list))
> +			die("invalid ls-refs response: %s", reader->line);
> +	}
> +
> +	if (reader->status != PACKET_READ_FLUSH)
> +		die("protocol error");

Can this protocol error give more detail?  When diagnosing an error in
servers, proxies, or lower-level networking issues, informative protocol
errors can be very helpful (similar to syntax errors from a compiler).

[...]
> --- a/connect.h
> +++ b/connect.h
> @@ -16,4 +16,6 @@ extern int url_is_local_not_ssh(const char *url);
>  struct packet_reader;
>  extern enum protocol_version discover_version(struct packet_reader *reader);
>  
> +extern int server_supports_v2(const char *c, int die_on_error);

const char *cap, maybe?

[...]
> --- a/remote.h
> +++ b/remote.h
> @@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
>  
>  struct oid_array;
>  struct packet_reader;
> +struct argv_array;
>  extern struct ref **get_remote_heads(struct packet_reader *reader,
>  				     struct ref **list, unsigned int flags,
>  				     struct oid_array *extra_have,
>  				     struct oid_array *shallow_points);
> +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> +				    struct ref **list, int for_push,
> +				    const struct argv_array *ref_patterns);

What is the difference between get_remote_heads and get_remote_refs?
A comment might help.  (BTW, thanks for making the new saner name to
replace get_remote_heads!)

[...]
> --- /dev/null
> +++ b/t/t5702-protocol-v2.sh
> @@ -0,0 +1,53 @@
> +#!/bin/sh
> +
> +test_description='test git wire-protocol version 2'

Woot!

[...]
> +test_expect_success 'list refs with git:// using protocol v2' '
> +	GIT_TRACE_PACKET=1 git -c protocol.version=2 \
> +		ls-remote --symref "$GIT_DAEMON_URL/parent" >actual 2>log &&
> +
> +	# Client requested to use protocol v2
> +	grep "git> .*\\\0\\\0version=2\\\0$" log &&
> +	# Server responded using protocol v2
> +	grep "git< version 2" log &&

optional: Could anchor these greps to make the test tighter (e.g. to
not match "version 20".

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 18/35] fetch: pass ref patterns when fetching
  2018-02-07  1:12     ` [PATCH v3 18/35] fetch: pass ref patterns when fetching Brandon Williams
@ 2018-02-27  6:53       ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27  6:53 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Construct a list of ref patterns to be passed to
> 'transport_get_remote_refs()' from the refspec to be used during the
> fetch.  This list of ref patterns will be used to allow the server to
> filter the ref advertisement when communicating using protocol v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  builtin/fetch.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Nice.

I take it that tests covering this come later in the series?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 13/35] ls-refs: introduce ls-refs server command
  2018-02-27  5:15               ` Jonathan Nieder
@ 2018-02-27 18:02                 ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 18:02 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Ævar Arnfjörð Bjarmason, Jeff King, git, sbeller,
	gitster, stolee, git, pclouds

On 02/26, Jonathan Nieder wrote:
> Ævar Arnfjörð Bjarmason wrote:
> > On Sat, Feb 24 2018, Jeff King jotted:
> 
> >> I actually wonder if we should just specify that the patterns must
> >> _always_ be fully-qualified, but may end with a single "/*" to iterate
> >> over wildcards. Or even simpler, that "refs/heads/foo" would find that
> >> ref itself, and anything under it.
> >
> > I agree that this is a very good trade-off for now, but I think having
> > an escape hatch makes sense. It looks like the protocol is implicitly
> > extendible since another parameter could be added, but maybe having such
> > a parameter from the get-go would make sense:
> 
> I prefer to rely on the implicit extensibility (following the general
> YAGNI principle).
> 
> In other words, we can introduce a pattern-type later and make the
> current pattern-type the default.

Yeah this is what I'm going to do for the next re-roll of the series,
make the pattern matching simple and later we can extend it if we want
since we already have the ability to add new features to commands (you
can see how I added shallow to fetch for an example).

> 
> Thanks for looking to the future.
> 
> [...]
> > E.g. if the refs were stored indexed using the method described at
> > https://swtch.com/~rsc/regexp/regexp4.html tail matching becomes no less
> > efficient than prefix matching, but a function of how many trigrams in
> > your index match the pattern given.
> 
> I think the nearest planned change to ref storage is [1], which is
> still optimized for prefix matching.  Longer term, maybe some day
> we'll want a secondary index that supports infix matching, or maybe
> we'll never need it. :)
> 
> Sincerely,
> Jonathan
> 
> [1] https://public-inbox.org/git/CAJo=hJsZcAM9sipdVr7TMD-FD2V2W6_pvMQ791EGCDsDkQ033w@mail.gmail.com/#t

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-23 21:33         ` Brandon Williams
@ 2018-02-27 18:05           ` Jonathan Tan
  2018-02-27 18:34             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-02-27 18:05 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On Fri, 23 Feb 2018 13:33:15 -0800
Brandon Williams <bmwill@google.com> wrote:

> On 02/21, Jonathan Tan wrote:
> > As someone who is implementing the server side of protocol V2 in JGit, I
> > now have a bit more insight into this :-)
> > 
> > First of all, I used to not have a strong opinion on the existence of a
> > new endpoint, but now I think that it's better to *not* have git-serve.
> > As it is, as far as I can tell, upload-pack also needs to support (and
> > does support, as of the end of this patch set) protocol v2 anyway, so it
> > might be better to merely upgrade upload-pack.
> 
> Having it allows for easier testing and the easy ability to make it a
> true endpoint when we want to.  As of right now, git-serve isn't an
> endpoint as you can't issue requests there via http-backend or
> git-daemon.

Is git-serve planned to be a new endpoint?

If yes, I now don't think it's a good idea - it's an extra burden to
reimplementors without much benefit (to have a new endpoint that does
the same things as upload-pack).

If not, I don't think that easier testing makes it worth having an extra
binary. Couldn't the same tests be done by running upload-pack directly?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 02/35] pkt-line: introduce struct packet_reader
  2018-02-13  0:49       ` Jonathan Nieder
@ 2018-02-27 18:14         ` Brandon Williams
  2018-02-27 19:20           ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 18:14 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/12, Jonathan Nieder wrote:
> [...]
> > --- a/pkt-line.h
> > +++ b/pkt-line.h
> > @@ -111,6 +111,64 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
> >   */
> >  ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
> >  
> > +struct packet_reader {
> > +	/* source file descriptor */
> > +	int fd;
> > +
> > +	/* source buffer and its size */
> > +	char *src_buffer;
> > +	size_t src_len;
> 
> Can or should this be a strbuf?
> 
> > +
> > +	/* buffer that pkt-lines are read into and its size */
> > +	char *buffer;
> > +	unsigned buffer_size;
> 
> Likewise.
> 

This struct is setup to be a drop in replacement for the existing
read_packet() family of functions.  Because of this I tried to make the
interface as similar as possible to make it easy to convert to using it
as well as having no need to clean anything up (because the struct is
really just a wrapper and doesn't own anything).

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 06/35] transport: use get_refs_via_connect to get refs
  2018-02-27  6:08       ` Jonathan Nieder
@ 2018-02-27 18:17         ` Brandon Williams
  2018-02-27 19:25           ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 18:17 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/26, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > Remove code duplication and use the existing 'get_refs_via_connect()'
> > function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
> > 'git_transport_push()'.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  transport.c | 18 ++++--------------
> >  1 file changed, 4 insertions(+), 14 deletions(-)
> 
> I like the diffstat.
> 
> [...]
> > +++ b/transport.c
> > @@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
> >  	args.cloning = transport->cloning;
> >  	args.update_shallow = data->options.update_shallow;
> >  
> > -	if (!data->got_remote_heads) {
> > -		connect_setup(transport, 0);
> > -		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
> > -				 NULL, &data->shallow);
> > -		data->got_remote_heads = 1;
> > -	}
> > +	if (!data->got_remote_heads)
> > +		refs_tmp = get_refs_via_connect(transport, 0);
> 
> The only difference between the old and new code is that the old code
> passes NULL as 'extra_have' and the new code passes &data->extra_have.
> 
> That means this populates the data->extra_have oid_array.  Does it
> matter?
> 
> > @@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
> >  	struct send_pack_args args;
> >  	int ret;
> >  
> > -	if (!data->got_remote_heads) {
> > -		struct ref *tmp_refs;
> > -		connect_setup(transport, 1);
> > -
> > -		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
> > -				 NULL, &data->shallow);
> > -		data->got_remote_heads = 1;
> > -	}
> > +	if (!data->got_remote_heads)
> > +		get_refs_via_connect(transport, 1);
> 
> not a new problem, just curious: Does this leak tmp_refs?

Maybe, though its removed by this patch.

> 
> Same question as the other caller about whether we mind getting
> extra_have populated.

I don't think its a problem to have extra_have populated, least I
haven't seen anything to lead me to believe it would be a problem.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 19/35] push: pass ref patterns when pushing
  2018-02-07  1:12     ` [PATCH v3 19/35] push: pass ref patterns when pushing Brandon Williams
@ 2018-02-27 18:23       ` Stefan Beller
  0 siblings, 0 replies; 362+ messages in thread
From: Stefan Beller @ 2018-02-27 18:23 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> Construct a list of ref patterns to be passed to 'get_refs_list()' from
> the refspec to be used during the push.  This list of ref patterns will
> be used to allow the server to filter the ref advertisement when
> communicating using protocol v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  transport.c | 17 ++++++++++++++++-
>  1 file changed, 16 insertions(+), 1 deletion(-)
>
> diff --git a/transport.c b/transport.c
> index dfc603b36..6ea3905e3 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1026,11 +1026,26 @@ int transport_push(struct transport *transport,
>                 int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
>                 int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
>                 int push_ret, ret, err;
> +               struct refspec *tmp_rs;
> +               struct argv_array ref_patterns = ARGV_ARRAY_INIT;
> +               int i;
>
>                 if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
>                         return -1;
>
> -               remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
> +               tmp_rs = parse_push_refspec(refspec_nr, refspec);
> +               for (i = 0; i < refspec_nr; i++) {
> +                       if (tmp_rs[i].dst)
> +                               argv_array_push(&ref_patterns, tmp_rs[i].dst);
> +                       else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
> +                               argv_array_push(&ref_patterns, tmp_rs[i].src);

else /* !tmp_rs[i].dst && (!tmp_rs[i].src || tmp_rs[i].exact_sha1)

I would think the case of !dst && !src cannot happen, as then there is
no refspec, but what about the !!exact_sha1 case ?

I'd think that is something like

    git push origin $(git rev-parse HEAD)

for which I'd think we'd bail out anyway?
But that would happen at a different place, here we can ignore
the exact hashes for listing refs purposes.

Can you add a comment or rather explain in the commit
message to make this less confusing?

Stefan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 22/35] upload-pack: support shallow requests
  2018-02-07  1:12     ` [PATCH v3 22/35] upload-pack: support shallow requests Brandon Williams
  2018-02-07 19:00       ` Stefan Beller
@ 2018-02-27 18:29       ` Jonathan Nieder
  2018-02-27 18:57         ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 18:29 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Hi,

Brandon Williams wrote:

> Add the 'shallow' feature to the protocol version 2 command 'fetch'
> which indicates that the server supports shallow clients and deepen
> requets.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/technical/protocol-v2.txt |  67 +++++++++++++++-
>  serve.c                                 |   2 +-
>  t/t5701-git-serve.sh                    |   2 +-
>  upload-pack.c                           | 138 +++++++++++++++++++++++---------
>  upload-pack.h                           |   3 +
>  5 files changed, 173 insertions(+), 39 deletions(-)

Yay!  We've been running with this for a while at Google (for file://
fetches, at least) and it's been working well.

[...]
> --- a/Documentation/technical/protocol-v2.txt
> +++ b/Documentation/technical/protocol-v2.txt
> @@ -201,12 +201,42 @@ packet-lines:
>  	to its base by position in pack rather than by an oid.  That is,
>  	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
>  
> +    shallow <oid>
> +	A client must notify the server of all objects for which it only
> +	has shallow copies of (meaning that it doesn't have the parents

Grammar nit: "for which it only has shallow copies of" should be e.g.
"for which it only has shallow copies" or "that it only has shallow
copies of" or "that it only has shallow copies for".

I think s/objects/commits/ would also help clarify.

> +	of a commit) by supplying a 'shallow <oid>' line for each such
> +	object so that the serve is aware of the limitations of the

s/serve/server/

> +	client's history.

Is it worth mentioning that this is about negotiation?  E.g. "so that
the server is aware that the client may not have all objects reachable
from such commits".

> +
> +    deepen <depth>
> +	Request that the fetch/clone should be shallow having a commit depth of

nit: s/Request/Requests/, for consistency with the others?

> +	<depth> relative to the remote side.

What does the value of <depth> mean? E.g. does a depth of 1 mean to
fetch only the commits named in "have", 2 to fetch those commits plus
their parents, etc, or am I off by one?

Is <depth> always a positive number?

What happens if <depth> starts with a 0?  Is that a client error?

> +
> +    deepen-relative
> +	Requests that the semantics of the "deepen" command be changed
> +	to indicate that the depth requested is relative to the clients
> +	current shallow boundary, instead of relative to the remote
> +	refs.

s/clients/client's/

s/remote refs/requested commits/ or "wants" or something.

> +
> +    deepen-since <timestamp>
> +	Requests that the shallow clone/fetch should be cut at a
> +	specific time, instead of depth.  Internally it's equivalent of
> +	doing "rev-list --max-age=<timestamp>". Cannot be used with
> +	"deepen".

Nits:
  s/rev-list/git rev-list/
  s/equivalent of/equivalent to/ or 'the equivalent of'.

Since the git-rev-list(1) manpage doesn't tell me: what is the format
of <timestamp>?  And is the requested time interval inclusive of
exclusive?

> +
> +    deepen-not <rev>
> +	Requests that the shallow clone/fetch should be cut at a
> +	specific revision specified by '<rev>', instead of a depth.
> +	Internally it's equivalent of doing "rev-list --not <rev>".
> +	Cannot be used with "deepen", but can be used with
> +	"deepen-since".

Interesting.

nit: s/rev-list/git rev-list/

What is the format of <rev>?  E.g. can it be an arbitrary revision
specifier or is it an oid?

[...]
>      output = *section
> -    section = (acknowledgments | packfile)
> +    section = (acknowledgments | shallow-info | packfile)
>  	      (flush-pkt | delim-pkt)

It looks like sections can go in an arbitrary order.  Are there
tests to make sure the server can cope with reordering?  (I ask
not because I mistrust the server but because I have some vague
hope that other server implementations might be inspired by our
tests.)

[...]
> @@ -215,6 +245,11 @@ header.
>      nak = PKT-LINE("NAK" LF)
>      ack = PKT-LINE("ACK" SP obj-id LF)
>  
> +    shallow-info = PKT-LINE("shallow-info" LF)
> +		   *PKT-LINE((shallow | unshallow) LF)
> +    shallow = "shallow" SP obj-id
> +    unshallow = "unshallow" SP obj-id

Likewise: it looks like shallows and unshallows can be intermixed; can
this be either (a) tightened or (b) covered by tests to make sure a
later refactoring doesn't accidentally tighten it?

[...]
> @@ -247,6 +282,36 @@ header.
>  	  determined the objects it plans to send to the client and no
>  	  further negotiation is needed.
>  
> +----
> +    shallow-info section
> +	If the client has requested a shallow fetch/clone, a shallow
> +	client requests a fetch or the server is shallow then the
> +	server's response may include a shallow-info section.

I'm having trouble parsing this sentence.

>                                                              The
> +	shallow-info section will be include if (due to one of the above

nit: s/include/included/

> +	conditions) the server needs to inform the client of any shallow
> +	boundaries or adjustments to the clients already existing
> +	shallow boundaries.
> +
> +	* Always begins with the section header "shallow-info"
> +
> +	* If a positive depth is requested, the server will compute the
> +	  set of commits which are no deeper than the desired depth.
> +
> +	* The server sends a "shallow obj-id" line for each commit whose
> +	  parents will not be sent in the following packfile.
> +
> +	* The server sends an "unshallow obj-id" line for each commit
> +	  which the client has indicated is shallow, but is no longer
> +	  shallow as a result of the fetch (due to its parents being
> +	  sent in the following packfile).
> +
> +	* The server MUST NOT send any "unshallow" lines for anything
> +	  which the client has not indicated was shallow as a part of
> +	  its request.
> +
> +	* This section is only included if a packfile section is also
> +	  included in the response.

Neat.

I wonder if shallow information is also useful for negotiation.  I
guess mostly not today, since the client is the only one that suggests
"have"s for negotiation and although it could accidentally exclude too
many "have"s by going down a path from a server's ack, that feels like
a rare case.

[...]
> --- a/upload-pack.c
> +++ b/upload-pack.c
> @@ -710,7 +710,6 @@ static void deepen(int depth, int deepen_relative,
>  	}
>  
>  	send_unshallow(shallows);
> -	packet_flush(1);

What does this part do?
>  }
>  
>  static void deepen_by_rev_list(int ac, const char **av,
> @@ -722,7 +721,52 @@ static void deepen_by_rev_list(int ac, const char **av,
>  	send_shallow(result);
>  	free_commit_list(result);
>  	send_unshallow(shallows);
> -	packet_flush(1);

Same question.

> +}
> +
> +static int send_shallow_list(int depth, int deepen_rev_list,
> +			     timestamp_t deepen_since,
> +			     struct string_list *deepen_not,
> +			     struct object_array *shallows)

What does the return value from this function represent?  It doesn't
appear to be the usual "0 means success, -1 means failure" so a
comment would help.

> +{
> +	int ret = 0;
> +
> +	if (depth > 0 && deepen_rev_list)
> +		die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");

nit: long line (can/should "make style" find these?)

The error message is pretty long, longer than a typical 80-column
terminal, so probably best to find a way to make the message shorter.
E.g.

		die("upload-pack: deepen cannot be combined with other deepen-* options");

That still would be >80 columns with the indent, so the usual style
would be to break it into multiple strings and use C preprocessor
concatenation (yuck):

		die("upload-pack: "
		    "deepen cannot be combined with other deepen-* options");

[...]
> +	if (depth > 0) {
> +		deepen(depth, deepen_relative, shallows);
> +		ret = 1;
> +	} else if (deepen_rev_list) {
> +		struct argv_array av = ARGV_ARRAY_INIT;
> +		int i;
> +
> +		argv_array_push(&av, "rev-list");
> +		if (deepen_since)
> +			argv_array_pushf(&av, "--max-age=%"PRItime, deepen_since);
> +		if (deepen_not->nr) {
> +			argv_array_push(&av, "--not");
> +			for (i = 0; i < deepen_not->nr; i++) {
> +				struct string_list_item *s = deepen_not->items + i;
> +				argv_array_push(&av, s->string);

This accepts arbitrary rev-list arguments, which feels dangerous
(could end up doing an expensive operation or reading arbitrary files
or finding a way to execute arbitrary code).

[...]
> -		if (deepen_not.nr) {
> -			argv_array_push(&av, "--not");
> -			for (i = 0; i < deepen_not.nr; i++) {
> -				struct string_list_item *s = deepen_not.items + i;
> -				argv_array_push(&av, s->string);

Huh.  Looks like some of the above comments are better addressed to an
earlier patch.

[...]
> @@ -1071,6 +1085,13 @@ struct upload_pack_data {
>  	struct object_array wants;
>  	struct oid_array haves;
>  
> +	struct object_array shallows;
> +	struct string_list deepen_not;
> +	int depth;
> +	timestamp_t deepen_since;
> +	int deepen_rev_list;
> +	int deepen_relative;

Nice.

Comments describing deepen_Rev_list and deepen_relative would be nice.

Are those boolean?  Can they be unsigned:1 to make that
self-explanatory?

[...]
> @@ -1080,12 +1101,14 @@ struct upload_pack_data {
>  	unsigned done : 1;
>  };
>  
> -#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
> +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, OBJECT_ARRAY_INIT, STRING_LIST_INIT_DUP, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }

Long line, "make style" should be able to fix it.

[...]
>  
>  static void upload_pack_data_clear(struct upload_pack_data *data)
>  {
>  	object_array_clear(&data->wants);
>  	oid_array_clear(&data->haves);
> +	object_array_clear(&data->shallows);
> +	string_list_clear(&data->deepen_not, 0);

Thanks for remembering to clean up.

[...]
> @@ -1284,6 +1323,23 @@ static int process_haves_and_send_acks(struct upload_pack_data *data)
>  	return ret;
>  }
>  
> +static void send_shallow_info(struct upload_pack_data *data)
> +{
> +	/* No shallow info needs to be sent */
> +	if (!data->depth && !data->deepen_rev_list && !data->shallows.nr &&
> +	    !is_repository_shallow())
> +		return;
> +
> +	packet_write_fmt(1, "shallow-info\n");
> +
> +	if (!send_shallow_list(data->depth, data->deepen_rev_list,
> +			       data->deepen_since, &data->deepen_not,
> +			       &data->shallows) && is_repository_shallow())
> +		deepen(INFINITE_DEPTH, data->deepen_relative, &data->shallows);
> +
> +	packet_delim(1);
> +}

Nice.

[...]
> @@ -1346,3 +1404,11 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
>  	upload_pack_data_clear(&data);
>  	return 0;
>  }
> +
> +int upload_pack_advertise(struct repository *r,
> +			  struct strbuf *value)
> +{
> +	if (value)
> +		strbuf_addstr(value, "shallow");
> +	return 1;
> +}

This is about capabilities?

Maybe a doc comment in the header file would help.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-27 18:05           ` Jonathan Tan
@ 2018-02-27 18:34             ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 18:34 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, sbeller, peff, gitster, jrnieder, stolee, git, pclouds

On 02/27, Jonathan Tan wrote:
> On Fri, 23 Feb 2018 13:33:15 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > On 02/21, Jonathan Tan wrote:
> > > As someone who is implementing the server side of protocol V2 in JGit, I
> > > now have a bit more insight into this :-)
> > > 
> > > First of all, I used to not have a strong opinion on the existence of a
> > > new endpoint, but now I think that it's better to *not* have git-serve.
> > > As it is, as far as I can tell, upload-pack also needs to support (and
> > > does support, as of the end of this patch set) protocol v2 anyway, so it
> > > might be better to merely upgrade upload-pack.
> > 
> > Having it allows for easier testing and the easy ability to make it a
> > true endpoint when we want to.  As of right now, git-serve isn't an
> > endpoint as you can't issue requests there via http-backend or
> > git-daemon.
> 
> Is git-serve planned to be a new endpoint?
> 
> If yes, I now don't think it's a good idea - it's an extra burden to
> reimplementors without much benefit (to have a new endpoint that does
> the same things as upload-pack).

I'm still going to include it, with the potential for it to become an
endpoint if we so choose (it isn't now), because when we start to
introduce more things to v2 (push or other commands we haven't dreamed
up yet) it just makes more sense to contact an endpoint that doesn't
explicitly say what it does.

> 
> If not, I don't think that easier testing makes it worth having an extra
> binary. Couldn't the same tests be done by running upload-pack directly?

its builtin and not a new binary, and yes it makes testing much easier
because its assumes v2 from the start instead of v0.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2
  2018-02-27  6:18       ` Jonathan Nieder
@ 2018-02-27 18:41         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 18:41 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/26, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > Introduce protocol_v2, a new value for 'enum protocol_version'.
> > Subsequent patches will fill in the implementation of protocol_v2.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> 
> Yay!
> 
> [...]
> > +++ b/builtin/fetch-pack.c
> > @@ -201,6 +201,9 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
> >  			   PACKET_READ_GENTLE_ON_EOF);
> >  
> >  	switch (discover_version(&reader)) {
> > +	case protocol_v2:
> > +		die("support for protocol v2 not implemented yet");
> > +		break;
> 
> This code goes away in a later patch, so no need to do anything about
> this, but the 'break' is redundant after the 'die'.

I'll fix that.

> 
> [...]
> > --- a/builtin/receive-pack.c
> > +++ b/builtin/receive-pack.c
> > @@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> >  		unpack_limit = receive_unpack_limit;
> >  
> >  	switch (determine_protocol_version_server()) {
> > +	case protocol_v2:
> > +		/*
> > +		 * push support for protocol v2 has not been implemented yet,
> > +		 * so ignore the request to use v2 and fallback to using v0.
> > +		 */
> > +		break;
> 
> As you mentioned in the cover letter, it's probably worth doing the
> same fallback on the client side (send-pack), too.
> 
> Otherwise when this client talks to a new-enough server, it would
> request protocol v2 and then get confused when the server responds
> with the protocol v2 it requested.

Some patches later on ensure this.

> 
> Thanks,
> Jonathan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 22/35] upload-pack: support shallow requests
  2018-02-27 18:29       ` Jonathan Nieder
@ 2018-02-27 18:57         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 18:57 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/27, Jonathan Nieder wrote:

I'll make the documentation changes you suggested.

> > +    deepen <depth>
> > +	Request that the fetch/clone should be shallow having a commit depth of
> 
> nit: s/Request/Requests/, for consistency with the others?
> 
> > +	<depth> relative to the remote side.
> 
> What does the value of <depth> mean? E.g. does a depth of 1 mean to
> fetch only the commits named in "have", 2 to fetch those commits plus
> their parents, etc, or am I off by one?

Honestly I have no clue, what does the current protocol do?  There isn't
any documentation about it and this just reuses the logic from that.

> 
> Is <depth> always a positive number?
> 
> What happens if <depth> starts with a 0?  Is that a client error?
> 

> >      output = *section
> > -    section = (acknowledgments | packfile)
> > +    section = (acknowledgments | shallow-info | packfile)
> >  	      (flush-pkt | delim-pkt)
> 
> It looks like sections can go in an arbitrary order.  Are there
> tests to make sure the server can cope with reordering?  (I ask
> not because I mistrust the server but because I have some vague
> hope that other server implementations might be inspired by our
> tests.)

I'll fix this so that they don't come in arbitrary order

> 
> [...]
> > @@ -215,6 +245,11 @@ header.
> >      nak = PKT-LINE("NAK" LF)
> >      ack = PKT-LINE("ACK" SP obj-id LF)
> >  
> > +    shallow-info = PKT-LINE("shallow-info" LF)
> > +		   *PKT-LINE((shallow | unshallow) LF)
> > +    shallow = "shallow" SP obj-id
> > +    unshallow = "unshallow" SP obj-id
> 
> Likewise: it looks like shallows and unshallows can be intermixed; can
> this be either (a) tightened or (b) covered by tests to make sure a
> later refactoring doesn't accidentally tighten it?

This reuses the existing logic from v0 so its due to that spec.

> > --- a/upload-pack.c
> > +++ b/upload-pack.c
> > @@ -710,7 +710,6 @@ static void deepen(int depth, int deepen_relative,
> >  	}
> >  
> >  	send_unshallow(shallows);
> > -	packet_flush(1);
> 
> What does this part do?
> >  }
> >  
> >  static void deepen_by_rev_list(int ac, const char **av,
> > @@ -722,7 +721,52 @@ static void deepen_by_rev_list(int ac, const char **av,
> >  	send_shallow(result);
> >  	free_commit_list(result);
> >  	send_unshallow(shallows);
> > -	packet_flush(1);
> 
> Same question.

Pulling out the flush packet so that the logic can be reused for v2, the
flush is added back in for the v0 case but not for the v2 case.

> 
> > +}
> > +
> > +static int send_shallow_list(int depth, int deepen_rev_list,
> > +			     timestamp_t deepen_since,
> > +			     struct string_list *deepen_not,
> > +			     struct object_array *shallows)
> 
> What does the return value from this function represent?  It doesn't
> appear to be the usual "0 means success, -1 means failure" so a
> comment would help.

I'll add a comment.

> 
> > +{
> > +	int ret = 0;
> > +
> > +	if (depth > 0 && deepen_rev_list)
> > +		die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");
> 
> nit: long line (can/should "make style" find these?)
> 
> The error message is pretty long, longer than a typical 80-column
> terminal, so probably best to find a way to make the message shorter.
> E.g.
> 
> 		die("upload-pack: deepen cannot be combined with other deepen-* options");
> 
> That still would be >80 columns with the indent, so the usual style
> would be to break it into multiple strings and use C preprocessor
> concatenation (yuck):
> 
> 		die("upload-pack: "
> 		    "deepen cannot be combined with other deepen-* options");
> 

> [...]
> > +	if (depth > 0) {
> > +		deepen(depth, deepen_relative, shallows);
> > +		ret = 1;
> > +	} else if (deepen_rev_list) {
> > +		struct argv_array av = ARGV_ARRAY_INIT;
> > +		int i;
> > +
> > +		argv_array_push(&av, "rev-list");
> > +		if (deepen_since)
> > +			argv_array_pushf(&av, "--max-age=%"PRItime, deepen_since);
> > +		if (deepen_not->nr) {
> > +			argv_array_push(&av, "--not");
> > +			for (i = 0; i < deepen_not->nr; i++) {
> > +				struct string_list_item *s = deepen_not->items + i;
> > +				argv_array_push(&av, s->string);
> 
> This accepts arbitrary rev-list arguments, which feels dangerous
> (could end up doing an expensive operation or reading arbitrary files
> or finding a way to execute arbitrary code).
> 
> [...]
> > -		if (deepen_not.nr) {
> > -			argv_array_push(&av, "--not");
> > -			for (i = 0; i < deepen_not.nr; i++) {
> > -				struct string_list_item *s = deepen_not.items + i;
> > -				argv_array_push(&av, s->string);
> 
> Huh.  Looks like some of the above comments are better addressed to an
> earlier patch.

If someone wants to fix this after the fact they can, I just moved this
logic, I didn't add it.

> 
> [...]
> > @@ -1071,6 +1085,13 @@ struct upload_pack_data {
> >  	struct object_array wants;
> >  	struct oid_array haves;
> >  
> > +	struct object_array shallows;
> > +	struct string_list deepen_not;
> > +	int depth;
> > +	timestamp_t deepen_since;
> > +	int deepen_rev_list;
> > +	int deepen_relative;
> 
> Nice.
> 
> Comments describing deepen_Rev_list and deepen_relative would be nice.
> 
> Are those boolean?  Can they be unsigned:1 to make that
> self-explanatory?

They are boolean but are passed via reference at some points so they
can't be bit flags.

> 
> [...]
> > @@ -1080,12 +1101,14 @@ struct upload_pack_data {
> >  	unsigned done : 1;
> >  };
> >  
> > -#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 0, 0, 0 }
> > +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, OBJECT_ARRAY_INIT, STRING_LIST_INIT_DUP, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }
> 
> Long line, "make style" should be able to fix it.
> 

I'll fix this.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 02/35] pkt-line: introduce struct packet_reader
  2018-02-27 18:14         ` Brandon Williams
@ 2018-02-27 19:20           ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 19:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:
> On 02/12, Jonathan Nieder wrote:

>>> --- a/pkt-line.h
>>> +++ b/pkt-line.h
>>> @@ -111,6 +111,64 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
>>>   */
>>>  ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
>>>  
>>> +struct packet_reader {
>>> +	/* source file descriptor */
>>> +	int fd;
>>> +
>>> +	/* source buffer and its size */
>>> +	char *src_buffer;
>>> +	size_t src_len;
>>
>> Can or should this be a strbuf?
>>
>>> +
>>> +	/* buffer that pkt-lines are read into and its size */
>>> +	char *buffer;
>>> +	unsigned buffer_size;
>>
>> Likewise.
>
> This struct is setup to be a drop in replacement for the existing
> read_packet() family of functions.  Because of this I tried to make the
> interface as similar as possible to make it easy to convert to using it
> as well as having no need to clean anything up (because the struct is
> really just a wrapper and doesn't own anything).

Sorry, I don't completely follow.  Are you saying some callers play
with the buffer, or are you saying you haven't checked?  (If the
latter, that's perfectly fine; I'm just trying to understand the API.)

Either way, can you add some comments about ownership / who is allowed
to write to it / etc to make it easier to clean up later?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 06/35] transport: use get_refs_via_connect to get refs
  2018-02-27 18:17         ` Brandon Williams
@ 2018-02-27 19:25           ` Jonathan Nieder
  2018-02-27 19:46             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 19:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:
> On 02/26, Jonathan Nieder wrote:
>> Brandon Williams wrote:

>>> +++ b/transport.c
>>> @@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
>>>  	args.cloning = transport->cloning;
>>>  	args.update_shallow = data->options.update_shallow;
>>>  
>>> -	if (!data->got_remote_heads) {
>>> -		connect_setup(transport, 0);
>>> -		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
>>> -				 NULL, &data->shallow);
>>> -		data->got_remote_heads = 1;
>>> -	}
>>> +	if (!data->got_remote_heads)
>>> +		refs_tmp = get_refs_via_connect(transport, 0);
>>
>> The only difference between the old and new code is that the old code
>> passes NULL as 'extra_have' and the new code passes &data->extra_have.
>>
>> That means this populates the data->extra_have oid_array.  Does it
>> matter?
[...]
> I don't think its a problem to have extra_have populated, least I
> haven't seen anything to lead me to believe it would be a problem.

Assuming it gets properly freed later, the only effect I can imagine
is some increased memory usage.

I'm inclined to agree with you that the simplicity is worth it.  It
seems worth mentioning in the commit message, though.

[...]
>>> @@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
>>>  	struct send_pack_args args;
>>>  	int ret;
>>>  
>>> -	if (!data->got_remote_heads) {
>>> -		struct ref *tmp_refs;
>>> -		connect_setup(transport, 1);
>>> -
>>> -		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
>>> -				 NULL, &data->shallow);
>>> -		data->got_remote_heads = 1;
>>> -	}
>>> +	if (!data->got_remote_heads)
>>> +		get_refs_via_connect(transport, 1);
>>
>> not a new problem, just curious: Does this leak tmp_refs?
>
> Maybe, though its removed by this patch.

Sorry for the lack of clarity.  If it was leaked before, then it is
still leaked now, via the discarded return value from
get_refs_via_connect.

Any idea how we can track that down?  E.g. are there ways to tell leak
checkers "just tell me about this particular allocation"?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 21/35] fetch-pack: perform a fetch using v2
  2018-02-07  1:12     ` [PATCH v3 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
  2018-02-24  0:54       ` Jonathan Tan
@ 2018-02-27 19:27       ` Stefan Beller
  2018-02-27 19:40         ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Stefan Beller @ 2018-02-27 19:27 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> When communicating with a v2 server, perform a fetch by requesting the
> 'fetch' command.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  builtin/fetch-pack.c   |   2 +-
>  fetch-pack.c           | 252 ++++++++++++++++++++++++++++++++++++++++++++++++-
>  fetch-pack.h           |   4 +-
>  t/t5702-protocol-v2.sh |  84 +++++++++++++++++
>  transport.c            |   7 +-
>  5 files changed, 342 insertions(+), 7 deletions(-)
>
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index f492e8abd..867dd3cc7 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -213,7 +213,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>         }
>
>         ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
> -                        &shallow, pack_lockfile_ptr);
> +                        &shallow, pack_lockfile_ptr, protocol_v0);
>         if (pack_lockfile) {
>                 printf("lock %s\n", pack_lockfile);
>                 fflush(stdout);
> diff --git a/fetch-pack.c b/fetch-pack.c
> index 9f6b07ad9..4fb5805dd 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -1008,6 +1008,247 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
>         return ref;
>  }
>
> +static void add_wants(const struct ref *wants, struct strbuf *req_buf)
> +{
> +       for ( ; wants ; wants = wants->next) {
> +               const struct object_id *remote = &wants->old_oid;
> +               const char *remote_hex;
> +               struct object *o;
> +
> +               /*
> +                * If that object is complete (i.e. it is an ancestor of a
> +                * local ref), we tell them we have it but do not have to
> +                * tell them about its ancestors, which they already know
> +                * about.
> +                *
> +                * We use lookup_object here because we are only
> +                * interested in the case we *know* the object is
> +                * reachable and we have already scanned it.
> +                */
> +               if (((o = lookup_object(remote->hash)) != NULL) &&
> +                   (o->flags & COMPLETE)) {
> +                       continue;
> +               }
> +
> +               remote_hex = oid_to_hex(remote);
> +               packet_buf_write(req_buf, "want %s\n", remote_hex);
> +       }
> +}
> +
> +static void add_common(struct strbuf *req_buf, struct oidset *common)
> +{
> +       struct oidset_iter iter;
> +       const struct object_id *oid;
> +       oidset_iter_init(common, &iter);
> +
> +       while ((oid = oidset_iter_next(&iter))) {
> +               packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
> +       }
> +}
> +
> +static int add_haves(struct strbuf *req_buf, int *in_vain)
> +{
> +       int ret = 0;
> +       int haves_added = 0;
> +       const struct object_id *oid;
> +
> +       while ((oid = get_rev())) {
> +               packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
> +               if (++haves_added >= INITIAL_FLUSH)
> +                       break;
> +       };
> +
> +       *in_vain += haves_added;
> +       if (!haves_added || *in_vain >= MAX_IN_VAIN) {
> +               /* Send Done */
> +               packet_buf_write(req_buf, "done\n");
> +               ret = 1;
> +       }
> +
> +       return ret;
> +}
> +
> +static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
> +                             const struct ref *wants, struct oidset *common,
> +                             int *in_vain)
> +{
> +       int ret = 0;
> +       struct strbuf req_buf = STRBUF_INIT;
> +
> +       if (server_supports_v2("fetch", 1))
> +               packet_buf_write(&req_buf, "command=fetch");
> +       if (server_supports_v2("agent", 0))
> +               packet_buf_write(&req_buf, "agent=%s", git_user_agent_sanitized());
> +
> +       packet_buf_delim(&req_buf);
> +       if (args->use_thin_pack)
> +               packet_buf_write(&req_buf, "thin-pack");
> +       if (args->no_progress)
> +               packet_buf_write(&req_buf, "no-progress");
> +       if (args->include_tag)
> +               packet_buf_write(&req_buf, "include-tag");
> +       if (prefer_ofs_delta)
> +               packet_buf_write(&req_buf, "ofs-delta");
> +
> +       /* add wants */
> +       add_wants(wants, &req_buf);

The comment might convey too much redundant information
instead of helping the reader gain more understanding. ;)

> +
> +       /* Add all of the common commits we've found in previous rounds */
> +       add_common(&req_buf, common);

nit:
Maybe s/add_common/add_common_haves/ or
add_previous_haves ?

> +
> +       /* Add initial haves */
> +       ret = add_haves(&req_buf, in_vain);

I like the shortness and conciseness of this send_fetch_request
function as it makes clear what is happening over the wire, however
I wonder if we can improve on that, still.

The functions 'add_common' and 'add_haves' seem like they do the
same (sending haves), except for different sets of oids.

So I would imagine that a structure like

  {
    struct set haves = compute_haves_from(in_vain, common, ...);
    struct set wants = compute_wants&wants);

    request_capabilities(args)
    send_haves(&haves);
    send_wants(&wants);
    flush();
  }

That way we would have an even more concise way of writing
one request, and factoring out the business logic. (Coming up
with the "right" haves is a heuristic that we plan on changing in
the future, so we'd want to have that encapsulated into one function
that computes all the haves?

> +
> +/*
> + * Processes a section header in a server's response and checks if it matches
> + * `section`.  If the value of `peek` is 1, the header line will be peeked (and
> + * not consumed); if 0, the line will be consumed and the function will die if
> + * the section header doesn't match what was expected.
> + */
> +static int process_section_header(struct packet_reader *reader,
> +                                 const char *section, int peek)
> +{
> +       int ret;
> +
> +       if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
> +               die("error reading packet");
> +
> +       ret = !strcmp(reader->line, section);
> +
> +       if (!peek) {
> +               if (!ret)
> +                       die("expected '%s', received '%s'",
> +                           section, reader->line);
> +               packet_reader_read(reader);
> +       }
> +
> +       return ret;
> +}
> +
> +static int process_acks(struct packet_reader *reader, struct oidset *common)
> +{
> +       /* received */
> +       int received_ready = 0;
> +       int received_ack = 0;
> +
> +       process_section_header(reader, "acknowledgments", 0);
> +       while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> +               const char *arg;
> +
> +               if (!strcmp(reader->line, "NAK"))
> +                       continue;
> +
> +               if (skip_prefix(reader->line, "ACK ", &arg)) {
> +                       struct object_id oid;
> +                       if (!get_oid_hex(arg, &oid)) {
> +                               struct commit *commit;
> +                               oidset_insert(common, &oid);
> +                               commit = lookup_commit(&oid);
> +                               mark_common(commit, 0, 1);
> +                       }
> +                       continue;
> +               }
> +
> +               if (!strcmp(reader->line, "ready")) {
> +                       clear_prio_queue(&rev_list);
> +                       received_ready = 1;
> +                       continue;
> +               }
> +
> +               die(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);

This is slightly misleading, it could also expect "ready" ?


> +       }
> +
> +       if (reader->status != PACKET_READ_FLUSH &&
> +           reader->status != PACKET_READ_DELIM)
> +               die("Error during processing acks: %d", reader->status);

Why is this not translated unlike the one 5 lines prior to this?
Do we expect these conditions to come up due to different
root causes?

> +static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> +                                   int fd[2],
> +                                   const struct ref *orig_ref,
> +                                   struct ref **sought, int nr_sought,
> +                                   char **pack_lockfile)
> +{
> +       struct ref *ref = copy_ref_list(orig_ref);
> +       enum fetch_state state = FETCH_CHECK_LOCAL;
> +       struct oidset common = OIDSET_INIT;
> +       struct packet_reader reader;
> +       int in_vain = 0;
> +       packet_reader_init(&reader, fd[0], NULL, 0,
> +                          PACKET_READ_CHOMP_NEWLINE);
> +
> +       while (state != FETCH_DONE) {
> +               switch (state) {
> +               case FETCH_CHECK_LOCAL:
> +                       sort_ref_list(&ref, ref_compare_name);
> +                       QSORT(sought, nr_sought, cmp_ref_by_name);
> +
> +                       /* v2 supports these by default */

Is there a doc that says what is all on by default?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-27  6:51       ` Jonathan Nieder
@ 2018-02-27 19:30         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 19:30 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/26, Jonathan Nieder wrote:
> Brandon Williams wrote:
> >  static char *server_capabilities;
> > +static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
> 
> Can a quick doc comment describe these and how they relate?
> 
> Is only one of them set, based on which protocol version is in use?
> Should server_capabilities be renamed to server_capabilities_v1?

yes that's correct.  I can rename it.

> > +{
> > +	int ret = 1;
> > +	int i = 0;
> > +	struct object_id old_oid;
> > +	struct ref *ref;
> > +	struct string_list line_sections = STRING_LIST_INIT_DUP;
> > +
> > +	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
> 
> Can there be a comment describing the expected format?

Yep I'll write a comment up.

> > +
> > +	oidcpy(&ref->old_oid, &old_oid);
> > +	**list = ref;
> > +	*list = &ref->next;
> > +
> > +	for (; i < line_sections.nr; i++) {
> > +		const char *arg = line_sections.items[i].string;
> > +		if (skip_prefix(arg, "symref-target:", &arg))
> > +			ref->symref = xstrdup(arg);
> 
> Using space-delimited fields in a single pkt-line means that
> - values cannot contains a space
> - total length is limited by the size of a pkt-line
> 
> Given the context, I think that's fine.  More generally it is tempting
> to use a pkt-line per field to avoid the trouble v1 had with
> capability lists crammed into a pkt-line, but I see why you used a
> pkt-line per ref to avoid having to have sections-within-a-section.
> 
> My only potential worry is the length part: do we have an explicit
> limit on the length of a ref name?  git-check-ref-format(1) doesn't
> mention one.  A 32k ref name would be a bit ridiculous, though.

Yeah I think we're fine for now, mostly because we're out of luck with
the current protocol as it is.

> 
> > +
> > +		if (skip_prefix(arg, "peeled:", &arg)) {
> > +			struct object_id peeled_oid;
> > +			char *peeled_name;
> > +			struct ref *peeled;
> > +			if (get_oid_hex(arg, &peeled_oid)) {
> > +				ret = 0;
> > +				goto out;
> > +			}
> 
> Can this also check that there's no trailing garbage after the oid?

Yeah I do that.

> 
> > +
> > +	packet_delim(fd_out);
> > +	/* When pushing we don't want to request the peeled tags */
> 
> Can you say more about this?  In theory it would be nice to have the
> peeled tags since they name commits whose history can be excluded from
> the pack.

I don't believe peeled refs are sent now in v0 for push.

> 
> > +	if (!for_push)
> > +		packet_write_fmt(fd_out, "peel\n");
> > +	packet_write_fmt(fd_out, "symrefs\n");
> 
> Are symrefs useful during push?

They may be at a later point in time when you want to update a symref :)

> 
> > +	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
> > +		packet_write_fmt(fd_out, "ref-pattern %s\n",
> > +				 ref_patterns->argv[i]);
> > +	}
> 
> The exciting part.
> 
> Why do these pkts end with \n?  I would have expected the pkt-line
> framing to work to delimit them.

All pkts end with \n, that's just hows its been since v0.  Though the
server isn't supposed to complain if they don't contain newlines.

> 
> > +	packet_flush(fd_out);
> > +
> > +	/* Process response from server */
> > +	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> > +		if (!process_ref_v2(reader->line, &list))
> > +			die("invalid ls-refs response: %s", reader->line);
> > +	}
> > +
> > +	if (reader->status != PACKET_READ_FLUSH)
> > +		die("protocol error");
> 
> Can this protocol error give more detail?  When diagnosing an error in
> servers, proxies, or lower-level networking issues, informative protocol
> errors can be very helpful (similar to syntax errors from a compiler).

I'll update the  error msg.

> [...]
> > --- a/remote.h
> > +++ b/remote.h
> > @@ -151,10 +151,14 @@ void free_refs(struct ref *ref);
> >  
> >  struct oid_array;
> >  struct packet_reader;
> > +struct argv_array;
> >  extern struct ref **get_remote_heads(struct packet_reader *reader,
> >  				     struct ref **list, unsigned int flags,
> >  				     struct oid_array *extra_have,
> >  				     struct oid_array *shallow_points);
> > +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> > +				    struct ref **list, int for_push,
> > +				    const struct argv_array *ref_patterns);
> 
> What is the difference between get_remote_heads and get_remote_refs?
> A comment might help.  (BTW, thanks for making the new saner name to
> replace get_remote_heads!)

I'll add a comment saying its used in v2 to retrieve a list of refs from
the remote.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 21/35] fetch-pack: perform a fetch using v2
  2018-02-27 19:27       ` Stefan Beller
@ 2018-02-27 19:40         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 19:40 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Derrick Stolee,
	Jeff Hostetler, Duy Nguyen

On 02/27, Stefan Beller wrote:
> > +
> > +       /* Add initial haves */
> > +       ret = add_haves(&req_buf, in_vain);
> 
> I like the shortness and conciseness of this send_fetch_request
> function as it makes clear what is happening over the wire, however
> I wonder if we can improve on that, still.

I'm sure there is more that can be micro optimized but I don't really
want to get distracted by redesigning this logic another time right now.

> 
> The functions 'add_common' and 'add_haves' seem like they do the
> same (sending haves), except for different sets of oids.
> 
> So I would imagine that a structure like
> 
>   {
>     struct set haves = compute_haves_from(in_vain, common, ...);
>     struct set wants = compute_wants&wants);
> 
>     request_capabilities(args)
>     send_haves(&haves);
>     send_wants(&wants);
>     flush();
>   }
> 
> That way we would have an even more concise way of writing
> one request, and factoring out the business logic. (Coming up
> with the "right" haves is a heuristic that we plan on changing in
> the future, so we'd want to have that encapsulated into one function
> that computes all the haves?
> 
> > +
> > +/*
> > + * Processes a section header in a server's response and checks if it matches
> > + * `section`.  If the value of `peek` is 1, the header line will be peeked (and
> > + * not consumed); if 0, the line will be consumed and the function will die if
> > + * the section header doesn't match what was expected.
> > + */
> > +static int process_section_header(struct packet_reader *reader,
> > +                                 const char *section, int peek)
> > +{
> > +       int ret;
> > +
> > +       if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
> > +               die("error reading packet");
> > +
> > +       ret = !strcmp(reader->line, section);
> > +
> > +       if (!peek) {
> > +               if (!ret)
> > +                       die("expected '%s', received '%s'",
> > +                           section, reader->line);
> > +               packet_reader_read(reader);
> > +       }
> > +
> > +       return ret;
> > +}
> > +
> > +static int process_acks(struct packet_reader *reader, struct oidset *common)
> > +{
> > +       /* received */
> > +       int received_ready = 0;
> > +       int received_ack = 0;
> > +
> > +       process_section_header(reader, "acknowledgments", 0);
> > +       while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
> > +               const char *arg;
> > +
> > +               if (!strcmp(reader->line, "NAK"))
> > +                       continue;
> > +
> > +               if (skip_prefix(reader->line, "ACK ", &arg)) {
> > +                       struct object_id oid;
> > +                       if (!get_oid_hex(arg, &oid)) {
> > +                               struct commit *commit;
> > +                               oidset_insert(common, &oid);
> > +                               commit = lookup_commit(&oid);
> > +                               mark_common(commit, 0, 1);
> > +                       }
> > +                       continue;
> > +               }
> > +
> > +               if (!strcmp(reader->line, "ready")) {
> > +                       clear_prio_queue(&rev_list);
> > +                       received_ready = 1;
> > +                       continue;
> > +               }
> > +
> > +               die(_("git fetch-pack: expected ACK/NAK, got '%s'"), reader->line);
> 
> This is slightly misleading, it could also expect "ready" ?

I'll update this.

> 
> 
> > +       }
> > +
> > +       if (reader->status != PACKET_READ_FLUSH &&
> > +           reader->status != PACKET_READ_DELIM)
> > +               die("Error during processing acks: %d", reader->status);
> 
> Why is this not translated unlike the one 5 lines prior to this?
> Do we expect these conditions to come up due to different
> root causes?
> 
> > +static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
> > +                                   int fd[2],
> > +                                   const struct ref *orig_ref,
> > +                                   struct ref **sought, int nr_sought,
> > +                                   char **pack_lockfile)
> > +{
> > +       struct ref *ref = copy_ref_list(orig_ref);
> > +       enum fetch_state state = FETCH_CHECK_LOCAL;
> > +       struct oidset common = OIDSET_INIT;
> > +       struct packet_reader reader;
> > +       int in_vain = 0;
> > +       packet_reader_init(&reader, fd[0], NULL, 0,
> > +                          PACKET_READ_CHOMP_NEWLINE);
> > +
> > +       while (state != FETCH_DONE) {
> > +               switch (state) {
> > +               case FETCH_CHECK_LOCAL:
> > +                       sort_ref_list(&ref, ref_compare_name);
> > +                       QSORT(sought, nr_sought, cmp_ref_by_name);
> > +
> > +                       /* v2 supports these by default */
> 
> Is there a doc that says what is all on by default?

Yeah protocol-v2.txt should say all of that.

> 
> Thanks,
> Stefan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 06/35] transport: use get_refs_via_connect to get refs
  2018-02-27 19:25           ` Jonathan Nieder
@ 2018-02-27 19:46             ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-27 19:46 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/27, Jonathan Nieder wrote:
> Brandon Williams wrote:
> > On 02/26, Jonathan Nieder wrote:
> >> Brandon Williams wrote:
> 
> >>> +++ b/transport.c
> >>> @@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
> >>>  	args.cloning = transport->cloning;
> >>>  	args.update_shallow = data->options.update_shallow;
> >>>  
> >>> -	if (!data->got_remote_heads) {
> >>> -		connect_setup(transport, 0);
> >>> -		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
> >>> -				 NULL, &data->shallow);
> >>> -		data->got_remote_heads = 1;
> >>> -	}
> >>> +	if (!data->got_remote_heads)
> >>> +		refs_tmp = get_refs_via_connect(transport, 0);
> >>
> >> The only difference between the old and new code is that the old code
> >> passes NULL as 'extra_have' and the new code passes &data->extra_have.
> >>
> >> That means this populates the data->extra_have oid_array.  Does it
> >> matter?
> [...]
> > I don't think its a problem to have extra_have populated, least I
> > haven't seen anything to lead me to believe it would be a problem.
> 
> Assuming it gets properly freed later, the only effect I can imagine
> is some increased memory usage.
> 
> I'm inclined to agree with you that the simplicity is worth it.  It
> seems worth mentioning in the commit message, though.
> 
> [...]
> >>> @@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
> >>>  	struct send_pack_args args;
> >>>  	int ret;
> >>>  
> >>> -	if (!data->got_remote_heads) {
> >>> -		struct ref *tmp_refs;
> >>> -		connect_setup(transport, 1);
> >>> -
> >>> -		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
> >>> -				 NULL, &data->shallow);
> >>> -		data->got_remote_heads = 1;
> >>> -	}
> >>> +	if (!data->got_remote_heads)
> >>> +		get_refs_via_connect(transport, 1);
> >>
> >> not a new problem, just curious: Does this leak tmp_refs?
> >
> > Maybe, though its removed by this patch.
> 
> Sorry for the lack of clarity.  If it was leaked before, then it is
> still leaked now, via the discarded return value from
> get_refs_via_connect.
> 
> Any idea how we can track that down?  E.g. are there ways to tell leak
> checkers "just tell me about this particular allocation"?

Hmm I wonder if that code path is even used, because it just throws away
the result.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-27  6:21               ` Jonathan Nieder
@ 2018-02-27 21:58                 ` Junio C Hamano
  2018-02-27 22:04                   ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-02-27 21:58 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jonathan Tan, Jeff King, Brandon Williams, git, sbeller, stolee,
	git, pclouds

Jonathan Nieder <jrnieder@gmail.com> writes:

> Jonathan Tan wrote:
>> On Thu, 22 Feb 2018 13:26:58 -0500
>> Jeff King <peff@peff.net> wrote:
>
>>> I agree that it shouldn't matter much here. But if the name argv_array
>>> is standing in the way of using it, I think we should consider giving it
>>> a more general name. I picked that not to evoke "this must be arguments"
>>> but "this is terminated by a single NULL".
> [...]
>> This sounds reasonable - I withdraw my comment about using struct
>> string_list.
>
> Marking with #leftoverbits as a reminder to think about what such a
> more general name would be (or what kind of docs to put in
> argv-array.h) and make it so the next time I do a search for that
> keyword.

So are we looking for a natural name to call an array of trings?  I
personally do not mind argv_array too much, but perhaps we can call
it a string_array and then everybody will be happy?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-27 21:58                 ` Junio C Hamano
@ 2018-02-27 22:04                   ` Jeff King
  2018-02-27 22:10                     ` Eric Sunshine
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-27 22:04 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Jonathan Tan, Brandon Williams, git, sbeller,
	stolee, git, pclouds

On Tue, Feb 27, 2018 at 01:58:00PM -0800, Junio C Hamano wrote:

> Jonathan Nieder <jrnieder@gmail.com> writes:
> 
> > Jonathan Tan wrote:
> >> On Thu, 22 Feb 2018 13:26:58 -0500
> >> Jeff King <peff@peff.net> wrote:
> >
> >>> I agree that it shouldn't matter much here. But if the name argv_array
> >>> is standing in the way of using it, I think we should consider giving it
> >>> a more general name. I picked that not to evoke "this must be arguments"
> >>> but "this is terminated by a single NULL".
> > [...]
> >> This sounds reasonable - I withdraw my comment about using struct
> >> string_list.
> >
> > Marking with #leftoverbits as a reminder to think about what such a
> > more general name would be (or what kind of docs to put in
> > argv-array.h) and make it so the next time I do a search for that
> > keyword.
> 
> So are we looking for a natural name to call an array of trings?  I
> personally do not mind argv_array too much, but perhaps we can call
> it a string_array and then everybody will be happy?

That would be fine with me. Though I would love it if we could find a
shorter name for the associated functions. For example,
argv_array_pushf() can make lines quite long, and something like
argv_pushf() is easier to read (in my opinion). And that might work
because "argv" is pretty unique by itself, but "string" is not.

Some one-word name like "strarray" might work, though I find that is not
quite catchy. I guess "strv" is short if you assume that people know the
"v" suffix means "vector".

It may not be worth worrying too much about, though. We already have
24-character monstrosities like string_list_append_nodup(). ;)

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-27 22:04                   ` Jeff King
@ 2018-02-27 22:10                     ` Eric Sunshine
  2018-02-27 22:18                       ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Eric Sunshine @ 2018-02-27 22:10 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Jonathan Nieder, Jonathan Tan, Brandon Williams,
	Git List, Stefan Beller, Derrick Stolee, Jeff Hostetler,
	Nguyễn Thái Ngọc Duy

On Tue, Feb 27, 2018 at 5:04 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Feb 27, 2018 at 01:58:00PM -0800, Junio C Hamano wrote:
>> So are we looking for a natural name to call an array of trings?  I
>> personally do not mind argv_array too much, but perhaps we can call
>> it a string_array and then everybody will be happy?
>
> That would be fine with me. Though I would love it if we could find a
> shorter name for the associated functions. For example,
> argv_array_pushf() can make lines quite long, and something like
> argv_pushf() is easier to read (in my opinion). And that might work
> because "argv" is pretty unique by itself, but "string" is not.
>
> Some one-word name like "strarray" might work, though I find that is not
> quite catchy. I guess "strv" is short if you assume that people know the
> "v" suffix means "vector".

struct strs {...};

void strs_init(struct strs *);
void strs_push(struct strs *, const char *);
void strs_pushf(struct strs *, const char *fmt, ...);
void strs_pushl(struct strs *, ...);
void strs_pushv(struct strs *, const char **);
void strs_pop(struct strs *);
void strs_clear(struct strs *);
const char **strs_detach(struct strs *);

...is short, feels pretty natural, and doesn't require understanding
"v" for "vector".

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-27 22:10                     ` Eric Sunshine
@ 2018-02-27 22:18                       ` Jeff King
  2018-02-27 23:32                         ` Junio C Hamano
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-02-27 22:18 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Junio C Hamano, Jonathan Nieder, Jonathan Tan, Brandon Williams,
	Git List, Stefan Beller, Derrick Stolee, Jeff Hostetler,
	Nguyễn Thái Ngọc Duy

On Tue, Feb 27, 2018 at 05:10:09PM -0500, Eric Sunshine wrote:

> > That would be fine with me. Though I would love it if we could find a
> > shorter name for the associated functions. For example,
> > argv_array_pushf() can make lines quite long, and something like
> > argv_pushf() is easier to read (in my opinion). And that might work
> > because "argv" is pretty unique by itself, but "string" is not.
> >
> > Some one-word name like "strarray" might work, though I find that is not
> > quite catchy. I guess "strv" is short if you assume that people know the
> > "v" suffix means "vector".
> 
> struct strs {...};
> 
> void strs_init(struct strs *);
> void strs_push(struct strs *, const char *);
> void strs_pushf(struct strs *, const char *fmt, ...);
> void strs_pushl(struct strs *, ...);
> void strs_pushv(struct strs *, const char **);
> void strs_pop(struct strs *);
> void strs_clear(struct strs *);
> const char **strs_detach(struct strs *);
> 
> ...is short, feels pretty natural, and doesn't require understanding
> "v" for "vector".

Not bad. The "v" carries the information that it _is_ a NULL-terminated
vector and not some other list-like structure (and so is suitable for
feeding to execv, etc). But that may just be obvious from looking at its
uses and documentation.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 26/35] transport-helper: remove name parameter
  2018-02-07  1:13     ` [PATCH v3 26/35] transport-helper: remove name parameter Brandon Williams
@ 2018-02-27 23:03       ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 23:03 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Commit 266f1fdfa (transport-helper: be quiet on read errors from
> helpers, 2013-06-21) removed a call to 'die()' which printed the name of
> the remote helper passed in to the 'recvline_fh()' function using the
> 'name' parameter.  Once the call to 'die()' was removed the parameter
> was no longer necessary but wasn't removed.  Clean up 'recvline_fh()'
> parameter list by removing the 'name' parameter.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  transport-helper.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Nice.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 29/35] pkt-line: add packet_buf_write_len function
  2018-02-07  1:13     ` [PATCH v3 29/35] pkt-line: add packet_buf_write_len function Brandon Williams
@ 2018-02-27 23:11       ` Jonathan Nieder
  2018-02-28  1:08         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 23:11 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Add the 'packet_buf_write_len()' function which allows for writing an
> arbitrary length buffer into a 'struct strbuf' and formatting it in
> packet-line format.

Makes sense.

[...]
> --- a/pkt-line.h
> +++ b/pkt-line.h
> @@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
>  void packet_buf_delim(struct strbuf *buf);
>  void packet_write(int fd_out, const char *buf, size_t size);
>  void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
> +void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);

I wonder if we should rename packet_buf_write to something like
packet_buf_writef.  Right now there's a kind of confusing collection of
functions without much symmetry.

Alternatively, the _buf_ ones could become strbuf_* functions:

	strbuf_add_packet(&buf, data, len);
	strbuf_addf_packet(&buf, fmt, ...);

That would make it clearer that these append to buf.

I'm just thinking out loud.  For this series, the API you have here
looks fine, even if it is a bit inconsistent.  (In other words, even
if you agree with me, this would probably be best addressed as a patch
on top.)

[...]
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
>  	va_end(args);
>  }
>  
> +void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
> +{
> +	size_t orig_len, n;
> +
> +	orig_len = buf->len;
> +	strbuf_addstr(buf, "0000");
> +	strbuf_add(buf, data, len);
> +	n = buf->len - orig_len;
> +
> +	if (n > LARGE_PACKET_MAX)
> +		die("protocol error: impossibly long line");

Could the error message describe the long line (e.g.

		...impossibly long line %.*s...", 256, data);

)?

> +
> +	set_packet_header(&buf->buf[orig_len], n);
> +	packet_trace(buf->buf + orig_len + 4, n - 4, 1);

Could do, more simply:

	packet_trace(data, len, 1);

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 31/35] remote-curl: store the protocol version the server responded with
  2018-02-07  1:13     ` [PATCH v3 31/35] remote-curl: store the protocol version the server responded with Brandon Williams
@ 2018-02-27 23:17       ` Jonathan Nieder
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 23:17 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Store the protocol version the server responded with when performing
> discovery.  This will be used in a future patch to either change the
> 'Git-Protocol' header sent in subsequent requests or to determine if a
> client needs to fallback to using a different protocol version.

nit: s/fallback/fall back/ (fallback is the noun/adjective, fall back
the verb)

> Signed-off-by: Brandon Williams <bmwill@google.com>

With or without that tweak,
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>

Thanks.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect
  2018-02-07  1:13     ` [PATCH v3 28/35] transport-helper: introduce stateless-connect Brandon Williams
  2018-02-22  0:01       ` Jonathan Tan
@ 2018-02-27 23:30       ` Jonathan Nieder
  2018-02-28 19:09         ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-27 23:30 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Brandon Williams wrote:

> Introduce the transport-helper capability 'stateless-connect'.  This
> capability indicates that the transport-helper can be requested to run
> the 'stateless-connect' command which should attempt to make a
> stateless connection with a remote end.  Once established, the
> connection can be used by the git client to communicate with
> the remote end natively in a stateless-rpc manner as supported by
> protocol v2.  This means that the client must send everything the server
> needs in a single request as the client must not assume any
> state-storing on the part of the server or transport.
>
> If a stateless connection cannot be established then the remote-helper
> will respond in the same manner as the 'connect' command indicating that
> the client should fallback to using the dumb remote-helper commands.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  transport-helper.c | 8 ++++++++
>  transport.c        | 1 +
>  transport.h        | 6 ++++++
>  3 files changed, 15 insertions(+)

Please add documentation for this command to
Documentation/gitremote-helpers.txt.

That helps reviewers, since it means reviewers can get a sense of what
the interface is meant to be.  It helps remote helper implementers as
well: it tells them what they can rely on and what can't rely on in
this interface.  For the same reason it helpers remote helper callers
as well.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 14/35] connect: request remote refs using v2
  2018-02-27 22:18                       ` Jeff King
@ 2018-02-27 23:32                         ` Junio C Hamano
  0 siblings, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-02-27 23:32 UTC (permalink / raw)
  To: Jeff King
  Cc: Eric Sunshine, Jonathan Nieder, Jonathan Tan, Brandon Williams,
	Git List, Stefan Beller, Derrick Stolee, Jeff Hostetler,
	Nguyễn Thái Ngọc Duy

Jeff King <peff@peff.net> writes:

>> struct strs {...};
>> 
>> void strs_init(struct strs *);
>> void strs_push(struct strs *, const char *);
>> void strs_pushf(struct strs *, const char *fmt, ...);
>> void strs_pushl(struct strs *, ...);
>> void strs_pushv(struct strs *, const char **);
>> void strs_pop(struct strs *);
>> void strs_clear(struct strs *);
>> const char **strs_detach(struct strs *);
>> 
>> ...is short, feels pretty natural, and doesn't require understanding
>> "v" for "vector".
>
> Not bad. The "v" carries the information that it _is_ a NULL-terminated
> vector and not some other list-like structure (and so is suitable for
> feeding to execv, etc). But that may just be obvious from looking at its
> uses and documentation.

And with "v", it probably is obvious without looking at its uses and
documentation, so... ;-)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 34/35] remote-curl: implement stateless-connect command
  2018-02-07  1:13     ` [PATCH v3 34/35] remote-curl: implement stateless-connect command Brandon Williams
@ 2018-02-28  0:05       ` Jonathan Nieder
  2018-02-28 20:21         ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-02-28  0:05 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

Hi,

Brandon Williams wrote:

> Teach remote-curl the 'stateless-connect' command which is used to
> establish a stateless connection with servers which support protocol
> version 2.  This allows remote-curl to act as a proxy, allowing the git
> client to communicate natively with a remote end, simply using
> remote-curl as a pass through to convert requests to http.

Cool!  I better look at the spec for that first.

*looks at the previous patch*

Oh, there is no documented spec. :/  I'll muddle through this instead,
then.

[...]
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -188,7 +188,10 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
>  	heads->version = discover_version(&reader);
>  	switch (heads->version) {
>  	case protocol_v2:
> -		die("support for protocol v2 not implemented yet");
> +		/*
> +		 * Do nothing.  Client should run 'stateless-connect' and
> +		 * request the refs themselves.
> +		 */
>  		break;

This is the 'list' command, right?  Since we expect the client to run
'stateless-connect' instead, can we make it error out?

[...]
> @@ -1082,6 +1085,184 @@ static void parse_push(struct strbuf *buf)
>  	free(specs);
>  }
>  
> +struct proxy_state {
> +	char *service_name;
> +	char *service_url;
> +	struct curl_slist *headers;
> +	struct strbuf request_buffer;
> +	int in;
> +	int out;
> +	struct packet_reader reader;
> +	size_t pos;
> +	int seen_flush;
> +};

Can this have a comment describing what it is/does?  It's not obvious
to me at first glance.

It doesn't have to go in a lot of detail since this is an internal
implementation detail, but something saying e.g. that this represents
a connection to an HTTP server (that's an example; I'm not saying
that's what it represents :)) would help.

> +
> +static void proxy_state_init(struct proxy_state *p, const char *service_name,
> +			     enum protocol_version version)
[...]
> +static void proxy_state_clear(struct proxy_state *p)

Looks sensible.

[...]
> +static size_t proxy_in(char *buffer, size_t eltsize,
> +		       size_t nmemb, void *userdata)

Can this have a comment describing the interface?  (E.g. does it read
a single pkt_line?  How is the caller expected to use it?  Does it
satisfy the interface of some callback?)

libcurl's example https://curl.haxx.se/libcurl/c/ftpupload.html just
calls this read_callback.  Such a name plus a pointer to
CURLOPT_READFUNCTION should do the trick; bonus points if the comment 
says what our implementation of the callback does.

Is this about having peek ability?

> +{
> +	size_t max = eltsize * nmemb;

Can this overflow?  st_mult can avoid having to worry about that.

> +	struct proxy_state *p = userdata;
> +	size_t avail = p->request_buffer.len - p->pos;
> +
> +	if (!avail) {
> +		if (p->seen_flush) {
> +			p->seen_flush = 0;
> +			return 0;
> +		}
> +
> +		strbuf_reset(&p->request_buffer);
> +		switch (packet_reader_read(&p->reader)) {
> +		case PACKET_READ_EOF:
> +			die("unexpected EOF when reading from parent process");
> +		case PACKET_READ_NORMAL:
> +			packet_buf_write_len(&p->request_buffer, p->reader.line,
> +					     p->reader.pktlen);
> +			break;
> +		case PACKET_READ_DELIM:
> +			packet_buf_delim(&p->request_buffer);
> +			break;
> +		case PACKET_READ_FLUSH:
> +			packet_buf_flush(&p->request_buffer);
> +			p->seen_flush = 1;
> +			break;
> +		}
> +		p->pos = 0;
> +		avail = p->request_buffer.len;
> +	}
> +
> +	if (max < avail)
> +		avail = max;
> +	memcpy(buffer, p->request_buffer.buf + p->pos, avail);
> +	p->pos += avail;
> +	return avail;

This is a number of bytes, but CURLOPT_READFUNCTION expects a number
of items, fread-style.  That is:

	if (avail < eltsize)
		... handle somehow, maybe by reading in more? ...

	avail_memb = avail / eltsize;
	memcpy(buffer,
	       p->request_buffer.buf + p->pos,
	       st_mult(avail_memb, eltsize));
	p->pos += st_mult(avail_memb, eltsize);
	return avail_memb;

But https://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html says

	Your function must then return the actual number of bytes that
	it stored in that memory area.

Does this mean eltsize is always 1?  This is super confusing...

... ok, a quick grep for fread_func in libcurl reveals that eltsize is
indeed always 1.  Can we add an assertion so we notice if that
changes?

	if (eltsize != 1)
		BUG("curl read callback called with size = %zu != 1", eltsize);
	max = nmemb;

[...]
> +static size_t proxy_out(char *buffer, size_t eltsize,
> +			size_t nmemb, void *userdata)
> +{
> +	size_t size = eltsize * nmemb;
> +	struct proxy_state *p = userdata;
> +
> +	write_or_die(p->out, buffer, size);
> +	return size;
> +}

Nice.  Same questions about st_mult or just asserting on eltsize apply
here, too.

[...]
> +static int proxy_post(struct proxy_state *p)

What does this function do?  Can it get a brief comment?

> +{
> +	struct active_request_slot *slot;
> +	int err;
> +
> +	slot = get_active_slot();
> +
> +	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> +	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
> +	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
> +	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, p->headers);
> +
> +	/* Setup function to read request from client */
> +	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
> +	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
> +
> +	/* Setup function to write server response to client */
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
> +	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
> +
> +	err = run_slot(slot, NULL);
> +
> +	if (err != HTTP_OK)
> +		err = -1;
> +
> +	return err;

HTTP_OK is 0 but kind of obscures that.  How about something like the
following?

	if (run_slot(slot, NULL))
		return -1;
	return 0;

or

	if (run_slot(slot, NULL) != HTTP_OK)
		return -1;
	return 0;

That way it's clearer that this always returns 0 or -1.

[...]
> +static int stateless_connect(const char *service_name)
> +{
> +	struct discovery *discover;
> +	struct proxy_state p;
> +
> +	/*
> +	 * Run the info/refs request and see if the server supports protocol
> +	 * v2.  If and only if the server supports v2 can we successfully
> +	 * establish a stateless connection, otherwise we need to tell the
> +	 * client to fallback to using other transport helper functions to
> +	 * complete their request.
> +	 */
> +	discover = discover_refs(service_name, 0);
> +	if (discover->version != protocol_v2) {
> +		printf("fallback\n");
> +		fflush(stdout);
> +		return -1;

Interesting.  I wonder if we can make remote-curl less smart and drive
this more from the caller.

E.g. if the caller could do a single stateless request, they could do:

	option git-protocol version=2
	stateless-request GET info/refs?service=git-upload-pack
	[pkt-lines, ending with a flush-pkt]

The git-protocol option in this hypothetical example is the value to
be passed in the Git-Protocol header.

Then based on the response, the caller could decide to keep using
stateless-request for further requests or fall back to "fetch".

That way, if we implement some protocol v3, the remote helper would
not have to be changed at all to handle it: the caller would instead
make the new v3-format request and remote-curl would be able to oblige
without knowing why they're doing it.

What do you think?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 29/35] pkt-line: add packet_buf_write_len function
  2018-02-27 23:11       ` Jonathan Nieder
@ 2018-02-28  1:08         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28  1:08 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/27, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > Add the 'packet_buf_write_len()' function which allows for writing an
> > arbitrary length buffer into a 'struct strbuf' and formatting it in
> > packet-line format.
> 
> Makes sense.
> 
> [...]
> > --- a/pkt-line.h
> > +++ b/pkt-line.h
> > @@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
> >  void packet_buf_delim(struct strbuf *buf);
> >  void packet_write(int fd_out, const char *buf, size_t size);
> >  void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
> > +void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);
> 
> I wonder if we should rename packet_buf_write to something like
> packet_buf_writef.  Right now there's a kind of confusing collection of
> functions without much symmetry.
> 
> Alternatively, the _buf_ ones could become strbuf_* functions:
> 
> 	strbuf_add_packet(&buf, data, len);
> 	strbuf_addf_packet(&buf, fmt, ...);
> 
> That would make it clearer that these append to buf.
> 
> I'm just thinking out loud.  For this series, the API you have here
> looks fine, even if it is a bit inconsistent.  (In other words, even
> if you agree with me, this would probably be best addressed as a patch
> on top.)

Yeah I agree that an api change is needed, but yeah it can be done on
top of this series.

> 
> [...]
> > --- a/pkt-line.c
> > +++ b/pkt-line.c
> > @@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
> >  	va_end(args);
> >  }
> >  
> > +void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
> > +{
> > +	size_t orig_len, n;
> > +
> > +	orig_len = buf->len;
> > +	strbuf_addstr(buf, "0000");
> > +	strbuf_add(buf, data, len);
> > +	n = buf->len - orig_len;
> > +
> > +	if (n > LARGE_PACKET_MAX)
> > +		die("protocol error: impossibly long line");
> 
> Could the error message describe the long line (e.g.
> 
> 		...impossibly long line %.*s...", 256, data);
> 

I was reusing the error msg as it appears in another part of this file.

> )?
> 
> > +
> > +	set_packet_header(&buf->buf[orig_len], n);
> > +	packet_trace(buf->buf + orig_len + 4, n - 4, 1);
> 
> Could do, more simply:
> 
> 	packet_trace(data, len, 1);

I'll change this.

> 
> Thanks,
> Jonathan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect
  2018-02-27 23:30       ` Jonathan Nieder
@ 2018-02-28 19:09         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 19:09 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/27, Jonathan Nieder wrote:
> Brandon Williams wrote:
> 
> > Introduce the transport-helper capability 'stateless-connect'.  This
> > capability indicates that the transport-helper can be requested to run
> > the 'stateless-connect' command which should attempt to make a
> > stateless connection with a remote end.  Once established, the
> > connection can be used by the git client to communicate with
> > the remote end natively in a stateless-rpc manner as supported by
> > protocol v2.  This means that the client must send everything the server
> > needs in a single request as the client must not assume any
> > state-storing on the part of the server or transport.
> >
> > If a stateless connection cannot be established then the remote-helper
> > will respond in the same manner as the 'connect' command indicating that
> > the client should fallback to using the dumb remote-helper commands.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  transport-helper.c | 8 ++++++++
> >  transport.c        | 1 +
> >  transport.h        | 6 ++++++
> >  3 files changed, 15 insertions(+)
> 
> Please add documentation for this command to
> Documentation/gitremote-helpers.txt.
> 
> That helps reviewers, since it means reviewers can get a sense of what
> the interface is meant to be.  It helps remote helper implementers as
> well: it tells them what they can rely on and what can't rely on in
> this interface.  For the same reason it helpers remote helper callers
> as well.
> 
> Thanks,
> Jonathan

Thanks for reminding me.  I had intended to at some point but had
forgotten to do so.  I'm going to mark this it as experimental and for
internal use only so that we can still tweak the interface if we want
before it becomes stable.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 34/35] remote-curl: implement stateless-connect command
  2018-02-28  0:05       ` Jonathan Nieder
@ 2018-02-28 20:21         ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 20:21 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, sbeller, peff, gitster, stolee, git, pclouds

On 02/27, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > Teach remote-curl the 'stateless-connect' command which is used to
> > establish a stateless connection with servers which support protocol
> > version 2.  This allows remote-curl to act as a proxy, allowing the git
> > client to communicate natively with a remote end, simply using
> > remote-curl as a pass through to convert requests to http.
> 
> Cool!  I better look at the spec for that first.
> 
> *looks at the previous patch*
> 
> Oh, there is no documented spec. :/  I'll muddle through this instead,
> then.

I'll make sure to add one :)

> 
> [...]
> > --- a/remote-curl.c
> > +++ b/remote-curl.c
> > @@ -188,7 +188,10 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
> >  	heads->version = discover_version(&reader);
> >  	switch (heads->version) {
> >  	case protocol_v2:
> > -		die("support for protocol v2 not implemented yet");
> > +		/*
> > +		 * Do nothing.  Client should run 'stateless-connect' and
> > +		 * request the refs themselves.
> > +		 */
> >  		break;
> 
> This is the 'list' command, right?  Since we expect the client to run
> 'stateless-connect' instead, can we make it error out?

Yes and no.  Remote-curl will run this when trying to establish a
stateless-connection.  If the response is v2 then this is a capability
list and not refs.  So the capabilities will be dumped to the client and
they will be able to request the refs themselves at a later point.  The
comment here is just misleading, so i'll make sure to fix it.

> 
> [...]
> > @@ -1082,6 +1085,184 @@ static void parse_push(struct strbuf *buf)
> >  	free(specs);
> >  }
> >  
> > +struct proxy_state {
> > +	char *service_name;
> > +	char *service_url;
> > +	struct curl_slist *headers;
> > +	struct strbuf request_buffer;
> > +	int in;
> > +	int out;
> > +	struct packet_reader reader;
> > +	size_t pos;
> > +	int seen_flush;
> > +};
> 
> Can this have a comment describing what it is/does?  It's not obvious
> to me at first glance.
> 
> It doesn't have to go in a lot of detail since this is an internal
> implementation detail, but something saying e.g. that this represents
> a connection to an HTTP server (that's an example; I'm not saying
> that's what it represents :)) would help.

Always making new code have higher standards than the existing code ;)
Haha, I'll add a simple comment explaining it.

> 
> > +
> > +static void proxy_state_init(struct proxy_state *p, const char *service_name,
> > +			     enum protocol_version version)
> [...]
> > +static void proxy_state_clear(struct proxy_state *p)
> 
> Looks sensible.
> 
> [...]
> > +static size_t proxy_in(char *buffer, size_t eltsize,
> > +		       size_t nmemb, void *userdata)
> 
> Can this have a comment describing the interface?  (E.g. does it read
> a single pkt_line?  How is the caller expected to use it?  Does it
> satisfy the interface of some callback?)

I'll add a comment that its used as a READFUNCTION callback for curl and
that it tries to copy over a packet-line at a time.

> 
> libcurl's example https://curl.haxx.se/libcurl/c/ftpupload.html just
> calls this read_callback.  Such a name plus a pointer to
> CURLOPT_READFUNCTION should do the trick; bonus points if the comment 
> says what our implementation of the callback does.
> 
> Is this about having peek ability?

No its just that Curl only requests a set about of data at a time so you
need to be able to buffer the data that can't be read yet.

> 
> > +	struct proxy_state *p = userdata;
> > +	size_t avail = p->request_buffer.len - p->pos;
> > +
> > +	if (!avail) {
> > +		if (p->seen_flush) {
> > +			p->seen_flush = 0;
> > +			return 0;
> > +		}
> > +
> > +		strbuf_reset(&p->request_buffer);
> > +		switch (packet_reader_read(&p->reader)) {
> > +		case PACKET_READ_EOF:
> > +			die("unexpected EOF when reading from parent process");
> > +		case PACKET_READ_NORMAL:
> > +			packet_buf_write_len(&p->request_buffer, p->reader.line,
> > +					     p->reader.pktlen);
> > +			break;
> > +		case PACKET_READ_DELIM:
> > +			packet_buf_delim(&p->request_buffer);
> > +			break;
> > +		case PACKET_READ_FLUSH:
> > +			packet_buf_flush(&p->request_buffer);
> > +			p->seen_flush = 1;
> > +			break;
> > +		}
> > +		p->pos = 0;
> > +		avail = p->request_buffer.len;
> > +	}
> > +
> > +	if (max < avail)
> > +		avail = max;
> > +	memcpy(buffer, p->request_buffer.buf + p->pos, avail);
> > +	p->pos += avail;
> > +	return avail;
> 
> This is a number of bytes, but CURLOPT_READFUNCTION expects a number
> of items, fread-style.  That is:
> 
> 	if (avail < eltsize)
> 		... handle somehow, maybe by reading in more? ...
> 
> 	avail_memb = avail / eltsize;
> 	memcpy(buffer,
> 	       p->request_buffer.buf + p->pos,
> 	       st_mult(avail_memb, eltsize));
> 	p->pos += st_mult(avail_memb, eltsize);
> 	return avail_memb;
> 
> But https://curl.haxx.se/libcurl/c/CURLOPT_READFUNCTION.html says
> 
> 	Your function must then return the actual number of bytes that
> 	it stored in that memory area.
> 
> Does this mean eltsize is always 1?  This is super confusing...
> 
> ... ok, a quick grep for fread_func in libcurl reveals that eltsize is
> indeed always 1.  Can we add an assertion so we notice if that
> changes?
> 
> 	if (eltsize != 1)
> 		BUG("curl read callback called with size = %zu != 1", eltsize);
> 	max = nmemb;

Yeah i can go ahead and do this.  Just note that the v1 path uses logic
identical to this so it would be a problem there.

> 
> [...]
> > +static size_t proxy_out(char *buffer, size_t eltsize,
> > +			size_t nmemb, void *userdata)
> > +{
> > +	size_t size = eltsize * nmemb;
> > +	struct proxy_state *p = userdata;
> > +
> > +	write_or_die(p->out, buffer, size);
> > +	return size;
> > +}
> 
> Nice.  Same questions about st_mult or just asserting on eltsize apply
> here, too.
> 
> [...]
> > +static int proxy_post(struct proxy_state *p)
> 
> What does this function do?  Can it get a brief comment?

Will do.

> 
> > +{
> > +	struct active_request_slot *slot;
> > +	int err;
> > +
> > +	slot = get_active_slot();
> > +
> > +	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
> > +	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
> > +	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
> > +	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, p->headers);
> > +
> > +	/* Setup function to read request from client */
> > +	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
> > +	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
> > +
> > +	/* Setup function to write server response to client */
> > +	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
> > +	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
> > +
> > +	err = run_slot(slot, NULL);
> > +
> > +	if (err != HTTP_OK)
> > +		err = -1;
> > +
> > +	return err;
> 
> HTTP_OK is 0 but kind of obscures that.  How about something like the
> following?
> 
> 	if (run_slot(slot, NULL))
> 		return -1;
> 	return 0;
> 
> or
> 
> 	if (run_slot(slot, NULL) != HTTP_OK)
> 		return -1;
> 	return 0;
> 
> That way it's clearer that this always returns 0 or -1.

Sounds good.

> 
> [...]
> > +static int stateless_connect(const char *service_name)
> > +{
> > +	struct discovery *discover;
> > +	struct proxy_state p;
> > +
> > +	/*
> > +	 * Run the info/refs request and see if the server supports protocol
> > +	 * v2.  If and only if the server supports v2 can we successfully
> > +	 * establish a stateless connection, otherwise we need to tell the
> > +	 * client to fallback to using other transport helper functions to
> > +	 * complete their request.
> > +	 */
> > +	discover = discover_refs(service_name, 0);
> > +	if (discover->version != protocol_v2) {
> > +		printf("fallback\n");
> > +		fflush(stdout);
> > +		return -1;
> 
> Interesting.  I wonder if we can make remote-curl less smart and drive
> this more from the caller.
> 
> E.g. if the caller could do a single stateless request, they could do:
> 
> 	option git-protocol version=2
> 	stateless-request GET info/refs?service=git-upload-pack
> 	[pkt-lines, ending with a flush-pkt]
> 
> The git-protocol option in this hypothetical example is the value to
> be passed in the Git-Protocol header.
> 
> Then based on the response, the caller could decide to keep using
> stateless-request for further requests or fall back to "fetch".
> 
> That way, if we implement some protocol v3, the remote helper would
> not have to be changed at all to handle it: the caller would instead
> make the new v3-format request and remote-curl would be able to oblige
> without knowing why they're doing it.
> 
> What do you think?

I do see the draw for wanting this.  I think a change like this requires
a lot more refactoring, simply because with the current setup the
fetch/ls-refs logic doesn't care that its talking through a
remote-helper where if we went down that route it would need to be aware
of that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH v4 00/35] protocol version 2
  2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
                       ` (36 preceding siblings ...)
  2018-02-21 20:01     ` Brandon Williams
@ 2018-02-28 23:22     ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
                         ` (35 more replies)
  37 siblings, 36 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Lots of changes since v3 (well more than between v2 and v3).  Thanks for
all of the reviews on the last round, the series is getting more
polished.

 * Eliminated the "# service" line from the response from an HTTP
   server.  This means that the response to a v2 request is exactly the
   same regardless of which transport you use!  Docs for this have been
   added as well.
 * Changed how ref-patterns work with the `ls-refs` command.  Instead of
   using wildmatch all patterns must either match exactly or they can
   contain a single '*' character at the end to mean that the prefix
   must match.  Docs for this have also been added.
 * Lots of updates to the docs.  Including documenting the
   `stateless-connect` remote-helper command used by remote-curl to
   handle the http transport.
 * Fixed a number of bugs with the `fetch` command, one of which didn't
   use objects from configured alternates.

Brandon Williams (35):
  pkt-line: introduce packet_read_with_status
  pkt-line: allow peeking a packet line without consuming it
  pkt-line: add delim packet support
  upload-pack: convert to a builtin
  upload-pack: factor out processing lines
  transport: use get_refs_via_connect to get refs
  connect: convert get_remote_heads to use struct packet_reader
  connect: discover protocol version outside of get_remote_heads
  transport: store protocol version
  protocol: introduce enum protocol_version value protocol_v2
  test-pkt-line: introduce a packet-line test helper
  serve: introduce git-serve
  ls-refs: introduce ls-refs server command
  connect: request remote refs using v2
  transport: convert get_refs_list to take a list of ref patterns
  transport: convert transport_get_remote_refs to take a list of ref
    patterns
  ls-remote: pass ref patterns when requesting a remote's refs
  fetch: pass ref patterns when fetching
  push: pass ref patterns when pushing
  upload-pack: introduce fetch server command
  fetch-pack: perform a fetch using v2
  fetch-pack: support shallow requests
  connect: refactor git_connect to only get the protocol version once
  connect: don't request v2 when pushing
  transport-helper: remove name parameter
  transport-helper: refactor process_connect_service
  transport-helper: introduce stateless-connect
  pkt-line: add packet_buf_write_len function
  remote-curl: create copy of the service name
  remote-curl: store the protocol version the server responded with
  http: allow providing extra headers for http requests
  http: don't always add Git-Protocol header
  http: eliminate "# service" line when using protocol v2
  remote-curl: implement stateless-connect command
  remote-curl: don't request v2 when pushing

 .gitignore                              |   1 +
 Documentation/gitremote-helpers.txt     |  32 ++
 Documentation/technical/protocol-v2.txt | 401 +++++++++++++++
 Makefile                                |   7 +-
 builtin.h                               |   2 +
 builtin/clone.c                         |   2 +-
 builtin/fetch-pack.c                    |  20 +-
 builtin/fetch.c                         |  18 +-
 builtin/ls-remote.c                     |  12 +-
 builtin/receive-pack.c                  |   6 +
 builtin/remote.c                        |   2 +-
 builtin/send-pack.c                     |  20 +-
 builtin/serve.c                         |  30 ++
 builtin/upload-pack.c                   |  74 +++
 connect.c                               | 364 ++++++++++----
 connect.h                               |   7 +
 fetch-pack.c                            | 339 ++++++++++++-
 fetch-pack.h                            |   4 +-
 git.c                                   |   2 +
 http-backend.c                          |   8 +-
 http.c                                  |  25 +-
 http.h                                  |   7 +
 ls-refs.c                               | 144 ++++++
 ls-refs.h                               |   9 +
 pkt-line.c                              | 147 +++++-
 pkt-line.h                              |  78 +++
 protocol.c                              |   2 +
 protocol.h                              |   1 +
 refs.c                                  |  14 +
 refs.h                                  |   7 +
 remote-curl.c                           | 278 ++++++++++-
 remote.h                                |  11 +-
 serve.c                                 | 260 ++++++++++
 serve.h                                 |  15 +
 t/helper/test-pkt-line.c                |  64 +++
 t/t5701-git-serve.sh                    | 176 +++++++
 t/t5702-protocol-v2.sh                  | 273 +++++++++++
 transport-helper.c                      |  87 ++--
 transport-internal.h                    |   9 +-
 transport.c                             | 125 +++--
 transport.h                             |  18 +-
 upload-pack.c                           | 616 ++++++++++++++++++------
 upload-pack.h                           |  21 +
 43 files changed, 3370 insertions(+), 368 deletions(-)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 builtin/upload-pack.c
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100644 t/helper/test-pkt-line.c
 create mode 100755 t/t5701-git-serve.sh
 create mode 100755 t/t5702-protocol-v2.sh
 create mode 100644 upload-pack.h

-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply	[flat|nested] 362+ messages in thread

* [PATCH v4 01/35] pkt-line: introduce packet_read_with_status
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 19:35         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it Brandon Williams
                         ` (34 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

The current pkt-line API encodes the status of a pkt-line read in the
length of the read content.  An error is indicated with '-1', a flush
with '0' (which can be confusing since a return value of '0' can also
indicate an empty pkt-line), and a positive integer for the length of
the read content otherwise.  This doesn't leave much room for allowing
the addition of additional special packets in the future.

To solve this introduce 'packet_read_with_status()' which reads a packet
and returns the status of the read encoded as an 'enum packet_status'
type.  This allows for easily identifying between special and normal
packets as well as errors.  It also enables easily adding a new special
packet in the future.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 55 ++++++++++++++++++++++++++++++++++++++++--------------
 pkt-line.h | 16 ++++++++++++++++
 2 files changed, 57 insertions(+), 14 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 2827ca772..08e5ba44c 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -280,28 +280,34 @@ static int packet_length(const char *linelen)
 	return (val < 0) ? val : (val << 8) | hex2chr(linelen + 2);
 }
 
-int packet_read(int fd, char **src_buf, size_t *src_len,
-		char *buffer, unsigned size, int options)
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
+						size_t *src_len, char *buffer,
+						unsigned size, int *pktlen,
+						int options)
 {
-	int len, ret;
+	int len;
 	char linelen[4];
 
-	ret = get_packet_data(fd, src_buf, src_len, linelen, 4, options);
-	if (ret < 0)
-		return ret;
+	if (get_packet_data(fd, src_buffer, src_len, linelen, 4, options) < 0)
+		return PACKET_READ_EOF;
+
 	len = packet_length(linelen);
-	if (len < 0)
+
+	if (len < 0) {
 		die("protocol error: bad line length character: %.4s", linelen);
-	if (!len) {
+	} else if (!len) {
 		packet_trace("0000", 4, 0);
-		return 0;
+		return PACKET_READ_FLUSH;
+	} else if (len < 4) {
+		die("protocol error: bad line length %d", len);
 	}
+
 	len -= 4;
-	if (len >= size)
+	if ((unsigned)len >= size)
 		die("protocol error: bad line length %d", len);
-	ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
-	if (ret < 0)
-		return ret;
+
+	if (get_packet_data(fd, src_buffer, src_len, buffer, len, options) < 0)
+		return PACKET_READ_EOF;
 
 	if ((options & PACKET_READ_CHOMP_NEWLINE) &&
 	    len && buffer[len-1] == '\n')
@@ -309,7 +315,28 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
 	buffer[len] = 0;
 	packet_trace(buffer, len, 0);
-	return len;
+	*pktlen = len;
+	return PACKET_READ_NORMAL;
+}
+
+int packet_read(int fd, char **src_buffer, size_t *src_len,
+		char *buffer, unsigned size, int options)
+{
+	int pktlen;
+
+	switch (packet_read_with_status(fd, src_buffer, src_len, buffer, size,
+					&pktlen, options)) {
+	case PACKET_READ_EOF:
+		pktlen = -1;
+		break;
+	case PACKET_READ_NORMAL:
+		break;
+	case PACKET_READ_FLUSH:
+		pktlen = 0;
+		break;
+	}
+
+	return pktlen;
 }
 
 static char *packet_read_line_generic(int fd,
diff --git a/pkt-line.h b/pkt-line.h
index 3dad583e2..0be691116 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -65,6 +65,22 @@ int write_packetized_from_buf(const char *src_in, size_t len, int fd_out);
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
 		*buffer, unsigned size, int options);
 
+/*
+ * Read a packetized line into a buffer like the 'packet_read()' function but
+ * returns an 'enum packet_read_status' which indicates the status of the read.
+ * The number of bytes read will be assigined to *pktlen if the status of the
+ * read was 'PACKET_READ_NORMAL'.
+ */
+enum packet_read_status {
+	PACKET_READ_EOF = -1,
+	PACKET_READ_NORMAL,
+	PACKET_READ_FLUSH,
+};
+enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
+						size_t *src_len, char *buffer,
+						unsigned size, int *pktlen,
+						int options);
+
 /*
  * Convenience wrapper for packet_read that is not gentle, and sets the
  * CHOMP_NEWLINE option. The return value is NULL for a flush packet,
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-01 20:48         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 03/35] pkt-line: add delim packet support Brandon Williams
                         ` (33 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Sometimes it is advantageous to be able to peek the next packet line
without consuming it (e.g. to be able to determine the protocol version
a server is speaking).  In order to do that introduce 'struct
packet_reader' which is an abstraction around the normal packet reading
logic.  This enables a caller to be able to peek a single line at a time
using 'packet_reader_peek()' and having a caller consume a line by
calling 'packet_reader_read()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 pkt-line.h | 58 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 117 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 08e5ba44c..6307fa4a3 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -404,3 +404,62 @@ ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out)
 	}
 	return sb_out->len - orig_len;
 }
+
+/* Packet Reader Functions */
+void packet_reader_init(struct packet_reader *reader, int fd,
+			char *src_buffer, size_t src_len,
+			int options)
+{
+	memset(reader, 0, sizeof(*reader));
+
+	reader->fd = fd;
+	reader->src_buffer = src_buffer;
+	reader->src_len = src_len;
+	reader->buffer = packet_buffer;
+	reader->buffer_size = sizeof(packet_buffer);
+	reader->options = options;
+}
+
+enum packet_read_status packet_reader_read(struct packet_reader *reader)
+{
+	if (reader->line_peeked) {
+		reader->line_peeked = 0;
+		return reader->status;
+	}
+
+	reader->status = packet_read_with_status(reader->fd,
+						 &reader->src_buffer,
+						 &reader->src_len,
+						 reader->buffer,
+						 reader->buffer_size,
+						 &reader->pktlen,
+						 reader->options);
+
+	switch (reader->status) {
+	case PACKET_READ_EOF:
+		reader->pktlen = -1;
+		reader->line = NULL;
+		break;
+	case PACKET_READ_NORMAL:
+		reader->line = reader->buffer;
+		break;
+	case PACKET_READ_FLUSH:
+		reader->pktlen = 0;
+		reader->line = NULL;
+		break;
+	}
+
+	return reader->status;
+}
+
+enum packet_read_status packet_reader_peek(struct packet_reader *reader)
+{
+	/* Only allow peeking a single line */
+	if (reader->line_peeked)
+		return reader->status;
+
+	/* Peek a line by reading it and setting peeked flag */
+	packet_reader_read(reader);
+	reader->line_peeked = 1;
+	return reader->status;
+}
diff --git a/pkt-line.h b/pkt-line.h
index 0be691116..f2edfae9a 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -112,6 +112,64 @@ char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
  */
 ssize_t read_packetized_to_strbuf(int fd_in, struct strbuf *sb_out);
 
+struct packet_reader {
+	/* source file descriptor */
+	int fd;
+
+	/* source buffer and its size */
+	char *src_buffer;
+	size_t src_len;
+
+	/* buffer that pkt-lines are read into and its size */
+	char *buffer;
+	unsigned buffer_size;
+
+	/* options to be used during reads */
+	int options;
+
+	/* status of the last read */
+	enum packet_read_status status;
+
+	/* length of data read during the last read */
+	int pktlen;
+
+	/* the last line read */
+	const char *line;
+
+	/* indicates if a line has been peeked */
+	int line_peeked;
+};
+
+/*
+ * Initialize a 'struct packet_reader' object which is an
+ * abstraction around the 'packet_read_with_status()' function.
+ */
+extern void packet_reader_init(struct packet_reader *reader, int fd,
+			       char *src_buffer, size_t src_len,
+			       int options);
+
+/*
+ * Perform a packet read and return the status of the read.
+ * The values of 'pktlen' and 'line' are updated based on the status of the
+ * read as follows:
+ *
+ * PACKET_READ_ERROR: 'pktlen' is set to '-1' and 'line' is set to NULL
+ * PACKET_READ_NORMAL: 'pktlen' is set to the number of bytes read
+ *		       'line' is set to point at the read line
+ * PACKET_READ_FLUSH: 'pktlen' is set to '0' and 'line' is set to NULL
+ */
+extern enum packet_read_status packet_reader_read(struct packet_reader *reader);
+
+/*
+ * Peek the next packet line without consuming it and return the status.
+ * The next call to 'packet_reader_read()' will perform a read of the same line
+ * that was peeked, consuming the line.
+ *
+ * Peeking multiple times without calling 'packet_reader_read()' will return
+ * the same result.
+ */
+extern enum packet_read_status packet_reader_peek(struct packet_reader *reader);
+
 #define DEFAULT_PACKET_MAX 1000
 #define LARGE_PACKET_MAX 65520
 #define LARGE_PACKET_DATA_MAX (LARGE_PACKET_MAX - 4)
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 03/35] pkt-line: add delim packet support
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-01 20:50         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 04/35] upload-pack: convert to a builtin Brandon Williams
                         ` (32 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

One of the design goals of protocol-v2 is to improve the semantics of
flush packets.  Currently in protocol-v1, flush packets are used both to
indicate a break in a list of packet lines as well as an indication that
one side has finished speaking.  This makes it particularly difficult
to implement proxies as a proxy would need to completely understand git
protocol instead of simply looking for a flush packet.

To do this, introduce the special deliminator packet '0001'.  A delim
packet can then be used as a deliminator between lists of packet lines
while flush packets can be reserved to indicate the end of a response.

Documentation for how this packet will be used in protocol v2 will
included in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 17 +++++++++++++++++
 pkt-line.h |  3 +++
 2 files changed, 20 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 6307fa4a3..87a24bd17 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -91,6 +91,12 @@ void packet_flush(int fd)
 	write_or_die(fd, "0000", 4);
 }
 
+void packet_delim(int fd)
+{
+	packet_trace("0001", 4, 1);
+	write_or_die(fd, "0001", 4);
+}
+
 int packet_flush_gently(int fd)
 {
 	packet_trace("0000", 4, 1);
@@ -105,6 +111,12 @@ void packet_buf_flush(struct strbuf *buf)
 	strbuf_add(buf, "0000", 4);
 }
 
+void packet_buf_delim(struct strbuf *buf)
+{
+	packet_trace("0001", 4, 1);
+	strbuf_add(buf, "0001", 4);
+}
+
 static void set_packet_header(char *buf, const int size)
 {
 	static char hexchar[] = "0123456789abcdef";
@@ -298,6 +310,9 @@ enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 	} else if (!len) {
 		packet_trace("0000", 4, 0);
 		return PACKET_READ_FLUSH;
+	} else if (len == 1) {
+		packet_trace("0001", 4, 0);
+		return PACKET_READ_DELIM;
 	} else if (len < 4) {
 		die("protocol error: bad line length %d", len);
 	}
@@ -331,6 +346,7 @@ int packet_read(int fd, char **src_buffer, size_t *src_len,
 		break;
 	case PACKET_READ_NORMAL:
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		pktlen = 0;
 		break;
@@ -443,6 +459,7 @@ enum packet_read_status packet_reader_read(struct packet_reader *reader)
 	case PACKET_READ_NORMAL:
 		reader->line = reader->buffer;
 		break;
+	case PACKET_READ_DELIM:
 	case PACKET_READ_FLUSH:
 		reader->pktlen = 0;
 		reader->line = NULL;
diff --git a/pkt-line.h b/pkt-line.h
index f2edfae9a..3f836f01a 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -20,8 +20,10 @@
  * side can't, we stay with pure read/write interfaces.
  */
 void packet_flush(int fd);
+void packet_delim(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
@@ -75,6 +77,7 @@ enum packet_read_status {
 	PACKET_READ_EOF = -1,
 	PACKET_READ_NORMAL,
 	PACKET_READ_FLUSH,
+	PACKET_READ_DELIM,
 };
 enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
 						size_t *src_len, char *buffer,
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 04/35] upload-pack: convert to a builtin
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (2 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 03/35] pkt-line: add delim packet support Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:40         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 05/35] upload-pack: factor out processing lines Brandon Williams
                         ` (31 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

In order to allow for code sharing with the server-side of fetch in
protocol-v2 convert upload-pack to be a builtin.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile              |   3 +-
 builtin.h             |   1 +
 builtin/upload-pack.c |  67 ++++++++++++++++++++++++++
 git.c                 |   1 +
 upload-pack.c         | 107 ++++++++++--------------------------------
 upload-pack.h         |  13 +++++
 6 files changed, 109 insertions(+), 83 deletions(-)
 create mode 100644 builtin/upload-pack.c
 create mode 100644 upload-pack.h

diff --git a/Makefile b/Makefile
index 1a9b23b67..b7ccc05fa 100644
--- a/Makefile
+++ b/Makefile
@@ -639,7 +639,6 @@ PROGRAM_OBJS += imap-send.o
 PROGRAM_OBJS += sh-i18n--envsubst.o
 PROGRAM_OBJS += shell.o
 PROGRAM_OBJS += show-index.o
-PROGRAM_OBJS += upload-pack.o
 PROGRAM_OBJS += remote-testsvn.o
 
 # Binary suffix, set to .exe for Windows builds
@@ -909,6 +908,7 @@ LIB_OBJS += tree-diff.o
 LIB_OBJS += tree.o
 LIB_OBJS += tree-walk.o
 LIB_OBJS += unpack-trees.o
+LIB_OBJS += upload-pack.o
 LIB_OBJS += url.o
 LIB_OBJS += urlmatch.o
 LIB_OBJS += usage.o
@@ -1026,6 +1026,7 @@ BUILTIN_OBJS += builtin/update-index.o
 BUILTIN_OBJS += builtin/update-ref.o
 BUILTIN_OBJS += builtin/update-server-info.o
 BUILTIN_OBJS += builtin/upload-archive.o
+BUILTIN_OBJS += builtin/upload-pack.o
 BUILTIN_OBJS += builtin/var.o
 BUILTIN_OBJS += builtin/verify-commit.o
 BUILTIN_OBJS += builtin/verify-pack.o
diff --git a/builtin.h b/builtin.h
index 42378f3aa..f332a1257 100644
--- a/builtin.h
+++ b/builtin.h
@@ -231,6 +231,7 @@ extern int cmd_update_ref(int argc, const char **argv, const char *prefix);
 extern int cmd_update_server_info(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive(int argc, const char **argv, const char *prefix);
 extern int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix);
+extern int cmd_upload_pack(int argc, const char **argv, const char *prefix);
 extern int cmd_var(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_commit(int argc, const char **argv, const char *prefix);
 extern int cmd_verify_tag(int argc, const char **argv, const char *prefix);
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
new file mode 100644
index 000000000..2cb5cb35b
--- /dev/null
+++ b/builtin/upload-pack.c
@@ -0,0 +1,67 @@
+#include "cache.h"
+#include "builtin.h"
+#include "exec_cmd.h"
+#include "pkt-line.h"
+#include "parse-options.h"
+#include "protocol.h"
+#include "upload-pack.h"
+
+static const char * const upload_pack_usage[] = {
+	N_("git upload-pack [<options>] <dir>"),
+	NULL
+};
+
+int cmd_upload_pack(int argc, const char **argv, const char *prefix)
+{
+	const char *dir;
+	int strict = 0;
+	struct upload_pack_options opts = { 0 };
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-refs", &opts.advertise_refs,
+			 N_("exit immediately after initial ref advertisement")),
+		OPT_BOOL(0, "strict", &strict,
+			 N_("do not try <directory>/.git/ if <directory> is no Git directory")),
+		OPT_INTEGER(0, "timeout", &opts.timeout,
+			    N_("interrupt transfer after <n> seconds of inactivity")),
+		OPT_END()
+	};
+
+	packet_trace_identity("upload-pack");
+	check_replace_refs = 0;
+
+	argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
+
+	if (argc != 1)
+		usage_with_options(upload_pack_usage, options);
+
+	if (opts.timeout)
+		opts.daemon_mode = 1;
+
+	setup_path();
+
+	dir = argv[0];
+
+	if (!enter_repo(dir, strict))
+		die("'%s' does not appear to be a git repository", dir);
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+		if (opts.advertise_refs || !opts.stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
+	case protocol_v0:
+		upload_pack(&opts);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
+	return 0;
+}
diff --git a/git.c b/git.c
index c870b9719..f71073dc8 100644
--- a/git.c
+++ b/git.c
@@ -478,6 +478,7 @@ static struct cmd_struct commands[] = {
 	{ "update-server-info", cmd_update_server_info, RUN_SETUP },
 	{ "upload-archive", cmd_upload_archive },
 	{ "upload-archive--writer", cmd_upload_archive_writer },
+	{ "upload-pack", cmd_upload_pack },
 	{ "var", cmd_var, RUN_SETUP_GENTLY },
 	{ "verify-commit", cmd_verify_commit, RUN_SETUP },
 	{ "verify-pack", cmd_verify_pack },
diff --git a/upload-pack.c b/upload-pack.c
index d5de18127..2ad73a98b 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -6,7 +6,6 @@
 #include "tag.h"
 #include "object.h"
 #include "commit.h"
-#include "exec_cmd.h"
 #include "diff.h"
 #include "revision.h"
 #include "list-objects.h"
@@ -15,15 +14,10 @@
 #include "sigchain.h"
 #include "version.h"
 #include "string-list.h"
-#include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
 #include "protocol.h"
-
-static const char * const upload_pack_usage[] = {
-	N_("git upload-pack [<options>] <dir>"),
-	NULL
-};
+#include "upload-pack.h"
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE	(1u << 11)
@@ -61,7 +55,6 @@ static int keepalive = 5;
  * otherwise maximum packet size (up to 65520 bytes).
  */
 static int use_sideband;
-static int advertise_refs;
 static int stateless_rpc;
 static const char *pack_objects_hook;
 
@@ -977,33 +970,6 @@ static int find_symref(const char *refname, const struct object_id *oid,
 	return 0;
 }
 
-static void upload_pack(void)
-{
-	struct string_list symref = STRING_LIST_INIT_DUP;
-
-	head_ref_namespaced(find_symref, &symref);
-
-	if (advertise_refs || !stateless_rpc) {
-		reset_timeout();
-		head_ref_namespaced(send_ref, &symref);
-		for_each_namespaced_ref(send_ref, &symref);
-		advertise_shallow_grafts(1);
-		packet_flush(1);
-	} else {
-		head_ref_namespaced(check_ref, NULL);
-		for_each_namespaced_ref(check_ref, NULL);
-	}
-	string_list_clear(&symref, 1);
-	if (advertise_refs)
-		return;
-
-	receive_needs();
-	if (want_obj.nr) {
-		get_common_commits();
-		create_pack_file();
-	}
-}
-
 static int upload_pack_config(const char *var, const char *value, void *unused)
 {
 	if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
@@ -1032,58 +998,35 @@ static int upload_pack_config(const char *var, const char *value, void *unused)
 	return parse_hide_refs_config(var, value, "uploadpack");
 }
 
-int cmd_main(int argc, const char **argv)
+void upload_pack(struct upload_pack_options *options)
 {
-	const char *dir;
-	int strict = 0;
-	struct option options[] = {
-		OPT_BOOL(0, "stateless-rpc", &stateless_rpc,
-			 N_("quit after a single request/response exchange")),
-		OPT_BOOL(0, "advertise-refs", &advertise_refs,
-			 N_("exit immediately after initial ref advertisement")),
-		OPT_BOOL(0, "strict", &strict,
-			 N_("do not try <directory>/.git/ if <directory> is no Git directory")),
-		OPT_INTEGER(0, "timeout", &timeout,
-			    N_("interrupt transfer after <n> seconds of inactivity")),
-		OPT_END()
-	};
-
-	packet_trace_identity("upload-pack");
-	check_replace_refs = 0;
-
-	argc = parse_options(argc, argv, NULL, options, upload_pack_usage, 0);
-
-	if (argc != 1)
-		usage_with_options(upload_pack_usage, options);
-
-	if (timeout)
-		daemon_mode = 1;
-
-	setup_path();
-
-	dir = argv[0];
+	struct string_list symref = STRING_LIST_INIT_DUP;
 
-	if (!enter_repo(dir, strict))
-		die("'%s' does not appear to be a git repository", dir);
+	stateless_rpc = options->stateless_rpc;
+	timeout = options->timeout;
+	daemon_mode = options->daemon_mode;
 
 	git_config(upload_pack_config, NULL);
 
-	switch (determine_protocol_version_server()) {
-	case protocol_v1:
-		/*
-		 * v1 is just the original protocol with a version string,
-		 * so just fall through after writing the version string.
-		 */
-		if (advertise_refs || !stateless_rpc)
-			packet_write_fmt(1, "version 1\n");
-
-		/* fallthrough */
-	case protocol_v0:
-		upload_pack();
-		break;
-	case protocol_unknown_version:
-		BUG("unknown protocol version");
+	head_ref_namespaced(find_symref, &symref);
+
+	if (options->advertise_refs || !stateless_rpc) {
+		reset_timeout();
+		head_ref_namespaced(send_ref, &symref);
+		for_each_namespaced_ref(send_ref, &symref);
+		advertise_shallow_grafts(1);
+		packet_flush(1);
+	} else {
+		head_ref_namespaced(check_ref, NULL);
+		for_each_namespaced_ref(check_ref, NULL);
 	}
+	string_list_clear(&symref, 1);
+	if (options->advertise_refs)
+		return;
 
-	return 0;
+	receive_needs();
+	if (want_obj.nr) {
+		get_common_commits();
+		create_pack_file();
+	}
 }
diff --git a/upload-pack.h b/upload-pack.h
new file mode 100644
index 000000000..a71e4dc7e
--- /dev/null
+++ b/upload-pack.h
@@ -0,0 +1,13 @@
+#ifndef UPLOAD_PACK_H
+#define UPLOAD_PACK_H
+
+struct upload_pack_options {
+	int stateless_rpc;
+	int advertise_refs;
+	unsigned int timeout;
+	int daemon_mode;
+};
+
+void upload_pack(struct upload_pack_options *options);
+
+#endif /* UPLOAD_PACK_H */
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 05/35] upload-pack: factor out processing lines
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (3 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 04/35] upload-pack: convert to a builtin Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-01 21:25         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
                         ` (30 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Factor out the logic for processing shallow, deepen, deepen_since, and
deepen_not lines into their own functions to simplify the
'receive_needs()' function in addition to making it easier to reuse some
of this logic when implementing protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 upload-pack.c | 113 +++++++++++++++++++++++++++++++++-----------------
 1 file changed, 74 insertions(+), 39 deletions(-)

diff --git a/upload-pack.c b/upload-pack.c
index 2ad73a98b..1e8a9e1ca 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -724,6 +724,75 @@ static void deepen_by_rev_list(int ac, const char **av,
 	packet_flush(1);
 }
 
+static int process_shallow(const char *line, struct object_array *shallows)
+{
+	const char *arg;
+	if (skip_prefix(line, "shallow ", &arg)) {
+		struct object_id oid;
+		struct object *object;
+		if (get_oid_hex(arg, &oid))
+			die("invalid shallow line: %s", line);
+		object = parse_object(&oid);
+		if (!object)
+			return 1;
+		if (object->type != OBJ_COMMIT)
+			die("invalid shallow object %s", oid_to_hex(&oid));
+		if (!(object->flags & CLIENT_SHALLOW)) {
+			object->flags |= CLIENT_SHALLOW;
+			add_object_array(object, NULL, shallows);
+		}
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen(const char *line, int *depth)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen ", &arg)) {
+		char *end = NULL;
+		*depth = (int)strtol(arg, &end, 0);
+		if (!end || *end || *depth <= 0)
+			die("Invalid deepen: %s", line);
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_deepen_since(const char *line, timestamp_t *deepen_since, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-since ", &arg)) {
+		char *end = NULL;
+		*deepen_since = parse_timestamp(arg, &end, 0);
+		if (!end || *end || !deepen_since ||
+		    /* revisions.c's max_age -1 is special */
+		    *deepen_since == -1)
+			die("Invalid deepen-since: %s", line);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
+static int process_deepen_not(const char *line, struct string_list *deepen_not, int *deepen_rev_list)
+{
+	const char *arg;
+	if (skip_prefix(line, "deepen-not ", &arg)) {
+		char *ref = NULL;
+		struct object_id oid;
+		if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
+			die("git upload-pack: ambiguous deepen-not: %s", line);
+		string_list_append(deepen_not, ref);
+		free(ref);
+		*deepen_rev_list = 1;
+		return 1;
+	}
+	return 0;
+}
+
 static void receive_needs(void)
 {
 	struct object_array shallows = OBJECT_ARRAY_INIT;
@@ -745,49 +814,15 @@ static void receive_needs(void)
 		if (!line)
 			break;
 
-		if (skip_prefix(line, "shallow ", &arg)) {
-			struct object_id oid;
-			struct object *object;
-			if (get_oid_hex(arg, &oid))
-				die("invalid shallow line: %s", line);
-			object = parse_object(&oid);
-			if (!object)
-				continue;
-			if (object->type != OBJ_COMMIT)
-				die("invalid shallow object %s", oid_to_hex(&oid));
-			if (!(object->flags & CLIENT_SHALLOW)) {
-				object->flags |= CLIENT_SHALLOW;
-				add_object_array(object, NULL, &shallows);
-			}
+		if (process_shallow(line, &shallows))
 			continue;
-		}
-		if (skip_prefix(line, "deepen ", &arg)) {
-			char *end = NULL;
-			depth = strtol(arg, &end, 0);
-			if (!end || *end || depth <= 0)
-				die("Invalid deepen: %s", line);
+		if (process_deepen(line, &depth))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-since ", &arg)) {
-			char *end = NULL;
-			deepen_since = parse_timestamp(arg, &end, 0);
-			if (!end || *end || !deepen_since ||
-			    /* revisions.c's max_age -1 is special */
-			    deepen_since == -1)
-				die("Invalid deepen-since: %s", line);
-			deepen_rev_list = 1;
+		if (process_deepen_since(line, &deepen_since, &deepen_rev_list))
 			continue;
-		}
-		if (skip_prefix(line, "deepen-not ", &arg)) {
-			char *ref = NULL;
-			struct object_id oid;
-			if (expand_ref(arg, strlen(arg), &oid, &ref) != 1)
-				die("git upload-pack: ambiguous deepen-not: %s", line);
-			string_list_append(&deepen_not, ref);
-			free(ref);
-			deepen_rev_list = 1;
+		if (process_deepen_not(line, &deepen_not, &deepen_rev_list))
 			continue;
-		}
+
 		if (!skip_prefix(line, "want ", &arg) ||
 		    get_oid_hex(arg, &oid_buf))
 			die("git upload-pack: protocol error, "
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 06/35] transport: use get_refs_via_connect to get refs
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (4 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 05/35] upload-pack: factor out processing lines Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-01 21:25         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
                         ` (29 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Remove code duplication and use the existing 'get_refs_via_connect()'
function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
'git_transport_push()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 18 ++++--------------
 1 file changed, 4 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index fc802260f..8e8779096 100644
--- a/transport.c
+++ b/transport.c
@@ -230,12 +230,8 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
 
-	if (!data->got_remote_heads) {
-		connect_setup(transport, 0);
-		get_remote_heads(data->fd[0], NULL, 0, &refs_tmp, 0,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		refs_tmp = get_refs_via_connect(transport, 0);
 
 	refs = fetch_pack(&args, data->fd, data->conn,
 			  refs_tmp ? refs_tmp : transport->remote_refs,
@@ -541,14 +537,8 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	struct send_pack_args args;
 	int ret;
 
-	if (!data->got_remote_heads) {
-		struct ref *tmp_refs;
-		connect_setup(transport, 1);
-
-		get_remote_heads(data->fd[0], NULL, 0, &tmp_refs, REF_NORMAL,
-				 NULL, &data->shallow);
-		data->got_remote_heads = 1;
-	}
+	if (!data->got_remote_heads)
+		get_refs_via_connect(transport, 1);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 07/35] connect: convert get_remote_heads to use struct packet_reader
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (5 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
                         ` (28 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

In order to allow for better control flow when protocol_v2 is introduced
convert 'get_remote_heads()' to use 'struct packet_reader' to read
packet lines.  This enables a client to be able to peek the first line
of a server's response (without consuming it) in order to determine the
protocol version its speaking and then passing control to the
appropriate handler.

This is needed because the initial response from a server speaking
protocol_v0 includes the first ref, while subsequent protocol versions
respond with a version line.  We want to be able to read this first line
without consuming the first ref sent in the protocol_v0 case so that the
protocol version the server is speaking can be determined outside of
'get_remote_heads()' in a future patch.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 173 ++++++++++++++++++++++++++++++------------------------
 1 file changed, 95 insertions(+), 78 deletions(-)

diff --git a/connect.c b/connect.c
index c3a014c5b..c82c90b7c 100644
--- a/connect.c
+++ b/connect.c
@@ -48,6 +48,12 @@ int check_ref_type(const struct ref *ref, int flags)
 
 static void die_initial_contact(int unexpected)
 {
+	/*
+	 * A hang-up after seeing some response from the other end
+	 * means that it is unexpected, as we know the other end is
+	 * willing to talk to us.  A hang-up before seeing any
+	 * response does not necessarily mean an ACL problem, though.
+	 */
 	if (unexpected)
 		die(_("The remote end hung up upon initial contact"));
 	else
@@ -56,6 +62,40 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+static enum protocol_version discover_version(struct packet_reader *reader)
+{
+	enum protocol_version version = protocol_unknown_version;
+
+	/*
+	 * Peek the first line of the server's response to
+	 * determine the protocol version the server is speaking.
+	 */
+	switch (packet_reader_peek(reader)) {
+	case PACKET_READ_EOF:
+		die_initial_contact(0);
+	case PACKET_READ_FLUSH:
+	case PACKET_READ_DELIM:
+		version = protocol_v0;
+		break;
+	case PACKET_READ_NORMAL:
+		version = determine_protocol_version_client(reader->line);
+		break;
+	}
+
+	switch (version) {
+	case protocol_v1:
+		/* Read the peeked version line */
+		packet_reader_read(reader);
+		break;
+	case protocol_v0:
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
+	return version;
+}
+
 static void parse_one_symref_info(struct string_list *symref, const char *val, int len)
 {
 	char *sym, *target;
@@ -109,60 +149,21 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
-/*
- * Read one line of a server's ref advertisement into packet_buffer.
- */
-static int read_remote_ref(int in, char **src_buf, size_t *src_len,
-			   int *responded)
+static void process_capabilities(const char *line, int *len)
 {
-	int len = packet_read(in, src_buf, src_len,
-			      packet_buffer, sizeof(packet_buffer),
-			      PACKET_READ_GENTLE_ON_EOF |
-			      PACKET_READ_CHOMP_NEWLINE);
-	const char *arg;
-	if (len < 0)
-		die_initial_contact(*responded);
-	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
-		die("remote error: %s", arg);
-
-	*responded = 1;
-
-	return len;
-}
-
-#define EXPECTING_PROTOCOL_VERSION 0
-#define EXPECTING_FIRST_REF 1
-#define EXPECTING_REF 2
-#define EXPECTING_SHALLOW 3
-
-/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
-static int process_protocol_version(void)
-{
-	switch (determine_protocol_version_client(packet_buffer)) {
-	case protocol_v1:
-		return 1;
-	case protocol_v0:
-		return 0;
-	default:
-		die("server is speaking an unknown protocol");
-	}
-}
-
-static void process_capabilities(int *len)
-{
-	int nul_location = strlen(packet_buffer);
+	int nul_location = strlen(line);
 	if (nul_location == *len)
 		return;
-	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	server_capabilities = xstrdup(line + nul_location + 1);
 	*len = nul_location;
 }
 
-static int process_dummy_ref(void)
+static int process_dummy_ref(const char *line)
 {
 	struct object_id oid;
 	const char *name;
 
-	if (parse_oid_hex(packet_buffer, &oid, &name))
+	if (parse_oid_hex(line, &oid, &name))
 		return 0;
 	if (*name != ' ')
 		return 0;
@@ -171,20 +172,20 @@ static int process_dummy_ref(void)
 	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
 }
 
-static void check_no_capabilities(int len)
+static void check_no_capabilities(const char *line, int len)
 {
-	if (strlen(packet_buffer) != len)
+	if (strlen(line) != len)
 		warning("Ignoring capabilities after first line '%s'",
-			packet_buffer + strlen(packet_buffer));
+			line + strlen(line));
 }
 
-static int process_ref(int len, struct ref ***list, unsigned int flags,
-		       struct oid_array *extra_have)
+static int process_ref(const char *line, int len, struct ref ***list,
+		       unsigned int flags, struct oid_array *extra_have)
 {
 	struct object_id old_oid;
 	const char *name;
 
-	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+	if (parse_oid_hex(line, &old_oid, &name))
 		return 0;
 	if (*name != ' ')
 		return 0;
@@ -200,16 +201,17 @@ static int process_ref(int len, struct ref ***list, unsigned int flags,
 		**list = ref;
 		*list = &ref->next;
 	}
-	check_no_capabilities(len);
+	check_no_capabilities(line, len);
 	return 1;
 }
 
-static int process_shallow(int len, struct oid_array *shallow_points)
+static int process_shallow(const char *line, int len,
+			   struct oid_array *shallow_points)
 {
 	const char *arg;
 	struct object_id old_oid;
 
-	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+	if (!skip_prefix(line, "shallow ", &arg))
 		return 0;
 
 	if (get_oid_hex(arg, &old_oid))
@@ -217,10 +219,17 @@ static int process_shallow(int len, struct oid_array *shallow_points)
 	if (!shallow_points)
 		die("repository on the other end cannot be shallow");
 	oid_array_append(shallow_points, &old_oid);
-	check_no_capabilities(len);
+	check_no_capabilities(line, len);
 	return 1;
 }
 
+enum get_remote_heads_state {
+	EXPECTING_FIRST_REF = 0,
+	EXPECTING_REF,
+	EXPECTING_SHALLOW,
+	EXPECTING_DONE,
+};
+
 /*
  * Read all the refs from the other end
  */
@@ -230,47 +239,55 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 			      struct oid_array *shallow_points)
 {
 	struct ref **orig_list = list;
+	int len = 0;
+	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
+	struct packet_reader reader;
+	const char *arg;
 
-	/*
-	 * A hang-up after seeing some response from the other end
-	 * means that it is unexpected, as we know the other end is
-	 * willing to talk to us.  A hang-up before seeing any
-	 * response does not necessarily mean an ACL problem, though.
-	 */
-	int responded = 0;
-	int len;
-	int state = EXPECTING_PROTOCOL_VERSION;
+	packet_reader_init(&reader, in, src_buf, src_len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	discover_version(&reader);
 
 	*list = NULL;
 
-	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+	while (state != EXPECTING_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			die_initial_contact(1);
+		case PACKET_READ_NORMAL:
+			len = reader.pktlen;
+			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
+				die("remote error: %s", arg);
+			break;
+		case PACKET_READ_FLUSH:
+			state = EXPECTING_DONE;
+			break;
+		case PACKET_READ_DELIM:
+			die("invalid packet");
+		}
+
 		switch (state) {
-		case EXPECTING_PROTOCOL_VERSION:
-			if (process_protocol_version()) {
-				state = EXPECTING_FIRST_REF;
-				break;
-			}
-			state = EXPECTING_FIRST_REF;
-			/* fallthrough */
 		case EXPECTING_FIRST_REF:
-			process_capabilities(&len);
-			if (process_dummy_ref()) {
+			process_capabilities(reader.line, &len);
+			if (process_dummy_ref(reader.line)) {
 				state = EXPECTING_SHALLOW;
 				break;
 			}
 			state = EXPECTING_REF;
 			/* fallthrough */
 		case EXPECTING_REF:
-			if (process_ref(len, &list, flags, extra_have))
+			if (process_ref(reader.line, len, &list, flags, extra_have))
 				break;
 			state = EXPECTING_SHALLOW;
 			/* fallthrough */
 		case EXPECTING_SHALLOW:
-			if (process_shallow(len, shallow_points))
+			if (process_shallow(reader.line, len, shallow_points))
 				break;
-			die("protocol error: unexpected '%s'", packet_buffer);
-		default:
-			die("unexpected state %d", state);
+			die("protocol error: unexpected '%s'", reader.line);
+		case EXPECTING_DONE:
+			break;
 		}
 	}
 
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 08/35] connect: discover protocol version outside of get_remote_heads
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (6 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 15:49         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 09/35] transport: store protocol version Brandon Williams
                         ` (27 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

In order to prepare for the addition of protocol_v2 push the protocol
version discovery outside of 'get_remote_heads()'.  This will allow for
keeping the logic for processing the reference advertisement for
protocol_v1 and protocol_v0 separate from the logic for protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c | 16 +++++++++++++++-
 builtin/send-pack.c  | 17 +++++++++++++++--
 connect.c            | 27 ++++++++++-----------------
 connect.h            |  3 +++
 remote-curl.c        | 20 ++++++++++++++++++--
 remote.h             |  5 +++--
 transport.c          | 24 +++++++++++++++++++-----
 7 files changed, 83 insertions(+), 29 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 366b9d13f..85d4faf76 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -4,6 +4,7 @@
 #include "remote.h"
 #include "connect.h"
 #include "sha1-array.h"
+#include "protocol.h"
 
 static const char fetch_pack_usage[] =
 "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
@@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	struct fetch_pack_args args;
 	struct oid_array shallow = OID_ARRAY_INIT;
 	struct string_list deepen_not = STRING_LIST_INIT_DUP;
+	struct packet_reader reader;
 
 	packet_trace_identity("fetch-pack");
 
@@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 		if (!conn)
 			return args.diag_url ? 0 : 1;
 	}
-	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
+
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
 			 &shallow, pack_lockfile_ptr);
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index fc4f0bb5f..83cb125a6 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -14,6 +14,7 @@
 #include "sha1-array.h"
 #include "gpg-interface.h"
 #include "gettext.h"
+#include "protocol.h"
 
 static const char * const send_pack_usage[] = {
 	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
@@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 	int progress = -1;
 	int from_stdin = 0;
 	struct push_cas_option cas = {0};
+	struct packet_reader reader;
 
 	struct option options[] = {
 		OPT__VERBOSITY(&verbose),
@@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			args.verbose ? CONNECT_VERBOSE : 0);
 	}
 
-	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
-			 &extra_have, &shallow);
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
+				 &extra_have, &shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	transport_verify_remote_names(nr_refspecs, refspecs);
 
diff --git a/connect.c b/connect.c
index c82c90b7c..0b111e62d 100644
--- a/connect.c
+++ b/connect.c
@@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
-static enum protocol_version discover_version(struct packet_reader *reader)
+enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
 
@@ -233,7 +233,7 @@ enum get_remote_heads_state {
 /*
  * Read all the refs from the other end
  */
-struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct ref **get_remote_heads(struct packet_reader *reader,
 			      struct ref **list, unsigned int flags,
 			      struct oid_array *extra_have,
 			      struct oid_array *shallow_points)
@@ -241,24 +241,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	struct ref **orig_list = list;
 	int len = 0;
 	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
-	struct packet_reader reader;
 	const char *arg;
 
-	packet_reader_init(&reader, in, src_buf, src_len,
-			   PACKET_READ_CHOMP_NEWLINE |
-			   PACKET_READ_GENTLE_ON_EOF);
-
-	discover_version(&reader);
-
 	*list = NULL;
 
 	while (state != EXPECTING_DONE) {
-		switch (packet_reader_read(&reader)) {
+		switch (packet_reader_read(reader)) {
 		case PACKET_READ_EOF:
 			die_initial_contact(1);
 		case PACKET_READ_NORMAL:
-			len = reader.pktlen;
-			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
+			len = reader->pktlen;
+			if (len > 4 && skip_prefix(reader->line, "ERR ", &arg))
 				die("remote error: %s", arg);
 			break;
 		case PACKET_READ_FLUSH:
@@ -270,22 +263,22 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 
 		switch (state) {
 		case EXPECTING_FIRST_REF:
-			process_capabilities(reader.line, &len);
-			if (process_dummy_ref(reader.line)) {
+			process_capabilities(reader->line, &len);
+			if (process_dummy_ref(reader->line)) {
 				state = EXPECTING_SHALLOW;
 				break;
 			}
 			state = EXPECTING_REF;
 			/* fallthrough */
 		case EXPECTING_REF:
-			if (process_ref(reader.line, len, &list, flags, extra_have))
+			if (process_ref(reader->line, len, &list, flags, extra_have))
 				break;
 			state = EXPECTING_SHALLOW;
 			/* fallthrough */
 		case EXPECTING_SHALLOW:
-			if (process_shallow(reader.line, len, shallow_points))
+			if (process_shallow(reader->line, len, shallow_points))
 				break;
-			die("protocol error: unexpected '%s'", reader.line);
+			die("protocol error: unexpected '%s'", reader->line);
 		case EXPECTING_DONE:
 			break;
 		}
diff --git a/connect.h b/connect.h
index 01f14cdf3..cdb8979dc 100644
--- a/connect.h
+++ b/connect.h
@@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
 extern const char *server_feature_value(const char *feature, int *len_ret);
 extern int url_is_local_not_ssh(const char *url);
 
+struct packet_reader;
+extern enum protocol_version discover_version(struct packet_reader *reader);
+
 #endif
diff --git a/remote-curl.c b/remote-curl.c
index 0053b0954..9f6d07683 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -1,6 +1,7 @@
 #include "cache.h"
 #include "config.h"
 #include "remote.h"
+#include "connect.h"
 #include "strbuf.h"
 #include "walker.h"
 #include "http.h"
@@ -13,6 +14,7 @@
 #include "credential.h"
 #include "sha1-array.h"
 #include "send-pack.h"
+#include "protocol.h"
 
 static struct remote *remote;
 /* always ends with a trailing slash */
@@ -176,8 +178,22 @@ static struct discovery *last_discovery;
 static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 {
 	struct ref *list = NULL;
-	get_remote_heads(-1, heads->buf, heads->len, &list,
-			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
+	struct packet_reader reader;
+
+	packet_reader_init(&reader, -1, heads->buf, heads->len,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
+				 NULL, &heads->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	return list;
 }
 
diff --git a/remote.h b/remote.h
index 1f6611be2..2016461df 100644
--- a/remote.h
+++ b/remote.h
@@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
 void free_refs(struct ref *ref);
 
 struct oid_array;
-extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
+struct packet_reader;
+extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
-				     struct oid_array *shallow);
+				     struct oid_array *shallow_points);
 
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
diff --git a/transport.c b/transport.c
index 8e8779096..63c3dbab9 100644
--- a/transport.c
+++ b/transport.c
@@ -18,6 +18,7 @@
 #include "sha1-array.h"
 #include "sigchain.h"
 #include "transport-internal.h"
+#include "protocol.h"
 
 static void set_upstreams(struct transport *transport, struct ref *refs,
 	int pretend)
@@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
 static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
 {
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
+	struct packet_reader reader;
 
 	connect_setup(transport, for_push);
-	get_remote_heads(data->fd[0], NULL, 0, &refs,
-			 for_push ? REF_NORMAL : 0,
-			 &data->extra_have,
-			 &data->shallow);
+
+	packet_reader_init(&reader, data->fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	switch (discover_version(&reader)) {
+	case protocol_v1:
+	case protocol_v0:
+		get_remote_heads(&reader, &refs,
+				 for_push ? REF_NORMAL : 0,
+				 &data->extra_have,
+				 &data->shallow);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 	data->got_remote_heads = 1;
 
 	return refs;
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 09/35] transport: store protocol version
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (7 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
                         ` (26 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Once protocol_v2 is introduced requesting a fetch or a push will need to
be handled differently depending on the protocol version.  Store the
protocol version the server is speaking in 'struct git_transport_data'
and use it to determine what to do in the case of a fetch or a push.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 35 ++++++++++++++++++++++++++---------
 1 file changed, 26 insertions(+), 9 deletions(-)

diff --git a/transport.c b/transport.c
index 63c3dbab9..2378dcb38 100644
--- a/transport.c
+++ b/transport.c
@@ -118,6 +118,7 @@ struct git_transport_data {
 	struct child_process *conn;
 	int fd[2];
 	unsigned got_remote_heads : 1;
+	enum protocol_version version;
 	struct oid_array extra_have;
 	struct oid_array shallow;
 };
@@ -200,7 +201,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	data->version = discover_version(&reader);
+	switch (data->version) {
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -221,7 +223,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 {
 	int ret = 0;
 	struct git_transport_data *data = transport->data;
-	struct ref *refs;
+	struct ref *refs = NULL;
 	char *dest = xstrdup(transport->url);
 	struct fetch_pack_args args;
 	struct ref *refs_tmp = NULL;
@@ -247,10 +249,18 @@ static int fetch_refs_via_pack(struct transport *transport,
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0);
 
-	refs = fetch_pack(&args, data->fd, data->conn,
-			  refs_tmp ? refs_tmp : transport->remote_refs,
-			  dest, to_fetch, nr_heads, &data->shallow,
-			  &transport->pack_lockfile);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	close(data->fd[0]);
 	close(data->fd[1]);
 	if (finish_connect(data->conn))
@@ -549,7 +559,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 {
 	struct git_transport_data *data = transport->data;
 	struct send_pack_args args;
-	int ret;
+	int ret = 0;
 
 	if (!data->got_remote_heads)
 		get_refs_via_connect(transport, 1);
@@ -574,8 +584,15 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	else
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
-	ret = send_pack(&args, data->fd, data->conn, remote_refs,
-			&data->extra_have);
+	switch (data->version) {
+	case protocol_v1:
+	case protocol_v0:
+		ret = send_pack(&args, data->fd, data->conn, remote_refs,
+				&data->extra_have);
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
 
 	close(data->fd[1]);
 	close(data->fd[0]);
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 10/35] protocol: introduce enum protocol_version value protocol_v2
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (8 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 09/35] transport: store protocol version Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
                         ` (25 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Introduce protocol_v2, a new value for 'enum protocol_version'.
Subsequent patches will fill in the implementation of protocol_v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch-pack.c   | 2 ++
 builtin/receive-pack.c | 6 ++++++
 builtin/send-pack.c    | 3 +++
 builtin/upload-pack.c  | 7 +++++++
 connect.c              | 3 +++
 protocol.c             | 2 ++
 protocol.h             | 1 +
 remote-curl.c          | 3 +++
 transport.c            | 9 +++++++++
 9 files changed, 36 insertions(+)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 85d4faf76..b2374ddbb 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -201,6 +201,8 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index b7ce7c7f5..3656e94fd 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1963,6 +1963,12 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		unpack_limit = receive_unpack_limit;
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * push support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 83cb125a6..b5427f75e 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -263,6 +263,9 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 2cb5cb35b..8d53e9794 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -47,6 +47,13 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 		die("'%s' does not appear to be a git repository", dir);
 
 	switch (determine_protocol_version_server()) {
+	case protocol_v2:
+		/*
+		 * fetch support for protocol v2 has not been implemented yet,
+		 * so ignore the request to use v2 and fallback to using v0.
+		 */
+		upload_pack(&opts);
+		break;
 	case protocol_v1:
 		/*
 		 * v1 is just the original protocol with a version string,
diff --git a/connect.c b/connect.c
index 0b111e62d..4b89b984c 100644
--- a/connect.c
+++ b/connect.c
@@ -83,6 +83,9 @@ enum protocol_version discover_version(struct packet_reader *reader)
 	}
 
 	switch (version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 		/* Read the peeked version line */
 		packet_reader_read(reader);
diff --git a/protocol.c b/protocol.c
index 43012b7eb..5e636785d 100644
--- a/protocol.c
+++ b/protocol.c
@@ -8,6 +8,8 @@ static enum protocol_version parse_protocol_version(const char *value)
 		return protocol_v0;
 	else if (!strcmp(value, "1"))
 		return protocol_v1;
+	else if (!strcmp(value, "2"))
+		return protocol_v2;
 	else
 		return protocol_unknown_version;
 }
diff --git a/protocol.h b/protocol.h
index 1b2bc94a8..2ad35e433 100644
--- a/protocol.h
+++ b/protocol.h
@@ -5,6 +5,7 @@ enum protocol_version {
 	protocol_unknown_version = -1,
 	protocol_v0 = 0,
 	protocol_v1 = 1,
+	protocol_v2 = 2,
 };
 
 /*
diff --git a/remote-curl.c b/remote-curl.c
index 9f6d07683..dae8a4a48 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -185,6 +185,9 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_GENTLE_ON_EOF);
 
 	switch (discover_version(&reader)) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
diff --git a/transport.c b/transport.c
index 2378dcb38..83d9dd1df 100644
--- a/transport.c
+++ b/transport.c
@@ -203,6 +203,9 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 
 	data->version = discover_version(&reader);
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		get_remote_heads(&reader, &refs,
@@ -250,6 +253,9 @@ static int fetch_refs_via_pack(struct transport *transport,
 		refs_tmp = get_refs_via_connect(transport, 0);
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
@@ -585,6 +591,9 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 		args.push_cert = SEND_PACK_PUSH_CERT_NEVER;
 
 	switch (data->version) {
+	case protocol_v2:
+		die("support for protocol v2 not implemented yet");
+		break;
 	case protocol_v1:
 	case protocol_v0:
 		ret = send_pack(&args, data->fd, data->conn, remote_refs,
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 11/35] test-pkt-line: introduce a packet-line test helper
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (9 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 12/35] serve: introduce git-serve Brandon Williams
                         ` (24 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Introduce a packet-line test helper which can either pack or unpack an
input stream into packet-lines and writes out the result to stdout.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Makefile                 |  1 +
 t/helper/test-pkt-line.c | 64 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)
 create mode 100644 t/helper/test-pkt-line.c

diff --git a/Makefile b/Makefile
index b7ccc05fa..3b849c060 100644
--- a/Makefile
+++ b/Makefile
@@ -669,6 +669,7 @@ TEST_PROGRAMS_NEED_X += test-mktemp
 TEST_PROGRAMS_NEED_X += test-online-cpus
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
+TEST_PROGRAMS_NEED_X += test-pkt-line
 TEST_PROGRAMS_NEED_X += test-prio-queue
 TEST_PROGRAMS_NEED_X += test-read-cache
 TEST_PROGRAMS_NEED_X += test-write-cache
diff --git a/t/helper/test-pkt-line.c b/t/helper/test-pkt-line.c
new file mode 100644
index 000000000..0f19e53c7
--- /dev/null
+++ b/t/helper/test-pkt-line.c
@@ -0,0 +1,64 @@
+#include "pkt-line.h"
+
+static void pack_line(const char *line)
+{
+	if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))
+		packet_flush(1);
+	else if (!strcmp(line, "0001") || !strcmp(line, "0001\n"))
+		packet_delim(1);
+	else
+		packet_write_fmt(1, "%s", line);
+}
+
+static void pack(int argc, const char **argv)
+{
+	if (argc) { /* read from argv */
+		int i;
+		for (i = 0; i < argc; i++)
+			pack_line(argv[i]);
+	} else { /* read from stdin */
+		char line[LARGE_PACKET_MAX];
+		while (fgets(line, sizeof(line), stdin)) {
+			pack_line(line);
+		}
+	}
+}
+
+static void unpack(void)
+{
+	struct packet_reader reader;
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF |
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (packet_reader_read(&reader) != PACKET_READ_EOF) {
+		switch (reader.status) {
+		case PACKET_READ_EOF:
+			break;
+		case PACKET_READ_NORMAL:
+			printf("%s\n", reader.line);
+			break;
+		case PACKET_READ_FLUSH:
+			printf("0000\n");
+			break;
+		case PACKET_READ_DELIM:
+			printf("0001\n");
+			break;
+		}
+	}
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	if (argc < 2)
+		die("too few arguments");
+
+	if (!strcmp(argv[1], "pack"))
+		pack(argc - 2, argv + 2);
+	else if (!strcmp(argv[1], "unpack"))
+		unpack();
+	else
+		die("invalid argument '%s'", argv[1]);
+
+	return 0;
+}
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 12/35] serve: introduce git-serve
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (10 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-01 23:11         ` Junio C Hamano
                           ` (2 more replies)
  2018-02-28 23:22       ` [PATCH v4 13/35] ls-refs: introduce ls-refs server command Brandon Williams
                         ` (23 subsequent siblings)
  35 siblings, 3 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Introduce git-serve, the base server for protocol version 2.

Protocol version 2 is intended to be a replacement for Git's current
wire protocol.  The intention is that it will be a simpler, less
wasteful protocol which can evolve over time.

Protocol version 2 improves upon version 1 by eliminating the initial
ref advertisement.  In its place a server will export a list of
capabilities and commands which it supports in a capability
advertisement.  A client can then request that a particular command be
executed by providing a number of capabilities and command specific
parameters.  At the completion of a command, a client can request that
another command be executed or can terminate the connection by sending a
flush packet.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 .gitignore                              |   1 +
 Documentation/technical/protocol-v2.txt | 171 ++++++++++++++++
 Makefile                                |   2 +
 builtin.h                               |   1 +
 builtin/serve.c                         |  30 +++
 git.c                                   |   1 +
 serve.c                                 | 250 ++++++++++++++++++++++++
 serve.h                                 |  15 ++
 t/t5701-git-serve.sh                    |  60 ++++++
 9 files changed, 531 insertions(+)
 create mode 100644 Documentation/technical/protocol-v2.txt
 create mode 100644 builtin/serve.c
 create mode 100644 serve.c
 create mode 100644 serve.h
 create mode 100755 t/t5701-git-serve.sh

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..2d0450c26 100644
--- a/.gitignore
+++ b/.gitignore
@@ -140,6 +140,7 @@
 /git-rm
 /git-send-email
 /git-send-pack
+/git-serve
 /git-sh-i18n
 /git-sh-i18n--envsubst
 /git-sh-setup
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
new file mode 100644
index 000000000..b02eefc21
--- /dev/null
+++ b/Documentation/technical/protocol-v2.txt
@@ -0,0 +1,171 @@
+ Git Wire Protocol, Version 2
+==============================
+
+This document presents a specification for a version 2 of Git's wire
+protocol.  Protocol v2 will improve upon v1 in the following ways:
+
+  * Instead of multiple service names, multiple commands will be
+    supported by a single service
+  * Easily extendable as capabilities are moved into their own section
+    of the protocol, no longer being hidden behind a NUL byte and
+    limited by the size of a pkt-line
+  * Separate out other information hidden behind NUL bytes (e.g. agent
+    string as a capability and symrefs can be requested using 'ls-refs')
+  * Reference advertisement will be omitted unless explicitly requested
+  * ls-refs command to explicitly request some refs
+  * Designed with http and stateless-rpc in mind.  With clear flush
+    semantics the http remote helper can simply act as a proxy
+
+ Detailed Design
+=================
+
+In protocol v2 communication is command oriented.  When first contacting a
+server a list of capabilities will advertised.  Some of these capabilities
+will be commands which a client can request be executed.  Once a command
+has completed, a client can reuse the connection and request that other
+commands be executed.
+
+ Packet-Line Framing
+---------------------
+
+All communication is done using packet-line framing, just as in v1.  See
+`Documentation/technical/pack-protocol.txt` and
+`Documentation/technical/protocol-common.txt` for more information.
+
+In protocol v2 these special packets will have the following semantics:
+
+  * '0000' Flush Packet (flush-pkt) - indicates the end of a message
+  * '0001' Delimiter Packet (delim-pkt) - separates sections of a message
+
+ Initial Client Request
+------------------------
+
+In general a client can request to speak protocol v2 by sending
+`version=2` through the respective side-channel for the transport being
+used which inevitably sets `GIT_PROTOCOL`.  More information can be
+found in `pack-protocol.txt` and `http-protocol.txt`.  In all cases the
+response from the server is the capability advertisement.
+
+ Git Transport
+~~~~~~~~~~~~~~~
+
+When using the git:// transport, you can request to use protocol v2 by
+sending "version=2" as an extra parameter:
+
+   003egit-upload-pack /project.git\0host=myserver.com\0\0version=2\0
+
+ SSH and File Transport
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+When using either the ssh:// or file:// transport, the GIT_PROTOCOL
+environment variable must be set explicitly to include "version=2".
+
+ HTTP Transport
+~~~~~~~~~~~~~~~~
+
+When using the http:// or https:// transport a client makes a "smart"
+info/refs request as described in `http-protocol.txt` and requests that
+v2 be used by supplying "version=2" in the `Git-Protocol` header.
+
+   C: Git-Protocol: version=2
+   C:
+   C: GET $GIT_URL/info/refs?service=git-upload-pack HTTP/1.0
+
+A v2 server would reply:
+
+   S: 200 OK
+   S: <Some headers>
+   S: ...
+   S:
+   S: 000eversion 2\n
+   S: <capability-advertisement>
+
+Subsequent requests are then made directly to the service
+`$GIT_URL/git-upload-pack`. (This works the same for git-receive-pack).
+
+ Capability Advertisement
+--------------------------
+
+A server which decides to communicate (based on a request from a client)
+using protocol version 2, notifies the client by sending a version string
+in its initial response followed by an advertisement of its capabilities.
+Each capability is a key with an optional value.  Clients must ignore all
+unknown keys.  Semantics of unknown values are left to the definition of
+each key.  Some capabilities will describe commands which can be requested
+to be executed by the client.
+
+    capability-advertisement = protocol-version
+			       capability-list
+			       flush-pkt
+
+    protocol-version = PKT-LINE("version 2" LF)
+    capability-list = *capability
+    capability = PKT-LINE(key[=value] LF)
+
+    key = 1*(ALPHA | DIGIT | "-_")
+    value = 1*(ALPHA | DIGIT | " -_.,?\/{}[]()<>!@#$%^&*+=:;")
+
+ Command Request
+-----------------
+
+After receiving the capability advertisement, a client can then issue a
+request to select the command it wants with any particular capabilities
+or arguments.  There is then an optional section where the client can
+provide any command specific parameters or queries.  Only a single
+command can be requested at a time.
+
+    request = empty-request | command-request
+    empty-request = flush-pkt
+    command-request = command
+		      capability-list
+		      (command-args)
+		      flush-pkt
+    command = PKT-LINE("command=" key LF)
+    command-args = delim-pkt
+		   *PKT-Line(arg LF)
+    arg = 1*(ALPHA | DIGIT | " -_.,?\/{}[]()<>!@#$%^&*+=:;")
+
+The server will then check to ensure that the client's request is
+comprised of a valid command as well as valid capabilities which were
+advertised.  If the request is valid the server will then execute the
+command.  A server MUST wait till it has received the client's entire
+request before issuing a response.  The format of the response is
+determined by the command being executed, but in all cases a flush-pkt
+indicates the end of the response.
+
+When a command has finished, and the client has received the entire
+response from the server, a client can either request that another
+command be executed or can terminate the connection.  A client may
+optionally send an empty request consisting of just a flush-pkt to
+indicate that no more requests will be made.
+
+ Capabilities
+~~~~~~~~~~~~~~
+
+There are two different types of capabilities: normal capabilities,
+which can be used to to convey information or alter the behavior of a
+request, and commands, which are the core actions that a client wants to
+perform (fetch, push, etc).
+
+All commands must only last a single round and be stateless from the
+perspective of the server side.  All state MUST be retained and managed
+by the client process.  This permits simple round-robin load-balancing
+on the server side, without needing to worry about state management.
+
+Clients MUST NOT require state management on the server side in order to
+function correctly.
+
+ agent
+-------
+
+The server can advertise the `agent` capability with a value `X` (in the
+form `agent=X`) to notify the client that the server is running version
+`X`.  The client may optionally send its own agent string by including
+the `agent` capability with a value `Y` (in the form `agent=Y`) in its
+request to the server (but it MUST NOT do so if the server did not
+advertise the agent capability). The `X` and `Y` strings may contain any
+printable ASCII characters except space (i.e., the byte range 32 < x <
+127), and are typically of the form "package/version" (e.g.,
+"git/1.8.3.1"). The agent strings are purely informative for statistics
+and debugging purposes, and MUST NOT be used to programmatically assume
+the presence or absence of particular features.
diff --git a/Makefile b/Makefile
index 3b849c060..18c255428 100644
--- a/Makefile
+++ b/Makefile
@@ -881,6 +881,7 @@ LIB_OBJS += revision.o
 LIB_OBJS += run-command.o
 LIB_OBJS += send-pack.o
 LIB_OBJS += sequencer.o
+LIB_OBJS += serve.o
 LIB_OBJS += server-info.o
 LIB_OBJS += setup.o
 LIB_OBJS += sha1-array.o
@@ -1014,6 +1015,7 @@ BUILTIN_OBJS += builtin/rev-parse.o
 BUILTIN_OBJS += builtin/revert.o
 BUILTIN_OBJS += builtin/rm.o
 BUILTIN_OBJS += builtin/send-pack.o
+BUILTIN_OBJS += builtin/serve.o
 BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-ref.o
diff --git a/builtin.h b/builtin.h
index f332a1257..3f3fdfc28 100644
--- a/builtin.h
+++ b/builtin.h
@@ -215,6 +215,7 @@ extern int cmd_rev_parse(int argc, const char **argv, const char *prefix);
 extern int cmd_revert(int argc, const char **argv, const char *prefix);
 extern int cmd_rm(int argc, const char **argv, const char *prefix);
 extern int cmd_send_pack(int argc, const char **argv, const char *prefix);
+extern int cmd_serve(int argc, const char **argv, const char *prefix);
 extern int cmd_shortlog(int argc, const char **argv, const char *prefix);
 extern int cmd_show(int argc, const char **argv, const char *prefix);
 extern int cmd_show_branch(int argc, const char **argv, const char *prefix);
diff --git a/builtin/serve.c b/builtin/serve.c
new file mode 100644
index 000000000..d3fd240bb
--- /dev/null
+++ b/builtin/serve.c
@@ -0,0 +1,30 @@
+#include "cache.h"
+#include "builtin.h"
+#include "parse-options.h"
+#include "serve.h"
+
+static char const * const serve_usage[] = {
+	N_("git serve [<options>]"),
+	NULL
+};
+
+int cmd_serve(int argc, const char **argv, const char *prefix)
+{
+	struct serve_options opts = SERVE_OPTIONS_INIT;
+
+	struct option options[] = {
+		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
+			 N_("quit after a single request/response exchange")),
+		OPT_BOOL(0, "advertise-capabilities", &opts.advertise_capabilities,
+			 N_("exit immediately after advertising capabilities")),
+		OPT_END()
+	};
+
+	/* ignore all unknown cmdline switches for now */
+	argc = parse_options(argc, argv, prefix, options, serve_usage,
+			     PARSE_OPT_KEEP_DASHDASH |
+			     PARSE_OPT_KEEP_UNKNOWN);
+	serve(&opts);
+
+	return 0;
+}
diff --git a/git.c b/git.c
index f71073dc8..f85d682b6 100644
--- a/git.c
+++ b/git.c
@@ -461,6 +461,7 @@ static struct cmd_struct commands[] = {
 	{ "revert", cmd_revert, RUN_SETUP | NEED_WORK_TREE },
 	{ "rm", cmd_rm, RUN_SETUP },
 	{ "send-pack", cmd_send_pack, RUN_SETUP },
+	{ "serve", cmd_serve, RUN_SETUP },
 	{ "shortlog", cmd_shortlog, RUN_SETUP_GENTLY | USE_PAGER },
 	{ "show", cmd_show, RUN_SETUP },
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
diff --git a/serve.c b/serve.c
new file mode 100644
index 000000000..cf23179b9
--- /dev/null
+++ b/serve.c
@@ -0,0 +1,250 @@
+#include "cache.h"
+#include "repository.h"
+#include "config.h"
+#include "pkt-line.h"
+#include "version.h"
+#include "argv-array.h"
+#include "serve.h"
+
+static int agent_advertise(struct repository *r,
+			   struct strbuf *value)
+{
+	if (value)
+		strbuf_addstr(value, git_user_agent_sanitized());
+	return 1;
+}
+
+struct protocol_capability {
+	/*
+	 * The name of the capability.  The server uses this name when
+	 * advertising this capability, and the client uses this name to
+	 * specify this capability.
+	 */
+	const char *name;
+
+	/*
+	 * Function queried to see if a capability should be advertised.
+	 * Optionally a value can be specified by adding it to 'value'.
+	 * If a value is added to 'value', the server will advertise this
+	 * capability as "<name>=<value>" instead of "<name>".
+	 */
+	int (*advertise)(struct repository *r, struct strbuf *value);
+
+	/*
+	 * Function called when a client requests the capability as a command.
+	 * The command request will be provided to the function via 'keys', the
+	 * capabilities requested, and 'args', the command specific parameters.
+	 *
+	 * This field should be NULL for capabilities which are not commands.
+	 */
+	int (*command)(struct repository *r,
+		       struct argv_array *keys,
+		       struct argv_array *args);
+};
+
+static struct protocol_capability capabilities[] = {
+	{ "agent", agent_advertise, NULL },
+};
+
+static void advertise_capabilities(void)
+{
+	struct strbuf capability = STRBUF_INIT;
+	struct strbuf value = STRBUF_INIT;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+
+		if (c->advertise(the_repository, &value)) {
+			strbuf_addstr(&capability, c->name);
+
+			if (value.len) {
+				strbuf_addch(&capability, '=');
+				strbuf_addbuf(&capability, &value);
+			}
+
+			strbuf_addch(&capability, '\n');
+			packet_write(1, capability.buf, capability.len);
+		}
+
+		strbuf_reset(&capability);
+		strbuf_reset(&value);
+	}
+
+	packet_flush(1);
+	strbuf_release(&capability);
+	strbuf_release(&value);
+}
+
+static struct protocol_capability *get_capability(const char *key)
+{
+	int i;
+
+	if (!key)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
+		struct protocol_capability *c = &capabilities[i];
+		const char *out;
+		if (skip_prefix(key, c->name, &out) && (!*out || *out == '='))
+			return c;
+	}
+
+	return NULL;
+}
+
+static int is_valid_capability(const char *key)
+{
+	const struct protocol_capability *c = get_capability(key);
+
+	return c && c->advertise(the_repository, NULL);
+}
+
+static int is_command(const char *key, struct protocol_capability **command)
+{
+	const char *out;
+
+	if (skip_prefix(key, "command=", &out)) {
+		struct protocol_capability *cmd = get_capability(out);
+
+		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
+			die("invalid command '%s'", out);
+		if (*command)
+			die("command already requested");
+
+		*command = cmd;
+		return 1;
+	}
+
+	return 0;
+}
+
+int has_capability(const struct argv_array *keys, const char *capability,
+		   const char **value)
+{
+	int i;
+	for (i = 0; i < keys->argc; i++) {
+		const char *out;
+		if (skip_prefix(keys->argv[i], capability, &out) &&
+		    (!*out || *out == '=')) {
+			if (value) {
+				if (*out == '=')
+					out++;
+				*value = out;
+			}
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+enum request_state {
+	PROCESS_REQUEST_KEYS = 0,
+	PROCESS_REQUEST_ARGS,
+	PROCESS_REQUEST_DONE,
+};
+
+static int process_request(void)
+{
+	enum request_state state = PROCESS_REQUEST_KEYS;
+	struct packet_reader reader;
+	struct argv_array keys = ARGV_ARRAY_INIT;
+	struct argv_array args = ARGV_ARRAY_INIT;
+	struct protocol_capability *command = NULL;
+
+	packet_reader_init(&reader, 0, NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE |
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	/*
+	 * Check to see if the client closed their end before sending another
+	 * request.  If so we can terminate the connection.
+	 */
+	if (packet_reader_peek(&reader) == PACKET_READ_EOF)
+		return 1;
+	reader.options = PACKET_READ_CHOMP_NEWLINE;
+
+	while (state != PROCESS_REQUEST_DONE) {
+		switch (packet_reader_read(&reader)) {
+		case PACKET_READ_EOF:
+			BUG("Should have already died when seeing EOF");
+		case PACKET_READ_NORMAL:
+			break;
+		case PACKET_READ_FLUSH:
+			state = PROCESS_REQUEST_DONE;
+			continue;
+		case PACKET_READ_DELIM:
+			if (state != PROCESS_REQUEST_KEYS)
+				die("protocol error");
+			state = PROCESS_REQUEST_ARGS;
+			/*
+			 * maybe include a check to make sure that a
+			 * command/capabilities were given.
+			 */
+			continue;
+		}
+
+		switch (state) {
+		case PROCESS_REQUEST_KEYS:
+			/* collect request; a sequence of keys and values */
+			if (is_command(reader.line, &command) ||
+			    is_valid_capability(reader.line))
+				argv_array_push(&keys, reader.line);
+			else
+				die("unknown capability '%s'", reader.line);
+			break;
+		case PROCESS_REQUEST_ARGS:
+			/* collect arguments for the requested command */
+			argv_array_push(&args, reader.line);
+			break;
+		case PROCESS_REQUEST_DONE:
+			continue;
+		}
+	}
+
+	/*
+	 * If no command and no keys were given then the client wanted to
+	 * terminate the connection.
+	 */
+	if (!keys.argc && !args.argc)
+		return 1;
+
+	if (!command)
+		die("no command requested");
+
+	command->command(the_repository, &keys, &args);
+
+	argv_array_clear(&keys);
+	argv_array_clear(&args);
+	return 0;
+}
+
+/* Main serve loop for protocol version 2 */
+void serve(struct serve_options *options)
+{
+	if (options->advertise_capabilities || !options->stateless_rpc) {
+		/* serve by default supports v2 */
+		packet_write_fmt(1, "version 2\n");
+
+		advertise_capabilities();
+		/*
+		 * If only the list of capabilities was requested exit
+		 * immediately after advertising capabilities
+		 */
+		if (options->advertise_capabilities)
+			return;
+	}
+
+	/*
+	 * If stateless-rpc was requested then exit after
+	 * a single request/response exchange
+	 */
+	if (options->stateless_rpc) {
+		process_request();
+	} else {
+		for (;;)
+			if (process_request())
+				break;
+	}
+}
diff --git a/serve.h b/serve.h
new file mode 100644
index 000000000..fe65ba9f4
--- /dev/null
+++ b/serve.h
@@ -0,0 +1,15 @@
+#ifndef SERVE_H
+#define SERVE_H
+
+struct argv_array;
+extern int has_capability(const struct argv_array *keys, const char *capability,
+			  const char **value);
+
+struct serve_options {
+	unsigned advertise_capabilities;
+	unsigned stateless_rpc;
+};
+#define SERVE_OPTIONS_INIT { 0 }
+extern void serve(struct serve_options *options);
+
+#endif /* SERVE_H */
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
new file mode 100755
index 000000000..affbad097
--- /dev/null
+++ b/t/t5701-git-serve.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+
+test_description='test git-serve and server commands'
+
+. ./test-lib.sh
+
+test_expect_success 'test capability advertisement' '
+	cat >expect <<-EOF &&
+	version 2
+	agent=git/$(git version | cut -d" " -f3)
+	0000
+	EOF
+
+	git serve --advertise-capabilities >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'stateless-rpc flag does not list capabilities' '
+	# Empty request
+	test-pkt-line pack >in <<-EOF &&
+	0000
+	EOF
+	git serve --stateless-rpc >out <in &&
+	test_must_be_empty out &&
+
+	# EOF
+	git serve --stateless-rpc >out &&
+	test_must_be_empty out
+'
+
+test_expect_success 'request invalid capability' '
+	test-pkt-line pack >in <<-EOF &&
+	foobar
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "unknown capability" err
+'
+
+test_expect_success 'request with no command' '
+	test-pkt-line pack >in <<-EOF &&
+	agent=git/test
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "no command requested" err
+'
+
+test_expect_success 'request invalid command' '
+	test-pkt-line pack >in <<-EOF &&
+	command=foo
+	agent=git/test
+	0000
+	EOF
+	test_must_fail git serve --stateless-rpc 2>err <in &&
+	test_i18ngrep "invalid command" err
+'
+
+test_done
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (11 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 12/35] serve: introduce git-serve Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-02 21:13         ` Junio C Hamano
  2018-03-03  4:43         ` Jeff King
  2018-02-28 23:22       ` [PATCH v4 14/35] connect: request remote refs using v2 Brandon Williams
                         ` (22 subsequent siblings)
  35 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt |  36 ++++++
 Makefile                                |   1 +
 ls-refs.c                               | 144 ++++++++++++++++++++++++
 ls-refs.h                               |   9 ++
 serve.c                                 |   8 ++
 t/t5701-git-serve.sh                    | 115 +++++++++++++++++++
 6 files changed, 313 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index b02eefc21..7f50e6462 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -169,3 +169,39 @@ printable ASCII characters except space (i.e., the byte range 32 < x <
 "git/1.8.3.1"). The agent strings are purely informative for statistics
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
+
+ ls-refs
+---------
+
+`ls-refs` is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in arguments
+which can be used to limit the refs sent from the server.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "<command>=<feature 1>
+<feature 2>".
+
+ls-refs takes in the following arguments:
+
+    symrefs
+	In addition to the object pointed by it, show the underlying ref
+	pointed by it when showing a symbolic ref.
+    peel
+	Show peeled tags.
+    ref-pattern <pattern>
+	When specified, only references matching one of the provided
+	patterns are displayed.  A pattern is either a valid refname
+	(e.g.  refs/heads/master), in which a ref must match the pattern
+	exactly, or a prefix of a ref followed by a single '*' wildcard
+	character (e.g. refs/heads/*), in which a ref must have a prefix
+	equal to the pattern up to the wildcard character.
+
+The output of ls-refs is as follows:
+
+    output = *ref
+	     flush-pkt
+    ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+    ref-attribute = (symref | peeled)
+    symref = "symref-target:" symref-target
+    peeled = "peeled:" obj-id
diff --git a/Makefile b/Makefile
index 18c255428..e50927cfb 100644
--- a/Makefile
+++ b/Makefile
@@ -825,6 +825,7 @@ LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 000000000..91d7deb34
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,144 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ref_pattern {
+	char *pattern;
+	int wildcard_pos; /* If > 0, indicates the position of the wildcard */
+};
+
+struct pattern_list {
+	struct ref_pattern *patterns;
+	int nr;
+	int alloc;
+};
+
+static void add_pattern(struct pattern_list *patterns, const char *pattern)
+{
+	struct ref_pattern p;
+	const char *wildcard;
+
+	p.pattern = strdup(pattern);
+
+	wildcard = strchr(pattern, '*');
+	if (wildcard) {
+		p.wildcard_pos = wildcard - pattern;
+	} else {
+		p.wildcard_pos = -1;
+	}
+
+	ALLOC_GROW(patterns->patterns,
+		   patterns->nr + 1,
+		   patterns->alloc);
+	patterns->patterns[patterns->nr++] = p;
+}
+
+static void clear_patterns(struct pattern_list *patterns)
+{
+	int i;
+	for (i = 0; i < patterns->nr; i++)
+		free(patterns->patterns[i].pattern);
+	FREE_AND_NULL(patterns->patterns);
+	patterns->nr = 0;
+	patterns->alloc = 0;
+}
+
+/*
+ * Check if one of the patterns matches the tail part of the ref.
+ * If no patterns were provided, all refs match.
+ */
+static int ref_match(const struct pattern_list *patterns, const char *refname)
+{
+	int i;
+
+	if (!patterns->nr)
+		return 1; /* no restriction */
+
+	for (i = 0; i < patterns->nr; i++) {
+		const struct ref_pattern *p = &patterns->patterns[i];
+
+		/* No wildcard, exact match expected */
+		if (p->wildcard_pos < 0) {
+			if (!strcmp(refname, p->pattern))
+				return 1;
+		} else {
+			/* Wildcard, prefix match until the wildcard */
+			if (!strncmp(refname, p->pattern, p->wildcard_pos))
+				return 1;
+		}
+	}
+
+	return 0;
+}
+
+struct ls_refs_data {
+	unsigned peel;
+	unsigned symrefs;
+	struct pattern_list patterns;
+};
+
+static int send_ref(const char *refname, const struct object_id *oid,
+		    int flag, void *cb_data)
+{
+	struct ls_refs_data *data = cb_data;
+	const char *refname_nons = strip_namespace(refname);
+	struct strbuf refline = STRBUF_INIT;
+
+	if (!ref_match(&data->patterns, refname))
+		return 0;
+
+	strbuf_addf(&refline, "%s %s", oid_to_hex(oid), refname_nons);
+	if (data->symrefs && flag & REF_ISSYMREF) {
+		struct object_id unused;
+		const char *symref_target = resolve_ref_unsafe(refname, 0,
+							       &unused,
+							       &flag);
+
+		if (!symref_target)
+			die("'%s' is a symref but it is not?", refname);
+
+		strbuf_addf(&refline, " symref-target:%s", symref_target);
+	}
+
+	if (data->peel) {
+		struct object_id peeled;
+		if (!peel_ref(refname, &peeled))
+			strbuf_addf(&refline, " peeled:%s", oid_to_hex(&peeled));
+	}
+
+	strbuf_addch(&refline, '\n');
+	packet_write(1, refline.buf, refline.len);
+
+	strbuf_release(&refline);
+	return 0;
+}
+
+int ls_refs(struct repository *r, struct argv_array *keys, struct argv_array *args)
+{
+	int i;
+	struct ls_refs_data data;
+
+	memset(&data, 0, sizeof(data));
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+		const char *out;
+
+		if (!strcmp("peel", arg))
+			data.peel = 1;
+		else if (!strcmp("symrefs", arg))
+			data.symrefs = 1;
+		else if (skip_prefix(arg, "ref-pattern ", &out))
+			add_pattern(&data.patterns, out);
+	}
+
+	head_ref_namespaced(send_ref, &data);
+	for_each_namespaced_ref(send_ref, &data);
+	packet_flush(1);
+	clear_patterns(&data.patterns);
+	return 0;
+}
diff --git a/ls-refs.h b/ls-refs.h
new file mode 100644
index 000000000..9e4c57bfe
--- /dev/null
+++ b/ls-refs.h
@@ -0,0 +1,9 @@
+#ifndef LS_REFS_H
+#define LS_REFS_H
+
+struct repository;
+struct argv_array;
+extern int ls_refs(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args);
+
+#endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index cf23179b9..c7925c5c7 100644
--- a/serve.c
+++ b/serve.c
@@ -4,8 +4,15 @@
 #include "pkt-line.h"
 #include "version.h"
 #include "argv-array.h"
+#include "ls-refs.h"
 #include "serve.h"
 
+static int always_advertise(struct repository *r,
+			    struct strbuf *value)
+{
+	return 1;
+}
+
 static int agent_advertise(struct repository *r,
 			   struct strbuf *value)
 {
@@ -44,6 +51,7 @@ struct protocol_capability {
 
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
+	{ "ls-refs", always_advertise, ls_refs },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index affbad097..11aeb0541 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -8,6 +8,7 @@ test_expect_success 'test capability advertisement' '
 	cat >expect <<-EOF &&
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
+	ls-refs
 	0000
 	EOF
 
@@ -57,4 +58,118 @@ test_expect_success 'request invalid command' '
 	test_i18ngrep "invalid command" err
 '
 
+# Test the basics of ls-refs
+#
+test_expect_success 'setup some refs and tags' '
+	test_commit one &&
+	git branch dev master &&
+	test_commit two &&
+	git symbolic-ref refs/heads/release refs/heads/master &&
+	git tag -a -m "annotated tag" annotated-tag
+'
+
+test_expect_success 'basics of ls-refs' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse HEAD) HEAD
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'basic ref-patterns' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	ref-pattern refs/heads/master
+	ref-pattern refs/tags/one
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/tags/one) refs/tags/one
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'wildcard ref-patterns' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	ref-pattern refs/heads/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'peel parameter' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	peel
+	ref-pattern refs/tags/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/tags/annotated-tag) refs/tags/annotated-tag peeled:$(git rev-parse refs/tags/annotated-tag^{})
+	$(git rev-parse refs/tags/one) refs/tags/one
+	$(git rev-parse refs/tags/two) refs/tags/two
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
+test_expect_success 'symrefs parameter' '
+	test-pkt-line pack >in <<-EOF &&
+	command=ls-refs
+	0001
+	symrefs
+	ref-pattern refs/heads/*
+	0000
+	EOF
+
+	cat >expect <<-EOF &&
+	$(git rev-parse refs/heads/dev) refs/heads/dev
+	$(git rev-parse refs/heads/master) refs/heads/master
+	$(git rev-parse refs/heads/release) refs/heads/release symref-target:refs/heads/master
+	0000
+	EOF
+
+	git serve --stateless-rpc <in >out &&
+	test-pkt-line unpack <out >actual &&
+	test_cmp actual expect
+'
+
 test_done
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 14/35] connect: request remote refs using v2
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (12 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 13/35] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
                         ` (21 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Teach the client to be able to request a remote's refs using protocol
v2.  This is done by having a client issue a 'ls-refs' request to a v2
server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/upload-pack.c  |  10 +--
 connect.c              | 138 +++++++++++++++++++++++++++++++++++++++--
 connect.h              |   2 +
 remote.h               |   6 ++
 t/t5702-protocol-v2.sh |  57 +++++++++++++++++
 transport.c            |   2 +-
 6 files changed, 204 insertions(+), 11 deletions(-)
 create mode 100755 t/t5702-protocol-v2.sh

diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 8d53e9794..a757df8da 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -5,6 +5,7 @@
 #include "parse-options.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -16,6 +17,7 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 	const char *dir;
 	int strict = 0;
 	struct upload_pack_options opts = { 0 };
+	struct serve_options serve_opts = SERVE_OPTIONS_INIT;
 	struct option options[] = {
 		OPT_BOOL(0, "stateless-rpc", &opts.stateless_rpc,
 			 N_("quit after a single request/response exchange")),
@@ -48,11 +50,9 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v2:
-		/*
-		 * fetch support for protocol v2 has not been implemented yet,
-		 * so ignore the request to use v2 and fallback to using v0.
-		 */
-		upload_pack(&opts);
+		serve_opts.advertise_capabilities = opts.advertise_refs;
+		serve_opts.stateless_rpc = opts.stateless_rpc;
+		serve(&serve_opts);
 		break;
 	case protocol_v1:
 		/*
diff --git a/connect.c b/connect.c
index 4b89b984c..6203ce576 100644
--- a/connect.c
+++ b/connect.c
@@ -12,9 +12,11 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "version.h"
 #include "protocol.h"
 
-static char *server_capabilities;
+static char *server_capabilities_v1;
+static struct argv_array server_capabilities_v2 = ARGV_ARRAY_INIT;
 static const char *parse_feature_value(const char *, const char *, int *);
 
 static int check_ref(const char *name, unsigned int flags)
@@ -62,6 +64,33 @@ static void die_initial_contact(int unexpected)
 		      "and the repository exists."));
 }
 
+/* Checks if the server supports the capability 'c' */
+int server_supports_v2(const char *c, int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *out == '='))
+			return 1;
+	}
+
+	if (die_on_error)
+		die("server doesn't support '%s'", c);
+
+	return 0;
+}
+
+static void process_capabilities_v2(struct packet_reader *reader)
+{
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
+		argv_array_push(&server_capabilities_v2, reader->line);
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("expected flush after capabilities");
+}
+
 enum protocol_version discover_version(struct packet_reader *reader)
 {
 	enum protocol_version version = protocol_unknown_version;
@@ -84,7 +113,7 @@ enum protocol_version discover_version(struct packet_reader *reader)
 
 	switch (version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		process_capabilities_v2(reader);
 		break;
 	case protocol_v1:
 		/* Read the peeked version line */
@@ -128,7 +157,7 @@ static void parse_one_symref_info(struct string_list *symref, const char *val, i
 static void annotate_refs_with_symref_info(struct ref *ref)
 {
 	struct string_list symref = STRING_LIST_INIT_DUP;
-	const char *feature_list = server_capabilities;
+	const char *feature_list = server_capabilities_v1;
 
 	while (feature_list) {
 		int len;
@@ -157,7 +186,7 @@ static void process_capabilities(const char *line, int *len)
 	int nul_location = strlen(line);
 	if (nul_location == *len)
 		return;
-	server_capabilities = xstrdup(line + nul_location + 1);
+	server_capabilities_v1 = xstrdup(line + nul_location + 1);
 	*len = nul_location;
 }
 
@@ -292,6 +321,105 @@ struct ref **get_remote_heads(struct packet_reader *reader,
 	return list;
 }
 
+/* Returns 1 when a valid ref has been added to `list`, 0 otherwise */
+static int process_ref_v2(const char *line, struct ref ***list)
+{
+	int ret = 1;
+	int i = 0;
+	struct object_id old_oid;
+	struct ref *ref;
+	struct string_list line_sections = STRING_LIST_INIT_DUP;
+	const char *end;
+
+	/*
+	 * Ref lines have a number of fields which are space deliminated.  The
+	 * first field is the OID of the ref.  The second field is the ref
+	 * name.  Subsequent fields (symref-target and peeled) are optional and
+	 * don't have a particular order.
+	 */
+	if (string_list_split(&line_sections, line, ' ', -1) < 2) {
+		ret = 0;
+		goto out;
+	}
+
+	if (parse_oid_hex(line_sections.items[i++].string, &old_oid, &end) ||
+	    *end) {
+		ret = 0;
+		goto out;
+	}
+
+	ref = alloc_ref(line_sections.items[i++].string);
+
+	oidcpy(&ref->old_oid, &old_oid);
+	**list = ref;
+	*list = &ref->next;
+
+	for (; i < line_sections.nr; i++) {
+		const char *arg = line_sections.items[i].string;
+		if (skip_prefix(arg, "symref-target:", &arg))
+			ref->symref = xstrdup(arg);
+
+		if (skip_prefix(arg, "peeled:", &arg)) {
+			struct object_id peeled_oid;
+			char *peeled_name;
+			struct ref *peeled;
+			if (parse_oid_hex(arg, &peeled_oid, &end) || *end) {
+				ret = 0;
+				goto out;
+			}
+
+			peeled_name = xstrfmt("%s^{}", ref->name);
+			peeled = alloc_ref(peeled_name);
+
+			oidcpy(&peeled->old_oid, &peeled_oid);
+			**list = peeled;
+			*list = &peeled->next;
+
+			free(peeled_name);
+		}
+	}
+
+out:
+	string_list_clear(&line_sections, 0);
+	return ret;
+}
+
+struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+			     struct ref **list, int for_push,
+			     const struct argv_array *ref_patterns)
+{
+	int i;
+	*list = NULL;
+
+	if (server_supports_v2("ls-refs", 1))
+		packet_write_fmt(fd_out, "command=ls-refs\n");
+
+	if (server_supports_v2("agent", 0))
+		packet_write_fmt(fd_out, "agent=%s", git_user_agent_sanitized());
+
+	packet_delim(fd_out);
+	/* When pushing we don't want to request the peeled tags */
+	if (!for_push)
+		packet_write_fmt(fd_out, "peel\n");
+	packet_write_fmt(fd_out, "symrefs\n");
+	for (i = 0; ref_patterns && i < ref_patterns->argc; i++) {
+		packet_write_fmt(fd_out, "ref-pattern %s\n",
+				 ref_patterns->argv[i]);
+	}
+	packet_flush(fd_out);
+
+	/* Process response from server */
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		if (!process_ref_v2(reader->line, &list))
+			die("invalid ls-refs response: %s", reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH)
+		die("expected flush after ref listing");
+
+	return list;
+}
+
 static const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp)
 {
 	int len;
@@ -336,7 +464,7 @@ int parse_feature_request(const char *feature_list, const char *feature)
 
 const char *server_feature_value(const char *feature, int *len)
 {
-	return parse_feature_value(server_capabilities, feature, len);
+	return parse_feature_value(server_capabilities_v1, feature, len);
 }
 
 int server_supports(const char *feature)
diff --git a/connect.h b/connect.h
index cdb8979dc..8898d4495 100644
--- a/connect.h
+++ b/connect.h
@@ -16,4 +16,6 @@ extern int url_is_local_not_ssh(const char *url);
 struct packet_reader;
 extern enum protocol_version discover_version(struct packet_reader *reader);
 
+extern int server_supports_v2(const char *c, int die_on_error);
+
 #endif
diff --git a/remote.h b/remote.h
index 2016461df..3a9db30cf 100644
--- a/remote.h
+++ b/remote.h
@@ -151,11 +151,17 @@ void free_refs(struct ref *ref);
 
 struct oid_array;
 struct packet_reader;
+struct argv_array;
 extern struct ref **get_remote_heads(struct packet_reader *reader,
 				     struct ref **list, unsigned int flags,
 				     struct oid_array *extra_have,
 				     struct oid_array *shallow_points);
 
+/* Used for protocol v2 in order to retrieve refs from a remote */
+extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
+				    struct ref **list, int for_push,
+				    const struct argv_array *ref_patterns);
+
 int resolve_remote_symref(struct ref *ref, struct ref *list);
 int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
new file mode 100755
index 000000000..dc5f813be
--- /dev/null
+++ b/t/t5702-protocol-v2.sh
@@ -0,0 +1,57 @@
+#!/bin/sh
+
+test_description='test git wire-protocol version 2'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v2 with 'git://' transport
+#
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	test_commit -C "$daemon_parent" one
+'
+
+test_expect_success 'list refs with git:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -c protocol.version=2 \
+		ls-remote --symref "$GIT_DAEMON_URL/parent" >actual &&
+
+	# Client requested to use protocol v2
+	grep "git> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+
+	git ls-remote --symref "$GIT_DAEMON_URL/parent" >expect &&
+	test_cmp actual expect
+'
+
+stop_git_daemon
+
+# Test protocol v2 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'list refs with file:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -c protocol.version=2 \
+		ls-remote --symref "file://$(pwd)/file_parent" >actual &&
+
+	# Server responded using protocol v2
+	grep "git< version 2" log &&
+
+	git ls-remote --symref "file://$(pwd)/file_parent" >expect &&
+	test_cmp actual expect
+'
+
+test_done
diff --git a/transport.c b/transport.c
index 83d9dd1df..ffc6b2614 100644
--- a/transport.c
+++ b/transport.c
@@ -204,7 +204,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
 		break;
 	case protocol_v1:
 	case protocol_v0:
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 15/35] transport: convert get_refs_list to take a list of ref patterns
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (13 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 14/35] connect: request remote refs using v2 Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 16/35] transport: convert transport_get_remote_refs " Brandon Williams
                         ` (20 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Convert the 'struct transport' virtual function 'get_refs_list()' to
optionally take an argv_array of ref patterns.  When communicating with
a server using protocol v2 these ref patterns can be sent when
requesting a listing of their refs allowing the server to filter the
refs it sends based on the sent patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c   |  5 +++--
 transport-internal.h |  9 ++++++++-
 transport.c          | 16 +++++++++-------
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 508015023..4c334b5ee 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -1026,7 +1026,8 @@ static int has_attribute(const char *attrs, const char *attr) {
 	}
 }
 
-static struct ref *get_refs_list(struct transport *transport, int for_push)
+static struct ref *get_refs_list(struct transport *transport, int for_push,
+				 const struct argv_array *ref_patterns)
 {
 	struct helper_data *data = transport->data;
 	struct child_process *helper;
@@ -1039,7 +1040,7 @@ static struct ref *get_refs_list(struct transport *transport, int for_push)
 
 	if (process_connect(transport, for_push)) {
 		do_take_over(transport);
-		return transport->vtable->get_refs_list(transport, for_push);
+		return transport->vtable->get_refs_list(transport, for_push, ref_patterns);
 	}
 
 	if (data->push && for_push)
diff --git a/transport-internal.h b/transport-internal.h
index 3c1a29d72..36fcee437 100644
--- a/transport-internal.h
+++ b/transport-internal.h
@@ -3,6 +3,7 @@
 
 struct ref;
 struct transport;
+struct argv_array;
 
 struct transport_vtable {
 	/**
@@ -17,11 +18,17 @@ struct transport_vtable {
 	 * the transport to try to share connections, for_push is a
 	 * hint as to whether the ultimate operation is a push or a fetch.
 	 *
+	 * If communicating using protocol v2 a list of patterns can be
+	 * provided to be sent to the server to enable it to limit the ref
+	 * advertisement.  Since ref filtering is done on the server's end,
+	 * this can return refs which don't match the provided ref_patterns.
+	 *
 	 * If the transport is able to determine the remote hash for
 	 * the ref without a huge amount of effort, it should store it
 	 * in the ref's old_sha1 field; otherwise it should be all 0.
 	 **/
-	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
+	struct ref *(*get_refs_list)(struct transport *transport, int for_push,
+				     const struct argv_array *ref_patterns);
 
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
diff --git a/transport.c b/transport.c
index ffc6b2614..c54a44630 100644
--- a/transport.c
+++ b/transport.c
@@ -72,7 +72,7 @@ struct bundle_transport_data {
 	struct bundle_header header;
 };
 
-static struct ref *get_refs_from_bundle(struct transport *transport, int for_push)
+static struct ref *get_refs_from_bundle(struct transport *transport, int for_push, const struct argv_array *ref_patterns)
 {
 	struct bundle_transport_data *data = transport->data;
 	struct ref *result = NULL;
@@ -189,7 +189,8 @@ static int connect_setup(struct transport *transport, int for_push)
 	return 0;
 }
 
-static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
+static struct ref *get_refs_via_connect(struct transport *transport, int for_push,
+					const struct argv_array *ref_patterns)
 {
 	struct git_transport_data *data = transport->data;
 	struct ref *refs = NULL;
@@ -204,7 +205,8 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 	data->version = discover_version(&reader);
 	switch (data->version) {
 	case protocol_v2:
-		get_remote_refs(data->fd[1], &reader, &refs, for_push, NULL);
+		get_remote_refs(data->fd[1], &reader, &refs, for_push,
+				ref_patterns);
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -250,7 +252,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 	args.update_shallow = data->options.update_shallow;
 
 	if (!data->got_remote_heads)
-		refs_tmp = get_refs_via_connect(transport, 0);
+		refs_tmp = get_refs_via_connect(transport, 0, NULL);
 
 	switch (data->version) {
 	case protocol_v2:
@@ -568,7 +570,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	int ret = 0;
 
 	if (!data->got_remote_heads)
-		get_refs_via_connect(transport, 1);
+		get_refs_via_connect(transport, 1, NULL);
 
 	memset(&args, 0, sizeof(args));
 	args.send_mirror = !!(flags & TRANSPORT_PUSH_MIRROR);
@@ -1028,7 +1030,7 @@ int transport_push(struct transport *transport,
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1);
+		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
@@ -1137,7 +1139,7 @@ int transport_push(struct transport *transport,
 const struct ref *transport_get_remote_refs(struct transport *transport)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0);
+		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
 		transport->got_remote_refs = 1;
 	}
 
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (14 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:00         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
                         ` (19 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Convert 'transport_get_remote_refs()' to optionally take a list of ref
patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/clone.c     |  2 +-
 builtin/fetch.c     |  4 ++--
 builtin/ls-remote.c |  2 +-
 builtin/remote.c    |  2 +-
 transport.c         |  7 +++++--
 transport.h         | 12 +++++++++++-
 6 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/builtin/clone.c b/builtin/clone.c
index 284651797..6e77d993f 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1121,7 +1121,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 	if (transport->smart_options && !deepen)
 		transport->smart_options->check_self_contained_and_connected = 1;
 
-	refs = transport_get_remote_refs(transport);
+	refs = transport_get_remote_refs(transport, NULL);
 
 	if (refs) {
 		mapped_refs = wanted_peer_refs(refs, refspec);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 7bbcd26fa..850382f55 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -250,7 +250,7 @@ static void find_non_local_tags(struct transport *transport,
 	struct string_list_item *item = NULL;
 
 	for_each_ref(add_existing, &existing_refs);
-	for (ref = transport_get_remote_refs(transport); ref; ref = ref->next) {
+	for (ref = transport_get_remote_refs(transport, NULL); ref; ref = ref->next) {
 		if (!starts_with(ref->name, "refs/tags/"))
 			continue;
 
@@ -336,7 +336,7 @@ static struct ref *get_ref_map(struct transport *transport,
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport);
+	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c4be98ab9..c6e9847c5 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -96,7 +96,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport);
+	ref = transport_get_remote_refs(transport, NULL);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/builtin/remote.c b/builtin/remote.c
index d95bf904c..d0b6ff6e2 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -862,7 +862,7 @@ static int get_remote_ref_states(const char *name,
 	if (query) {
 		transport = transport_get(states->remote, states->remote->url_nr > 0 ?
 			states->remote->url[0] : NULL);
-		remote_refs = transport_get_remote_refs(transport);
+		remote_refs = transport_get_remote_refs(transport, NULL);
 		transport_disconnect(transport);
 
 		states->queried = 1;
diff --git a/transport.c b/transport.c
index c54a44630..dfc603b36 100644
--- a/transport.c
+++ b/transport.c
@@ -1136,10 +1136,13 @@ int transport_push(struct transport *transport,
 	return 1;
 }
 
-const struct ref *transport_get_remote_refs(struct transport *transport)
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns)
 {
 	if (!transport->got_remote_refs) {
-		transport->remote_refs = transport->vtable->get_refs_list(transport, 0, NULL);
+		transport->remote_refs =
+			transport->vtable->get_refs_list(transport, 0,
+							 ref_patterns);
 		transport->got_remote_refs = 1;
 	}
 
diff --git a/transport.h b/transport.h
index 731c78b67..daea4770c 100644
--- a/transport.h
+++ b/transport.h
@@ -178,7 +178,17 @@ int transport_push(struct transport *connection,
 		   int refspec_nr, const char **refspec, int flags,
 		   unsigned int * reject_reasons);
 
-const struct ref *transport_get_remote_refs(struct transport *transport);
+/*
+ * Retrieve refs from a remote.
+ *
+ * Optionally a list of ref patterns can be provided which can be sent to the
+ * server (when communicating using protocol v2) to enable it to limit the ref
+ * advertisement.  Since ref filtering is done on the server's end (and only
+ * when using protocol v2), this can return refs which don't match the provided
+ * ref_patterns.
+ */
+const struct ref *transport_get_remote_refs(struct transport *transport,
+					    const struct argv_array *ref_patterns);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (15 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 16/35] transport: convert transport_get_remote_refs " Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-02 22:13         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 18/35] fetch: pass ref patterns when fetching Brandon Williams
                         ` (18 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Construct an argv_array of the ref patterns supplied via the command
line and pass them to 'transport_get_remote_refs()' to be used when
communicating protocol v2 so that the server can limit the ref
advertisement based on the supplied patterns.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/ls-remote.c    | 12 ++++++++++--
 refs.c                 | 14 ++++++++++++++
 refs.h                 |  7 +++++++
 t/t5702-protocol-v2.sh | 26 ++++++++++++++++++++++++++
 4 files changed, 57 insertions(+), 2 deletions(-)

diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
index c6e9847c5..083ba8b29 100644
--- a/builtin/ls-remote.c
+++ b/builtin/ls-remote.c
@@ -2,6 +2,7 @@
 #include "cache.h"
 #include "transport.h"
 #include "remote.h"
+#include "refs.h"
 
 static const char * const ls_remote_usage[] = {
 	N_("git ls-remote [--heads] [--tags] [--refs] [--upload-pack=<exec>]\n"
@@ -43,6 +44,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	int show_symref_target = 0;
 	const char *uploadpack = NULL;
 	const char **pattern = NULL;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	struct remote *remote;
 	struct transport *transport;
@@ -74,8 +76,14 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (argc > 1) {
 		int i;
 		pattern = xcalloc(argc, sizeof(const char *));
-		for (i = 1; i < argc; i++)
+		for (i = 1; i < argc; i++) {
 			pattern[i - 1] = xstrfmt("*/%s", argv[i]);
+
+			if (strchr(argv[i], '*'))
+				argv_array_push(&ref_patterns, argv[i]);
+			else
+				expand_ref_pattern(&ref_patterns, argv[i]);
+		}
 	}
 
 	remote = remote_get(dest);
@@ -96,7 +104,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
-	ref = transport_get_remote_refs(transport, NULL);
+	ref = transport_get_remote_refs(transport, &ref_patterns);
 	if (transport_disconnect(transport))
 		return 1;
 
diff --git a/refs.c b/refs.c
index 20ba82b43..58e9f88fb 100644
--- a/refs.c
+++ b/refs.c
@@ -13,6 +13,7 @@
 #include "tag.h"
 #include "submodule.h"
 #include "worktree.h"
+#include "argv-array.h"
 
 /*
  * List of all available backends
@@ -501,6 +502,19 @@ int refname_match(const char *abbrev_name, const char *full_name)
 	return 0;
 }
 
+/*
+ * Given a 'pattern' expand it by the rules in 'ref_rev_parse_rules' and add
+ * the results to 'patterns'
+ */
+void expand_ref_pattern(struct argv_array *patterns, const char *pattern)
+{
+	const char **p;
+	for (p = ref_rev_parse_rules; *p; p++) {
+		int len = strlen(pattern);
+		argv_array_pushf(patterns, *p, len, pattern);
+	}
+}
+
 /*
  * *string and *len will only be substituted, and *string returned (for
  * later free()ing) if the string passed in is a magic short-hand form
diff --git a/refs.h b/refs.h
index 01be5ae32..292ca35ce 100644
--- a/refs.h
+++ b/refs.h
@@ -139,6 +139,13 @@ int resolve_gitlink_ref(const char *submodule, const char *refname,
  */
 int refname_match(const char *abbrev_name, const char *full_name);
 
+/*
+ * Given a 'pattern' expand it by the rules in 'ref_rev_parse_rules' and add
+ * the results to 'patterns'
+ */
+struct argv_array;
+void expand_ref_pattern(struct argv_array *patterns, const char *pattern);
+
 int expand_ref(const char *str, int len, struct object_id *oid, char **ref);
 int dwim_ref(const char *str, int len, struct object_id *oid, char **ref);
 int dwim_log(const char *str, int len, struct object_id *oid, char **ref);
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index dc5f813be..562610fd2 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -32,6 +32,19 @@ test_expect_success 'list refs with git:// using protocol v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -c protocol.version=2 \
+		ls-remote "$GIT_DAEMON_URL/parent" master >actual &&
+
+	cat >expect <<-EOF &&
+	$(git -C "$daemon_parent" rev-parse refs/heads/master)$(printf "\t")refs/heads/master
+	EOF
+
+	test_cmp actual expect
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
@@ -54,4 +67,17 @@ test_expect_success 'list refs with file:// using protocol v2' '
 	test_cmp actual expect
 '
 
+test_expect_success 'ref advertisment is filtered with ls-remote using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -c protocol.version=2 \
+		ls-remote "file://$(pwd)/file_parent" master >actual &&
+
+	cat >expect <<-EOF &&
+	$(git -C file_parent rev-parse refs/heads/master)$(printf "\t")refs/heads/master
+	EOF
+
+	test_cmp actual expect
+'
+
 test_done
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 18/35] fetch: pass ref patterns when fetching
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (16 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-02 22:20         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 19/35] push: pass ref patterns when pushing Brandon Williams
                         ` (17 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Construct a list of ref patterns to be passed to
'transport_get_remote_refs()' from the refspec to be used during the
fetch.  This list of ref patterns will be used to allow the server to
filter the ref advertisement when communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/fetch.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 850382f55..695fafe06 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -332,11 +332,25 @@ static struct ref *get_ref_map(struct transport *transport,
 	struct ref *rm;
 	struct ref *ref_map = NULL;
 	struct ref **tail = &ref_map;
+	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
 
 	/* opportunistically-updated references: */
 	struct ref *orefs = NULL, **oref_tail = &orefs;
 
-	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
+	const struct ref *remote_refs;
+
+	for (i = 0; i < refspec_count; i++) {
+		if (!refspecs[i].exact_sha1) {
+			if (refspecs[i].pattern)
+				argv_array_push(&ref_patterns, refspecs[i].src);
+			else
+				expand_ref_pattern(&ref_patterns, refspecs[i].src);
+		}
+	}
+
+	remote_refs = transport_get_remote_refs(transport, &ref_patterns);
+
+	argv_array_clear(&ref_patterns);
 
 	if (refspec_count) {
 		struct refspec *fetch_refspec;
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 19/35] push: pass ref patterns when pushing
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (17 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 18/35] fetch: pass ref patterns when fetching Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-02 22:25         ` Junio C Hamano
  2018-02-28 23:22       ` [PATCH v4 20/35] upload-pack: introduce fetch server command Brandon Williams
                         ` (16 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Construct a list of ref patterns to be passed to 'get_refs_list()' from
the refspec to be used during the push.  This list of ref patterns will
be used to allow the server to filter the ref advertisement when
communicating using protocol v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/transport.c b/transport.c
index dfc603b36..bf7ba6879 100644
--- a/transport.c
+++ b/transport.c
@@ -1026,11 +1026,35 @@ int transport_push(struct transport *transport,
 		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
 		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
 		int push_ret, ret, err;
+		struct refspec *tmp_rs;
+		struct argv_array ref_patterns = ARGV_ARRAY_INIT;
+		int i;
 
 		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
 			return -1;
 
-		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
+		tmp_rs = parse_push_refspec(refspec_nr, refspec);
+		for (i = 0; i < refspec_nr; i++) {
+			const char *pattern = NULL;
+
+			if (tmp_rs[i].dst)
+				pattern = tmp_rs[i].dst;
+			else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
+				pattern = tmp_rs[i].src;
+
+			if (pattern) {
+				if (tmp_rs[i].pattern)
+					argv_array_push(&ref_patterns, pattern);
+				else
+					expand_ref_pattern(&ref_patterns, pattern);
+			}
+		}
+
+		remote_refs = transport->vtable->get_refs_list(transport, 1,
+							       &ref_patterns);
+
+		argv_array_clear(&ref_patterns);
+		free_refspec(refspec_nr, tmp_rs);
 
 		if (flags & TRANSPORT_PUSH_ALL)
 			match_flags |= MATCH_REFS_ALL;
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 20/35] upload-pack: introduce fetch server command
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (18 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 19/35] push: pass ref patterns when pushing Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:20         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
                         ` (15 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Introduce the 'fetch' server command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt | 128 ++++++++++++
 serve.c                                 |   2 +
 t/t5701-git-serve.sh                    |   1 +
 upload-pack.c                           | 267 ++++++++++++++++++++++++
 upload-pack.h                           |   5 +
 5 files changed, 403 insertions(+)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 7f50e6462..99c70a1e4 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -205,3 +205,131 @@ The output of ls-refs is as follows:
     ref-attribute = (symref | peeled)
     symref = "symref-target:" symref-target
     peeled = "peeled:" obj-id
+
+ fetch
+-------
+
+`fetch` is the command used to fetch a packfile in v2.  It can be looked
+at as a modified version of the v1 fetch where the ref-advertisement is
+stripped out (since the `ls-refs` command fills that role) and the
+message format is tweaked to eliminate redundancies and permit easy
+addition of future extensions.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "<command>=<feature 1>
+<feature 2>".
+
+A `fetch` request can take the following arguments:
+
+    want <oid>
+	Indicates to the server an object which the client wants to
+	retrieve.  Wants can be anything and are not limited to
+	advertised objects.
+
+    have <oid>
+	Indicates to the server an object which the client has locally.
+	This allows the server to make a packfile which only contains
+	the objects that the client needs. Multiple 'have' lines can be
+	supplied.
+
+    done
+	Indicates to the server that negotiation should terminate (or
+	not even begin if performing a clone) and that the server should
+	use the information supplied in the request to construct the
+	packfile.
+
+    thin-pack
+	Request that a thin pack be sent, which is a pack with deltas
+	which reference base objects not contained within the pack (but
+	are known to exist at the receiving end). This can reduce the
+	network traffic significantly, but it requires the receiving end
+	to know how to "thicken" these packs by adding the missing bases
+	to the pack.
+
+    no-progress
+	Request that progress information that would normally be sent on
+	side-band channel 2, during the packfile transfer, should not be
+	sent.  However, the side-band channel 3 is still used for error
+	responses.
+
+    include-tag
+	Request that annotated tags should be sent if the objects they
+	point to are being sent.
+
+    ofs-delta
+	Indicate that the client understands PACKv2 with delta referring
+	to its base by position in pack rather than by an oid.  That is,
+	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
+
+The response of `fetch` is broken into a number of sections separated by
+delimiter packets (0001), with each section beginning with its section
+header.
+
+    output = *section
+    section = (acknowledgments | packfile)
+	      (flush-pkt | delim-pkt)
+
+    acknowledgments = PKT-LINE("acknowledgments" LF)
+		      (nak | *ack)
+		      (ready)
+    ready = PKT-LINE("ready" LF)
+    nak = PKT-LINE("NAK" LF)
+    ack = PKT-LINE("ACK" SP obj-id LF)
+
+    packfile = PKT-LINE("packfile" LF)
+	       [PACKFILE]
+
+----
+    acknowledgments section
+	* Always begins with the section header "acknowledgments"
+
+	* The server will respond with "NAK" if none of the object ids sent
+	  as have lines were common.
+
+	* The server will respond with "ACK obj-id" for all of the
+	  object ids sent as have lines which are common.
+
+	* A response cannot have both "ACK" lines as well as a "NAK"
+	  line.
+
+	* The server will respond with a "ready" line indicating that
+	  the server has found an acceptable common base and is ready to
+	  make and send a packfile (which will be found in the packfile
+	  section of the same response)
+
+	* If the client determines that it is finished with negotiations
+	  by sending a "done" line, the acknowledgments sections MUST be
+	  omitted from the server's response.
+
+	* If the server has found a suitable cut point and has decided
+	  to send a "ready" line, then the server can decide to (as an
+	  optimization) omit any "ACK" lines it would have sent during
+	  its response.  This is because the server will have already
+	  determined the objects it plans to send to the client and no
+	  further negotiation is needed.
+
+----
+    packfile section
+	* Always begins with the section header "packfile"
+
+	* The transmission of the packfile begins immediately after the
+	  section header
+
+	* The data transfer of the packfile is always multiplexed, using
+	  the same semantics of the 'side-band-64k' capability from
+	  protocol version 1.  This means that each packet, during the
+	  packfile data stream, is made up of a leading 4-byte pkt-line
+	  length (typical of the pkt-line format), followed by a 1-byte
+	  stream code, followed by the actual data.
+
+	  The stream code can be one of:
+		1 - pack data
+		2 - progress messages
+		3 - fatal error message just before stream aborts
+
+	* This section is only included if the client has sent 'want'
+	  lines in its request and either requested that no more
+	  negotiation be done by sending 'done' or if the server has
+	  decided it has found a sufficient cut point to produce a
+	  packfile.
diff --git a/serve.c b/serve.c
index c7925c5c7..05cc434cf 100644
--- a/serve.c
+++ b/serve.c
@@ -6,6 +6,7 @@
 #include "argv-array.h"
 #include "ls-refs.h"
 #include "serve.h"
+#include "upload-pack.h"
 
 static int always_advertise(struct repository *r,
 			    struct strbuf *value)
@@ -52,6 +53,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "ls-refs", always_advertise, ls_refs },
+	{ "fetch", always_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 11aeb0541..cc5918a67 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -9,6 +9,7 @@ test_expect_success 'test capability advertisement' '
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs
+	fetch
 	0000
 	EOF
 
diff --git a/upload-pack.c b/upload-pack.c
index 1e8a9e1ca..2af6b1382 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "prio-queue.h"
 #include "protocol.h"
 #include "upload-pack.h"
+#include "serve.h"
 
 /* Remember to update object flag allocation in object.h */
 #define THEY_HAVE	(1u << 11)
@@ -1065,3 +1066,269 @@ void upload_pack(struct upload_pack_options *options)
 		create_pack_file();
 	}
 }
+
+struct upload_pack_data {
+	struct object_array wants;
+	struct oid_array haves;
+
+	unsigned stateless_rpc : 1;
+
+	unsigned use_thin_pack : 1;
+	unsigned use_ofs_delta : 1;
+	unsigned no_progress : 1;
+	unsigned use_include_tag : 1;
+	unsigned done : 1;
+};
+
+static void upload_pack_data_init(struct upload_pack_data *data)
+{
+	struct object_array wants = OBJECT_ARRAY_INIT;
+	struct oid_array haves = OID_ARRAY_INIT;
+
+	memset(data, 0, sizeof(*data));
+	data->wants = wants;
+	data->haves = haves;
+}
+
+static void upload_pack_data_clear(struct upload_pack_data *data)
+{
+	object_array_clear(&data->wants);
+	oid_array_clear(&data->haves);
+}
+
+static int parse_want(const char *line)
+{
+	const char *arg;
+	if (skip_prefix(line, "want ", &arg)) {
+		struct object_id oid;
+		struct object *o;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: protocol error, "
+			    "expected to get oid, not '%s'", line);
+
+		o = parse_object(&oid);
+		if (!o) {
+			packet_write_fmt(1,
+					 "ERR upload-pack: not our ref %s",
+					 oid_to_hex(&oid));
+			die("git upload-pack: not our ref %s",
+			    oid_to_hex(&oid));
+		}
+
+		if (!(o->flags & WANTED)) {
+			o->flags |= WANTED;
+			add_object_array(o, NULL, &want_obj);
+		}
+
+		return 1;
+	}
+
+	return 0;
+}
+
+static int parse_have(const char *line, struct oid_array *haves)
+{
+	const char *arg;
+	if (skip_prefix(line, "have ", &arg)) {
+		struct object_id oid;
+
+		if (get_oid_hex(arg, &oid))
+			die("git upload-pack: expected SHA1 object, got '%s'", arg);
+		oid_array_append(haves, &oid);
+		return 1;
+	}
+
+	return 0;
+}
+
+static void process_args(struct argv_array *args, struct upload_pack_data *data)
+{
+	int i;
+
+	for (i = 0; i < args->argc; i++) {
+		const char *arg = args->argv[i];
+
+		/* process want */
+		if (parse_want(arg))
+			continue;
+		/* process have line */
+		if (parse_have(arg, &data->haves))
+			continue;
+
+		/* process args like thin-pack */
+		if (!strcmp(arg, "thin-pack")) {
+			use_thin_pack = 1;
+			continue;
+		}
+		if (!strcmp(arg, "ofs-delta")) {
+			use_ofs_delta = 1;
+			continue;
+		}
+		if (!strcmp(arg, "no-progress")) {
+			no_progress = 1;
+			continue;
+		}
+		if (!strcmp(arg, "include-tag")) {
+			use_include_tag = 1;
+			continue;
+		}
+		if (!strcmp(arg, "done")) {
+			data->done = 1;
+			continue;
+		}
+
+		/* ignore unknown lines maybe? */
+		die("unexpect line: '%s'", arg);
+	}
+}
+
+static int process_haves(struct oid_array *haves, struct oid_array *common)
+{
+	int i;
+
+	/* Process haves */
+	for (i = 0; i < haves->nr; i++) {
+		const struct object_id *oid = &haves->oid[i];
+		struct object *o;
+		int we_knew_they_have = 0;
+
+		if (!has_object_file(oid))
+			continue;
+
+		oid_array_append(common, oid);
+
+		o = parse_object(oid);
+		if (!o)
+			die("oops (%s)", oid_to_hex(oid));
+		if (o->type == OBJ_COMMIT) {
+			struct commit_list *parents;
+			struct commit *commit = (struct commit *)o;
+			if (o->flags & THEY_HAVE)
+				we_knew_they_have = 1;
+			else
+				o->flags |= THEY_HAVE;
+			if (!oldest_have || (commit->date < oldest_have))
+				oldest_have = commit->date;
+			for (parents = commit->parents;
+			     parents;
+			     parents = parents->next)
+				parents->item->object.flags |= THEY_HAVE;
+		}
+		if (!we_knew_they_have)
+			add_object_array(o, NULL, &have_obj);
+	}
+
+	return 0;
+}
+
+static int send_acks(struct oid_array *acks, struct strbuf *response)
+{
+	int i;
+
+	packet_buf_write(response, "acknowledgments\n");
+
+	/* Send Acks */
+	if (!acks->nr)
+		packet_buf_write(response, "NAK\n");
+
+	for (i = 0; i < acks->nr; i++) {
+		packet_buf_write(response, "ACK %s\n",
+				 oid_to_hex(&acks->oid[i]));
+	}
+
+	if (ok_to_give_up()) {
+		/* Send Ready */
+		packet_buf_write(response, "ready\n");
+		return 1;
+	}
+
+	return 0;
+}
+
+static int process_haves_and_send_acks(struct upload_pack_data *data)
+{
+	struct oid_array common = OID_ARRAY_INIT;
+	struct strbuf response = STRBUF_INIT;
+	int ret = 0;
+
+	process_haves(&data->haves, &common);
+	if (data->done) {
+		ret = 1;
+	} else if (send_acks(&common, &response)) {
+		packet_buf_delim(&response);
+		ret = 1;
+	} else {
+		/* Add Flush */
+		packet_buf_flush(&response);
+		ret = 0;
+	}
+
+	/* Send response */
+	write_or_die(1, response.buf, response.len);
+	strbuf_release(&response);
+
+	oid_array_clear(&data->haves);
+	oid_array_clear(&common);
+	return ret;
+}
+
+enum fetch_state {
+	FETCH_PROCESS_ARGS = 0,
+	FETCH_SEND_ACKS,
+	FETCH_SEND_PACK,
+	FETCH_DONE,
+};
+
+int upload_pack_v2(struct repository *r, struct argv_array *keys,
+		   struct argv_array *args)
+{
+	enum fetch_state state = FETCH_PROCESS_ARGS;
+	struct upload_pack_data data;
+
+	upload_pack_data_init(&data);
+	use_sideband = LARGE_PACKET_MAX;
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_PROCESS_ARGS:
+			process_args(args, &data);
+
+			if (!want_obj.nr) {
+				/*
+				 * Request didn't contain any 'want' lines,
+				 * guess they didn't want anything.
+				 */
+				state = FETCH_DONE;
+			} else if (data.haves.nr) {
+				/*
+				 * Request had 'have' lines, so lets ACK them.
+				 */
+				state = FETCH_SEND_ACKS;
+			} else {
+				/*
+				 * Request had 'want's but no 'have's so we can
+				 * immedietly go to construct and send a pack.
+				 */
+				state = FETCH_SEND_PACK;
+			}
+			break;
+		case FETCH_SEND_ACKS:
+			if (process_haves_and_send_acks(&data))
+				state = FETCH_SEND_PACK;
+			else
+				state = FETCH_DONE;
+			break;
+		case FETCH_SEND_PACK:
+			packet_write_fmt(1, "packfile\n");
+			create_pack_file();
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			continue;
+		}
+	}
+
+	upload_pack_data_clear(&data);
+	return 0;
+}
diff --git a/upload-pack.h b/upload-pack.h
index a71e4dc7e..6b7890238 100644
--- a/upload-pack.h
+++ b/upload-pack.h
@@ -10,4 +10,9 @@ struct upload_pack_options {
 
 void upload_pack(struct upload_pack_options *options);
 
+struct repository;
+struct argv_array;
+extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
+			  struct argv_array *args);
+
 #endif /* UPLOAD_PACK_H */
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 21/35] fetch-pack: perform a fetch using v2
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (19 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 20/35] upload-pack: introduce fetch server command Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 22/35] fetch-pack: support shallow requests Brandon Williams
                         ` (14 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

When communicating with a v2 server, perform a fetch by requesting the
'fetch' command.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/protocol-v2.txt |  68 +++++-
 builtin/fetch-pack.c                    |   2 +-
 fetch-pack.c                            | 270 +++++++++++++++++++++++-
 fetch-pack.h                            |   4 +-
 serve.c                                 |   2 +-
 t/t5701-git-serve.sh                    |   2 +-
 t/t5702-protocol-v2.sh                  |  97 +++++++++
 transport.c                             |   7 +-
 upload-pack.c                           | 141 ++++++++++---
 upload-pack.h                           |   3 +
 10 files changed, 548 insertions(+), 48 deletions(-)

diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 99c70a1e4..0d63456fc 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -262,12 +262,43 @@ A `fetch` request can take the following arguments:
 	to its base by position in pack rather than by an oid.  That is,
 	they can read OBJ_OFS_DELTA (ake type 6) in a packfile.
 
+    shallow <oid>
+	A client must notify the server of all commits for which it only
+	has shallow copies (meaning that it doesn't have the parents of
+	a commit) by supplying a 'shallow <oid>' line for each such
+	object so that the server is aware of the limitations of the
+	client's history.  This is so that the server is aware that the
+	client may not have all objects reachable from such commits.
+
+    deepen <depth>
+	Requests that the fetch/clone should be shallow having a commit
+	depth of <depth> relative to the remote side.
+
+    deepen-relative
+	Requests that the semantics of the "deepen" command be changed
+	to indicate that the depth requested is relative to the client's
+	current shallow boundary, instead of relative to the requested
+	commits.
+
+    deepen-since <timestamp>
+	Requests that the shallow clone/fetch should be cut at a
+	specific time, instead of depth.  Internally it's equivalent to
+	doing "git rev-list --max-age=<timestamp>". Cannot be used with
+	"deepen".
+
+    deepen-not <rev>
+	Requests that the shallow clone/fetch should be cut at a
+	specific revision specified by '<rev>', instead of a depth.
+	Internally it's equivalent of doing "git rev-list --not <rev>".
+	Cannot be used with "deepen", but can be used with
+	"deepen-since".
+
 The response of `fetch` is broken into a number of sections separated by
 delimiter packets (0001), with each section beginning with its section
 header.
 
     output = *section
-    section = (acknowledgments | packfile)
+    section = (acknowledgments | shallow-info | packfile)
 	      (flush-pkt | delim-pkt)
 
     acknowledgments = PKT-LINE("acknowledgments" LF)
@@ -277,6 +308,11 @@ header.
     nak = PKT-LINE("NAK" LF)
     ack = PKT-LINE("ACK" SP obj-id LF)
 
+    shallow-info = PKT-LINE("shallow-info" LF)
+		   *PKT-LINE((shallow | unshallow) LF)
+    shallow = "shallow" SP obj-id
+    unshallow = "unshallow" SP obj-id
+
     packfile = PKT-LINE("packfile" LF)
 	       [PACKFILE]
 
@@ -309,6 +345,36 @@ header.
 	  determined the objects it plans to send to the client and no
 	  further negotiation is needed.
 
+----
+    shallow-info section
+	If the client has requested a shallow fetch/clone, a shallow
+	client requests a fetch or the server is shallow then the
+	server's response may include a shallow-info section.  The
+	shallow-info section will be included if (due to one of the
+	above conditions) the server needs to inform the client of any
+	shallow boundaries or adjustments to the clients already
+	existing shallow boundaries.
+
+	* Always begins with the section header "shallow-info"
+
+	* If a positive depth is requested, the server will compute the
+	  set of commits which are no deeper than the desired depth.
+
+	* The server sends a "shallow obj-id" line for each commit whose
+	  parents will not be sent in the following packfile.
+
+	* The server sends an "unshallow obj-id" line for each commit
+	  which the client has indicated is shallow, but is no longer
+	  shallow as a result of the fetch (due to its parents being
+	  sent in the following packfile).
+
+	* The server MUST NOT send any "unshallow" lines for anything
+	  which the client has not indicated was shallow as a part of
+	  its request.
+
+	* This section is only included if a packfile section is also
+	  included in the response.
+
 ----
     packfile section
 	* Always begins with the section header "packfile"
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index b2374ddbb..f9d7d0b5a 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -212,7 +212,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
 	}
 
 	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
-			 &shallow, pack_lockfile_ptr);
+			 &shallow, pack_lockfile_ptr, protocol_v0);
 	if (pack_lockfile) {
 		printf("lock %s\n", pack_lockfile);
 		fflush(stdout);
diff --git a/fetch-pack.c b/fetch-pack.c
index 9f6b07ad9..dffcfd66a 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -303,9 +303,9 @@ static void insert_one_alternate_object(struct object *obj)
 #define PIPESAFE_FLUSH 32
 #define LARGE_FLUSH 16384
 
-static int next_flush(struct fetch_pack_args *args, int count)
+static int next_flush(int stateless_rpc, int count)
 {
-	if (args->stateless_rpc) {
+	if (stateless_rpc) {
 		if (count < LARGE_FLUSH)
 			count <<= 1;
 		else
@@ -461,7 +461,7 @@ static int find_common(struct fetch_pack_args *args,
 			send_request(args, fd[1], &req_buf);
 			strbuf_setlen(&req_buf, state_len);
 			flushes++;
-			flush_at = next_flush(args, count);
+			flush_at = next_flush(args->stateless_rpc, count);
 
 			/*
 			 * We keep one window "ahead" of the other side, and
@@ -1008,6 +1008,259 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	return ref;
 }
 
+static void add_wants(const struct ref *wants, struct strbuf *req_buf)
+{
+	for ( ; wants ; wants = wants->next) {
+		const struct object_id *remote = &wants->old_oid;
+		const char *remote_hex;
+		struct object *o;
+
+		/*
+		 * If that object is complete (i.e. it is an ancestor of a
+		 * local ref), we tell them we have it but do not have to
+		 * tell them about its ancestors, which they already know
+		 * about.
+		 *
+		 * We use lookup_object here because we are only
+		 * interested in the case we *know* the object is
+		 * reachable and we have already scanned it.
+		 */
+		if (((o = lookup_object(remote->hash)) != NULL) &&
+		    (o->flags & COMPLETE)) {
+			continue;
+		}
+
+		remote_hex = oid_to_hex(remote);
+		packet_buf_write(req_buf, "want %s\n", remote_hex);
+	}
+}
+
+static void add_common(struct strbuf *req_buf, struct oidset *common)
+{
+	struct oidset_iter iter;
+	const struct object_id *oid;
+	oidset_iter_init(common, &iter);
+
+	while ((oid = oidset_iter_next(&iter))) {
+		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+	}
+}
+
+static int add_haves(struct strbuf *req_buf, int *haves_to_send, int *in_vain)
+{
+	int ret = 0;
+	int haves_added = 0;
+	const struct object_id *oid;
+
+	while ((oid = get_rev())) {
+		packet_buf_write(req_buf, "have %s\n", oid_to_hex(oid));
+		if (++haves_added >= *haves_to_send)
+			break;
+	}
+
+	*in_vain += haves_added;
+	if (!haves_added || *in_vain >= MAX_IN_VAIN) {
+		/* Send Done */
+		packet_buf_write(req_buf, "done\n");
+		ret = 1;
+	}
+
+	/* Increase haves to send on next round */
+	*haves_to_send = next_flush(1, *haves_to_send);
+
+	return ret;
+}
+
+static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
+			      const struct ref *wants, struct oidset *common,
+			      int *haves_to_send, int *in_vain)
+{
+	int ret = 0;
+	struct strbuf req_buf = STRBUF_INIT;
+
+	if (server_supports_v2("fetch", 1))
+		packet_buf_write(&req_buf, "command=fetch");
+	if (server_supports_v2("agent", 0))
+		packet_buf_write(&req_buf, "agent=%s", git_user_agent_sanitized());
+
+	packet_buf_delim(&req_buf);
+	if (args->use_thin_pack)
+		packet_buf_write(&req_buf, "thin-pack");
+	if (args->no_progress)
+		packet_buf_write(&req_buf, "no-progress");
+	if (args->include_tag)
+		packet_buf_write(&req_buf, "include-tag");
+	if (prefer_ofs_delta)
+		packet_buf_write(&req_buf, "ofs-delta");
+
+	/* add wants */
+	add_wants(wants, &req_buf);
+
+	/* Add all of the common commits we've found in previous rounds */
+	add_common(&req_buf, common);
+
+	/* Add initial haves */
+	ret = add_haves(&req_buf, haves_to_send, in_vain);
+
+	/* Send request */
+	packet_buf_flush(&req_buf);
+	write_or_die(fd_out, req_buf.buf, req_buf.len);
+
+	strbuf_release(&req_buf);
+	return ret;
+}
+
+/*
+ * Processes a section header in a server's response and checks if it matches
+ * `section`.  If the value of `peek` is 1, the header line will be peeked (and
+ * not consumed); if 0, the line will be consumed and the function will die if
+ * the section header doesn't match what was expected.
+ */
+static int process_section_header(struct packet_reader *reader,
+				  const char *section, int peek)
+{
+	int ret;
+
+	if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
+		die("error reading packet");
+
+	ret = !strcmp(reader->line, section);
+
+	if (!peek) {
+		if (!ret)
+			die("expected '%s', received '%s'",
+			    section, reader->line);
+		packet_reader_read(reader);
+	}
+
+	return ret;
+}
+
+static int process_acks(struct packet_reader *reader, struct oidset *common)
+{
+	/* received */
+	int received_ready = 0;
+	int received_ack = 0;
+
+	process_section_header(reader, "acknowledgments", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *arg;
+
+		if (!strcmp(reader->line, "NAK"))
+			continue;
+
+		if (skip_prefix(reader->line, "ACK ", &arg)) {
+			struct object_id oid;
+			if (!get_oid_hex(arg, &oid)) {
+				struct commit *commit;
+				oidset_insert(common, &oid);
+				commit = lookup_commit(&oid);
+				mark_common(commit, 0, 1);
+			}
+			continue;
+		}
+
+		if (!strcmp(reader->line, "ready")) {
+			clear_prio_queue(&rev_list);
+			received_ready = 1;
+			continue;
+		}
+
+		die("unexpected acknowledgment line: '%s'", reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH &&
+	    reader->status != PACKET_READ_DELIM)
+		die("error processing acks: %d", reader->status);
+
+	/* return 0 if no common, 1 if there are common, or 2 if ready */
+	return received_ready ? 2 : (received_ack ? 1 : 0);
+}
+
+enum fetch_state {
+	FETCH_CHECK_LOCAL = 0,
+	FETCH_SEND_REQUEST,
+	FETCH_PROCESS_ACKS,
+	FETCH_GET_PACK,
+	FETCH_DONE,
+};
+
+static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
+				    int fd[2],
+				    const struct ref *orig_ref,
+				    struct ref **sought, int nr_sought,
+				    char **pack_lockfile)
+{
+	struct ref *ref = copy_ref_list(orig_ref);
+	enum fetch_state state = FETCH_CHECK_LOCAL;
+	struct oidset common = OIDSET_INIT;
+	struct packet_reader reader;
+	int in_vain = 0;
+	int haves_to_send = INITIAL_FLUSH;
+	packet_reader_init(&reader, fd[0], NULL, 0,
+			   PACKET_READ_CHOMP_NEWLINE);
+
+	while (state != FETCH_DONE) {
+		switch (state) {
+		case FETCH_CHECK_LOCAL:
+			sort_ref_list(&ref, ref_compare_name);
+			QSORT(sought, nr_sought, cmp_ref_by_name);
+
+			/* v2 supports these by default */
+			allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
+			use_sideband = 2;
+
+			if (marked)
+				for_each_ref(clear_marks, NULL);
+			marked = 1;
+
+			for_each_ref(rev_list_insert_ref_oid, NULL);
+			for_each_cached_alternate(insert_one_alternate_object);
+
+			/* Filter 'ref' by 'sought' and those that aren't local */
+			if (everything_local(args, &ref, sought, nr_sought))
+				state = FETCH_DONE;
+			else
+				state = FETCH_SEND_REQUEST;
+			break;
+		case FETCH_SEND_REQUEST:
+			if (send_fetch_request(fd[1], args, ref, &common,
+					       &haves_to_send, &in_vain))
+				state = FETCH_GET_PACK;
+			else
+				state = FETCH_PROCESS_ACKS;
+			break;
+		case FETCH_PROCESS_ACKS:
+			/* Process ACKs/NAKs */
+			switch (process_acks(&reader, &common)) {
+			case 2:
+				state = FETCH_GET_PACK;
+				break;
+			case 1:
+				in_vain = 0;
+				/* fallthrough */
+			default:
+				state = FETCH_SEND_REQUEST;
+				break;
+			}
+			break;
+		case FETCH_GET_PACK:
+			/* get the pack */
+			process_section_header(&reader, "packfile", 0);
+			if (get_pack(args, fd, pack_lockfile))
+				die(_("git fetch-pack: fetch failed."));
+
+			state = FETCH_DONE;
+			break;
+		case FETCH_DONE:
+			continue;
+		}
+	}
+
+	oidset_clear(&common);
+	return ref;
+}
+
 static void fetch_pack_config(void)
 {
 	git_config_get_int("fetch.unpacklimit", &fetch_unpack_limit);
@@ -1153,7 +1406,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       const char *dest,
 		       struct ref **sought, int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile)
+		       char **pack_lockfile,
+		       enum protocol_version version)
 {
 	struct ref *ref_cpy;
 	struct shallow_info si;
@@ -1167,8 +1421,12 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		die(_("no matching remote head"));
 	}
 	prepare_shallow_info(&si, shallow);
-	ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
-				&si, pack_lockfile);
+	if (version == protocol_v2)
+		ref_cpy = do_fetch_pack_v2(args, fd, ref, sought, nr_sought,
+					   pack_lockfile);
+	else
+		ref_cpy = do_fetch_pack(args, fd, ref, sought, nr_sought,
+					&si, pack_lockfile);
 	reprepare_packed_git();
 	update_shallow(args, sought, nr_sought, &si);
 	clear_shallow_info(&si);
diff --git a/fetch-pack.h b/fetch-pack.h
index b6aeb43a8..7afca7305 100644
--- a/fetch-pack.h
+++ b/fetch-pack.h
@@ -3,6 +3,7 @@
 
 #include "string-list.h"
 #include "run-command.h"
+#include "protocol.h"
 
 struct oid_array;
 
@@ -43,7 +44,8 @@ struct ref *fetch_pack(struct fetch_pack_args *args,
 		       struct ref **sought,
 		       int nr_sought,
 		       struct oid_array *shallow,
-		       char **pack_lockfile);
+		       char **pack_lockfile,
+		       enum protocol_version version);
 
 /*
  * Print an appropriate error message for each sought ref that wasn't
diff --git a/serve.c b/serve.c
index 05cc434cf..c3e58c1e7 100644
--- a/serve.c
+++ b/serve.c
@@ -53,7 +53,7 @@ struct protocol_capability {
 static struct protocol_capability capabilities[] = {
 	{ "agent", agent_advertise, NULL },
 	{ "ls-refs", always_advertise, ls_refs },
-	{ "fetch", always_advertise, upload_pack_v2 },
+	{ "fetch", upload_pack_advertise, upload_pack_v2 },
 };
 
 static void advertise_capabilities(void)
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index cc5918a67..569922f7a 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -9,7 +9,7 @@ test_expect_success 'test capability advertisement' '
 	version 2
 	agent=git/$(git version | cut -d" " -f3)
 	ls-refs
-	fetch
+	fetch=shallow
 	0000
 	EOF
 
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 562610fd2..4365ac273 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -45,6 +45,56 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
 	test_cmp actual expect
 '
 
+test_expect_success 'clone with git:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -c protocol.version=2 \
+		clone "$GIT_DAEMON_URL/parent" daemon_child &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "clone> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "clone< version 2" log
+'
+
+test_expect_success 'fetch with git:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	test_commit -C "$daemon_parent" two &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -C daemon_child -c protocol.version=2 \
+		fetch &&
+
+	git -C daemon_child log -1 --format=%s origin/master >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "fetch> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "fetch< version 2" log
+'
+
+test_expect_success 'pull with git:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -C daemon_child -c protocol.version=2 \
+		pull &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "fetch> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	grep "fetch< version 2" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
@@ -80,4 +130,51 @@ test_expect_success 'ref advertisment is filtered with ls-remote using protocol
 	test_cmp actual expect
 '
 
+test_expect_success 'clone with file:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -c protocol.version=2 \
+		clone "file://$(pwd)/file_parent" file_child &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "clone< version 2" log
+'
+
+test_expect_success 'fetch with file:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -C file_child -c protocol.version=2 \
+		fetch origin &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "fetch< version 2" log
+'
+
+test_expect_success 'ref advertisment is filtered during fetch using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	test_commit -C file_parent three &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -C file_child -c protocol.version=2 \
+		fetch origin master &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	! grep "refs/tags/one" log &&
+	! grep "refs/tags/two" log &&
+	! grep "refs/tags/three" log
+'
+
 test_done
diff --git a/transport.c b/transport.c
index bf7ba6879..8e38352c5 100644
--- a/transport.c
+++ b/transport.c
@@ -256,14 +256,17 @@ static int fetch_refs_via_pack(struct transport *transport,
 
 	switch (data->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		refs = fetch_pack(&args, data->fd, data->conn,
+				  refs_tmp ? refs_tmp : transport->remote_refs,
+				  dest, to_fetch, nr_heads, &data->shallow,
+				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_v1:
 	case protocol_v0:
 		refs = fetch_pack(&args, data->fd, data->conn,
 				  refs_tmp ? refs_tmp : transport->remote_refs,
 				  dest, to_fetch, nr_heads, &data->shallow,
-				  &transport->pack_lockfile);
+				  &transport->pack_lockfile, data->version);
 		break;
 	case protocol_unknown_version:
 		BUG("unknown protocol version");
diff --git a/upload-pack.c b/upload-pack.c
index 2af6b1382..65a1beeb0 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -710,7 +710,6 @@ static void deepen(int depth, int deepen_relative,
 	}
 
 	send_unshallow(shallows);
-	packet_flush(1);
 }
 
 static void deepen_by_rev_list(int ac, const char **av,
@@ -722,7 +721,53 @@ static void deepen_by_rev_list(int ac, const char **av,
 	send_shallow(result);
 	free_commit_list(result);
 	send_unshallow(shallows);
-	packet_flush(1);
+}
+
+/* Returns 1 if a shallow list is sent or 0 otherwise */
+static int send_shallow_list(int depth, int deepen_rev_list,
+			     timestamp_t deepen_since,
+			     struct string_list *deepen_not,
+			     struct object_array *shallows)
+{
+	int ret = 0;
+
+	if (depth > 0 && deepen_rev_list)
+		die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");
+	if (depth > 0) {
+		deepen(depth, deepen_relative, shallows);
+		ret = 1;
+	} else if (deepen_rev_list) {
+		struct argv_array av = ARGV_ARRAY_INIT;
+		int i;
+
+		argv_array_push(&av, "rev-list");
+		if (deepen_since)
+			argv_array_pushf(&av, "--max-age=%"PRItime, deepen_since);
+		if (deepen_not->nr) {
+			argv_array_push(&av, "--not");
+			for (i = 0; i < deepen_not->nr; i++) {
+				struct string_list_item *s = deepen_not->items + i;
+				argv_array_push(&av, s->string);
+			}
+			argv_array_push(&av, "--not");
+		}
+		for (i = 0; i < want_obj.nr; i++) {
+			struct object *o = want_obj.objects[i].item;
+			argv_array_push(&av, oid_to_hex(&o->oid));
+		}
+		deepen_by_rev_list(av.argc, av.argv, shallows);
+		argv_array_clear(&av);
+		ret = 1;
+	} else {
+		if (shallows->nr > 0) {
+			int i;
+			for (i = 0; i < shallows->nr; i++)
+				register_shallow(&shallows->objects[i].item->oid);
+		}
+	}
+
+	shallow_nr += shallows->nr;
+	return ret;
 }
 
 static int process_shallow(const char *line, struct object_array *shallows)
@@ -884,40 +929,10 @@ static void receive_needs(void)
 
 	if (depth == 0 && !deepen_rev_list && shallows.nr == 0)
 		return;
-	if (depth > 0 && deepen_rev_list)
-		die("git upload-pack: deepen and deepen-since (or deepen-not) cannot be used together");
-	if (depth > 0)
-		deepen(depth, deepen_relative, &shallows);
-	else if (deepen_rev_list) {
-		struct argv_array av = ARGV_ARRAY_INIT;
-		int i;
 
-		argv_array_push(&av, "rev-list");
-		if (deepen_since)
-			argv_array_pushf(&av, "--max-age=%"PRItime, deepen_since);
-		if (deepen_not.nr) {
-			argv_array_push(&av, "--not");
-			for (i = 0; i < deepen_not.nr; i++) {
-				struct string_list_item *s = deepen_not.items + i;
-				argv_array_push(&av, s->string);
-			}
-			argv_array_push(&av, "--not");
-		}
-		for (i = 0; i < want_obj.nr; i++) {
-			struct object *o = want_obj.objects[i].item;
-			argv_array_push(&av, oid_to_hex(&o->oid));
-		}
-		deepen_by_rev_list(av.argc, av.argv, &shallows);
-		argv_array_clear(&av);
-	}
-	else
-		if (shallows.nr > 0) {
-			int i;
-			for (i = 0; i < shallows.nr; i++)
-				register_shallow(&shallows.objects[i].item->oid);
-		}
-
-	shallow_nr += shallows.nr;
+	if (send_shallow_list(depth, deepen_rev_list, deepen_since,
+			      &deepen_not, &shallows))
+		packet_flush(1);
 	object_array_clear(&shallows);
 }
 
@@ -1071,6 +1086,13 @@ struct upload_pack_data {
 	struct object_array wants;
 	struct oid_array haves;
 
+	struct object_array shallows;
+	struct string_list deepen_not;
+	int depth;
+	timestamp_t deepen_since;
+	int deepen_rev_list;
+	int deepen_relative;
+
 	unsigned stateless_rpc : 1;
 
 	unsigned use_thin_pack : 1;
@@ -1084,16 +1106,22 @@ static void upload_pack_data_init(struct upload_pack_data *data)
 {
 	struct object_array wants = OBJECT_ARRAY_INIT;
 	struct oid_array haves = OID_ARRAY_INIT;
+	struct object_array shallows = OBJECT_ARRAY_INIT;
+	struct string_list deepen_not = STRING_LIST_INIT_DUP;
 
 	memset(data, 0, sizeof(*data));
 	data->wants = wants;
 	data->haves = haves;
+	data->shallows = shallows;
+	data->deepen_not = deepen_not;
 }
 
 static void upload_pack_data_clear(struct upload_pack_data *data)
 {
 	object_array_clear(&data->wants);
 	oid_array_clear(&data->haves);
+	object_array_clear(&data->shallows);
+	string_list_clear(&data->deepen_not, 0);
 }
 
 static int parse_want(const char *line)
@@ -1178,6 +1206,22 @@ static void process_args(struct argv_array *args, struct upload_pack_data *data)
 			continue;
 		}
 
+		/* Shallow related arguments */
+		if (process_shallow(arg, &data->shallows))
+			continue;
+		if (process_deepen(arg, &data->depth))
+			continue;
+		if (process_deepen_since(arg, &data->deepen_since,
+					 &data->deepen_rev_list))
+			continue;
+		if (process_deepen_not(arg, &data->deepen_not,
+				       &data->deepen_rev_list))
+			continue;
+		if (!strcmp(arg, "deepen-relative")) {
+			data->deepen_relative = 1;
+			continue;
+		}
+
 		/* ignore unknown lines maybe? */
 		die("unexpect line: '%s'", arg);
 	}
@@ -1273,6 +1317,23 @@ static int process_haves_and_send_acks(struct upload_pack_data *data)
 	return ret;
 }
 
+static void send_shallow_info(struct upload_pack_data *data)
+{
+	/* No shallow info needs to be sent */
+	if (!data->depth && !data->deepen_rev_list && !data->shallows.nr &&
+	    !is_repository_shallow())
+		return;
+
+	packet_write_fmt(1, "shallow-info\n");
+
+	if (!send_shallow_list(data->depth, data->deepen_rev_list,
+			       data->deepen_since, &data->deepen_not,
+			       &data->shallows) && is_repository_shallow())
+		deepen(INFINITE_DEPTH, data->deepen_relative, &data->shallows);
+
+	packet_delim(1);
+}
+
 enum fetch_state {
 	FETCH_PROCESS_ARGS = 0,
 	FETCH_SEND_ACKS,
@@ -1320,6 +1381,8 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
 				state = FETCH_DONE;
 			break;
 		case FETCH_SEND_PACK:
+			send_shallow_info(&data);
+
 			packet_write_fmt(1, "packfile\n");
 			create_pack_file();
 			state = FETCH_DONE;
@@ -1332,3 +1395,11 @@ int upload_pack_v2(struct repository *r, struct argv_array *keys,
 	upload_pack_data_clear(&data);
 	return 0;
 }
+
+int upload_pack_advertise(struct repository *r,
+			  struct strbuf *value)
+{
+	if (value)
+		strbuf_addstr(value, "shallow");
+	return 1;
+}
diff --git a/upload-pack.h b/upload-pack.h
index 6b7890238..7720f2142 100644
--- a/upload-pack.h
+++ b/upload-pack.h
@@ -14,5 +14,8 @@ struct repository;
 struct argv_array;
 extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
 			  struct argv_array *args);
+struct strbuf;
+extern int upload_pack_advertise(struct repository *r,
+				 struct strbuf *value);
 
 #endif /* UPLOAD_PACK_H */
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 22/35] fetch-pack: support shallow requests
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (20 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 23/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
                         ` (13 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Enable shallow clones and deepen requests using protocol version 2 if
the server 'fetch' command supports the 'shallow' feature.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c    | 22 ++++++++++++++++
 connect.h    |  2 ++
 fetch-pack.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 94 insertions(+), 1 deletion(-)

diff --git a/connect.c b/connect.c
index 6203ce576..66a9443c8 100644
--- a/connect.c
+++ b/connect.c
@@ -82,6 +82,28 @@ int server_supports_v2(const char *c, int die_on_error)
 	return 0;
 }
 
+int server_supports_feature(const char *c, const char *feature,
+			    int die_on_error)
+{
+	int i;
+
+	for (i = 0; i < server_capabilities_v2.argc; i++) {
+		const char *out;
+		if (skip_prefix(server_capabilities_v2.argv[i], c, &out) &&
+		    (!*out || *(out++) == '=')) {
+			if (parse_feature_request(out, feature))
+				return 1;
+			else
+				break;
+		}
+	}
+
+	if (die_on_error)
+		die("server doesn't support feature '%s'", feature);
+
+	return 0;
+}
+
 static void process_capabilities_v2(struct packet_reader *reader)
 {
 	while (packet_reader_read(reader) == PACKET_READ_NORMAL)
diff --git a/connect.h b/connect.h
index 8898d4495..0e69c6709 100644
--- a/connect.h
+++ b/connect.h
@@ -17,5 +17,7 @@ struct packet_reader;
 extern enum protocol_version discover_version(struct packet_reader *reader);
 
 extern int server_supports_v2(const char *c, int die_on_error);
+extern int server_supports_feature(const char *c, const char *feature,
+				   int die_on_error);
 
 #endif
diff --git a/fetch-pack.c b/fetch-pack.c
index dffcfd66a..837e1fd21 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1008,6 +1008,26 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 	return ref;
 }
 
+static void add_shallow_requests(struct strbuf *req_buf,
+				 const struct fetch_pack_args *args)
+{
+	if (is_repository_shallow())
+		write_shallow_commits(req_buf, 1, NULL);
+	if (args->depth > 0)
+		packet_buf_write(req_buf, "deepen %d", args->depth);
+	if (args->deepen_since) {
+		timestamp_t max_age = approxidate(args->deepen_since);
+		packet_buf_write(req_buf, "deepen-since %"PRItime, max_age);
+	}
+	if (args->deepen_not) {
+		int i;
+		for (i = 0; i < args->deepen_not->nr; i++) {
+			struct string_list_item *s = args->deepen_not->items + i;
+			packet_buf_write(req_buf, "deepen-not %s", s->string);
+		}
+	}
+}
+
 static void add_wants(const struct ref *wants, struct strbuf *req_buf)
 {
 	for ( ; wants ; wants = wants->next) {
@@ -1093,6 +1113,12 @@ static int send_fetch_request(int fd_out, const struct fetch_pack_args *args,
 	if (prefer_ofs_delta)
 		packet_buf_write(&req_buf, "ofs-delta");
 
+	/* Add shallow-info and deepen request */
+	if (server_supports_feature("fetch", "shallow", 0))
+		add_shallow_requests(&req_buf, args);
+	else if (is_repository_shallow() || args->deepen)
+		die(_("Server does not support shallow requests"));
+
 	/* add wants */
 	add_wants(wants, &req_buf);
 
@@ -1122,7 +1148,7 @@ static int process_section_header(struct packet_reader *reader,
 	int ret;
 
 	if (packet_reader_peek(reader) != PACKET_READ_NORMAL)
-		die("error reading packet");
+		die("error reading section header '%s'", section);
 
 	ret = !strcmp(reader->line, section);
 
@@ -1177,6 +1203,43 @@ static int process_acks(struct packet_reader *reader, struct oidset *common)
 	return received_ready ? 2 : (received_ack ? 1 : 0);
 }
 
+static void receive_shallow_info(struct fetch_pack_args *args,
+				 struct packet_reader *reader)
+{
+	process_section_header(reader, "shallow-info", 0);
+	while (packet_reader_read(reader) == PACKET_READ_NORMAL) {
+		const char *arg;
+		struct object_id oid;
+
+		if (skip_prefix(reader->line, "shallow ", &arg)) {
+			if (get_oid_hex(arg, &oid))
+				die(_("invalid shallow line: %s"), reader->line);
+			register_shallow(&oid);
+			continue;
+		}
+		if (skip_prefix(reader->line, "unshallow ", &arg)) {
+			if (get_oid_hex(arg, &oid))
+				die(_("invalid unshallow line: %s"), reader->line);
+			if (!lookup_object(oid.hash))
+				die(_("object not found: %s"), reader->line);
+			/* make sure that it is parsed as shallow */
+			if (!parse_object(&oid))
+				die(_("error in object: %s"), reader->line);
+			if (unregister_shallow(&oid))
+				die(_("no shallow found: %s"), reader->line);
+			continue;
+		}
+		die(_("expected shallow/unshallow, got %s"), reader->line);
+	}
+
+	if (reader->status != PACKET_READ_FLUSH &&
+	    reader->status != PACKET_READ_DELIM)
+		die("error processing shallow info: %d", reader->status);
+
+	setup_alternate_shallow(&shallow_lock, &alternate_shallow_file, NULL);
+	args->deepen = 1;
+}
+
 enum fetch_state {
 	FETCH_CHECK_LOCAL = 0,
 	FETCH_SEND_REQUEST,
@@ -1209,6 +1272,8 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			/* v2 supports these by default */
 			allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
 			use_sideband = 2;
+			if (args->depth > 0 || args->deepen_since || args->deepen_not)
+				args->deepen = 1;
 
 			if (marked)
 				for_each_ref(clear_marks, NULL);
@@ -1245,6 +1310,10 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 			}
 			break;
 		case FETCH_GET_PACK:
+			/* Check for shallow-info section */
+			if (process_section_header(&reader, "shallow-info", 1))
+				receive_shallow_info(args, &reader);
+
 			/* get the pack */
 			process_section_header(&reader, "packfile", 0);
 			if (get_pack(args, fd, pack_lockfile))
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 23/35] connect: refactor git_connect to only get the protocol version once
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (21 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 22/35] fetch-pack: support shallow requests Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 24/35] connect: don't request v2 when pushing Brandon Williams
                         ` (12 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Instead of having each builtin transport asking for which protocol
version the user has configured in 'protocol.version' by calling
`get_protocol_version_config()` multiple times, factor this logic out
so there is just a single call at the beginning of `git_connect()`.

This will be helpful in the next patch where we can have centralized
logic which determines if we need to request a different protocol
version than what the user has configured.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/connect.c b/connect.c
index 66a9443c8..a0bfcdf4f 100644
--- a/connect.c
+++ b/connect.c
@@ -1035,6 +1035,7 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
  */
 static struct child_process *git_connect_git(int fd[2], char *hostandport,
 					     const char *path, const char *prog,
+					     enum protocol_version version,
 					     int flags)
 {
 	struct child_process *conn;
@@ -1073,10 +1074,10 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
 		    target_host, 0);
 
 	/* If using a new version put that stuff here after a second null byte */
-	if (get_protocol_version_config() > 0) {
+	if (version > 0) {
 		strbuf_addch(&request, '\0');
 		strbuf_addf(&request, "version=%d%c",
-			    get_protocol_version_config(), '\0');
+			    version, '\0');
 	}
 
 	packet_write(fd[1], request.buf, request.len);
@@ -1092,14 +1093,14 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
  */
 static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 			     enum ssh_variant variant, const char *port,
-			     int flags)
+			     enum protocol_version version, int flags)
 {
 	if (variant == VARIANT_SSH &&
-	    get_protocol_version_config() > 0) {
+	    version > 0) {
 		argv_array_push(args, "-o");
 		argv_array_push(args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
 		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-				 get_protocol_version_config());
+				 version);
 	}
 
 	if (flags & CONNECT_IPV4) {
@@ -1152,7 +1153,8 @@ static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 
 /* Prepare a child_process for use by Git's SSH-tunneled transport. */
 static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
-			  const char *port, int flags)
+			  const char *port, enum protocol_version version,
+			  int flags)
 {
 	const char *ssh;
 	enum ssh_variant variant;
@@ -1186,14 +1188,14 @@ static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
 		argv_array_push(&detect.args, ssh);
 		argv_array_push(&detect.args, "-G");
 		push_ssh_options(&detect.args, &detect.env_array,
-				 VARIANT_SSH, port, flags);
+				 VARIANT_SSH, port, version, flags);
 		argv_array_push(&detect.args, ssh_host);
 
 		variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
 	}
 
 	argv_array_push(&conn->args, ssh);
-	push_ssh_options(&conn->args, &conn->env_array, variant, port, flags);
+	push_ssh_options(&conn->args, &conn->env_array, variant, port, version, flags);
 	argv_array_push(&conn->args, ssh_host);
 }
 
@@ -1214,6 +1216,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 	char *hostandport, *path;
 	struct child_process *conn;
 	enum protocol protocol;
+	enum protocol_version version = get_protocol_version_config();
 
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
@@ -1228,7 +1231,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
-		conn = git_connect_git(fd, hostandport, path, prog, flags);
+		conn = git_connect_git(fd, hostandport, path, prog, version, flags);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
 		const char *const *var;
@@ -1271,12 +1274,12 @@ struct child_process *git_connect(int fd[2], const char *url,
 				strbuf_release(&cmd);
 				return NULL;
 			}
-			fill_ssh_args(conn, ssh_host, port, flags);
+			fill_ssh_args(conn, ssh_host, port, version, flags);
 		} else {
 			transport_check_allowed("file");
-			if (get_protocol_version_config() > 0) {
+			if (version > 0) {
 				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-						 get_protocol_version_config());
+						 version);
 			}
 		}
 		argv_array_push(&conn->args, cmd.buf);
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 24/35] connect: don't request v2 when pushing
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (22 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 23/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 25/35] transport-helper: remove name parameter Brandon Williams
                         ` (11 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.

We could run into issues if we didn't fall back to protocol v2 when
pushing right now.  This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2.  So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              |  8 ++++++++
 t/t5702-protocol-v2.sh | 24 ++++++++++++++++++++++++
 2 files changed, 32 insertions(+)

diff --git a/connect.c b/connect.c
index a0bfcdf4f..32284d050 100644
--- a/connect.c
+++ b/connect.c
@@ -1218,6 +1218,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 	enum protocol protocol;
 	enum protocol_version version = get_protocol_version_config();
 
+	/*
+	 * NEEDSWORK: If we are trying to use protocol v2 and we are planning
+	 * to perform a push, then fallback to v0 since the client doesn't know
+	 * how to push yet using v2.
+	 */
+	if (version == protocol_v2 && !strcmp("git-receive-pack", prog))
+		version = protocol_v0;
+
 	/* Without this we cannot rely on waitpid() to tell
 	 * what happened to our children.
 	 */
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 4365ac273..e3a7c09d4 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -95,6 +95,30 @@ test_expect_success 'pull with git:// using protocol v2' '
 	grep "fetch< version 2" log
 '
 
+test_expect_success 'push with git:// and a config of v2 does not request v2' '
+	test_when_finished "rm -f log" &&
+
+	# Till v2 for push is designed, make sure that if a client has
+	# protocol.version configured to use v2, that the client instead falls
+	# back and uses v0.
+
+	test_commit -C daemon_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET="$(pwd)/log" git -C daemon_child -c protocol.version=2 \
+		push origin HEAD:client_branch &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	! grep "push> .*\\\0\\\0version=2\\\0$" log &&
+	# Server responded using protocol v2
+	! grep "push< version 2" log
+'
+
 stop_git_daemon
 
 # Test protocol v2 with 'file://' transport
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 25/35] transport-helper: remove name parameter
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (23 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 24/35] connect: don't request v2 when pushing Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 26/35] transport-helper: refactor process_connect_service Brandon Williams
                         ` (10 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Commit 266f1fdfa (transport-helper: be quiet on read errors from
helpers, 2013-06-21) removed a call to 'die()' which printed the name of
the remote helper passed in to the 'recvline_fh()' function using the
'name' parameter.  Once the call to 'die()' was removed the parameter
was no longer necessary but wasn't removed.  Clean up 'recvline_fh()'
parameter list by removing the 'name' parameter.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index 4c334b5ee..d72155768 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -49,7 +49,7 @@ static void sendline(struct helper_data *helper, struct strbuf *buffer)
 		die_errno("Full write to remote helper failed");
 }
 
-static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
+static int recvline_fh(FILE *helper, struct strbuf *buffer)
 {
 	strbuf_reset(buffer);
 	if (debug)
@@ -67,7 +67,7 @@ static int recvline_fh(FILE *helper, struct strbuf *buffer, const char *name)
 
 static int recvline(struct helper_data *helper, struct strbuf *buffer)
 {
-	return recvline_fh(helper->out, buffer, helper->name);
+	return recvline_fh(helper->out, buffer);
 }
 
 static void write_constant(int fd, const char *str)
@@ -586,7 +586,7 @@ static int process_connect_service(struct transport *transport,
 		goto exit;
 
 	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf, name))
+	if (recvline_fh(input, &cmdbuf))
 		exit(128);
 
 	if (!strcmp(cmdbuf.buf, "")) {
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 26/35] transport-helper: refactor process_connect_service
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (24 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 25/35] transport-helper: remove name parameter Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 27/35] transport-helper: introduce stateless-connect Brandon Williams
                         ` (9 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

A future patch will need to take advantage of the logic which runs and
processes the response of the connect command on a remote helper so
factor out this logic from 'process_connect_service()' and place it into
a helper function 'run_connect()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 transport-helper.c | 67 ++++++++++++++++++++++++++--------------------
 1 file changed, 38 insertions(+), 29 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index d72155768..c032a2a87 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -545,14 +545,13 @@ static int fetch_with_import(struct transport *transport,
 	return 0;
 }
 
-static int process_connect_service(struct transport *transport,
-				   const char *name, const char *exec)
+static int run_connect(struct transport *transport, struct strbuf *cmdbuf)
 {
 	struct helper_data *data = transport->data;
-	struct strbuf cmdbuf = STRBUF_INIT;
-	struct child_process *helper;
-	int r, duped, ret = 0;
+	int ret = 0;
+	int duped;
 	FILE *input;
+	struct child_process *helper;
 
 	helper = get_helper(transport);
 
@@ -568,44 +567,54 @@ static int process_connect_service(struct transport *transport,
 	input = xfdopen(duped, "r");
 	setvbuf(input, NULL, _IONBF, 0);
 
+	sendline(data, cmdbuf);
+	if (recvline_fh(input, cmdbuf))
+		exit(128);
+
+	if (!strcmp(cmdbuf->buf, "")) {
+		data->no_disconnect_req = 1;
+		if (debug)
+			fprintf(stderr, "Debug: Smart transport connection "
+				"ready.\n");
+		ret = 1;
+	} else if (!strcmp(cmdbuf->buf, "fallback")) {
+		if (debug)
+			fprintf(stderr, "Debug: Falling back to dumb "
+				"transport.\n");
+	} else {
+		die("Unknown response to connect: %s",
+			cmdbuf->buf);
+	}
+
+	fclose(input);
+	return ret;
+}
+
+static int process_connect_service(struct transport *transport,
+				   const char *name, const char *exec)
+{
+	struct helper_data *data = transport->data;
+	struct strbuf cmdbuf = STRBUF_INIT;
+	int ret = 0;
+
 	/*
 	 * Handle --upload-pack and friends. This is fire and forget...
 	 * just warn if it fails.
 	 */
 	if (strcmp(name, exec)) {
-		r = set_helper_option(transport, "servpath", exec);
+		int r = set_helper_option(transport, "servpath", exec);
 		if (r > 0)
 			warning("Setting remote service path not supported by protocol.");
 		else if (r < 0)
 			warning("Invalid remote service path.");
 	}
 
-	if (data->connect)
+	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
-	else
-		goto exit;
-
-	sendline(data, &cmdbuf);
-	if (recvline_fh(input, &cmdbuf))
-		exit(128);
-
-	if (!strcmp(cmdbuf.buf, "")) {
-		data->no_disconnect_req = 1;
-		if (debug)
-			fprintf(stderr, "Debug: Smart transport connection "
-				"ready.\n");
-		ret = 1;
-	} else if (!strcmp(cmdbuf.buf, "fallback")) {
-		if (debug)
-			fprintf(stderr, "Debug: Falling back to dumb "
-				"transport.\n");
-	} else
-		die("Unknown response to connect: %s",
-			cmdbuf.buf);
+		ret = run_connect(transport, &cmdbuf);
+	}
 
-exit:
 	strbuf_release(&cmdbuf);
-	fclose(input);
 	return ret;
 }
 
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 27/35] transport-helper: introduce stateless-connect
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (25 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 26/35] transport-helper: refactor process_connect_service Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:30         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 28/35] pkt-line: add packet_buf_write_len function Brandon Williams
                         ` (8 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Introduce the transport-helper capability 'stateless-connect'.  This
capability indicates that the transport-helper can be requested to run
the 'stateless-connect' command which should attempt to make a
stateless connection with a remote end.  Once established, the
connection can be used by the git client to communicate with
the remote end natively in a stateless-rpc manner as supported by
protocol v2.  This means that the client must send everything the server
needs in a single request as the client must not assume any
state-storing on the part of the server or transport.

If a stateless connection cannot be established then the remote-helper
will respond in the same manner as the 'connect' command indicating that
the client should fallback to using the dumb remote-helper commands.

A future patch will implement the 'stateless-connect' capability in our
http remote-helper (remote-curl) so that protocol v2 can be used using
the http transport.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/gitremote-helpers.txt | 32 +++++++++++++++++++++++++++++
 transport-helper.c                  | 11 ++++++++++
 transport.c                         |  1 +
 transport.h                         |  6 ++++++
 4 files changed, 50 insertions(+)

diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index 4a584f3c5..a8361ed95 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -102,6 +102,14 @@ Capabilities for Pushing
 +
 Supported commands: 'connect'.
 
+'stateless-connect'::
+	Experimental; for internal use only.
+	Can attempt to connect to a remote server for communication
+	using git's wire-protocol version 2.  This establishes a
+	stateless, half-duplex connection.
++
+Supported commands: 'stateless-connect'.
+
 'push'::
 	Can discover remote refs and push local commits and the
 	history leading up to them to new or existing remote refs.
@@ -136,6 +144,14 @@ Capabilities for Fetching
 +
 Supported commands: 'connect'.
 
+'stateless-connect'::
+	Experimental; for internal use only.
+	Can attempt to connect to a remote server for communication
+	using git's wire-protocol version 2.  This establishes a
+	stateless, half-duplex connection.
++
+Supported commands: 'stateless-connect'.
+
 'fetch'::
 	Can discover remote refs and transfer objects reachable from
 	them to the local object store.
@@ -375,6 +391,22 @@ Supported if the helper has the "export" capability.
 +
 Supported if the helper has the "connect" capability.
 
+'stateless-connect' <service>::
+	Experimental; for internal use only.
+	Connects to the given remote service for communication using
+	git's wire-protocol version 2.  This establishes a stateless,
+	half-duplex connection.  Valid replies to this command are empty
+	line (connection established), 'fallback' (no smart transport
+	support, fall back to dumb transports) and just exiting with
+	error message printed (can't connect, don't bother trying to
+	fall back).  After line feed terminating the positive (empty)
+	response, the output of the service starts.  Messages (both
+	request and response) must be terminated with a single flush
+	packet, allowing the remote helper to properly act as a proxy.
+	After the connection ends, the remote helper exits.
++
+Supported if the helper has the "stateless-connect" capability.
+
 If a fatal error occurs, the program writes the error message to
 stderr and exits. The caller should expect that a suitable error
 message has been printed if the child closes the connection without
diff --git a/transport-helper.c b/transport-helper.c
index c032a2a87..e20a5076e 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -12,6 +12,7 @@
 #include "argv-array.h"
 #include "refs.h"
 #include "transport-internal.h"
+#include "protocol.h"
 
 static int debug;
 
@@ -26,6 +27,7 @@ struct helper_data {
 		option : 1,
 		push : 1,
 		connect : 1,
+		stateless_connect : 1,
 		signed_tags : 1,
 		check_connectivity : 1,
 		no_disconnect_req : 1,
@@ -188,6 +190,8 @@ static struct child_process *get_helper(struct transport *transport)
 			refspecs[refspec_nr++] = xstrdup(arg);
 		} else if (!strcmp(capname, "connect")) {
 			data->connect = 1;
+		} else if (!strcmp(capname, "stateless-connect")) {
+			data->stateless_connect = 1;
 		} else if (!strcmp(capname, "signed-tags")) {
 			data->signed_tags = 1;
 		} else if (skip_prefix(capname, "export-marks ", &arg)) {
@@ -612,6 +616,13 @@ static int process_connect_service(struct transport *transport,
 	if (data->connect) {
 		strbuf_addf(&cmdbuf, "connect %s\n", name);
 		ret = run_connect(transport, &cmdbuf);
+	} else if (data->stateless_connect &&
+		   (get_protocol_version_config() == protocol_v2) &&
+		   !strcmp("git-upload-pack", name)) {
+		strbuf_addf(&cmdbuf, "stateless-connect %s\n", name);
+		ret = run_connect(transport, &cmdbuf);
+		if (ret)
+			transport->stateless_rpc = 1;
 	}
 
 	strbuf_release(&cmdbuf);
diff --git a/transport.c b/transport.c
index 8e38352c5..2e7b7a715 100644
--- a/transport.c
+++ b/transport.c
@@ -250,6 +250,7 @@ static int fetch_refs_via_pack(struct transport *transport,
 		data->options.check_self_contained_and_connected;
 	args.cloning = transport->cloning;
 	args.update_shallow = data->options.update_shallow;
+	args.stateless_rpc = transport->stateless_rpc;
 
 	if (!data->got_remote_heads)
 		refs_tmp = get_refs_via_connect(transport, 0, NULL);
diff --git a/transport.h b/transport.h
index daea4770c..0ef0d1902 100644
--- a/transport.h
+++ b/transport.h
@@ -55,6 +55,12 @@ struct transport {
 	 */
 	unsigned cloning : 1;
 
+	/*
+	 * Indicates that the transport is connected via a half-duplex
+	 * connection and should operate in stateless-rpc mode.
+	 */
+	unsigned stateless_rpc : 1;
+
 	/*
 	 * These strings will be passed to the {pre, post}-receive hook,
 	 * on the remote side, if both sides support the push options capability.
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 28/35] pkt-line: add packet_buf_write_len function
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (26 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 27/35] transport-helper: introduce stateless-connect Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 29/35] remote-curl: create copy of the service name Brandon Williams
                         ` (7 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Add the 'packet_buf_write_len()' function which allows for writing an
arbitrary length buffer into a 'struct strbuf' and formatting it in
packet-line format.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 16 ++++++++++++++++
 pkt-line.h |  1 +
 2 files changed, 17 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 87a24bd17..5223e24e2 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -215,6 +215,22 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 	va_end(args);
 }
 
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len)
+{
+	size_t orig_len, n;
+
+	orig_len = buf->len;
+	strbuf_addstr(buf, "0000");
+	strbuf_add(buf, data, len);
+	n = buf->len - orig_len;
+
+	if (n > LARGE_PACKET_MAX)
+		die("protocol error: impossibly long line");
+
+	set_packet_header(&buf->buf[orig_len], n);
+	packet_trace(data, len, 1);
+}
+
 int write_packetized_from_fd(int fd_in, int fd_out)
 {
 	static char buf[LARGE_PACKET_DATA_MAX];
diff --git a/pkt-line.h b/pkt-line.h
index 3f836f01a..4f97ae3e5 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -26,6 +26,7 @@ void packet_buf_flush(struct strbuf *buf);
 void packet_buf_delim(struct strbuf *buf);
 void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
+void packet_buf_write_len(struct strbuf *buf, const char *data, size_t len);
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int write_packetized_from_fd(int fd_in, int fd_out);
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 29/35] remote-curl: create copy of the service name
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (27 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 28/35] pkt-line: add packet_buf_write_len function Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:32         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 30/35] remote-curl: store the protocol version the server responded with Brandon Williams
                         ` (6 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Make a copy of the service name being requested instead of relying on
the buffer pointed to by the passed in 'const char *' to remain
unchanged.

Currently, all service names are string constants, but a subsequent
patch will introduce service names from external sources.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index dae8a4a48..4086aa733 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -165,7 +165,7 @@ static int set_option(const char *name, const char *value)
 }
 
 struct discovery {
-	const char *service;
+	char *service;
 	char *buf_alloc;
 	char *buf;
 	size_t len;
@@ -257,6 +257,7 @@ static void free_discovery(struct discovery *d)
 		free(d->shallow.oid);
 		free(d->buf_alloc);
 		free_refs(d->refs);
+		free(d->service);
 		free(d);
 	}
 }
@@ -343,7 +344,7 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		warning(_("redirecting to %s"), url.buf);
 
 	last= xcalloc(1, sizeof(*last_discovery));
-	last->service = service;
+	last->service = xstrdup(service);
 	last->buf_alloc = strbuf_detach(&buffer, &last->len);
 	last->buf = last->buf_alloc;
 
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 30/35] remote-curl: store the protocol version the server responded with
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (28 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 29/35] remote-curl: create copy of the service name Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 31/35] http: allow providing extra headers for http requests Brandon Williams
                         ` (5 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Store the protocol version the server responded with when performing
discovery.  This will be used in a future patch to either change the
'Git-Protocol' header sent in subsequent requests or to determine if a
client needs to fallback to using a different protocol version.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 4086aa733..c54035843 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -171,6 +171,7 @@ struct discovery {
 	size_t len;
 	struct ref *refs;
 	struct oid_array shallow;
+	enum protocol_version version;
 	unsigned proto_git : 1;
 };
 static struct discovery *last_discovery;
@@ -184,7 +185,8 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 			   PACKET_READ_CHOMP_NEWLINE |
 			   PACKET_READ_GENTLE_ON_EOF);
 
-	switch (discover_version(&reader)) {
+	heads->version = discover_version(&reader);
+	switch (heads->version) {
 	case protocol_v2:
 		die("support for protocol v2 not implemented yet");
 		break;
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 31/35] http: allow providing extra headers for http requests
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (29 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 30/35] remote-curl: store the protocol version the server responded with Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:33         ` Jonathan Tan
  2018-02-28 23:22       ` [PATCH v4 32/35] http: don't always add Git-Protocol header Brandon Williams
                         ` (4 subsequent siblings)
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Add a way for callers to request that extra headers be included when
making http requests.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c | 8 ++++++++
 http.h | 7 +++++++
 2 files changed, 15 insertions(+)

diff --git a/http.c b/http.c
index 597771271..e1757d62b 100644
--- a/http.c
+++ b/http.c
@@ -1723,6 +1723,14 @@ static int http_request(const char *url,
 
 	headers = curl_slist_append(headers, buf.buf);
 
+	/* Add additional headers here */
+	if (options && options->extra_headers) {
+		const struct string_list_item *item;
+		for_each_string_list_item(item, options->extra_headers) {
+			headers = curl_slist_append(headers, item->string);
+		}
+	}
+
 	curl_easy_setopt(slot->curl, CURLOPT_URL, url);
 	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
 	curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
diff --git a/http.h b/http.h
index f7bd3b26b..4df4a25e1 100644
--- a/http.h
+++ b/http.h
@@ -172,6 +172,13 @@ struct http_get_options {
 	 * for details.
 	 */
 	struct strbuf *base_url;
+
+	/*
+	 * If not NULL, contains additional HTTP headers to be sent with the
+	 * request. The strings in the list must not be freed until after the
+	 * request has completed.
+	 */
+	struct string_list *extra_headers;
 };
 
 /* Return values for http_get_*() */
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 32/35] http: don't always add Git-Protocol header
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (30 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 31/35] http: allow providing extra headers for http requests Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 33/35] http: eliminate "# service" line when using protocol v2 Brandon Williams
                         ` (3 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Instead of always sending the Git-Protocol header with the configured
version with every http request, explicitly send it when discovering
refs and then only send it on subsequent http requests if the server
understood the version requested.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http.c        | 17 -----------------
 remote-curl.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+), 17 deletions(-)

diff --git a/http.c b/http.c
index e1757d62b..8f1129ac7 100644
--- a/http.c
+++ b/http.c
@@ -904,21 +904,6 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
-static void protocol_http_header(void)
-{
-	if (get_protocol_version_config() > 0) {
-		struct strbuf protocol_header = STRBUF_INIT;
-
-		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
-			    get_protocol_version_config());
-
-
-		extra_http_headers = curl_slist_append(extra_http_headers,
-						       protocol_header.buf);
-		strbuf_release(&protocol_header);
-	}
-}
-
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -949,8 +934,6 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
-	protocol_http_header();
-
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/remote-curl.c b/remote-curl.c
index c54035843..b4e9db85b 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -291,6 +291,19 @@ static int show_http_message(struct strbuf *type, struct strbuf *charset,
 	return 0;
 }
 
+static int get_protocol_http_header(enum protocol_version version,
+				    struct strbuf *header)
+{
+	if (version > 0) {
+		strbuf_addf(header, GIT_PROTOCOL_HEADER ": version=%d",
+			    version);
+
+		return 1;
+	}
+
+	return 0;
+}
+
 static struct discovery *discover_refs(const char *service, int for_push)
 {
 	struct strbuf exp = STRBUF_INIT;
@@ -299,6 +312,8 @@ static struct discovery *discover_refs(const char *service, int for_push)
 	struct strbuf buffer = STRBUF_INIT;
 	struct strbuf refs_url = STRBUF_INIT;
 	struct strbuf effective_url = STRBUF_INIT;
+	struct strbuf protocol_header = STRBUF_INIT;
+	struct string_list extra_headers = STRING_LIST_INIT_DUP;
 	struct discovery *last = last_discovery;
 	int http_ret, maybe_smart = 0;
 	struct http_get_options http_options;
@@ -318,11 +333,16 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		strbuf_addf(&refs_url, "service=%s", service);
 	}
 
+	/* Add the extra Git-Protocol header */
+	if (get_protocol_http_header(get_protocol_version_config(), &protocol_header))
+		string_list_append(&extra_headers, protocol_header.buf);
+
 	memset(&http_options, 0, sizeof(http_options));
 	http_options.content_type = &type;
 	http_options.charset = &charset;
 	http_options.effective_url = &effective_url;
 	http_options.base_url = &url;
+	http_options.extra_headers = &extra_headers;
 	http_options.initial_request = 1;
 	http_options.no_cache = 1;
 	http_options.keep_error = 1;
@@ -389,6 +409,8 @@ static struct discovery *discover_refs(const char *service, int for_push)
 	strbuf_release(&charset);
 	strbuf_release(&effective_url);
 	strbuf_release(&buffer);
+	strbuf_release(&protocol_header);
+	string_list_clear(&extra_headers, 0);
 	last_discovery = last;
 	return last;
 }
@@ -425,6 +447,7 @@ struct rpc_state {
 	char *service_url;
 	char *hdr_content_type;
 	char *hdr_accept;
+	char *protocol_header;
 	char *buf;
 	size_t alloc;
 	size_t len;
@@ -611,6 +634,10 @@ static int post_rpc(struct rpc_state *rpc)
 	headers = curl_slist_append(headers, needs_100_continue ?
 		"Expect: 100-continue" : "Expect:");
 
+	/* Add the extra Git-Protocol header */
+	if (rpc->protocol_header)
+		headers = curl_slist_append(headers, rpc->protocol_header);
+
 retry:
 	slot = get_active_slot();
 
@@ -751,6 +778,11 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 	strbuf_addf(&buf, "Accept: application/x-%s-result", svc);
 	rpc->hdr_accept = strbuf_detach(&buf, NULL);
 
+	if (get_protocol_http_header(heads->version, &buf))
+		rpc->protocol_header = strbuf_detach(&buf, NULL);
+	else
+		rpc->protocol_header = NULL;
+
 	while (!err) {
 		int n = packet_read(rpc->out, NULL, NULL, rpc->buf, rpc->alloc, 0);
 		if (!n)
@@ -778,6 +810,7 @@ static int rpc_service(struct rpc_state *rpc, struct discovery *heads)
 	free(rpc->service_url);
 	free(rpc->hdr_content_type);
 	free(rpc->hdr_accept);
+	free(rpc->protocol_header);
 	free(rpc->buf);
 	strbuf_release(&buf);
 	return err;
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 33/35] http: eliminate "# service" line when using protocol v2
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (31 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 32/35] http: don't always add Git-Protocol header Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-02-28 23:22       ` [PATCH v4 34/35] remote-curl: implement stateless-connect command Brandon Williams
                         ` (2 subsequent siblings)
  35 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

When an http info/refs request is made, requesting that protocol v2 be
used, don't send a "# service" line since this line is not part of the
v2 spec.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 http-backend.c | 8 ++++++--
 remote-curl.c  | 3 +++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/http-backend.c b/http-backend.c
index f3dc218b2..5d241e910 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -10,6 +10,7 @@
 #include "url.h"
 #include "argv-array.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static const char content_type[] = "Content-Type";
 static const char content_length[] = "Content-Length";
@@ -466,8 +467,11 @@ static void get_info_refs(struct strbuf *hdr, char *arg)
 		hdr_str(hdr, content_type, buf.buf);
 		end_headers(hdr);
 
-		packet_write_fmt(1, "# service=git-%s\n", svc->name);
-		packet_flush(1);
+
+		if (determine_protocol_version_server() != protocol_v2) {
+			packet_write_fmt(1, "# service=git-%s\n", svc->name);
+			packet_flush(1);
+		}
 
 		argv[0] = svc->name;
 		run_service(argv, 0);
diff --git a/remote-curl.c b/remote-curl.c
index b4e9db85b..66a53f74b 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -396,6 +396,9 @@ static struct discovery *discover_refs(const char *service, int for_push)
 			;
 
 		last->proto_git = 1;
+	} else if (maybe_smart &&
+		   last->len > 5 && starts_with(last->buf + 4, "version 2")) {
+		last->proto_git = 1;
 	}
 
 	if (last->proto_git)
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 34/35] remote-curl: implement stateless-connect command
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (32 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 33/35] http: eliminate "# service" line when using protocol v2 Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-02 20:07         ` Johannes Schindelin
  2018-02-28 23:22       ` [PATCH v4 35/35] remote-curl: don't request v2 when pushing Brandon Williams
  2018-03-01 18:41       ` [PATCH v4 00/35] protocol version 2 Junio C Hamano
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

Teach remote-curl the 'stateless-connect' command which is used to
establish a stateless connection with servers which support protocol
version 2.  This allows remote-curl to act as a proxy, allowing the git
client to communicate natively with a remote end, simply using
remote-curl as a pass through to convert requests to http.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c          | 205 ++++++++++++++++++++++++++++++++++++++++-
 t/t5702-protocol-v2.sh |  45 +++++++++
 2 files changed, 249 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 66a53f74b..3f882d766 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -188,7 +188,12 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
 	heads->version = discover_version(&reader);
 	switch (heads->version) {
 	case protocol_v2:
-		die("support for protocol v2 not implemented yet");
+		/*
+		 * Do nothing.  This isn't a list of refs but rather a
+		 * capability advertisement.  Client would have run
+		 * 'stateless-connect' so we'll dump this capability listing
+		 * and let them request the refs themselves.
+		 */
 		break;
 	case protocol_v1:
 	case protocol_v0:
@@ -1085,6 +1090,200 @@ static void parse_push(struct strbuf *buf)
 	free(specs);
 }
 
+/*
+ * Used to represent the state of a connection to an HTTP server when
+ * communicating using git's wire-protocol version 2.
+ */
+struct proxy_state {
+	char *service_name;
+	char *service_url;
+	struct curl_slist *headers;
+	struct strbuf request_buffer;
+	int in;
+	int out;
+	struct packet_reader reader;
+	size_t pos;
+	int seen_flush;
+};
+
+static void proxy_state_init(struct proxy_state *p, const char *service_name,
+			     enum protocol_version version)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	memset(p, 0, sizeof(*p));
+	p->service_name = xstrdup(service_name);
+
+	p->in = 0;
+	p->out = 1;
+	strbuf_init(&p->request_buffer, 0);
+
+	strbuf_addf(&buf, "%s%s", url.buf, p->service_name);
+	p->service_url = strbuf_detach(&buf, NULL);
+
+	p->headers = http_copy_default_headers();
+
+	strbuf_addf(&buf, "Content-Type: application/x-%s-request", p->service_name);
+	p->headers = curl_slist_append(p->headers, buf.buf);
+	strbuf_reset(&buf);
+
+	strbuf_addf(&buf, "Accept: application/x-%s-result", p->service_name);
+	p->headers = curl_slist_append(p->headers, buf.buf);
+	strbuf_reset(&buf);
+
+	p->headers = curl_slist_append(p->headers, "Transfer-Encoding: chunked");
+
+	/* Add the Git-Protocol header */
+	if (get_protocol_http_header(version, &buf))
+		p->headers = curl_slist_append(p->headers, buf.buf);
+
+	packet_reader_init(&p->reader, p->in, NULL, 0,
+			   PACKET_READ_GENTLE_ON_EOF);
+
+	strbuf_release(&buf);
+}
+
+static void proxy_state_clear(struct proxy_state *p)
+{
+	free(p->service_name);
+	free(p->service_url);
+	curl_slist_free_all(p->headers);
+	strbuf_release(&p->request_buffer);
+}
+
+/*
+ * CURLOPT_READFUNCTION callback function.
+ * Attempts to copy over a single packet-line at a time into the
+ * curl provided buffer.
+ */
+static size_t proxy_in(char *buffer, size_t eltsize,
+		       size_t nmemb, void *userdata)
+{
+	size_t max;
+	struct proxy_state *p = userdata;
+	size_t avail = p->request_buffer.len - p->pos;
+
+
+	if (eltsize != 1)
+		BUG("curl read callback called with size = %zu != 1", eltsize);
+	max = nmemb;
+
+	if (!avail) {
+		if (p->seen_flush) {
+			p->seen_flush = 0;
+			return 0;
+		}
+
+		strbuf_reset(&p->request_buffer);
+		switch (packet_reader_read(&p->reader)) {
+		case PACKET_READ_EOF:
+			die("unexpected EOF when reading from parent process");
+		case PACKET_READ_NORMAL:
+			packet_buf_write_len(&p->request_buffer, p->reader.line,
+					     p->reader.pktlen);
+			break;
+		case PACKET_READ_DELIM:
+			packet_buf_delim(&p->request_buffer);
+			break;
+		case PACKET_READ_FLUSH:
+			packet_buf_flush(&p->request_buffer);
+			p->seen_flush = 1;
+			break;
+		}
+		p->pos = 0;
+		avail = p->request_buffer.len;
+	}
+
+	if (max < avail)
+		avail = max;
+	memcpy(buffer, p->request_buffer.buf + p->pos, avail);
+	p->pos += avail;
+	return avail;
+}
+
+static size_t proxy_out(char *buffer, size_t eltsize,
+			size_t nmemb, void *userdata)
+{
+	size_t size;
+	struct proxy_state *p = userdata;
+
+	if (eltsize != 1)
+		BUG("curl read callback called with size = %zu != 1", eltsize);
+	size = nmemb;
+
+	write_or_die(p->out, buffer, size);
+	return size;
+}
+
+/* Issues a request to the HTTP server configured in `p` */
+static int proxy_request(struct proxy_state *p)
+{
+	struct active_request_slot *slot;
+
+	slot = get_active_slot();
+
+	curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
+	curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
+	curl_easy_setopt(slot->curl, CURLOPT_URL, p->service_url);
+	curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, p->headers);
+
+	/* Setup function to read request from client */
+	curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, proxy_in);
+	curl_easy_setopt(slot->curl, CURLOPT_READDATA, p);
+
+	/* Setup function to write server response to client */
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, proxy_out);
+	curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, p);
+
+	if (run_slot(slot, NULL) != HTTP_OK)
+		return -1;
+
+	return 0;
+}
+
+static int stateless_connect(const char *service_name)
+{
+	struct discovery *discover;
+	struct proxy_state p;
+
+	/*
+	 * Run the info/refs request and see if the server supports protocol
+	 * v2.  If and only if the server supports v2 can we successfully
+	 * establish a stateless connection, otherwise we need to tell the
+	 * client to fallback to using other transport helper functions to
+	 * complete their request.
+	 */
+	discover = discover_refs(service_name, 0);
+	if (discover->version != protocol_v2) {
+		printf("fallback\n");
+		fflush(stdout);
+		return -1;
+	} else {
+		/* Stateless Connection established */
+		printf("\n");
+		fflush(stdout);
+	}
+
+	proxy_state_init(&p, service_name, discover->version);
+
+	/*
+	 * Dump the capability listing that we got from the server earlier
+	 * during the info/refs request.
+	 */
+	write_or_die(p.out, discover->buf, discover->len);
+
+	/* Peek the next packet line.  Until we see EOF keep sending POSTs */
+	while (packet_reader_peek(&p.reader) != PACKET_READ_EOF) {
+		if (proxy_request(&p)) {
+			/* We would have an err here */
+			break;
+		}
+	}
+
+	proxy_state_clear(&p);
+	return 0;
+}
+
 int cmd_main(int argc, const char **argv)
 {
 	struct strbuf buf = STRBUF_INIT;
@@ -1153,12 +1352,16 @@ int cmd_main(int argc, const char **argv)
 			fflush(stdout);
 
 		} else if (!strcmp(buf.buf, "capabilities")) {
+			printf("stateless-connect\n");
 			printf("fetch\n");
 			printf("option\n");
 			printf("push\n");
 			printf("check-connectivity\n");
 			printf("\n");
 			fflush(stdout);
+		} else if (skip_prefix(buf.buf, "stateless-connect ", &arg)) {
+			if (!stateless_connect(arg))
+				break;
 		} else {
 			error("remote-curl: unknown command '%s' from git", buf.buf);
 			return 1;
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index e3a7c09d4..124063c2c 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -201,4 +201,49 @@ test_expect_success 'ref advertisment is filtered during fetch using protocol v2
 	! grep "refs/tags/three" log
 '
 
+# Test protocol v2 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" GIT_TRACE_CURL="$(pwd)/log" git -c protocol.version=2 \
+		clone "$HTTPD_URL/smart/http_parent" http_child &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v2
+	grep "Git-Protocol: version=2" log &&
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+test_expect_success 'fetch with http:// using protocol v2' '
+	test_when_finished "rm -f log" &&
+
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET="$(pwd)/log" git -C http_child -c protocol.version=2 \
+		fetch &&
+
+	git -C http_child log -1 --format=%s origin/master >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v2
+	grep "git< version 2" log
+'
+
+stop_httpd
+
 test_done
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* [PATCH v4 35/35] remote-curl: don't request v2 when pushing
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (33 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 34/35] remote-curl: implement stateless-connect command Brandon Williams
@ 2018-02-28 23:22       ` Brandon Williams
  2018-03-13 16:35         ` Jonathan Tan
  2018-03-01 18:41       ` [PATCH v4 00/35] protocol version 2 Junio C Hamano
  35 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-02-28 23:22 UTC (permalink / raw)
  To: git
  Cc: git, gitster, jrnieder, pclouds, peff, sbeller, stolee, Brandon Williams

In order to be able to ship protocol v2 with only supporting fetch, we
need clients to not issue a request to use protocol v2 when pushing
(since the client currently doesn't know how to push using protocol v2).
This allows a client to have protocol v2 configured in
`protocol.version` and take advantage of using v2 for fetch and falling
back to using v0 when pushing while v2 for push is being designed.

We could run into issues if we didn't fall back to protocol v2 when
pushing right now.  This is because currently a server will ignore a request to
use v2 when contacting the 'receive-pack' endpoint and fall back to
using v0, but when push v2 is rolled out to servers, the 'receive-pack'
endpoint will start responding using v2.  So we don't want to get into a
state where a client is requesting to push with v2 before they actually
know how to push using v2.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 remote-curl.c          | 11 ++++++++++-
 t/t5702-protocol-v2.sh | 24 ++++++++++++++++++++++++
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/remote-curl.c b/remote-curl.c
index 3f882d766..379ab9b21 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -322,6 +322,7 @@ static struct discovery *discover_refs(const char *service, int for_push)
 	struct discovery *last = last_discovery;
 	int http_ret, maybe_smart = 0;
 	struct http_get_options http_options;
+	enum protocol_version version = get_protocol_version_config();
 
 	if (last && !strcmp(service, last->service))
 		return last;
@@ -338,8 +339,16 @@ static struct discovery *discover_refs(const char *service, int for_push)
 		strbuf_addf(&refs_url, "service=%s", service);
 	}
 
+	/*
+	 * NEEDSWORK: If we are trying to use protocol v2 and we are planning
+	 * to perform a push, then fallback to v0 since the client doesn't know
+	 * how to push yet using v2.
+	 */
+	if (version == protocol_v2 && !strcmp("git-receive-pack", service))
+		version = protocol_v0;
+
 	/* Add the extra Git-Protocol header */
-	if (get_protocol_http_header(get_protocol_version_config(), &protocol_header))
+	if (get_protocol_http_header(version, &protocol_header))
 		string_list_append(&extra_headers, protocol_header.buf);
 
 	memset(&http_options, 0, sizeof(http_options));
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 124063c2c..56f7c3c32 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -244,6 +244,30 @@ test_expect_success 'fetch with http:// using protocol v2' '
 	grep "git< version 2" log
 '
 
+test_expect_success 'push with http:// and a config of v2 does not request v2' '
+	test_when_finished "rm -f log" &&
+	# Till v2 for push is designed, make sure that if a client has
+	# protocol.version configured to use v2, that the client instead falls
+	# back and uses v0.
+
+	test_commit -C http_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET="$(pwd)/log" git -C http_child -c protocol.version=2 \
+		push origin HEAD:client_branch &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client didnt request to use protocol v2
+	! grep "Git-Protocol: version=2" log &&
+	# Server didnt respond using protocol v2
+	! grep "git< version 2" log
+'
+
+
 stop_httpd
 
 test_done
-- 
2.16.2.395.g2e18187dfd-goog


^ permalink raw reply related	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 00/35] protocol version 2
  2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
                         ` (34 preceding siblings ...)
  2018-02-28 23:22       ` [PATCH v4 35/35] remote-curl: don't request v2 when pushing Brandon Williams
@ 2018-03-01 18:41       ` Junio C Hamano
  2018-03-01 19:16         ` Brandon Williams
  35 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 18:41 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> Lots of changes since v3 (well more than between v2 and v3).  Thanks for
> all of the reviews on the last round, the series is getting more
> polished.
>
>  * Eliminated the "# service" line from the response from an HTTP
>    server.  This means that the response to a v2 request is exactly the
>    same regardless of which transport you use!  Docs for this have been
>    added as well.
>  * Changed how ref-patterns work with the `ls-refs` command.  Instead of
>    using wildmatch all patterns must either match exactly or they can
>    contain a single '*' character at the end to mean that the prefix
>    must match.  Docs for this have also been added.
>  * Lots of updates to the docs.  Including documenting the
>    `stateless-connect` remote-helper command used by remote-curl to
>    handle the http transport.
>  * Fixed a number of bugs with the `fetch` command, one of which didn't
>    use objects from configured alternates.

I noticed that this round is built on top of v2.16.0-rc0.  It
certainly makes it easier to compare against the previous round
which was built on top of that old commit and it is very much
appreciated that a reroll does not involve pointless rebases.

For those who are helping from sidelines, it may be ehlpful to
mention where in the history this was developed on, though, as
applying these on the current 'master' has a handful of small
conflicts.

Thanks, will replace and will comment on individual patches as
needed.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 00/35] protocol version 2
  2018-03-01 18:41       ` [PATCH v4 00/35] protocol version 2 Junio C Hamano
@ 2018-03-01 19:16         ` Brandon Williams
  2018-03-01 20:59           ` Junio C Hamano
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-03-01 19:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/01, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Lots of changes since v3 (well more than between v2 and v3).  Thanks for
> > all of the reviews on the last round, the series is getting more
> > polished.
> >
> >  * Eliminated the "# service" line from the response from an HTTP
> >    server.  This means that the response to a v2 request is exactly the
> >    same regardless of which transport you use!  Docs for this have been
> >    added as well.
> >  * Changed how ref-patterns work with the `ls-refs` command.  Instead of
> >    using wildmatch all patterns must either match exactly or they can
> >    contain a single '*' character at the end to mean that the prefix
> >    must match.  Docs for this have also been added.
> >  * Lots of updates to the docs.  Including documenting the
> >    `stateless-connect` remote-helper command used by remote-curl to
> >    handle the http transport.
> >  * Fixed a number of bugs with the `fetch` command, one of which didn't
> >    use objects from configured alternates.
> 
> I noticed that this round is built on top of v2.16.0-rc0.  It
> certainly makes it easier to compare against the previous round
> which was built on top of that old commit and it is very much
> appreciated that a reroll does not involve pointless rebases.
> 
> For those who are helping from sidelines, it may be ehlpful to
> mention where in the history this was developed on, though, as
> applying these on the current 'master' has a handful of small
> conflicts.
> 
> Thanks, will replace and will comment on individual patches as
> needed.

I've tried to keep building on the same base that I started with when
sending out a new version of series, mostly because I thought it was
easier to see what was different between rounds.

I can, in the future, try to remember to put the commit its based on.
Do we have any sort of guidance about the best practice here?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it
  2018-02-28 23:22       ` [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it Brandon Williams
@ 2018-03-01 20:48         ` Junio C Hamano
  2018-03-12 21:56           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 20:48 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> +enum packet_read_status packet_reader_read(struct packet_reader *reader)
> +{
> +	if (reader->line_peeked) {
> +		reader->line_peeked = 0;
> +		return reader->status;
> +	}
> +
> +	reader->status = packet_read_with_status(reader->fd,
> +						 &reader->src_buffer,
> +						 &reader->src_len,
> +						 reader->buffer,
> +						 reader->buffer_size,
> +						 &reader->pktlen,
> +						 reader->options);
> +
> +	switch (reader->status) {
> +	case PACKET_READ_EOF:
> +		reader->pktlen = -1;
> +		reader->line = NULL;
> +		break;
> +	case PACKET_READ_NORMAL:
> +		reader->line = reader->buffer;
> +		break;
> +	case PACKET_READ_FLUSH:
> +		reader->pktlen = 0;
> +		reader->line = NULL;
> +		break;
> +	}
> +
> +	return reader->status;
> +}

With the way _peek() interface interacts with the reader instance
(which by the way I find is well designed), it is understandable
that we want almost everything available in reader's fields, but
having to manually clear pktlen field upon non NORMAL status feels
a bit strange.  

Perhaps that is because the underlying packet_read_with_status()
does not set *pktlen in these cases?  Shouldn't it be doing that so
the caller does not have to?

A similar comment applies for reader's line field.  In priniciple,
as the status field is part of a reader, it does not have to exist
as a separate field, i.e.

	#define line_of(reader) \
		((reader).status == PACKET_READ_NORMAL ? \
		(reader).buffer : NULL)

can be used to as substitute for it.  I guess it depends on how the
actual callers wants to use this interface.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 03/35] pkt-line: add delim packet support
  2018-02-28 23:22       ` [PATCH v4 03/35] pkt-line: add delim packet support Brandon Williams
@ 2018-03-01 20:50         ` Junio C Hamano
  2018-03-01 21:04           ` Junio C Hamano
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 20:50 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> One of the design goals of protocol-v2 is to improve the semantics of
> flush packets.  Currently in protocol-v1, flush packets are used both to
> indicate a break in a list of packet lines as well as an indication that
> one side has finished speaking.  This makes it particularly difficult
> to implement proxies as a proxy would need to completely understand git
> protocol instead of simply looking for a flush packet.

Good ;-) Yes, this has been one of the largest gripe about the
smart-http support code we have.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 00/35] protocol version 2
  2018-03-01 19:16         ` Brandon Williams
@ 2018-03-01 20:59           ` Junio C Hamano
  0 siblings, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 20:59 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> I've tried to keep building on the same base that I started with when
> sending out a new version of series, mostly because I thought it was
> easier to see what was different between rounds.

Yes.  It indeed is easier to see the evolution if the series does
not get rebased needlessly.

> I can, in the future, try to remember to put the commit its based on.
> Do we have any sort of guidance about the best practice here?

I recall we taught a new "--base" option to "format-patch" not too
long ago, so one way to do so may be:

    $ git format-patch --cover-letter --base=v2.16.0-rc0 master..bw/protocol-v2
    $ tail -4 0000-cover*.txt
    base-commit: 1eaabe34fc6f486367a176207420378f587d3b48
    --
    2.16.2-345-g7e31236f65

perhaps?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 03/35] pkt-line: add delim packet support
  2018-03-01 20:50         ` Junio C Hamano
@ 2018-03-01 21:04           ` Junio C Hamano
  2018-03-01 22:49             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 21:04 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Junio C Hamano <gitster@pobox.com> writes:

> Brandon Williams <bmwill@google.com> writes:
>
>> One of the design goals of protocol-v2 is to improve the semantics of
>> flush packets.  Currently in protocol-v1, flush packets are used both to
>> indicate a break in a list of packet lines as well as an indication that
>> one side has finished speaking.  This makes it particularly difficult
>> to implement proxies as a proxy would need to completely understand git
>> protocol instead of simply looking for a flush packet.
>
> Good ;-) Yes, this has been one of the largest gripe about the
> smart-http support code we have.

Hmph, strictly speaking, the "delim" does not have to be a part of
how packetized stream is defined.  As long as we stop abusing flush
as "This is merely an end of one segment of what I say." and make it
always mean "I am done speaking, it is your turn.", the application
payload can define its own syntax to separate groups of packets.

I do not mind having this "delim" thing defined at the protocol
level too much, though.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 05/35] upload-pack: factor out processing lines
  2018-02-28 23:22       ` [PATCH v4 05/35] upload-pack: factor out processing lines Brandon Williams
@ 2018-03-01 21:25         ` Junio C Hamano
  2018-03-12 22:24           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 21:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> Factor out the logic for processing shallow, deepen, deepen_since, and
> deepen_not lines into their own functions to simplify the
> 'receive_needs()' function in addition to making it easier to reuse some
> of this logic when implementing protocol_v2.

These little functions that still require their incoming data to
begin with fixed prefixes feels a bit strange way to refactor the
logic for later reuse (when I imagine "reuse", the first use case
that comes to my mind is "this data source our new code reads from
gives the same data as the old 'shallow' packet used to give, but in
a different syntax"---so I'd restructure the code in such a way that
the caller figures out the syntax part and the called helper just
groks the "information contents" unwrapped from the surface syntax;
the syntax may be different in the new codepath but once unwrapped,
the "information contents" to be processed would not be different
hence we can reuse the helper).

IOW, I would have expected the caller to be not like this:

> -		if (skip_prefix(line, "shallow ", &arg)) {
> -			struct object_id oid;
> -			struct object *object;
> -			if (get_oid_hex(arg, &oid))
> -				die("invalid shallow line: %s", line);
> -			object = parse_object(&oid);
> -			if (!object)
> -				continue;
> -			if (object->type != OBJ_COMMIT)
> -				die("invalid shallow object %s", oid_to_hex(&oid));
> -			if (!(object->flags & CLIENT_SHALLOW)) {
> -				object->flags |= CLIENT_SHALLOW;
> -				add_object_array(object, NULL, &shallows);
> -			}
> +		if (process_shallow(line, &shallows))
>  			continue;
> +		if (process_deepen(line, &depth))
>  			continue;
		...

but more like

		if (skip_prefix(line, "shallow ", &arg) {
			process_shallow(arg, &shallows);
			continue;
		}
		if (skip_prefix(line, "deepen ", &arg) {
			process_deepen(arg, &depth);
			continue;
		}
		...

I need to defer the final judgment until I see how they are used,
though.  It's not too big a deal either way---it just felt "not
quite right" to me.



^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 06/35] transport: use get_refs_via_connect to get refs
  2018-02-28 23:22       ` [PATCH v4 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
@ 2018-03-01 21:25         ` Junio C Hamano
  0 siblings, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 21:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> Remove code duplication and use the existing 'get_refs_via_connect()'
> function to retrieve a remote's heads in 'fetch_refs_via_pack()' and
> 'git_transport_push()'.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  transport.c | 18 ++++--------------
>  1 file changed, 4 insertions(+), 14 deletions(-)

Nice ;-)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 03/35] pkt-line: add delim packet support
  2018-03-01 21:04           ` Junio C Hamano
@ 2018-03-01 22:49             ` Brandon Williams
  2018-03-01 23:43               ` Junio C Hamano
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-03-01 22:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/01, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Brandon Williams <bmwill@google.com> writes:
> >
> >> One of the design goals of protocol-v2 is to improve the semantics of
> >> flush packets.  Currently in protocol-v1, flush packets are used both to
> >> indicate a break in a list of packet lines as well as an indication that
> >> one side has finished speaking.  This makes it particularly difficult
> >> to implement proxies as a proxy would need to completely understand git
> >> protocol instead of simply looking for a flush packet.
> >
> > Good ;-) Yes, this has been one of the largest gripe about the
> > smart-http support code we have.
> 
> Hmph, strictly speaking, the "delim" does not have to be a part of
> how packetized stream is defined.  As long as we stop abusing flush
> as "This is merely an end of one segment of what I say." and make it
> always mean "I am done speaking, it is your turn.", the application
> payload can define its own syntax to separate groups of packets.

Thanks actually a good point.  We could just as easily have the delim
packet to be an empty packet-line "0004" or something like that.

> 
> I do not mind having this "delim" thing defined at the protocol
> level too much, though.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 12/35] serve: introduce git-serve
  2018-02-28 23:22       ` [PATCH v4 12/35] serve: introduce git-serve Brandon Williams
@ 2018-03-01 23:11         ` Junio C Hamano
  2018-03-12 22:08           ` Brandon Williams
  2018-03-02 20:42         ` Junio C Hamano
  2018-03-02 20:56         ` Junio C Hamano
  2 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 23:11 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

>  Documentation/technical/protocol-v2.txt | 171 ++++++++++++++++

Unlike other things in Documentation/technical/, this is not listed
on TECH_DOCS list in Documentation/Makefile.  Shouldn't it be?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 03/35] pkt-line: add delim packet support
  2018-03-01 22:49             ` Brandon Williams
@ 2018-03-01 23:43               ` Junio C Hamano
  0 siblings, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-03-01 23:43 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> On 03/01, Junio C Hamano wrote:
> ...
>> Hmph, strictly speaking, the "delim" does not have to be a part of
>> how packetized stream is defined.  As long as we stop abusing flush
>> as "This is merely an end of one segment of what I say." and make it
>> always mean "I am done speaking, it is your turn.", the application
>> payload can define its own syntax to separate groups of packets.
>
> Thanks actually a good point.  We could just as easily have the delim
> packet to be an empty packet-line "0004" or something like that.

Yes.  As long as there is an easy and obvious "cannot be a value"
constant, you can use it as a delimiter defined at the application
level.  For example, your command-request uses delim, like so:

+    request = empty-request | command-request
+    empty-request = flush-pkt
+    command-request = command
+		      capability-list
+		      (command-args)
+		      flush-pkt
+    command = PKT-LINE("command=" key LF)
+    command-args = delim-pkt
+		   *PKT-Line(arg LF)

to mark the end of cap list, but if an empty packet does not make
sense as a member of a cap list and a commmand args list, then an
empty packet between cap list and command arg can be used instead.
A protocol-ignorant proxy can still work just fine.

Having a defined delim at the protocol level is often convenient, of
course, but once the application starts calling for multi-level
delimiters (i.e. maybe there are chapters and sections inside each
chapter in a single request message), it would not be sufficient to
define a single delim packet type.  The application layer needs to
define its own convention (e.g. if no "empty" section is allowed,
then "two consecutive delim is a chapter break; one delim is a
section break" can become a viable way to emulate multi-level
delimiters).

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 34/35] remote-curl: implement stateless-connect command
  2018-02-28 23:22       ` [PATCH v4 34/35] remote-curl: implement stateless-connect command Brandon Williams
@ 2018-03-02 20:07         ` Johannes Schindelin
  2018-03-05 19:35           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Johannes Schindelin @ 2018-03-02 20:07 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

Hi Brandon,

On Wed, 28 Feb 2018, Brandon Williams wrote:

> diff --git a/remote-curl.c b/remote-curl.c
> index 66a53f74b..3f882d766 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -188,7 +188,12 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
> [...]
> +static size_t proxy_in(char *buffer, size_t eltsize,
> +		       size_t nmemb, void *userdata)
> +{
> +	size_t max;
> +	struct proxy_state *p = userdata;
> +	size_t avail = p->request_buffer.len - p->pos;
> +
> +
> +	if (eltsize != 1)
> +		BUG("curl read callback called with size = %zu != 1", eltsize);

The format specified %z is not actually portable. Please use PRIuMAX and
cast to (uintmax_t) instead.

This breaks the Windows build of `pu` (before that, there was still a test
failure that I did not have the time to chase down).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 12/35] serve: introduce git-serve
  2018-02-28 23:22       ` [PATCH v4 12/35] serve: introduce git-serve Brandon Williams
  2018-03-01 23:11         ` Junio C Hamano
@ 2018-03-02 20:42         ` Junio C Hamano
  2018-03-13 21:40           ` Brandon Williams
  2018-03-02 20:56         ` Junio C Hamano
  2 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-02 20:42 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> + Capabilities
> +~~~~~~~~~~~~~~
> +
> +There are two different types of capabilities: normal capabilities,
> +which can be used to to convey information or alter the behavior of a
> +request, and commands, which are the core actions that a client wants to
> +perform (fetch, push, etc).
> +
> +All commands must only last a single round and be stateless from the
> +perspective of the server side.  All state MUST be retained and managed
> +by the client process.  This permits simple round-robin load-balancing
> +on the server side, without needing to worry about state management.
> +
> +Clients MUST NOT require state management on the server side in order to
> +function correctly.

This somehow feels a bit too HTTP centric worldview that potentially
may penalize those who do not mind stateful services.

> + agent
> +-------
> +
> +The server can advertise the `agent` capability with a value `X` (in the
> +form `agent=X`) to notify the client that the server is running version
> +`X`.  The client may optionally send its own agent string by including
> +the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> +request to the server (but it MUST NOT do so if the server did not
> +advertise the agent capability).

Are there different degrees of permissiveness between "The server
CAN" and "The client MAY" above, or is the above paragraph merely
being fuzzy?

I notice that, with the above "MUST NOT", it is impossible for a
server to collect voluntary census information from client without
revealing its own "version".  Because in principle it is not
sensible to allow one side to send random capabilities without first
making sure that the other side understands them, unsolicited
"agent" from the client over a channel where the server did not say
it would accept one is quite fine, and the server can always say
something silly like "agent=undisclosed" to allow the clients to
volunteer their own version, but the definition of this capability
smells like conflating two unrelated things (i.e. advertising your
own version vs permission to announce yourself).


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 12/35] serve: introduce git-serve
  2018-02-28 23:22       ` [PATCH v4 12/35] serve: introduce git-serve Brandon Williams
  2018-03-01 23:11         ` Junio C Hamano
  2018-03-02 20:42         ` Junio C Hamano
@ 2018-03-02 20:56         ` Junio C Hamano
  2018-03-13 21:35           ` Brandon Williams
  2 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-02 20:56 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> +	/*
> +	 * Function queried to see if a capability should be advertised.
> +	 * Optionally a value can be specified by adding it to 'value'.
> +	 * If a value is added to 'value', the server will advertise this
> +	 * capability as "<name>=<value>" instead of "<name>".
> +	 */
> +	int (*advertise)(struct repository *r, struct strbuf *value);

So this is "do we tell them about this capability?"

> +static void advertise_capabilities(void)
> +{
> +	...
> +	for (i = 0; i < ARRAY_SIZE(capabilities); i++) {
> +		struct protocol_capability *c = &capabilities[i];
> +
> +		if (c->advertise(the_repository, &value)) {
> +			strbuf_addstr(&capability, c->name);
> +             ...

And used as such in this function.  We tell the other side about the
capability only when .advertise returns true.

> +static int is_valid_capability(const char *key)
> +{
> +	const struct protocol_capability *c = get_capability(key);
> +
> +	return c && c->advertise(the_repository, NULL);
> +}

But this is different---the other side mentioned a capability's
name, and we looked it up from our table to see if we know about it
(i.e. NULL-ness of c), but in addition, we ask if we would tell them
about it if we were advertising.  I am not sure how I should feel
about it (yet).

> +static int is_command(const char *key, struct protocol_capability **command)
> +{
> +	const char *out;
> +
> +	if (skip_prefix(key, "command=", &out)) {
> +		struct protocol_capability *cmd = get_capability(out);
> +
> +		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
> +			die("invalid command '%s'", out);
> +		if (*command)
> +			die("command already requested");

Shouldn't these two checks that lead to die the other way around?
When they give us "command=frotz" and we already have *command, it
would be an error whether we understand 'frotz' or not.

Who are the target audience of these "die"?  Are they meant to be
communicated back to the other side of the connection, or are they
only to be sent to the "server log"?

The latter one may want to say what two conflicting commands are in
the log message, perhaps?

> +		*command = cmd;


^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-02-28 23:22       ` [PATCH v4 13/35] ls-refs: introduce ls-refs server command Brandon Williams
@ 2018-03-02 21:13         ` Junio C Hamano
  2018-03-13 21:27           ` Brandon Williams
  2018-03-03  4:43         ` Jeff King
  1 sibling, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-02 21:13 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> + ls-refs
> +---------
> +
> +`ls-refs` is the command used to request a reference advertisement in v2.
> +Unlike the current reference advertisement, ls-refs takes in arguments
> +which can be used to limit the refs sent from the server.

OK.

> +Additional features not supported in the base command will be advertised
> +as the value of the command in the capability advertisement in the form
> +of a space separated list of features, e.g.  "<command>=<feature 1>
> +<feature 2>".

Doesn't this explain the general convention that applies to any
command, not just ls-refs command?  As a part of ls-refs section,
<command> in the above explanation is always a constant "ls-refs",
right?

It is a bit unclear how <feature N> in the above description are
related to "arguments" in the following paragraph.  Do the server
that can show symref and peeled tags and that can limit the output
with ref-pattern advertise these three as supported features, i.e.

	ls-refs=symrefs peel ref-pattern

or something?  Would there a case where a "feature" does not
correspond 1:1 to an argument to the command, and if so how would
the server and the client negotiate use of such a feature?

> +    ref-pattern <pattern>
> +	When specified, only references matching one of the provided
> +	patterns are displayed.  A pattern is either a valid refname
> +	(e.g.  refs/heads/master), in which a ref must match the pattern
> +	exactly, or a prefix of a ref followed by a single '*' wildcard
> +	character (e.g. refs/heads/*), in which a ref must have a prefix
> +	equal to the pattern up to the wildcard character.

I thought the recent concensus was left-anchored prefix match that
honors /-directory boundary, i.e. no explicit asterisk and just
saying "refs/heads" is enough to match "refs/heads" itself and
"refs/heads/master" but not "refs/headscarf"?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs
  2018-02-28 23:22       ` [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
@ 2018-03-02 22:13         ` Junio C Hamano
  0 siblings, 0 replies; 362+ messages in thread
From: Junio C Hamano @ 2018-03-02 22:13 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> Construct an argv_array of the ref patterns supplied via the command
> line and pass them to 'transport_get_remote_refs()' to be used when
> communicating protocol v2 so that the server can limit the ref
> advertisement based on the supplied patterns.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  builtin/ls-remote.c    | 12 ++++++++++--
>  refs.c                 | 14 ++++++++++++++
>  refs.h                 |  7 +++++++
>  t/t5702-protocol-v2.sh | 26 ++++++++++++++++++++++++++
>  4 files changed, 57 insertions(+), 2 deletions(-)
>
> diff --git a/builtin/ls-remote.c b/builtin/ls-remote.c
> index c6e9847c5..083ba8b29 100644
> --- a/builtin/ls-remote.c
> +++ b/builtin/ls-remote.c
> @@ -2,6 +2,7 @@
>  #include "cache.h"
>  #include "transport.h"
>  #include "remote.h"
> +#include "refs.h"
>  
>  static const char * const ls_remote_usage[] = {
>  	N_("git ls-remote [--heads] [--tags] [--refs] [--upload-pack=<exec>]\n"
> @@ -43,6 +44,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
>  	int show_symref_target = 0;
>  	const char *uploadpack = NULL;
>  	const char **pattern = NULL;
> +	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
>  
>  	struct remote *remote;
>  	struct transport *transport;
> @@ -74,8 +76,14 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
>  	if (argc > 1) {
>  		int i;
>  		pattern = xcalloc(argc, sizeof(const char *));
> -		for (i = 1; i < argc; i++)
> +		for (i = 1; i < argc; i++) {
>  			pattern[i - 1] = xstrfmt("*/%s", argv[i]);
> +
> +			if (strchr(argv[i], '*'))
> +				argv_array_push(&ref_patterns, argv[i]);
> +			else
> +				expand_ref_pattern(&ref_patterns, argv[i]);
> +		}
>  	}
>  
>  	remote = remote_get(dest);
> @@ -96,7 +104,7 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
>  	if (uploadpack != NULL)
>  		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
>  
> -	ref = transport_get_remote_refs(transport, NULL);
> +	ref = transport_get_remote_refs(transport, &ref_patterns);

Yup, this is a logical and an obvious conclusion of the past handful
of steps ;-) I actually was wondering why the previous step didn't
do this already, but the resulting series is easier to understand if
this is kept as a separate step.

However, this also means that traditional pattern language ls-remote
used to support dictates what ls-refs command over the wire can
take, which may not be optimal.

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 18/35] fetch: pass ref patterns when fetching
  2018-02-28 23:22       ` [PATCH v4 18/35] fetch: pass ref patterns when fetching Brandon Williams
@ 2018-03-02 22:20         ` Junio C Hamano
  2018-03-12 22:18           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-02 22:20 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 850382f55..695fafe06 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -332,11 +332,25 @@ static struct ref *get_ref_map(struct transport *transport,
>  	struct ref *rm;
>  	struct ref *ref_map = NULL;
>  	struct ref **tail = &ref_map;
> +	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
>  
>  	/* opportunistically-updated references: */
>  	struct ref *orefs = NULL, **oref_tail = &orefs;
>  
> -	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
> +	const struct ref *remote_refs;
> +
> +	for (i = 0; i < refspec_count; i++) {
> +		if (!refspecs[i].exact_sha1) {
> +			if (refspecs[i].pattern)
> +				argv_array_push(&ref_patterns, refspecs[i].src);
> +			else
> +				expand_ref_pattern(&ref_patterns, refspecs[i].src);
> +		}
> +	}
> +
> +	remote_refs = transport_get_remote_refs(transport, &ref_patterns);
> +
> +	argv_array_clear(&ref_patterns);

Is the idea here, which is shared with 17/35 about ls-remote, that
we used to grab literally everything they have in remote_refs, but
we have code in place to filter that set using refspecs given in the
remote.*.fetch configuration, so it is OK as long as we grab everything
that would match the remote.*.fetch pattern?  That is, grabbing too
much is acceptable, but if we populated ref_patterns[] with too few
patterns and fail to ask refs that would match our refspec it would
be a bug?

The reason behind this question is that I am wondering if/how we can
take advantage of this remote-side pre-filtering while doing "fetch
--prune".

Thanks.

>  
>  	if (refspec_count) {
>  		struct refspec *fetch_refspec;

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 19/35] push: pass ref patterns when pushing
  2018-02-28 23:22       ` [PATCH v4 19/35] push: pass ref patterns when pushing Brandon Williams
@ 2018-03-02 22:25         ` Junio C Hamano
  2018-03-12 22:20           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Junio C Hamano @ 2018-03-02 22:25 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

Brandon Williams <bmwill@google.com> writes:

> Construct a list of ref patterns to be passed to 'get_refs_list()' from
> the refspec to be used during the push.  This list of ref patterns will
> be used to allow the server to filter the ref advertisement when
> communicating using protocol v2.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  transport.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)

When you are pushing 'master', we no longer hear what the other end
has at 'next', with this change, no?

In a project whose 'master' is extended primarily by merging topics
that have been cooking in 'next', old way of pushing would only have
transferred the merge commits and resulting trees but not bulk of
the blob data because they are all available on 'next', would it
make the object transfer far less efficient, I wonder?

I guess it is OK only because the push side of the current protocol
does not do common ancestor discovery exchange ;-)

>
> diff --git a/transport.c b/transport.c
> index dfc603b36..bf7ba6879 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -1026,11 +1026,35 @@ int transport_push(struct transport *transport,
>  		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
>  		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
>  		int push_ret, ret, err;
> +		struct refspec *tmp_rs;
> +		struct argv_array ref_patterns = ARGV_ARRAY_INIT;
> +		int i;
>  
>  		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
>  			return -1;
>  
> -		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
> +		tmp_rs = parse_push_refspec(refspec_nr, refspec);
> +		for (i = 0; i < refspec_nr; i++) {
> +			const char *pattern = NULL;
> +
> +			if (tmp_rs[i].dst)
> +				pattern = tmp_rs[i].dst;
> +			else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
> +				pattern = tmp_rs[i].src;
> +
> +			if (pattern) {
> +				if (tmp_rs[i].pattern)
> +					argv_array_push(&ref_patterns, pattern);
> +				else
> +					expand_ref_pattern(&ref_patterns, pattern);
> +			}
> +		}
> +
> +		remote_refs = transport->vtable->get_refs_list(transport, 1,
> +							       &ref_patterns);
> +
> +		argv_array_clear(&ref_patterns);
> +		free_refspec(refspec_nr, tmp_rs);
>  
>  		if (flags & TRANSPORT_PUSH_ALL)
>  			match_flags |= MATCH_REFS_ALL;

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-23 21:09                     ` Brandon Williams
@ 2018-03-03  4:24                       ` Jeff King
  0 siblings, 0 replies; 362+ messages in thread
From: Jeff King @ 2018-03-03  4:24 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jonathan Nieder, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Fri, Feb 23, 2018 at 01:09:04PM -0800, Brandon Williams wrote:

> > By the way, any decision here would presumably need to be extended to
> > git-serve, etc. The current property is that it's safe to fetch from an
> > untrusted repository, even over ssh. If we're keeping that for protocol
> > v1, we'd want it to apply to protocol v2, as well.
> 
> This may be more complicated.  Right now (for backward compatibility)
> all fetches for v2 are issued to the upload-pack endpoint. So even
> though I've introduced git-serve it doesn't have requests issued to it
> and no requests can be issued to it currently (support isn't added to
> http-backend or git-daemon).  This just means that the command already
> exists to make it easy for testing specific v2 stuff and if we want to
> expose it as an endpoint (like when we have a brand new server command
> that is completely incompatible with v1) its already there and support
> just needs to be plumbed in.
> 
> This whole notion of treating upload-pack differently from receive-pack
> has bad consequences for v2 though.  The idea for v2 is to be able to
> run any number of commands via the same endpoint, so at the end of the
> day the endpoint you used is irrelevant.  So you could issue both fetch
> and push commands via the same endpoint in v2 whether its git-serve,
> receive-pack, or upload-pack.  So really, like Jonathan has said
> elsewhere, we need to figure out how to be ok with having receive-pack
> and upload-pack builtins, or having neither of them builtins, because it
> doesn't make much sense for v2 to straddle that line.

It seems like it would be OK if the whole code path of git-serve
invoking upload-pack happened without being a builtin, even if it would
be possible to run a builtin receive-pack from that same (non-builtin)
git-serve.

Remember that the client is driving the whole operation here, and we can
assume that git-serve is operating on the client's behalf. So a client
who chooses not to trigger receive-pack would be fine.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper
  2018-02-23 21:22         ` Brandon Williams
@ 2018-03-03  4:25           ` Jeff King
  2018-03-05 18:48             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-03  4:25 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Stefan Beller, git, Junio C Hamano, Jonathan Nieder,
	Derrick Stolee, Jeff Hostetler, Duy Nguyen

On Fri, Feb 23, 2018 at 01:22:31PM -0800, Brandon Williams wrote:

> On 02/22, Stefan Beller wrote:
> > On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> > 
> > > +static void pack_line(const char *line)
> > > +{
> > > +       if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))
> > 
> > From our in-office discussion:
> > v1/v0 packs pktlines twice in http, which is not possible to
> > construct using this test helper when using the same string
> > for the packed and unpacked representation of flush and delim packets,
> > i.e. test-pkt-line --pack $(test-pkt-line --pack 0000) would produce
> > '0000' instead of '00090000\n'.
> > To fix it we'd have to replace the unpacked versions of these pkts to
> > something else such as "FLUSH" "DELIM".
> > 
> > However as we do not anticipate the test helper to be used in further
> > tests for v0, this ought to be no big issue.
> > Maybe someone else cares though?
> 
> I'm going to punt and say, if someone cares enough they can update this
> test-helper when they want to use it for v1/v0 stuff.

I recently add packetize and depacketize helpers for testing v0 streams;
see 4414a15002 (t/lib-git-daemon: add network-protocol helpers,
2018-01-24). Is it worth folding these together?

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-02-23 21:45         ` Brandon Williams
@ 2018-03-03  4:33           ` Jeff King
  2018-03-05 18:43             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-03  4:33 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote:

> I think this is the price of extending the protocol in a backward
> compatible way.  If we don't want to be backwards compatible (allowing
> for graceful fallback to v1) then we could design this differently.
> Even so we're not completely out of luck just yet.
> 
> Back when I introduced the GIT_PROTOCOL side-channel I was able to
> demonstrate that arbitrary data could be sent to the server and it would
> only respect the stuff it knows about.  This means that we can do a
> follow up to v2 at some point to introduce an optimization where we can
> stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
> if the server supports it.

If that's our end-game, it does make me wonder if we'd be happier just
jumping to that at first. Before you started the v2 protocol work, I had
a rough patch series passing what I called "early capabilities". The
idea was to let the client speak a few optional capabilities before the
ref advertisement, and be ready for the server to ignore them
completely. That doesn't clean up all the warts with the v0 protocol,
but it handles the major one (allowing more efficient ref
advertisements).

I dunno. There's a lot more going on here in v2 and I'm not sure I've
fully digested it.

> The great thing about this is that from the POV of the git-client, it
> doesn't care if its speaking using the git://, ssh://, file://, or
> http:// transport; it's all the same protocol.  In my next re-roll I'll
> even drop the "# service" bit from the http server response and then the
> responses will truly be identical in all cases.

This part has me a little confused still. The big difference between
http and the other protocols is that the other ones are full-duplex, and
http is a series of stateless request/response pairs.

Are the other protocols becoming stateless request/response pairs, too?
Or will they be "the same protocol" only in the sense of using the same
transport?

(There are a lot of reasons not to like the stateless pair thing; it has
some horrid corner cases during want/have negotiation).

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-02-28 23:22       ` [PATCH v4 13/35] ls-refs: introduce ls-refs server command Brandon Williams
  2018-03-02 21:13         ` Junio C Hamano
@ 2018-03-03  4:43         ` Jeff King
  2018-03-05 18:21           ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-03  4:43 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, gitster, jrnieder, pclouds, sbeller, stolee

On Wed, Feb 28, 2018 at 03:22:30PM -0800, Brandon Williams wrote:

> +static void add_pattern(struct pattern_list *patterns, const char *pattern)
> +{
> +	struct ref_pattern p;
> +	const char *wildcard;
> +
> +	p.pattern = strdup(pattern);

xstrdup?

> +	wildcard = strchr(pattern, '*');
> +	if (wildcard) {
> +		p.wildcard_pos = wildcard - pattern;
> +	} else {
> +		p.wildcard_pos = -1;
> +	}

Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
ignore the "/foo" part.

It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
it's worth going for the most-restrictive thing to start with, since
that enables a lot more server operations without worrying about
breaking compatibility.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-03-03  4:43         ` Jeff King
@ 2018-03-05 18:21           ` Brandon Williams
  2018-03-05 18:29             ` Jonathan Nieder
  2018-03-05 20:28             ` Jeff King
  0 siblings, 2 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-05 18:21 UTC (permalink / raw)
  To: Jeff King; +Cc: git, git, gitster, jrnieder, pclouds, sbeller, stolee

On 03/02, Jeff King wrote:
> On Wed, Feb 28, 2018 at 03:22:30PM -0800, Brandon Williams wrote:
> 
> > +static void add_pattern(struct pattern_list *patterns, const char *pattern)
> > +{
> > +	struct ref_pattern p;
> > +	const char *wildcard;
> > +
> > +	p.pattern = strdup(pattern);
> 
> xstrdup?
> 
> > +	wildcard = strchr(pattern, '*');
> > +	if (wildcard) {
> > +		p.wildcard_pos = wildcard - pattern;
> > +	} else {
> > +		p.wildcard_pos = -1;
> > +	}
> 
> Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
> ignore the "/foo" part.

Yeah that's true...this should probably not do that.  Since
"refs/heads/*/foo" violates what the spec is, really this should error
out as an invalid pattern.

> 
> It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> it's worth going for the most-restrictive thing to start with, since
> that enables a lot more server operations without worrying about
> breaking compatibility.

And just to clarify what do you see as being the most-restrictive case
of patterns that would should use?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-03-05 18:21           ` Brandon Williams
@ 2018-03-05 18:29             ` Jonathan Nieder
  2018-03-05 20:38               ` Jeff King
  2018-03-05 20:28             ` Jeff King
  1 sibling, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-03-05 18:29 UTC (permalink / raw)
  To: Brandon Williams; +Cc: Jeff King, git, git, gitster, pclouds, sbeller, stolee

Hi,

On Mon, Mar 05, 2018 at 10:21:55AM -0800, Brandon Williams wrote:
> On 03/02, Jeff King wrote:

>> It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
>> it's worth going for the most-restrictive thing to start with, since
>> that enables a lot more server operations without worrying about
>> breaking compatibility.
>
> And just to clarify what do you see as being the most-restrictive case
> of patterns that would should use?

Peff, can you say a little more about the downsides of accepting
refs/h*?

IIRC the "git push" command already accepts such refspecs, so there's a
benefit to accepting them.  Reftable and packed-refs support such
queries about as efficiently as refs/heads/*.  For loose refs, readdir
doesn't provide a way to restrict which files you look at, but loose
refs are always slow anyway. :)

In other words, I see real benefits and I don't see much in the way of
costs, so I'm not seeing why not to support this.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-03-03  4:33           ` Jeff King
@ 2018-03-05 18:43             ` Brandon Williams
  2018-03-05 20:52               ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-03-05 18:43 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On 03/02, Jeff King wrote:
> On Fri, Feb 23, 2018 at 01:45:57PM -0800, Brandon Williams wrote:
> 
> > I think this is the price of extending the protocol in a backward
> > compatible way.  If we don't want to be backwards compatible (allowing
> > for graceful fallback to v1) then we could design this differently.
> > Even so we're not completely out of luck just yet.
> > 
> > Back when I introduced the GIT_PROTOCOL side-channel I was able to
> > demonstrate that arbitrary data could be sent to the server and it would
> > only respect the stuff it knows about.  This means that we can do a
> > follow up to v2 at some point to introduce an optimization where we can
> > stuff a request into GIT_PROTOCOL and short-circuit the first round-trip
> > if the server supports it.
> 
> If that's our end-game, it does make me wonder if we'd be happier just
> jumping to that at first. Before you started the v2 protocol work, I had
> a rough patch series passing what I called "early capabilities". The
> idea was to let the client speak a few optional capabilities before the
> ref advertisement, and be ready for the server to ignore them
> completely. That doesn't clean up all the warts with the v0 protocol,
> but it handles the major one (allowing more efficient ref
> advertisements).

I didn't really want to get to that just yet, simply because I want to
try and keep the scope of this smaller while still being able to fix
most of the issues we have with v0.

> I dunno. There's a lot more going on here in v2 and I'm not sure I've
> fully digested it.

I tried to keep it similar enough to v0 such that it wouldn't be that
big of a leap (small steps).  For example negotiation is really done the
same as it is in v0 during fetch (a next step would be to actually
improve that).  We can definitely talk about all this in more detail
later this week too.

> 
> > The great thing about this is that from the POV of the git-client, it
> > doesn't care if its speaking using the git://, ssh://, file://, or
> > http:// transport; it's all the same protocol.  In my next re-roll I'll
> > even drop the "# service" bit from the http server response and then the
> > responses will truly be identical in all cases.
> 
> This part has me a little confused still. The big difference between
> http and the other protocols is that the other ones are full-duplex, and
> http is a series of stateless request/response pairs.
> 
> Are the other protocols becoming stateless request/response pairs, too?
> Or will they be "the same protocol" only in the sense of using the same
> transport?
> 
> (There are a lot of reasons not to like the stateless pair thing; it has
> some horrid corner cases during want/have negotiation).

Junio made a comment on the Spec in the most recent version of the
series about how I state that v2 is stateless and "MUST NOT" rely on
state being stored on the server side.  In reality I think this needs to
be tweaked a bit because when you do have a full-duplex connection you
may probably want to use that to reduce the amount of data that you send
in some cases.

In the current protocol http has a lot of additional stuff that's had to
be done to it to get it to work with a protocol that was designed to be
stateful first.  What I want is for the protocol to be designed
stateless first so that http functions essentially the same as ssh or
file or git transports and we don't have to do any hackery to get it to
work.  This also makes it very simple to implement a new feature in the
protocol because you only need to think about implementing it once
instead of twice like you kind of have to do with v0.  So in the most
recent series everything is a chain of request/response pairs even in
the non-http cases.

In a previous version of the series I had each command being able to
last any number of rounds and having a 'stateless' capability indicating
if the command needed to be run stateless.  I didn't think that was a
good design because by default you are still designing the stateful
thing first and the http (stateless) case can be an afterthought.  So
instead maybe we'll need commands which can benefit from state to have a
'stateful' feature that can be advertised when a full-duplex connection
is possible.  This still gives you the opportunity to not advertise that
and have the same behavior over ssh as http.  I actually remember
hearing someone talk about how they would like to allow for ssh
connections to their server and just have it be a proxy for http and
this would enable that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper
  2018-03-03  4:25           ` Jeff King
@ 2018-03-05 18:48             ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-05 18:48 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, git, Junio C Hamano, Jonathan Nieder,
	Derrick Stolee, Jeff Hostetler, Duy Nguyen

On 03/02, Jeff King wrote:
> On Fri, Feb 23, 2018 at 01:22:31PM -0800, Brandon Williams wrote:
> 
> > On 02/22, Stefan Beller wrote:
> > > On Tue, Feb 6, 2018 at 5:12 PM, Brandon Williams <bmwill@google.com> wrote:
> > > 
> > > > +static void pack_line(const char *line)
> > > > +{
> > > > +       if (!strcmp(line, "0000") || !strcmp(line, "0000\n"))
> > > 
> > > From our in-office discussion:
> > > v1/v0 packs pktlines twice in http, which is not possible to
> > > construct using this test helper when using the same string
> > > for the packed and unpacked representation of flush and delim packets,
> > > i.e. test-pkt-line --pack $(test-pkt-line --pack 0000) would produce
> > > '0000' instead of '00090000\n'.
> > > To fix it we'd have to replace the unpacked versions of these pkts to
> > > something else such as "FLUSH" "DELIM".
> > > 
> > > However as we do not anticipate the test helper to be used in further
> > > tests for v0, this ought to be no big issue.
> > > Maybe someone else cares though?
> > 
> > I'm going to punt and say, if someone cares enough they can update this
> > test-helper when they want to use it for v1/v0 stuff.
> 
> I recently add packetize and depacketize helpers for testing v0 streams;
> see 4414a15002 (t/lib-git-daemon: add network-protocol helpers,
> 2018-01-24). Is it worth folding these together?

I didn't know something like that existed! (of course if it was just
added this year then it didn't exist when I started working on this
stuff).  Yeah its probably a good idea to fold these together, I can
take a look at how your packetize and depacketize helpers work and add
the small amount of functionality that I'd need to replace the helper I
made.

> 
> -Peff

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 34/35] remote-curl: implement stateless-connect command
  2018-03-02 20:07         ` Johannes Schindelin
@ 2018-03-05 19:35           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-05 19:35 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On 03/02, Johannes Schindelin wrote:
> Hi Brandon,
> 
> On Wed, 28 Feb 2018, Brandon Williams wrote:
> 
> > diff --git a/remote-curl.c b/remote-curl.c
> > index 66a53f74b..3f882d766 100644
> > --- a/remote-curl.c
> > +++ b/remote-curl.c
> > @@ -188,7 +188,12 @@ static struct ref *parse_git_refs(struct discovery *heads, int for_push)
> > [...]
> > +static size_t proxy_in(char *buffer, size_t eltsize,
> > +		       size_t nmemb, void *userdata)
> > +{
> > +	size_t max;
> > +	struct proxy_state *p = userdata;
> > +	size_t avail = p->request_buffer.len - p->pos;
> > +
> > +
> > +	if (eltsize != 1)
> > +		BUG("curl read callback called with size = %zu != 1", eltsize);
> 
> The format specified %z is not actually portable. Please use PRIuMAX and
> cast to (uintmax_t) instead.
> 
> This breaks the Windows build of `pu` (before that, there was still a test
> failure that I did not have the time to chase down).

Oh sorry, Looks like Junio put a patch ontop in pu to fix this.  I'll
squash that fix into this patch.

Thanks for catching this.

> 
> Ciao,
> Dscho

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-03-05 18:21           ` Brandon Williams
  2018-03-05 18:29             ` Jonathan Nieder
@ 2018-03-05 20:28             ` Jeff King
  2018-03-13 21:23               ` Brandon Williams
  1 sibling, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-05 20:28 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, git, gitster, jrnieder, pclouds, sbeller, stolee

On Mon, Mar 05, 2018 at 10:21:55AM -0800, Brandon Williams wrote:

> > Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
> > ignore the "/foo" part.
> 
> Yeah that's true...this should probably not do that.  Since
> "refs/heads/*/foo" violates what the spec is, really this should error
> out as an invalid pattern.

Yeah, that would be better, I think.

> > It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> > it's worth going for the most-restrictive thing to start with, since
> > that enables a lot more server operations without worrying about
> > breaking compatibility.
> 
> And just to clarify what do you see as being the most-restrictive case
> of patterns that would should use?

I mean only accepting "*" at a "/" boundary (or just allowing a trailing
slash to imply recursion, like "refs/heads/", or even just always
assuming recursion to allow "refs/heads").

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-03-05 18:29             ` Jonathan Nieder
@ 2018-03-05 20:38               ` Jeff King
  0 siblings, 0 replies; 362+ messages in thread
From: Jeff King @ 2018-03-05 20:38 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, git, gitster, pclouds, sbeller, stolee

On Mon, Mar 05, 2018 at 10:29:14AM -0800, Jonathan Nieder wrote:

> >> It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> >> it's worth going for the most-restrictive thing to start with, since
> >> that enables a lot more server operations without worrying about
> >> breaking compatibility.
> >
> > And just to clarify what do you see as being the most-restrictive case
> > of patterns that would should use?
> 
> Peff, can you say a little more about the downsides of accepting
> refs/h*?
> 
> IIRC the "git push" command already accepts such refspecs, so there's a
> benefit to accepting them.  Reftable and packed-refs support such
> queries about as efficiently as refs/heads/*.  For loose refs, readdir
> doesn't provide a way to restrict which files you look at, but loose
> refs are always slow anyway. :)
> 
> In other words, I see real benefits and I don't see much in the way of
> costs, so I'm not seeing why not to support this.

"git for-each-ref" only handles "/" boundaries. I think we used to have
similar problems with the internal for_each_ref(), but I just checked
and I think it's more flexible these days.  One could imagine a more
trie-like storage, though I agree that is stretching it with a
hypothetical.

Mostly my point was that I don't see any big upside, and the choice
seemed rather arbitrary. And as it is generally easier to loosen the
patterns later than tighten them, it makes sense to go with the tightest
option at first unless there is a compelling reason not to.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-03-05 18:43             ` Brandon Williams
@ 2018-03-05 20:52               ` Jeff King
  2018-03-05 21:36                 ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-05 20:52 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, sbeller, gitster, jrnieder, stolee, git, pclouds

On Mon, Mar 05, 2018 at 10:43:21AM -0800, Brandon Williams wrote:

> In the current protocol http has a lot of additional stuff that's had to
> be done to it to get it to work with a protocol that was designed to be
> stateful first.  What I want is for the protocol to be designed
> stateless first so that http functions essentially the same as ssh or
> file or git transports and we don't have to do any hackery to get it to
> work.  This also makes it very simple to implement a new feature in the
> protocol because you only need to think about implementing it once
> instead of twice like you kind of have to do with v0.  So in the most
> recent series everything is a chain of request/response pairs even in
> the non-http cases.

I agree that would be a lot more pleasant for adding protocol features.
But I just worry that the stateful protocols get a lot less efficient.
I'm having trouble coming up with an easy reproduction, but my
recollection is that http has some nasty corner cases, because each
round of "have" lines sent to the server has to summarize the previous
conversation. So you can get a case where the client's requests keep
getting bigger and bigger during the negotiation (and eventually getting
large enough to cause problems).

If anything, I wish we could push the http protocol in a more stateful
direction with something like websockets. But I suspect that's an
unrealistic dream, just because not everybody's http setup (proxies,
etc) will be able to handle that.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-03-05 20:52               ` Jeff King
@ 2018-03-05 21:36                 ` Jonathan Nieder
  2018-03-06  6:29                   ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-03-05 21:36 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Williams, git, sbeller, gitster, stolee, git, pclouds

Hi,

Jeff King wrote:

> I agree that would be a lot more pleasant for adding protocol features.
> But I just worry that the stateful protocols get a lot less efficient.
> I'm having trouble coming up with an easy reproduction, but my
> recollection is that http has some nasty corner cases, because each
> round of "have" lines sent to the server has to summarize the previous
> conversation. So you can get a case where the client's requests keep
> getting bigger and bigger during the negotiation (and eventually getting
> large enough to cause problems).

That's not so much a corner case as just how negotiation works over
http.

We want to do better (e.g. see [1]) but that's a bigger change than
the initial protocol v2.

As Brandon explained it to me, we really do want to use stateless-rpc
semantics by default, since that's just better for maintainability.
Instead of having two protocols, one that is sane and one that
struggles to hoist that into stateless-rpc, there would be one
stateless baseline plus capabilities to make use of state.

For example, it would be nice to have a capability to remember
negotiation state between rounds, to get around exactly the problem
you're describing when using a stateful protocol.  Stateless backends
would just not advertise such a capability.  But doing that without [1]
still sort of feels like a cop-out.  If we can get a reasonable
baseline using ideas like [1] and then have a capability to keep
server-side state as icing on the cake instead of having a negotiation
process that only really makes sense when you have server-side state,
then that would be even better.

> If anything, I wish we could push the http protocol in a more stateful
> direction with something like websockets. But I suspect that's an
> unrealistic dream, just because not everybody's http setup (proxies,
> etc) will be able to handle that.

Agreed.  I think we have to continue to deal with stateless-rpc
semantics, at least for the near future.

Jonathan

[1] https://public-inbox.org/git/20180227054638.GB65699@aiede.svl.corp.google.com/

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-03-05 21:36                 ` Jonathan Nieder
@ 2018-03-06  6:29                   ` Jeff King
  2018-03-12 23:46                     ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-06  6:29 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, sbeller, gitster, stolee, git, pclouds

On Mon, Mar 05, 2018 at 01:36:49PM -0800, Jonathan Nieder wrote:

> > I agree that would be a lot more pleasant for adding protocol features.
> > But I just worry that the stateful protocols get a lot less efficient.
> > I'm having trouble coming up with an easy reproduction, but my
> > recollection is that http has some nasty corner cases, because each
> > round of "have" lines sent to the server has to summarize the previous
> > conversation. So you can get a case where the client's requests keep
> > getting bigger and bigger during the negotiation (and eventually getting
> > large enough to cause problems).
> 
> That's not so much a corner case as just how negotiation works over
> http.

Sure. What I meant more was "there are corner cases where it gets out of
control and doesn't work".

I have had to give the advice in the past "if your fetch over http
doesn't work, try it over ssh". If we change the ssh protocol to be
stateless, too, then that closes that escape hatch.

I haven't had to give that advice for a while, though. Maybe tweaks to
the parameters or just larger buffers have made the problem go away over
the years?

> We want to do better (e.g. see [1]) but that's a bigger change than
> the initial protocol v2.
> 
> As Brandon explained it to me, we really do want to use stateless-rpc
> semantics by default, since that's just better for maintainability.
> Instead of having two protocols, one that is sane and one that
> struggles to hoist that into stateless-rpc, there would be one
> stateless baseline plus capabilities to make use of state.

Yes, I think that would be a nice end-game. It just wasn't clear to me
where we'd be in the interim.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it
  2018-03-01 20:48         ` Junio C Hamano
@ 2018-03-12 21:56           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-12 21:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/01, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > +enum packet_read_status packet_reader_read(struct packet_reader *reader)
> > +{
> > +	if (reader->line_peeked) {
> > +		reader->line_peeked = 0;
> > +		return reader->status;
> > +	}
> > +
> > +	reader->status = packet_read_with_status(reader->fd,
> > +						 &reader->src_buffer,
> > +						 &reader->src_len,
> > +						 reader->buffer,
> > +						 reader->buffer_size,
> > +						 &reader->pktlen,
> > +						 reader->options);
> > +
> > +	switch (reader->status) {
> > +	case PACKET_READ_EOF:
> > +		reader->pktlen = -1;
> > +		reader->line = NULL;
> > +		break;
> > +	case PACKET_READ_NORMAL:
> > +		reader->line = reader->buffer;
> > +		break;
> > +	case PACKET_READ_FLUSH:
> > +		reader->pktlen = 0;
> > +		reader->line = NULL;
> > +		break;
> > +	}
> > +
> > +	return reader->status;
> > +}
> 
> With the way _peek() interface interacts with the reader instance
> (which by the way I find is well designed), it is understandable
> that we want almost everything available in reader's fields, but
> having to manually clear pktlen field upon non NORMAL status feels
> a bit strange.  
> 
> Perhaps that is because the underlying packet_read_with_status()
> does not set *pktlen in these cases?  Shouldn't it be doing that so
> the caller does not have to?

That's true, I'll fix that.


-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 12/35] serve: introduce git-serve
  2018-03-01 23:11         ` Junio C Hamano
@ 2018-03-12 22:08           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-12 22:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/01, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> >  Documentation/technical/protocol-v2.txt | 171 ++++++++++++++++
> 
> Unlike other things in Documentation/technical/, this is not listed
> on TECH_DOCS list in Documentation/Makefile.  Shouldn't it be?

Yes it should, I'll fix that.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 18/35] fetch: pass ref patterns when fetching
  2018-03-02 22:20         ` Junio C Hamano
@ 2018-03-12 22:18           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-12 22:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/02, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > diff --git a/builtin/fetch.c b/builtin/fetch.c
> > index 850382f55..695fafe06 100644
> > --- a/builtin/fetch.c
> > +++ b/builtin/fetch.c
> > @@ -332,11 +332,25 @@ static struct ref *get_ref_map(struct transport *transport,
> >  	struct ref *rm;
> >  	struct ref *ref_map = NULL;
> >  	struct ref **tail = &ref_map;
> > +	struct argv_array ref_patterns = ARGV_ARRAY_INIT;
> >  
> >  	/* opportunistically-updated references: */
> >  	struct ref *orefs = NULL, **oref_tail = &orefs;
> >  
> > -	const struct ref *remote_refs = transport_get_remote_refs(transport, NULL);
> > +	const struct ref *remote_refs;
> > +
> > +	for (i = 0; i < refspec_count; i++) {
> > +		if (!refspecs[i].exact_sha1) {
> > +			if (refspecs[i].pattern)
> > +				argv_array_push(&ref_patterns, refspecs[i].src);
> > +			else
> > +				expand_ref_pattern(&ref_patterns, refspecs[i].src);
> > +		}
> > +	}
> > +
> > +	remote_refs = transport_get_remote_refs(transport, &ref_patterns);
> > +
> > +	argv_array_clear(&ref_patterns);
> 
> Is the idea here, which is shared with 17/35 about ls-remote, that
> we used to grab literally everything they have in remote_refs, but
> we have code in place to filter that set using refspecs given in the
> remote.*.fetch configuration, so it is OK as long as we grab everything
> that would match the remote.*.fetch pattern?  That is, grabbing too
> much is acceptable, but if we populated ref_patterns[] with too few
> patterns and fail to ask refs that would match our refspec it would
> be a bug?

Yes that's the idea.  Right now we're in the state where we ask for
everything (since there is no server side filtering) and the client just
does its own filtering after the fact using the refspec.  So if we end
up not sending enough ref patterns to match what the refspec is, it
would be a bug.

> 
> The reason behind this question is that I am wondering if/how we can
> take advantage of this remote-side pre-filtering while doing "fetch
> --prune".

Hmm maybe, assuming prune then means "get rid of all remote-tracking
branches that don't match the user provided refspec"

> 
> Thanks.
> 
> >  
> >  	if (refspec_count) {
> >  		struct refspec *fetch_refspec;

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 19/35] push: pass ref patterns when pushing
  2018-03-02 22:25         ` Junio C Hamano
@ 2018-03-12 22:20           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-12 22:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/02, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Construct a list of ref patterns to be passed to 'get_refs_list()' from
> > the refspec to be used during the push.  This list of ref patterns will
> > be used to allow the server to filter the ref advertisement when
> > communicating using protocol v2.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  transport.c | 26 +++++++++++++++++++++++++-
> >  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> When you are pushing 'master', we no longer hear what the other end
> has at 'next', with this change, no?
> 
> In a project whose 'master' is extended primarily by merging topics
> that have been cooking in 'next', old way of pushing would only have
> transferred the merge commits and resulting trees but not bulk of
> the blob data because they are all available on 'next', would it
> make the object transfer far less efficient, I wonder?
> 
> I guess it is OK only because the push side of the current protocol
> does not do common ancestor discovery exchange ;-)

Yep, though we've been throwing around ideas of adding that in push v2
after we figure out a good way to improve negotiation with fetch.

> 
> >
> > diff --git a/transport.c b/transport.c
> > index dfc603b36..bf7ba6879 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -1026,11 +1026,35 @@ int transport_push(struct transport *transport,
> >  		int porcelain = flags & TRANSPORT_PUSH_PORCELAIN;
> >  		int pretend = flags & TRANSPORT_PUSH_DRY_RUN;
> >  		int push_ret, ret, err;
> > +		struct refspec *tmp_rs;
> > +		struct argv_array ref_patterns = ARGV_ARRAY_INIT;
> > +		int i;
> >  
> >  		if (check_push_refs(local_refs, refspec_nr, refspec) < 0)
> >  			return -1;
> >  
> > -		remote_refs = transport->vtable->get_refs_list(transport, 1, NULL);
> > +		tmp_rs = parse_push_refspec(refspec_nr, refspec);
> > +		for (i = 0; i < refspec_nr; i++) {
> > +			const char *pattern = NULL;
> > +
> > +			if (tmp_rs[i].dst)
> > +				pattern = tmp_rs[i].dst;
> > +			else if (tmp_rs[i].src && !tmp_rs[i].exact_sha1)
> > +				pattern = tmp_rs[i].src;
> > +
> > +			if (pattern) {
> > +				if (tmp_rs[i].pattern)
> > +					argv_array_push(&ref_patterns, pattern);
> > +				else
> > +					expand_ref_pattern(&ref_patterns, pattern);
> > +			}
> > +		}
> > +
> > +		remote_refs = transport->vtable->get_refs_list(transport, 1,
> > +							       &ref_patterns);
> > +
> > +		argv_array_clear(&ref_patterns);
> > +		free_refspec(refspec_nr, tmp_rs);
> >  
> >  		if (flags & TRANSPORT_PUSH_ALL)
> >  			match_flags |= MATCH_REFS_ALL;

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 05/35] upload-pack: factor out processing lines
  2018-03-01 21:25         ` Junio C Hamano
@ 2018-03-12 22:24           ` Brandon Williams
  2018-03-12 22:39             ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Brandon Williams @ 2018-03-12 22:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/01, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Factor out the logic for processing shallow, deepen, deepen_since, and
> > deepen_not lines into their own functions to simplify the
> > 'receive_needs()' function in addition to making it easier to reuse some
> > of this logic when implementing protocol_v2.
> 
> These little functions that still require their incoming data to
> begin with fixed prefixes feels a bit strange way to refactor the
> logic for later reuse (when I imagine "reuse", the first use case
> that comes to my mind is "this data source our new code reads from
> gives the same data as the old 'shallow' packet used to give, but in
> a different syntax"---so I'd restructure the code in such a way that
> the caller figures out the syntax part and the called helper just
> groks the "information contents" unwrapped from the surface syntax;
> the syntax may be different in the new codepath but once unwrapped,
> the "information contents" to be processed would not be different
> hence we can reuse the helper).
> 
> IOW, I would have expected the caller to be not like this:
> 
> > -		if (skip_prefix(line, "shallow ", &arg)) {
> > -			struct object_id oid;
> > -			struct object *object;
> > -			if (get_oid_hex(arg, &oid))
> > -				die("invalid shallow line: %s", line);
> > -			object = parse_object(&oid);
> > -			if (!object)
> > -				continue;
> > -			if (object->type != OBJ_COMMIT)
> > -				die("invalid shallow object %s", oid_to_hex(&oid));
> > -			if (!(object->flags & CLIENT_SHALLOW)) {
> > -				object->flags |= CLIENT_SHALLOW;
> > -				add_object_array(object, NULL, &shallows);
> > -			}
> > +		if (process_shallow(line, &shallows))
> >  			continue;
> > +		if (process_deepen(line, &depth))
> >  			continue;
> 		...
> 
> but more like
> 
> 		if (skip_prefix(line, "shallow ", &arg) {
> 			process_shallow(arg, &shallows);
> 			continue;
> 		}
> 		if (skip_prefix(line, "deepen ", &arg) {
> 			process_deepen(arg, &depth);
> 			continue;
> 		}
> 		...
> 
> I need to defer the final judgment until I see how they are used,
> though.  It's not too big a deal either way---it just felt "not
> quite right" to me.

This is actually a really good point (and maybe the same point stefan
was trying to make on an old revision of this series).  I think it makes
much more sense to refactor the code to have a structure like you've
outlined.  I'll fix this for the next version.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 05/35] upload-pack: factor out processing lines
  2018-03-12 22:24           ` Brandon Williams
@ 2018-03-12 22:39             ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-12 22:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/12, Brandon Williams wrote:
> On 03/01, Junio C Hamano wrote:
> > Brandon Williams <bmwill@google.com> writes:
> > 
> > > Factor out the logic for processing shallow, deepen, deepen_since, and
> > > deepen_not lines into their own functions to simplify the
> > > 'receive_needs()' function in addition to making it easier to reuse some
> > > of this logic when implementing protocol_v2.
> > 
> > These little functions that still require their incoming data to
> > begin with fixed prefixes feels a bit strange way to refactor the
> > logic for later reuse (when I imagine "reuse", the first use case
> > that comes to my mind is "this data source our new code reads from
> > gives the same data as the old 'shallow' packet used to give, but in
> > a different syntax"---so I'd restructure the code in such a way that
> > the caller figures out the syntax part and the called helper just
> > groks the "information contents" unwrapped from the surface syntax;
> > the syntax may be different in the new codepath but once unwrapped,
> > the "information contents" to be processed would not be different
> > hence we can reuse the helper).
> > 
> > IOW, I would have expected the caller to be not like this:
> > 
> > > -		if (skip_prefix(line, "shallow ", &arg)) {
> > > -			struct object_id oid;
> > > -			struct object *object;
> > > -			if (get_oid_hex(arg, &oid))
> > > -				die("invalid shallow line: %s", line);
> > > -			object = parse_object(&oid);
> > > -			if (!object)
> > > -				continue;
> > > -			if (object->type != OBJ_COMMIT)
> > > -				die("invalid shallow object %s", oid_to_hex(&oid));
> > > -			if (!(object->flags & CLIENT_SHALLOW)) {
> > > -				object->flags |= CLIENT_SHALLOW;
> > > -				add_object_array(object, NULL, &shallows);
> > > -			}
> > > +		if (process_shallow(line, &shallows))
> > >  			continue;
> > > +		if (process_deepen(line, &depth))
> > >  			continue;
> > 		...
> > 
> > but more like
> > 
> > 		if (skip_prefix(line, "shallow ", &arg) {
> > 			process_shallow(arg, &shallows);
> > 			continue;
> > 		}
> > 		if (skip_prefix(line, "deepen ", &arg) {
> > 			process_deepen(arg, &depth);
> > 			continue;
> > 		}
> > 		...
> > 
> > I need to defer the final judgment until I see how they are used,
> > though.  It's not too big a deal either way---it just felt "not
> > quite right" to me.
> 
> This is actually a really good point (and maybe the same point stefan
> was trying to make on an old revision of this series).  I think it makes
> much more sense to refactor the code to have a structure like you've
> outlined.  I'll fix this for the next version.

And then I started writing the code and now I don't know which I
prefer.  The issue is that its for processing a line which has some well
defined structure and moving the check for "shallow " away from the rest
of the code which does the processing makes it a little less clear how
that shallow line is to be defined.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-02-22 21:44                       ` Jeff King
@ 2018-03-12 22:43                         ` Jonathan Nieder
  2018-03-12 23:28                           ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-03-12 22:43 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

Hi,

Jeff King wrote:
> On Thu, Feb 22, 2018 at 01:26:34PM -0800, Jonathan Nieder wrote:

>> Keep in mind that git upload-archive (a read-only command, just like
>> git upload-pack) also already has the same issues.
>
> Yuck. I don't think we've ever made a historical promise about that. But
> then, I don't think the promise about upload-pack has ever really been
> documented, except in mailing list discussions.

Sorry to revive this old side-thread.  Good news: for a dashed command
like git-upload-archive, the pager selection code only runs for
commands with RUN_SETUP or RUN_SETUP_GENTLY:

	if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
	    !(p->option & DELAY_PAGER_CONFIG))
		use_pager = check_pager_config(p->cmd);

None of upload-pack, receive-pack,git-serve, or upload-archive set
those flags, so we (narrowly) escape trouble here.

Later today I should be able to send a cleanup to make the behavior
more obvious.

Thanks again,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-03-12 22:43                         ` Jonathan Nieder
@ 2018-03-12 23:28                           ` Jeff King
  2018-03-12 23:37                             ` Jonathan Nieder
  0 siblings, 1 reply; 362+ messages in thread
From: Jeff King @ 2018-03-12 23:28 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Mon, Mar 12, 2018 at 03:43:55PM -0700, Jonathan Nieder wrote:

> Hi,
> 
> Jeff King wrote:
> > On Thu, Feb 22, 2018 at 01:26:34PM -0800, Jonathan Nieder wrote:
> 
> >> Keep in mind that git upload-archive (a read-only command, just like
> >> git upload-pack) also already has the same issues.
> >
> > Yuck. I don't think we've ever made a historical promise about that. But
> > then, I don't think the promise about upload-pack has ever really been
> > documented, except in mailing list discussions.
> 
> Sorry to revive this old side-thread.  Good news: for a dashed command
> like git-upload-archive, the pager selection code only runs for
> commands with RUN_SETUP or RUN_SETUP_GENTLY:
> 
> 	if (use_pager == -1 && p->option & (RUN_SETUP | RUN_SETUP_GENTLY) &&
> 	    !(p->option & DELAY_PAGER_CONFIG))
> 		use_pager = check_pager_config(p->cmd);
> 
> None of upload-pack, receive-pack,git-serve, or upload-archive set
> those flags, so we (narrowly) escape trouble here.

Right, I saw that earlier. But I actually think that is stale from the
days when it wasn't safe to call check_pager_config() too early. So I
could very well see somebody removing it and causing a spooky
vulnerability at a distance.

> Later today I should be able to send a cleanup to make the behavior
> more obvious.

Thanks. I'm still on the fence over the whole builtin concept, but
certainly a "don't ever turn on a pager" flag seems like a reasonable
thing to have.

An alternative approach is some kind of global for "don't trust the
local repo" flag. That could be respected from very low-level code
(e.g., where we read and/or respect the pager command, but also in other
places like hooks, other config that runs arbitrary commands, etc). And
then upload-pack would set that to "do not trust", and other programs
would default to "trust".

We could even give it an environment variable, which would allow
something like:

  tar xf maybe-evil.git.tar
  cd maybe-evil
  export GIT_TRUST_REPO=false
  git log

without worrying about pager.log config, etc. My two concerns with this
approach would be:

  1. We have to manually annotate any "dangerous" code to act more
     safely when it sees the flag. Which means it's highly likely to
     a spot, or to add a new feature which doesn't respect it. And
     suddenly that's a security hole. So I'm concerned it may create a
     false sense of security and actually make things worse.

  2. As a global, I'm not sure how it would interact with multi-repo
     processes like submodules. In theory it ought to go into the
     repository struct, but it would often need to be set globally
     before we've even discovered the repo.

     That might be fine, though. It's really more about context than
     about a specific repo (so you may say "don't trust this repo", and
     that extends to any submodules you happen to access, too).

I dunno. I think (2) is probably OK, but (1) really gives me pause.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-03-12 23:28                           ` Jeff King
@ 2018-03-12 23:37                             ` Jonathan Nieder
  2018-03-12 23:52                               ` Jeff King
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Nieder @ 2018-03-12 23:37 UTC (permalink / raw)
  To: Jeff King
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

Jeff King wrote:

> We could even give it an environment variable, which would allow
> something like:
>
>   tar xf maybe-evil.git.tar
>   cd maybe-evil
>   export GIT_TRUST_REPO=false
>   git log

Interesting idea.  Putting it in an envvar means it gets inherited by
child processes, which if I understand you correctly is a good thing.

[...]
>   1. We have to manually annotate any "dangerous" code to act more
>      safely when it sees the flag. Which means it's highly likely to
>      a spot, or to add a new feature which doesn't respect it. And
>      suddenly that's a security hole. So I'm concerned it may create a
>      false sense of security and actually make things worse.

As an internal implementation detail, this is so obviously fragile
that it wouldn't give me any feeling of security. ;-)  So it should be
strictly an improvement.

As a public-facing feature, I suspect it's a bad idea for exactly that
reason.

FWIW for pager specifically I am going for a whitelisting approach:
new commands would have to explicitly set ALLOW_PAGER if they want to
respect pager config.  That doesn't guarantee people think about it
again as things evolve but it should at least help with getting the
right setting for new plumbing.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 12/35] serve: introduce git-serve
  2018-03-06  6:29                   ` Jeff King
@ 2018-03-12 23:46                     ` Jeff King
  0 siblings, 0 replies; 362+ messages in thread
From: Jeff King @ 2018-03-12 23:46 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, sbeller, gitster, stolee, git, pclouds

On Tue, Mar 06, 2018 at 07:29:02AM +0100, Jeff King wrote:

> > We want to do better (e.g. see [1]) but that's a bigger change than
> > the initial protocol v2.
> > 
> > As Brandon explained it to me, we really do want to use stateless-rpc
> > semantics by default, since that's just better for maintainability.
> > Instead of having two protocols, one that is sane and one that
> > struggles to hoist that into stateless-rpc, there would be one
> > stateless baseline plus capabilities to make use of state.
> 
> Yes, I think that would be a nice end-game. It just wasn't clear to me
> where we'd be in the interim.

After some more thinking about this, and a little chatting with Brandon
at the contrib summit, I'm willing to soften my position on this.

Basically I was concerned about this as a regression where git-over-ssh
would stop working in a few corner cases. And it would cease to be
available as an escape hatch for those cases where http wouldn't work.

But we may be OK in this "interim" period (before unified
stateful-negotiation bits are added back) because v2 would not yet be
the default. So the ssh cases can't regress without flipping the v2
switch manually, and any escape hatch would continue to work by flipping
back to v1 anyway.

So it's probably OK to continue experimenting in this direction and see
how often it's a problem in practice.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v3 04/35] upload-pack: convert to a builtin
  2018-03-12 23:37                             ` Jonathan Nieder
@ 2018-03-12 23:52                               ` Jeff King
  0 siblings, 0 replies; 362+ messages in thread
From: Jeff King @ 2018-03-12 23:52 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Jonathan Tan, git, sbeller, gitster, stolee,
	git, pclouds

On Mon, Mar 12, 2018 at 04:37:47PM -0700, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > We could even give it an environment variable, which would allow
> > something like:
> >
> >   tar xf maybe-evil.git.tar
> >   cd maybe-evil
> >   export GIT_TRUST_REPO=false
> >   git log
> [...]
> As an internal implementation detail, this is so obviously fragile
> that it wouldn't give me any feeling of security. ;-)  So it should be
> strictly an improvement.
> 
> As a public-facing feature, I suspect it's a bad idea for exactly that
> reason.

So that pretty much kills off the GIT_TRUST_REPO idea, I guess.

> FWIW for pager specifically I am going for a whitelisting approach:
> new commands would have to explicitly set ALLOW_PAGER if they want to
> respect pager config.  That doesn't guarantee people think about it
> again as things evolve but it should at least help with getting the
> right setting for new plumbing.

I suspect we'd be about as well off with the "don't trust the repo"
internal flag. Touching the ALLOW_PAGER setup code is about as likely to
set off red flags for the developers (or reviewers) as code that checks
the "trust" flag.

Forcing a whitelist on ALLOW_PAGER _is_ more likely to catch people
adding new commands. But I don't think we actually want to add more
commands to the "safe to run in a malicious repo" list. It's already a
slightly sketchy concept. This is really all about upload-pack and its
existing promises.

But ALLOW_PAGER would _just_ fix the pager issue. When we inevitably
find another problem spot, it won't help us there. But a global "trust"
flag might.

I dunno. I guess I'm OK with either approach, but it seems like the
global trust flag has more room to grow.

-Peff

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 08/35] connect: discover protocol version outside of get_remote_heads
  2018-02-28 23:22       ` [PATCH v4 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
@ 2018-03-13 15:49         ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 15:49 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:25 -0800
Brandon Williams <bmwill@google.com> wrote:

> In order to prepare for the addition of protocol_v2 push the protocol
> version discovery outside of 'get_remote_heads()'.  This will allow for
> keeping the logic for processing the reference advertisement for
> protocol_v1 and protocol_v0 separate from the logic for protocol_v2.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

I had one issue in version 3 which turns out to not be one, and this
patch is unchanged from version 3, so:

Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

> ---
>  builtin/fetch-pack.c | 16 +++++++++++++++-
>  builtin/send-pack.c  | 17 +++++++++++++++--
>  connect.c            | 27 ++++++++++-----------------
>  connect.h            |  3 +++
>  remote-curl.c        | 20 ++++++++++++++++++--
>  remote.h             |  5 +++--
>  transport.c          | 24 +++++++++++++++++++-----
>  7 files changed, 83 insertions(+), 29 deletions(-)
> 
> diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
> index 366b9d13f..85d4faf76 100644
> --- a/builtin/fetch-pack.c
> +++ b/builtin/fetch-pack.c
> @@ -4,6 +4,7 @@
>  #include "remote.h"
>  #include "connect.h"
>  #include "sha1-array.h"
> +#include "protocol.h"
>  
>  static const char fetch_pack_usage[] =
>  "git fetch-pack [--all] [--stdin] [--quiet | -q] [--keep | -k] [--thin] "
> @@ -52,6 +53,7 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  	struct fetch_pack_args args;
>  	struct oid_array shallow = OID_ARRAY_INIT;
>  	struct string_list deepen_not = STRING_LIST_INIT_DUP;
> +	struct packet_reader reader;
>  
>  	packet_trace_identity("fetch-pack");
>  
> @@ -193,7 +195,19 @@ int cmd_fetch_pack(int argc, const char **argv, const char *prefix)
>  		if (!conn)
>  			return args.diag_url ? 0 : 1;
>  	}
> -	get_remote_heads(fd[0], NULL, 0, &ref, 0, NULL, &shallow);
> +
> +	packet_reader_init(&reader, fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &ref, 0, NULL, &shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
>  
>  	ref = fetch_pack(&args, fd, conn, ref, dest, sought, nr_sought,
>  			 &shallow, pack_lockfile_ptr);
> diff --git a/builtin/send-pack.c b/builtin/send-pack.c
> index fc4f0bb5f..83cb125a6 100644
> --- a/builtin/send-pack.c
> +++ b/builtin/send-pack.c
> @@ -14,6 +14,7 @@
>  #include "sha1-array.h"
>  #include "gpg-interface.h"
>  #include "gettext.h"
> +#include "protocol.h"
>  
>  static const char * const send_pack_usage[] = {
>  	N_("git send-pack [--all | --mirror] [--dry-run] [--force] "
> @@ -154,6 +155,7 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
>  	int progress = -1;
>  	int from_stdin = 0;
>  	struct push_cas_option cas = {0};
> +	struct packet_reader reader;
>  
>  	struct option options[] = {
>  		OPT__VERBOSITY(&verbose),
> @@ -256,8 +258,19 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
>  			args.verbose ? CONNECT_VERBOSE : 0);
>  	}
>  
> -	get_remote_heads(fd[0], NULL, 0, &remote_refs, REF_NORMAL,
> -			 &extra_have, &shallow);
> +	packet_reader_init(&reader, fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &remote_refs, REF_NORMAL,
> +				 &extra_have, &shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
>  
>  	transport_verify_remote_names(nr_refspecs, refspecs);
>  
> diff --git a/connect.c b/connect.c
> index c82c90b7c..0b111e62d 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -62,7 +62,7 @@ static void die_initial_contact(int unexpected)
>  		      "and the repository exists."));
>  }
>  
> -static enum protocol_version discover_version(struct packet_reader *reader)
> +enum protocol_version discover_version(struct packet_reader *reader)
>  {
>  	enum protocol_version version = protocol_unknown_version;
>  
> @@ -233,7 +233,7 @@ enum get_remote_heads_state {
>  /*
>   * Read all the refs from the other end
>   */
> -struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct ref **get_remote_heads(struct packet_reader *reader,
>  			      struct ref **list, unsigned int flags,
>  			      struct oid_array *extra_have,
>  			      struct oid_array *shallow_points)
> @@ -241,24 +241,17 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	struct ref **orig_list = list;
>  	int len = 0;
>  	enum get_remote_heads_state state = EXPECTING_FIRST_REF;
> -	struct packet_reader reader;
>  	const char *arg;
>  
> -	packet_reader_init(&reader, in, src_buf, src_len,
> -			   PACKET_READ_CHOMP_NEWLINE |
> -			   PACKET_READ_GENTLE_ON_EOF);
> -
> -	discover_version(&reader);
> -
>  	*list = NULL;
>  
>  	while (state != EXPECTING_DONE) {
> -		switch (packet_reader_read(&reader)) {
> +		switch (packet_reader_read(reader)) {
>  		case PACKET_READ_EOF:
>  			die_initial_contact(1);
>  		case PACKET_READ_NORMAL:
> -			len = reader.pktlen;
> -			if (len > 4 && skip_prefix(reader.line, "ERR ", &arg))
> +			len = reader->pktlen;
> +			if (len > 4 && skip_prefix(reader->line, "ERR ", &arg))
>  				die("remote error: %s", arg);
>  			break;
>  		case PACKET_READ_FLUSH:
> @@ -270,22 +263,22 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  
>  		switch (state) {
>  		case EXPECTING_FIRST_REF:
> -			process_capabilities(reader.line, &len);
> -			if (process_dummy_ref(reader.line)) {
> +			process_capabilities(reader->line, &len);
> +			if (process_dummy_ref(reader->line)) {
>  				state = EXPECTING_SHALLOW;
>  				break;
>  			}
>  			state = EXPECTING_REF;
>  			/* fallthrough */
>  		case EXPECTING_REF:
> -			if (process_ref(reader.line, len, &list, flags, extra_have))
> +			if (process_ref(reader->line, len, &list, flags, extra_have))
>  				break;
>  			state = EXPECTING_SHALLOW;
>  			/* fallthrough */
>  		case EXPECTING_SHALLOW:
> -			if (process_shallow(reader.line, len, shallow_points))
> +			if (process_shallow(reader->line, len, shallow_points))
>  				break;
> -			die("protocol error: unexpected '%s'", reader.line);
> +			die("protocol error: unexpected '%s'", reader->line);
>  		case EXPECTING_DONE:
>  			break;
>  		}
> diff --git a/connect.h b/connect.h
> index 01f14cdf3..cdb8979dc 100644
> --- a/connect.h
> +++ b/connect.h
> @@ -13,4 +13,7 @@ extern int parse_feature_request(const char *features, const char *feature);
>  extern const char *server_feature_value(const char *feature, int *len_ret);
>  extern int url_is_local_not_ssh(const char *url);
>  
> +struct packet_reader;
> +extern enum protocol_version discover_version(struct packet_reader *reader);
> +
>  #endif
> diff --git a/remote-curl.c b/remote-curl.c
> index 0053b0954..9f6d07683 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -1,6 +1,7 @@
>  #include "cache.h"
>  #include "config.h"
>  #include "remote.h"
> +#include "connect.h"
>  #include "strbuf.h"
>  #include "walker.h"
>  #include "http.h"
> @@ -13,6 +14,7 @@
>  #include "credential.h"
>  #include "sha1-array.h"
>  #include "send-pack.h"
> +#include "protocol.h"
>  
>  static struct remote *remote;
>  /* always ends with a trailing slash */
> @@ -176,8 +178,22 @@ static struct discovery *last_discovery;
>  static struct ref *parse_git_refs(struct discovery *heads, int for_push)
>  {
>  	struct ref *list = NULL;
> -	get_remote_heads(-1, heads->buf, heads->len, &list,
> -			 for_push ? REF_NORMAL : 0, NULL, &heads->shallow);
> +	struct packet_reader reader;
> +
> +	packet_reader_init(&reader, -1, heads->buf, heads->len,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &list, for_push ? REF_NORMAL : 0,
> +				 NULL, &heads->shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
> +
>  	return list;
>  }
>  
> diff --git a/remote.h b/remote.h
> index 1f6611be2..2016461df 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
>  void free_refs(struct ref *ref);
>  
>  struct oid_array;
> -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct packet_reader;
> +extern struct ref **get_remote_heads(struct packet_reader *reader,
>  				     struct ref **list, unsigned int flags,
>  				     struct oid_array *extra_have,
> -				     struct oid_array *shallow);
> +				     struct oid_array *shallow_points);
>  
>  int resolve_remote_symref(struct ref *ref, struct ref *list);
>  int ref_newer(const struct object_id *new_oid, const struct object_id *old_oid);
> diff --git a/transport.c b/transport.c
> index 8e8779096..63c3dbab9 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -18,6 +18,7 @@
>  #include "sha1-array.h"
>  #include "sigchain.h"
>  #include "transport-internal.h"
> +#include "protocol.h"
>  
>  static void set_upstreams(struct transport *transport, struct ref *refs,
>  	int pretend)
> @@ -190,13 +191,26 @@ static int connect_setup(struct transport *transport, int for_push)
>  static struct ref *get_refs_via_connect(struct transport *transport, int for_push)
>  {
>  	struct git_transport_data *data = transport->data;
> -	struct ref *refs;
> +	struct ref *refs = NULL;
> +	struct packet_reader reader;
>  
>  	connect_setup(transport, for_push);
> -	get_remote_heads(data->fd[0], NULL, 0, &refs,
> -			 for_push ? REF_NORMAL : 0,
> -			 &data->extra_have,
> -			 &data->shallow);
> +
> +	packet_reader_init(&reader, data->fd[0], NULL, 0,
> +			   PACKET_READ_CHOMP_NEWLINE |
> +			   PACKET_READ_GENTLE_ON_EOF);
> +
> +	switch (discover_version(&reader)) {
> +	case protocol_v1:
> +	case protocol_v0:
> +		get_remote_heads(&reader, &refs,
> +				 for_push ? REF_NORMAL : 0,
> +				 &data->extra_have,
> +				 &data->shallow);
> +		break;
> +	case protocol_unknown_version:
> +		BUG("unknown protocol version");
> +	}
>  	data->got_remote_heads = 1;
>  
>  	return refs;
> -- 
> 2.16.2.395.g2e18187dfd-goog
> 

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns
  2018-02-28 23:22       ` [PATCH v4 16/35] transport: convert transport_get_remote_refs " Brandon Williams
@ 2018-03-13 16:00         ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:00 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:33 -0800
Brandon Williams <bmwill@google.com> wrote:

> Convert 'transport_get_remote_refs()' to optionally take a list of ref
> patterns.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

[snip]

> -const struct ref *transport_get_remote_refs(struct transport *transport);
> +/*
> + * Retrieve refs from a remote.
> + *
> + * Optionally a list of ref patterns can be provided which can be sent to the
> + * server (when communicating using protocol v2) to enable it to limit the ref
> + * advertisement.  Since ref filtering is done on the server's end (and only
> + * when using protocol v2), this can return refs which don't match the provided
> + * ref_patterns.
> + */
> +const struct ref *transport_get_remote_refs(struct transport *transport,
> +					    const struct argv_array *ref_patterns);

Thanks for adding the documentation, but I think this should also go
into the commit message. For example:

    Teach transport_get_remote_refs() to accept a list of ref patterns,
    which will be sent to the server for use in filtering when using
    protocol v2. (This list will be ignored when not using protocol v2.)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 20/35] upload-pack: introduce fetch server command
  2018-02-28 23:22       ` [PATCH v4 20/35] upload-pack: introduce fetch server command Brandon Williams
@ 2018-03-13 16:20         ` Jonathan Tan
  2018-03-13 21:49           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:20 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:37 -0800
Brandon Williams <bmwill@google.com> wrote:

> +    output = *section
> +    section = (acknowledgments | packfile)
> +	      (flush-pkt | delim-pkt)
> +
> +    acknowledgments = PKT-LINE("acknowledgments" LF)
> +		      (nak | *ack)
> +		      (ready)
> +    ready = PKT-LINE("ready" LF)
> +    nak = PKT-LINE("NAK" LF)
> +    ack = PKT-LINE("ACK" SP obj-id LF)
> +
> +    packfile = PKT-LINE("packfile" LF)
> +	       [PACKFILE]

I should have noticed this earlier, but "PACKFILE" is not defined anywhere -
it's probably better written as:

    *PKT-LINE(%x01-03 *%x00-ff)"

or something like that.

> +    acknowledgments section
> +	* Always begins with the section header "acknowledgments"
> +
> +	* The server will respond with "NAK" if none of the object ids sent
> +	  as have lines were common.
> +
> +	* The server will respond with "ACK obj-id" for all of the
> +	  object ids sent as have lines which are common.
> +
> +	* A response cannot have both "ACK" lines as well as a "NAK"
> +	  line.
> +
> +	* The server will respond with a "ready" line indicating that
> +	  the server has found an acceptable common base and is ready to
> +	  make and send a packfile (which will be found in the packfile
> +	  section of the same response)
> +
> +	* If the client determines that it is finished with negotiations
> +	  by sending a "done" line, the acknowledgments sections MUST be
> +	  omitted from the server's response.
> +
> +	* If the server has found a suitable cut point and has decided
> +	  to send a "ready" line, then the server can decide to (as an
> +	  optimization) omit any "ACK" lines it would have sent during
> +	  its response.  This is because the server will have already
> +	  determined the objects it plans to send to the client and no
> +	  further negotiation is needed.
> +
> +----
> +    packfile section
> +	* Always begins with the section header "packfile"
> +
> +	* The transmission of the packfile begins immediately after the
> +	  section header
> +
> +	* The data transfer of the packfile is always multiplexed, using
> +	  the same semantics of the 'side-band-64k' capability from
> +	  protocol version 1.  This means that each packet, during the
> +	  packfile data stream, is made up of a leading 4-byte pkt-line
> +	  length (typical of the pkt-line format), followed by a 1-byte
> +	  stream code, followed by the actual data.
> +
> +	  The stream code can be one of:
> +		1 - pack data
> +		2 - progress messages
> +		3 - fatal error message just before stream aborts
> +
> +	* This section is only included if the client has sent 'want'
> +	  lines in its request and either requested that no more
> +	  negotiation be done by sending 'done' or if the server has
> +	  decided it has found a sufficient cut point to produce a
> +	  packfile.

For both the sections, I think that the conditions for
inclusion/non-inclusion ("This section is only included if...") should
be the first point.

> +static void upload_pack_data_init(struct upload_pack_data *data)
> +{
> +	struct object_array wants = OBJECT_ARRAY_INIT;
> +	struct oid_array haves = OID_ARRAY_INIT;
> +
> +	memset(data, 0, sizeof(*data));
> +	data->wants = wants;
> +	data->haves = haves;
> +}

Any reason to use a initializer function instead of a static literal?

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 27/35] transport-helper: introduce stateless-connect
  2018-02-28 23:22       ` [PATCH v4 27/35] transport-helper: introduce stateless-connect Brandon Williams
@ 2018-03-13 16:30         ` Jonathan Tan
  2018-03-14 17:36           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:30 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:44 -0800
Brandon Williams <bmwill@google.com> wrote:

> +'stateless-connect'::
> +	Experimental; for internal use only.
> +	Can attempt to connect to a remote server for communication
> +	using git's wire-protocol version 2.  This establishes a
> +	stateless, half-duplex connection.
> ++
> +Supported commands: 'stateless-connect'.
> +
>  'push'::
>  	Can discover remote refs and push local commits and the
>  	history leading up to them to new or existing remote refs.
> @@ -136,6 +144,14 @@ Capabilities for Fetching
>  +
>  Supported commands: 'connect'.
>  
> +'stateless-connect'::
> +	Experimental; for internal use only.
> +	Can attempt to connect to a remote server for communication
> +	using git's wire-protocol version 2.  This establishes a
> +	stateless, half-duplex connection.
> ++
> +Supported commands: 'stateless-connect'.

I don't think we should use the term "half-duplex" - from a search, it
means that both parties can use the wire but not simultaneously, which
is not strictly true. Might be better to just say "see the documentation
for the stateless-connect command for more information".

> +'stateless-connect' <service>::
> +	Experimental; for internal use only.
> +	Connects to the given remote service for communication using
> +	git's wire-protocol version 2.  This establishes a stateless,
> +	half-duplex connection.  Valid replies to this command are empty
> +	line (connection established), 'fallback' (no smart transport
> +	support, fall back to dumb transports) and just exiting with
> +	error message printed (can't connect, don't bother trying to
> +	fall back).  After line feed terminating the positive (empty)
> +	response, the output of the service starts.  Messages (both
> +	request and response) must be terminated with a single flush
> +	packet, allowing the remote helper to properly act as a proxy.
> +	After the connection ends, the remote helper exits.
> ++
> +Supported if the helper has the "stateless-connect" capability.

I'm not sure of the relevance of "allowing the remote helper to properly
act as a proxy" - this scheme does make it easier to implement proxies,
not for any party to start acting as one instead. I would write that
part as:

    Messages (both request and response) must consist of zero or more
    PKT-LINEs, terminating in a flush packet. The client must not expect
    the server to store any state in between request-response pairs.

(This covers the so-called "half-duplex" part and the "stateless" part.)

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 29/35] remote-curl: create copy of the service name
  2018-02-28 23:22       ` [PATCH v4 29/35] remote-curl: create copy of the service name Brandon Williams
@ 2018-03-13 16:32         ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:32 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:46 -0800
Brandon Williams <bmwill@google.com> wrote:

> Make a copy of the service name being requested instead of relying on
> the buffer pointed to by the passed in 'const char *' to remain
> unchanged.
> 
> Currently, all service names are string constants, but a subsequent
> patch will introduce service names from external sources.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

Once again,

Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 31/35] http: allow providing extra headers for http requests
  2018-02-28 23:22       ` [PATCH v4 31/35] http: allow providing extra headers for http requests Brandon Williams
@ 2018-03-13 16:33         ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:33 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:48 -0800
Brandon Williams <bmwill@google.com> wrote:

> Add a way for callers to request that extra headers be included when
> making http requests.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

Likewise,

Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 35/35] remote-curl: don't request v2 when pushing
  2018-02-28 23:22       ` [PATCH v4 35/35] remote-curl: don't request v2 when pushing Brandon Williams
@ 2018-03-13 16:35         ` Jonathan Tan
  0 siblings, 0 replies; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:35 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:52 -0800
Brandon Williams <bmwill@google.com> wrote:

> In order to be able to ship protocol v2 with only supporting fetch, we
> need clients to not issue a request to use protocol v2 when pushing
> (since the client currently doesn't know how to push using protocol v2).
> This allows a client to have protocol v2 configured in
> `protocol.version` and take advantage of using v2 for fetch and falling
> back to using v0 when pushing while v2 for push is being designed.
> 
> We could run into issues if we didn't fall back to protocol v2 when
> pushing right now.  This is because currently a server will ignore a request to
> use v2 when contacting the 'receive-pack' endpoint and fall back to
> using v0, but when push v2 is rolled out to servers, the 'receive-pack'
> endpoint will start responding using v2.  So we don't want to get into a
> state where a client is requesting to push with v2 before they actually
> know how to push using v2.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

I noticed that my review comments have been addressed, so:

Reviewed-by: Jonathan Tan <jonathantanmy@google.com>

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 04/35] upload-pack: convert to a builtin
  2018-02-28 23:22       ` [PATCH v4 04/35] upload-pack: convert to a builtin Brandon Williams
@ 2018-03-13 16:40         ` Jonathan Tan
  2018-03-13 19:50           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 16:40 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:21 -0800
Brandon Williams <bmwill@google.com> wrote:

> In order to allow for code sharing with the server-side of fetch in
> protocol-v2 convert upload-pack to be a builtin.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>

I suggested updating the commit message in my previous review [1], but I
understand that my comment might have been lost in the ensuing long
discussion.

[1] https://public-inbox.org/git/20180221134422.2386e1aca39fe673235590e7@google.com/

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 01/35] pkt-line: introduce packet_read_with_status
  2018-02-28 23:22       ` [PATCH v4 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
@ 2018-03-13 19:35         ` Jonathan Tan
  2018-03-13 19:52           ` Brandon Williams
  0 siblings, 1 reply; 362+ messages in thread
From: Jonathan Tan @ 2018-03-13 19:35 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On Wed, 28 Feb 2018 15:22:18 -0800
Brandon Williams <bmwill@google.com> wrote:

> +	if (len < 0) {
>  		die("protocol error: bad line length character: %.4s", linelen);
> -	if (!len) {
> +	} else if (!len) {
>  		packet_trace("0000", 4, 0);
> -		return 0;
> +		return PACKET_READ_FLUSH;
> +	} else if (len < 4) {
> +		die("protocol error: bad line length %d", len);
>  	}
> +
>  	len -= 4;
> -	if (len >= size)
> +	if ((unsigned)len >= size)
>  		die("protocol error: bad line length %d", len);

The cast to unsigned is safe, since len was at least 4 before "len -=
4". I can't think of a better way to write this to make that more
obvious, though.

> +/*
> + * Read a packetized line into a buffer like the 'packet_read()' function but
> + * returns an 'enum packet_read_status' which indicates the status of the read.
> + * The number of bytes read will be assigined to *pktlen if the status of the
> + * read was 'PACKET_READ_NORMAL'.
> + */
> +enum packet_read_status {
> +	PACKET_READ_EOF = -1,
> +	PACKET_READ_NORMAL,
> +	PACKET_READ_FLUSH,
> +};
> +enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
> +						size_t *src_len, char *buffer,
> +						unsigned size, int *pktlen,
> +						int options);

jrnieder said in [1], referring to the definition of enum
packet_read_status:

> nit: do any callers treat the return value as a number?  It would be
> less magical if the numbering were left to the compiler (0, 1, 2).

I checked the result of the entire patch set and the only callers seem
to be packet_read() (modified in this patch) and the
soon-to-be-introduced packet_reader_read(). So not only can the
numbering be left to the compiler, this function can (and should) be
marked static as well (and the enum definition moved to .c), since I
think that future development should be encouraged to use packet_reader.

The commit message would also thus need to be rewritten, since this
becomes more of a refactoring into a function with a more precisely
specified return type, to be used both by the existing packet_read() and
a soon-to-be-introduced packet_reader_read().

[1] https://public-inbox.org/git/20180213002554.GA42272@aiede.svl.corp.google.com/

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 04/35] upload-pack: convert to a builtin
  2018-03-13 16:40         ` Jonathan Tan
@ 2018-03-13 19:50           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 19:50 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On 03/13, Jonathan Tan wrote:
> On Wed, 28 Feb 2018 15:22:21 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > In order to allow for code sharing with the server-side of fetch in
> > protocol-v2 convert upload-pack to be a builtin.
> > 
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> I suggested updating the commit message in my previous review [1], but I
> understand that my comment might have been lost in the ensuing long
> discussion.
> 
> [1] https://public-inbox.org/git/20180221134422.2386e1aca39fe673235590e7@google.com/

Your suggested change to the commit msg isn't quite accurate as you can
already run "git-upload-pack --help" today.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 01/35] pkt-line: introduce packet_read_with_status
  2018-03-13 19:35         ` Jonathan Tan
@ 2018-03-13 19:52           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 19:52 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On 03/13, Jonathan Tan wrote:
> On Wed, 28 Feb 2018 15:22:18 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +	if (len < 0) {
> >  		die("protocol error: bad line length character: %.4s", linelen);
> > -	if (!len) {
> > +	} else if (!len) {
> >  		packet_trace("0000", 4, 0);
> > -		return 0;
> > +		return PACKET_READ_FLUSH;
> > +	} else if (len < 4) {
> > +		die("protocol error: bad line length %d", len);
> >  	}
> > +
> >  	len -= 4;
> > -	if (len >= size)
> > +	if ((unsigned)len >= size)
> >  		die("protocol error: bad line length %d", len);
> 
> The cast to unsigned is safe, since len was at least 4 before "len -=
> 4". I can't think of a better way to write this to make that more
> obvious, though.
> 
> > +/*
> > + * Read a packetized line into a buffer like the 'packet_read()' function but
> > + * returns an 'enum packet_read_status' which indicates the status of the read.
> > + * The number of bytes read will be assigined to *pktlen if the status of the
> > + * read was 'PACKET_READ_NORMAL'.
> > + */
> > +enum packet_read_status {
> > +	PACKET_READ_EOF = -1,
> > +	PACKET_READ_NORMAL,
> > +	PACKET_READ_FLUSH,
> > +};
> > +enum packet_read_status packet_read_with_status(int fd, char **src_buffer,
> > +						size_t *src_len, char *buffer,
> > +						unsigned size, int *pktlen,
> > +						int options);
> 
> jrnieder said in [1], referring to the definition of enum
> packet_read_status:
> 
> > nit: do any callers treat the return value as a number?  It would be
> > less magical if the numbering were left to the compiler (0, 1, 2).

yeah i'll do this.

> 
> I checked the result of the entire patch set and the only callers seem
> to be packet_read() (modified in this patch) and the
> soon-to-be-introduced packet_reader_read(). So not only can the
> numbering be left to the compiler, this function can (and should) be
> marked static as well (and the enum definition moved to .c), since I
> think that future development should be encouraged to use packet_reader.

The enum definition can't be moved as its needed outside this file.

> 
> The commit message would also thus need to be rewritten, since this
> becomes more of a refactoring into a function with a more precisely
> specified return type, to be used both by the existing packet_read() and
> a soon-to-be-introduced packet_reader_read().
> 
> [1] https://public-inbox.org/git/20180213002554.GA42272@aiede.svl.corp.google.com/

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-03-05 20:28             ` Jeff King
@ 2018-03-13 21:23               ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 21:23 UTC (permalink / raw)
  To: Jeff King; +Cc: git, git, gitster, jrnieder, pclouds, sbeller, stolee

On 03/05, Jeff King wrote:
> On Mon, Mar 05, 2018 at 10:21:55AM -0800, Brandon Williams wrote:
> 
> > > Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
> > > ignore the "/foo" part.
> > 
> > Yeah that's true...this should probably not do that.  Since
> > "refs/heads/*/foo" violates what the spec is, really this should error
> > out as an invalid pattern.
> 
> Yeah, that would be better, I think.
> 
> > > It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> > > it's worth going for the most-restrictive thing to start with, since
> > > that enables a lot more server operations without worrying about
> > > breaking compatibility.
> > 
> > And just to clarify what do you see as being the most-restrictive case
> > of patterns that would should use?
> 
> I mean only accepting "*" at a "/" boundary (or just allowing a trailing
> slash to imply recursion, like "refs/heads/", or even just always
> assuming recursion to allow "refs/heads").

For simplicity I'll change ref-patterns to be ref-prefixes where
a ref must start_with() one of the provided ref-prefixes.  Clients won't
send '*'s either but can send everything upto but not including the '*'
as a prefix.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command
  2018-03-02 21:13         ` Junio C Hamano
@ 2018-03-13 21:27           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 21:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/02, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > + ls-refs
> > +---------
> > +
> > +`ls-refs` is the command used to request a reference advertisement in v2.
> > +Unlike the current reference advertisement, ls-refs takes in arguments
> > +which can be used to limit the refs sent from the server.
> 
> OK.
> 
> > +Additional features not supported in the base command will be advertised
> > +as the value of the command in the capability advertisement in the form
> > +of a space separated list of features, e.g.  "<command>=<feature 1>
> > +<feature 2>".
> 
> Doesn't this explain the general convention that applies to any
> command, not just ls-refs command?  As a part of ls-refs section,
> <command> in the above explanation is always a constant "ls-refs",
> right?
> 
> It is a bit unclear how <feature N> in the above description are
> related to "arguments" in the following paragraph.  Do the server
> that can show symref and peeled tags and that can limit the output
> with ref-pattern advertise these three as supported features, i.e.
> 
> 	ls-refs=symrefs peel ref-pattern
> 
> or something?  Would there a case where a "feature" does not
> correspond 1:1 to an argument to the command, and if so how would
> the server and the client negotiate use of such a feature?

I mention earlier in the document that the values of each capability are
to be defined by the capability itself, so I'm just documenting what the
value advertised means.  And a feature could mean a couple things and
doesn't necessarily mean it affects the arguments which can be provided,
and it definitely doesn't mean that each argument that can be provided
must be advertised as a feature.  If you look at the patch that
introduces shallow, the shallow feature adds lots of arguments that a
client can that use in its request.

> 
> > +    ref-pattern <pattern>
> > +	When specified, only references matching one of the provided
> > +	patterns are displayed.  A pattern is either a valid refname
> > +	(e.g.  refs/heads/master), in which a ref must match the pattern
> > +	exactly, or a prefix of a ref followed by a single '*' wildcard
> > +	character (e.g. refs/heads/*), in which a ref must have a prefix
> > +	equal to the pattern up to the wildcard character.
> 
> I thought the recent concensus was left-anchored prefix match that
> honors /-directory boundary, i.e. no explicit asterisk and just
> saying "refs/heads" is enough to match "refs/heads" itself and
> "refs/heads/master" but not "refs/headscarf"?

I don't think there was a consensus at the time, but in the next
revision I'll have them be prefixes instead of containing wildcards.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 12/35] serve: introduce git-serve
  2018-03-02 20:56         ` Junio C Hamano
@ 2018-03-13 21:35           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 21:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/02, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> > +static int is_command(const char *key, struct protocol_capability **command)
> > +{
> > +	const char *out;
> > +
> > +	if (skip_prefix(key, "command=", &out)) {
> > +		struct protocol_capability *cmd = get_capability(out);
> > +
> > +		if (!cmd || !cmd->advertise(the_repository, NULL) || !cmd->command)
> > +			die("invalid command '%s'", out);
> > +		if (*command)
> > +			die("command already requested");
> 
> Shouldn't these two checks that lead to die the other way around?
> When they give us "command=frotz" and we already have *command, it
> would be an error whether we understand 'frotz' or not.
> 
> Who are the target audience of these "die"?  Are they meant to be
> communicated back to the other side of the connection, or are they
> only to be sent to the "server log"?
> 
> The latter one may want to say what two conflicting commands are in
> the log message, perhaps?

Yeah I'll switch the order of these checks as well as print out the two
commands requested for better logging.

> 
> > +		*command = cmd;
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 12/35] serve: introduce git-serve
  2018-03-02 20:42         ` Junio C Hamano
@ 2018-03-13 21:40           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 21:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, git, jrnieder, pclouds, peff, sbeller, stolee

On 03/02, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > + Capabilities
> > +~~~~~~~~~~~~~~
> > +
> > +There are two different types of capabilities: normal capabilities,
> > +which can be used to to convey information or alter the behavior of a
> > +request, and commands, which are the core actions that a client wants to
> > +perform (fetch, push, etc).
> > +
> > +All commands must only last a single round and be stateless from the
> > +perspective of the server side.  All state MUST be retained and managed
> > +by the client process.  This permits simple round-robin load-balancing
> > +on the server side, without needing to worry about state management.
> > +
> > +Clients MUST NOT require state management on the server side in order to
> > +function correctly.
> 
> This somehow feels a bit too HTTP centric worldview that potentially
> may penalize those who do not mind stateful services.

It's meant to be that way so that we don't run into the same issue we
have with the current HTTP transport.  Though I've decided to loosen
this slightly by making protocol v2 stateless by default unless a
capability is advertised and requested by the client indicating that
state can be maintained by the server.  That leaves the door open for
adding state later for transports which have full-duplex connections
while still requiring that stateless is designed first.  I'm kind of
hoping we never need to add state to the protocol because hopefully we
can figure out how to improve negotiation as a whole.

> 
> > + agent
> > +-------
> > +
> > +The server can advertise the `agent` capability with a value `X` (in the
> > +form `agent=X`) to notify the client that the server is running version
> > +`X`.  The client may optionally send its own agent string by including
> > +the `agent` capability with a value `Y` (in the form `agent=Y`) in its
> > +request to the server (but it MUST NOT do so if the server did not
> > +advertise the agent capability).
> 
> Are there different degrees of permissiveness between "The server
> CAN" and "The client MAY" above, or is the above paragraph merely
> being fuzzy?

I don't think so? I believe I ripped this from the existing description
of the agent capability from the current protocol.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 20/35] upload-pack: introduce fetch server command
  2018-03-13 16:20         ` Jonathan Tan
@ 2018-03-13 21:49           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-13 21:49 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On 03/13, Jonathan Tan wrote:
> On Wed, 28 Feb 2018 15:22:37 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +    output = *section
> > +    section = (acknowledgments | packfile)
> > +	      (flush-pkt | delim-pkt)
> > +
> > +    acknowledgments = PKT-LINE("acknowledgments" LF)
> > +		      (nak | *ack)
> > +		      (ready)
> > +    ready = PKT-LINE("ready" LF)
> > +    nak = PKT-LINE("NAK" LF)
> > +    ack = PKT-LINE("ACK" SP obj-id LF)
> > +
> > +    packfile = PKT-LINE("packfile" LF)
> > +	       [PACKFILE]
> 
> I should have noticed this earlier, but "PACKFILE" is not defined anywhere -
> it's probably better written as:
> 
>     *PKT-LINE(%x01-03 *%x00-ff)"
> 
> or something like that.

I'll document it as you described.

> 
> > +    acknowledgments section
> > +	* Always begins with the section header "acknowledgments"
> > +
> > +	* The server will respond with "NAK" if none of the object ids sent
> > +	  as have lines were common.
> > +
> > +	* The server will respond with "ACK obj-id" for all of the
> > +	  object ids sent as have lines which are common.
> > +
> > +	* A response cannot have both "ACK" lines as well as a "NAK"
> > +	  line.
> > +
> > +	* The server will respond with a "ready" line indicating that
> > +	  the server has found an acceptable common base and is ready to
> > +	  make and send a packfile (which will be found in the packfile
> > +	  section of the same response)
> > +
> > +	* If the client determines that it is finished with negotiations
> > +	  by sending a "done" line, the acknowledgments sections MUST be
> > +	  omitted from the server's response.
> > +
> > +	* If the server has found a suitable cut point and has decided
> > +	  to send a "ready" line, then the server can decide to (as an
> > +	  optimization) omit any "ACK" lines it would have sent during
> > +	  its response.  This is because the server will have already
> > +	  determined the objects it plans to send to the client and no
> > +	  further negotiation is needed.
> > +
> > +----
> > +    packfile section
> > +	* Always begins with the section header "packfile"
> > +
> > +	* The transmission of the packfile begins immediately after the
> > +	  section header
> > +
> > +	* The data transfer of the packfile is always multiplexed, using
> > +	  the same semantics of the 'side-band-64k' capability from
> > +	  protocol version 1.  This means that each packet, during the
> > +	  packfile data stream, is made up of a leading 4-byte pkt-line
> > +	  length (typical of the pkt-line format), followed by a 1-byte
> > +	  stream code, followed by the actual data.
> > +
> > +	  The stream code can be one of:
> > +		1 - pack data
> > +		2 - progress messages
> > +		3 - fatal error message just before stream aborts
> > +
> > +	* This section is only included if the client has sent 'want'
> > +	  lines in its request and either requested that no more
> > +	  negotiation be done by sending 'done' or if the server has
> > +	  decided it has found a sufficient cut point to produce a
> > +	  packfile.
> 
> For both the sections, I think that the conditions for
> inclusion/non-inclusion ("This section is only included if...") should
> be the first point.
> 
> > +static void upload_pack_data_init(struct upload_pack_data *data)
> > +{
> > +	struct object_array wants = OBJECT_ARRAY_INIT;
> > +	struct oid_array haves = OID_ARRAY_INIT;
> > +
> > +	memset(data, 0, sizeof(*data));
> > +	data->wants = wants;
> > +	data->haves = haves;
> > +}
> 
> Any reason to use a initializer function instead of a static literal?

Its much cleaner and easier to read than it was when i was using an
initializer.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

* Re: [PATCH v4 27/35] transport-helper: introduce stateless-connect
  2018-03-13 16:30         ` Jonathan Tan
@ 2018-03-14 17:36           ` Brandon Williams
  0 siblings, 0 replies; 362+ messages in thread
From: Brandon Williams @ 2018-03-14 17:36 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, git, gitster, jrnieder, pclouds, peff, sbeller, stolee

On 03/13, Jonathan Tan wrote:
> On Wed, 28 Feb 2018 15:22:44 -0800
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +'stateless-connect'::
> > +	Experimental; for internal use only.
> > +	Can attempt to connect to a remote server for communication
> > +	using git's wire-protocol version 2.  This establishes a
> > +	stateless, half-duplex connection.
> > ++
> > +Supported commands: 'stateless-connect'.
> > +
> >  'push'::
> >  	Can discover remote refs and push local commits and the
> >  	history leading up to them to new or existing remote refs.
> > @@ -136,6 +144,14 @@ Capabilities for Fetching
> >  +
> >  Supported commands: 'connect'.
> >  
> > +'stateless-connect'::
> > +	Experimental; for internal use only.
> > +	Can attempt to connect to a remote server for communication
> > +	using git's wire-protocol version 2.  This establishes a
> > +	stateless, half-duplex connection.
> > ++
> > +Supported commands: 'stateless-connect'.
> 
> I don't think we should use the term "half-duplex" - from a search, it
> means that both parties can use the wire but not simultaneously, which
> is not strictly true. Might be better to just say "see the documentation
> for the stateless-connect command for more information".
> 
> > +'stateless-connect' <service>::
> > +	Experimental; for internal use only.
> > +	Connects to the given remote service for communication using
> > +	git's wire-protocol version 2.  This establishes a stateless,
> > +	half-duplex connection.  Valid replies to this command are empty
> > +	line (connection established), 'fallback' (no smart transport
> > +	support, fall back to dumb transports) and just exiting with
> > +	error message printed (can't connect, don't bother trying to
> > +	fall back).  After line feed terminating the positive (empty)
> > +	response, the output of the service starts.  Messages (both
> > +	request and response) must be terminated with a single flush
> > +	packet, allowing the remote helper to properly act as a proxy.
> > +	After the connection ends, the remote helper exits.
> > ++
> > +Supported if the helper has the "stateless-connect" capability.
> 
> I'm not sure of the relevance of "allowing the remote helper to properly
> act as a proxy" - this scheme does make it easier to implement proxies,
> not for any party to start acting as one instead. I would write that
> part as:
> 
>     Messages (both request and response) must consist of zero or more
>     PKT-LINEs, terminating in a flush packet. The client must not expect
>     the server to store any state in between request-response pairs.
> 
> (This covers the so-called "half-duplex" part and the "stateless" part.)

Thanks for helping wordsmith this, I'll update the docs based on these
suggestions.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 362+ messages in thread

end of thread, other threads:[~2018-03-14 17:36 UTC | newest]

Thread overview: 362+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-03  0:18 [PATCH 00/26] protocol version 2 Brandon Williams
2018-01-03  0:18 ` [PATCH 01/26] pkt-line: introduce packet_read_with_status Brandon Williams
2018-01-03 19:27   ` Stefan Beller
2018-01-05 23:41     ` Brandon Williams
2018-01-09 18:04   ` Jonathan Tan
2018-01-09 19:28     ` Brandon Williams
2018-01-03  0:18 ` [PATCH 02/26] pkt-line: introduce struct packet_reader Brandon Williams
2018-01-09 18:08   ` Jonathan Tan
2018-01-09 19:19     ` Brandon Williams
2018-01-03  0:18 ` [PATCH 03/26] pkt-line: add delim packet support Brandon Williams
2018-01-03  0:18 ` [PATCH 04/26] upload-pack: convert to a builtin Brandon Williams
2018-01-03 20:33   ` Stefan Beller
2018-01-03 20:39     ` Brandon Williams
2018-02-21 21:47       ` Jonathan Nieder
2018-02-21 23:35         ` Junio C Hamano
2018-01-03  0:18 ` [PATCH 05/26] upload-pack: factor out processing lines Brandon Williams
2018-01-03 20:38   ` Stefan Beller
2018-01-03  0:18 ` [PATCH 06/26] transport: use get_refs_via_connect to get refs Brandon Williams
2018-01-03 21:20   ` Stefan Beller
2018-01-03  0:18 ` [PATCH 07/26] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
2018-01-09 18:27   ` Jonathan Tan
2018-01-09 19:09     ` Brandon Williams
2018-01-03  0:18 ` [PATCH 08/26] connect: discover protocol version outside of get_remote_heads Brandon Williams
2018-01-03  0:18 ` [PATCH 09/26] transport: store protocol version Brandon Williams
2018-01-09 18:41   ` Jonathan Tan
2018-01-09 19:15     ` Brandon Williams
2018-01-03  0:18 ` [PATCH 10/26] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
2018-01-03  0:18 ` [PATCH 11/26] serve: introduce git-serve Brandon Williams
2018-01-09 20:24   ` Jonathan Tan
2018-01-09 22:16     ` Brandon Williams
2018-01-09 22:28       ` Jonathan Tan
2018-01-09 22:34         ` Brandon Williams
2018-02-01 18:48   ` Jeff Hostetler
2018-02-01 18:57     ` Stefan Beller
2018-02-01 19:09       ` Jeff Hostetler
2018-02-01 20:05         ` Brandon Williams
2018-02-01 19:45       ` Randall S. Becker
2018-02-01 20:08         ` 'Brandon Williams'
2018-02-01 20:37           ` Randall S. Becker
2018-02-01 20:50             ` Stefan Beller
2018-01-03  0:18 ` [PATCH 12/26] ls-refs: introduce ls-refs server command Brandon Williams
2018-01-04  0:17   ` Stefan Beller
2018-01-05 23:49     ` Brandon Williams
2018-01-09 20:50   ` Jonathan Tan
2018-01-16 19:23     ` Brandon Williams
2018-02-01 19:16   ` Jeff Hostetler
2018-02-07  0:55     ` Brandon Williams
2018-01-03  0:18 ` [PATCH 13/26] connect: request remote refs using v2 Brandon Williams
2018-01-09 22:24   ` Jonathan Tan
2018-01-03  0:18 ` [PATCH 14/26] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
2018-01-03  0:18 ` [PATCH 15/26] transport: convert transport_get_remote_refs " Brandon Williams
2018-01-03  0:18 ` [PATCH 16/26] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
2018-01-03  0:18 ` [PATCH 17/26] fetch: pass ref patterns when fetching Brandon Williams
2018-01-03  0:18 ` [PATCH 18/26] push: pass ref patterns when pushing Brandon Williams
2018-01-03  0:18 ` [PATCH 19/26] upload-pack: introduce fetch server command Brandon Williams
2018-01-04  1:07   ` Stefan Beller
2018-01-03  0:18 ` [PATCH 20/26] fetch-pack: perform a fetch using v2 Brandon Williams
2018-01-04  1:23   ` Stefan Beller
2018-01-05 23:55     ` Brandon Williams
2018-01-10  0:05   ` Jonathan Tan
2018-01-03  0:18 ` [PATCH 21/26] transport-helper: remove name parameter Brandon Williams
2018-01-03  0:18 ` [PATCH 22/26] transport-helper: refactor process_connect_service Brandon Williams
2018-01-03  0:18 ` [PATCH 23/26] transport-helper: introduce connect-half-duplex Brandon Williams
2018-01-03  0:18 ` [PATCH 24/26] pkt-line: add packet_buf_write_len function Brandon Williams
2018-01-03  0:18 ` [PATCH 25/26] remote-curl: create copy of the service name Brandon Williams
2018-01-03  0:18 ` [PATCH 26/26] remote-curl: implement connect-half-duplex command Brandon Williams
2018-01-10  0:10   ` Jonathan Tan
2018-01-10 17:57   ` Jonathan Tan
2018-01-11  1:09     ` Brandon Williams
2018-01-09 17:55 ` [PATCH 00/26] protocol version 2 Jonathan Tan
2018-01-11  0:23   ` Brandon Williams
2018-01-25 23:58 ` [PATCH v2 00/27] " Brandon Williams
2018-01-25 23:58   ` [PATCH v2 01/27] pkt-line: introduce packet_read_with_status Brandon Williams
2018-01-25 23:58   ` [PATCH v2 02/27] pkt-line: introduce struct packet_reader Brandon Williams
2018-01-25 23:58   ` [PATCH v2 03/27] pkt-line: add delim packet support Brandon Williams
2018-01-25 23:58   ` [PATCH v2 04/27] upload-pack: convert to a builtin Brandon Williams
2018-01-25 23:58   ` [PATCH v2 05/27] upload-pack: factor out processing lines Brandon Williams
2018-01-26 20:12     ` Stefan Beller
2018-01-26 21:33       ` Brandon Williams
2018-01-31 14:08         ` Derrick Stolee
2018-01-25 23:58   ` [PATCH v2 06/27] transport: use get_refs_via_connect to get refs Brandon Williams
2018-01-25 23:58   ` [PATCH v2 07/27] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
2018-01-25 23:58   ` [PATCH v2 08/27] connect: discover protocol version outside of get_remote_heads Brandon Williams
2018-01-31 14:40     ` Derrick Stolee
2018-02-01 17:57       ` Brandon Williams
2018-01-25 23:58   ` [PATCH v2 09/27] transport: store protocol version Brandon Williams
2018-01-31 14:45     ` Derrick Stolee
2018-01-25 23:58   ` [PATCH v2 10/27] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
2018-01-31 14:54     ` Derrick Stolee
2018-02-02 22:44       ` Brandon Williams
2018-02-05 14:14         ` Derrick Stolee
2018-01-25 23:58   ` [PATCH v2 11/27] test-pkt-line: introduce a packet-line test helper Brandon Williams
2018-01-25 23:58   ` [PATCH v2 12/27] serve: introduce git-serve Brandon Williams
2018-01-26 10:39     ` Duy Nguyen
2018-02-27  5:46       ` Jonathan Nieder
2018-01-31 15:39     ` Derrick Stolee
2018-01-25 23:58   ` [PATCH v2 13/27] ls-refs: introduce ls-refs server command Brandon Williams
2018-01-26 22:20     ` Stefan Beller
2018-02-02 22:31       ` Brandon Williams
2018-01-25 23:58   ` [PATCH v2 14/27] connect: request remote refs using v2 Brandon Williams
2018-01-31 15:22     ` Derrick Stolee
2018-01-31 20:10       ` Eric Sunshine
2018-01-31 22:14         ` Derrick Stolee
2018-01-25 23:58   ` [PATCH v2 15/27] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
2018-01-25 23:58   ` [PATCH v2 16/27] transport: convert transport_get_remote_refs " Brandon Williams
2018-01-25 23:58   ` [PATCH v2 17/27] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
2018-01-25 23:58   ` [PATCH v2 18/27] fetch: pass ref patterns when fetching Brandon Williams
2018-01-25 23:58   ` [PATCH v2 19/27] push: pass ref patterns when pushing Brandon Williams
2018-01-25 23:58   ` [PATCH v2 20/27] upload-pack: introduce fetch server command Brandon Williams
2018-01-25 23:58   ` [PATCH v2 21/27] fetch-pack: perform a fetch using v2 Brandon Williams
2018-01-25 23:58   ` [PATCH v2 22/27] transport-helper: remove name parameter Brandon Williams
2018-01-25 23:58   ` [PATCH v2 23/27] transport-helper: refactor process_connect_service Brandon Williams
2018-01-25 23:58   ` [PATCH v2 24/27] transport-helper: introduce stateless-connect Brandon Williams
2018-01-25 23:58   ` [PATCH v2 25/27] pkt-line: add packet_buf_write_len function Brandon Williams
2018-01-25 23:58   ` [PATCH v2 26/27] remote-curl: create copy of the service name Brandon Williams
2018-01-25 23:58   ` [PATCH v2 27/27] remote-curl: implement stateless-connect command Brandon Williams
2018-01-31 16:00   ` [PATCH v2 00/27] protocol version 2 Derrick Stolee
2018-02-07  0:58     ` Brandon Williams
2018-02-01 19:40   ` Jeff Hostetler
2018-02-07  1:12   ` [PATCH v3 00/35] " Brandon Williams
2018-02-07  1:12     ` [PATCH v3 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
2018-02-13  0:25       ` Jonathan Nieder
2018-02-07  1:12     ` [PATCH v3 02/35] pkt-line: introduce struct packet_reader Brandon Williams
2018-02-13  0:49       ` Jonathan Nieder
2018-02-27 18:14         ` Brandon Williams
2018-02-27 19:20           ` Jonathan Nieder
2018-02-27  5:57       ` Jonathan Nieder
2018-02-27  6:12         ` Jonathan Nieder
2018-02-07  1:12     ` [PATCH v3 03/35] pkt-line: add delim packet support Brandon Williams
2018-02-22 19:13       ` Stefan Beller
2018-02-22 19:37         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 04/35] upload-pack: convert to a builtin Brandon Williams
2018-02-21 21:44       ` Jonathan Tan
2018-02-22  9:58         ` Jeff King
2018-02-22 18:07           ` Brandon Williams
2018-02-22 18:14             ` Jeff King
2018-02-22 19:38               ` Jonathan Nieder
2018-02-22 20:19                 ` Jeff King
2018-02-22 20:21                   ` Jeff King
2018-02-22 21:26                     ` Jonathan Nieder
2018-02-22 21:44                       ` Jeff King
2018-03-12 22:43                         ` Jonathan Nieder
2018-03-12 23:28                           ` Jeff King
2018-03-12 23:37                             ` Jonathan Nieder
2018-03-12 23:52                               ` Jeff King
2018-02-23 21:09                     ` Brandon Williams
2018-03-03  4:24                       ` Jeff King
2018-02-22 21:24                   ` Jonathan Nieder
2018-02-22 21:44                     ` Jeff King
2018-02-22 22:21                       ` Jeff King
2018-02-22 22:42                         ` Jonathan Nieder
2018-02-22 23:05                           ` Jeff King
2018-02-22 23:23                             ` Jeff King
2018-02-07  1:12     ` [PATCH v3 05/35] upload-pack: factor out processing lines Brandon Williams
2018-02-22 19:31       ` Stefan Beller
2018-02-22 19:39         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
2018-02-27  6:08       ` Jonathan Nieder
2018-02-27 18:17         ` Brandon Williams
2018-02-27 19:25           ` Jonathan Nieder
2018-02-27 19:46             ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
2018-02-22 19:52       ` Stefan Beller
2018-02-22 20:09       ` Stefan Beller
2018-02-23 21:30         ` Brandon Williams
2018-02-23 21:48           ` Stefan Beller
2018-02-23 22:56             ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
2018-02-21 22:11       ` Jonathan Tan
2018-02-22 18:17         ` Brandon Williams
2018-02-22 19:22           ` Jonathan Tan
2018-02-07  1:12     ` [PATCH v3 09/35] transport: store protocol version Brandon Williams
2018-02-07  1:12     ` [PATCH v3 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
2018-02-27  6:18       ` Jonathan Nieder
2018-02-27 18:41         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
2018-02-22 20:40       ` Stefan Beller
2018-02-23 21:22         ` Brandon Williams
2018-03-03  4:25           ` Jeff King
2018-03-05 18:48             ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 12/35] serve: introduce git-serve Brandon Williams
2018-02-21 22:45       ` Jonathan Tan
2018-02-23 21:33         ` Brandon Williams
2018-02-27 18:05           ` Jonathan Tan
2018-02-27 18:34             ` Brandon Williams
2018-02-22  9:33       ` Jeff King
2018-02-23 21:45         ` Brandon Williams
2018-03-03  4:33           ` Jeff King
2018-03-05 18:43             ` Brandon Williams
2018-03-05 20:52               ` Jeff King
2018-03-05 21:36                 ` Jonathan Nieder
2018-03-06  6:29                   ` Jeff King
2018-03-12 23:46                     ` Jeff King
2018-02-07  1:12     ` [PATCH v3 13/35] ls-refs: introduce ls-refs server command Brandon Williams
2018-02-22  9:48       ` Jeff King
2018-02-23  0:45         ` Brandon Williams
2018-02-24  0:19           ` Brandon Williams
2018-02-24  4:03             ` Jeff King
2018-02-24  4:01           ` Jeff King
2018-02-26 22:33             ` Junio C Hamano
2018-02-27  0:02             ` Ævar Arnfjörð Bjarmason
2018-02-27  5:15               ` Jonathan Nieder
2018-02-27 18:02                 ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 14/35] connect: request remote refs using v2 Brandon Williams
2018-02-21 22:54       ` Jonathan Tan
2018-02-22 18:19         ` Brandon Williams
2018-02-22 18:26           ` Jeff King
2018-02-22 19:25             ` Jonathan Tan
2018-02-27  6:21               ` Jonathan Nieder
2018-02-27 21:58                 ` Junio C Hamano
2018-02-27 22:04                   ` Jeff King
2018-02-27 22:10                     ` Eric Sunshine
2018-02-27 22:18                       ` Jeff King
2018-02-27 23:32                         ` Junio C Hamano
2018-02-27  6:51       ` Jonathan Nieder
2018-02-27 19:30         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
2018-02-21 22:56       ` Jonathan Tan
2018-02-22 18:25         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 16/35] transport: convert transport_get_remote_refs " Brandon Williams
2018-02-21 22:58       ` Jonathan Tan
2018-02-22 18:26         ` Brandon Williams
2018-02-22 19:32           ` Jonathan Tan
2018-02-22 19:51             ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
2018-02-07  1:12     ` [PATCH v3 18/35] fetch: pass ref patterns when fetching Brandon Williams
2018-02-27  6:53       ` Jonathan Nieder
2018-02-07  1:12     ` [PATCH v3 19/35] push: pass ref patterns when pushing Brandon Williams
2018-02-27 18:23       ` Stefan Beller
2018-02-07  1:12     ` [PATCH v3 20/35] upload-pack: introduce fetch server command Brandon Williams
2018-02-21 23:46       ` Jonathan Tan
2018-02-22 18:48         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
2018-02-24  0:54       ` Jonathan Tan
2018-02-26 22:23         ` Brandon Williams
2018-02-27 19:27       ` Stefan Beller
2018-02-27 19:40         ` Brandon Williams
2018-02-07  1:12     ` [PATCH v3 22/35] upload-pack: support shallow requests Brandon Williams
2018-02-07 19:00       ` Stefan Beller
2018-02-10 10:23         ` Duy Nguyen
2018-02-13 17:06         ` Brandon Williams
2018-02-27 18:29       ` Jonathan Nieder
2018-02-27 18:57         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 23/35] fetch-pack: " Brandon Williams
2018-02-23 19:37       ` Jonathan Tan
2018-02-23 19:56         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
2018-02-21 23:51       ` Jonathan Tan
2018-02-07  1:13     ` [PATCH v3 25/35] connect: don't request v2 when pushing Brandon Williams
2018-02-07  1:13     ` [PATCH v3 26/35] transport-helper: remove name parameter Brandon Williams
2018-02-27 23:03       ` Jonathan Nieder
2018-02-07  1:13     ` [PATCH v3 27/35] transport-helper: refactor process_connect_service Brandon Williams
2018-02-07  1:13     ` [PATCH v3 28/35] transport-helper: introduce stateless-connect Brandon Williams
2018-02-22  0:01       ` Jonathan Tan
2018-02-22 18:53         ` Brandon Williams
2018-02-22 21:55           ` Jonathan Tan
2018-02-27 23:30       ` Jonathan Nieder
2018-02-28 19:09         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 29/35] pkt-line: add packet_buf_write_len function Brandon Williams
2018-02-27 23:11       ` Jonathan Nieder
2018-02-28  1:08         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 30/35] remote-curl: create copy of the service name Brandon Williams
2018-02-22  0:06       ` Jonathan Tan
2018-02-22 18:56         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 31/35] remote-curl: store the protocol version the server responded with Brandon Williams
2018-02-27 23:17       ` Jonathan Nieder
2018-02-07  1:13     ` [PATCH v3 32/35] http: allow providing extra headers for http requests Brandon Williams
2018-02-22  0:09       ` Jonathan Tan
2018-02-22 18:58         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 33/35] http: don't always add Git-Protocol header Brandon Williams
2018-02-07  1:13     ` [PATCH v3 34/35] remote-curl: implement stateless-connect command Brandon Williams
2018-02-28  0:05       ` Jonathan Nieder
2018-02-28 20:21         ` Brandon Williams
2018-02-07  1:13     ` [PATCH v3 35/35] remote-curl: don't request v2 when pushing Brandon Williams
2018-02-22  0:12       ` Jonathan Tan
2018-02-22 18:59         ` Brandon Williams
2018-02-22 19:09           ` Brandon Williams
2018-02-12 14:50     ` [PATCH v3 00/35] protocol version 2 Derrick Stolee
2018-02-21 20:01     ` Brandon Williams
2018-02-28 23:22     ` [PATCH v4 " Brandon Williams
2018-02-28 23:22       ` [PATCH v4 01/35] pkt-line: introduce packet_read_with_status Brandon Williams
2018-03-13 19:35         ` Jonathan Tan
2018-03-13 19:52           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 02/35] pkt-line: allow peeking a packet line without consuming it Brandon Williams
2018-03-01 20:48         ` Junio C Hamano
2018-03-12 21:56           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 03/35] pkt-line: add delim packet support Brandon Williams
2018-03-01 20:50         ` Junio C Hamano
2018-03-01 21:04           ` Junio C Hamano
2018-03-01 22:49             ` Brandon Williams
2018-03-01 23:43               ` Junio C Hamano
2018-02-28 23:22       ` [PATCH v4 04/35] upload-pack: convert to a builtin Brandon Williams
2018-03-13 16:40         ` Jonathan Tan
2018-03-13 19:50           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 05/35] upload-pack: factor out processing lines Brandon Williams
2018-03-01 21:25         ` Junio C Hamano
2018-03-12 22:24           ` Brandon Williams
2018-03-12 22:39             ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 06/35] transport: use get_refs_via_connect to get refs Brandon Williams
2018-03-01 21:25         ` Junio C Hamano
2018-02-28 23:22       ` [PATCH v4 07/35] connect: convert get_remote_heads to use struct packet_reader Brandon Williams
2018-02-28 23:22       ` [PATCH v4 08/35] connect: discover protocol version outside of get_remote_heads Brandon Williams
2018-03-13 15:49         ` Jonathan Tan
2018-02-28 23:22       ` [PATCH v4 09/35] transport: store protocol version Brandon Williams
2018-02-28 23:22       ` [PATCH v4 10/35] protocol: introduce enum protocol_version value protocol_v2 Brandon Williams
2018-02-28 23:22       ` [PATCH v4 11/35] test-pkt-line: introduce a packet-line test helper Brandon Williams
2018-02-28 23:22       ` [PATCH v4 12/35] serve: introduce git-serve Brandon Williams
2018-03-01 23:11         ` Junio C Hamano
2018-03-12 22:08           ` Brandon Williams
2018-03-02 20:42         ` Junio C Hamano
2018-03-13 21:40           ` Brandon Williams
2018-03-02 20:56         ` Junio C Hamano
2018-03-13 21:35           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 13/35] ls-refs: introduce ls-refs server command Brandon Williams
2018-03-02 21:13         ` Junio C Hamano
2018-03-13 21:27           ` Brandon Williams
2018-03-03  4:43         ` Jeff King
2018-03-05 18:21           ` Brandon Williams
2018-03-05 18:29             ` Jonathan Nieder
2018-03-05 20:38               ` Jeff King
2018-03-05 20:28             ` Jeff King
2018-03-13 21:23               ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 14/35] connect: request remote refs using v2 Brandon Williams
2018-02-28 23:22       ` [PATCH v4 15/35] transport: convert get_refs_list to take a list of ref patterns Brandon Williams
2018-02-28 23:22       ` [PATCH v4 16/35] transport: convert transport_get_remote_refs " Brandon Williams
2018-03-13 16:00         ` Jonathan Tan
2018-02-28 23:22       ` [PATCH v4 17/35] ls-remote: pass ref patterns when requesting a remote's refs Brandon Williams
2018-03-02 22:13         ` Junio C Hamano
2018-02-28 23:22       ` [PATCH v4 18/35] fetch: pass ref patterns when fetching Brandon Williams
2018-03-02 22:20         ` Junio C Hamano
2018-03-12 22:18           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 19/35] push: pass ref patterns when pushing Brandon Williams
2018-03-02 22:25         ` Junio C Hamano
2018-03-12 22:20           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 20/35] upload-pack: introduce fetch server command Brandon Williams
2018-03-13 16:20         ` Jonathan Tan
2018-03-13 21:49           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 21/35] fetch-pack: perform a fetch using v2 Brandon Williams
2018-02-28 23:22       ` [PATCH v4 22/35] fetch-pack: support shallow requests Brandon Williams
2018-02-28 23:22       ` [PATCH v4 23/35] connect: refactor git_connect to only get the protocol version once Brandon Williams
2018-02-28 23:22       ` [PATCH v4 24/35] connect: don't request v2 when pushing Brandon Williams
2018-02-28 23:22       ` [PATCH v4 25/35] transport-helper: remove name parameter Brandon Williams
2018-02-28 23:22       ` [PATCH v4 26/35] transport-helper: refactor process_connect_service Brandon Williams
2018-02-28 23:22       ` [PATCH v4 27/35] transport-helper: introduce stateless-connect Brandon Williams
2018-03-13 16:30         ` Jonathan Tan
2018-03-14 17:36           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 28/35] pkt-line: add packet_buf_write_len function Brandon Williams
2018-02-28 23:22       ` [PATCH v4 29/35] remote-curl: create copy of the service name Brandon Williams
2018-03-13 16:32         ` Jonathan Tan
2018-02-28 23:22       ` [PATCH v4 30/35] remote-curl: store the protocol version the server responded with Brandon Williams
2018-02-28 23:22       ` [PATCH v4 31/35] http: allow providing extra headers for http requests Brandon Williams
2018-03-13 16:33         ` Jonathan Tan
2018-02-28 23:22       ` [PATCH v4 32/35] http: don't always add Git-Protocol header Brandon Williams
2018-02-28 23:22       ` [PATCH v4 33/35] http: eliminate "# service" line when using protocol v2 Brandon Williams
2018-02-28 23:22       ` [PATCH v4 34/35] remote-curl: implement stateless-connect command Brandon Williams
2018-03-02 20:07         ` Johannes Schindelin
2018-03-05 19:35           ` Brandon Williams
2018-02-28 23:22       ` [PATCH v4 35/35] remote-curl: don't request v2 when pushing Brandon Williams
2018-03-13 16:35         ` Jonathan Tan
2018-03-01 18:41       ` [PATCH v4 00/35] protocol version 2 Junio C Hamano
2018-03-01 19:16         ` Brandon Williams
2018-03-01 20:59           ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).