All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] protocol transition
@ 2017-09-13 21:54 Brandon Williams
  2017-09-13 21:54 ` [PATCH 1/8] pkt-line: add packet_write function Brandon Williams
                   ` (9 more replies)
  0 siblings, 10 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Here is the non-RFC version of of my proposed protocol transition plan which
can be found here:
https://public-inbox.org/git/20170824225328.8174-1-bmwill@google.com/

The main take away from the comments on the RFC were that the first transition
shouldn't be drastic, so this patch set introduces protocol v1 which is simply
protocol v0 (which is what I'm calling the current git wire protocol) with a
single pkt-line containing a version string before the ref advertisement.

I have included tests for protocol version 1 to verify that it works with the
following transports: git://, file://, ssh://, and http://.  I have also
included an interop test to ensure that sending the version request out of band
doesn't cause issues with older servers.

Any and all comments and feedback are welcome, thanks!

Brandon Williams (8):
  pkt-line: add packet_write function
  protocol: introduce protocol extention mechanisms
  daemon: recognize hidden request arguments
  upload-pack, receive-pack: introduce protocol version 1
  connect: teach client to recognize v1 server response
  connect: tell server that the client understands v1
  http: tell server that the client understands v1
  i5700: add interop test for protocol transition

 Documentation/config.txt               |  16 ++
 Documentation/git.txt                  |   5 +
 Makefile                               |   1 +
 builtin/receive-pack.c                 |  14 ++
 cache.h                                |   9 +
 connect.c                              |  59 ++++++-
 daemon.c                               |  71 ++++++--
 http.c                                 |  18 ++
 pkt-line.c                             |   6 +
 pkt-line.h                             |   1 +
 protocol.c                             |  72 ++++++++
 protocol.h                             |  15 ++
 t/interop/i5700-protocol-transition.sh |  68 ++++++++
 t/lib-httpd/apache.conf                |   7 +
 t/t5700-protocol-v1.sh                 | 292 +++++++++++++++++++++++++++++++++
 upload-pack.c                          |  17 +-
 16 files changed, 655 insertions(+), 16 deletions(-)
 create mode 100644 protocol.c
 create mode 100644 protocol.h
 create mode 100755 t/interop/i5700-protocol-transition.sh
 create mode 100755 t/t5700-protocol-v1.sh

-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH 1/8] pkt-line: add packet_write function
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 21:54 ` [PATCH 2/8] protocol: introduce protocol extention mechanisms Brandon Williams
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Add a function which can be used to write the contents of an arbitrary
buffer.  This makes it easy to build up data in a buffer before writing
the packet instead of formatting the entire contents of the packet using
'packet_write_fmt()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 6 ++++++
 pkt-line.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 7db911957..cf98f371b 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -188,6 +188,12 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 	return error("packet write failed");
 }
 
+void packet_write(const int fd_out, const char *buf, size_t size)
+{
+	if (packet_write_gently(fd_out, buf, size))
+		die_errno("packet write failed");
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
 	va_list args;
diff --git a/pkt-line.h b/pkt-line.h
index 66ef610fc..d9e9783b1 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
 void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_write(const int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 2/8] protocol: introduce protocol extention mechanisms
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
  2017-09-13 21:54 ` [PATCH 1/8] pkt-line: add packet_write function Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 22:27   ` Stefan Beller
  2017-09-13 21:54 ` [PATCH 3/8] daemon: recognize hidden request arguments Brandon Williams
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt | 16 +++++++++++
 Documentation/git.txt    |  5 ++++
 Makefile                 |  1 +
 cache.h                  |  7 +++++
 protocol.c               | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
 protocol.h               | 15 ++++++++++
 6 files changed, 116 insertions(+)
 create mode 100644 protocol.c
 create mode 100644 protocol.h

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a..d5b28a32c 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2517,6 +2517,22 @@ The protocol names currently used by git are:
     `hg` to allow the `git-remote-hg` helper)
 --
 
+protocol.version::
+	If set, clients will attempt to communicate with a server using
+	the specified protocol version.  If unset, no attempt will be
+	made by the client to communicate using a particular protocol
+	version, this results in protocol version 0 being used.
+	Supported versions:
++
+--
+
+* `0` - the original wire protocol.
+
+* `1` - the original wire protocol with the addition of a version string
+  in the initial respose from the server.
+
+--
+
 pull.ff::
 	By default, Git does not create an extra merge commit when merging
 	a commit that is a descendant of the current commit. Instead, the
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6e3a6767e..299f75c7b 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -697,6 +697,11 @@ of clones and fetches.
 	which feed potentially-untrusted URLS to git commands.  See
 	linkgit:git-config[1] for more details.
 
+`GIT_PROTOCOL`::
+	For internal use only.  Used in handshaking the wire protocol.
+	Contains a colon ':' separated list of keys with optional values
+	'key[=value]'.  Presence of unknown keys must be tolerated.
+
 Discussion[[Discussion]]
 ------------------------
 
diff --git a/Makefile b/Makefile
index f2bb7f2f6..1f300bd6b 100644
--- a/Makefile
+++ b/Makefile
@@ -837,6 +837,7 @@ LIB_OBJS += pretty.o
 LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
 LIB_OBJS += prompt.o
+LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
diff --git a/cache.h b/cache.h
index a916bc79e..8839b1ed4 100644
--- a/cache.h
+++ b/cache.h
@@ -444,6 +444,13 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS"
 #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH"
 
+/*
+ * Environment variable used in handshaking the wire protocol.
+ * Contains a colon ':' separated list of keys with optional values
+ * 'key[=value]'.  Presence of unknown keys must be tolerated.
+ */
+#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+
 /*
  * This environment variable is expected to contain a boolean indicating
  * whether we should or should not treat:
diff --git a/protocol.c b/protocol.c
new file mode 100644
index 000000000..1b16c7b9a
--- /dev/null
+++ b/protocol.c
@@ -0,0 +1,72 @@
+#include "cache.h"
+#include "config.h"
+#include "protocol.h"
+
+enum protocol_version parse_protocol_version(const char *value)
+{
+	if (!strcmp(value, "0"))
+		return protocol_v0;
+	else if (!strcmp(value, "1"))
+		return protocol_v1;
+	else
+		return protocol_unknown_version;
+}
+
+enum protocol_version get_protocol_version_config(void)
+{
+	const char *value;
+	if (!git_config_get_string_const("protocol.version", &value)) {
+		enum protocol_version version = parse_protocol_version(value);
+
+		if (version == protocol_unknown_version)
+			die("unknown value for config 'protocol.version': %s",
+			    value);
+
+		return version;
+	}
+
+	return protocol_v0;
+}
+
+enum protocol_version determine_protocol_version_server(void)
+{
+	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
+	enum protocol_version version = protocol_v0;
+
+	if (git_protocol) {
+		struct string_list list = STRING_LIST_INIT_DUP;
+		const struct string_list_item *item;
+		string_list_split(&list, git_protocol, ':', -1);
+
+		for_each_string_list_item(item, &list) {
+			const char *value;
+			enum protocol_version v;
+
+			if (skip_prefix(item->string, "version=", &value)) {
+				v = parse_protocol_version(value);
+				if (v > version)
+					version = v;
+			}
+		}
+
+		string_list_clear(&list, 0);
+	}
+
+	return version;
+}
+
+enum protocol_version determine_protocol_version_client(const char *server_response)
+{
+	enum protocol_version version = protocol_v0;
+
+	if (skip_prefix(server_response, "version ", &server_response)) {
+		version = parse_protocol_version(server_response);
+
+		if (version == protocol_unknown_version)
+			die("server is speaking an unknown protocol");
+		if (version == protocol_v0)
+			die("protocol error: server explicitly said version 0");
+	}
+
+	return version;
+}
diff --git a/protocol.h b/protocol.h
new file mode 100644
index 000000000..2fa6486d0
--- /dev/null
+++ b/protocol.h
@@ -0,0 +1,15 @@
+#ifndef PROTOCOL_H
+#define PROTOCOL_H
+
+enum protocol_version {
+	protocol_unknown_version = -1,
+	protocol_v0 = 0,
+	protocol_v1 = 1,
+};
+
+extern enum protocol_version parse_protocol_version(const char *value);
+extern enum protocol_version get_protocol_version_config(void);
+extern enum protocol_version determine_protocol_version_server(void);
+extern enum protocol_version determine_protocol_version_client(const char *server_response);
+
+#endif /* PROTOCOL_H */
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 3/8] daemon: recognize hidden request arguments
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
  2017-09-13 21:54 ` [PATCH 1/8] pkt-line: add packet_write function Brandon Williams
  2017-09-13 21:54 ` [PATCH 2/8] protocol: introduce protocol extention mechanisms Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 22:31   ` Stefan Beller
  2017-09-21  0:24   ` Jonathan Tan
  2017-09-13 21:54 ` [PATCH 4/8] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

A normal request to git-daemon is structured as
"command path/to/repo\0host=..\0" and due to a bug in an old version of
git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
command, 2009-06-04) we aren't able to place any extra args (separated
by NULs) besides the host.

In order to get around this limitation teach git-daemon to recognize
additional request arguments hidden behind a second NUL byte.  Requests
can then be structured like:
"command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
can then parse out the extra arguments and set 'GIT_PROTOCOL'
accordingly.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 daemon.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 61 insertions(+), 10 deletions(-)

diff --git a/daemon.c b/daemon.c
index 30747075f..250dbf82c 100644
--- a/daemon.c
+++ b/daemon.c
@@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi)
 	return NULL;		/* Fallthrough. Deny by default */
 }
 
-typedef int (*daemon_service_fn)(void);
+typedef int (*daemon_service_fn)(const struct argv_array *env);
 struct daemon_service {
 	const char *name;
 	const char *config_name;
@@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
 }
 
 static int run_service(const char *dir, struct daemon_service *service,
-		       struct hostinfo *hi)
+		       struct hostinfo *hi, const struct argv_array *env)
 {
 	const char *path;
 	int enabled = service->enabled;
@@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service,
 	 */
 	signal(SIGTERM, SIG_IGN);
 
-	return service->fn();
+	return service->fn(env);
 }
 
 static void copy_to_log(int fd)
@@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld)
 	return finish_command(cld);
 }
 
-static int upload_pack(void)
+static int upload_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_pushl(&cld.args, "upload-pack", "--strict", NULL);
 	argv_array_pushf(&cld.args, "--timeout=%u", timeout);
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int upload_archive(void)
+static int upload_archive(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "upload-archive");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int receive_pack(void)
+static int receive_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "receive-pack");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
@@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
 /*
  * Read the host as supplied by the client connection.
  */
-static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
+static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 {
 	char *val;
 	int vallen;
@@ -602,6 +611,39 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 		if (extra_args < end && *extra_args)
 			die("Invalid request");
 	}
+
+	return extra_args;
+}
+
+static void parse_extra_args(struct argv_array *env, const char *extra_args,
+			     int buflen)
+{
+	const char *end = extra_args + buflen;
+	struct strbuf git_protocol = STRBUF_INIT;
+
+	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
+		const char *arg = extra_args;
+
+		/*
+		 * Parse the extra arguments, adding most to 'git_protocol'
+		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
+		 * service that will be run.
+		 *
+		 * If there ends up being a particular arg in the future that
+		 * git-daemon needs to parse specificly (like the 'host' arg)
+		 * then it can be parsed here and not added to 'git_protocol'.
+		 */
+		if (*arg) {
+			if (git_protocol.len > 0)
+				strbuf_addch(&git_protocol, ':');
+			strbuf_addstr(&git_protocol, arg);
+		}
+	}
+
+	if (git_protocol.len > 0)
+		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
+				 git_protocol.buf);
+	strbuf_release(&git_protocol);
 }
 
 /*
@@ -695,6 +737,7 @@ static int execute(void)
 	int pktlen, len, i;
 	char *addr = getenv("REMOTE_ADDR"), *port = getenv("REMOTE_PORT");
 	struct hostinfo hi;
+	struct argv_array env = ARGV_ARRAY_INIT;
 
 	hostinfo_init(&hi);
 
@@ -716,8 +759,14 @@ static int execute(void)
 		pktlen--;
 	}
 
-	if (len != pktlen)
-		parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+	if (len != pktlen) {
+		const char *extra_args;
+		/* retrieve host */
+		extra_args = parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+
+		/* parse additional args hidden behind a second NUL byte */
+		parse_extra_args(&env, extra_args + 1, pktlen - (extra_args - line) - 1);
+	}
 
 	for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
 		struct daemon_service *s = &(daemon_service[i]);
@@ -730,13 +779,15 @@ static int execute(void)
 			 * Note: The directory here is probably context sensitive,
 			 * and might depend on the actual service being performed.
 			 */
-			int rc = run_service(arg, s, &hi);
+			int rc = run_service(arg, s, &hi, &env);
 			hostinfo_clear(&hi);
+			argv_array_clear(&env);
 			return rc;
 		}
 	}
 
 	hostinfo_clear(&hi);
+	argv_array_clear(&env);
 	logerror("Protocol error: '%s'", line);
 	return -1;
 }
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 4/8] upload-pack, receive-pack: introduce protocol version 1
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (2 preceding siblings ...)
  2017-09-13 21:54 ` [PATCH 3/8] daemon: recognize hidden request arguments Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 21:54 ` [PATCH 5/8] connect: teach client to recognize v1 server response Brandon Williams
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.

Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/receive-pack.c | 14 ++++++++++++++
 upload-pack.c          | 17 ++++++++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 52c63ebfd..aebe77cc3 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -24,6 +24,7 @@
 #include "tmp-objdir.h"
 #include "oidset.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static const char * const receive_pack_usage[] = {
 	N_("git receive-pack <git-dir>"),
@@ -1963,6 +1964,19 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 	else if (0 <= receive_unpack_limit)
 		unpack_limit = receive_unpack_limit;
 
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+	case protocol_v0:
+	default:
+		break;
+	}
+
 	if (advertise_refs || !stateless_rpc) {
 		write_head_info();
 	}
diff --git a/upload-pack.c b/upload-pack.c
index 7efff2fbf..5cab39819 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
+#include "protocol.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -1067,6 +1068,20 @@ int cmd_main(int argc, const char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 
 	git_config(upload_pack_config, NULL);
-	upload_pack();
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+	case protocol_v0:
+	default:
+		upload_pack();
+		break;
+	}
+
 	return 0;
 }
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 5/8] connect: teach client to recognize v1 server response
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (3 preceding siblings ...)
  2017-09-13 21:54 ` [PATCH 4/8] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 21:54 ` [PATCH 6/8] connect: tell server that the client understands v1 Brandon Williams
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/connect.c b/connect.c
index 49b28b83b..2702e1f2e 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -142,6 +143,27 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 		if (len < 0)
 			die_initial_contact(saw_response);
 
+		/* Only check for version information on first response */
+		if (!saw_response) {
+			switch (determine_protocol_version_client(buffer)) {
+			case protocol_v1:
+				/*
+				 * First pkt-line contained the version string.
+				 * Continue on to process the ref advertisement.
+				 */
+				continue;
+			case protocol_v0:
+				/*
+				 * Server is speaking protocol v0 and sent a
+				 * ref so we need to process it.
+				 */
+				break;
+			default:
+				die("server is speaking an unknown protocol");
+				break;
+			}
+		}
+
 		if (!len)
 			break;
 
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 6/8] connect: tell server that the client understands v1
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (4 preceding siblings ...)
  2017-09-13 21:54 ` [PATCH 5/8] connect: teach client to recognize v1 server response Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 21:54 ` [PATCH 7/8] http: " Brandon Williams
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Teach the connection logic to tell a serve that it understands protocol
v1.  This is done in 2 different ways for the built in protocols.

1. git://
   A normal request is structured as "command path/to/repo\0host=..\0"
   and due to a bug in an old version of git-daemon 73bb33a94 (daemon:
   Strictly parse the "extra arg" part of the command, 2009-06-04) we
   aren't able to place any extra args (separated by NULs) besides the
   host.  In order to get around this limitation put protocol version
   information after a second NUL byte so the request is structured
   like: "command path/to/repo\0host=..\0\0version=1\0".  git-daemon can
   then parse out the version number and set GIT_PROTOCOL.

2. ssh://, file://
   Set GIT_PROTOCOL envvar with the desired protocol version.  The
   envvar can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and
   having the server whitelist this envvar.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              |  37 ++++++--
 t/t5700-protocol-v1.sh | 223 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 255 insertions(+), 5 deletions(-)
 create mode 100755 t/t5700-protocol-v1.sh

diff --git a/connect.c b/connect.c
index 2702e1f2e..40b388acd 100644
--- a/connect.c
+++ b/connect.c
@@ -815,6 +815,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
+		struct strbuf request = STRBUF_INIT;
 		/*
 		 * Set up virtual host information based on where we will
 		 * connect, unless the user has overridden us in
@@ -842,12 +843,24 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * Note: Do not add any other headers here!  Doing so
 		 * will cause older git-daemon servers to crash.
 		 */
-		packet_write_fmt(fd[1],
-			     "%s %s%chost=%s%c",
-			     prog, path, 0,
-			     target_host, 0);
+		strbuf_addf(&request,
+			    "%s %s%chost=%s%c",
+			    prog, path, 0,
+			    target_host, 0);
+
+		/* If using a new version put that stuff here after a second null byte */
+		if (get_protocol_version_config() > 0) {
+			strbuf_addch(&request, '\0');
+			strbuf_addf(&request, "version=%d%c",
+				    get_protocol_version_config(), '\0');
+		}
+
+		packet_write(fd[1], request.buf, request.len);
+
 		free(target_host);
+		strbuf_release(&request);
 	} else {
+		const char *const *var;
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
 
@@ -859,7 +872,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		sq_quote_buf(&cmd, path);
 
 		/* remove repo-local variables from the environment */
-		conn->env = local_repo_env;
+		for (var = local_repo_env; *var; var++)
+			argv_array_push(&conn->env_array, *var);
+
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
@@ -912,6 +927,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 			}
 
 			argv_array_push(&conn->args, ssh);
+
+			if (get_protocol_version_config() > 0) {
+				argv_array_push(&conn->args, "-o");
+				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
+
 			if (flags & CONNECT_IPV4)
 				argv_array_push(&conn->args, "-4");
 			else if (flags & CONNECT_IPV6)
@@ -926,6 +949,10 @@ struct child_process *git_connect(int fd[2], const char *url,
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
+			if (get_protocol_version_config() > 0) {
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
 		}
 		argv_array_push(&conn->args, cmd.buf);
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
new file mode 100755
index 000000000..1988bbce6
--- /dev/null
+++ b/t/t5700-protocol-v1.sh
@@ -0,0 +1,223 @@
+#!/bin/sh
+
+test_description='test git wire-protocol transition'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v1 with 'git://' transport
+#
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	test_commit -C "$daemon_parent" one
+'
+
+test_expect_success 'clone with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with git:// using protocol v1' '
+	test_commit -C "$daemon_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C daemon_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with git:// using protocol v1' '
+	test_commit -C daemon_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+stop_git_daemon
+
+# Test protocol v1 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'clone with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with file:// using protocol v1' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C file_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with file:// using protocol v1' '
+	test_commit -C file_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+# Test protocol v1 with 'ssh://' transport
+#
+test_expect_success 'setup ssh wrapper' '
+	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" &&
+	export GIT_SSH &&
+	export TRASH_DIRECTORY &&
+	>"$TRASH_DIRECTORY"/ssh-output
+'
+
+expect_ssh () {
+	test_when_finished '(cd "$TRASH_DIRECTORY" && rm -f ssh-expect && >ssh-output)' &&
+	echo "ssh: -o SendEnv=GIT_PROTOCOL myhost $1 '$PWD/ssh_parent'" >"$TRASH_DIRECTORY/ssh-expect" &&
+	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
+}
+
+test_expect_success 'create repo to be served by ssh:// transport' '
+	git init ssh_parent &&
+	test_commit -C ssh_parent one
+'
+
+test_expect_success 'clone with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "ssh://myhost:$(pwd)/ssh_parent" ssh_child 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with ssh:// using protocol v1' '
+	test_commit -C ssh_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		fetch 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		pull 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with ssh:// using protocol v1' '
+	test_commit -C ssh_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+	expect_ssh git-receive-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+test_done
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 7/8] http: tell server that the client understands v1
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (5 preceding siblings ...)
  2017-09-13 21:54 ` [PATCH 6/8] connect: tell server that the client understands v1 Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-13 21:54 ` [PATCH 8/8] i5700: add interop test for protocol transition Brandon Williams
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Tell a server that protocol v1 can be used by sending the http header
'Git-Protocol' indicating this.

Also teach the apache http server to pass through the 'Git-Protocol'
header as an environment variable 'GIT_PROTOCOL'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 cache.h                 |  2 ++
 http.c                  | 18 +++++++++++++
 t/lib-httpd/apache.conf |  7 +++++
 t/t5700-protocol-v1.sh  | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 96 insertions(+)

diff --git a/cache.h b/cache.h
index 8839b1ed4..82a643968 100644
--- a/cache.h
+++ b/cache.h
@@ -450,6 +450,8 @@ static inline enum object_type object_type(unsigned int mode)
  * 'key[=value]'.  Presence of unknown keys must be tolerated.
  */
 #define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+/* HTTP header used to handshake the wire protocol */
+#define GIT_PROTOCOL_HEADER "Git-Protocol"
 
 /*
  * This environment variable is expected to contain a boolean indicating
diff --git a/http.c b/http.c
index 9e40a465f..ffb719216 100644
--- a/http.c
+++ b/http.c
@@ -12,6 +12,7 @@
 #include "gettext.h"
 #include "transport.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
@@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
+static void protocol_http_header(void)
+{
+	if (get_protocol_version_config() > 0) {
+		struct strbuf protocol_header = STRBUF_INIT;
+
+		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
+			    get_protocol_version_config());
+
+
+		extra_http_headers = curl_slist_append(extra_http_headers,
+						       protocol_header.buf);
+		strbuf_release(&protocol_header);
+	}
+}
+
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
+	protocol_http_header();
+
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 0642ae7e6..df1943631 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -67,6 +67,9 @@ LockFile accept.lock
 <IfModule !mod_unixd.c>
 	LoadModule unixd_module modules/mod_unixd.so
 </IfModule>
+<IfModule !mod_setenvif.c>
+	LoadModule setenvif_module modules/mod_setenvif.so
+</IfModule>
 </IfVersion>
 
 PassEnv GIT_VALGRIND
@@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS
 PassEnv GIT_TRACE
 PassEnv GIT_CONFIG_NOSYSTEM
 
+<IfVersion >= 2.4>
+	SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
+</IfVersion>
+
 Alias /dumb/ www/
 Alias /auth/dumb/ www/auth/dumb/
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index 1988bbce6..222265127 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -220,4 +220,73 @@ test_expect_success 'push with ssh:// using protocol v1' '
 	grep "push< version 1" log
 '
 
+# Test protocol v1 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=1 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "Git-Protocol: version=1" log &&
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'fetch with http:// using protocol v1' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'pull with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'push with http:// using protocol v1' '
+	test_commit -C http_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		push origin HEAD:client_branch && #2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+stop_httpd
+
 test_done
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 8/8] i5700: add interop test for protocol transition
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (6 preceding siblings ...)
  2017-09-13 21:54 ` [PATCH 7/8] http: " Brandon Williams
@ 2017-09-13 21:54 ` Brandon Williams
  2017-09-20 18:48 ` [PATCH 1.5/8] connect: die when a capability line comes after a ref Brandon Williams
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-13 21:54 UTC (permalink / raw)
  To: git
  Cc: peff, sbeller, gitster, jrnieder, bturner, git, jonathantanmy,
	Brandon Williams

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/interop/i5700-protocol-transition.sh | 68 ++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100755 t/interop/i5700-protocol-transition.sh

diff --git a/t/interop/i5700-protocol-transition.sh b/t/interop/i5700-protocol-transition.sh
new file mode 100755
index 000000000..9e83428a8
--- /dev/null
+++ b/t/interop/i5700-protocol-transition.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+VERSION_A=.
+VERSION_B=v2.0.0
+
+: ${LIB_GIT_DAEMON_PORT:=5600}
+LIB_GIT_DAEMON_COMMAND='git.b daemon'
+
+test_description='clone and fetch by client who is trying to use a new protocol'
+. ./interop-lib.sh
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+
+start_git_daemon --export-all
+
+repo=$GIT_DAEMON_DOCUMENT_ROOT_PATH/repo
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init "$repo" &&
+	git.b -C "$repo" commit --allow-empty -m one
+'
+
+test_expect_success "git:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone "$GIT_DAEMON_URL/repo" child 2>log &&
+	git.a -C child log -1 --format=%s >actual &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	grep "version=1" log
+'
+
+test_expect_success "git:// fetch with $VERSION_A and protocol v1" '
+	git.b -C "$repo" commit --allow-empty -m two &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child -c protocol.version=1 fetch 2>log &&
+	git.a -C child log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	grep "version=1" log &&
+	! grep "version 1" log
+'
+
+stop_git_daemon
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init parent &&
+	git.b -C parent commit --allow-empty -m one
+'
+
+test_expect_success "file:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone --upload-pack="git.b upload-pack" parent child2 2>log &&
+	git.a -C child2 log -1 --format=%s >actual &&
+	git.b -C parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_expect_success "file:// fetch with $VERSION_A and protocol v1" '
+	git.b -C parent commit --allow-empty -m two &&
+	git.b -C parent log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child2 -c protocol.version=1 fetch --upload-pack="git.b upload-pack" 2>log &&
+	git.a -C child2 log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_done
-- 
2.14.1.690.gbb1197296e-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/8] protocol: introduce protocol extention mechanisms
  2017-09-13 21:54 ` [PATCH 2/8] protocol: introduce protocol extention mechanisms Brandon Williams
@ 2017-09-13 22:27   ` Stefan Beller
  2017-09-18 17:02     ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-09-13 22:27 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, bturner,
	Jeff Hostetler, Jonathan Tan

On Wed, Sep 13, 2017 at 2:54 PM, Brandon Williams <bmwill@google.com> wrote:
> Create protocol.{c,h} and provide functions which future servers and
> clients can use to determine which protocol to use or is being used.
>
> Also introduce the 'GIT_PROTOCOL' environment variable which will be
> used to communicate a colon separated list of keys with optional values
> to a server.  Unknown keys and values must be tolerated.  This mechanism
> is used to communicate which version of the wire protocol a client would
> like to use with a server.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/config.txt | 16 +++++++++++
>  Documentation/git.txt    |  5 ++++
>  Makefile                 |  1 +
>  cache.h                  |  7 +++++
>  protocol.c               | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
>  protocol.h               | 15 ++++++++++
>  6 files changed, 116 insertions(+)
>  create mode 100644 protocol.c
>  create mode 100644 protocol.h
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index dc4e3f58a..d5b28a32c 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2517,6 +2517,22 @@ The protocol names currently used by git are:
>      `hg` to allow the `git-remote-hg` helper)
>  --
>
> +protocol.version::

It would be cool to set a set of versions that are good. (I am not sure
if that can be deferred to a later patch.)

  Consider we'd have versions 0,1,2,3,4 in the future:
  In an ideal world the client and server would talk using v4
  as it is the most advanced protocol, right?
  Maybe a security/performance issue is found on the server side
  with say protocol v3. Still no big deal as we are speaking v4.
  But then consider an issue is found on the client side with v4.
  Then the client would happily talk 0..3 while the server would
  love to talk using 0,1,2,4.

The way I think about protocol version negotiation is that
both parties involved have a set of versions that they tolerate
to talk (they might understand more than the tolerated set, but the
user forbade some), and the goal of the negotiation is to find
the highest version number that is part of both the server set
and client set. So quite naturally with this line of thinking the
configuration is to configure a set of versions, which is what
I propose here. Maybe even in the wire format, separated
with colons?

> +       If set, clients will attempt to communicate with a server using
> +       the specified protocol version.  If unset, no attempt will be
> +       made by the client to communicate using a particular protocol
> +       version, this results in protocol version 0 being used.

This sounds as if we're going to be really shy at first and only
users that care will try out new versions at their own risk.
From a users POV this may be frustrating as I would imagine that
people want to run

  git config --global protocol.version 2

to try it out and then realize that some of their hosts do not speak
2, so they have to actually configure it per repo/remote.

> +       Supported versions:

> +* `0` - the original wire protocol.

In the future this may be misleading as it doesn't specify the date of
when it was original. e.g. are capabilities already supported in "original"?

Maybe phrase it as "wire protocol as of v2.14" ? (Though this sounds
as if new capabilities added in the future are not allowed)


> +
> +extern enum protocol_version parse_protocol_version(const char *value);
> +extern enum protocol_version get_protocol_version_config(void);
> +extern enum protocol_version determine_protocol_version_server(void);
> +extern enum protocol_version determine_protocol_version_client(const char *server_response);

Here is a good place to have some comments.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/8] daemon: recognize hidden request arguments
  2017-09-13 21:54 ` [PATCH 3/8] daemon: recognize hidden request arguments Brandon Williams
@ 2017-09-13 22:31   ` Stefan Beller
  2017-09-18 16:56     ` Brandon Williams
  2017-09-21  0:24   ` Jonathan Tan
  1 sibling, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-09-13 22:31 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, bturner,
	Jeff Hostetler, Jonathan Tan

On Wed, Sep 13, 2017 at 2:54 PM, Brandon Williams <bmwill@google.com> wrote:
> A normal request to git-daemon is structured as
> "command path/to/repo\0host=..\0" and due to a bug in an old version of
> git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> command, 2009-06-04) we aren't able to place any extra args (separated
> by NULs) besides the host.
>
> In order to get around this limitation teach git-daemon to recognize
> additional request arguments hidden behind a second NUL byte.  Requests
> can then be structured like:
> "command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
> can then parse out the extra arguments and set 'GIT_PROTOCOL'
> accordingly.
>
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  daemon.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 61 insertions(+), 10 deletions(-)
>
> diff --git a/daemon.c b/daemon.c
> index 30747075f..250dbf82c 100644
> --- a/daemon.c
> +++ b/daemon.c
> @@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi)
>         return NULL;            /* Fallthrough. Deny by default */
>  }
>
> -typedef int (*daemon_service_fn)(void);
> +typedef int (*daemon_service_fn)(const struct argv_array *env);
>  struct daemon_service {
>         const char *name;
>         const char *config_name;
> @@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
>  }
>
>  static int run_service(const char *dir, struct daemon_service *service,
> -                      struct hostinfo *hi)
> +                      struct hostinfo *hi, const struct argv_array *env)
>  {
>         const char *path;
>         int enabled = service->enabled;
> @@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service,
>          */
>         signal(SIGTERM, SIG_IGN);
>
> -       return service->fn();
> +       return service->fn(env);
>  }
>
>  static void copy_to_log(int fd)
> @@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld)
>         return finish_command(cld);
>  }
>
> -static int upload_pack(void)
> +static int upload_pack(const struct argv_array *env)
>  {
>         struct child_process cld = CHILD_PROCESS_INIT;
>         argv_array_pushl(&cld.args, "upload-pack", "--strict", NULL);
>         argv_array_pushf(&cld.args, "--timeout=%u", timeout);
> +
> +       argv_array_pushv(&cld.env_array, env->argv);
> +
>         return run_service_command(&cld);
>  }
>
> -static int upload_archive(void)
> +static int upload_archive(const struct argv_array *env)
>  {
>         struct child_process cld = CHILD_PROCESS_INIT;
>         argv_array_push(&cld.args, "upload-archive");
> +
> +       argv_array_pushv(&cld.env_array, env->argv);
> +
>         return run_service_command(&cld);
>  }
>
> -static int receive_pack(void)
> +static int receive_pack(const struct argv_array *env)
>  {
>         struct child_process cld = CHILD_PROCESS_INIT;
>         argv_array_push(&cld.args, "receive-pack");
> +
> +       argv_array_pushv(&cld.env_array, env->argv);
> +
>         return run_service_command(&cld);
>  }
>
> @@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
>  /*
>   * Read the host as supplied by the client connection.
>   */
> -static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> +static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
>  {
>         char *val;
>         int vallen;
> @@ -602,6 +611,39 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
>                 if (extra_args < end && *extra_args)
>                         die("Invalid request");
>         }
> +
> +       return extra_args;
> +}
> +
> +static void parse_extra_args(struct argv_array *env, const char *extra_args,
> +                            int buflen)
> +{
> +       const char *end = extra_args + buflen;
> +       struct strbuf git_protocol = STRBUF_INIT;
> +
> +       for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
> +               const char *arg = extra_args;
> +
> +               /*
> +                * Parse the extra arguments, adding most to 'git_protocol'
> +                * which will be used to set the 'GIT_PROTOCOL' envvar in the
> +                * service that will be run.
> +                *
> +                * If there ends up being a particular arg in the future that
> +                * git-daemon needs to parse specificly (like the 'host' arg)
> +                * then it can be parsed here and not added to 'git_protocol'.
> +                */
> +               if (*arg) {
> +                       if (git_protocol.len > 0)
> +                               strbuf_addch(&git_protocol, ':');
> +                       strbuf_addstr(&git_protocol, arg);
> +               }
> +       }
> +
> +       if (git_protocol.len > 0)
> +               argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> +                                git_protocol.buf);
> +       strbuf_release(&git_protocol);
>  }

I wonder if this could be written as

  begin = extra_args;
  p = extra_args;
  end = extra_args + buflen;

  while (p < end) {
    if (!*p)
        *p = ':';
    p++;
  }
  argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s", begin);

to ease the load on the server side, as then we do not
have to copy the partial strings into strbufs and then
count the length individually? (maybe performance is no big deal here?)


>
>  /*
> @@ -695,6 +737,7 @@ static int execute(void)
>         int pktlen, len, i;
>         char *addr = getenv("REMOTE_ADDR"), *port = getenv("REMOTE_PORT");
>         struct hostinfo hi;
> +       struct argv_array env = ARGV_ARRAY_INIT;
>
>         hostinfo_init(&hi);
>
> @@ -716,8 +759,14 @@ static int execute(void)
>                 pktlen--;
>         }
>
> -       if (len != pktlen)
> -               parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
> +       if (len != pktlen) {
> +               const char *extra_args;
> +               /* retrieve host */
> +               extra_args = parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
> +
> +               /* parse additional args hidden behind a second NUL byte */
> +               parse_extra_args(&env, extra_args + 1, pktlen - (extra_args - line) - 1);
> +       }
>
>         for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
>                 struct daemon_service *s = &(daemon_service[i]);
> @@ -730,13 +779,15 @@ static int execute(void)
>                          * Note: The directory here is probably context sensitive,
>                          * and might depend on the actual service being performed.
>                          */
> -                       int rc = run_service(arg, s, &hi);
> +                       int rc = run_service(arg, s, &hi, &env);
>                         hostinfo_clear(&hi);
> +                       argv_array_clear(&env);
>                         return rc;
>                 }
>         }
>
>         hostinfo_clear(&hi);
> +       argv_array_clear(&env);
>         logerror("Protocol error: '%s'", line);
>         return -1;
>  }
> --
> 2.14.1.690.gbb1197296e-goog
>

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/8] daemon: recognize hidden request arguments
  2017-09-13 22:31   ` Stefan Beller
@ 2017-09-18 16:56     ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-18 16:56 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, bturner,
	Jeff Hostetler, Jonathan Tan

On 09/13, Stefan Beller wrote:
> On Wed, Sep 13, 2017 at 2:54 PM, Brandon Williams <bmwill@google.com> wrote:
> > A normal request to git-daemon is structured as
> > "command path/to/repo\0host=..\0" and due to a bug in an old version of
> > git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> > command, 2009-06-04) we aren't able to place any extra args (separated
> > by NULs) besides the host.
> >
> > In order to get around this limitation teach git-daemon to recognize
> > additional request arguments hidden behind a second NUL byte.  Requests
> > can then be structured like:
> > "command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
> > can then parse out the extra arguments and set 'GIT_PROTOCOL'
> > accordingly.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  daemon.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
> >  1 file changed, 61 insertions(+), 10 deletions(-)
> >
> > diff --git a/daemon.c b/daemon.c
> > index 30747075f..250dbf82c 100644
> > --- a/daemon.c
> > +++ b/daemon.c
> > @@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi)
> >         return NULL;            /* Fallthrough. Deny by default */
> >  }
> >
> > -typedef int (*daemon_service_fn)(void);
> > +typedef int (*daemon_service_fn)(const struct argv_array *env);
> >  struct daemon_service {
> >         const char *name;
> >         const char *config_name;
> > @@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
> >  }
> >
> >  static int run_service(const char *dir, struct daemon_service *service,
> > -                      struct hostinfo *hi)
> > +                      struct hostinfo *hi, const struct argv_array *env)
> >  {
> >         const char *path;
> >         int enabled = service->enabled;
> > @@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service,
> >          */
> >         signal(SIGTERM, SIG_IGN);
> >
> > -       return service->fn();
> > +       return service->fn(env);
> >  }
> >
> >  static void copy_to_log(int fd)
> > @@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld)
> >         return finish_command(cld);
> >  }
> >
> > -static int upload_pack(void)
> > +static int upload_pack(const struct argv_array *env)
> >  {
> >         struct child_process cld = CHILD_PROCESS_INIT;
> >         argv_array_pushl(&cld.args, "upload-pack", "--strict", NULL);
> >         argv_array_pushf(&cld.args, "--timeout=%u", timeout);
> > +
> > +       argv_array_pushv(&cld.env_array, env->argv);
> > +
> >         return run_service_command(&cld);
> >  }
> >
> > -static int upload_archive(void)
> > +static int upload_archive(const struct argv_array *env)
> >  {
> >         struct child_process cld = CHILD_PROCESS_INIT;
> >         argv_array_push(&cld.args, "upload-archive");
> > +
> > +       argv_array_pushv(&cld.env_array, env->argv);
> > +
> >         return run_service_command(&cld);
> >  }
> >
> > -static int receive_pack(void)
> > +static int receive_pack(const struct argv_array *env)
> >  {
> >         struct child_process cld = CHILD_PROCESS_INIT;
> >         argv_array_push(&cld.args, "receive-pack");
> > +
> > +       argv_array_pushv(&cld.env_array, env->argv);
> > +
> >         return run_service_command(&cld);
> >  }
> >
> > @@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
> >  /*
> >   * Read the host as supplied by the client connection.
> >   */
> > -static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> > +static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> >  {
> >         char *val;
> >         int vallen;
> > @@ -602,6 +611,39 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> >                 if (extra_args < end && *extra_args)
> >                         die("Invalid request");
> >         }
> > +
> > +       return extra_args;
> > +}
> > +
> > +static void parse_extra_args(struct argv_array *env, const char *extra_args,
> > +                            int buflen)
> > +{
> > +       const char *end = extra_args + buflen;
> > +       struct strbuf git_protocol = STRBUF_INIT;
> > +
> > +       for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
> > +               const char *arg = extra_args;
> > +
> > +               /*
> > +                * Parse the extra arguments, adding most to 'git_protocol'
> > +                * which will be used to set the 'GIT_PROTOCOL' envvar in the
> > +                * service that will be run.
> > +                *
> > +                * If there ends up being a particular arg in the future that
> > +                * git-daemon needs to parse specificly (like the 'host' arg)
> > +                * then it can be parsed here and not added to 'git_protocol'.
> > +                */
> > +               if (*arg) {
> > +                       if (git_protocol.len > 0)
> > +                               strbuf_addch(&git_protocol, ':');
> > +                       strbuf_addstr(&git_protocol, arg);
> > +               }
> > +       }
> > +
> > +       if (git_protocol.len > 0)
> > +               argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> > +                                git_protocol.buf);
> > +       strbuf_release(&git_protocol);
> >  }
> 
> I wonder if this could be written as
> 
>   begin = extra_args;
>   p = extra_args;
>   end = extra_args + buflen;
> 
>   while (p < end) {
>     if (!*p)
>         *p = ':';
>     p++;
>   }
>   argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s", begin);
> 
> to ease the load on the server side, as then we do not
> have to copy the partial strings into strbufs and then
> count the length individually? (maybe performance is no big deal here?)

I'm sure something like that could work, and I don't know how
performance sensitive this bit is.  That and depending on if we need the
unmodified string for anything at a later point maybe its best to not
modify it in place?  I don't know :)

> 
> 
> >
> >  /*
> > @@ -695,6 +737,7 @@ static int execute(void)
> >         int pktlen, len, i;
> >         char *addr = getenv("REMOTE_ADDR"), *port = getenv("REMOTE_PORT");
> >         struct hostinfo hi;
> > +       struct argv_array env = ARGV_ARRAY_INIT;
> >
> >         hostinfo_init(&hi);
> >
> > @@ -716,8 +759,14 @@ static int execute(void)
> >                 pktlen--;
> >         }
> >
> > -       if (len != pktlen)
> > -               parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
> > +       if (len != pktlen) {
> > +               const char *extra_args;
> > +               /* retrieve host */
> > +               extra_args = parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
> > +
> > +               /* parse additional args hidden behind a second NUL byte */
> > +               parse_extra_args(&env, extra_args + 1, pktlen - (extra_args - line) - 1);
> > +       }
> >
> >         for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
> >                 struct daemon_service *s = &(daemon_service[i]);
> > @@ -730,13 +779,15 @@ static int execute(void)
> >                          * Note: The directory here is probably context sensitive,
> >                          * and might depend on the actual service being performed.
> >                          */
> > -                       int rc = run_service(arg, s, &hi);
> > +                       int rc = run_service(arg, s, &hi, &env);
> >                         hostinfo_clear(&hi);
> > +                       argv_array_clear(&env);
> >                         return rc;
> >                 }
> >         }
> >
> >         hostinfo_clear(&hi);
> > +       argv_array_clear(&env);
> >         logerror("Protocol error: '%s'", line);
> >         return -1;
> >  }
> > --
> > 2.14.1.690.gbb1197296e-goog
> >

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/8] protocol: introduce protocol extention mechanisms
  2017-09-13 22:27   ` Stefan Beller
@ 2017-09-18 17:02     ` Brandon Williams
  2017-09-18 18:34       ` Stefan Beller
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-18 17:02 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, bturner,
	Jeff Hostetler, Jonathan Tan

On 09/13, Stefan Beller wrote:
> On Wed, Sep 13, 2017 at 2:54 PM, Brandon Williams <bmwill@google.com> wrote:
> > Create protocol.{c,h} and provide functions which future servers and
> > clients can use to determine which protocol to use or is being used.
> >
> > Also introduce the 'GIT_PROTOCOL' environment variable which will be
> > used to communicate a colon separated list of keys with optional values
> > to a server.  Unknown keys and values must be tolerated.  This mechanism
> > is used to communicate which version of the wire protocol a client would
> > like to use with a server.
> >
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> > ---
> >  Documentation/config.txt | 16 +++++++++++
> >  Documentation/git.txt    |  5 ++++
> >  Makefile                 |  1 +
> >  cache.h                  |  7 +++++
> >  protocol.c               | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  protocol.h               | 15 ++++++++++
> >  6 files changed, 116 insertions(+)
> >  create mode 100644 protocol.c
> >  create mode 100644 protocol.h
> >
> > diff --git a/Documentation/config.txt b/Documentation/config.txt
> > index dc4e3f58a..d5b28a32c 100644
> > --- a/Documentation/config.txt
> > +++ b/Documentation/config.txt
> > @@ -2517,6 +2517,22 @@ The protocol names currently used by git are:
> >      `hg` to allow the `git-remote-hg` helper)
> >  --
> >
> > +protocol.version::
> 
> It would be cool to set a set of versions that are good. (I am not sure
> if that can be deferred to a later patch.)
> 
>   Consider we'd have versions 0,1,2,3,4 in the future:
>   In an ideal world the client and server would talk using v4
>   as it is the most advanced protocol, right?
>   Maybe a security/performance issue is found on the server side
>   with say protocol v3. Still no big deal as we are speaking v4.
>   But then consider an issue is found on the client side with v4.
>   Then the client would happily talk 0..3 while the server would
>   love to talk using 0,1,2,4.
> 
> The way I think about protocol version negotiation is that
> both parties involved have a set of versions that they tolerate
> to talk (they might understand more than the tolerated set, but the
> user forbade some), and the goal of the negotiation is to find
> the highest version number that is part of both the server set
> and client set. So quite naturally with this line of thinking the
> configuration is to configure a set of versions, which is what
> I propose here. Maybe even in the wire format, separated
> with colons?

I'm sure it wouldn't take too much to change this to be a multi-valued
config.  Because after this series there is just v0 and v1 I didn't
think through this case too much.  If others agree then I can go ahead
and make it so in a reroll.

> 
> > +       If set, clients will attempt to communicate with a server using
> > +       the specified protocol version.  If unset, no attempt will be
> > +       made by the client to communicate using a particular protocol
> > +       version, this results in protocol version 0 being used.
> 
> This sounds as if we're going to be really shy at first and only
> users that care will try out new versions at their own risk.
> From a users POV this may be frustrating as I would imagine that
> people want to run
> 
>   git config --global protocol.version 2
> 
> to try it out and then realize that some of their hosts do not speak
> 2, so they have to actually configure it per repo/remote.

The point would be to be able to set this globally, not per-repo.  Even
if a repo doesn't speak v2 then it should be able to gracefully degrade
to v1 without the user having to do anything.  The reason why there is
this escape hatch is if doing the protocol negotiation out of band
causing issues with communicating with a server that it can be shut off.


> > +       Supported versions:
> 
> > +* `0` - the original wire protocol.
> 
> In the future this may be misleading as it doesn't specify the date of
> when it was original. e.g. are capabilities already supported in "original"?
> 
> Maybe phrase it as "wire protocol as of v2.14" ? (Though this sounds
> as if new capabilities added in the future are not allowed)

Yeah I can see how this could be misleading, though I'm not sure how
best to word it to avoid that.

> 
> 
> > +
> > +extern enum protocol_version parse_protocol_version(const char *value);
> > +extern enum protocol_version get_protocol_version_config(void);
> > +extern enum protocol_version determine_protocol_version_server(void);
> > +extern enum protocol_version determine_protocol_version_client(const char *server_response);
> 
> Here is a good place to have some comments.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/8] protocol: introduce protocol extention mechanisms
  2017-09-18 17:02     ` Brandon Williams
@ 2017-09-18 18:34       ` Stefan Beller
  2017-09-18 19:58         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-09-18 18:34 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Bryan Turner,
	Jeff Hostetler, Jonathan Tan

>> From a users POV this may be frustrating as I would imagine that
>> people want to run
>>
>>   git config --global protocol.version 2
>>
>> to try it out and then realize that some of their hosts do not speak
>> 2, so they have to actually configure it per repo/remote.
>
> The point would be to be able to set this globally, not per-repo.  Even
> if a repo doesn't speak v2 then it should be able to gracefully degrade
> to v1 without the user having to do anything.  The reason why there is
> this escape hatch is if doing the protocol negotiation out of band
> causing issues with communicating with a server that it can be shut off.

In the current situation it is easy to assume that if v1 (and not v0)
is configured, that the users intent is "to try out v1 and fallback
gracefully to v0".

But this will change over time in the future!

Eventually people will have the desire to say:
"Use version N+1, but never version N", because N has
performance or security issues; the user might not want
to bother to try N or even actively want to be affirmed that
Git will never use version N.

In this future we need a mechanism, that either contains a
white list or black list of protocols. To keep it simple (I assume
there won't be many protocol versions), a single white list will do.

However transitioning from the currently proposed "try the new
configured thing and fallback to whatever" to "this is the exact list
of options that Git will try for you" will be hard, as we may break people
if we do not unconditionally fall back to v0.

That is why I propose to start with an explicit white list as then we
do not have to have a transition plan or otherwise work around the
issue. Also it doesn't hurt now to use

    git config --global protocol.version v1,v0

instead compared to the proposed configuration above.
(Even better yet, then people could play around with "v1 only"
and see how it falls apart on old servers)

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/8] protocol: introduce protocol extention mechanisms
  2017-09-18 18:34       ` Stefan Beller
@ 2017-09-18 19:58         ` Brandon Williams
  2017-09-18 20:06           ` Stefan Beller
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-18 19:58 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Bryan Turner,
	Jeff Hostetler, Jonathan Tan

On 09/18, Stefan Beller wrote:
> >> From a users POV this may be frustrating as I would imagine that
> >> people want to run
> >>
> >>   git config --global protocol.version 2
> >>
> >> to try it out and then realize that some of their hosts do not speak
> >> 2, so they have to actually configure it per repo/remote.
> >
> > The point would be to be able to set this globally, not per-repo.  Even
> > if a repo doesn't speak v2 then it should be able to gracefully degrade
> > to v1 without the user having to do anything.  The reason why there is
> > this escape hatch is if doing the protocol negotiation out of band
> > causing issues with communicating with a server that it can be shut off.
> 
> In the current situation it is easy to assume that if v1 (and not v0)
> is configured, that the users intent is "to try out v1 and fallback
> gracefully to v0".
> 
> But this will change over time in the future!
> 
> Eventually people will have the desire to say:
> "Use version N+1, but never version N", because N has
> performance or security issues; the user might not want
> to bother to try N or even actively want to be affirmed that
> Git will never use version N.
> 
> In this future we need a mechanism, that either contains a
> white list or black list of protocols. To keep it simple (I assume
> there won't be many protocol versions), a single white list will do.
> 
> However transitioning from the currently proposed "try the new
> configured thing and fallback to whatever" to "this is the exact list
> of options that Git will try for you" will be hard, as we may break people
> if we do not unconditionally fall back to v0.
> 
> That is why I propose to start with an explicit white list as then we
> do not have to have a transition plan or otherwise work around the
> issue. Also it doesn't hurt now to use
> 
>     git config --global protocol.version v1,v0
> 
> instead compared to the proposed configuration above.
> (Even better yet, then people could play around with "v1 only"
> and see how it falls apart on old servers)

Except we can't start with an explicit whitelist because we must
fallback to v0 if v1 isn't supported otherwise we would break people.

That is unless we have the semantics of: If not configured v0 will be
used, otherwise only use the configured protocol versions.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/8] protocol: introduce protocol extention mechanisms
  2017-09-18 19:58         ` Brandon Williams
@ 2017-09-18 20:06           ` Stefan Beller
  0 siblings, 0 replies; 161+ messages in thread
From: Stefan Beller @ 2017-09-18 20:06 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Jeff King, Junio C Hamano, Jonathan Nieder, Bryan Turner,
	Jeff Hostetler, Jonathan Tan

>> instead compared to the proposed configuration above.
>> (Even better yet, then people could play around with "v1 only"
>> and see how it falls apart on old servers)
>
> Except we can't start with an explicit whitelist because we must
> fallback to v0 if v1 isn't supported otherwise we would break people.
>
> That is unless we have the semantics of: If not configured v0 will be
> used, otherwise only use the configured protocol versions.
>

Good point.

Thinking about this, how about:

  If not configured, we do as we want. (i.e. Git has full control over
  it's decision making process, which for now is "favor v0 over v1 as
  we are experimenting with v1". This strategy may change in the future
  to "prefer highest version number that both client and server can speak".)

  If it is configured, "use highest configured number from the given set".

?

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH 1.5/8] connect: die when a capability line comes after a ref
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (7 preceding siblings ...)
  2017-09-13 21:54 ` [PATCH 8/8] i5700: add interop test for protocol transition Brandon Williams
@ 2017-09-20 18:48 ` Brandon Williams
  2017-09-20 19:14   ` Jeff King
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
  9 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-20 18:48 UTC (permalink / raw)
  To: git
  Cc: Brandon Williams, peff, sbeller, gitster, jrnieder, bturner, git,
	jonathantanmy

Commit eb398797c (connect: advertized capability is not a ref,
2016-09-09) taught 'get_remote_heads()' to recognize that the
'capabilities^{}' line isn't a ref but required that the
'capabilities^{}' line came during the first response from the server.
A future patch will introduce a version string sent by the server during
its first response which can then cause a client to unnecessarily die if
a 'capabilities^{}' line sent as the first ref.

Teach 'get_remote_heads()' to instead die if a 'capabilities^{}' line is
sent after a ref.

Reported-by: Miguel Alcon <malcon@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
This is a fix to the bug we found when internally deploying this series.  It
just makes it so that a capability line wont cause a client to error out if its
not the first response, because it won't be the first response come protocol
v1.

 connect.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/connect.c b/connect.c
index df56c0cbf..af5096ec6 100644
--- a/connect.c
+++ b/connect.c
@@ -124,10 +124,11 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * response does not necessarily mean an ACL problem, though.
 	 */
 	int saw_response;
+	int seen_ref;
 	int got_dummy_ref_with_capabilities_declaration = 0;

 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
+	for (saw_response = 0, seen_ref = 0; ; saw_response = 1) {
 		struct ref *ref;
 		struct object_id old_oid;
 		char *name;
@@ -165,6 +166,8 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,

 		name_len = strlen(name);
 		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
+			if (seen_ref)
+				; /* NEEDSWORK: Error out for multiple capabilities lines? */
 			free(server_capabilities);
 			server_capabilities = xstrdup(name + name_len + 1);
 		}
@@ -175,7 +178,7 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 		}

 		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
+			if (seen_ref)
 				die("protocol error: unexpected capabilities^{}");
 			if (got_dummy_ref_with_capabilities_declaration)
 				die("protocol error: multiple capabilities^{}");
@@ -193,6 +196,7 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 		oidcpy(&ref->old_oid, &old_oid);
 		*list = ref;
 		list = &ref->next;
+		seen_ref = 1;
 	}

 	annotate_refs_with_symref_info(*orig_list);
--
2.14.1.821.g8fa685d3b7-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 1.5/8] connect: die when a capability line comes after a ref
  2017-09-20 18:48 ` [PATCH 1.5/8] connect: die when a capability line comes after a ref Brandon Williams
@ 2017-09-20 19:14   ` Jeff King
  2017-09-20 20:06     ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jeff King @ 2017-09-20 19:14 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, sbeller, gitster, jrnieder, bturner, git, jonathantanmy

On Wed, Sep 20, 2017 at 11:48:32AM -0700, Brandon Williams wrote:

> Commit eb398797c (connect: advertized capability is not a ref,
> 2016-09-09) taught 'get_remote_heads()' to recognize that the
> 'capabilities^{}' line isn't a ref but required that the
> 'capabilities^{}' line came during the first response from the server.
> A future patch will introduce a version string sent by the server during
> its first response which can then cause a client to unnecessarily die if
> a 'capabilities^{}' line sent as the first ref.
> 
> Teach 'get_remote_heads()' to instead die if a 'capabilities^{}' line is
> sent after a ref.

Hmm. I think I understand why you'd want this loosening. But why are we
sending a version line to a client that we don't know is speaking v2?
IOW, shouldn't we be reporting the version to the client in the normal
capabilities when we don't know for sure that they can handle the new
field? Otherwise we're breaking existing clients.

Or is this only for v2 clients, and we've changed the protocol but
get_remote_heads() just needs to be updated, too?

> diff --git a/connect.c b/connect.c
> index df56c0cbf..af5096ec6 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -124,10 +124,11 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	 * response does not necessarily mean an ACL problem, though.
>  	 */
>  	int saw_response;
> +	int seen_ref;
>  	int got_dummy_ref_with_capabilities_declaration = 0;
> 
>  	*list = NULL;
> -	for (saw_response = 0; ; saw_response = 1) {
> +	for (saw_response = 0, seen_ref = 0; ; saw_response = 1) {

If we're not going to update it in the right-hand side of the for-loop,
should we perhaps not be initializing it in the left-hand side? I.e.,
can we just do:

  seen_ref = 0;

above the loop, like we initialize "list"?

(For that matter, could we just be checking whether *list is NULL?)

> @@ -165,6 +166,8 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> 
>  		name_len = strlen(name);
>  		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> +			if (seen_ref)
> +				; /* NEEDSWORK: Error out for multiple capabilities lines? */
>  			free(server_capabilities);
>  			server_capabilities = xstrdup(name + name_len + 1);
>  		}

Interesting question. Probably it would be fine to. Coincidentally I ran
across a similar case. It seems that upload-pack will read multiple
capabilities lines back from the client. I.e., if it gets:

  want 1234abcd... foo
  want 5678abcd... bar

then it will turn on both the "foo" and "bar" capabilities. I'm pretty
sure this is unintended, and is somewhat counter to the way that clients
handle multiple lines (which is to forget the old line and respect only
the new one, as shown in the quoted hunk).

I wonder if we should be outlawing extra capabilities in both
directions. I don't _think_ we've ever relied on that working, and I
don't have much sympathy for any 3rd-party implementation that does
(though I doubt that any exists).

That tangent aside, I do this hunk is kind of orthogonal to the point of
your patch. We're talking about potential _tightening_ here, whereas the
point of your patch is loosening. And it's not clear to me what we want
to tighten:

  - should capabilities come as part of the first response, even if we
    have no refs? In which case we really want "if (saw_response)" here.

  - should they came as part of the first ref (or pseudo-ref), in which
    case "if (seen_ref)" is the right thing.

  - should we loosen it to complaining when there are multiple
    capabilities sent. In which case "if (server_capabilities)" is the
    right thing.

I'm not sure which we'd want, but it really seems like a separate topic
that should be explored on top.

-Peff

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1.5/8] connect: die when a capability line comes after a ref
  2017-09-20 19:14   ` Jeff King
@ 2017-09-20 20:06     ` Brandon Williams
  2017-09-20 20:48       ` Jonathan Nieder
                         ` (2 more replies)
  0 siblings, 3 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-20 20:06 UTC (permalink / raw)
  To: Jeff King; +Cc: git, sbeller, gitster, jrnieder, bturner, git, jonathantanmy

On 09/20, Jeff King wrote:
> On Wed, Sep 20, 2017 at 11:48:32AM -0700, Brandon Williams wrote:
> 
> > Commit eb398797c (connect: advertized capability is not a ref,
> > 2016-09-09) taught 'get_remote_heads()' to recognize that the
> > 'capabilities^{}' line isn't a ref but required that the
> > 'capabilities^{}' line came during the first response from the server.
> > A future patch will introduce a version string sent by the server during
> > its first response which can then cause a client to unnecessarily die if
> > a 'capabilities^{}' line sent as the first ref.
> > 
> > Teach 'get_remote_heads()' to instead die if a 'capabilities^{}' line is
> > sent after a ref.
> 
> Hmm. I think I understand why you'd want this loosening. But why are we
> sending a version line to a client that we don't know is speaking v2?
> IOW, shouldn't we be reporting the version to the client in the normal
> capabilities when we don't know for sure that they can handle the new
> field? Otherwise we're breaking existing clients.

The client requested the version, this is the servers response.  So
older clients shouldn't be broken because they wouldn't be requesting
the newer versions.

> 
> Or is this only for v2 clients, and we've changed the protocol but
> get_remote_heads() just needs to be updated, too?

A client which didn't request protocol v1 (I'm calling the current
protocol v0, and v1 is just v0 with the initial response from the server
containing a version string) should not receive a version string in the
initial response.  The problem is that when introducing the version
string to protocol version 1, I didn't want to have to do a huge
refactoring of ALL of the current transport code so I stuck the version
check in get_remote_heads() since v1 is exactly the same as v0, except
for the first line from the server.

When we introduce v2, I'm sure we'll have to do more refactoring to
separate out the logic for the different versions.
> 
> > diff --git a/connect.c b/connect.c
> > index df56c0cbf..af5096ec6 100644
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -124,10 +124,11 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> >  	 * response does not necessarily mean an ACL problem, though.
> >  	 */
> >  	int saw_response;
> > +	int seen_ref;
> >  	int got_dummy_ref_with_capabilities_declaration = 0;
> > 
> >  	*list = NULL;
> > -	for (saw_response = 0; ; saw_response = 1) {
> > +	for (saw_response = 0, seen_ref = 0; ; saw_response = 1) {
> 
> If we're not going to update it in the right-hand side of the for-loop,
> should we perhaps not be initializing it in the left-hand side? I.e.,
> can we just do:
> 
>   seen_ref = 0;
> 
> above the loop, like we initialize "list"?
> 
> (For that matter, could we just be checking whether *list is NULL?)

True, that would probably be the better way to do this.

> 
> > @@ -165,6 +166,8 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> > 
> >  		name_len = strlen(name);
> >  		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> > +			if (seen_ref)
> > +				; /* NEEDSWORK: Error out for multiple capabilities lines? */
> >  			free(server_capabilities);
> >  			server_capabilities = xstrdup(name + name_len + 1);
> >  		}
> 
> Interesting question. Probably it would be fine to. Coincidentally I ran
> across a similar case. It seems that upload-pack will read multiple
> capabilities lines back from the client. I.e., if it gets:
> 
>   want 1234abcd... foo
>   want 5678abcd... bar
> 
> then it will turn on both the "foo" and "bar" capabilities. I'm pretty
> sure this is unintended, and is somewhat counter to the way that clients
> handle multiple lines (which is to forget the old line and respect only
> the new one, as shown in the quoted hunk).
> 
> I wonder if we should be outlawing extra capabilities in both
> directions. I don't _think_ we've ever relied on that working, and I
> don't have much sympathy for any 3rd-party implementation that does
> (though I doubt that any exists).
> 
> That tangent aside, I do this hunk is kind of orthogonal to the point of
> your patch. We're talking about potential _tightening_ here, whereas the
> point of your patch is loosening. And it's not clear to me what we want
> to tighten:
> 
>   - should capabilities come as part of the first response, even if we
>     have no refs? In which case we really want "if (saw_response)" here.
> 
>   - should they came as part of the first ref (or pseudo-ref), in which
>     case "if (seen_ref)" is the right thing.
> 
>   - should we loosen it to complaining when there are multiple
>     capabilities sent. In which case "if (server_capabilities)" is the
>     right thing.
> 
> I'm not sure which we'd want, but it really seems like a separate topic
> that should be explored on top.

I wasn't sure either, which is why I added the comment to prod
discussion.  I agree that is is orthogonal to this series so I'll most
likely drop it, as it doesn't help with the protocol transition
discussion.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1.5/8] connect: die when a capability line comes after a ref
  2017-09-20 20:06     ` Brandon Williams
@ 2017-09-20 20:48       ` Jonathan Nieder
  2017-09-21  3:02       ` Junio C Hamano
  2017-09-21 20:45       ` [PATCH] connect: in ref advertisement, shallows are last Jonathan Tan
  2 siblings, 0 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-09-20 20:48 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jeff King, git, sbeller, gitster, bturner, git, jonathantanmy

Brandon Williams wrote:
> On 09/20, Jeff King wrote:

>> (For that matter, could we just be checking whether *list is NULL?)
>
> True, that would probably be the better way to do this.

Nice idea, thank you.

That doesn't capture a few other cases of pkts that aren't supposed to
come before the capabilities^{} line:

 * shallow
 * .have
 * capabilities^{}
 * invalid refnames

Perhaps it should check all of those:

	if ((shallow_points && shallow_points->nr) ||
	    (extra_have && extra_have->nr) ||
	    got_dummy_ref_with_capabilities_declaration ||
	    got_invalid_ref ||
	    *list)

What happens when another type of pkt gets introduced?  This feels
pretty error-prone.  The underlying problem is that we are emulating a
state machine that is not a simple for loop using a simple for loop,
by piling up variables that keep track of the current state.  That
suggests one of the following approaches:

 A. Replace saw_response with an enum describing the state.
    Immediately after reading the first packet, update the state to
    EXPECTING_FIRST_REF.  Immediately after reading the first ref,
    update the state to EXPECTING_SHALLOW.

 B. Use instruction flow to encode the state machine.  Have separate
    loops for processing refs and shallow lines.

By the way, there are some other ways the current code is less strict
than described in pack-protocol.txt:

 - allowing an empty list-of-refs.  (This is deliberate ---
   pack-protocol.txt's lack of documentation of this case is a bug.)

 - allowing multiple capability-lists

 - allowing capabilities^{} combined with other refs

 - allowing refs, shallow, and .have to be interleaved

Tightening those would likely be good for the ecosystem (so that
buggy servers get noticed quickly), but that's a separate topic from
this change.

[...]
> I wasn't sure either, which is why I added the comment to prod
> discussion.  I agree that is is orthogonal to this series so I'll most
> likely drop it, as it doesn't help with the protocol transition
> discussion.

I'd be happy to write a separate patch adding the NEEDSWORK comment
(or even a patch doing what the NEEDSWORK comment suggests) to avoid
derailing this one. :)

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/8] daemon: recognize hidden request arguments
  2017-09-13 21:54 ` [PATCH 3/8] daemon: recognize hidden request arguments Brandon Williams
  2017-09-13 22:31   ` Stefan Beller
@ 2017-09-21  0:24   ` Jonathan Tan
  2017-09-21  0:31     ` Jonathan Tan
  1 sibling, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-09-21  0:24 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, peff, sbeller, gitster, jrnieder, bturner, git

On Wed, 13 Sep 2017 14:54:43 -0700
Brandon Williams <bmwill@google.com> wrote:

> A normal request to git-daemon is structured as
> "command path/to/repo\0host=..\0" and due to a bug in an old version of
> git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> command, 2009-06-04) we aren't able to place any extra args (separated
> by NULs) besides the host.
> 
> In order to get around this limitation teach git-daemon to recognize
> additional request arguments hidden behind a second NUL byte.  Requests
> can then be structured like:
> "command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
> can then parse out the extra arguments and set 'GIT_PROTOCOL'
> accordingly.

A test in this patch (if possible) would be nice, but it is probably
clearer to test this when one of the commands (e.g. upload-pack) is
done. Could a test be added to the next patch to verify (using
GIT_TRACE_PACKET, maybe) that the expected strings are sent? Then
mention in this commit message that this will be tested in the next
patch too.

> @@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
>  /*
>   * Read the host as supplied by the client connection.
>   */
> -static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> +static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
>  {
>  	char *val;
>  	int vallen;
> @@ -602,6 +611,39 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
>  		if (extra_args < end && *extra_args)
>  			die("Invalid request");
>  	}
> +
> +	return extra_args;
> +}
> +
> +static void parse_extra_args(struct argv_array *env, const char *extra_args,
> +			     int buflen)
> +{
> +	const char *end = extra_args + buflen;
> +	struct strbuf git_protocol = STRBUF_INIT;
> +
> +	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
> +		const char *arg = extra_args;
> +
> +		/*
> +		 * Parse the extra arguments, adding most to 'git_protocol'
> +		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
> +		 * service that will be run.
> +		 *
> +		 * If there ends up being a particular arg in the future that
> +		 * git-daemon needs to parse specificly (like the 'host' arg)
> +		 * then it can be parsed here and not added to 'git_protocol'.
> +		 */
> +		if (*arg) {
> +			if (git_protocol.len > 0)
> +				strbuf_addch(&git_protocol, ':');
> +			strbuf_addstr(&git_protocol, arg);
> +		}
> +	}
> +
> +	if (git_protocol.len > 0)
> +		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> +				 git_protocol.buf);
> +	strbuf_release(&git_protocol);
>  }

I would rewrite this with parse_extra_args() calling parse_host_arg()
instead (right now, you have 2 functions with 2 different meanings of
"extra_args"). If you want to keep this arrangement, though, add a
documentation comment about the meaning of the return value of
parse_host_arg().

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/8] daemon: recognize hidden request arguments
  2017-09-21  0:24   ` Jonathan Tan
@ 2017-09-21  0:31     ` Jonathan Tan
  2017-09-21 21:55       ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-09-21  0:31 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Brandon Williams, git, peff, sbeller, gitster, jrnieder, bturner, git

On Wed, 20 Sep 2017 17:24:43 -0700
Jonathan Tan <jonathantanmy@google.com> wrote:

> On Wed, 13 Sep 2017 14:54:43 -0700
> Brandon Williams <bmwill@google.com> wrote:
> 
> > A normal request to git-daemon is structured as
> > "command path/to/repo\0host=..\0" and due to a bug in an old version of
> > git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> > command, 2009-06-04) we aren't able to place any extra args (separated
> > by NULs) besides the host.
> > 
> > In order to get around this limitation teach git-daemon to recognize
> > additional request arguments hidden behind a second NUL byte.  Requests
> > can then be structured like:
> > "command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
> > can then parse out the extra arguments and set 'GIT_PROTOCOL'
> > accordingly.
> 
> A test in this patch (if possible) would be nice, but it is probably
> clearer to test this when one of the commands (e.g. upload-pack) is
> done. Could a test be added to the next patch to verify (using
> GIT_TRACE_PACKET, maybe) that the expected strings are sent? Then
> mention in this commit message that this will be tested in the next
> patch too.

Ah, I see that it is tested in 6/8. You can ignore this comment.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1.5/8] connect: die when a capability line comes after a ref
  2017-09-20 20:06     ` Brandon Williams
  2017-09-20 20:48       ` Jonathan Nieder
@ 2017-09-21  3:02       ` Junio C Hamano
  2017-09-21 20:45       ` [PATCH] connect: in ref advertisement, shallows are last Jonathan Tan
  2 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-09-21  3:02 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jeff King, git, sbeller, jrnieder, bturner, git, jonathantanmy

Brandon Williams <bmwill@google.com> writes:

>> Or is this only for v2 clients, and we've changed the protocol but
>> get_remote_heads() just needs to be updated, too?
>
> A client which didn't request protocol v1 (I'm calling the current
> protocol v0, and v1 is just v0 with the initial response from the server
> containing a version string) should not receive a version string in the
> initial response.  The problem is that when introducing the version
> string to protocol version 1, I didn't want to have to do a huge
> refactoring of ALL of the current transport code so I stuck the version
> check in get_remote_heads() since v1 is exactly the same as v0, except
> for the first line from the server.

It is still unclear from your response what other things the server
is now allowed to say before "version".  I have a slight suspicion
that this change makes the input language overly loose.  Before
eb398797 ("connect: advertized capability is not a ref", 2016-09-09)
made the "dummy ref must come before any ref and no refs should be
sent if there is a dummy ref sent", the code before it used to allow
a ".have" or a "shallow" to appear at the beginning, but with the
"anything from the other end whatsoever is not allowed before the
dummy one" check the commit introduced, it made it a protocol error
to send these before dummy ref advertisement.  But with this patch,
you are again allowing them to come before the dummy ref, together
with the "version" line you recently added.  I do not know if it is
a problem in practice or not offhand, though.


^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH] connect: in ref advertisement, shallows are last
  2017-09-20 20:06     ` Brandon Williams
  2017-09-20 20:48       ` Jonathan Nieder
  2017-09-21  3:02       ` Junio C Hamano
@ 2017-09-21 20:45       ` Jonathan Tan
  2017-09-21 23:45         ` [PATCH v2] " Jonathan Tan
  2017-09-26 18:21         ` [PATCH v5] " Jonathan Tan
  2 siblings, 2 replies; 161+ messages in thread
From: Jonathan Tan @ 2017-09-21 20:45 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, jrnieder, gitster, peff

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
It seems that some people are concerned about looseness in interpreting
the ref advertisement, so here is a patch to tighten it instead. This is
a replacement for Brandon's PATCH 1.5.

I think this is what Jonathan Nieder meant by his instruction flow idea.

I've checked that Brandon's other patches apply cleanly on this patch,
except for "connect: teach client to recognize v1 server response" which
has to be modified to the following:

    @@ -149,6 +150,26 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
     	*list = NULL;
     
     	len = read_remote_ref(in, &src_buf, &src_len, &responded);
    +
    +	switch (determine_protocol_version_client(packet_buffer)) {
    +	case protocol_v1:
    +		/*
    +		 * First pkt-line contained the version string.
    +		 * Continue on to process the ref advertisement.
    +		 */
    +		len = read_remote_ref(in, &src_buf, &src_len, &responded);
    +		break;
    +	case protocol_v0:
    +		/*
    +		 * Server is speaking protocol v0 and sent a
    +		 * ref so we need to process it.
    +		 */
    +		break;
    +	default:
    +		die("server is speaking an unknown protocol");
    +		break;
    +	}
    +

 connect.c | 112 ++++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 61 insertions(+), 51 deletions(-)

diff --git a/connect.c b/connect.c
index 49b28b83b..9bf97adf6 100644
--- a/connect.c
+++ b/connect.c
@@ -107,6 +107,26 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+int read_remote_ref(int in, char **src_buf, size_t *src_len, int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,46 +143,23 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
+
+	len = read_remote_ref(in, &src_buf, &src_len, &responded);
+	do {
 		struct object_id old_oid;
 		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
+		int name_len;
 
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
+		if (len < GIT_SHA1_HEXSZ + 2 ||
+		    get_oid_hex(packet_buffer, &old_oid) ||
+		    packet_buffer[GIT_SHA1_HEXSZ] != ' ')
 			break;
 
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
-
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
+		name = packet_buffer + GIT_SHA1_HEXSZ + 1;
 		name_len = strlen(name);
 		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
 			free(server_capabilities);
@@ -171,29 +168,42 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 
 		if (extra_have && !strcmp(name, ".have")) {
 			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
+		} else if (!strcmp(name, "capabilities^{}")) {
+			if (*list)
+				/* cannot coexist with other refs */
 				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+			/*
+			 * There should be no more refs; read the next line and
+			 * go to next block.
+			 */
+			len = read_remote_ref(in, &src_buf, &src_len,
+					      &responded);
+			break;
+		} else if (check_ref(name, flags)) {
+			struct ref *ref = alloc_ref(name);
+			oidcpy(&ref->old_oid, &old_oid);
+			*list = ref;
+			list = &ref->next;
 		}
+	} while ((len = read_remote_ref(in, &src_buf, &src_len, &responded)));
 
-		if (!check_ref(name, flags))
-			continue;
+	do {
+		const char *arg;
+		struct object_id old_oid;
 
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
+		if (skip_prefix(packet_buffer, "shallow ", &arg)) {
+			if (get_oid_hex(arg, &old_oid))
+				die("protocol error: expected shallow sha-1, got '%s'", arg);
+			if (!shallow_points)
+				die("repository on the other end cannot be shallow");
+			oid_array_append(shallow_points, &old_oid);
+		} else {
+			break;
+		}
+	} while ((len = read_remote_ref(in, &src_buf, &src_len, &responded)));
 
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
-	}
+	if (len)
+		die("protocol error: unexpected '%s'", packet_buffer);
 
 	annotate_refs_with_symref_info(*orig_list);
 
-- 
2.14.1.728.g20a5b67d5.dirty


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/8] daemon: recognize hidden request arguments
  2017-09-21  0:31     ` Jonathan Tan
@ 2017-09-21 21:55       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-21 21:55 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, peff, sbeller, gitster, jrnieder, bturner, git

On 09/20, Jonathan Tan wrote:
> On Wed, 20 Sep 2017 17:24:43 -0700
> Jonathan Tan <jonathantanmy@google.com> wrote:
> 
> > On Wed, 13 Sep 2017 14:54:43 -0700
> > Brandon Williams <bmwill@google.com> wrote:
> > 
> > > A normal request to git-daemon is structured as
> > > "command path/to/repo\0host=..\0" and due to a bug in an old version of
> > > git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> > > command, 2009-06-04) we aren't able to place any extra args (separated
> > > by NULs) besides the host.
> > > 
> > > In order to get around this limitation teach git-daemon to recognize
> > > additional request arguments hidden behind a second NUL byte.  Requests
> > > can then be structured like:
> > > "command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
> > > can then parse out the extra arguments and set 'GIT_PROTOCOL'
> > > accordingly.
> > 
> > A test in this patch (if possible) would be nice, but it is probably
> > clearer to test this when one of the commands (e.g. upload-pack) is
> > done. Could a test be added to the next patch to verify (using
> > GIT_TRACE_PACKET, maybe) that the expected strings are sent? Then
> > mention in this commit message that this will be tested in the next
> > patch too.
> 
> Ah, I see that it is tested in 6/8. You can ignore this comment.

Yeah I felt it would have been difficult to test any earlier without
both the client and server sides done.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v2] connect: in ref advertisement, shallows are last
  2017-09-21 20:45       ` [PATCH] connect: in ref advertisement, shallows are last Jonathan Tan
@ 2017-09-21 23:45         ` Jonathan Tan
  2017-09-22  0:00           ` Brandon Williams
  2017-09-26 18:21         ` [PATCH v5] " Jonathan Tan
  1 sibling, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-09-21 23:45 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, jrnieder, gitster, peff, bmwill

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
In some in-office discussion, I was informed that my original patch
relaxed the ordering of ".keep" lines. Here is an update.

I'm also using a switch statement now, which avoids having multiple
lines of read_remote_ref().

 connect.c | 159 ++++++++++++++++++++++++++++++++++++--------------------------
 1 file changed, 93 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index 49b28b83b..e0821dbff 100644
--- a/connect.c
+++ b/connect.c
@@ -107,6 +107,84 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_REF 0
+#define EXPECTING_SHALLOW 1
+
+static int process_ref(int *state, int len, struct ref ***list,
+		       unsigned int flags, struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	char *name;
+	int name_len;
+
+	if (len < GIT_SHA1_HEXSZ + 2 ||
+	    get_oid_hex(packet_buffer, &old_oid) ||
+	    packet_buffer[GIT_SHA1_HEXSZ] != ' ') {
+		(*state)++;
+		return 0;
+	}
+
+	name = packet_buffer + GIT_SHA1_HEXSZ + 1;
+	name_len = strlen(name);
+	if (len != name_len + GIT_SHA1_HEXSZ + 1) {
+		free(server_capabilities);
+		server_capabilities = xstrdup(name + name_len + 1);
+	}
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (!strcmp(name, "capabilities^{}")) {
+		if (**list)
+			/* cannot coexist with other refs */
+			die("protocol error: unexpected capabilities^{}");
+		/* There should be no more refs; proceed to the next state. */
+		(*state)++;
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	return 1;
+}
+
+static int process_shallow(int *state, struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +201,25 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_REF;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
-
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
 
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_REF:
+			if (process_ref(&state, len, &list, flags, extra_have))
+				break;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(&state, shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.14.1.728.g20a5b67d5.dirty


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v2] connect: in ref advertisement, shallows are last
  2017-09-21 23:45         ` [PATCH v2] " Jonathan Tan
@ 2017-09-22  0:00           ` Brandon Williams
  2017-09-22  0:08             ` [PATCH v3] " Jonathan Tan
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-22  0:00 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, jrnieder, gitster, peff

On 09/21, Jonathan Tan wrote:
> Currently, get_remote_heads() parses the ref advertisement in one loop,
> allowing refs and shallow lines to intersperse, despite this not being
> allowed by the specification. Refactor get_remote_heads() to use two
> loops instead, enforcing that refs come first, and then shallows.
> 
> This also makes it easier to teach get_remote_heads() to interpret other
> lines in the ref advertisement, which will be done in a subsequent
> patch.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> In some in-office discussion, I was informed that my original patch
> relaxed the ordering of ".keep" lines. Here is an update.
> 
> I'm also using a switch statement now, which avoids having multiple
> lines of read_remote_ref().


Looks cleaner than the last patch.

> 
>  connect.c | 159 ++++++++++++++++++++++++++++++++++++--------------------------
>  1 file changed, 93 insertions(+), 66 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index 49b28b83b..e0821dbff 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -107,6 +107,84 @@ static void annotate_refs_with_symref_info(struct ref *ref)
>  	string_list_clear(&symref, 0);
>  }
>  
> +/*
> + * Read one line of a server's ref advertisement into packet_buffer.
> + */
> +static int read_remote_ref(int in, char **src_buf, size_t *src_len,
> +			   int *responded)
> +{
> +	int len = packet_read(in, src_buf, src_len,
> +			      packet_buffer, sizeof(packet_buffer),
> +			      PACKET_READ_GENTLE_ON_EOF |
> +			      PACKET_READ_CHOMP_NEWLINE);
> +	const char *arg;
> +	if (len < 0)
> +		die_initial_contact(*responded);
> +	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
> +		die("remote error: %s", arg);
> +
> +	*responded = 1;
> +
> +	return len;
> +}
> +
> +#define EXPECTING_REF 0
> +#define EXPECTING_SHALLOW 1
> +
> +static int process_ref(int *state, int len, struct ref ***list,
> +		       unsigned int flags, struct oid_array *extra_have)
> +{
> +	struct object_id old_oid;
> +	char *name;
> +	int name_len;
> +
> +	if (len < GIT_SHA1_HEXSZ + 2 ||
> +	    get_oid_hex(packet_buffer, &old_oid) ||
> +	    packet_buffer[GIT_SHA1_HEXSZ] != ' ') {
> +		(*state)++;

I think it may be cleaner if the state variable is updated outside of
this function based on a return value from this function.

> +		return 0;
> +	}
> +
> +	name = packet_buffer + GIT_SHA1_HEXSZ + 1;
> +	name_len = strlen(name);
> +	if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> +		free(server_capabilities);
> +		server_capabilities = xstrdup(name + name_len + 1);
> +	}
> +
> +	if (extra_have && !strcmp(name, ".have")) {
> +		oid_array_append(extra_have, &old_oid);
> +	} else if (!strcmp(name, "capabilities^{}")) {
> +		if (**list)
> +			/* cannot coexist with other refs */
> +			die("protocol error: unexpected capabilities^{}");
> +		/* There should be no more refs; proceed to the next state. */
> +		(*state)++;
> +	} else if (check_ref(name, flags)) {
> +		struct ref *ref = alloc_ref(name);
> +		oidcpy(&ref->old_oid, &old_oid);
> +		**list = ref;
> +		*list = &ref->next;
> +	}
> +	return 1;
> +}
> +
> +static int process_shallow(int *state, struct oid_array *shallow_points)

state isn't needed here and could be dropped from the parameter list.

> +{
> +	const char *arg;
> +	struct object_id old_oid;
> +
> +	if (!skip_prefix(packet_buffer, "shallow ", &arg))
> +		return 0;
> +
> +	if (get_oid_hex(arg, &old_oid))
> +		die("protocol error: expected shallow sha-1, got '%s'", arg);
> +	if (!shallow_points)
> +		die("repository on the other end cannot be shallow");
> +	oid_array_append(shallow_points, &old_oid);
> +	return 1;
> +}
> +
>  /*
>   * Read all the refs from the other end
>   */
> @@ -123,76 +201,25 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	 * willing to talk to us.  A hang-up before seeing any
>  	 * response does not necessarily mean an ACL problem, though.
>  	 */
> -	int saw_response;
> -	int got_dummy_ref_with_capabilities_declaration = 0;
> +	int responded = 0;
> +	int len;
> +	int state = EXPECTING_REF;
>  
>  	*list = NULL;
> -	for (saw_response = 0; ; saw_response = 1) {
> -		struct ref *ref;
> -		struct object_id old_oid;
> -		char *name;
> -		int len, name_len;
> -		char *buffer = packet_buffer;
> -		const char *arg;
> -
> -		len = packet_read(in, &src_buf, &src_len,
> -				  packet_buffer, sizeof(packet_buffer),
> -				  PACKET_READ_GENTLE_ON_EOF |
> -				  PACKET_READ_CHOMP_NEWLINE);
> -		if (len < 0)
> -			die_initial_contact(saw_response);
> -
> -		if (!len)
> -			break;
> -
> -		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
> -			die("remote error: %s", arg);
> -
> -		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
> -			skip_prefix(buffer, "shallow ", &arg)) {
> -			if (get_oid_hex(arg, &old_oid))
> -				die("protocol error: expected shallow sha-1, got '%s'", arg);
> -			if (!shallow_points)
> -				die("repository on the other end cannot be shallow");
> -			oid_array_append(shallow_points, &old_oid);
> -			continue;
> -		}
> -
> -		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
> -			buffer[GIT_SHA1_HEXSZ] != ' ')
> -			die("protocol error: expected sha/ref, got '%s'", buffer);
> -		name = buffer + GIT_SHA1_HEXSZ + 1;
> -
> -		name_len = strlen(name);
> -		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> -			free(server_capabilities);
> -			server_capabilities = xstrdup(name + name_len + 1);
> -		}
>  
> -		if (extra_have && !strcmp(name, ".have")) {
> -			oid_array_append(extra_have, &old_oid);
> -			continue;
> -		}
> -
> -		if (!strcmp(name, "capabilities^{}")) {
> -			if (saw_response)
> -				die("protocol error: unexpected capabilities^{}");
> -			if (got_dummy_ref_with_capabilities_declaration)
> -				die("protocol error: multiple capabilities^{}");
> -			got_dummy_ref_with_capabilities_declaration = 1;
> -			continue;
> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> +		switch (state) {
> +		case EXPECTING_REF:
> +			if (process_ref(&state, len, &list, flags, extra_have))
> +				break;
> +			/* fallthrough */
> +		case EXPECTING_SHALLOW:
> +			if (process_shallow(&state, shallow_points))
> +				break;
> +			die("protocol error: unexpected '%s'", packet_buffer);
> +		default:
> +			die("unexpected state %d", state);
>  		}
> -
> -		if (!check_ref(name, flags))
> -			continue;
> -
> -		if (got_dummy_ref_with_capabilities_declaration)
> -			die("protocol error: unexpected ref after capabilities^{}");
> -
> -		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
> -		oidcpy(&ref->old_oid, &old_oid);
> -		*list = ref;
> -		list = &ref->next;
>  	}
>  
>  	annotate_refs_with_symref_info(*orig_list);
> -- 
> 2.14.1.728.g20a5b67d5.dirty
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v3] connect: in ref advertisement, shallows are last
  2017-09-22  0:00           ` Brandon Williams
@ 2017-09-22  0:08             ` Jonathan Tan
  2017-09-22  1:06               ` Junio C Hamano
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-09-22  0:08 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, jrnieder, gitster, peff, bmwill

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
I sent the wrong version of this patch :-(

This should be the correct one. A bit less clean because I introduced a
3rd state, however.

 connect.c | 167 +++++++++++++++++++++++++++++++++++++-------------------------
 1 file changed, 101 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index 49b28b83b..ef6358cfc 100644
--- a/connect.c
+++ b/connect.c
@@ -107,6 +107,91 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_REF_WITH_CAPABILITIES 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+
+static int process_ref(int *state, int len, struct ref ***list,
+		       unsigned int flags, struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	char *name;
+	int name_len;
+
+	if (len < GIT_SHA1_HEXSZ + 2 ||
+	    get_oid_hex(packet_buffer, &old_oid) ||
+	    packet_buffer[GIT_SHA1_HEXSZ] != ' ') {
+		*state = EXPECTING_SHALLOW;
+		return 0;
+	}
+
+	name = packet_buffer + GIT_SHA1_HEXSZ + 1;
+	name_len = strlen(name);
+	if (*state == EXPECTING_REF_WITH_CAPABILITIES &&
+	    len != name_len + GIT_SHA1_HEXSZ + 1) {
+		free(server_capabilities);
+		server_capabilities = xstrdup(name + name_len + 1);
+	} else if (*state == EXPECTING_REF) {
+		if (len != name_len + GIT_SHA1_HEXSZ + 1)
+			die("unexpected capabilities after ref name");
+	}
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (!strcmp(name, "capabilities^{}")) {
+		if (**list)
+			/* cannot coexist with other refs */
+			die("protocol error: unexpected capabilities^{}");
+		/* There should be no more refs; proceed to the next state. */
+		*state = EXPECTING_SHALLOW;
+		return 1;
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	*state = EXPECTING_REF;
+	return 1;
+}
+
+static int process_shallow(struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +208,26 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_REF_WITH_CAPABILITIES;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
-
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
 
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_REF_WITH_CAPABILITIES:
+		case EXPECTING_REF:
+			if (process_ref(&state, len, &list, flags, extra_have))
+				break;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.14.1.728.g20a5b67d5.dirty


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v3] connect: in ref advertisement, shallows are last
  2017-09-22  0:08             ` [PATCH v3] " Jonathan Tan
@ 2017-09-22  1:06               ` Junio C Hamano
  2017-09-22  1:39                 ` Junio C Hamano
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-22  1:06 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, jrnieder, peff, bmwill

Jonathan Tan <jonathantanmy@google.com> writes:

> Currently, get_remote_heads() parses the ref advertisement in one loop,
> allowing refs and shallow lines to intersperse, despite this not being
> allowed by the specification. Refactor get_remote_heads() to use two
> loops instead, enforcing that refs come first, and then shallows.
>
> This also makes it easier to teach get_remote_heads() to interpret other
> lines in the ref advertisement, which will be done in a subsequent
> patch.

Sounds sensible.  This still replaces the earlier 1.5?

>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> I sent the wrong version of this patch :-(
>
> This should be the correct one. A bit less clean because I introduced a
> 3rd state, however.
>
>  connect.c | 167 +++++++++++++++++++++++++++++++++++++-------------------------
>  1 file changed, 101 insertions(+), 66 deletions(-)
>
> diff --git a/connect.c b/connect.c
> index 49b28b83b..ef6358cfc 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -107,6 +107,91 @@ static void annotate_refs_with_symref_info(struct ref *ref)
>  	string_list_clear(&symref, 0);
>  }
>  
> +/*
> + * Read one line of a server's ref advertisement into packet_buffer.
> + */
> +static int read_remote_ref(int in, char **src_buf, size_t *src_len,
> +			   int *responded)
> +{
> +	int len = packet_read(in, src_buf, src_len,
> +			      packet_buffer, sizeof(packet_buffer),
> +			      PACKET_READ_GENTLE_ON_EOF |
> +			      PACKET_READ_CHOMP_NEWLINE);
> +	const char *arg;
> +	if (len < 0)
> +		die_initial_contact(*responded);
> +	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
> +		die("remote error: %s", arg);
> +
> +	*responded = 1;
> +
> +	return len;
> +}
> +
> +#define EXPECTING_REF_WITH_CAPABILITIES 0
> +#define EXPECTING_REF 1
> +#define EXPECTING_SHALLOW 2
> +
> +static int process_ref(int *state, int len, struct ref ***list,
> +		       unsigned int flags, struct oid_array *extra_have)
> +{
> +	struct object_id old_oid;
> +	char *name;
> +	int name_len;
> +
> +	if (len < GIT_SHA1_HEXSZ + 2 ||
> +	    get_oid_hex(packet_buffer, &old_oid) ||
> +	    packet_buffer[GIT_SHA1_HEXSZ] != ' ') {
> +		*state = EXPECTING_SHALLOW;
> +		return 0;
> +	}
> +
> +	name = packet_buffer + GIT_SHA1_HEXSZ + 1;
> +	name_len = strlen(name);
> +	if (*state == EXPECTING_REF_WITH_CAPABILITIES &&
> +	    len != name_len + GIT_SHA1_HEXSZ + 1) {
> +		free(server_capabilities);
> +		server_capabilities = xstrdup(name + name_len + 1);
> +	} else if (*state == EXPECTING_REF) {
> +		if (len != name_len + GIT_SHA1_HEXSZ + 1)
> +			die("unexpected capabilities after ref name");
> +	}
> +
> +	if (extra_have && !strcmp(name, ".have")) {
> +		oid_array_append(extra_have, &old_oid);
> +	} else if (!strcmp(name, "capabilities^{}")) {
> +		if (**list)
> +			/* cannot coexist with other refs */
> +			die("protocol error: unexpected capabilities^{}");
> +		/* There should be no more refs; proceed to the next state. */
> +		*state = EXPECTING_SHALLOW;
> +		return 1;
> +	} else if (check_ref(name, flags)) {
> +		struct ref *ref = alloc_ref(name);
> +		oidcpy(&ref->old_oid, &old_oid);
> +		**list = ref;
> +		*list = &ref->next;
> +	}
> +	*state = EXPECTING_REF;
> +	return 1;
> +}
> +
> +static int process_shallow(struct oid_array *shallow_points)
> +{
> +	const char *arg;
> +	struct object_id old_oid;
> +
> +	if (!skip_prefix(packet_buffer, "shallow ", &arg))
> +		return 0;
> +
> +	if (get_oid_hex(arg, &old_oid))
> +		die("protocol error: expected shallow sha-1, got '%s'", arg);
> +	if (!shallow_points)
> +		die("repository on the other end cannot be shallow");
> +	oid_array_append(shallow_points, &old_oid);
> +	return 1;
> +}
> +
>  /*
>   * Read all the refs from the other end
>   */
> @@ -123,76 +208,26 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	 * willing to talk to us.  A hang-up before seeing any
>  	 * response does not necessarily mean an ACL problem, though.
>  	 */
> -	int saw_response;
> -	int got_dummy_ref_with_capabilities_declaration = 0;
> +	int responded = 0;
> +	int len;
> +	int state = EXPECTING_REF_WITH_CAPABILITIES;
>  
>  	*list = NULL;
> -	for (saw_response = 0; ; saw_response = 1) {
> -		struct ref *ref;
> -		struct object_id old_oid;
> -		char *name;
> -		int len, name_len;
> -		char *buffer = packet_buffer;
> -		const char *arg;
> -
> -		len = packet_read(in, &src_buf, &src_len,
> -				  packet_buffer, sizeof(packet_buffer),
> -				  PACKET_READ_GENTLE_ON_EOF |
> -				  PACKET_READ_CHOMP_NEWLINE);
> -		if (len < 0)
> -			die_initial_contact(saw_response);
> -
> -		if (!len)
> -			break;
> -
> -		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
> -			die("remote error: %s", arg);
> -
> -		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
> -			skip_prefix(buffer, "shallow ", &arg)) {
> -			if (get_oid_hex(arg, &old_oid))
> -				die("protocol error: expected shallow sha-1, got '%s'", arg);
> -			if (!shallow_points)
> -				die("repository on the other end cannot be shallow");
> -			oid_array_append(shallow_points, &old_oid);
> -			continue;
> -		}
> -
> -		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
> -			buffer[GIT_SHA1_HEXSZ] != ' ')
> -			die("protocol error: expected sha/ref, got '%s'", buffer);
> -		name = buffer + GIT_SHA1_HEXSZ + 1;
> -
> -		name_len = strlen(name);
> -		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> -			free(server_capabilities);
> -			server_capabilities = xstrdup(name + name_len + 1);
> -		}
> -
> -		if (extra_have && !strcmp(name, ".have")) {
> -			oid_array_append(extra_have, &old_oid);
> -			continue;
> -		}
>  
> -		if (!strcmp(name, "capabilities^{}")) {
> -			if (saw_response)
> -				die("protocol error: unexpected capabilities^{}");
> -			if (got_dummy_ref_with_capabilities_declaration)
> -				die("protocol error: multiple capabilities^{}");
> -			got_dummy_ref_with_capabilities_declaration = 1;
> -			continue;
> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> +		switch (state) {
> +		case EXPECTING_REF_WITH_CAPABILITIES:
> +		case EXPECTING_REF:
> +			if (process_ref(&state, len, &list, flags, extra_have))
> +				break;
> +			/* fallthrough */
> +		case EXPECTING_SHALLOW:
> +			if (process_shallow(shallow_points))
> +				break;
> +			die("protocol error: unexpected '%s'", packet_buffer);
> +		default:
> +			die("unexpected state %d", state);
>  		}
> -
> -		if (!check_ref(name, flags))
> -			continue;
> -
> -		if (got_dummy_ref_with_capabilities_declaration)
> -			die("protocol error: unexpected ref after capabilities^{}");
> -
> -		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
> -		oidcpy(&ref->old_oid, &old_oid);
> -		*list = ref;
> -		list = &ref->next;
>  	}
>  
>  	annotate_refs_with_symref_info(*orig_list);

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3] connect: in ref advertisement, shallows are last
  2017-09-22  1:06               ` Junio C Hamano
@ 2017-09-22  1:39                 ` Junio C Hamano
  2017-09-22 16:45                   ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-22  1:39 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, jrnieder, peff, bmwill

Junio C Hamano <gitster@pobox.com> writes:

> Jonathan Tan <jonathantanmy@google.com> writes:
>
>> Currently, get_remote_heads() parses the ref advertisement in one loop,
>> allowing refs and shallow lines to intersperse, despite this not being
>> allowed by the specification. Refactor get_remote_heads() to use two
>> loops instead, enforcing that refs come first, and then shallows.
>>
>> This also makes it easier to teach get_remote_heads() to interpret other
>> lines in the ref advertisement, which will be done in a subsequent
>> patch.
>
> Sounds sensible.  This still replaces the earlier 1.5?

Well, it does, but it also invalidates how the new "pick the version
offered and used" feature is integrated to this callchain.  I guess
we'd need a new "we are now expecting the version info" state in a
patch to replace "connect: teach client to recognize v1 server
response".

>> +static int process_ref(int *state, int len, struct ref ***list,
>> +		       unsigned int flags, struct oid_array *extra_have)
>> +{
>> +	struct object_id old_oid;
>> +	char *name;
>> +	int name_len;
>> +
>> +	if (len < GIT_SHA1_HEXSZ + 2 ||
>> +	    get_oid_hex(packet_buffer, &old_oid) ||
>> +	    packet_buffer[GIT_SHA1_HEXSZ] != ' ') {
>> +		*state = EXPECTING_SHALLOW;
>> +		return 0;
>> +	}
>> +
>> +	name = packet_buffer + GIT_SHA1_HEXSZ + 1;
>> +	name_len = strlen(name);
>> +	if (*state == EXPECTING_REF_WITH_CAPABILITIES &&
>> +	    len != name_len + GIT_SHA1_HEXSZ + 1) {
>> +		free(server_capabilities);

Is this free() still needed?  After hitting this block, you'd set
*state to EXPECTING_REF before you return, so nobody would set
server_capabilities by hitting this block twice, and an attempt to
do so will hit the die("unexpected cap") below, no?

Or it may be a signal that this patch tightens it too much and
breaks older or third-party implementations of the other side that
can emit more than one refs with capability advertisement?

>> +		server_capabilities = xstrdup(name + name_len + 1);
>> +	} else if (*state == EXPECTING_REF) {
>> +		if (len != name_len + GIT_SHA1_HEXSZ + 1)
>> +			die("unexpected capabilities after ref name");
>> +	}
>> +	...
>> +	}
>> +	*state = EXPECTING_REF;
>> +	return 1;
>> +}

>> @@ -123,76 +208,26 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>>  	 * willing to talk to us.  A hang-up before seeing any
>>  	 * response does not necessarily mean an ACL problem, though.
>>  	 */
>> -	int saw_response;
>> -	int got_dummy_ref_with_capabilities_declaration = 0;
>> +	int responded = 0;
>> +	int len;
>> +	int state = EXPECTING_REF_WITH_CAPABILITIES;
>>  
>>  	*list = NULL;

>> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
>> +		switch (state) {
>> +		case EXPECTING_REF_WITH_CAPABILITIES:
>> +		case EXPECTING_REF:
>> +			if (process_ref(&state, len, &list, flags, extra_have))
>> +				break;
>> +			/* fallthrough */

OK.  This fallthrough is because expecting-ref is really expecting
ref or shallow and once we see a shallow, we no longer expect ref
and expect only shallow.  So from that point of view, an assignment
to set state to EXPECTING_SHALLOW could happen here, not inside
process_ref.  I mention this because in general, passing state
around and let it be updated in helper functions would make the
state transition harder to follow, not easier, even though
refactoring the processing needed in different stages into helper
functions like this patch does ought to make it easier to see by
shrinking the outer loop (i.e. this one) that controls the whole
process.

I think if we split process_ref() further into two, then we no
longer need to pass &state to that function?  We start this loop
with "expecting the dummy ref (or other)" state, have a new
process_dummy_ref() function check if we got "capabilities^{}" thing
and do its thing if that is the case (otherwise we fall through to
the call to process_ref(), just like the above code falls through to
call process_shallow() when it realizes what it got is not a ref),
and after the first call to process_dummy_ref() we'd be in the
"expecting ref (or other)" state---and the state transition can
happen in this caller, not in process_dummy_ref() or process_ref().

Inside process_dummy_ref() and process_ref(), there would be a call
to the same helper that notices and extracts the server capability
and stores it (or barfs against the second line that advertises the
capability, by noticing that server_capabilities is not NULL).

Wouldn't that make the presentation of the state machine cleaner?



^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3] connect: in ref advertisement, shallows are last
  2017-09-22  1:39                 ` Junio C Hamano
@ 2017-09-22 16:45                   ` Brandon Williams
  2017-09-22 20:15                     ` [PATCH v4] " Jonathan Tan
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-22 16:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Tan, git, jrnieder, peff

On 09/22, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Jonathan Tan <jonathantanmy@google.com> writes:
> >
> >> Currently, get_remote_heads() parses the ref advertisement in one loop,
> >> allowing refs and shallow lines to intersperse, despite this not being
> >> allowed by the specification. Refactor get_remote_heads() to use two
> >> loops instead, enforcing that refs come first, and then shallows.
> >>
> >> This also makes it easier to teach get_remote_heads() to interpret other
> >> lines in the ref advertisement, which will be done in a subsequent
> >> patch.
> >
> > Sounds sensible.  This still replaces the earlier 1.5?
> 
> Well, it does, but it also invalidates how the new "pick the version
> offered and used" feature is integrated to this callchain.  I guess
> we'd need a new "we are now expecting the version info" state in a
> patch to replace "connect: teach client to recognize v1 server
> response".

Yeah given we go with this patch, which is probably a better cleanup
than what I attempted, then I would need to change how a client
recognizes a v1 server.  That would probably be easily done by adding a
new state.

I do think that once a v2 protocol rolls around we'll probably have to
do even more refactoring because I don't think we'll want to keep all
the version checking logic in get_remote_heads() for different protocol
versions which may not be interested in a servers ref advertisement, but
that'll be for another time.

> 
> >> +static int process_ref(int *state, int len, struct ref ***list,
> >> +		       unsigned int flags, struct oid_array *extra_have)
> >> +{
> >> +	struct object_id old_oid;
> >> +	char *name;
> >> +	int name_len;
> >> +
> >> +	if (len < GIT_SHA1_HEXSZ + 2 ||
> >> +	    get_oid_hex(packet_buffer, &old_oid) ||
> >> +	    packet_buffer[GIT_SHA1_HEXSZ] != ' ') {
> >> +		*state = EXPECTING_SHALLOW;
> >> +		return 0;
> >> +	}
> >> +
> >> +	name = packet_buffer + GIT_SHA1_HEXSZ + 1;
> >> +	name_len = strlen(name);
> >> +	if (*state == EXPECTING_REF_WITH_CAPABILITIES &&
> >> +	    len != name_len + GIT_SHA1_HEXSZ + 1) {
> >> +		free(server_capabilities);
> 
> Is this free() still needed?  After hitting this block, you'd set
> *state to EXPECTING_REF before you return, so nobody would set
> server_capabilities by hitting this block twice, and an attempt to
> do so will hit the die("unexpected cap") below, no?
> 
> Or it may be a signal that this patch tightens it too much and
> breaks older or third-party implementations of the other side that
> can emit more than one refs with capability advertisement?
> 
> >> +		server_capabilities = xstrdup(name + name_len + 1);
> >> +	} else if (*state == EXPECTING_REF) {
> >> +		if (len != name_len + GIT_SHA1_HEXSZ + 1)
> >> +			die("unexpected capabilities after ref name");
> >> +	}
> >> +	...
> >> +	}
> >> +	*state = EXPECTING_REF;
> >> +	return 1;
> >> +}
> 
> >> @@ -123,76 +208,26 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> >>  	 * willing to talk to us.  A hang-up before seeing any
> >>  	 * response does not necessarily mean an ACL problem, though.
> >>  	 */
> >> -	int saw_response;
> >> -	int got_dummy_ref_with_capabilities_declaration = 0;
> >> +	int responded = 0;
> >> +	int len;
> >> +	int state = EXPECTING_REF_WITH_CAPABILITIES;
> >>  
> >>  	*list = NULL;
> 
> >> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> >> +		switch (state) {
> >> +		case EXPECTING_REF_WITH_CAPABILITIES:
> >> +		case EXPECTING_REF:
> >> +			if (process_ref(&state, len, &list, flags, extra_have))
> >> +				break;
> >> +			/* fallthrough */
> 
> OK.  This fallthrough is because expecting-ref is really expecting
> ref or shallow and once we see a shallow, we no longer expect ref
> and expect only shallow.  So from that point of view, an assignment
> to set state to EXPECTING_SHALLOW could happen here, not inside
> process_ref.  I mention this because in general, passing state
> around and let it be updated in helper functions would make the
> state transition harder to follow, not easier, even though
> refactoring the processing needed in different stages into helper
> functions like this patch does ought to make it easier to see by
> shrinking the outer loop (i.e. this one) that controls the whole
> process.
> 
> I think if we split process_ref() further into two, then we no
> longer need to pass &state to that function?  We start this loop
> with "expecting the dummy ref (or other)" state, have a new
> process_dummy_ref() function check if we got "capabilities^{}" thing
> and do its thing if that is the case (otherwise we fall through to
> the call to process_ref(), just like the above code falls through to
> call process_shallow() when it realizes what it got is not a ref),
> and after the first call to process_dummy_ref() we'd be in the
> "expecting ref (or other)" state---and the state transition can
> happen in this caller, not in process_dummy_ref() or process_ref().
> 
> Inside process_dummy_ref() and process_ref(), there would be a call
> to the same helper that notices and extracts the server capability
> and stores it (or barfs against the second line that advertises the
> capability, by noticing that server_capabilities is not NULL).
> 
> Wouldn't that make the presentation of the state machine cleaner?

I mentioned this when looking at v2 of this patch, that it would
probably be cleaner to remove passing the state variable around the
place and updating it inside a helper function.  It would just make the
logic simpler to follow if 'state' is updated directly instead of
indirectly.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v4] connect: in ref advertisement, shallows are last
  2017-09-22 16:45                   ` Brandon Williams
@ 2017-09-22 20:15                     ` Jonathan Tan
  2017-09-22 21:01                       ` Brandon Williams
  2017-09-24  0:52                       ` Junio C Hamano
  0 siblings, 2 replies; 161+ messages in thread
From: Jonathan Tan @ 2017-09-22 20:15 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, jrnieder, gitster, peff, bmwill

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

As part of this change, this patch interprets capabilities only on the
first line in the ref advertisement, ignoring all others.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
I've updated state transitions to occur in get_remote_heads() instead,
as suggested. I didn't want to do that previously because each step in
the state machine needed to communicate if (i) the line is "consumed"
and (ii) the state needed to be advanced, but with Junio's suggestion to
reorganize the methods, that is no longer true.

As Junio said, the free(server_capabilities) can be removed.

As for whether how capabilities on subsequent lines are handled, I think
it's better to ignore them - they are behind NULs, after all.

Yes, "connect: teach client to recognize v1 server response" will need
to be modified.

This change does have the side effect that if the server sends a ref
advertisement with "shallow"s only (and no refs), things will still
work, and the server can even tuck capabilities on the first "shallow"
line. I think that's fine, and it does make the client code cleaner.
---
 connect.c | 171 ++++++++++++++++++++++++++++++++++++++------------------------
 1 file changed, 105 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index 49b28b83b..978d01359 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "strbuf.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -107,6 +108,86 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+
+static void process_capabilities(int len)
+{
+	int nul_location = strlen(packet_buffer);
+	if (nul_location == len)
+		return;
+	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+}
+
+static int process_dummy_ref(void)
+{
+	static char *template;
+	if (!template)
+		template = xstrfmt("%040d capabilities^{}", 0);
+	return !strcmp(packet_buffer, template);
+}
+
+static int process_ref(struct ref ***list, unsigned int flags,
+		       struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	return 1;
+}
+
+static int process_shallow(struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +204,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_FIRST_REF;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
 
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
-
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_FIRST_REF:
+			process_capabilities(len);
+			if (process_dummy_ref()) {
+				state = EXPECTING_SHALLOW;
+				break;
+			}
+			state = EXPECTING_REF;
+			/* fallthrough */
+		case EXPECTING_REF:
+			if (process_ref(&list, flags, extra_have))
+				break;
+			state = EXPECTING_SHALLOW;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.14.1.728.g20a5b67d5.dirty


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v4] connect: in ref advertisement, shallows are last
  2017-09-22 20:15                     ` [PATCH v4] " Jonathan Tan
@ 2017-09-22 21:01                       ` Brandon Williams
  2017-09-22 22:16                         ` Jonathan Tan
  2017-09-24  0:52                       ` Junio C Hamano
  1 sibling, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-22 21:01 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, jrnieder, gitster, peff

On 09/22, Jonathan Tan wrote:
> Currently, get_remote_heads() parses the ref advertisement in one loop,
> allowing refs and shallow lines to intersperse, despite this not being
> allowed by the specification. Refactor get_remote_heads() to use two
> loops instead, enforcing that refs come first, and then shallows.
> 
> This also makes it easier to teach get_remote_heads() to interpret other
> lines in the ref advertisement, which will be done in a subsequent
> patch.
> 
> As part of this change, this patch interprets capabilities only on the
> first line in the ref advertisement, ignoring all others.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> I've updated state transitions to occur in get_remote_heads() instead,
> as suggested. I didn't want to do that previously because each step in
> the state machine needed to communicate if (i) the line is "consumed"
> and (ii) the state needed to be advanced, but with Junio's suggestion to
> reorganize the methods, that is no longer true.
> 
> As Junio said, the free(server_capabilities) can be removed.
> 
> As for whether how capabilities on subsequent lines are handled, I think
> it's better to ignore them - they are behind NULs, after all.
> 
> Yes, "connect: teach client to recognize v1 server response" will need
> to be modified.
> 
> This change does have the side effect that if the server sends a ref
> advertisement with "shallow"s only (and no refs), things will still
> work, and the server can even tuck capabilities on the first "shallow"
> line. I think that's fine, and it does make the client code cleaner.
> ---
>  connect.c | 171 ++++++++++++++++++++++++++++++++++++++------------------------
>  1 file changed, 105 insertions(+), 66 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index 49b28b83b..978d01359 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -11,6 +11,7 @@
>  #include "string-list.h"
>  #include "sha1-array.h"
>  #include "transport.h"
> +#include "strbuf.h"
>  
>  static char *server_capabilities;
>  static const char *parse_feature_value(const char *, const char *, int *);
> @@ -107,6 +108,86 @@ static void annotate_refs_with_symref_info(struct ref *ref)
>  	string_list_clear(&symref, 0);
>  }
>  
> +/*
> + * Read one line of a server's ref advertisement into packet_buffer.
> + */
> +static int read_remote_ref(int in, char **src_buf, size_t *src_len,
> +			   int *responded)
> +{
> +	int len = packet_read(in, src_buf, src_len,
> +			      packet_buffer, sizeof(packet_buffer),
> +			      PACKET_READ_GENTLE_ON_EOF |
> +			      PACKET_READ_CHOMP_NEWLINE);
> +	const char *arg;
> +	if (len < 0)
> +		die_initial_contact(*responded);
> +	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
> +		die("remote error: %s", arg);
> +
> +	*responded = 1;
> +
> +	return len;
> +}
> +
> +#define EXPECTING_FIRST_REF 0
> +#define EXPECTING_REF 1
> +#define EXPECTING_SHALLOW 2
> +
> +static void process_capabilities(int len)
> +{
> +	int nul_location = strlen(packet_buffer);

It may make more sense to not rely on accessing a global buffer here
directly and instead pass in the buff you're working on, much like your
are doing with len.

> +	if (nul_location == len)
> +		return;
> +	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
> +}
> +
> +static int process_dummy_ref(void)
> +{
> +	static char *template;
> +	if (!template)
> +		template = xstrfmt("%040d capabilities^{}", 0);

I'm not the biggest fan of dynamically allocating this and then using it
to compare.  Maybe we can check to make sure that the oid matches the
null_oid and that the name matches the "capabilities^{}" string?  That
way you can avoid the allocation?

> +	return !strcmp(packet_buffer, template);
> +}
> +
> +static int process_ref(struct ref ***list, unsigned int flags,
> +		       struct oid_array *extra_have)

So from comparing this to the current code it doesn't look like there is
a check in 'process_ref' that ensures that a 'capabilities^{}' line
doesn't show up after a normal ref, or am I missing something?

> +{
> +	struct object_id old_oid;
> +	const char *name;
> +
> +	if (parse_oid_hex(packet_buffer, &old_oid, &name))
> +		return 0;
> +	if (*name != ' ')
> +		return 0;
> +	name++;
> +
> +	if (extra_have && !strcmp(name, ".have")) {
> +		oid_array_append(extra_have, &old_oid);
> +	} else if (check_ref(name, flags)) {
> +		struct ref *ref = alloc_ref(name);
> +		oidcpy(&ref->old_oid, &old_oid);
> +		**list = ref;
> +		*list = &ref->next;
> +	}
> +	return 1;
> +}
> +
> +static int process_shallow(struct oid_array *shallow_points)
> +{
> +	const char *arg;
> +	struct object_id old_oid;
> +
> +	if (!skip_prefix(packet_buffer, "shallow ", &arg))
> +		return 0;
> +
> +	if (get_oid_hex(arg, &old_oid))
> +		die("protocol error: expected shallow sha-1, got '%s'", arg);
> +	if (!shallow_points)
> +		die("repository on the other end cannot be shallow");
> +	oid_array_append(shallow_points, &old_oid);
> +	return 1;
> +}
> +
>  /*
>   * Read all the refs from the other end
>   */
> @@ -123,76 +204,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	 * willing to talk to us.  A hang-up before seeing any
>  	 * response does not necessarily mean an ACL problem, though.
>  	 */
> -	int saw_response;
> -	int got_dummy_ref_with_capabilities_declaration = 0;
> +	int responded = 0;
> +	int len;
> +	int state = EXPECTING_FIRST_REF;
>  
>  	*list = NULL;
> -	for (saw_response = 0; ; saw_response = 1) {
> -		struct ref *ref;
> -		struct object_id old_oid;
> -		char *name;
> -		int len, name_len;
> -		char *buffer = packet_buffer;
> -		const char *arg;
> -
> -		len = packet_read(in, &src_buf, &src_len,
> -				  packet_buffer, sizeof(packet_buffer),
> -				  PACKET_READ_GENTLE_ON_EOF |
> -				  PACKET_READ_CHOMP_NEWLINE);
> -		if (len < 0)
> -			die_initial_contact(saw_response);
> -
> -		if (!len)
> -			break;
>  
> -		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
> -			die("remote error: %s", arg);
> -
> -		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
> -			skip_prefix(buffer, "shallow ", &arg)) {
> -			if (get_oid_hex(arg, &old_oid))
> -				die("protocol error: expected shallow sha-1, got '%s'", arg);
> -			if (!shallow_points)
> -				die("repository on the other end cannot be shallow");
> -			oid_array_append(shallow_points, &old_oid);
> -			continue;
> -		}
> -
> -		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
> -			buffer[GIT_SHA1_HEXSZ] != ' ')
> -			die("protocol error: expected sha/ref, got '%s'", buffer);
> -		name = buffer + GIT_SHA1_HEXSZ + 1;
> -
> -		name_len = strlen(name);
> -		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> -			free(server_capabilities);
> -			server_capabilities = xstrdup(name + name_len + 1);
> -		}
> -
> -		if (extra_have && !strcmp(name, ".have")) {
> -			oid_array_append(extra_have, &old_oid);
> -			continue;
> -		}
> -
> -		if (!strcmp(name, "capabilities^{}")) {
> -			if (saw_response)
> -				die("protocol error: unexpected capabilities^{}");
> -			if (got_dummy_ref_with_capabilities_declaration)
> -				die("protocol error: multiple capabilities^{}");
> -			got_dummy_ref_with_capabilities_declaration = 1;
> -			continue;
> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> +		switch (state) {
> +		case EXPECTING_FIRST_REF:
> +			process_capabilities(len);
> +			if (process_dummy_ref()) {
> +				state = EXPECTING_SHALLOW;
> +				break;
> +			}
> +			state = EXPECTING_REF;
> +			/* fallthrough */
> +		case EXPECTING_REF:
> +			if (process_ref(&list, flags, extra_have))
> +				break;
> +			state = EXPECTING_SHALLOW;
> +			/* fallthrough */
> +		case EXPECTING_SHALLOW:
> +			if (process_shallow(shallow_points))
> +				break;
> +			die("protocol error: unexpected '%s'", packet_buffer);
> +		default:
> +			die("unexpected state %d", state);

Looks much cleaner, thanks!

>  		}
> -
> -		if (!check_ref(name, flags))
> -			continue;
> -
> -		if (got_dummy_ref_with_capabilities_declaration)
> -			die("protocol error: unexpected ref after capabilities^{}");
> -
> -		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
> -		oidcpy(&ref->old_oid, &old_oid);
> -		*list = ref;
> -		list = &ref->next;
>  	}
>  
>  	annotate_refs_with_symref_info(*orig_list);
> -- 
> 2.14.1.728.g20a5b67d5.dirty
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v4] connect: in ref advertisement, shallows are last
  2017-09-22 21:01                       ` Brandon Williams
@ 2017-09-22 22:16                         ` Jonathan Tan
  0 siblings, 0 replies; 161+ messages in thread
From: Jonathan Tan @ 2017-09-22 22:16 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, jrnieder, gitster, peff

On Fri, 22 Sep 2017 14:01:04 -0700
Brandon Williams <bmwill@google.com> wrote:

> > +static void process_capabilities(int len)
> > +{
> > +	int nul_location = strlen(packet_buffer);
> 
> It may make more sense to not rely on accessing a global buffer here
> directly and instead pass in the buff you're working on, much like your
> are doing with len.

I wanted to preserve the existing code's behavior of using the global
buffer, and it didn't make sense for me to alias it (like the existing
code does).

I pass len in because I need to read beyond NUL.

> I'm not the biggest fan of dynamically allocating this and then using it
> to compare.  Maybe we can check to make sure that the oid matches the
> null_oid and that the name matches the "capabilities^{}" string?  That
> way you can avoid the allocation?

The dynamic allocation happens only once per process, since it is
static. To check the oid matches null_oid, I would have to parse it
first, and that seemed unnecessary.

Ideally I would just check again "000...000 capabilities^{}", but
writing it in source code would be error-prone, I think.

> > +static int process_ref(struct ref ***list, unsigned int flags,
> > +		       struct oid_array *extra_have)
> 
> So from comparing this to the current code it doesn't look like there is
> a check in 'process_ref' that ensures that a 'capabilities^{}' line
> doesn't show up after a normal ref, or am I missing something?

Ah...yes, you're right. I'll fix this by adding a check in
process_ref().

This is getting more complicated than I thought, so I'll wait a while
for other comments before sending out an updated patch.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v4] connect: in ref advertisement, shallows are last
  2017-09-22 20:15                     ` [PATCH v4] " Jonathan Tan
  2017-09-22 21:01                       ` Brandon Williams
@ 2017-09-24  0:52                       ` Junio C Hamano
  1 sibling, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-09-24  0:52 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, jrnieder, peff, bmwill

Jonathan Tan <jonathantanmy@google.com> writes:

> Currently, get_remote_heads() parses the ref advertisement in one loop,
> allowing refs and shallow lines to intersperse, despite this not being
> allowed by the specification. Refactor get_remote_heads() to use two
> loops instead, enforcing that refs come first, and then shallows.
>
> This also makes it easier to teach get_remote_heads() to interpret other
> lines in the ref advertisement, which will be done in a subsequent
> patch.
>
> As part of this change, this patch interprets capabilities only on the
> first line in the ref advertisement, ignoring all others.
>
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> I've updated state transitions to occur in get_remote_heads() instead,
> as suggested. I didn't want to do that previously because each step in
> the state machine needed to communicate if (i) the line is "consumed"
> and (ii) the state needed to be advanced, but with Junio's suggestion to
> reorganize the methods, that is no longer true.
>
> As Junio said, the free(server_capabilities) can be removed.
>
> As for whether how capabilities on subsequent lines are handled, I think
> it's better to ignore them - they are behind NULs, after all.

Not even a diagnosis?  When the other side clearly wants to tell us
something and we deliberately ignore, I'd think we want to at least
know about it---that may lead us to notify the implementators of the
other side of a protocol violation, or rethink the design decision
(i.e. only the first one matters) ourselves.

> This change does have the side effect that if the server sends a ref
> advertisement with "shallow"s only (and no refs), things will still
> work, and the server can even tuck capabilities on the first "shallow"
> line. I think that's fine, and it does make the client code cleaner.

I am ambivalent on this aspect of the change.

The change makes the resulting state transition logic quite easy to
follow.  Very nicely done.

> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> +		switch (state) {
> +		case EXPECTING_FIRST_REF:
> +			process_capabilities(len);
> +			if (process_dummy_ref()) {
> +				state = EXPECTING_SHALLOW;
> +				break;
> +			}
> +			state = EXPECTING_REF;
> +			/* fallthrough */
> +		case EXPECTING_REF:
> +			if (process_ref(&list, flags, extra_have))
> +				break;
> +			state = EXPECTING_SHALLOW;
> +			/* fallthrough */
> +		case EXPECTING_SHALLOW:
> +			if (process_shallow(shallow_points))
> +				break;
> +			die("protocol error: unexpected '%s'", packet_buffer);
> +		default:
> +			die("unexpected state %d", state);
>  		}
>  	}


^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v5] connect: in ref advertisement, shallows are last
  2017-09-21 20:45       ` [PATCH] connect: in ref advertisement, shallows are last Jonathan Tan
  2017-09-21 23:45         ` [PATCH v2] " Jonathan Tan
@ 2017-09-26 18:21         ` Jonathan Tan
  2017-09-26 18:31           ` Brandon Williams
  1 sibling, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-09-26 18:21 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, bmwill

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

As part of this change, this patch interprets capabilities only on the
first line in the ref advertisement, printing a warning message when
encountering capabilities on other lines.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
Changes in v5:
 - print warning when encountering capabilities on other lines instead
   of ignoring them (also updated commit message)
 - explicitly disallow refs of name "capabilities^{}" (except when it is
   the only ref)
---
 connect.c | 183 +++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 117 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index df56c0cbf..df65a3fc4 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "strbuf.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -107,6 +108,98 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+
+static void process_capabilities(int *len)
+{
+	int nul_location = strlen(packet_buffer);
+	if (nul_location == *len)
+		return;
+	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	*len = nul_location;
+}
+
+static int process_dummy_ref(void)
+{
+	static char *template;
+	if (!template)
+		template = xstrfmt("%040d capabilities^{}", 0);
+	return !strcmp(packet_buffer, template);
+}
+
+static void check_no_capabilities(int len)
+{
+	if (strlen(packet_buffer) != len)
+		warning("Ignoring capabilities after first line '%s'",
+			packet_buffer + strlen(packet_buffer));
+}
+
+static int process_ref(int len, struct ref ***list, unsigned int flags,
+		       struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (!strcmp(name, "capabilities^{}")) {
+		die("protocol error: unexpected capabilities^{}");
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	check_no_capabilities(len);
+	return 1;
+}
+
+static int process_shallow(int len, struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	check_no_capabilities(len);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +216,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_FIRST_REF;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
-
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
 
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_FIRST_REF:
+			process_capabilities(&len);
+			if (process_dummy_ref()) {
+				state = EXPECTING_SHALLOW;
+				break;
+			}
+			state = EXPECTING_REF;
+			/* fallthrough */
+		case EXPECTING_REF:
+			if (process_ref(len, &list, flags, extra_have))
+				break;
+			state = EXPECTING_SHALLOW;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(len, shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.14.1.821.g8fa685d3b7-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v5] connect: in ref advertisement, shallows are last
  2017-09-26 18:21         ` [PATCH v5] " Jonathan Tan
@ 2017-09-26 18:31           ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 18:31 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, gitster

On 09/26, Jonathan Tan wrote:
> Currently, get_remote_heads() parses the ref advertisement in one loop,
> allowing refs and shallow lines to intersperse, despite this not being
> allowed by the specification. Refactor get_remote_heads() to use two
> loops instead, enforcing that refs come first, and then shallows.
> 
> This also makes it easier to teach get_remote_heads() to interpret other
> lines in the ref advertisement, which will be done in a subsequent
> patch.
> 
> As part of this change, this patch interprets capabilities only on the
> first line in the ref advertisement, printing a warning message when
> encountering capabilities on other lines.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> Changes in v5:
>  - print warning when encountering capabilities on other lines instead
>    of ignoring them (also updated commit message)
>  - explicitly disallow refs of name "capabilities^{}" (except when it is
>    the only ref)
> ---
>  connect.c | 183 +++++++++++++++++++++++++++++++++++++++-----------------------
>  1 file changed, 117 insertions(+), 66 deletions(-)
> 
> diff --git a/connect.c b/connect.c
> index df56c0cbf..df65a3fc4 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -11,6 +11,7 @@
>  #include "string-list.h"
>  #include "sha1-array.h"
>  #include "transport.h"
> +#include "strbuf.h"
>  
>  static char *server_capabilities;
>  static const char *parse_feature_value(const char *, const char *, int *);
> @@ -107,6 +108,98 @@ static void annotate_refs_with_symref_info(struct ref *ref)
>  	string_list_clear(&symref, 0);
>  }
>  
> +/*
> + * Read one line of a server's ref advertisement into packet_buffer.
> + */
> +static int read_remote_ref(int in, char **src_buf, size_t *src_len,
> +			   int *responded)
> +{
> +	int len = packet_read(in, src_buf, src_len,
> +			      packet_buffer, sizeof(packet_buffer),
> +			      PACKET_READ_GENTLE_ON_EOF |
> +			      PACKET_READ_CHOMP_NEWLINE);
> +	const char *arg;
> +	if (len < 0)
> +		die_initial_contact(*responded);
> +	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
> +		die("remote error: %s", arg);
> +
> +	*responded = 1;
> +
> +	return len;
> +}
> +
> +#define EXPECTING_FIRST_REF 0
> +#define EXPECTING_REF 1
> +#define EXPECTING_SHALLOW 2
> +
> +static void process_capabilities(int *len)
> +{
> +	int nul_location = strlen(packet_buffer);
> +	if (nul_location == *len)
> +		return;
> +	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
> +	*len = nul_location;
> +}
> +
> +static int process_dummy_ref(void)
> +{
> +	static char *template;
> +	if (!template)
> +		template = xstrfmt("%040d capabilities^{}", 0);

My only complaint is still here, I don't like the notion of hardcoding
the 0's.  Its much more future proof and less error prone to call
parse_oid_hex and require that it matches null_oid.

> +	return !strcmp(packet_buffer, template);
> +}
> +
> +static void check_no_capabilities(int len)
> +{
> +	if (strlen(packet_buffer) != len)
> +		warning("Ignoring capabilities after first line '%s'",
> +			packet_buffer + strlen(packet_buffer));
> +}
> +
> +static int process_ref(int len, struct ref ***list, unsigned int flags,
> +		       struct oid_array *extra_have)
> +{
> +	struct object_id old_oid;
> +	const char *name;
> +
> +	if (parse_oid_hex(packet_buffer, &old_oid, &name))
> +		return 0;
> +	if (*name != ' ')
> +		return 0;
> +	name++;
> +
> +	if (extra_have && !strcmp(name, ".have")) {
> +		oid_array_append(extra_have, &old_oid);
> +	} else if (!strcmp(name, "capabilities^{}")) {
> +		die("protocol error: unexpected capabilities^{}");
> +	} else if (check_ref(name, flags)) {
> +		struct ref *ref = alloc_ref(name);
> +		oidcpy(&ref->old_oid, &old_oid);
> +		**list = ref;
> +		*list = &ref->next;
> +	}
> +	check_no_capabilities(len);
> +	return 1;
> +}
> +
> +static int process_shallow(int len, struct oid_array *shallow_points)
> +{
> +	const char *arg;
> +	struct object_id old_oid;
> +
> +	if (!skip_prefix(packet_buffer, "shallow ", &arg))
> +		return 0;
> +
> +	if (get_oid_hex(arg, &old_oid))
> +		die("protocol error: expected shallow sha-1, got '%s'", arg);
> +	if (!shallow_points)
> +		die("repository on the other end cannot be shallow");
> +	oid_array_append(shallow_points, &old_oid);
> +	check_no_capabilities(len);
> +	return 1;
> +}
> +
>  /*
>   * Read all the refs from the other end
>   */
> @@ -123,76 +216,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	 * willing to talk to us.  A hang-up before seeing any
>  	 * response does not necessarily mean an ACL problem, though.
>  	 */
> -	int saw_response;
> -	int got_dummy_ref_with_capabilities_declaration = 0;
> +	int responded = 0;
> +	int len;
> +	int state = EXPECTING_FIRST_REF;
>  
>  	*list = NULL;
> -	for (saw_response = 0; ; saw_response = 1) {
> -		struct ref *ref;
> -		struct object_id old_oid;
> -		char *name;
> -		int len, name_len;
> -		char *buffer = packet_buffer;
> -		const char *arg;
> -
> -		len = packet_read(in, &src_buf, &src_len,
> -				  packet_buffer, sizeof(packet_buffer),
> -				  PACKET_READ_GENTLE_ON_EOF |
> -				  PACKET_READ_CHOMP_NEWLINE);
> -		if (len < 0)
> -			die_initial_contact(saw_response);
> -
> -		if (!len)
> -			break;
> -
> -		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
> -			die("remote error: %s", arg);
> -
> -		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
> -			skip_prefix(buffer, "shallow ", &arg)) {
> -			if (get_oid_hex(arg, &old_oid))
> -				die("protocol error: expected shallow sha-1, got '%s'", arg);
> -			if (!shallow_points)
> -				die("repository on the other end cannot be shallow");
> -			oid_array_append(shallow_points, &old_oid);
> -			continue;
> -		}
> -
> -		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
> -			buffer[GIT_SHA1_HEXSZ] != ' ')
> -			die("protocol error: expected sha/ref, got '%s'", buffer);
> -		name = buffer + GIT_SHA1_HEXSZ + 1;
>  
> -		name_len = strlen(name);
> -		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
> -			free(server_capabilities);
> -			server_capabilities = xstrdup(name + name_len + 1);
> -		}
> -
> -		if (extra_have && !strcmp(name, ".have")) {
> -			oid_array_append(extra_have, &old_oid);
> -			continue;
> -		}
> -
> -		if (!strcmp(name, "capabilities^{}")) {
> -			if (saw_response)
> -				die("protocol error: unexpected capabilities^{}");
> -			if (got_dummy_ref_with_capabilities_declaration)
> -				die("protocol error: multiple capabilities^{}");
> -			got_dummy_ref_with_capabilities_declaration = 1;
> -			continue;
> +	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> +		switch (state) {
> +		case EXPECTING_FIRST_REF:
> +			process_capabilities(&len);
> +			if (process_dummy_ref()) {
> +				state = EXPECTING_SHALLOW;
> +				break;
> +			}
> +			state = EXPECTING_REF;
> +			/* fallthrough */
> +		case EXPECTING_REF:
> +			if (process_ref(len, &list, flags, extra_have))
> +				break;
> +			state = EXPECTING_SHALLOW;
> +			/* fallthrough */
> +		case EXPECTING_SHALLOW:
> +			if (process_shallow(len, shallow_points))
> +				break;
> +			die("protocol error: unexpected '%s'", packet_buffer);
> +		default:
> +			die("unexpected state %d", state);
>  		}
> -
> -		if (!check_ref(name, flags))
> -			continue;
> -
> -		if (got_dummy_ref_with_capabilities_declaration)
> -			die("protocol error: unexpected ref after capabilities^{}");
> -
> -		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
> -		oidcpy(&ref->old_oid, &old_oid);
> -		*list = ref;
> -		list = &ref->next;
>  	}
>  
>  	annotate_refs_with_symref_info(*orig_list);
> -- 
> 2.14.1.821.g8fa685d3b7-goog
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v2 0/9] protocol transition
  2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
                   ` (8 preceding siblings ...)
  2017-09-20 18:48 ` [PATCH 1.5/8] connect: die when a capability line comes after a ref Brandon Williams
@ 2017-09-26 23:56 ` Brandon Williams
  2017-09-26 23:56   ` [PATCH v2 1/9] connect: in ref advertisement, shallows are last Brandon Williams
                     ` (9 more replies)
  9 siblings, 10 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

v2 of this series has the following changes:

 * Included Jonathan Tan's patch as [1/9] with a small tweak to not rely on
   hardcoding the side of sha1 into the capability line check.
 * Reworked some of the logic in daemon.c
 * Added the word 'Experimental' to protocol.version to indicate that user's
   shouldn't rely on its semantics to hold until it has been thoroughly tested.

Brandon Williams (8):
  pkt-line: add packet_write function
  protocol: introduce protocol extention mechanisms
  daemon: recognize hidden request arguments
  upload-pack, receive-pack: introduce protocol version 1
  connect: teach client to recognize v1 server response
  connect: tell server that the client understands v1
  http: tell server that the client understands v1
  i5700: add interop test for protocol transition

Jonathan Tan (1):
  connect: in ref advertisement, shallows are last

 Documentation/config.txt               |  17 ++
 Documentation/git.txt                  |   5 +
 Makefile                               |   1 +
 builtin/receive-pack.c                 |  14 ++
 cache.h                                |   9 +
 connect.c                              | 248 ++++++++++++++++++++--------
 daemon.c                               |  68 +++++++-
 http.c                                 |  18 ++
 pkt-line.c                             |   6 +
 pkt-line.h                             |   1 +
 protocol.c                             |  72 ++++++++
 protocol.h                             |  14 ++
 t/interop/i5700-protocol-transition.sh |  68 ++++++++
 t/lib-httpd/apache.conf                |   7 +
 t/t5700-protocol-v1.sh                 | 292 +++++++++++++++++++++++++++++++++
 upload-pack.c                          |  17 +-
 16 files changed, 776 insertions(+), 81 deletions(-)
 create mode 100644 protocol.c
 create mode 100644 protocol.h
 create mode 100755 t/interop/i5700-protocol-transition.sh
 create mode 100755 t/t5700-protocol-v1.sh

-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v2 1/9] connect: in ref advertisement, shallows are last
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-26 23:56   ` [PATCH v2 2/9] pkt-line: add packet_write function Brandon Williams
                     ` (8 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

From: Jonathan Tan <jonathantanmy@google.com>

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

As part of this change, this patch interprets capabilities only on the
first line in the ref advertisement, printing a warning message when
encountering capabilities on other lines.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 189 ++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 123 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index df56c0cbf..8e2e276b6 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "strbuf.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -107,6 +108,104 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+
+static void process_capabilities(int *len)
+{
+	int nul_location = strlen(packet_buffer);
+	if (nul_location == *len)
+		return;
+	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	*len = nul_location;
+}
+
+static int process_dummy_ref(void)
+{
+	struct object_id oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
+}
+
+static void check_no_capabilities(int len)
+{
+	if (strlen(packet_buffer) != len)
+		warning("Ignoring capabilities after first line '%s'",
+			packet_buffer + strlen(packet_buffer));
+}
+
+static int process_ref(int len, struct ref ***list, unsigned int flags,
+		       struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (!strcmp(name, "capabilities^{}")) {
+		die("protocol error: unexpected capabilities^{}");
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	check_no_capabilities(len);
+	return 1;
+}
+
+static int process_shallow(int len, struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	check_no_capabilities(len);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +222,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_FIRST_REF;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
 
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_FIRST_REF:
+			process_capabilities(&len);
+			if (process_dummy_ref()) {
+				state = EXPECTING_SHALLOW;
+				break;
+			}
+			state = EXPECTING_REF;
+			/* fallthrough */
+		case EXPECTING_REF:
+			if (process_ref(len, &list, flags, extra_have))
+				break;
+			state = EXPECTING_SHALLOW;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(len, shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 2/9] pkt-line: add packet_write function
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
  2017-09-26 23:56   ` [PATCH v2 1/9] connect: in ref advertisement, shallows are last Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-26 23:56   ` [PATCH v2 3/9] protocol: introduce protocol extention mechanisms Brandon Williams
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Add a function which can be used to write the contents of an arbitrary
buffer.  This makes it easy to build up data in a buffer before writing
the packet instead of formatting the entire contents of the packet using
'packet_write_fmt()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 6 ++++++
 pkt-line.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 647bbd3bc..c025d0332 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -188,6 +188,12 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 	return 0;
 }
 
+void packet_write(const int fd_out, const char *buf, size_t size)
+{
+	if (packet_write_gently(fd_out, buf, size))
+		die_errno("packet write failed");
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
 	va_list args;
diff --git a/pkt-line.h b/pkt-line.h
index 66ef610fc..d9e9783b1 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
 void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_write(const int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
  2017-09-26 23:56   ` [PATCH v2 1/9] connect: in ref advertisement, shallows are last Brandon Williams
  2017-09-26 23:56   ` [PATCH v2 2/9] pkt-line: add packet_write function Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-27  5:17     ` Junio C Hamano
  2017-09-27  6:30     ` Stefan Beller
  2017-09-26 23:56   ` [PATCH v2 4/9] daemon: recognize hidden request arguments Brandon Williams
                     ` (6 subsequent siblings)
  9 siblings, 2 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt | 17 ++++++++++++
 Documentation/git.txt    |  5 ++++
 Makefile                 |  1 +
 cache.h                  |  7 +++++
 protocol.c               | 72 ++++++++++++++++++++++++++++++++++++++++++++++++
 protocol.h               | 14 ++++++++++
 6 files changed, 116 insertions(+)
 create mode 100644 protocol.c
 create mode 100644 protocol.h

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a..b78747abc 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2517,6 +2517,23 @@ The protocol names currently used by git are:
     `hg` to allow the `git-remote-hg` helper)
 --
 
+protocol.version::
+	Experimental. If set, clients will attempt to communicate with a
+	server using the specified protocol version.  If unset, no
+	attempt will be made by the client to communicate using a
+	particular protocol version, this results in protocol version 0
+	being used.
+	Supported versions:
++
+--
+
+* `0` - the original wire protocol.
+
+* `1` - the original wire protocol with the addition of a version string
+  in the initial response from the server.
+
+--
+
 pull.ff::
 	By default, Git does not create an extra merge commit when merging
 	a commit that is a descendant of the current commit. Instead, the
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6e3a6767e..299f75c7b 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -697,6 +697,11 @@ of clones and fetches.
 	which feed potentially-untrusted URLS to git commands.  See
 	linkgit:git-config[1] for more details.
 
+`GIT_PROTOCOL`::
+	For internal use only.  Used in handshaking the wire protocol.
+	Contains a colon ':' separated list of keys with optional values
+	'key[=value]'.  Presence of unknown keys must be tolerated.
+
 Discussion[[Discussion]]
 ------------------------
 
diff --git a/Makefile b/Makefile
index ed4ca438b..9ce68cded 100644
--- a/Makefile
+++ b/Makefile
@@ -842,6 +842,7 @@ LIB_OBJS += pretty.o
 LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
 LIB_OBJS += prompt.o
+LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
diff --git a/cache.h b/cache.h
index 49b083ee0..0c792545f 100644
--- a/cache.h
+++ b/cache.h
@@ -445,6 +445,13 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS"
 #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH"
 
+/*
+ * Environment variable used in handshaking the wire protocol.
+ * Contains a colon ':' separated list of keys with optional values
+ * 'key[=value]'.  Presence of unknown keys must be tolerated.
+ */
+#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+
 /*
  * This environment variable is expected to contain a boolean indicating
  * whether we should or should not treat:
diff --git a/protocol.c b/protocol.c
new file mode 100644
index 000000000..369503065
--- /dev/null
+++ b/protocol.c
@@ -0,0 +1,72 @@
+#include "cache.h"
+#include "config.h"
+#include "protocol.h"
+
+static enum protocol_version parse_protocol_version(const char *value)
+{
+	if (!strcmp(value, "0"))
+		return protocol_v0;
+	else if (!strcmp(value, "1"))
+		return protocol_v1;
+	else
+		return protocol_unknown_version;
+}
+
+enum protocol_version get_protocol_version_config(void)
+{
+	const char *value;
+	if (!git_config_get_string_const("protocol.version", &value)) {
+		enum protocol_version version = parse_protocol_version(value);
+
+		if (version == protocol_unknown_version)
+			die("unknown value for config 'protocol.version': %s",
+			    value);
+
+		return version;
+	}
+
+	return protocol_v0;
+}
+
+enum protocol_version determine_protocol_version_server(void)
+{
+	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
+	enum protocol_version version = protocol_v0;
+
+	if (git_protocol) {
+		struct string_list list = STRING_LIST_INIT_DUP;
+		const struct string_list_item *item;
+		string_list_split(&list, git_protocol, ':', -1);
+
+		for_each_string_list_item(item, &list) {
+			const char *value;
+			enum protocol_version v;
+
+			if (skip_prefix(item->string, "version=", &value)) {
+				v = parse_protocol_version(value);
+				if (v > version)
+					version = v;
+			}
+		}
+
+		string_list_clear(&list, 0);
+	}
+
+	return version;
+}
+
+enum protocol_version determine_protocol_version_client(const char *server_response)
+{
+	enum protocol_version version = protocol_v0;
+
+	if (skip_prefix(server_response, "version ", &server_response)) {
+		version = parse_protocol_version(server_response);
+
+		if (version == protocol_unknown_version)
+			die("server is speaking an unknown protocol");
+		if (version == protocol_v0)
+			die("protocol error: server explicitly said version 0");
+	}
+
+	return version;
+}
diff --git a/protocol.h b/protocol.h
new file mode 100644
index 000000000..18f9a5235
--- /dev/null
+++ b/protocol.h
@@ -0,0 +1,14 @@
+#ifndef PROTOCOL_H
+#define PROTOCOL_H
+
+enum protocol_version {
+	protocol_unknown_version = -1,
+	protocol_v0 = 0,
+	protocol_v1 = 1,
+};
+
+extern enum protocol_version get_protocol_version_config(void);
+extern enum protocol_version determine_protocol_version_server(void);
+extern enum protocol_version determine_protocol_version_client(const char *server_response);
+
+#endif /* PROTOCOL_H */
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 4/9] daemon: recognize hidden request arguments
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (2 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 3/9] protocol: introduce protocol extention mechanisms Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-27  5:20     ` Junio C Hamano
  2017-09-26 23:56   ` [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
                     ` (5 subsequent siblings)
  9 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

A normal request to git-daemon is structured as
"command path/to/repo\0host=..\0" and due to a bug in an old version of
git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
command, 2009-06-04) we aren't able to place any extra args (separated
by NULs) besides the host.

In order to get around this limitation teach git-daemon to recognize
additional request arguments hidden behind a second NUL byte.  Requests
can then be structured like:
"command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
can then parse out the extra arguments and set 'GIT_PROTOCOL'
accordingly.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 daemon.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 59 insertions(+), 9 deletions(-)

diff --git a/daemon.c b/daemon.c
index 30747075f..36cc794c9 100644
--- a/daemon.c
+++ b/daemon.c
@@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi)
 	return NULL;		/* Fallthrough. Deny by default */
 }
 
-typedef int (*daemon_service_fn)(void);
+typedef int (*daemon_service_fn)(const struct argv_array *env);
 struct daemon_service {
 	const char *name;
 	const char *config_name;
@@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
 }
 
 static int run_service(const char *dir, struct daemon_service *service,
-		       struct hostinfo *hi)
+		       struct hostinfo *hi, const struct argv_array *env)
 {
 	const char *path;
 	int enabled = service->enabled;
@@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service,
 	 */
 	signal(SIGTERM, SIG_IGN);
 
-	return service->fn();
+	return service->fn(env);
 }
 
 static void copy_to_log(int fd)
@@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld)
 	return finish_command(cld);
 }
 
-static int upload_pack(void)
+static int upload_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_pushl(&cld.args, "upload-pack", "--strict", NULL);
 	argv_array_pushf(&cld.args, "--timeout=%u", timeout);
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int upload_archive(void)
+static int upload_archive(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "upload-archive");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int receive_pack(void)
+static int receive_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "receive-pack");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
@@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
 /*
  * Read the host as supplied by the client connection.
  */
-static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
+static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 {
 	char *val;
 	int vallen;
@@ -602,6 +611,43 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 		if (extra_args < end && *extra_args)
 			die("Invalid request");
 	}
+
+	return extra_args;
+}
+
+static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
+			     char *extra_args, int buflen)
+{
+	const char *end = extra_args + buflen;
+	struct strbuf git_protocol = STRBUF_INIT;
+
+	/* First look for the host argument */
+	extra_args = parse_host_arg(hi, extra_args, buflen);
+
+	/* Look for additional arguments places after a second NUL byte */
+	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
+		const char *arg = extra_args;
+
+		/*
+		 * Parse the extra arguments, adding most to 'git_protocol'
+		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
+		 * service that will be run.
+		 *
+		 * If there ends up being a particular arg in the future that
+		 * git-daemon needs to parse specificly (like the 'host' arg)
+		 * then it can be parsed here and not added to 'git_protocol'.
+		 */
+		if (*arg) {
+			if (git_protocol.len > 0)
+				strbuf_addch(&git_protocol, ':');
+			strbuf_addstr(&git_protocol, arg);
+		}
+	}
+
+	if (git_protocol.len > 0)
+		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
+				 git_protocol.buf);
+	strbuf_release(&git_protocol);
 }
 
 /*
@@ -695,6 +741,7 @@ static int execute(void)
 	int pktlen, len, i;
 	char *addr = getenv("REMOTE_ADDR"), *port = getenv("REMOTE_PORT");
 	struct hostinfo hi;
+	struct argv_array env = ARGV_ARRAY_INIT;
 
 	hostinfo_init(&hi);
 
@@ -716,8 +763,9 @@ static int execute(void)
 		pktlen--;
 	}
 
+	/* parse additional args hidden behind a NUL byte */
 	if (len != pktlen)
-		parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+		parse_extra_args(&hi, &env, line + len + 1, pktlen - len - 1);
 
 	for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
 		struct daemon_service *s = &(daemon_service[i]);
@@ -730,13 +778,15 @@ static int execute(void)
 			 * Note: The directory here is probably context sensitive,
 			 * and might depend on the actual service being performed.
 			 */
-			int rc = run_service(arg, s, &hi);
+			int rc = run_service(arg, s, &hi, &env);
 			hostinfo_clear(&hi);
+			argv_array_clear(&env);
 			return rc;
 		}
 	}
 
 	hostinfo_clear(&hi);
+	argv_array_clear(&env);
 	logerror("Protocol error: '%s'", line);
 	return -1;
 }
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (3 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 4/9] daemon: recognize hidden request arguments Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-27  5:23     ` Junio C Hamano
  2017-09-26 23:56   ` [PATCH v2 6/9] connect: teach client to recognize v1 server response Brandon Williams
                     ` (4 subsequent siblings)
  9 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.

Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/receive-pack.c | 14 ++++++++++++++
 upload-pack.c          | 17 ++++++++++++++++-
 2 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index dd06b3fb4..cb179367b 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -24,6 +24,7 @@
 #include "tmp-objdir.h"
 #include "oidset.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static const char * const receive_pack_usage[] = {
 	N_("git receive-pack <git-dir>"),
@@ -1963,6 +1964,19 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 	else if (0 <= receive_unpack_limit)
 		unpack_limit = receive_unpack_limit;
 
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+	case protocol_v0:
+	default:
+		break;
+	}
+
 	if (advertise_refs || !stateless_rpc) {
 		write_head_info();
 	}
diff --git a/upload-pack.c b/upload-pack.c
index 7efff2fbf..5cab39819 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
+#include "protocol.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -1067,6 +1068,20 @@ int cmd_main(int argc, const char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 
 	git_config(upload_pack_config, NULL);
-	upload_pack();
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+	case protocol_v0:
+	default:
+		upload_pack();
+		break;
+	}
+
 	return 0;
 }
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 6/9] connect: teach client to recognize v1 server response
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (4 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-27  1:07     ` Junio C Hamano
  2017-09-27  5:29     ` Junio C Hamano
  2017-09-26 23:56   ` [PATCH v2 7/9] connect: tell server that the client understands v1 Brandon Williams
                     ` (3 subsequent siblings)
  9 siblings, 2 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 8e2e276b6..1805debf3 100644
--- a/connect.c
+++ b/connect.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t *src_len,
 	return len;
 }
 
-#define EXPECTING_FIRST_REF 0
-#define EXPECTING_REF 1
-#define EXPECTING_SHALLOW 2
+#define EXPECTING_PROTOCOL_VERSION 0
+#define EXPECTING_FIRST_REF 1
+#define EXPECTING_REF 2
+#define EXPECTING_SHALLOW 3
+
+/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
+static int process_protocol_version(void)
+{
+	switch (determine_protocol_version_client(packet_buffer)) {
+		case protocol_v1:
+			return 1;
+		case protocol_v0:
+			return 0;
+		default:
+			die("server is speaking an unknown protocol");
+	}
+}
 
 static void process_capabilities(int *len)
 {
@@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 */
 	int responded = 0;
 	int len;
-	int state = EXPECTING_FIRST_REF;
+	int state = EXPECTING_PROTOCOL_VERSION;
 
 	*list = NULL;
 
 	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
 		switch (state) {
+		case EXPECTING_PROTOCOL_VERSION:
+			if (process_protocol_version()) {
+				state = EXPECTING_FIRST_REF;
+				break;
+			}
+			state = EXPECTING_FIRST_REF;
+			/* fallthrough */
 		case EXPECTING_FIRST_REF:
 			process_capabilities(&len);
 			if (process_dummy_ref()) {
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 7/9] connect: tell server that the client understands v1
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (5 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 6/9] connect: teach client to recognize v1 server response Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-27  6:21     ` Junio C Hamano
  2017-09-26 23:56   ` [PATCH v2 8/9] http: " Brandon Williams
                     ` (2 subsequent siblings)
  9 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Teach the connection logic to tell a serve that it understands protocol
v1.  This is done in 2 different ways for the built in protocols.

1. git://
   A normal request is structured as "command path/to/repo\0host=..\0"
   and due to a bug in an old version of git-daemon 73bb33a94 (daemon:
   Strictly parse the "extra arg" part of the command, 2009-06-04) we
   aren't able to place any extra args (separated by NULs) besides the
   host.  In order to get around this limitation put protocol version
   information after a second NUL byte so the request is structured
   like: "command path/to/repo\0host=..\0\0version=1\0".  git-daemon can
   then parse out the version number and set GIT_PROTOCOL.

2. ssh://, file://
   Set GIT_PROTOCOL envvar with the desired protocol version.  The
   envvar can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and
   having the server whitelist this envvar.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              |  37 ++++++--
 t/t5700-protocol-v1.sh | 223 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 255 insertions(+), 5 deletions(-)
 create mode 100755 t/t5700-protocol-v1.sh

diff --git a/connect.c b/connect.c
index 1805debf3..12ebab724 100644
--- a/connect.c
+++ b/connect.c
@@ -871,6 +871,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
+		struct strbuf request = STRBUF_INIT;
 		/*
 		 * Set up virtual host information based on where we will
 		 * connect, unless the user has overridden us in
@@ -898,13 +899,25 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * Note: Do not add any other headers here!  Doing so
 		 * will cause older git-daemon servers to crash.
 		 */
-		packet_write_fmt(fd[1],
-			     "%s %s%chost=%s%c",
-			     prog, path, 0,
-			     target_host, 0);
+		strbuf_addf(&request,
+			    "%s %s%chost=%s%c",
+			    prog, path, 0,
+			    target_host, 0);
+
+		/* If using a new version put that stuff here after a second null byte */
+		if (get_protocol_version_config() > 0) {
+			strbuf_addch(&request, '\0');
+			strbuf_addf(&request, "version=%d%c",
+				    get_protocol_version_config(), '\0');
+		}
+
+		packet_write(fd[1], request.buf, request.len);
+
 		free(target_host);
+		strbuf_release(&request);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
+		const char *const *var;
 
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
@@ -917,7 +930,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		sq_quote_buf(&cmd, path);
 
 		/* remove repo-local variables from the environment */
-		conn->env = local_repo_env;
+		for (var = local_repo_env; *var; var++)
+			argv_array_push(&conn->env_array, *var);
+
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
@@ -971,6 +986,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 			}
 
 			argv_array_push(&conn->args, ssh);
+
+			if (get_protocol_version_config() > 0) {
+				argv_array_push(&conn->args, "-o");
+				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
+
 			if (flags & CONNECT_IPV4)
 				argv_array_push(&conn->args, "-4");
 			else if (flags & CONNECT_IPV6)
@@ -985,6 +1008,10 @@ struct child_process *git_connect(int fd[2], const char *url,
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
+			if (get_protocol_version_config() > 0) {
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
 		}
 		argv_array_push(&conn->args, cmd.buf);
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
new file mode 100755
index 000000000..1988bbce6
--- /dev/null
+++ b/t/t5700-protocol-v1.sh
@@ -0,0 +1,223 @@
+#!/bin/sh
+
+test_description='test git wire-protocol transition'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v1 with 'git://' transport
+#
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	test_commit -C "$daemon_parent" one
+'
+
+test_expect_success 'clone with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with git:// using protocol v1' '
+	test_commit -C "$daemon_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C daemon_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with git:// using protocol v1' '
+	test_commit -C daemon_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "version=1" log &&
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+stop_git_daemon
+
+# Test protocol v1 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'clone with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with file:// using protocol v1' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C file_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with file:// using protocol v1' '
+	test_commit -C file_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+# Test protocol v1 with 'ssh://' transport
+#
+test_expect_success 'setup ssh wrapper' '
+	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" &&
+	export GIT_SSH &&
+	export TRASH_DIRECTORY &&
+	>"$TRASH_DIRECTORY"/ssh-output
+'
+
+expect_ssh () {
+	test_when_finished '(cd "$TRASH_DIRECTORY" && rm -f ssh-expect && >ssh-output)' &&
+	echo "ssh: -o SendEnv=GIT_PROTOCOL myhost $1 '$PWD/ssh_parent'" >"$TRASH_DIRECTORY/ssh-expect" &&
+	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
+}
+
+test_expect_success 'create repo to be served by ssh:// transport' '
+	git init ssh_parent &&
+	test_commit -C ssh_parent one
+'
+
+test_expect_success 'clone with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "ssh://myhost:$(pwd)/ssh_parent" ssh_child 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with ssh:// using protocol v1' '
+	test_commit -C ssh_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		fetch 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		pull 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with ssh:// using protocol v1' '
+	test_commit -C ssh_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+	expect_ssh git-receive-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+test_done
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 8/9] http: tell server that the client understands v1
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (6 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 7/9] connect: tell server that the client understands v1 Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-09-27  6:24     ` Junio C Hamano
  2017-09-26 23:56   ` [PATCH v2 9/9] i5700: add interop test for protocol transition Brandon Williams
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
  9 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Tell a server that protocol v1 can be used by sending the http header
'Git-Protocol' indicating this.

Also teach the apache http server to pass through the 'Git-Protocol'
header as an environment variable 'GIT_PROTOCOL'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 cache.h                 |  2 ++
 http.c                  | 18 +++++++++++++
 t/lib-httpd/apache.conf |  7 +++++
 t/t5700-protocol-v1.sh  | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 96 insertions(+)

diff --git a/cache.h b/cache.h
index 0c792545f..aaadac1f0 100644
--- a/cache.h
+++ b/cache.h
@@ -451,6 +451,8 @@ static inline enum object_type object_type(unsigned int mode)
  * 'key[=value]'.  Presence of unknown keys must be tolerated.
  */
 #define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+/* HTTP header used to handshake the wire protocol */
+#define GIT_PROTOCOL_HEADER "Git-Protocol"
 
 /*
  * This environment variable is expected to contain a boolean indicating
diff --git a/http.c b/http.c
index 9e40a465f..ffb719216 100644
--- a/http.c
+++ b/http.c
@@ -12,6 +12,7 @@
 #include "gettext.h"
 #include "transport.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
@@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
+static void protocol_http_header(void)
+{
+	if (get_protocol_version_config() > 0) {
+		struct strbuf protocol_header = STRBUF_INIT;
+
+		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
+			    get_protocol_version_config());
+
+
+		extra_http_headers = curl_slist_append(extra_http_headers,
+						       protocol_header.buf);
+		strbuf_release(&protocol_header);
+	}
+}
+
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
+	protocol_http_header();
+
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 0642ae7e6..df1943631 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -67,6 +67,9 @@ LockFile accept.lock
 <IfModule !mod_unixd.c>
 	LoadModule unixd_module modules/mod_unixd.so
 </IfModule>
+<IfModule !mod_setenvif.c>
+	LoadModule setenvif_module modules/mod_setenvif.so
+</IfModule>
 </IfVersion>
 
 PassEnv GIT_VALGRIND
@@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS
 PassEnv GIT_TRACE
 PassEnv GIT_CONFIG_NOSYSTEM
 
+<IfVersion >= 2.4>
+	SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
+</IfVersion>
+
 Alias /dumb/ www/
 Alias /auth/dumb/ www/auth/dumb/
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index 1988bbce6..222265127 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -220,4 +220,73 @@ test_expect_success 'push with ssh:// using protocol v1' '
 	grep "push< version 1" log
 '
 
+# Test protocol v1 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=1 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "Git-Protocol: version=1" log &&
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'fetch with http:// using protocol v1' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'pull with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'push with http:// using protocol v1' '
+	test_commit -C http_child three &&
+
+	# Since the repository being served isnt bare we need to push to
+	# another branch explicitly to avoid mangling the master branch
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		push origin HEAD:client_branch && #2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+stop_httpd
+
 test_done
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v2 9/9] i5700: add interop test for protocol transition
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (7 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 8/9] http: " Brandon Williams
@ 2017-09-26 23:56   ` Brandon Williams
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
  9 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-26 23:56 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/interop/i5700-protocol-transition.sh | 68 ++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100755 t/interop/i5700-protocol-transition.sh

diff --git a/t/interop/i5700-protocol-transition.sh b/t/interop/i5700-protocol-transition.sh
new file mode 100755
index 000000000..9e83428a8
--- /dev/null
+++ b/t/interop/i5700-protocol-transition.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+VERSION_A=.
+VERSION_B=v2.0.0
+
+: ${LIB_GIT_DAEMON_PORT:=5600}
+LIB_GIT_DAEMON_COMMAND='git.b daemon'
+
+test_description='clone and fetch by client who is trying to use a new protocol'
+. ./interop-lib.sh
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+
+start_git_daemon --export-all
+
+repo=$GIT_DAEMON_DOCUMENT_ROOT_PATH/repo
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init "$repo" &&
+	git.b -C "$repo" commit --allow-empty -m one
+'
+
+test_expect_success "git:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone "$GIT_DAEMON_URL/repo" child 2>log &&
+	git.a -C child log -1 --format=%s >actual &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	grep "version=1" log
+'
+
+test_expect_success "git:// fetch with $VERSION_A and protocol v1" '
+	git.b -C "$repo" commit --allow-empty -m two &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child -c protocol.version=1 fetch 2>log &&
+	git.a -C child log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	grep "version=1" log &&
+	! grep "version 1" log
+'
+
+stop_git_daemon
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init parent &&
+	git.b -C parent commit --allow-empty -m one
+'
+
+test_expect_success "file:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone --upload-pack="git.b upload-pack" parent child2 2>log &&
+	git.a -C child2 log -1 --format=%s >actual &&
+	git.b -C parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_expect_success "file:// fetch with $VERSION_A and protocol v1" '
+	git.b -C parent commit --allow-empty -m two &&
+	git.b -C parent log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child2 -c protocol.version=1 fetch --upload-pack="git.b upload-pack" 2>log &&
+	git.a -C child2 log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_done
-- 
2.14.1.992.g2c7b836f3a-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response
  2017-09-26 23:56   ` [PATCH v2 6/9] connect: teach client to recognize v1 server response Brandon Williams
@ 2017-09-27  1:07     ` Junio C Hamano
  2017-09-27 17:34       ` Brandon Williams
  2017-09-27  5:29     ` Junio C Hamano
  1 sibling, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  1:07 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
> +static int process_protocol_version(void)
> +{
> +	switch (determine_protocol_version_client(packet_buffer)) {
> +		case protocol_v1:
> +			return 1;
> +		case protocol_v0:
> +			return 0;
> +		default:
> +			die("server is speaking an unknown protocol");
> +	}
> +}

checkpatch.pl yells at me:

    ERROR: switch and case should be at the same indent

and we would probably want to teach "make style" the same, if we
already don't.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-26 23:56   ` [PATCH v2 3/9] protocol: introduce protocol extention mechanisms Brandon Williams
@ 2017-09-27  5:17     ` Junio C Hamano
  2017-09-27 11:23       ` Junio C Hamano
  2017-09-28 21:58       ` Brandon Williams
  2017-09-27  6:30     ` Stefan Beller
  1 sibling, 2 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  5:17 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> +`GIT_PROTOCOL`::
> +	For internal use only.  Used in handshaking the wire protocol.
> +	Contains a colon ':' separated list of keys with optional values
> +	'key[=value]'.  Presence of unknown keys must be tolerated.

Is this meant to be used only on the "server" end?  Am I correct to
interpret "handshaking" to mean the initial connection acceptor
(e.g. "git daemon") uses it to pass what it decided to the programs
that implement the service (e.g. "git receive-pack")?

> +/*
> + * Environment variable used in handshaking the wire protocol.
> + * Contains a colon ':' separated list of keys with optional values
> + * 'key[=value]'.  Presence of unknown keys must be tolerated.
> + */
> +#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"

"Must be tolerated" feels a bit strange.  When somebody asks you to
use "version 3" or "version 1 variant 2", when you only know
"version 0" or "version 1" and you are not yet even aware of the
concept of "variant", we simply ignore "variant=2" as if it wasn't
there, even though "version=3" will be rejected (because we know of
"version"; it's just that we don't know "version=3").

> +enum protocol_version determine_protocol_version_server(void)
> +{
> +	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
> +	enum protocol_version version = protocol_v0;
> +
> +	if (git_protocol) {
> +		struct string_list list = STRING_LIST_INIT_DUP;
> +		const struct string_list_item *item;
> +		string_list_split(&list, git_protocol, ':', -1);
> +
> +		for_each_string_list_item(item, &list) {
> +			const char *value;
> +			enum protocol_version v;
> +
> +			if (skip_prefix(item->string, "version=", &value)) {
> +				v = parse_protocol_version(value);
> +				if (v > version)
> +					version = v;
> +			}
> +		}
> +
> +		string_list_clear(&list, 0);
> +	}
> +
> +	return version;
> +}

This implements "the largest one wins", not "the last one wins".  Is
there a particular reason why the former is chosen?


^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 4/9] daemon: recognize hidden request arguments
  2017-09-26 23:56   ` [PATCH v2 4/9] daemon: recognize hidden request arguments Brandon Williams
@ 2017-09-27  5:20     ` Junio C Hamano
  2017-09-27 21:22       ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  5:20 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> A normal request to git-daemon is structured as
> "command path/to/repo\0host=..\0" and due to a bug in an old version of
> git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> command, 2009-06-04) we aren't able to place any extra args (separated
> by NULs) besides the host.

It's a bit unclear if that commit _introduced_ a bug, or just
noticed an old bug and documented it in its log message.  How does
that commit impact the versons of Git that the updated code is
capable of interracting with?

> +static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
> +			     char *extra_args, int buflen)
> +{
> +	const char *end = extra_args + buflen;
> +	struct strbuf git_protocol = STRBUF_INIT;
> +
> +	/* First look for the host argument */
> +	extra_args = parse_host_arg(hi, extra_args, buflen);
> +
> +	/* Look for additional arguments places after a second NUL byte */
> +	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
> +		const char *arg = extra_args;
> +
> +		/*
> +		 * Parse the extra arguments, adding most to 'git_protocol'
> +		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
> +		 * service that will be run.
> +		 *
> +		 * If there ends up being a particular arg in the future that
> +		 * git-daemon needs to parse specificly (like the 'host' arg)
> +		 * then it can be parsed here and not added to 'git_protocol'.
> +		 */
> +		if (*arg) {
> +			if (git_protocol.len > 0)
> +				strbuf_addch(&git_protocol, ':');
> +			strbuf_addstr(&git_protocol, arg);
> +		}
> +	}
> +
> +	if (git_protocol.len > 0)
> +		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> +				 git_protocol.buf);
> +	strbuf_release(&git_protocol);
>  }

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1
  2017-09-26 23:56   ` [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
@ 2017-09-27  5:23     ` Junio C Hamano
  2017-09-27 21:29       ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  5:23 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> @@ -1963,6 +1964,19 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  	else if (0 <= receive_unpack_limit)
>  		unpack_limit = receive_unpack_limit;
>  
> +	switch (determine_protocol_version_server()) {
> +	case protocol_v1:
> +		if (advertise_refs || !stateless_rpc)
> +			packet_write_fmt(1, "version 1\n");
> +		/*
> +		 * v1 is just the original protocol with a version string,
> +		 * so just fall through after writing the version string.
> +		 */
> +	case protocol_v0:
> +	default:
> +		break;

When protocol_v2 is introduced in the other part of the codebase
(i.e. in protocol.[ch]), until these lines are updated accordingly
to take care of the new protocol, we'd pretend that client asked
(and the server accepted) v0, even though the client and the daemon
agreed to talk v2.

Shouldn't the "default:" die instead?  The same for upload-pack.c

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response
  2017-09-26 23:56   ` [PATCH v2 6/9] connect: teach client to recognize v1 server response Brandon Williams
  2017-09-27  1:07     ` Junio C Hamano
@ 2017-09-27  5:29     ` Junio C Hamano
  2017-09-28 22:08       ` Brandon Williams
  1 sibling, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  5:29 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
> +static int process_protocol_version(void)
> +{
> +	switch (determine_protocol_version_client(packet_buffer)) {
> +		case protocol_v1:
> +			return 1;
> +		case protocol_v0:
> +			return 0;
> +		default:
> +			die("server is speaking an unknown protocol");
> +	}
> +}

For the purpose of "technology demonstration" v1 protocol, it is OK
to discard the result of "determine_pvc()" like the above code, but
in a real application, we would do a bit more than just ignoring an
extra "version #" packet that appears at the beginning, no?

It would be sensible to design how the result of determien_pvc()
call is propagated to the remainder of the program in this patch and
implement it.  Perhaps add a new global (like server_capabilities
already is) and store the value there, or something?  Or pass a
pointer to enum protocol_version as a return-location parameter to
this helper function so that the process_capabilities() can pass a
pointer to its local variable?

>  static void process_capabilities(int *len)
>  {
> @@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
>  	 */
>  	int responded = 0;
>  	int len;
> -	int state = EXPECTING_FIRST_REF;
> +	int state = EXPECTING_PROTOCOL_VERSION;
>  
>  	*list = NULL;
>  
>  	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
>  		switch (state) {
> +		case EXPECTING_PROTOCOL_VERSION:
> +			if (process_protocol_version()) {
> +				state = EXPECTING_FIRST_REF;
> +				break;
> +			}
> +			state = EXPECTING_FIRST_REF;
> +			/* fallthrough */
>  		case EXPECTING_FIRST_REF:
>  			process_capabilities(&len);
>  			if (process_dummy_ref()) {

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 7/9] connect: tell server that the client understands v1
  2017-09-26 23:56   ` [PATCH v2 7/9] connect: tell server that the client understands v1 Brandon Williams
@ 2017-09-27  6:21     ` Junio C Hamano
  2017-09-27  6:29       ` Junio C Hamano
  2017-09-28 22:20       ` Brandon Williams
  0 siblings, 2 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  6:21 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> Teach the connection logic to tell a serve that it understands protocol
> v1.  This is done in 2 different ways for the built in protocols.
>
> 1. git://
>    A normal request is structured as "command path/to/repo\0host=..\0"
>    and due to a bug in an old version of git-daemon 73bb33a94 (daemon:
>    Strictly parse the "extra arg" part of the command, 2009-06-04) we
>    aren't able to place any extra args (separated by NULs) besides the
>    host.  In order to get around this limitation put protocol version
>    information after a second NUL byte so the request is structured
>    like: "command path/to/repo\0host=..\0\0version=1\0".  git-daemon can
>    then parse out the version number and set GIT_PROTOCOL.

Same question as a previous step, wrt the cited commit.  It reads as
if we are saying that the commit introduced a bug and left it there,
that we cannot use \0host=..\0version=..\0other=..\0 until that bug
is fixed, and that in the meantime we use \0host=..\0\0version=.. as
a workaround, but that reading leaves readers wondering if we want
to eventually drop this double-NUL workaround.  I am guessing that
we want to declare that the current protocol has a glitch that
prevents us to use \0host=..\0version=..\0 but we accept that and
plan to keep it that way, and we'll use the double-NUL for anything
other than host from now on, as it is compatible with the current
version of Git before this patch (the extras are safely ignored),
but then it still leaves readers wonder if the mention of the
old commit from 2009 means that this double-NUL would not even work
if the other end is running a version of Git before that commit, or
we are safe to talk with versions of Git even older than that.

I do not think it is a showstopper if we did not work with v1.6.4,
but it still needs to be clarified.

> 2. ssh://, file://
>    Set GIT_PROTOCOL envvar with the desired protocol version.  The
>    envvar can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and
>    having the server whitelist this envvar.

OpenSSH lets us do this, but I do not know how well this works with
other implementations of SSH clients.  The log message perhaps needs
to ask for volunteers to check if it is OK with the implementations
they use, and offer conditional code (just like we have for putty
and plink customizations) otherwise.

Other than that, the code changes looked good.

> diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
> new file mode 100755
> index 000000000..1988bbce6
> --- /dev/null
> +++ b/t/t5700-protocol-v1.sh
> @@ -0,0 +1,223 @@
> +#!/bin/sh
> +
> +test_description='test git wire-protocol transition'
> +
> +TEST_NO_CREATE_REPO=1
> +
> +. ./test-lib.sh
> +
> +# Test protocol v1 with 'git://' transport
> +#
> +. "$TEST_DIRECTORY"/lib-git-daemon.sh
> +start_git_daemon --export-all --enable=receive-pack
> +daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
> +
> +test_expect_success 'create repo to be served by git-daemon' '
> +	git init "$daemon_parent" &&
> +	test_commit -C "$daemon_parent" one
> +'
> +
> +test_expect_success 'clone with git:// using protocol v1' '
> +	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
> +		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
> +
> +	git -C daemon_child log -1 --format=%s >actual &&
> +	git -C "$daemon_parent" log -1 --format=%s >expect &&
> +	test_cmp expect actual &&
> +
> +	# Client requested to use protocol v1
> +	grep "version=1" log &&
> +	# Server responded using protocol v1
> +	grep "clone< version 1" log

This looked a bit strange to check "clone< version 1" for one
direction, but did not check "$something> version 1" for the other
direction.  Doesn't "version=1" end up producing 2 hits?

Not a complaint, but wondering if we can write it in such a way that
does not have to make readers wonder.

> +'
> +
> +test_expect_success 'fetch with git:// using protocol v1' '
> +	test_commit -C "$daemon_parent" two &&
> +
> +	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
> +		fetch 2>log &&
> +
> +	git -C daemon_child log -1 --format=%s FETCH_HEAD >actual &&
> +	git -C "$daemon_parent" log -1 --format=%s >expect &&
> +	test_cmp expect actual &&

OK.  So the origin repository gained one commit on the 'master'
branch (and a tag 'two').  By fetching, but not pulling, our
'master' would not advance, and that is where check on FETCH_HEAD
comes from.  I suspect that the tag 'two' is also auto-followed with
this operation and would be in FETCH_HEAD; is that something we want
to check?  Alternatively, the "actual" log may want to see what the
remote tracking branch for their 'master' has---then we do not have
to worry about "FETCH_HEAD has two refs---which one are we checking?"

> +
> +	# Client requested to use protocol v1
> +	grep "version=1" log &&
> +	# Server responded using protocol v1
> +	grep "fetch< version 1" log
> +'

Same "version=1" vs "fetch< version 1" strangeness appears here.

> +test_expect_success 'pull with git:// using protocol v1' '
> +	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
> +		pull 2>log &&
> +
> +	git -C daemon_child log -1 --format=%s >actual &&
> +	git -C "$daemon_parent" log -1 --format=%s >expect &&
> +	test_cmp expect actual &&

Here we can check our 'master', as we pulled their 'master' into it.
What is this testing, though?  The fact that protocol.version=1
given via "git -c var=val" mechanism is propagated to the underlying
fetch?

> +	# Client requested to use protocol v1
> +	grep "version=1" log &&
> +	# Server responded using protocol v1
> +	grep "fetch< version 1" log
> +'
> +
> +test_expect_success 'push with git:// using protocol v1' '
> +	test_commit -C daemon_child three &&
> +
> +	# Since the repository being served isnt bare we need to push to
> +	# another branch explicitly to avoid mangling the master branch

The other end avoids mangling the master just fine without us doing
anything special ;-).  You are pushing to another branch because you
cannot push into a branch that is currently checked out.

	# Push to another branch, as the target repository has the
	# master branch checked out and we cannot push into it.

perhaps?

The tests for file:// looked identical, so the same set of comments
apply.

> +# Test protocol v1 with 'ssh://' transport
> +#
> +test_expect_success 'setup ssh wrapper' '
> +	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" &&
> +	export GIT_SSH &&
> +	export TRASH_DIRECTORY &&
> +	>"$TRASH_DIRECTORY"/ssh-output
> +'
> +
> +expect_ssh () {
> +	test_when_finished '(cd "$TRASH_DIRECTORY" && rm -f ssh-expect && >ssh-output)' &&
> +	echo "ssh: -o SendEnv=GIT_PROTOCOL myhost $1 '$PWD/ssh_parent'" >"$TRASH_DIRECTORY/ssh-expect" &&
> +	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
> +}
> +
> +test_expect_success 'create repo to be served by ssh:// transport' '
> +	git init ssh_parent &&
> +	test_commit -C ssh_parent one
> +'
> +
> +test_expect_success 'clone with ssh:// using protocol v1' '
> +	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
> +		clone "ssh://myhost:$(pwd)/ssh_parent" ssh_child 2>log &&

Hmm, this is a fun one, as we deliberately make $(pwd) to have
whitespace in the test setup.  I am impressed/kinda surprised that
this works ;-)

Other than that, these also look more or less identical to file://
and git:// tests, so the same set of comments apply.

Overall very nicely done.

Thanks.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 8/9] http: tell server that the client understands v1
  2017-09-26 23:56   ` [PATCH v2 8/9] http: " Brandon Williams
@ 2017-09-27  6:24     ` Junio C Hamano
  2017-09-27 21:36       ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  6:24 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Brandon Williams <bmwill@google.com> writes:

> @@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname)
>  		*var = val;
>  }
>  
> +static void protocol_http_header(void)
> +{
> +	if (get_protocol_version_config() > 0) {
> +		struct strbuf protocol_header = STRBUF_INIT;
> +
> +		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
> +			    get_protocol_version_config());
> +
> +
> +		extra_http_headers = curl_slist_append(extra_http_headers,
> +						       protocol_header.buf);
> +		strbuf_release(&protocol_header);
> +	}
> +}
> +
>  void http_init(struct remote *remote, const char *url, int proactive_auth)
>  {
>  	char *low_speed_limit;
> @@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
>  	if (remote)
>  		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
>  
> +	protocol_http_header();
> +
>  	pragma_header = curl_slist_append(http_copy_default_headers(),
>  		"Pragma: no-cache");
>  	no_pragma_header = curl_slist_append(http_copy_default_headers(),
> diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
> index 0642ae7e6..df1943631 100644
> --- a/t/lib-httpd/apache.conf
> +++ b/t/lib-httpd/apache.conf
> @@ -67,6 +67,9 @@ LockFile accept.lock
>  <IfModule !mod_unixd.c>
>  	LoadModule unixd_module modules/mod_unixd.so
>  </IfModule>
> +<IfModule !mod_setenvif.c>
> +	LoadModule setenvif_module modules/mod_setenvif.so
> +</IfModule>
>  </IfVersion>
>  
>  PassEnv GIT_VALGRIND
> @@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS
>  PassEnv GIT_TRACE
>  PassEnv GIT_CONFIG_NOSYSTEM
>  
> +<IfVersion >= 2.4>
> +	SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
> +</IfVersion>
> +

It is very nice to see that only with a single extra HTTP header and
the server configuration, everybody else does not have to care how
the version information is plumbed through ;-)

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 7/9] connect: tell server that the client understands v1
  2017-09-27  6:21     ` Junio C Hamano
@ 2017-09-27  6:29       ` Junio C Hamano
  2017-09-29 21:32         ` Brandon Williams
  2017-09-28 22:20       ` Brandon Williams
  1 sibling, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27  6:29 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Junio C Hamano <gitster@pobox.com> writes:

>> +	# Client requested to use protocol v1
>> +	grep "version=1" log &&
>> +	# Server responded using protocol v1
>> +	grep "clone< version 1" log
>
> This looked a bit strange to check "clone< version 1" for one
> direction, but did not check "$something> version 1" for the other
> direction.  Doesn't "version=1" end up producing 2 hits?
>
> Not a complaint, but wondering if we can write it in such a way that
> does not have to make readers wonder.

Ah, the check for "version=1" is a short-hand for

	grep "clone> git-upload-pack ...\\0\\0version=1\\0$" log

and the symmetry I sought is already there.  So ignore the above; if
we wanted to make the symmetry more explicit, it would not hurt to
spell the first one as

	grep "clone> .*\\0\\0version=1\\0$" log

though.


^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-26 23:56   ` [PATCH v2 3/9] protocol: introduce protocol extention mechanisms Brandon Williams
  2017-09-27  5:17     ` Junio C Hamano
@ 2017-09-27  6:30     ` Stefan Beller
  2017-09-28 21:04       ` Brandon Williams
  1 sibling, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-09-27  6:30 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, Bryan Turner, Jeff Hostetler, Junio C Hamano, Jonathan Tan,
	Jonathan Nieder, Jeff King

> +extern enum protocol_version get_protocol_version_config(void);
> +extern enum protocol_version determine_protocol_version_server(void);
> +extern enum protocol_version determine_protocol_version_client(const char *server_response);

It would be cool to have some documentation here.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-27  5:17     ` Junio C Hamano
@ 2017-09-27 11:23       ` Junio C Hamano
  2017-09-29 21:20         ` Brandon Williams
  2017-09-28 21:58       ` Brandon Williams
  1 sibling, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-09-27 11:23 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Junio C Hamano <gitster@pobox.com> writes:

>> +enum protocol_version determine_protocol_version_server(void)
>> +{
>> +	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
>> +	enum protocol_version version = protocol_v0;
>> +
>> +	if (git_protocol) {
>> +		struct string_list list = STRING_LIST_INIT_DUP;
>> +		const struct string_list_item *item;
>> +		string_list_split(&list, git_protocol, ':', -1);
>> +
>> +		for_each_string_list_item(item, &list) {
>> +			const char *value;
>> +			enum protocol_version v;
>> +
>> +			if (skip_prefix(item->string, "version=", &value)) {
>> +				v = parse_protocol_version(value);
>> +				if (v > version)
>> +					version = v;
>> +			}
>> +		}
>> +
>> +		string_list_clear(&list, 0);
>> +	}
>> +
>> +	return version;
>> +}
>
> This implements "the largest one wins", not "the last one wins".  Is
> there a particular reason why the former is chosen?

Let me give my version of why the usual "the last one wins" would
not necessarily a good idea.  I would imagine that a client
contacting the server may want to say "I understand v3, v2 (but not
v1 nor v0)" and in order to influence the server's choice between
the available two, it may want to somehow say it prefers v3 over v2
(or v2 over v3).  

One way to implement such a behaviour would be "the first one that
is understood is used", i.e. something along this line:

        enum protocol_version version = protocol_unknown;

	for_each_string_list_item(item, &list) {
		const char *value;
		enum protocol_version v;
		if (skip_prefix(item->string, "version=", &value)) {
                	if (version == protocol_unknown) {
                        	v = parse_protocol_version(value);
			        if (v != protocol_unknown)
					version = v;
			}
		}
	}

	if (version == protocol_unknown)
		version = protocol_v0;

and not "the largest one wins" nor "the last one wins".

I am not saying your code or the choice of "the largest one wins" is
necessarily wrong.  I am just illlustrating the way to explain
"because I want to support a usecase like _this_, I define the way
in which multiple values to the version variable is parsed like so,
hence this code".  IOW, I think this commit should mention how the
"largest one wins" rule would be useful to the clients and the
servers when they want to achieve X---and that X is left unexplained.




^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response
  2017-09-27  1:07     ` Junio C Hamano
@ 2017-09-27 17:34       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-27 17:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
> > +static int process_protocol_version(void)
> > +{
> > +	switch (determine_protocol_version_client(packet_buffer)) {
> > +		case protocol_v1:
> > +			return 1;
> > +		case protocol_v0:
> > +			return 0;
> > +		default:
> > +			die("server is speaking an unknown protocol");
> > +	}
> > +}
> 
> checkpatch.pl yells at me:
> 
>     ERROR: switch and case should be at the same indent
> 
> and we would probably want to teach "make style" the same, if we
> already don't.

'make style' actually already understands this, I just forgot it run it
on this change :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 4/9] daemon: recognize hidden request arguments
  2017-09-27  5:20     ` Junio C Hamano
@ 2017-09-27 21:22       ` Brandon Williams
  2017-09-28 16:57         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-09-27 21:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > A normal request to git-daemon is structured as
> > "command path/to/repo\0host=..\0" and due to a bug in an old version of
> > git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> > command, 2009-06-04) we aren't able to place any extra args (separated
> > by NULs) besides the host.
> 
> It's a bit unclear if that commit _introduced_ a bug, or just
> noticed an old bug and documented it in its log message.  How does
> that commit impact the versons of Git that the updated code is
> capable of interracting with?

You're right, after reading it again it isn't clear.  I'll change this
to indicate that the commit is a fix to a bug and that the fix doesn't
allow us to place any additional args.

> 
> > +static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
> > +			     char *extra_args, int buflen)
> > +{
> > +	const char *end = extra_args + buflen;
> > +	struct strbuf git_protocol = STRBUF_INIT;
> > +
> > +	/* First look for the host argument */
> > +	extra_args = parse_host_arg(hi, extra_args, buflen);
> > +
> > +	/* Look for additional arguments places after a second NUL byte */
> > +	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
> > +		const char *arg = extra_args;
> > +
> > +		/*
> > +		 * Parse the extra arguments, adding most to 'git_protocol'
> > +		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
> > +		 * service that will be run.
> > +		 *
> > +		 * If there ends up being a particular arg in the future that
> > +		 * git-daemon needs to parse specificly (like the 'host' arg)
> > +		 * then it can be parsed here and not added to 'git_protocol'.
> > +		 */
> > +		if (*arg) {
> > +			if (git_protocol.len > 0)
> > +				strbuf_addch(&git_protocol, ':');
> > +			strbuf_addstr(&git_protocol, arg);
> > +		}
> > +	}
> > +
> > +	if (git_protocol.len > 0)
> > +		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> > +				 git_protocol.buf);
> > +	strbuf_release(&git_protocol);
> >  }

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1
  2017-09-27  5:23     ` Junio C Hamano
@ 2017-09-27 21:29       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-27 21:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > @@ -1963,6 +1964,19 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
> >  	else if (0 <= receive_unpack_limit)
> >  		unpack_limit = receive_unpack_limit;
> >  
> > +	switch (determine_protocol_version_server()) {
> > +	case protocol_v1:
> > +		if (advertise_refs || !stateless_rpc)
> > +			packet_write_fmt(1, "version 1\n");
> > +		/*
> > +		 * v1 is just the original protocol with a version string,
> > +		 * so just fall through after writing the version string.
> > +		 */
> > +	case protocol_v0:
> > +	default:
> > +		break;
> 
> When protocol_v2 is introduced in the other part of the codebase
> (i.e. in protocol.[ch]), until these lines are updated accordingly
> to take care of the new protocol, we'd pretend that client asked
> (and the server accepted) v0, even though the client and the daemon
> agreed to talk v2.
> 
> Shouldn't the "default:" die instead?  The same for upload-pack.c

Good catch.  Yeah you're right, the default should probably die saying
that receive pack or upload pack doesn't support the protocol version.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 8/9] http: tell server that the client understands v1
  2017-09-27  6:24     ` Junio C Hamano
@ 2017-09-27 21:36       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-27 21:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > @@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname)
> >  		*var = val;
> >  }
> >  
> > +static void protocol_http_header(void)
> > +{
> > +	if (get_protocol_version_config() > 0) {
> > +		struct strbuf protocol_header = STRBUF_INIT;
> > +
> > +		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
> > +			    get_protocol_version_config());
> > +
> > +
> > +		extra_http_headers = curl_slist_append(extra_http_headers,
> > +						       protocol_header.buf);
> > +		strbuf_release(&protocol_header);
> > +	}
> > +}
> > +
> >  void http_init(struct remote *remote, const char *url, int proactive_auth)
> >  {
> >  	char *low_speed_limit;
> > @@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
> >  	if (remote)
> >  		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
> >  
> > +	protocol_http_header();
> > +
> >  	pragma_header = curl_slist_append(http_copy_default_headers(),
> >  		"Pragma: no-cache");
> >  	no_pragma_header = curl_slist_append(http_copy_default_headers(),
> > diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
> > index 0642ae7e6..df1943631 100644
> > --- a/t/lib-httpd/apache.conf
> > +++ b/t/lib-httpd/apache.conf
> > @@ -67,6 +67,9 @@ LockFile accept.lock
> >  <IfModule !mod_unixd.c>
> >  	LoadModule unixd_module modules/mod_unixd.so
> >  </IfModule>
> > +<IfModule !mod_setenvif.c>
> > +	LoadModule setenvif_module modules/mod_setenvif.so
> > +</IfModule>
> >  </IfVersion>
> >  
> >  PassEnv GIT_VALGRIND
> > @@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS
> >  PassEnv GIT_TRACE
> >  PassEnv GIT_CONFIG_NOSYSTEM
> >  
> > +<IfVersion >= 2.4>
> > +	SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
> > +</IfVersion>
> > +
> 
> It is very nice to see that only with a single extra HTTP header and
> the server configuration, everybody else does not have to care how
> the version information is plumbed through ;-)

Having limited experience working with HTTP, it took me a bit to
figure out how to get the server configuration right, but once I got it
working it seemed to work pretty seamlessly :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 4/9] daemon: recognize hidden request arguments
  2017-09-27 21:22       ` Brandon Williams
@ 2017-09-28 16:57         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-28 16:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Brandon Williams wrote:
> On 09/27, Junio C Hamano wrote:
> > Brandon Williams <bmwill@google.com> writes:
> >
> > > A normal request to git-daemon is structured as
> > > "command path/to/repo\0host=..\0" and due to a bug in an old version of
> > > git-daemon 73bb33a94 (daemon: Strictly parse the "extra arg" part of the
> > > command, 2009-06-04) we aren't able to place any extra args (separated
> > > by NULs) besides the host.
> >
> > It's a bit unclear if that commit _introduced_ a bug, or just
> > noticed an old bug and documented it in its log message.  How does
> > that commit impact the versons of Git that the updated code is
> > capable of interracting with?
>
> You're right, after reading it again it isn't clear.  I'll change this
> to indicate that the commit is a fix to a bug and that the fix doesn't
> allow us to place any additional args.

Ok how about this wording for the commit msg:

  daemon: recognize hidden request arguments

  A normal request to git-daemon is structured as "command
  path/to/repo\0host=..\0" and due to a bug introduced in
  49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19)
  we aren't able to place any extra arguments (separated by NULs)
  besides the host otherwise the parsing of those arguments would
  enter an infinite loop.  This bug was fixed in 73bb33a94
  (daemon: Strictly parse the "extra arg" part of the command,
  2009-06-04) but a check was put in place to disallow extra
  arguments so that new clients wouldn't trigger this bug in older
  servers.

  In order to get around this limitation teach git-daemon to
  recognize additional request arguments hidden behind a second
  NUL byte.  Requests can then be structured like: "command
  path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
  can then parse out the extra arguments and set 'GIT_PROTOCOL'
  accordingly.

  By placing these extra arguments behind a second NUL byte we can
  skirt around both the infinite loop bug in 49ba83fb6 (Add
  virtualization support to git-daemon, 2006-09-19) as well as the
  explicit disallowing of extra arguments introduced in 73bb33a94
  (daemon: Strictly parse the "extra arg" part of the command,
  2009-06-04) because both of these versions of git-daemon check
  for a single NUL byte after the host argument before terminating
  the argument parsing.

This way I'm citing when the bug was actually introduced as well as
describing why the 'fix' didn't completely resolve the issue.  I also
explain a little bit about how this change should work even with very
old servers which still have the bug.  (I tried to get the introp test
to work on a version of git that old but am having some difficulty even
getting the old version to launch git-daemon without hanging for some
unknown reason)

--
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-27  6:30     ` Stefan Beller
@ 2017-09-28 21:04       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-28 21:04 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Bryan Turner, Jeff Hostetler, Junio C Hamano, Jonathan Tan,
	Jonathan Nieder, Jeff King

On 09/26, Stefan Beller wrote:
> > +extern enum protocol_version get_protocol_version_config(void);
> > +extern enum protocol_version determine_protocol_version_server(void);
> > +extern enum protocol_version determine_protocol_version_client(const char *server_response);
> 
> It would be cool to have some documentation here.

Thanks for reminding me, I'll get to writing some more documentation :)

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-27  5:17     ` Junio C Hamano
  2017-09-27 11:23       ` Junio C Hamano
@ 2017-09-28 21:58       ` Brandon Williams
  1 sibling, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-28 21:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > +`GIT_PROTOCOL`::
> > +	For internal use only.  Used in handshaking the wire protocol.
> > +	Contains a colon ':' separated list of keys with optional values
> > +	'key[=value]'.  Presence of unknown keys must be tolerated.
> 
> Is this meant to be used only on the "server" end?  Am I correct to
> interpret "handshaking" to mean the initial connection acceptor
> (e.g. "git daemon") uses it to pass what it decided to the programs
> that implement the service (e.g. "git receive-pack")?

Yes, the idea is that the client will request a protocol version by
setting GIT_PROTOCOL (or indirectly set when using git:// or http://).
upload-pack and receive-pack will use the keys and values set in
GIT_PROTOCOL to determine which version of the protocol to use.  At some
point in the future they may even use other keys and values as a means
of sending more information in an initial request from the client.

> 
> > +/*
> > + * Environment variable used in handshaking the wire protocol.
> > + * Contains a colon ':' separated list of keys with optional values
> > + * 'key[=value]'.  Presence of unknown keys must be tolerated.
> > + */
> > +#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
> 
> "Must be tolerated" feels a bit strange.  When somebody asks you to
> use "version 3" or "version 1 variant 2", when you only know
> "version 0" or "version 1" and you are not yet even aware of the
> concept of "variant", we simply ignore "variant=2" as if it wasn't
> there, even though "version=3" will be rejected (because we know of
> "version"; it's just that we don't know "version=3").

By "Must be tolerated" I was trying to get across that if the server
seeing something it doesn't understand, it shouldn't choke.  Maybe a
better wording would be to use the word "ignored"?

> 
> > +enum protocol_version determine_protocol_version_server(void)
> > +{
> > +	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
> > +	enum protocol_version version = protocol_v0;
> > +
> > +	if (git_protocol) {
> > +		struct string_list list = STRING_LIST_INIT_DUP;
> > +		const struct string_list_item *item;
> > +		string_list_split(&list, git_protocol, ':', -1);
> > +
> > +		for_each_string_list_item(item, &list) {
> > +			const char *value;
> > +			enum protocol_version v;
> > +
> > +			if (skip_prefix(item->string, "version=", &value)) {
> > +				v = parse_protocol_version(value);
> > +				if (v > version)
> > +					version = v;
> > +			}
> > +		}
> > +
> > +		string_list_clear(&list, 0);
> > +	}
> > +
> > +	return version;
> > +}
> 
> This implements "the largest one wins", not "the last one wins".  Is
> there a particular reason why the former is chosen?
> 

I envision this logic changing for newer servers once more protocol
versions are added because at some point a server may want to disallow a
particular version (because of a security issue or what have you).  So I
figured the easiest thing to do for now was to implement "Newest version
wins".

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 6/9] connect: teach client to recognize v1 server response
  2017-09-27  5:29     ` Junio C Hamano
@ 2017-09-28 22:08       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-28 22:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > +/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
> > +static int process_protocol_version(void)
> > +{
> > +	switch (determine_protocol_version_client(packet_buffer)) {
> > +		case protocol_v1:
> > +			return 1;
> > +		case protocol_v0:
> > +			return 0;
> > +		default:
> > +			die("server is speaking an unknown protocol");
> > +	}
> > +}
> 
> For the purpose of "technology demonstration" v1 protocol, it is OK
> to discard the result of "determine_pvc()" like the above code, but
> in a real application, we would do a bit more than just ignoring an
> extra "version #" packet that appears at the beginning, no?
> 
> It would be sensible to design how the result of determien_pvc()
> call is propagated to the remainder of the program in this patch and
> implement it.  Perhaps add a new global (like server_capabilities
> already is) and store the value there, or something?  Or pass a
> pointer to enum protocol_version as a return-location parameter to
> this helper function so that the process_capabilities() can pass a
> pointer to its local variable?

Yes, once we actually implement a v2 we would need to not throw away the
result of 'determine_pvc()' and instead do control flow based on the
resultant version.  I was trying to implement 'v1' as simply as possible
so that I wouldn't have to do a large amount of refactoring when
proposing this transition, though it seems Jonathan ended up doing more
than I planned, as I figured we didn't really know what the code will
need to be refactored to, in order to handle another protocol version.
I would suspect that we maybe wouldn't want to determine which version a
server is speaking in 'get_remote_heads()' but rather at some point
before that so we can branch off to do v2 like things, for example,
capability discovery and not ref discovery.

If you do think we need to do more of that refactoring now, before a v2,
I can most certainly work on that.


> 
> >  static void process_capabilities(int *len)
> >  {
> > @@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> >  	 */
> >  	int responded = 0;
> >  	int len;
> > -	int state = EXPECTING_FIRST_REF;
> > +	int state = EXPECTING_PROTOCOL_VERSION;
> >  
> >  	*list = NULL;
> >  
> >  	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
> >  		switch (state) {
> > +		case EXPECTING_PROTOCOL_VERSION:
> > +			if (process_protocol_version()) {
> > +				state = EXPECTING_FIRST_REF;
> > +				break;
> > +			}
> > +			state = EXPECTING_FIRST_REF;
> > +			/* fallthrough */
> >  		case EXPECTING_FIRST_REF:
> >  			process_capabilities(&len);
> >  			if (process_dummy_ref()) {

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 7/9] connect: tell server that the client understands v1
  2017-09-27  6:21     ` Junio C Hamano
  2017-09-27  6:29       ` Junio C Hamano
@ 2017-09-28 22:20       ` Brandon Williams
  1 sibling, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-28 22:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
> 
> > Teach the connection logic to tell a serve that it understands protocol
> > v1.  This is done in 2 different ways for the built in protocols.
> >
> > 1. git://
> >    A normal request is structured as "command path/to/repo\0host=..\0"
> >    and due to a bug in an old version of git-daemon 73bb33a94 (daemon:
> >    Strictly parse the "extra arg" part of the command, 2009-06-04) we
> >    aren't able to place any extra args (separated by NULs) besides the
> >    host.  In order to get around this limitation put protocol version
> >    information after a second NUL byte so the request is structured
> >    like: "command path/to/repo\0host=..\0\0version=1\0".  git-daemon can
> >    then parse out the version number and set GIT_PROTOCOL.
> 
> Same question as a previous step, wrt the cited commit.  It reads as
> if we are saying that the commit introduced a bug and left it there,
> that we cannot use \0host=..\0version=..\0other=..\0 until that bug
> is fixed, and that in the meantime we use \0host=..\0\0version=.. as
> a workaround, but that reading leaves readers wondering if we want
> to eventually drop this double-NUL workaround.  I am guessing that
> we want to declare that the current protocol has a glitch that
> prevents us to use \0host=..\0version=..\0 but we accept that and
> plan to keep it that way, and we'll use the double-NUL for anything
> other than host from now on, as it is compatible with the current
> version of Git before this patch (the extras are safely ignored),
> but then it still leaves readers wonder if the mention of the
> old commit from 2009 means that this double-NUL would not even work
> if the other end is running a version of Git before that commit, or
> we are safe to talk with versions of Git even older than that.
> 
> I do not think it is a showstopper if we did not work with v1.6.4,
> but it still needs to be clarified.

I wrote an updated commit msg for the daemon change, I can make a
similar change here.  And this mechanism shouldn't cause any issues with
both the pre and post 73bb33a94 git-daemon servers.

> 
> > 2. ssh://, file://
> >    Set GIT_PROTOCOL envvar with the desired protocol version.  The
> >    envvar can be sent across ssh by using '-o SendEnv=GIT_PROTOCOL' and
> >    having the server whitelist this envvar.
> 
> OpenSSH lets us do this, but I do not know how well this works with
> other implementations of SSH clients.  The log message perhaps needs
> to ask for volunteers to check if it is OK with the implementations
> they use, and offer conditional code (just like we have for putty
> and plink customizations) otherwise.

I'll make a comment indicating that

> 
> Other than that, the code changes looked good.
> 
> > diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
> > new file mode 100755
> > index 000000000..1988bbce6
> > --- /dev/null
> > +++ b/t/t5700-protocol-v1.sh
> > @@ -0,0 +1,223 @@
> > +#!/bin/sh
> > +
> > +test_description='test git wire-protocol transition'
> > +
> > +TEST_NO_CREATE_REPO=1
> > +
> > +. ./test-lib.sh
> > +
> > +# Test protocol v1 with 'git://' transport
> > +#
> > +. "$TEST_DIRECTORY"/lib-git-daemon.sh
> > +start_git_daemon --export-all --enable=receive-pack
> > +daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
> > +
> > +test_expect_success 'create repo to be served by git-daemon' '
> > +	git init "$daemon_parent" &&
> > +	test_commit -C "$daemon_parent" one
> > +'
> > +
> > +test_expect_success 'clone with git:// using protocol v1' '
> > +	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
> > +		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
> > +
> > +	git -C daemon_child log -1 --format=%s >actual &&
> > +	git -C "$daemon_parent" log -1 --format=%s >expect &&
> > +	test_cmp expect actual &&
> > +
> > +	# Client requested to use protocol v1
> > +	grep "version=1" log &&
> > +	# Server responded using protocol v1
> > +	grep "clone< version 1" log
> 
> This looked a bit strange to check "clone< version 1" for one
> direction, but did not check "$something> version 1" for the other
> direction.  Doesn't "version=1" end up producing 2 hits?

I think you discovered this in your next email but the "version=1" check
is to check for the request sent to git-daemon, the "command
path/to/repo\0host=blah\0\0version=1\0" one. While the "clone< version
1" check is to make sure that the server responded with the correct
version.

> 
> Not a complaint, but wondering if we can write it in such a way that
> does not have to make readers wonder.
> 
> > +'
> > +
> > +test_expect_success 'fetch with git:// using protocol v1' '
> > +	test_commit -C "$daemon_parent" two &&
> > +
> > +	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
> > +		fetch 2>log &&
> > +
> > +	git -C daemon_child log -1 --format=%s FETCH_HEAD >actual &&
> > +	git -C "$daemon_parent" log -1 --format=%s >expect &&
> > +	test_cmp expect actual &&
> 
> OK.  So the origin repository gained one commit on the 'master'
> branch (and a tag 'two').  By fetching, but not pulling, our
> 'master' would not advance, and that is where check on FETCH_HEAD
> comes from.  I suspect that the tag 'two' is also auto-followed with
> this operation and would be in FETCH_HEAD; is that something we want
> to check?  Alternatively, the "actual" log may want to see what the
> remote tracking branch for their 'master' has---then we do not have
> to worry about "FETCH_HEAD has two refs---which one are we checking?"

Yeah I can do that instead if you would prefer.

> 
> > +
> > +	# Client requested to use protocol v1
> > +	grep "version=1" log &&
> > +	# Server responded using protocol v1
> > +	grep "fetch< version 1" log
> > +'
> 
> Same "version=1" vs "fetch< version 1" strangeness appears here.
> 
> > +test_expect_success 'pull with git:// using protocol v1' '
> > +	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
> > +		pull 2>log &&
> > +
> > +	git -C daemon_child log -1 --format=%s >actual &&
> > +	git -C "$daemon_parent" log -1 --format=%s >expect &&
> > +	test_cmp expect actual &&
> 
> Here we can check our 'master', as we pulled their 'master' into it.
> What is this testing, though?  The fact that protocol.version=1
> given via "git -c var=val" mechanism is propagated to the underlying
> fetch?

Yeah, i guess we could realistically drop either the fetch or pull test
as they essentially do the same thing.  I was just being overly
cautious.

> 
> > +	# Client requested to use protocol v1
> > +	grep "version=1" log &&
> > +	# Server responded using protocol v1
> > +	grep "fetch< version 1" log
> > +'
> > +
> > +test_expect_success 'push with git:// using protocol v1' '
> > +	test_commit -C daemon_child three &&
> > +
> > +	# Since the repository being served isnt bare we need to push to
> > +	# another branch explicitly to avoid mangling the master branch
> 
> The other end avoids mangling the master just fine without us doing
> anything special ;-).  You are pushing to another branch because you
> cannot push into a branch that is currently checked out.
> 
> 	# Push to another branch, as the target repository has the
> 	# master branch checked out and we cannot push into it.

Sounds good I'll change that.

> 
> perhaps?
> 
> The tests for file:// looked identical, so the same set of comments
> apply.
> 
> > +# Test protocol v1 with 'ssh://' transport
> > +#
> > +test_expect_success 'setup ssh wrapper' '
> > +	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" &&
> > +	export GIT_SSH &&
> > +	export TRASH_DIRECTORY &&
> > +	>"$TRASH_DIRECTORY"/ssh-output
> > +'
> > +
> > +expect_ssh () {
> > +	test_when_finished '(cd "$TRASH_DIRECTORY" && rm -f ssh-expect && >ssh-output)' &&
> > +	echo "ssh: -o SendEnv=GIT_PROTOCOL myhost $1 '$PWD/ssh_parent'" >"$TRASH_DIRECTORY/ssh-expect" &&
> > +	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
> > +}
> > +
> > +test_expect_success 'create repo to be served by ssh:// transport' '
> > +	git init ssh_parent &&
> > +	test_commit -C ssh_parent one
> > +'
> > +
> > +test_expect_success 'clone with ssh:// using protocol v1' '
> > +	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
> > +		clone "ssh://myhost:$(pwd)/ssh_parent" ssh_child 2>log &&
> 
> Hmm, this is a fun one, as we deliberately make $(pwd) to have
> whitespace in the test setup.  I am impressed/kinda surprised that
> this works ;-)
> 
> Other than that, these also look more or less identical to file://
> and git:// tests, so the same set of comments apply.
> 
> Overall very nicely done.

Thanks! :D

> 
> Thanks.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 3/9] protocol: introduce protocol extention mechanisms
  2017-09-27 11:23       ` Junio C Hamano
@ 2017-09-29 21:20         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-29 21:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> >> +enum protocol_version determine_protocol_version_server(void)
> >> +{
> >> +	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
> >> +	enum protocol_version version = protocol_v0;
> >> +
> >> +	if (git_protocol) {
> >> +		struct string_list list = STRING_LIST_INIT_DUP;
> >> +		const struct string_list_item *item;
> >> +		string_list_split(&list, git_protocol, ':', -1);
> >> +
> >> +		for_each_string_list_item(item, &list) {
> >> +			const char *value;
> >> +			enum protocol_version v;
> >> +
> >> +			if (skip_prefix(item->string, "version=", &value)) {
> >> +				v = parse_protocol_version(value);
> >> +				if (v > version)
> >> +					version = v;
> >> +			}
> >> +		}
> >> +
> >> +		string_list_clear(&list, 0);
> >> +	}
> >> +
> >> +	return version;
> >> +}
> >
> > This implements "the largest one wins", not "the last one wins".  Is
> > there a particular reason why the former is chosen?
> 
> Let me give my version of why the usual "the last one wins" would
> not necessarily a good idea.  I would imagine that a client
> contacting the server may want to say "I understand v3, v2 (but not
> v1 nor v0)" and in order to influence the server's choice between
> the available two, it may want to somehow say it prefers v3 over v2
> (or v2 over v3).  
> 
> One way to implement such a behaviour would be "the first one that
> is understood is used", i.e. something along this line:
> 
>         enum protocol_version version = protocol_unknown;
> 
> 	for_each_string_list_item(item, &list) {
> 		const char *value;
> 		enum protocol_version v;
> 		if (skip_prefix(item->string, "version=", &value)) {
>                 	if (version == protocol_unknown) {
>                         	v = parse_protocol_version(value);
> 			        if (v != protocol_unknown)
> 					version = v;
> 			}
> 		}
> 	}
> 
> 	if (version == protocol_unknown)
> 		version = protocol_v0;
> 
> and not "the largest one wins" nor "the last one wins".
> 
> I am not saying your code or the choice of "the largest one wins" is
> necessarily wrong.  I am just illlustrating the way to explain
> "because I want to support a usecase like _this_, I define the way
> in which multiple values to the version variable is parsed like so,
> hence this code".  IOW, I think this commit should mention how the
> "largest one wins" rule would be useful to the clients and the
> servers when they want to achieve X---and that X is left unexplained.

I believe I mentioned this elsewhere but I think that at some point this
logic will probably have to be tweaked again at some point so that a
server may be able to prefer one version to another.

That being said I can definitely add a comment indicating how this code
selects the version and that it can be used to ensure that the latest
and greatest protocol version is used.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 7/9] connect: tell server that the client understands v1
  2017-09-27  6:29       ` Junio C Hamano
@ 2017-09-29 21:32         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-09-29 21:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

On 09/27, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> >> +	# Client requested to use protocol v1
> >> +	grep "version=1" log &&
> >> +	# Server responded using protocol v1
> >> +	grep "clone< version 1" log
> >
> > This looked a bit strange to check "clone< version 1" for one
> > direction, but did not check "$something> version 1" for the other
> > direction.  Doesn't "version=1" end up producing 2 hits?
> >
> > Not a complaint, but wondering if we can write it in such a way that
> > does not have to make readers wonder.
> 
> Ah, the check for "version=1" is a short-hand for
> 
> 	grep "clone> git-upload-pack ...\\0\\0version=1\\0$" log
> 
> and the symmetry I sought is already there.  So ignore the above; if
> we wanted to make the symmetry more explicit, it would not hurt to
> spell the first one as
> 
> 	grep "clone> .*\\0\\0version=1\\0$" log

I think you need three '\' to get an escaped backslash, but I agree,
I'll spell this out more explicitly in the tests.

> 
> though.
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v3 00/10] protocol transition
  2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
                     ` (8 preceding siblings ...)
  2017-09-26 23:56   ` [PATCH v2 9/9] i5700: add interop test for protocol transition Brandon Williams
@ 2017-10-03 20:14   ` Brandon Williams
  2017-10-03 20:14     ` [PATCH v3 01/10] connect: in ref advertisement, shallows are last Brandon Williams
                       ` (12 more replies)
  9 siblings, 13 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:14 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Changes in v3:
 * added a new ssh variant 'simple' and update documentation to better reflect
   the command-line parameters passed to the ssh command.
 * updated various commit messages based on feedback.
 * tighten the wording for 'GIT_PROTOCOL' to indicate that both unknown keys
   and values must be ignored.
 * added API comments for functions in protocol.h
 * updated various tests in t5700 based on reviewer feedback

Brandon Williams (9):
  pkt-line: add packet_write function
  protocol: introduce protocol extention mechanisms
  daemon: recognize hidden request arguments
  upload-pack, receive-pack: introduce protocol version 1
  connect: teach client to recognize v1 server response
  connect: tell server that the client understands v1
  http: tell server that the client understands v1
  i5700: add interop test for protocol transition
  ssh: introduce a 'simple' ssh variant

Jonathan Tan (1):
  connect: in ref advertisement, shallows are last

 Documentation/config.txt               |  44 +++-
 Documentation/git.txt                  |  15 +-
 Makefile                               |   1 +
 builtin/receive-pack.c                 |  15 ++
 cache.h                                |  10 +
 connect.c                              | 353 ++++++++++++++++++++++-----------
 daemon.c                               |  68 ++++++-
 http.c                                 |  18 ++
 pkt-line.c                             |   6 +
 pkt-line.h                             |   1 +
 protocol.c                             |  79 ++++++++
 protocol.h                             |  33 +++
 t/interop/i5700-protocol-transition.sh |  68 +++++++
 t/lib-httpd/apache.conf                |   7 +
 t/t5601-clone.sh                       |   9 +-
 t/t5700-protocol-v1.sh                 | 294 +++++++++++++++++++++++++++
 upload-pack.c                          |  18 +-
 17 files changed, 900 insertions(+), 139 deletions(-)
 create mode 100644 protocol.c
 create mode 100644 protocol.h
 create mode 100755 t/interop/i5700-protocol-transition.sh
 create mode 100755 t/t5700-protocol-v1.sh

--- interdiff with 'origin/bw/protocol-v1'

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b78747abc..0460af37e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2084,12 +2084,31 @@ ssh.variant::
 	Depending on the value of the environment variables `GIT_SSH` or
 	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
 	auto-detects whether to adjust its command-line parameters for use
-	with plink or tortoiseplink, as opposed to the default (OpenSSH).
+	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
+	(simple).
 +
 The config variable `ssh.variant` can be set to override this auto-detection;
-valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
-will be treated as normal ssh. This setting can be overridden via the
-environment variable `GIT_SSH_VARIANT`.
+valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
+other value will be treated as normal ssh. This setting can be overridden via
+the environment variable `GIT_SSH_VARIANT`.
++
+The current command-line parameters used for each variant are as
+follows:
++
+--
+
+* `ssh` - [-p port] [-4] [-6] [-o option] [username@]host command
+
+* `simple` - [username@]host command
+
+* `plink` or `putty` - [-P port] [-4] [-6] [username@]host command
+
+* `tortoiseplink` - [-P port] [-4] [-6] -batch [username@]host command
+
+--
++
+Except for the `simple` variant, command-line parameters are likely to
+change as git gains new features.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 299f75c7b..8bc3f2147 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -518,11 +518,10 @@ other
 	If either of these environment variables is set then 'git fetch'
 	and 'git push' will use the specified command instead of 'ssh'
 	when they need to connect to a remote system.
-	The command will be given exactly two or four arguments: the
-	'username@host' (or just 'host') from the URL and the shell
-	command to execute on that remote system, optionally preceded by
-	`-p` (literally) and the 'port' from the URL when it specifies
-	something other than the default SSH port.
+	The command-line parameters passed to the configured command are
+	determined by the ssh variant.  See `ssh.variant` option in
+	linkgit:git-config[1] for details.
+
 +
 `$GIT_SSH_COMMAND` takes precedence over `$GIT_SSH`, and is interpreted
 by the shell, which allows additional arguments to be included.
@@ -700,7 +699,8 @@ of clones and fetches.
 `GIT_PROTOCOL`::
 	For internal use only.  Used in handshaking the wire protocol.
 	Contains a colon ':' separated list of keys with optional values
-	'key[=value]'.  Presence of unknown keys must be tolerated.
+	'key[=value]'.  Presence of unknown keys and values must be
+	ignored.
 
 Discussion[[Discussion]]
 ------------------------
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index cb179367b..94b7d29ea 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1973,8 +1973,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		 * so just fall through after writing the version string.
 		 */
 	case protocol_v0:
-	default:
 		break;
+	default:
+		BUG("unknown protocol version");
 	}
 
 	if (advertise_refs || !stateless_rpc) {
diff --git a/cache.h b/cache.h
index aaadac1f0..3a6b869c2 100644
--- a/cache.h
+++ b/cache.h
@@ -448,7 +448,8 @@ static inline enum object_type object_type(unsigned int mode)
 /*
  * Environment variable used in handshaking the wire protocol.
  * Contains a colon ':' separated list of keys with optional values
- * 'key[=value]'.  Presence of unknown keys must be tolerated.
+ * 'key[=value]'.  Presence of unknown keys and values must be
+ * ignored.
  */
 #define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
 /* HTTP header used to handshake the wire protocol */
diff --git a/connect.c b/connect.c
index 12ebab724..65cee49b6 100644
--- a/connect.c
+++ b/connect.c
@@ -139,12 +139,12 @@ static int read_remote_ref(int in, char **src_buf, size_t *src_len,
 static int process_protocol_version(void)
 {
 	switch (determine_protocol_version_client(packet_buffer)) {
-		case protocol_v1:
-			return 1;
-		case protocol_v0:
-			return 0;
-		default:
-			die("server is speaking an unknown protocol");
+	case protocol_v1:
+		return 1;
+	case protocol_v0:
+		return 0;
+	default:
+		die("server is speaking an unknown protocol");
 	}
 }
 
@@ -776,37 +776,44 @@ static const char *get_ssh_command(void)
 	return NULL;
 }
 
-static int override_ssh_variant(int *port_option, int *needs_batch)
+enum ssh_variant {
+	VARIANT_SIMPLE,
+	VARIANT_SSH,
+	VARIANT_PLINK,
+	VARIANT_PUTTY,
+	VARIANT_TORTOISEPLINK,
+};
+
+static int override_ssh_variant(enum ssh_variant *ssh_variant)
 {
-	char *variant;
+	const char *variant = getenv("GIT_SSH_VARIANT");
 
-	variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
-	if (!variant &&
-	    git_config_get_string("ssh.variant", &variant))
+	if (!variant && git_config_get_string_const("ssh.variant", &variant))
 		return 0;
 
-	if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
-		*port_option = 'P';
-		*needs_batch = 0;
-	} else if (!strcmp(variant, "tortoiseplink")) {
-		*port_option = 'P';
-		*needs_batch = 1;
-	} else {
-		*port_option = 'p';
-		*needs_batch = 0;
-	}
-	free(variant);
+	if (!strcmp(variant, "plink"))
+		*ssh_variant = VARIANT_PLINK;
+	else if (!strcmp(variant, "putty"))
+		*ssh_variant = VARIANT_PUTTY;
+	else if (!strcmp(variant, "tortoiseplink"))
+		*ssh_variant = VARIANT_TORTOISEPLINK;
+	else if (!strcmp(variant, "simple"))
+		*ssh_variant = VARIANT_SIMPLE;
+	else
+		*ssh_variant = VARIANT_SSH;
+
 	return 1;
 }
 
-static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
-			       int *port_option, int *needs_batch)
+static enum ssh_variant determine_ssh_variant(const char *ssh_command,
+					      int is_cmdline)
 {
+	enum ssh_variant ssh_variant = VARIANT_SIMPLE;
 	const char *variant;
 	char *p = NULL;
 
-	if (override_ssh_variant(port_option, needs_batch))
-		return;
+	if (override_ssh_variant(&ssh_variant))
+		return ssh_variant;
 
 	if (!is_cmdline) {
 		p = xstrdup(ssh_command);
@@ -825,19 +832,21 @@ static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
 			free(ssh_argv);
 		} else {
 			free(p);
-			return;
+			return ssh_variant;
 		}
 	}
 
-	if (!strcasecmp(variant, "plink") ||
-	    !strcasecmp(variant, "plink.exe"))
-		*port_option = 'P';
+	if (!strcasecmp(variant, "ssh"))
+		ssh_variant = VARIANT_SSH;
+	else if (!strcasecmp(variant, "plink") ||
+		 !strcasecmp(variant, "plink.exe"))
+		ssh_variant = VARIANT_PLINK;
 	else if (!strcasecmp(variant, "tortoiseplink") ||
-		 !strcasecmp(variant, "tortoiseplink.exe")) {
-		*port_option = 'P';
-		*needs_batch = 1;
-	}
+		 !strcasecmp(variant, "tortoiseplink.exe"))
+		ssh_variant = VARIANT_TORTOISEPLINK;
+
 	free(p);
+	return ssh_variant;
 }
 
 /*
@@ -937,8 +946,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
 			const char *ssh;
-			int needs_batch = 0;
-			int port_option = 'p';
+			enum ssh_variant variant;
 			char *ssh_host = hostandport;
 			const char *port = NULL;
 			transport_check_allowed("ssh");
@@ -965,10 +973,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 				die("strange hostname '%s' blocked", ssh_host);
 
 			ssh = get_ssh_command();
-			if (ssh)
-				handle_ssh_variant(ssh, 1, &port_option,
-						   &needs_batch);
-			else {
+			if (ssh) {
+				variant = determine_ssh_variant(ssh, 1);
+			} else {
 				/*
 				 * GIT_SSH is the no-shell version of
 				 * GIT_SSH_COMMAND (and must remain so for
@@ -979,32 +986,38 @@ struct child_process *git_connect(int fd[2], const char *url,
 				ssh = getenv("GIT_SSH");
 				if (!ssh)
 					ssh = "ssh";
-				else
-					handle_ssh_variant(ssh, 0,
-							   &port_option,
-							   &needs_batch);
+				variant = determine_ssh_variant(ssh, 0);
 			}
 
 			argv_array_push(&conn->args, ssh);
 
-			if (get_protocol_version_config() > 0) {
+			if (variant == VARIANT_SSH &&
+			    get_protocol_version_config() > 0) {
 				argv_array_push(&conn->args, "-o");
 				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
 				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
 						 get_protocol_version_config());
 			}
 
-			if (flags & CONNECT_IPV4)
-				argv_array_push(&conn->args, "-4");
-			else if (flags & CONNECT_IPV6)
-				argv_array_push(&conn->args, "-6");
-			if (needs_batch)
+			if (variant != VARIANT_SIMPLE) {
+				if (flags & CONNECT_IPV4)
+					argv_array_push(&conn->args, "-4");
+				else if (flags & CONNECT_IPV6)
+					argv_array_push(&conn->args, "-6");
+			}
+
+			if (variant == VARIANT_TORTOISEPLINK)
 				argv_array_push(&conn->args, "-batch");
-			if (port) {
-				argv_array_pushf(&conn->args,
-						 "-%c", port_option);
+
+			if (port && variant != VARIANT_SIMPLE) {
+				if (variant == VARIANT_SSH)
+					argv_array_push(&conn->args, "-p");
+				else
+					argv_array_push(&conn->args, "-P");
+
 				argv_array_push(&conn->args, port);
 			}
+
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
diff --git a/protocol.c b/protocol.c
index 369503065..43012b7eb 100644
--- a/protocol.c
+++ b/protocol.c
@@ -33,6 +33,13 @@ enum protocol_version determine_protocol_version_server(void)
 	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
 	enum protocol_version version = protocol_v0;
 
+	/*
+	 * Determine which protocol version the client has requested.  Since
+	 * multiple 'version' keys can be sent by the client, indicating that
+	 * the client is okay to speak any of them, select the greatest version
+	 * that the client has requested.  This is due to the assumption that
+	 * the most recent protocol version will be the most state-of-the-art.
+	 */
 	if (git_protocol) {
 		struct string_list list = STRING_LIST_INIT_DUP;
 		const struct string_list_item *item;
diff --git a/protocol.h b/protocol.h
index 18f9a5235..1b2bc94a8 100644
--- a/protocol.h
+++ b/protocol.h
@@ -7,8 +7,27 @@ enum protocol_version {
 	protocol_v1 = 1,
 };
 
+/*
+ * Used by a client to determine which protocol version to request be used when
+ * communicating with a server, reflecting the configured value of the
+ * 'protocol.version' config.  If unconfigured, a value of 'protocol_v0' is
+ * returned.
+ */
 extern enum protocol_version get_protocol_version_config(void);
+
+/*
+ * Used by a server to determine which protocol version should be used based on
+ * a client's request, communicated via the 'GIT_PROTOCOL' environment variable
+ * by setting appropriate values for the key 'version'.  If a client doesn't
+ * request a particular protocol version, a default of 'protocol_v0' will be
+ * used.
+ */
 extern enum protocol_version determine_protocol_version_server(void);
+
+/*
+ * Used by a client to determine which protocol version the server is speaking
+ * based on the server's initial response.
+ */
 extern enum protocol_version determine_protocol_version_client(const char *server_response);
 
 #endif /* PROTOCOL_H */
diff --git a/t/interop/i5700-protocol-transition.sh b/t/interop/i5700-protocol-transition.sh
index 9e83428a8..97e8e580e 100755
--- a/t/interop/i5700-protocol-transition.sh
+++ b/t/interop/i5700-protocol-transition.sh
@@ -3,7 +3,7 @@
 VERSION_A=.
 VERSION_B=v2.0.0
 
-: ${LIB_GIT_DAEMON_PORT:=5600}
+: ${LIB_GIT_DAEMON_PORT:=5700}
 LIB_GIT_DAEMON_COMMAND='git.b daemon'
 
 test_description='clone and fetch by client who is trying to use a new protocol'
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f771b..ee1a24c5b 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -312,6 +312,8 @@ setup_ssh_wrapper () {
 			"$TRASH_DIRECTORY/ssh-wrapper$X" &&
 		GIT_SSH="$TRASH_DIRECTORY/ssh-wrapper$X" &&
 		export GIT_SSH &&
+		GIT_SSH_VARIANT=ssh &&
+		export GIT_SSH_VARIANT &&
 		export TRASH_DIRECTORY &&
 		>"$TRASH_DIRECTORY"/ssh-output
 	'
@@ -320,7 +322,8 @@ setup_ssh_wrapper () {
 copy_ssh_wrapper_as () {
 	cp "$TRASH_DIRECTORY/ssh-wrapper$X" "${1%$X}$X" &&
 	GIT_SSH="${1%$X}$X" &&
-	export GIT_SSH
+	export GIT_SSH &&
+	unset GIT_SSH_VARIANT
 }
 
 expect_ssh () {
@@ -362,10 +365,10 @@ test_expect_success 'bracketed hostnames are still ssh' '
 	expect_ssh "-p 123" myhost src
 '
 
-test_expect_success 'uplink is not treated as putty' '
+test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
 	git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
-	expect_ssh "-p 123" myhost src
+	expect_ssh myhost src
 '
 
 test_expect_success 'plink is treated specially (as putty)' '
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index 222265127..ba86a44eb 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -26,7 +26,7 @@ test_expect_success 'clone with git:// using protocol v1' '
 	test_cmp expect actual &&
 
 	# Client requested to use protocol v1
-	grep "version=1" log &&
+	grep "clone> .*\\\0\\\0version=1\\\0$" log &&
 	# Server responded using protocol v1
 	grep "clone< version 1" log
 '
@@ -37,12 +37,12 @@ test_expect_success 'fetch with git:// using protocol v1' '
 	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
 		fetch 2>log &&
 
-	git -C daemon_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C daemon_child log -1 --format=%s origin/master >actual &&
 	git -C "$daemon_parent" log -1 --format=%s >expect &&
 	test_cmp expect actual &&
 
 	# Client requested to use protocol v1
-	grep "version=1" log &&
+	grep "fetch> .*\\\0\\\0version=1\\\0$" log &&
 	# Server responded using protocol v1
 	grep "fetch< version 1" log
 '
@@ -56,7 +56,7 @@ test_expect_success 'pull with git:// using protocol v1' '
 	test_cmp expect actual &&
 
 	# Client requested to use protocol v1
-	grep "version=1" log &&
+	grep "fetch> .*\\\0\\\0version=1\\\0$" log &&
 	# Server responded using protocol v1
 	grep "fetch< version 1" log
 '
@@ -64,8 +64,8 @@ test_expect_success 'pull with git:// using protocol v1' '
 test_expect_success 'push with git:// using protocol v1' '
 	test_commit -C daemon_child three &&
 
-	# Since the repository being served isnt bare we need to push to
-	# another branch explicitly to avoid mangling the master branch
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
 	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
 		push origin HEAD:client_branch 2>log &&
 
@@ -74,7 +74,7 @@ test_expect_success 'push with git:// using protocol v1' '
 	test_cmp expect actual &&
 
 	# Client requested to use protocol v1
-	grep "version=1" log &&
+	grep "push> .*\\\0\\\0version=1\\\0$" log &&
 	# Server responded using protocol v1
 	grep "push< version 1" log
 '
@@ -106,7 +106,7 @@ test_expect_success 'fetch with file:// using protocol v1' '
 	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
 		fetch 2>log &&
 
-	git -C file_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C file_child log -1 --format=%s origin/master >actual &&
 	git -C file_parent log -1 --format=%s >expect &&
 	test_cmp expect actual &&
 
@@ -129,8 +129,8 @@ test_expect_success 'pull with file:// using protocol v1' '
 test_expect_success 'push with file:// using protocol v1' '
 	test_commit -C file_child three &&
 
-	# Since the repository being served isnt bare we need to push to
-	# another branch explicitly to avoid mangling the master branch
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
 	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
 		push origin HEAD:client_branch 2>log &&
 
@@ -145,8 +145,10 @@ test_expect_success 'push with file:// using protocol v1' '
 # Test protocol v1 with 'ssh://' transport
 #
 test_expect_success 'setup ssh wrapper' '
-	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" &&
+	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh" &&
 	export GIT_SSH &&
+	GIT_SSH_VARIANT=ssh &&
+	export GIT_SSH_VARIANT &&
 	export TRASH_DIRECTORY &&
 	>"$TRASH_DIRECTORY"/ssh-output
 '
@@ -182,7 +184,7 @@ test_expect_success 'fetch with ssh:// using protocol v1' '
 		fetch 2>log &&
 	expect_ssh git-upload-pack &&
 
-	git -C ssh_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C ssh_child log -1 --format=%s origin/master >actual &&
 	git -C ssh_parent log -1 --format=%s >expect &&
 	test_cmp expect actual &&
 
@@ -206,8 +208,8 @@ test_expect_success 'pull with ssh:// using protocol v1' '
 test_expect_success 'push with ssh:// using protocol v1' '
 	test_commit -C ssh_child three &&
 
-	# Since the repository being served isnt bare we need to push to
-	# another branch explicitly to avoid mangling the master branch
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
 	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
 		push origin HEAD:client_branch 2>log &&
 	expect_ssh git-receive-pack &&
@@ -251,7 +253,7 @@ test_expect_success 'fetch with http:// using protocol v1' '
 	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
 		fetch 2>log &&
 
-	git -C http_child log -1 --format=%s FETCH_HEAD >actual &&
+	git -C http_child log -1 --format=%s origin/master >actual &&
 	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
 	test_cmp expect actual &&
 
@@ -274,8 +276,8 @@ test_expect_success 'pull with http:// using protocol v1' '
 test_expect_success 'push with http:// using protocol v1' '
 	test_commit -C http_child three &&
 
-	# Since the repository being served isnt bare we need to push to
-	# another branch explicitly to avoid mangling the master branch
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
 	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
 		push origin HEAD:client_branch && #2>log &&
 
diff --git a/upload-pack.c b/upload-pack.c
index 5cab39819..ef438e9c2 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1078,9 +1078,10 @@ int cmd_main(int argc, const char **argv)
 		 * so just fall through after writing the version string.
 		 */
 	case protocol_v0:
-	default:
 		upload_pack();
 		break;
+	default:
+		BUG("unknown protocol version");
 	}
 
 	return 0;


-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 01/10] connect: in ref advertisement, shallows are last
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
@ 2017-10-03 20:14     ` Brandon Williams
  2017-10-10 18:14       ` Jonathan Tan
  2017-10-03 20:14     ` [PATCH v3 02/10] pkt-line: add packet_write function Brandon Williams
                       ` (11 subsequent siblings)
  12 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:14 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

From: Jonathan Tan <jonathantanmy@google.com>

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

As part of this change, this patch interprets capabilities only on the
first line in the ref advertisement, printing a warning message when
encountering capabilities on other lines.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 189 ++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 123 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index df56c0cbf..8e2e276b6 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "strbuf.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -107,6 +108,104 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+
+static void process_capabilities(int *len)
+{
+	int nul_location = strlen(packet_buffer);
+	if (nul_location == *len)
+		return;
+	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	*len = nul_location;
+}
+
+static int process_dummy_ref(void)
+{
+	struct object_id oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
+}
+
+static void check_no_capabilities(int len)
+{
+	if (strlen(packet_buffer) != len)
+		warning("Ignoring capabilities after first line '%s'",
+			packet_buffer + strlen(packet_buffer));
+}
+
+static int process_ref(int len, struct ref ***list, unsigned int flags,
+		       struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (!strcmp(name, "capabilities^{}")) {
+		die("protocol error: unexpected capabilities^{}");
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	check_no_capabilities(len);
+	return 1;
+}
+
+static int process_shallow(int len, struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	check_no_capabilities(len);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +222,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_FIRST_REF;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
 
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_FIRST_REF:
+			process_capabilities(&len);
+			if (process_dummy_ref()) {
+				state = EXPECTING_SHALLOW;
+				break;
+			}
+			state = EXPECTING_REF;
+			/* fallthrough */
+		case EXPECTING_REF:
+			if (process_ref(len, &list, flags, extra_have))
+				break;
+			state = EXPECTING_SHALLOW;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(len, shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 02/10] pkt-line: add packet_write function
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
  2017-10-03 20:14     ` [PATCH v3 01/10] connect: in ref advertisement, shallows are last Brandon Williams
@ 2017-10-03 20:14     ` Brandon Williams
  2017-10-10 18:15       ` Jonathan Tan
  2017-10-03 20:15     ` [PATCH v3 03/10] protocol: introduce protocol extention mechanisms Brandon Williams
                       ` (10 subsequent siblings)
  12 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:14 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Add a function which can be used to write the contents of an arbitrary
buffer.  This makes it easy to build up data in a buffer before writing
the packet instead of formatting the entire contents of the packet using
'packet_write_fmt()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 6 ++++++
 pkt-line.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 647bbd3bc..c025d0332 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -188,6 +188,12 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 	return 0;
 }
 
+void packet_write(const int fd_out, const char *buf, size_t size)
+{
+	if (packet_write_gently(fd_out, buf, size))
+		die_errno("packet write failed");
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
 	va_list args;
diff --git a/pkt-line.h b/pkt-line.h
index 66ef610fc..d9e9783b1 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
 void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_write(const int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
  2017-10-03 20:14     ` [PATCH v3 01/10] connect: in ref advertisement, shallows are last Brandon Williams
  2017-10-03 20:14     ` [PATCH v3 02/10] pkt-line: add packet_write function Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-06  9:09       ` Simon Ruderich
  2017-10-10 19:51       ` Jonathan Tan
  2017-10-03 20:15     ` [PATCH v3 04/10] daemon: recognize hidden request arguments Brandon Williams
                       ` (9 subsequent siblings)
  12 siblings, 2 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt | 17 +++++++++++
 Documentation/git.txt    |  6 ++++
 Makefile                 |  1 +
 cache.h                  |  8 +++++
 protocol.c               | 79 ++++++++++++++++++++++++++++++++++++++++++++++++
 protocol.h               | 33 ++++++++++++++++++++
 6 files changed, 144 insertions(+)
 create mode 100644 protocol.c
 create mode 100644 protocol.h

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a..b78747abc 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2517,6 +2517,23 @@ The protocol names currently used by git are:
     `hg` to allow the `git-remote-hg` helper)
 --
 
+protocol.version::
+	Experimental. If set, clients will attempt to communicate with a
+	server using the specified protocol version.  If unset, no
+	attempt will be made by the client to communicate using a
+	particular protocol version, this results in protocol version 0
+	being used.
+	Supported versions:
++
+--
+
+* `0` - the original wire protocol.
+
+* `1` - the original wire protocol with the addition of a version string
+  in the initial response from the server.
+
+--
+
 pull.ff::
 	By default, Git does not create an extra merge commit when merging
 	a commit that is a descendant of the current commit. Instead, the
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6e3a6767e..7518ea3af 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -697,6 +697,12 @@ of clones and fetches.
 	which feed potentially-untrusted URLS to git commands.  See
 	linkgit:git-config[1] for more details.
 
+`GIT_PROTOCOL`::
+	For internal use only.  Used in handshaking the wire protocol.
+	Contains a colon ':' separated list of keys with optional values
+	'key[=value]'.  Presence of unknown keys and values must be
+	ignored.
+
 Discussion[[Discussion]]
 ------------------------
 
diff --git a/Makefile b/Makefile
index ed4ca438b..9ce68cded 100644
--- a/Makefile
+++ b/Makefile
@@ -842,6 +842,7 @@ LIB_OBJS += pretty.o
 LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
 LIB_OBJS += prompt.o
+LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
diff --git a/cache.h b/cache.h
index 49b083ee0..c74b73671 100644
--- a/cache.h
+++ b/cache.h
@@ -445,6 +445,14 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS"
 #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH"
 
+/*
+ * Environment variable used in handshaking the wire protocol.
+ * Contains a colon ':' separated list of keys with optional values
+ * 'key[=value]'.  Presence of unknown keys and values must be
+ * ignored.
+ */
+#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+
 /*
  * This environment variable is expected to contain a boolean indicating
  * whether we should or should not treat:
diff --git a/protocol.c b/protocol.c
new file mode 100644
index 000000000..43012b7eb
--- /dev/null
+++ b/protocol.c
@@ -0,0 +1,79 @@
+#include "cache.h"
+#include "config.h"
+#include "protocol.h"
+
+static enum protocol_version parse_protocol_version(const char *value)
+{
+	if (!strcmp(value, "0"))
+		return protocol_v0;
+	else if (!strcmp(value, "1"))
+		return protocol_v1;
+	else
+		return protocol_unknown_version;
+}
+
+enum protocol_version get_protocol_version_config(void)
+{
+	const char *value;
+	if (!git_config_get_string_const("protocol.version", &value)) {
+		enum protocol_version version = parse_protocol_version(value);
+
+		if (version == protocol_unknown_version)
+			die("unknown value for config 'protocol.version': %s",
+			    value);
+
+		return version;
+	}
+
+	return protocol_v0;
+}
+
+enum protocol_version determine_protocol_version_server(void)
+{
+	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
+	enum protocol_version version = protocol_v0;
+
+	/*
+	 * Determine which protocol version the client has requested.  Since
+	 * multiple 'version' keys can be sent by the client, indicating that
+	 * the client is okay to speak any of them, select the greatest version
+	 * that the client has requested.  This is due to the assumption that
+	 * the most recent protocol version will be the most state-of-the-art.
+	 */
+	if (git_protocol) {
+		struct string_list list = STRING_LIST_INIT_DUP;
+		const struct string_list_item *item;
+		string_list_split(&list, git_protocol, ':', -1);
+
+		for_each_string_list_item(item, &list) {
+			const char *value;
+			enum protocol_version v;
+
+			if (skip_prefix(item->string, "version=", &value)) {
+				v = parse_protocol_version(value);
+				if (v > version)
+					version = v;
+			}
+		}
+
+		string_list_clear(&list, 0);
+	}
+
+	return version;
+}
+
+enum protocol_version determine_protocol_version_client(const char *server_response)
+{
+	enum protocol_version version = protocol_v0;
+
+	if (skip_prefix(server_response, "version ", &server_response)) {
+		version = parse_protocol_version(server_response);
+
+		if (version == protocol_unknown_version)
+			die("server is speaking an unknown protocol");
+		if (version == protocol_v0)
+			die("protocol error: server explicitly said version 0");
+	}
+
+	return version;
+}
diff --git a/protocol.h b/protocol.h
new file mode 100644
index 000000000..1b2bc94a8
--- /dev/null
+++ b/protocol.h
@@ -0,0 +1,33 @@
+#ifndef PROTOCOL_H
+#define PROTOCOL_H
+
+enum protocol_version {
+	protocol_unknown_version = -1,
+	protocol_v0 = 0,
+	protocol_v1 = 1,
+};
+
+/*
+ * Used by a client to determine which protocol version to request be used when
+ * communicating with a server, reflecting the configured value of the
+ * 'protocol.version' config.  If unconfigured, a value of 'protocol_v0' is
+ * returned.
+ */
+extern enum protocol_version get_protocol_version_config(void);
+
+/*
+ * Used by a server to determine which protocol version should be used based on
+ * a client's request, communicated via the 'GIT_PROTOCOL' environment variable
+ * by setting appropriate values for the key 'version'.  If a client doesn't
+ * request a particular protocol version, a default of 'protocol_v0' will be
+ * used.
+ */
+extern enum protocol_version determine_protocol_version_server(void);
+
+/*
+ * Used by a client to determine which protocol version the server is speaking
+ * based on the server's initial response.
+ */
+extern enum protocol_version determine_protocol_version_client(const char *server_response);
+
+#endif /* PROTOCOL_H */
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 04/10] daemon: recognize hidden request arguments
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (2 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 03/10] protocol: introduce protocol extention mechanisms Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-10 18:24       ` Jonathan Tan
  2017-10-03 20:15     ` [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
                       ` (8 subsequent siblings)
  12 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

A normal request to git-daemon is structured as
"command path/to/repo\0host=..\0" and due to a bug introduced in
49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) we
aren't able to place any extra arguments (separated by NULs) besides the
host otherwise the parsing of those arguments would enter an infinite
loop.  This bug was fixed in 73bb33a94 (daemon: Strictly parse the
"extra arg" part of the command, 2009-06-04) but a check was put in
place to disallow extra arguments so that new clients wouldn't trigger
this bug in older servers.

In order to get around this limitation teach git-daemon to recognize
additional request arguments hidden behind a second NUL byte.  Requests
can then be structured like:
"command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
can then parse out the extra arguments and set 'GIT_PROTOCOL'
accordingly.

By placing these extra arguments behind a second NUL byte we can skirt
around both the infinite loop bug in 49ba83fb6 (Add virtualization
support to git-daemon, 2006-09-19) as well as the explicit disallowing
of extra arguments introduced in 73bb33a94 (daemon: Strictly parse the
"extra arg" part of the command, 2009-06-04) because both of these
versions of git-daemon check for a single NUL byte after the host
argument before terminating the argument parsing.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 daemon.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 59 insertions(+), 9 deletions(-)

diff --git a/daemon.c b/daemon.c
index 30747075f..36cc794c9 100644
--- a/daemon.c
+++ b/daemon.c
@@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi)
 	return NULL;		/* Fallthrough. Deny by default */
 }
 
-typedef int (*daemon_service_fn)(void);
+typedef int (*daemon_service_fn)(const struct argv_array *env);
 struct daemon_service {
 	const char *name;
 	const char *config_name;
@@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
 }
 
 static int run_service(const char *dir, struct daemon_service *service,
-		       struct hostinfo *hi)
+		       struct hostinfo *hi, const struct argv_array *env)
 {
 	const char *path;
 	int enabled = service->enabled;
@@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service,
 	 */
 	signal(SIGTERM, SIG_IGN);
 
-	return service->fn();
+	return service->fn(env);
 }
 
 static void copy_to_log(int fd)
@@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld)
 	return finish_command(cld);
 }
 
-static int upload_pack(void)
+static int upload_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_pushl(&cld.args, "upload-pack", "--strict", NULL);
 	argv_array_pushf(&cld.args, "--timeout=%u", timeout);
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int upload_archive(void)
+static int upload_archive(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "upload-archive");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int receive_pack(void)
+static int receive_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "receive-pack");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
@@ -574,7 +583,7 @@ static void canonicalize_client(struct strbuf *out, const char *in)
 /*
  * Read the host as supplied by the client connection.
  */
-static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
+static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 {
 	char *val;
 	int vallen;
@@ -602,6 +611,43 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 		if (extra_args < end && *extra_args)
 			die("Invalid request");
 	}
+
+	return extra_args;
+}
+
+static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
+			     char *extra_args, int buflen)
+{
+	const char *end = extra_args + buflen;
+	struct strbuf git_protocol = STRBUF_INIT;
+
+	/* First look for the host argument */
+	extra_args = parse_host_arg(hi, extra_args, buflen);
+
+	/* Look for additional arguments places after a second NUL byte */
+	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
+		const char *arg = extra_args;
+
+		/*
+		 * Parse the extra arguments, adding most to 'git_protocol'
+		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
+		 * service that will be run.
+		 *
+		 * If there ends up being a particular arg in the future that
+		 * git-daemon needs to parse specificly (like the 'host' arg)
+		 * then it can be parsed here and not added to 'git_protocol'.
+		 */
+		if (*arg) {
+			if (git_protocol.len > 0)
+				strbuf_addch(&git_protocol, ':');
+			strbuf_addstr(&git_protocol, arg);
+		}
+	}
+
+	if (git_protocol.len > 0)
+		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
+				 git_protocol.buf);
+	strbuf_release(&git_protocol);
 }
 
 /*
@@ -695,6 +741,7 @@ static int execute(void)
 	int pktlen, len, i;
 	char *addr = getenv("REMOTE_ADDR"), *port = getenv("REMOTE_PORT");
 	struct hostinfo hi;
+	struct argv_array env = ARGV_ARRAY_INIT;
 
 	hostinfo_init(&hi);
 
@@ -716,8 +763,9 @@ static int execute(void)
 		pktlen--;
 	}
 
+	/* parse additional args hidden behind a NUL byte */
 	if (len != pktlen)
-		parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+		parse_extra_args(&hi, &env, line + len + 1, pktlen - len - 1);
 
 	for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
 		struct daemon_service *s = &(daemon_service[i]);
@@ -730,13 +778,15 @@ static int execute(void)
 			 * Note: The directory here is probably context sensitive,
 			 * and might depend on the actual service being performed.
 			 */
-			int rc = run_service(arg, s, &hi);
+			int rc = run_service(arg, s, &hi, &env);
 			hostinfo_clear(&hi);
+			argv_array_clear(&env);
 			return rc;
 		}
 	}
 
 	hostinfo_clear(&hi);
+	argv_array_clear(&env);
 	logerror("Protocol error: '%s'", line);
 	return -1;
 }
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (3 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 04/10] daemon: recognize hidden request arguments Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-10 18:28       ` Jonathan Tan
  2017-10-03 20:15     ` [PATCH v3 06/10] connect: teach client to recognize v1 server response Brandon Williams
                       ` (7 subsequent siblings)
  12 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.

Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/receive-pack.c | 15 +++++++++++++++
 upload-pack.c          | 18 +++++++++++++++++-
 2 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index dd06b3fb4..94b7d29ea 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -24,6 +24,7 @@
 #include "tmp-objdir.h"
 #include "oidset.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static const char * const receive_pack_usage[] = {
 	N_("git receive-pack <git-dir>"),
@@ -1963,6 +1964,20 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 	else if (0 <= receive_unpack_limit)
 		unpack_limit = receive_unpack_limit;
 
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+	case protocol_v0:
+		break;
+	default:
+		BUG("unknown protocol version");
+	}
+
 	if (advertise_refs || !stateless_rpc) {
 		write_head_info();
 	}
diff --git a/upload-pack.c b/upload-pack.c
index 7efff2fbf..ef438e9c2 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
+#include "protocol.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -1067,6 +1068,21 @@ int cmd_main(int argc, const char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 
 	git_config(upload_pack_config, NULL);
-	upload_pack();
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+	case protocol_v0:
+		upload_pack();
+		break;
+	default:
+		BUG("unknown protocol version");
+	}
+
 	return 0;
 }
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 06/10] connect: teach client to recognize v1 server response
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (4 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-03 20:15     ` [PATCH v3 07/10] connect: tell server that the client understands v1 Brandon Williams
                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 8e2e276b6..a5e708a61 100644
--- a/connect.c
+++ b/connect.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t *src_len,
 	return len;
 }
 
-#define EXPECTING_FIRST_REF 0
-#define EXPECTING_REF 1
-#define EXPECTING_SHALLOW 2
+#define EXPECTING_PROTOCOL_VERSION 0
+#define EXPECTING_FIRST_REF 1
+#define EXPECTING_REF 2
+#define EXPECTING_SHALLOW 3
+
+/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
+static int process_protocol_version(void)
+{
+	switch (determine_protocol_version_client(packet_buffer)) {
+	case protocol_v1:
+		return 1;
+	case protocol_v0:
+		return 0;
+	default:
+		die("server is speaking an unknown protocol");
+	}
+}
 
 static void process_capabilities(int *len)
 {
@@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 */
 	int responded = 0;
 	int len;
-	int state = EXPECTING_FIRST_REF;
+	int state = EXPECTING_PROTOCOL_VERSION;
 
 	*list = NULL;
 
 	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
 		switch (state) {
+		case EXPECTING_PROTOCOL_VERSION:
+			if (process_protocol_version()) {
+				state = EXPECTING_FIRST_REF;
+				break;
+			}
+			state = EXPECTING_FIRST_REF;
+			/* fallthrough */
 		case EXPECTING_FIRST_REF:
 			process_capabilities(&len);
 			if (process_dummy_ref()) {
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 07/10] connect: tell server that the client understands v1
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (5 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 06/10] connect: teach client to recognize v1 server response Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-10 18:30       ` Jonathan Tan
  2017-10-03 20:15     ` [PATCH v3 08/10] http: " Brandon Williams
                       ` (5 subsequent siblings)
  12 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Teach the connection logic to tell a serve that it understands protocol
v1.  This is done in 2 different ways for the builtin transports.

1. git://
   A normal request to git-daemon is structured as
   "command path/to/repo\0host=..\0" and due to a bug introduced in
   49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) we
   aren't able to place any extra arguments (separated by NULs) besides
   the host otherwise the parsing of those arguments would enter an
   infinite loop.  This bug was fixed in 73bb33a94 (daemon: Strictly
   parse the "extra arg" part of the command, 2009-06-04) but a check
   was put in place to disallow extra arguments so that new clients
   wouldn't trigger this bug in older servers.

   In order to get around this limitation git-daemon was taught to
   recognize additional request arguments hidden behind a second
   NUL byte.  Requests can then be structured like:
   "command path/to/repo\0host=..\0\0version=1\0key=value\0".
   git-daemon can then parse out the extra arguments and set
   'GIT_PROTOCOL' accordingly.

   By placing these extra arguments behind a second NUL byte we can
   skirt around both the infinite loop bug in 49ba83fb6 (Add
   virtualization support to git-daemon, 2006-09-19) as well as the
   explicit disallowing of extra arguments introduced in 73bb33a94
   (daemon: Strictly parse the "extra arg" part of the command,
   2009-06-04) because both of these versions of git-daemon check for a
   single NUL byte after the host argument before terminating the
   argument parsing.

2. ssh://, file://
   Set 'GIT_PROTOCOL' environment variable with the desired protocol
   version.  With the file:// transport, 'GIT_PROTOCOL' can be set
   explicitly in the locally running git-upload-pack or git-receive-pack
   processes.  With the ssh:// transport and OpenSSH compliant ssh
   programs, 'GIT_PROTOCOL' can be sent across ssh by using '-o
   SendEnv=GIT_PROTOCOL' and having the server whitelist this
   environment variable.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              |  37 ++++++--
 t/t5700-protocol-v1.sh | 223 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 255 insertions(+), 5 deletions(-)
 create mode 100755 t/t5700-protocol-v1.sh

diff --git a/connect.c b/connect.c
index a5e708a61..b8695a2fa 100644
--- a/connect.c
+++ b/connect.c
@@ -871,6 +871,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
+		struct strbuf request = STRBUF_INIT;
 		/*
 		 * Set up virtual host information based on where we will
 		 * connect, unless the user has overridden us in
@@ -898,13 +899,25 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * Note: Do not add any other headers here!  Doing so
 		 * will cause older git-daemon servers to crash.
 		 */
-		packet_write_fmt(fd[1],
-			     "%s %s%chost=%s%c",
-			     prog, path, 0,
-			     target_host, 0);
+		strbuf_addf(&request,
+			    "%s %s%chost=%s%c",
+			    prog, path, 0,
+			    target_host, 0);
+
+		/* If using a new version put that stuff here after a second null byte */
+		if (get_protocol_version_config() > 0) {
+			strbuf_addch(&request, '\0');
+			strbuf_addf(&request, "version=%d%c",
+				    get_protocol_version_config(), '\0');
+		}
+
+		packet_write(fd[1], request.buf, request.len);
+
 		free(target_host);
+		strbuf_release(&request);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
+		const char *const *var;
 
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
@@ -917,7 +930,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		sq_quote_buf(&cmd, path);
 
 		/* remove repo-local variables from the environment */
-		conn->env = local_repo_env;
+		for (var = local_repo_env; *var; var++)
+			argv_array_push(&conn->env_array, *var);
+
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
@@ -971,6 +986,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 			}
 
 			argv_array_push(&conn->args, ssh);
+
+			if (get_protocol_version_config() > 0) {
+				argv_array_push(&conn->args, "-o");
+				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
+
 			if (flags & CONNECT_IPV4)
 				argv_array_push(&conn->args, "-4");
 			else if (flags & CONNECT_IPV6)
@@ -985,6 +1008,10 @@ struct child_process *git_connect(int fd[2], const char *url,
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
+			if (get_protocol_version_config() > 0) {
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
 		}
 		argv_array_push(&conn->args, cmd.buf);
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
new file mode 100755
index 000000000..6551932da
--- /dev/null
+++ b/t/t5700-protocol-v1.sh
@@ -0,0 +1,223 @@
+#!/bin/sh
+
+test_description='test git wire-protocol transition'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v1 with 'git://' transport
+#
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	test_commit -C "$daemon_parent" one
+'
+
+test_expect_success 'clone with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "clone> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with git:// using protocol v1' '
+	test_commit -C "$daemon_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C daemon_child log -1 --format=%s origin/master >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "fetch> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "fetch> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with git:// using protocol v1' '
+	test_commit -C daemon_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "push> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+stop_git_daemon
+
+# Test protocol v1 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'clone with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with file:// using protocol v1' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with file:// using protocol v1' '
+	test_commit -C file_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+# Test protocol v1 with 'ssh://' transport
+#
+test_expect_success 'setup ssh wrapper' '
+	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh" &&
+	export GIT_SSH &&
+	export TRASH_DIRECTORY &&
+	>"$TRASH_DIRECTORY"/ssh-output
+'
+
+expect_ssh () {
+	test_when_finished '(cd "$TRASH_DIRECTORY" && rm -f ssh-expect && >ssh-output)' &&
+	echo "ssh: -o SendEnv=GIT_PROTOCOL myhost $1 '$PWD/ssh_parent'" >"$TRASH_DIRECTORY/ssh-expect" &&
+	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
+}
+
+test_expect_success 'create repo to be served by ssh:// transport' '
+	git init ssh_parent &&
+	test_commit -C ssh_parent one
+'
+
+test_expect_success 'clone with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "ssh://myhost:$(pwd)/ssh_parent" ssh_child 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with ssh:// using protocol v1' '
+	test_commit -C ssh_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		fetch 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s origin/master >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		pull 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with ssh:// using protocol v1' '
+	test_commit -C ssh_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+	expect_ssh git-receive-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+test_done
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 08/10] http: tell server that the client understands v1
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (6 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 07/10] connect: tell server that the client understands v1 Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-03 20:15     ` [PATCH v3 09/10] i5700: add interop test for protocol transition Brandon Williams
                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Tell a server that protocol v1 can be used by sending the http header
'Git-Protocol' indicating this.

Also teach the apache http server to pass through the 'Git-Protocol'
header as an environment variable 'GIT_PROTOCOL'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 cache.h                 |  2 ++
 http.c                  | 18 +++++++++++++
 t/lib-httpd/apache.conf |  7 +++++
 t/t5700-protocol-v1.sh  | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 96 insertions(+)

diff --git a/cache.h b/cache.h
index c74b73671..3a6b869c2 100644
--- a/cache.h
+++ b/cache.h
@@ -452,6 +452,8 @@ static inline enum object_type object_type(unsigned int mode)
  * ignored.
  */
 #define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+/* HTTP header used to handshake the wire protocol */
+#define GIT_PROTOCOL_HEADER "Git-Protocol"
 
 /*
  * This environment variable is expected to contain a boolean indicating
diff --git a/http.c b/http.c
index 9e40a465f..ffb719216 100644
--- a/http.c
+++ b/http.c
@@ -12,6 +12,7 @@
 #include "gettext.h"
 #include "transport.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
@@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
+static void protocol_http_header(void)
+{
+	if (get_protocol_version_config() > 0) {
+		struct strbuf protocol_header = STRBUF_INIT;
+
+		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
+			    get_protocol_version_config());
+
+
+		extra_http_headers = curl_slist_append(extra_http_headers,
+						       protocol_header.buf);
+		strbuf_release(&protocol_header);
+	}
+}
+
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
+	protocol_http_header();
+
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 0642ae7e6..df1943631 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -67,6 +67,9 @@ LockFile accept.lock
 <IfModule !mod_unixd.c>
 	LoadModule unixd_module modules/mod_unixd.so
 </IfModule>
+<IfModule !mod_setenvif.c>
+	LoadModule setenvif_module modules/mod_setenvif.so
+</IfModule>
 </IfVersion>
 
 PassEnv GIT_VALGRIND
@@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS
 PassEnv GIT_TRACE
 PassEnv GIT_CONFIG_NOSYSTEM
 
+<IfVersion >= 2.4>
+	SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
+</IfVersion>
+
 Alias /dumb/ www/
 Alias /auth/dumb/ www/auth/dumb/
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index 6551932da..b0779d362 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -220,4 +220,73 @@ test_expect_success 'push with ssh:// using protocol v1' '
 	grep "push< version 1" log
 '
 
+# Test protocol v1 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=1 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "Git-Protocol: version=1" log &&
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'fetch with http:// using protocol v1' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s origin/master >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'pull with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'push with http:// using protocol v1' '
+	test_commit -C http_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		push origin HEAD:client_branch && #2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+stop_httpd
+
 test_done
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 09/10] i5700: add interop test for protocol transition
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (7 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 08/10] http: " Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-03 20:15     ` [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant Brandon Williams
                       ` (3 subsequent siblings)
  12 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/interop/i5700-protocol-transition.sh | 68 ++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100755 t/interop/i5700-protocol-transition.sh

diff --git a/t/interop/i5700-protocol-transition.sh b/t/interop/i5700-protocol-transition.sh
new file mode 100755
index 000000000..97e8e580e
--- /dev/null
+++ b/t/interop/i5700-protocol-transition.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+VERSION_A=.
+VERSION_B=v2.0.0
+
+: ${LIB_GIT_DAEMON_PORT:=5700}
+LIB_GIT_DAEMON_COMMAND='git.b daemon'
+
+test_description='clone and fetch by client who is trying to use a new protocol'
+. ./interop-lib.sh
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+
+start_git_daemon --export-all
+
+repo=$GIT_DAEMON_DOCUMENT_ROOT_PATH/repo
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init "$repo" &&
+	git.b -C "$repo" commit --allow-empty -m one
+'
+
+test_expect_success "git:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone "$GIT_DAEMON_URL/repo" child 2>log &&
+	git.a -C child log -1 --format=%s >actual &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	grep "version=1" log
+'
+
+test_expect_success "git:// fetch with $VERSION_A and protocol v1" '
+	git.b -C "$repo" commit --allow-empty -m two &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child -c protocol.version=1 fetch 2>log &&
+	git.a -C child log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	grep "version=1" log &&
+	! grep "version 1" log
+'
+
+stop_git_daemon
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init parent &&
+	git.b -C parent commit --allow-empty -m one
+'
+
+test_expect_success "file:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone --upload-pack="git.b upload-pack" parent child2 2>log &&
+	git.a -C child2 log -1 --format=%s >actual &&
+	git.b -C parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_expect_success "file:// fetch with $VERSION_A and protocol v1" '
+	git.b -C parent commit --allow-empty -m two &&
+	git.b -C parent log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child2 -c protocol.version=1 fetch --upload-pack="git.b upload-pack" 2>log &&
+	git.a -C child2 log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_done
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (8 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 09/10] i5700: add interop test for protocol transition Brandon Williams
@ 2017-10-03 20:15     ` Brandon Williams
  2017-10-03 21:42       ` Jonathan Nieder
  2017-10-04  6:20     ` [PATCH v3 00/10] protocol transition Junio C Hamano
                       ` (2 subsequent siblings)
  12 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-03 20:15 UTC (permalink / raw)
  To: git
  Cc: bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller,
	Brandon Williams

When using the 'ssh' transport, the '-o' option is used to specify an
environment variable which should be set on the remote end.  This allows
git to send additional information when contacting the server,
requesting the use of a different protocol version via the
'GIT_PROTOCOL' environment variable like so: "-o SendEnv=GIT_PROTOCOL"

Unfortunately not all ssh variants support the sending of environment
variables to the remote end.  To account for this, only use the '-o'
option for ssh variants which are OpenSSH compliant.  This is done by
checking that the basename of the ssh command is 'ssh' or the ssh
variant is overridden to be 'ssh' (via the ssh.variant config).

Previously if an ssh command's basename wasn't 'plink' or
'tortoiseplink' git assumed that the command was an OpenSSH variant.
Since user configured ssh commands may not be OpenSSH compliant, tighten
this constraint and assume a variant of 'simple' if the basename of the
command doesn't match the variants known to git.  The new ssh variant
'simple' will only have the host and command to execute ([username@]host
command) passed as parameters to the ssh command.

Update the Documentation to better reflect the command-line options sent
to ssh commands based on their variant.

Reported-by: Jeffrey Yasskin <jyasskin@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt |  27 ++++++++++--
 Documentation/git.txt    |   9 ++--
 connect.c                | 107 ++++++++++++++++++++++++++---------------------
 t/t5601-clone.sh         |   9 ++--
 t/t5700-protocol-v1.sh   |   2 +
 5 files changed, 95 insertions(+), 59 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b78747abc..0460af37e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2084,12 +2084,31 @@ ssh.variant::
 	Depending on the value of the environment variables `GIT_SSH` or
 	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
 	auto-detects whether to adjust its command-line parameters for use
-	with plink or tortoiseplink, as opposed to the default (OpenSSH).
+	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
+	(simple).
 +
 The config variable `ssh.variant` can be set to override this auto-detection;
-valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
-will be treated as normal ssh. This setting can be overridden via the
-environment variable `GIT_SSH_VARIANT`.
+valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
+other value will be treated as normal ssh. This setting can be overridden via
+the environment variable `GIT_SSH_VARIANT`.
++
+The current command-line parameters used for each variant are as
+follows:
++
+--
+
+* `ssh` - [-p port] [-4] [-6] [-o option] [username@]host command
+
+* `simple` - [username@]host command
+
+* `plink` or `putty` - [-P port] [-4] [-6] [username@]host command
+
+* `tortoiseplink` - [-P port] [-4] [-6] -batch [username@]host command
+
+--
++
+Except for the `simple` variant, command-line parameters are likely to
+change as git gains new features.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 7518ea3af..8bc3f2147 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -518,11 +518,10 @@ other
 	If either of these environment variables is set then 'git fetch'
 	and 'git push' will use the specified command instead of 'ssh'
 	when they need to connect to a remote system.
-	The command will be given exactly two or four arguments: the
-	'username@host' (or just 'host') from the URL and the shell
-	command to execute on that remote system, optionally preceded by
-	`-p` (literally) and the 'port' from the URL when it specifies
-	something other than the default SSH port.
+	The command-line parameters passed to the configured command are
+	determined by the ssh variant.  See `ssh.variant` option in
+	linkgit:git-config[1] for details.
+
 +
 `$GIT_SSH_COMMAND` takes precedence over `$GIT_SSH`, and is interpreted
 by the shell, which allows additional arguments to be included.
diff --git a/connect.c b/connect.c
index b8695a2fa..65cee49b6 100644
--- a/connect.c
+++ b/connect.c
@@ -776,37 +776,44 @@ static const char *get_ssh_command(void)
 	return NULL;
 }
 
-static int override_ssh_variant(int *port_option, int *needs_batch)
+enum ssh_variant {
+	VARIANT_SIMPLE,
+	VARIANT_SSH,
+	VARIANT_PLINK,
+	VARIANT_PUTTY,
+	VARIANT_TORTOISEPLINK,
+};
+
+static int override_ssh_variant(enum ssh_variant *ssh_variant)
 {
-	char *variant;
+	const char *variant = getenv("GIT_SSH_VARIANT");
 
-	variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
-	if (!variant &&
-	    git_config_get_string("ssh.variant", &variant))
+	if (!variant && git_config_get_string_const("ssh.variant", &variant))
 		return 0;
 
-	if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
-		*port_option = 'P';
-		*needs_batch = 0;
-	} else if (!strcmp(variant, "tortoiseplink")) {
-		*port_option = 'P';
-		*needs_batch = 1;
-	} else {
-		*port_option = 'p';
-		*needs_batch = 0;
-	}
-	free(variant);
+	if (!strcmp(variant, "plink"))
+		*ssh_variant = VARIANT_PLINK;
+	else if (!strcmp(variant, "putty"))
+		*ssh_variant = VARIANT_PUTTY;
+	else if (!strcmp(variant, "tortoiseplink"))
+		*ssh_variant = VARIANT_TORTOISEPLINK;
+	else if (!strcmp(variant, "simple"))
+		*ssh_variant = VARIANT_SIMPLE;
+	else
+		*ssh_variant = VARIANT_SSH;
+
 	return 1;
 }
 
-static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
-			       int *port_option, int *needs_batch)
+static enum ssh_variant determine_ssh_variant(const char *ssh_command,
+					      int is_cmdline)
 {
+	enum ssh_variant ssh_variant = VARIANT_SIMPLE;
 	const char *variant;
 	char *p = NULL;
 
-	if (override_ssh_variant(port_option, needs_batch))
-		return;
+	if (override_ssh_variant(&ssh_variant))
+		return ssh_variant;
 
 	if (!is_cmdline) {
 		p = xstrdup(ssh_command);
@@ -825,19 +832,21 @@ static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
 			free(ssh_argv);
 		} else {
 			free(p);
-			return;
+			return ssh_variant;
 		}
 	}
 
-	if (!strcasecmp(variant, "plink") ||
-	    !strcasecmp(variant, "plink.exe"))
-		*port_option = 'P';
+	if (!strcasecmp(variant, "ssh"))
+		ssh_variant = VARIANT_SSH;
+	else if (!strcasecmp(variant, "plink") ||
+		 !strcasecmp(variant, "plink.exe"))
+		ssh_variant = VARIANT_PLINK;
 	else if (!strcasecmp(variant, "tortoiseplink") ||
-		 !strcasecmp(variant, "tortoiseplink.exe")) {
-		*port_option = 'P';
-		*needs_batch = 1;
-	}
+		 !strcasecmp(variant, "tortoiseplink.exe"))
+		ssh_variant = VARIANT_TORTOISEPLINK;
+
 	free(p);
+	return ssh_variant;
 }
 
 /*
@@ -937,8 +946,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
 			const char *ssh;
-			int needs_batch = 0;
-			int port_option = 'p';
+			enum ssh_variant variant;
 			char *ssh_host = hostandport;
 			const char *port = NULL;
 			transport_check_allowed("ssh");
@@ -965,10 +973,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 				die("strange hostname '%s' blocked", ssh_host);
 
 			ssh = get_ssh_command();
-			if (ssh)
-				handle_ssh_variant(ssh, 1, &port_option,
-						   &needs_batch);
-			else {
+			if (ssh) {
+				variant = determine_ssh_variant(ssh, 1);
+			} else {
 				/*
 				 * GIT_SSH is the no-shell version of
 				 * GIT_SSH_COMMAND (and must remain so for
@@ -979,32 +986,38 @@ struct child_process *git_connect(int fd[2], const char *url,
 				ssh = getenv("GIT_SSH");
 				if (!ssh)
 					ssh = "ssh";
-				else
-					handle_ssh_variant(ssh, 0,
-							   &port_option,
-							   &needs_batch);
+				variant = determine_ssh_variant(ssh, 0);
 			}
 
 			argv_array_push(&conn->args, ssh);
 
-			if (get_protocol_version_config() > 0) {
+			if (variant == VARIANT_SSH &&
+			    get_protocol_version_config() > 0) {
 				argv_array_push(&conn->args, "-o");
 				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
 				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
 						 get_protocol_version_config());
 			}
 
-			if (flags & CONNECT_IPV4)
-				argv_array_push(&conn->args, "-4");
-			else if (flags & CONNECT_IPV6)
-				argv_array_push(&conn->args, "-6");
-			if (needs_batch)
+			if (variant != VARIANT_SIMPLE) {
+				if (flags & CONNECT_IPV4)
+					argv_array_push(&conn->args, "-4");
+				else if (flags & CONNECT_IPV6)
+					argv_array_push(&conn->args, "-6");
+			}
+
+			if (variant == VARIANT_TORTOISEPLINK)
 				argv_array_push(&conn->args, "-batch");
-			if (port) {
-				argv_array_pushf(&conn->args,
-						 "-%c", port_option);
+
+			if (port && variant != VARIANT_SIMPLE) {
+				if (variant == VARIANT_SSH)
+					argv_array_push(&conn->args, "-p");
+				else
+					argv_array_push(&conn->args, "-P");
+
 				argv_array_push(&conn->args, port);
 			}
+
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f771b..ee1a24c5b 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -312,6 +312,8 @@ setup_ssh_wrapper () {
 			"$TRASH_DIRECTORY/ssh-wrapper$X" &&
 		GIT_SSH="$TRASH_DIRECTORY/ssh-wrapper$X" &&
 		export GIT_SSH &&
+		GIT_SSH_VARIANT=ssh &&
+		export GIT_SSH_VARIANT &&
 		export TRASH_DIRECTORY &&
 		>"$TRASH_DIRECTORY"/ssh-output
 	'
@@ -320,7 +322,8 @@ setup_ssh_wrapper () {
 copy_ssh_wrapper_as () {
 	cp "$TRASH_DIRECTORY/ssh-wrapper$X" "${1%$X}$X" &&
 	GIT_SSH="${1%$X}$X" &&
-	export GIT_SSH
+	export GIT_SSH &&
+	unset GIT_SSH_VARIANT
 }
 
 expect_ssh () {
@@ -362,10 +365,10 @@ test_expect_success 'bracketed hostnames are still ssh' '
 	expect_ssh "-p 123" myhost src
 '
 
-test_expect_success 'uplink is not treated as putty' '
+test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
 	git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
-	expect_ssh "-p 123" myhost src
+	expect_ssh myhost src
 '
 
 test_expect_success 'plink is treated specially (as putty)' '
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index b0779d362..ba86a44eb 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -147,6 +147,8 @@ test_expect_success 'push with file:// using protocol v1' '
 test_expect_success 'setup ssh wrapper' '
 	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh" &&
 	export GIT_SSH &&
+	GIT_SSH_VARIANT=ssh &&
+	export GIT_SSH_VARIANT &&
 	export TRASH_DIRECTORY &&
 	>"$TRASH_DIRECTORY"/ssh-output
 '
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant
  2017-10-03 20:15     ` [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant Brandon Williams
@ 2017-10-03 21:42       ` Jonathan Nieder
  2017-10-16 17:18         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-03 21:42 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller

Hi,

Brandon Williams wrote:

> When using the 'ssh' transport, the '-o' option is used to specify an
> environment variable which should be set on the remote end.  This allows
> git to send additional information when contacting the server,
> requesting the use of a different protocol version via the
> 'GIT_PROTOCOL' environment variable like so: "-o SendEnv=GIT_PROTOCOL"
>
> Unfortunately not all ssh variants support the sending of environment
> variables to the remote end.  To account for this, only use the '-o'
> option for ssh variants which are OpenSSH compliant.  This is done by
> checking that the basename of the ssh command is 'ssh' or the ssh
> variant is overridden to be 'ssh' (via the ssh.variant config).

This also affects -p (port), right?

What happens if I specify a ssh://host:port/path URL and the SSH
implementation is of 'simple' type?

> Previously if an ssh command's basename wasn't 'plink' or

Git's commit messages use the present tense to describe the current
state of the code, so this is "Currently". :)

> 'tortoiseplink' git assumed that the command was an OpenSSH variant.
> Since user configured ssh commands may not be OpenSSH compliant, tighten
> this constraint and assume a variant of 'simple' if the basename of the
> command doesn't match the variants known to git.  The new ssh variant
> 'simple' will only have the host and command to execute ([username@]host
> command) passed as parameters to the ssh command.
>
> Update the Documentation to better reflect the command-line options sent
> to ssh commands based on their variant.
>
> Reported-by: Jeffrey Yasskin <jyasskin@google.com>
> Signed-off-by: Brandon Williams <bmwill@google.com>

Thanks for working on this.

For background, the GIT_SSH implementation that motivated this is
https://github.com/travis-ci/dpl/blob/6c3fddfda1f2a85944c544446b068bac0a77c049/lib/dpl/provider.rb#L215,
which does not support -p or -4/-6, either.

> ---
>  Documentation/config.txt |  27 ++++++++++--
>  Documentation/git.txt    |   9 ++--
>  connect.c                | 107 ++++++++++++++++++++++++++---------------------
>  t/t5601-clone.sh         |   9 ++--
>  t/t5700-protocol-v1.sh   |   2 +
>  5 files changed, 95 insertions(+), 59 deletions(-)
[...]
> --- a/connect.c
> +++ b/connect.c
> @@ -776,37 +776,44 @@ static const char *get_ssh_command(void)
[...]
> +static enum ssh_variant determine_ssh_variant(const char *ssh_command,
> +					      int is_cmdline)
[...]
> -	if (!strcasecmp(variant, "plink") ||
> -	    !strcasecmp(variant, "plink.exe"))
> -		*port_option = 'P';
> +	if (!strcasecmp(variant, "ssh"))
> +		ssh_variant = VARIANT_SSH;

Could this handle ssh.exe, too?

[...]
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh

Can this get tests for the new defaulting behavior?  E.g.

 - default is "simple"
 - how "simple" treats an ssh://host:port/path URL
 - how "simple" treats ipv4/ipv6 switching
 - ssh defaults to "ssh"
 - if GIT_SSH=ssh, can set ssh.variant to "simple" to force the "simple"
   mode

One other worry: this (intentionally) changes the behavior of a
previously-working GIT_SSH=ssh-wrapper that wants to support
OpenSSH-style options but does not declare ssh.variant=ssh.  When
discovering this change, what should the author of such an ssh-wrapper
do?

They could instruct their users to set ssh.variant or GIT_SSH_VARIANT
to "ssh", but then they are at the mercy of future additional options
supported by OpenSSH we may want to start using in the future (e.g.,
maybe we will start passing "--" to separate options from the
hostname).  So this is not a futureproof option for them.

They could take the new default behavior or instruct their users to
set ssh.variant or GIT_SSH_VARIANT to "simple", but then they lose
support for handling alternate ports, ipv4/ipv6, and specifying -o
SendEnv to propagate GIT_PROTOCOL or other envvars.  They can handle
GIT_PROTOCOL propagation manually, but losing port support seems like
a heavy cost.

They could send a patch to define yet another variant that is
forward-compatible, for example using an interface similar to what
git-credential(1) uses.  Then they can set GIT_SSH to their
OpenSSH-style helper and GIT_FANCY_NEW_SSH to their more modern
helper, so that old Git versions could use GIT_SSH and new Git
versions could use GIT_FANCY_NEW_SSH.  This might be their best
option.  It feels odd to say that their only good way forward is to
send a patch, but on the other hand someone with such an itch is
likely to be in the best position to define an appropriate interface.

They could send a documentation patch to make more promises about the
commandline used in OpenSSH mode: e.g. setting a rule in advance about
which options can take an argument so that they can properly parse an
OpenSSH command line in a future-compatible way.

Or they could send a patch to allow passing the port in "simple"
mode, for example using an environment variable.

Am I missing another option?  What advice do we give to this person?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 00/10] protocol transition
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (9 preceding siblings ...)
  2017-10-03 20:15     ` [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant Brandon Williams
@ 2017-10-04  6:20     ` Junio C Hamano
  2017-10-10 19:39     ` [PATCH] Documentation: document Extra Parameters Jonathan Tan
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
  12 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-10-04  6:20 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, jonathantanmy, jrnieder, peff, sbeller

Thanks.  All of my review comments from the previous round seem to
have been addressed, so this is Reviewed-by: me ;-)


^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-03 20:15     ` [PATCH v3 03/10] protocol: introduce protocol extention mechanisms Brandon Williams
@ 2017-10-06  9:09       ` Simon Ruderich
  2017-10-06  9:40         ` Junio C Hamano
  2017-10-10 19:51       ` Jonathan Tan
  1 sibling, 1 reply; 161+ messages in thread
From: Simon Ruderich @ 2017-10-06  9:09 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, jrnieder, peff, sbeller

On Tue, Oct 03, 2017 at 01:15:00PM -0700, Brandon Williams wrote:
> [snip]
>
> +protocol.version::
> +	Experimental. If set, clients will attempt to communicate with a
> +	server using the specified protocol version.  If unset, no
> +	attempt will be made by the client to communicate using a
> +	particular protocol version, this results in protocol version 0
> +	being used.
> +	Supported versions:

Did you consider Stefan Beller's suggestion regarding a
(white)list of allowed versions?

On Mon, Sep 18, 2017 at 01:06:59PM -0700, Stefan Beller wrote:
> Thinking about this, how about:
>
>   If not configured, we do as we want. (i.e. Git has full control over
>   it's decision making process, which for now is "favor v0 over v1 as
>   we are experimenting with v1". This strategy may change in the future
>   to "prefer highest version number that both client and server can speak".)
>
>   If it is configured, "use highest configured number from the given set".
>
> ?

It would also allow the server operator to configure only a
specific set of versions (to handle the "version x is
insecure/slow"-issue raised by Stefan Beller). The current code
always uses the latest protocol supported by the git binary.

Minor nit, s/extention/extension/ in the patch name?

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06  9:09       ` Simon Ruderich
@ 2017-10-06  9:40         ` Junio C Hamano
  2017-10-06 11:11           ` Martin Ågren
  2017-10-09  4:05           ` Martin Ågren
  0 siblings, 2 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-10-06  9:40 UTC (permalink / raw)
  To: Simon Ruderich
  Cc: Brandon Williams, git, bturner, git, jonathantanmy, jrnieder,
	peff, sbeller

Simon Ruderich <simon@ruderich.org> writes:

> Did you consider Stefan Beller's suggestion regarding a
> (white)list of allowed versions?
>
> On Mon, Sep 18, 2017 at 01:06:59PM -0700, Stefan Beller wrote:
>> Thinking about this, how about:
>>
>>   If not configured, we do as we want. (i.e. Git has full control over
>>   it's decision making process, which for now is "favor v0 over v1 as
>>   we are experimenting with v1". This strategy may change in the future
>>   to "prefer highest version number that both client and server can speak".)
>>
>>   If it is configured, "use highest configured number from the given set".
>>
>> ?
>
> It would also allow the server operator to configure only a
> specific set of versions (to handle the "version x is
> insecure/slow"-issue raised by Stefan Beller). The current code
> always uses the latest protocol supported by the git binary.

If we do anything less trivial than "highest supported by both" (and
I suspect we want to in the final production version), I'd prefer
the configuration to list versions one side supports in decreasing
order of preference (e.g. "v3 v0 v2"), and take the earliest from
this list that both sides know how to talk, so that we can skip
insecure versions altogether by omitting, and we can express that we
would rather avoid talking expensive versions unless there is no
other version that is understood by the other side.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06  9:40         ` Junio C Hamano
@ 2017-10-06 11:11           ` Martin Ågren
  2017-10-06 12:09             ` Junio C Hamano
  2017-10-10 21:00             ` Brandon Williams
  2017-10-09  4:05           ` Martin Ågren
  1 sibling, 2 replies; 161+ messages in thread
From: Martin Ågren @ 2017-10-06 11:11 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Simon Ruderich, Brandon Williams, Git Mailing List, bturner,
	Jeff Hostetler, jonathantanmy, Jonathan Nieder, Jeff King,
	Stefan Beller

On 6 October 2017 at 11:40, Junio C Hamano <gitster@pobox.com> wrote:
> Simon Ruderich <simon@ruderich.org> writes:
>
>> Did you consider Stefan Beller's suggestion regarding a
>> (white)list of allowed versions?
>>
>> On Mon, Sep 18, 2017 at 01:06:59PM -0700, Stefan Beller wrote:
>>> Thinking about this, how about:
>>>
>>>   If not configured, we do as we want. (i.e. Git has full control over
>>>   it's decision making process, which for now is "favor v0 over v1 as
>>>   we are experimenting with v1". This strategy may change in the future
>>>   to "prefer highest version number that both client and server can speak".)
>>>
>>>   If it is configured, "use highest configured number from the given set".
>>>
>>> ?
>>
>> It would also allow the server operator to configure only a
>> specific set of versions (to handle the "version x is
>> insecure/slow"-issue raised by Stefan Beller). The current code
>> always uses the latest protocol supported by the git binary.
>
> If we do anything less trivial than "highest supported by both" (and
> I suspect we want to in the final production version), I'd prefer
> the configuration to list versions one side supports in decreasing
> order of preference (e.g. "v3 v0 v2"), and take the earliest from
> this list that both sides know how to talk, so that we can skip
> insecure versions altogether by omitting, and we can express that we
> would rather avoid talking expensive versions unless there is no
> other version that is understood by the other side.

Maybe I'm missing something Git-specific, but isn't the only thing that
needs to be done now, to document/specify that 1) the client should send
its list ordered by preference, 2) how preference is signalled, and 3)
that the server gets to choose?

Why would a server operator with only v0 and v1 at their disposal want
to choose v0 instead of v1, considering that -- as far as I understand
-- they are in fact the same?

Different server implementations and different server configurations
will be able to apply whatever rules they want in order to decide which
version to use. (It's not like the client can verify the choice that the
server makes.) And Brandon's "pick the highest number" will do for now.

There are many possible rules that the server could apply and they
shouldn't affect other servers or what the client does. For example, the
server can go "You seem to know lots of versions, including X and Y.
Those are the two that I really prefer, but between those two I'm not
picky, so I'll use whichever of X and Y that you seem to prefer." Unless
I've missed something, we'll never need to implement -- nor specify --
anything like that before we have learned both v2 and v3.

I guess my thinking is, there's a difference between the protocol and
the implementation.

Martin

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06 11:11           ` Martin Ågren
@ 2017-10-06 12:09             ` Junio C Hamano
  2017-10-06 19:42               ` Martin Ågren
  2017-10-10 21:00             ` Brandon Williams
  1 sibling, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-10-06 12:09 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Simon Ruderich, Brandon Williams, Git Mailing List, bturner,
	Jeff Hostetler, jonathantanmy, Jonathan Nieder, Jeff King,
	Stefan Beller

Martin Ågren <martin.agren@gmail.com> writes:

> Maybe I'm missing something Git-specific, but isn't the only thing that
> needs to be done now, to document/specify that 1) the client should send
> its list ordered by preference, 2) how preference is signalled, and 3)
> that the server gets to choose?

I think Simon's reminder of Stefan's was about specifying something
different from (1) above---it was just a list of good ones (as
opposed to ones to be avoided).  I was suggesting to tweak that to
match what you wrote above.

> Why would a server operator with only v0 and v1 at their disposal want
> to choose v0 instead of v1, considering that -- as far as I understand
> -- they are in fact the same?

Because we may later discover some reason we not yet know that makes
v$n+1 unsuitable after we introduce it, and we need to avoid it by
preferring v$n instead?

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06 12:09             ` Junio C Hamano
@ 2017-10-06 19:42               ` Martin Ågren
  2017-10-06 20:27                 ` Stefan Beller
  0 siblings, 1 reply; 161+ messages in thread
From: Martin Ågren @ 2017-10-06 19:42 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Simon Ruderich, Brandon Williams, Git Mailing List, bturner,
	Jeff Hostetler, jonathantanmy, Jonathan Nieder, Jeff King,
	Stefan Beller

On 6 October 2017 at 14:09, Junio C Hamano <gitster@pobox.com> wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
>
>> Maybe I'm missing something Git-specific, but isn't the only thing that
>> needs to be done now, to document/specify that 1) the client should send
>> its list ordered by preference, 2) how preference is signalled, and 3)
>> that the server gets to choose?
>
> I think Simon's reminder of Stefan's was about specifying something
> different from (1) above---it was just a list of good ones (as
> opposed to ones to be avoided).  I was suggesting to tweak that to
> match what you wrote above.

I replied to your mail, but also to the general notion of "we need a
carefully designed configuration and fall-back strategy before we can
include this series" which I sensed (and which I hope I didn't just
misrepresent). I didn't make it very clear exactly what I was replying
to, sorry about that.

Note that my 1-3 above are not about "configuration", which I interpret
as "how does the user tell Git how to select a protocol version?", but
about the protocol, i.e., "how do the client and server agree on which
version to use?".

>> Why would a server operator with only v0 and v1 at their disposal want
>> to choose v0 instead of v1, considering that -- as far as I understand
>> -- they are in fact the same?
>
> Because we may later discover some reason we not yet know that makes
> v$n+1 unsuitable after we introduce it, and we need to avoid it by
> preferring v$n instead?

For n in general, I agree completely. For n=0, and in particular for a
Git which only knows v0 and v1, I'm not so sure. If v1 has a problem
which v0 doesn't, then to the best of my understanding, that problem
would be in this series, i.e., in the version-negotiation. And to
minimize risk in that area, we'd want to make this series more complex?
That might be the correct thing to do -- certainly, if we botch this
series totally, then we might be in big trouble going forward --, but
it's not at all obvious to me. Nor is it obvious to me that such an
escape-hatch needs to be a multi-choice, prioritized configuration.

If a fall-back mechanism to v0 is wanted on the first Git server with
v1/v0, the simplest approach might be to make the server respect
protocol.version (possibly with a default of 1!?).

I might be naive in thinking that protocol.version could be removed or
redefined at our discretion just because it's marked as "experimental".

Martin

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06 19:42               ` Martin Ågren
@ 2017-10-06 20:27                 ` Stefan Beller
  2017-10-08 14:24                   ` Martin Ågren
  0 siblings, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-10-06 20:27 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Junio C Hamano, Simon Ruderich, Brandon Williams,
	Git Mailing List, Bryan Turner, Jeff Hostetler, Jonathan Tan,
	Jonathan Nieder, Jeff King

> I might be naive in thinking that protocol.version could be removed or
> redefined at our discretion just because it's marked as "experimental".

Well the redefinition might very well occur, when we now say "set to v1
to test v1 and fallback to v0 otherwise", but long term we want a white
or black list or some other protocol selection strategy encoded in this
configuration (we would not want to introduce yet another config to work
around the initial "failed experiment", would we?)

And hence I would be careful how we define the meaning of
protocol.version now.

For example we could instead now claim "protocol.version is a whitelist
of protocol versions, order is not specified. The only guarantee we're willing
to give is that no protocol is used that is not on the list".

Later we may want to either add another variable '.versionSelectionStrategy'
that helps out there or we'd just say protocol.version tries to select
the leftmost (first) protocol that both ends support.

All I was trying to say initially is that "we may try (one of) protocol.version,
but fall back to whatever (currently v0)" is too broad. We'd need to redefine
it shortly in the foreseeable future already.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06 20:27                 ` Stefan Beller
@ 2017-10-08 14:24                   ` Martin Ågren
  0 siblings, 0 replies; 161+ messages in thread
From: Martin Ågren @ 2017-10-08 14:24 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Simon Ruderich, Brandon Williams,
	Git Mailing List, Bryan Turner, Jeff Hostetler, Jonathan Tan,
	Jonathan Nieder, Jeff King

On 6 October 2017 at 22:27, Stefan Beller <sbeller@google.com> wrote:
>> I might be naive in thinking that protocol.version could be removed or
>> redefined at our discretion just because it's marked as "experimental".
>
> Well the redefinition might very well occur, when we now say "set to v1
> to test v1 and fallback to v0 otherwise", but long term we want a white
> or black list or some other protocol selection strategy encoded in this
> configuration (we would not want to introduce yet another config to work
> around the initial "failed experiment", would we?)
>
> And hence I would be careful how we define the meaning of
> protocol.version now.

Good points. If we want to go for a more general / future-proof wording
now, then we must already now implement the config-parsing as "does this
string contain the word '1'" instead of "is this string exactly '1'". If
we claim that "34 1 5" is a valid configuration, then the implementation
should accept it. (We'd probably also want to verify that there are only
integers and spaces in the string.)

> For example we could instead now claim "protocol.version is a whitelist
> of protocol versions, order is not specified. The only guarantee we're willing
> to give is that no protocol is used that is not on the list".

If we want to be able to list more than one version, we need to define
how to signal preference from the first day, IMHO. (I know you just gave
an example; I'm simply responding with what I think makes that example
non-ideal.)

The fact that v0 is requested by lack of data and all other protocols
(whether v1 or v34) have to be requested by presence of data, is a bit
unfortunate and it is bound to bleed through into the definitions, at
least until v0 is simply ripped out of git.git. Ok, this definition
suggests that "1 0" will be the preferred variant for checking basic
robustness, while "1" will be how to ensure you have a peer which knows
v1.

> All I was trying to say initially is that "we may try (one of) protocol.version,
> but fall back to whatever (currently v0)" is too broad. We'd need to redefine
> it shortly in the foreseeable future already.

Yes we would. I'll post a suggestion elsewhere in the thread.

Martin

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06  9:40         ` Junio C Hamano
  2017-10-06 11:11           ` Martin Ågren
@ 2017-10-09  4:05           ` Martin Ågren
  1 sibling, 0 replies; 161+ messages in thread
From: Martin Ågren @ 2017-10-09  4:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Simon Ruderich, Brandon Williams, Git Mailing List, Bryan Turner,
	Jeff Hostetler, Jonathan Tan, Jonathan Nieder, Jeff King,
	Stefan Beller

On 6 October 2017 at 11:40, Junio C Hamano <gitster@pobox.com> wrote:
> Simon Ruderich <simon@ruderich.org> writes:
>
>> Did you consider Stefan Beller's suggestion regarding a
>> (white)list of allowed versions?
>>
>> On Mon, Sep 18, 2017 at 01:06:59PM -0700, Stefan Beller wrote:
>>> Thinking about this, how about:
>>>
>>>   If not configured, we do as we want. (i.e. Git has full control over
>>>   it's decision making process, which for now is "favor v0 over v1 as
>>>   we are experimenting with v1". This strategy may change in the future
>>>   to "prefer highest version number that both client and server can speak".)
>>>
>>>   If it is configured, "use highest configured number from the given set".
>>>
>>> ?
>>
>> It would also allow the server operator to configure only a
>> specific set of versions (to handle the "version x is
>> insecure/slow"-issue raised by Stefan Beller). The current code
>> always uses the latest protocol supported by the git binary.
>
> If we do anything less trivial than "highest supported by both" (and
> I suspect we want to in the final production version), I'd prefer
> the configuration to list versions one side supports in decreasing
> order of preference (e.g. "v3 v0 v2"), and take the earliest from
> this list that both sides know how to talk, so that we can skip
> insecure versions altogether by omitting, and we can express that we
> would rather avoid talking expensive versions unless there is no
> other version that is understood by the other side.

I think I've managed to convince myself that a blacklist would be the
most future-proof approach, simply because it cannot be overloaded with
any other meanings in the future.

If an ordering needs to be possible, that would have to go into another
config item. An ordering would open up for some interesting issues, but
at least that shouldn't be any worse because the blacklist-approach has
been taken (as opposed to a whitelist-approach).

To aid with a slow roll-out, the default blacklist could be used (start
by blacklisting v1), but after that the default list should be empty. It
should not be misused for slowly rolling out any later experimental
versions.

Letting the blacklist be different server- and client-side seems useful
for driving the experiment forwards. Post-experiment, I'm not so sure,
that just seems unnecessarily complicated.

So, here's a suggestion:

* experimental.{client,server}ProtocolV1 is "0" (don't experiment) or
  "1" (experiment).

* experimental.serverProtocolV1 has default "0". Unless early feedback
  is negative, the default is changed to "1".

* experimental.clientProtocolV1 has default "0". Switch the default to
  "1" after some time.

* Big warnings that if someone finds themselves switching to "0" they
  should get in touch.

Once we feel confident, we implement protocol.blacklist and let the
default be "". The experimental.* are simply dropped, no "aliasing" or
"transitioning". That is, we activate v0 and v1. We don't respect "0" in
a blacklist (but don't forbid it either). Once we introduce v2, sure,
but until then, some will just be tempted to blacklist v0 "to get the
modern v1" -- they will have risk, but no benefits.

Martin

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 01/10] connect: in ref advertisement, shallows are last
  2017-10-03 20:14     ` [PATCH v3 01/10] connect: in ref advertisement, shallows are last Brandon Williams
@ 2017-10-10 18:14       ` Jonathan Tan
  0 siblings, 0 replies; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 18:14 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On Tue,  3 Oct 2017 13:14:58 -0700
Brandon Williams <bmwill@google.com> wrote:

> +static int process_dummy_ref(void)
> +{
> +	struct object_id oid;
> +	const char *name;
> +
> +	if (parse_oid_hex(packet_buffer, &oid, &name))
> +		return 0;
> +	if (*name != ' ')
> +		return 0;
> +	name++;
> +
> +	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
> +}

Nit: If you're planning to parse_oid_hex, you can strcmp with
" capabilities^{}" directly (note the space at the start) instead of
first checking for ' ' then "capabilities^{}".

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 02/10] pkt-line: add packet_write function
  2017-10-03 20:14     ` [PATCH v3 02/10] pkt-line: add packet_write function Brandon Williams
@ 2017-10-10 18:15       ` Jonathan Tan
  0 siblings, 0 replies; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 18:15 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On Tue,  3 Oct 2017 13:14:59 -0700
Brandon Williams <bmwill@google.com> wrote:

> +void packet_write(const int fd_out, const char *buf, size_t size)

No need for "const" in "const int fd_out", I think. Same comment for the
header file.

> +{
> +	if (packet_write_gently(fd_out, buf, size))
> +		die_errno("packet write failed");
> +}

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 04/10] daemon: recognize hidden request arguments
  2017-10-03 20:15     ` [PATCH v3 04/10] daemon: recognize hidden request arguments Brandon Williams
@ 2017-10-10 18:24       ` Jonathan Tan
  2017-10-13 22:04         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 18:24 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On Tue,  3 Oct 2017 13:15:01 -0700
Brandon Williams <bmwill@google.com> wrote:

>  /*
>   * Read the host as supplied by the client connection.

The return value is probably worth documenting. Something like "Returns
a pointer to the character *after* the NUL byte terminating the host
argument, or extra_args if there is no host argument."

>   */
> -static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> +static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)

[snip]

> +static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
> +			     char *extra_args, int buflen)
> +{
> +	const char *end = extra_args + buflen;
> +	struct strbuf git_protocol = STRBUF_INIT;
> +
> +	/* First look for the host argument */
> +	extra_args = parse_host_arg(hi, extra_args, buflen);

This works, but is a bit loose in what it accepts. I think it's better
to be tighter - in particular, if there is no host argument, we
shouldn't be looking for extra args.

> +
> +	/* Look for additional arguments places after a second NUL byte */
> +	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {

Assuming that the host argument exists, this for loop should start at
extra_args + 1 to skip the second NUL byte. This currently works
because this code is lenient towards empty strings.

> +		const char *arg = extra_args;
> +
> +		/*
> +		 * Parse the extra arguments, adding most to 'git_protocol'
> +		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
> +		 * service that will be run.
> +		 *
> +		 * If there ends up being a particular arg in the future that
> +		 * git-daemon needs to parse specificly (like the 'host' arg)
> +		 * then it can be parsed here and not added to 'git_protocol'.
> +		 */
> +		if (*arg) {
> +			if (git_protocol.len > 0)
> +				strbuf_addch(&git_protocol, ':');
> +			strbuf_addstr(&git_protocol, arg);
> +		}
> +	}

But I think we shouldn't be lenient towards empty strings.

> +
> +	if (git_protocol.len > 0)
> +		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> +				 git_protocol.buf);
> +	strbuf_release(&git_protocol);
>  }

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1
  2017-10-03 20:15     ` [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
@ 2017-10-10 18:28       ` Jonathan Tan
  2017-10-13 22:18         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 18:28 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On Tue,  3 Oct 2017 13:15:02 -0700
Brandon Williams <bmwill@google.com> wrote:

> +	switch (determine_protocol_version_server()) {
> +	case protocol_v1:
> +		if (advertise_refs || !stateless_rpc)
> +			packet_write_fmt(1, "version 1\n");
> +		/*
> +		 * v1 is just the original protocol with a version string,
> +		 * so just fall through after writing the version string.
> +		 */

Peff sent out at least one patch [1] that reformats fallthrough comments
to be understood by GCC. It's probably worth doing here too.

In this case, I would do the 2-comment system that Peff suggested:

	/*
	 * v1 is just the original protocol with a version string,
	 * so just fall through after writing the version string.
	 */
	if (advertise_refs || !stateless_rpc)
		packet_write_fmt(1, "version 1\n");
	/* fallthrough */

(I put the first comment before the code, so it doesn't look so weird.)

[1] https://public-inbox.org/git/20170921062541.ew67gyvrmb2ot4sf@sigill.intra.peff.net/

> +	case protocol_v0:
> +		break;
> +	default:
> +		BUG("unknown protocol version");
> +	}

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 07/10] connect: tell server that the client understands v1
  2017-10-03 20:15     ` [PATCH v3 07/10] connect: tell server that the client understands v1 Brandon Williams
@ 2017-10-10 18:30       ` Jonathan Tan
  2017-10-13 22:56         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 18:30 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On Tue,  3 Oct 2017 13:15:04 -0700
Brandon Williams <bmwill@google.com> wrote:

> 2. ssh://, file://
>    Set 'GIT_PROTOCOL' environment variable with the desired protocol
>    version.  With the file:// transport, 'GIT_PROTOCOL' can be set
>    explicitly in the locally running git-upload-pack or git-receive-pack
>    processes.  With the ssh:// transport and OpenSSH compliant ssh
>    programs, 'GIT_PROTOCOL' can be sent across ssh by using '-o
>    SendEnv=GIT_PROTOCOL' and having the server whitelist this
>    environment variable.

In your commit message, also mention what GIT_PROTOCOL contains
(version=?). (From this commit message alone, I would have expected a
lone integer, but that is not the case.)

Same comment for the commit message of PATCH v3 08/10.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH] Documentation: document Extra Parameters
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (10 preceding siblings ...)
  2017-10-04  6:20     ` [PATCH v3 00/10] protocol transition Junio C Hamano
@ 2017-10-10 19:39     ` Jonathan Tan
  2017-10-13 22:26       ` Brandon Williams
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
  12 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 19:39 UTC (permalink / raw)
  To: bmwill; +Cc: bturner, git, git, gitster, jonathantanmy, jrnieder, peff, sbeller

Document the server support for Extra Parameters, additional information
that the client can send in its first message to the server during a
Git client-server interaction.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
I noticed that Documentation/technical was not updated in this patch
series. It probably should, and here is a suggestion on how to do it.

Also, I'm not sure what to call the extra sendable information. I'm open
to suggestions for a better name than Extra Parameters.
---
 Documentation/technical/http-protocol.txt |  8 ++++++
 Documentation/technical/pack-protocol.txt | 43 ++++++++++++++++++++++++++-----
 2 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 1c561bdd9..a0e45f288 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -219,6 +219,10 @@ smart server reply:
    S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
    S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
 
+The client may send Extra Parameters (see
+Documentation/technical/pack-protocol.txt) as a colon-separated string
+in the Git-Protocol HTTP header.
+
 Dumb Server Response
 ^^^^^^^^^^^^^^^^^^^^
 Dumb servers MUST respond with the dumb server reply format.
@@ -269,7 +273,11 @@ the C locale ordering.  The stream SHOULD include the default ref
 named `HEAD` as the first ref.  The stream MUST include capability
 declarations behind a NUL on the first ref.
 
+The returned response contains "version 1" if "version=1" was sent as an
+Extra Parameter.
+
   smart_reply     =  PKT-LINE("# service=$servicename" LF)
+		     *1("version 1")
 		     ref_list
 		     "0000"
   ref_list        =  empty_list / non_empty_list
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index ed1eae8b8..f9ebfb23e 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -39,6 +39,19 @@ communicates with that invoked process over the SSH connection.
 The file:// transport runs the 'upload-pack' or 'receive-pack'
 process locally and communicates with it over a pipe.
 
+Extra Parameters
+----------------
+
+The protocol provides a mechanism in which clients can send additional
+information in its first message to the server. These are called "Extra
+Parameters", and are supported by the Git, SSH, and HTTP protocols.
+
+Each Extra Parameter takes the form of `<key>=<value>` or `<key>`.
+
+Servers that receive any such Extra Parameters MUST ignore all
+unrecognized keys. Currently, the only Extra Parameter recognized is
+"version=1".
+
 Git Transport
 -------------
 
@@ -46,18 +59,25 @@ The Git transport starts off by sending the command and repository
 on the wire using the pkt-line format, followed by a NUL byte and a
 hostname parameter, terminated by a NUL byte.
 
-   0032git-upload-pack /project.git\0host=myserver.com\0
+   0033git-upload-pack /project.git\0host=myserver.com\0
+
+The transport may send Extra Parameters by adding an additional NUL
+byte, and then adding one or more NUL-terminated strings:
+
+   003egit-upload-pack /project.git\0host=myserver.com\0\0version=1\0
 
 --
-   git-proto-request = request-command SP pathname NUL [ host-parameter NUL ]
+   git-proto-request = request-command SP pathname NUL
+		       [ host-parameter NUL [ NUL extra-parameters ] ]
    request-command   = "git-upload-pack" / "git-receive-pack" /
 		       "git-upload-archive"   ; case sensitive
    pathname          = *( %x01-ff ) ; exclude NUL
    host-parameter    = "host=" hostname [ ":" port ]
+   extra-parameters  = 1*extra-parameter
+   extra-parameter   = 1*( %x01-ff ) NUL
 --
 
-Only host-parameter is allowed in the git-proto-request. Clients
-MUST NOT attempt to send additional parameters. It is used for the
+host-parameter is used for the
 git-daemon name based virtual hosting.  See --interpolated-path
 option to git daemon, with the %H/%CH format characters.
 
@@ -117,6 +137,12 @@ we execute it without the leading '/'.
 		     v
    ssh user@example.com "git-upload-pack '~alice/project.git'"
 
+Depending on the value of the `protocol.version` configuration variable,
+Git may attempt to send Extra Parameters as a colon-separated string in
+the GIT_PROTOCOL environment variable. This is done only if
+the `ssh.variant` configuration variable indicates that the ssh command
+supports passing environment variables as an argument.
+
 A few things to remember here:
 
 - The "command name" is spelled with dash (e.g. git-upload-pack), but
@@ -137,11 +163,13 @@ Reference Discovery
 -------------------
 
 When the client initially connects the server will immediately respond
-with a listing of each reference it has (all branches and tags) along
+with a version number (if "version=1" is sent as an Extra Parameter),
+and a listing of each reference it has (all branches and tags) along
 with the object name that each reference currently points to.
 
-   $ echo -e -n "0039git-upload-pack /schacon/gitbook.git\0host=example.com\0" |
+   $ echo -e -n "0044git-upload-pack /schacon/gitbook.git\0host=example.com\0\0version=1\0" |
       nc -v example.com 9418
+   000aversion 1
    00887217a7c7e582c46cec22a130adf4b9d7d950fba0 HEAD\0multi_ack thin-pack
 		side-band side-band-64k ofs-delta shallow no-progress include-tag
    00441d3fcd5ced445d1abc402225c0b8a1299641f497 refs/heads/integration
@@ -165,7 +193,8 @@ immediately after the ref itself, if presented. A conforming server
 MUST peel the ref if it's an annotated tag.
 
 ----
-  advertised-refs  =  (no-refs / list-of-refs)
+  advertised-refs  =  *1("version 1")
+		      (no-refs / list-of-refs)
 		      *shallow
 		      flush-pkt
 
-- 
2.14.2.920.gcf0c67979c-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-03 20:15     ` [PATCH v3 03/10] protocol: introduce protocol extention mechanisms Brandon Williams
  2017-10-06  9:09       ` Simon Ruderich
@ 2017-10-10 19:51       ` Jonathan Tan
  1 sibling, 0 replies; 161+ messages in thread
From: Jonathan Tan @ 2017-10-10 19:51 UTC (permalink / raw)
  To: Brandon Williams; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On Tue,  3 Oct 2017 13:15:00 -0700
Brandon Williams <bmwill@google.com> wrote:

> +enum protocol_version determine_protocol_version_server(void)
> +{
> +	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
> +	enum protocol_version version = protocol_v0;
> +
> +	/*
> +	 * Determine which protocol version the client has requested.  Since
> +	 * multiple 'version' keys can be sent by the client, indicating that
> +	 * the client is okay to speak any of them, select the greatest version
> +	 * that the client has requested.  This is due to the assumption that
> +	 * the most recent protocol version will be the most state-of-the-art.
> +	 */
> +	if (git_protocol) {
> +		struct string_list list = STRING_LIST_INIT_DUP;
> +		const struct string_list_item *item;
> +		string_list_split(&list, git_protocol, ':', -1);
> +
> +		for_each_string_list_item(item, &list) {
> +			const char *value;
> +			enum protocol_version v;
> +
> +			if (skip_prefix(item->string, "version=", &value)) {

After writing some protocol docs [1], I wonder if this is also too
lenient. The code should probably die if a lone "version" (without the
equals sign) is given.

[1] https://public-inbox.org/git/20171010193956.168385-1-jonathantanmy@google.com/

> +				v = parse_protocol_version(value);
> +				if (v > version)
> +					version = v;
> +			}
> +		}
> +
> +		string_list_clear(&list, 0);
> +	}
> +
> +	return version;
> +}

Also, in your commit title, it is "extension", not "extention".

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-06 11:11           ` Martin Ågren
  2017-10-06 12:09             ` Junio C Hamano
@ 2017-10-10 21:00             ` Brandon Williams
  2017-10-10 21:17               ` Jonathan Nieder
  1 sibling, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-10 21:00 UTC (permalink / raw)
  To: Martin Ågren
  Cc: Junio C Hamano, Simon Ruderich, Git Mailing List, bturner,
	Jeff Hostetler, jonathantanmy, Jonathan Nieder, Jeff King,
	Stefan Beller

On 10/06, Martin Ågren wrote:
> On 6 October 2017 at 11:40, Junio C Hamano <gitster@pobox.com> wrote:
> > Simon Ruderich <simon@ruderich.org> writes:
> >
> >> Did you consider Stefan Beller's suggestion regarding a
> >> (white)list of allowed versions?
> >>
> >> On Mon, Sep 18, 2017 at 01:06:59PM -0700, Stefan Beller wrote:
> >>> Thinking about this, how about:
> >>>
> >>>   If not configured, we do as we want. (i.e. Git has full control over
> >>>   it's decision making process, which for now is "favor v0 over v1 as
> >>>   we are experimenting with v1". This strategy may change in the future
> >>>   to "prefer highest version number that both client and server can speak".)
> >>>
> >>>   If it is configured, "use highest configured number from the given set".
> >>>
> >>> ?
> >>
> >> It would also allow the server operator to configure only a
> >> specific set of versions (to handle the "version x is
> >> insecure/slow"-issue raised by Stefan Beller). The current code
> >> always uses the latest protocol supported by the git binary.
> >
> > If we do anything less trivial than "highest supported by both" (and
> > I suspect we want to in the final production version), I'd prefer
> > the configuration to list versions one side supports in decreasing
> > order of preference (e.g. "v3 v0 v2"), and take the earliest from
> > this list that both sides know how to talk, so that we can skip
> > insecure versions altogether by omitting, and we can express that we
> > would rather avoid talking expensive versions unless there is no
> > other version that is understood by the other side.
> 
> Maybe I'm missing something Git-specific, but isn't the only thing that
> needs to be done now, to document/specify that 1) the client should send
> its list ordered by preference, 2) how preference is signalled, and 3)
> that the server gets to choose?
> 
> Why would a server operator with only v0 and v1 at their disposal want
> to choose v0 instead of v1, considering that -- as far as I understand
> -- they are in fact the same?
> 
> Different server implementations and different server configurations
> will be able to apply whatever rules they want in order to decide which
> version to use. (It's not like the client can verify the choice that the
> server makes.) And Brandon's "pick the highest number" will do for now.
> 
> There are many possible rules that the server could apply and they
> shouldn't affect other servers or what the client does. For example, the
> server can go "You seem to know lots of versions, including X and Y.
> Those are the two that I really prefer, but between those two I'm not
> picky, so I'll use whichever of X and Y that you seem to prefer." Unless
> I've missed something, we'll never need to implement -- nor specify --
> anything like that before we have learned both v2 and v3.
> 
> I guess my thinking is, there's a difference between the protocol and
> the implementation.

I've been busy the last week or so and I probably wont get much more
time to go into more detail on this until the end of the week, so sorry
for not being super active on this thread in the past couple of days.

One of the key things about this transition is ensuring that we get it
right, because if we get it wrong and find out years later we'll have
to live with it.  So I'm excited and happy that people are taking a
close look at this.

That being said I don't think we need to worry too much about how one
version of the protocol is selected over another.  I fully expect this
to change based on many different factors.  Maybe one particular server
implementation favors v4 over v5, or another serve has no preference at
all.  We may learn something later on, based on security or other
reasons, that we want to prefer one version or another.  Because of that
(and because I'm hoping that once we have a v2 built that we don't have
to move to another protocol version any time soon) I think it would be a
mistake to hard-code or design in inherent preferences that a client
expresses that servers are 'required' to respect.

So I agree with Martin here that if more complicated use cases arise we
can design in a preference system for them at a later time.

Given some of this discussion though, maybe we want to change the
semantics of 'protocol.version' so that both servers and clients respect
it.  As of right now, these patches have the server always allow
protocol v0 and v1?  Though that doesn't do much right now since v1 is
the same as v0.

One other considerations that I should probably handle is that a client
doesn't do any verification right now to ensure that the protocol
version the server selects was indeed the protocol that the client
requested.  Is that something you think we need to check for?

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-10 21:00             ` Brandon Williams
@ 2017-10-10 21:17               ` Jonathan Nieder
  2017-10-10 21:32                 ` Stefan Beller
                                   ` (2 more replies)
  0 siblings, 3 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-10 21:17 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Martin Ågren, Junio C Hamano, Simon Ruderich,
	Git Mailing List, bturner, Jeff Hostetler, jonathantanmy,
	Jeff King, Stefan Beller

Hi,

Brandon Williams wrote:

> Given some of this discussion though, maybe we want to change the
> semantics of 'protocol.version' so that both servers and clients respect
> it.  As of right now, these patches have the server always allow
> protocol v0 and v1?  Though that doesn't do much right now since v1 is
> the same as v0.

I strongly prefer not to do that.

If we want to make the advertised protocol versions on the server side
configurable, I think it should be independent from the configuration
for protocol versions to use on the client side.  Rationale:

 - As a client-side user, I may want to (after reporting the bug, of
   course!) blacklist certain protocol versions to work around server
   bugs.  But this should not affect my behavior as a server --- in
   my role as a server, these server-side bugs have no effect on me.

 - As a server operator, I may want to (after reporting the bug, of
   course!) blacklist certain protocol versions to work around client
   bugs.  But this should not affect my behavior as a client --- in my
   role as a client, these client-side bugs have no effect on me.

Making the client-side case configurable seems important since Git is
widely used in environments where it may not be easy to control the
deployed version (so having configuration as an escape hatch is
important).

Making the server-side case configurable seems less important since
Git server operators usually have tight control over the deployed Git
version and can apply a targetted fix or workaround.

> One other considerations that I should probably handle is that a client
> doesn't do any verification right now to ensure that the protocol
> version the server selects was indeed the protocol that the client
> requested.  Is that something you think we need to check for?

Do you mean in tests, or are you referring to something else?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-10 21:17               ` Jonathan Nieder
@ 2017-10-10 21:32                 ` Stefan Beller
  2017-10-11  0:39                 ` Junio C Hamano
  2017-10-13 22:46                 ` Brandon Williams
  2 siblings, 0 replies; 161+ messages in thread
From: Stefan Beller @ 2017-10-10 21:32 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Martin Ågren, Junio C Hamano,
	Simon Ruderich, Git Mailing List, Bryan Turner, Jeff Hostetler,
	Jonathan Tan, Jeff King

On Tue, Oct 10, 2017 at 2:17 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi,
>
> Brandon Williams wrote:
>
>> Given some of this discussion though, maybe we want to change the
>> semantics of 'protocol.version' so that both servers and clients respect
>> it.

I have no preference for this, but I agree with Jonathans reasoning.


>> As of right now, these patches have the server always allow
>> protocol v0 and v1?  Though that doesn't do much right now since v1 is
>> the same as v0.

I would strongly prefer if the user configures "v0 and v1", being explicit,
if the client wants to talk either of them.

(v0 is not any more special than any future protocol after all -- from the
users perspective, configuring protocol.version)

On the wire transfer we may want to omit the v0, but for simplicity we could
just dump the clients config of "v0 and v1" onto the wire, and the server
would either ignore the "v0 and" part (when speaking v1), or ignore it
completely (old server, speaking v0).

Given this model, we have
* a strict whitelist clientside,
* the ordering decision can be deferred until later,
  when we have an actual v2.

>> One other considerations that I should probably handle is that a client
>> doesn't do any verification right now to ensure that the protocol
>> version the server selects was indeed the protocol that the client
>> requested.  Is that something you think we need to check for?

Yes, we would want to see if the protocol version matches the configured
white list. ("Once we have a v2, I would no longer want a server talking
v0,v1 to me, because I consider v0 insecure[1], and I personally want
to rather have no communication than a v0 protocol. After all I configured
'protocol.version = v2' only")

[1] e.g. https://public-inbox.org/git/1477690790.2904.22.camel@mattmccutchen.net/

> Do you mean in tests, or are you referring to something else?

A test would be lovely.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-10 21:17               ` Jonathan Nieder
  2017-10-10 21:32                 ` Stefan Beller
@ 2017-10-11  0:39                 ` Junio C Hamano
  2017-10-13 22:46                 ` Brandon Williams
  2 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-10-11  0:39 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, Martin Ågren, Simon Ruderich,
	Git Mailing List, bturner, Jeff Hostetler, jonathantanmy,
	Jeff King, Stefan Beller

Jonathan Nieder <jrnieder@gmail.com> writes:

> If we want to make the advertised protocol versions on the server side
> configurable, I think it should be independent from the configuration
> for protocol versions to use on the client side.  Rationale:
>
>  - As a client-side user, I may want to (after reporting the bug, of
>    course!) blacklist certain protocol versions to work around server
>    bugs.  But this should not affect my behavior as a server --- in
>    my role as a server, these server-side bugs have no effect on me.
>
>  - As a server operator, I may want to (after reporting the bug, of
>    course!) blacklist certain protocol versions to work around client
>    bugs.  But this should not affect my behavior as a client --- in my
>    role as a client, these client-side bugs have no effect on me.

Good point.  

> Making the client-side case configurable seems important since Git is
> widely used in environments where it may not be easy to control the
> deployed version (so having configuration as an escape hatch is
> important).
>
> Making the server-side case configurable seems less important since
> Git server operators usually have tight control over the deployed Git
> version and can apply a targetted fix or workaround.

The above also suggests that the configuration variable that lets
you specify the protocol version should hint in its name which
direction of the protocol it controls.  Even if we decide that we'd
add only the variable for the initiator side for now, if it is
possible that we later may want to add another for the acceptor
side, we need to do it right from day one.

Perhaps protocol.connectVersion(s) vs protocol.acceptVersion(s) or
something like that?

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 04/10] daemon: recognize hidden request arguments
  2017-10-10 18:24       ` Jonathan Tan
@ 2017-10-13 22:04         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-13 22:04 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On 10/10, Jonathan Tan wrote:
> On Tue,  3 Oct 2017 13:15:01 -0700
> Brandon Williams <bmwill@google.com> wrote:
> 
> >  /*
> >   * Read the host as supplied by the client connection.
> 
> The return value is probably worth documenting. Something like "Returns
> a pointer to the character *after* the NUL byte terminating the host
> argument, or extra_args if there is no host argument."

I can add that comment.

> 
> >   */
> > -static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> > +static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
> 
> [snip]
> 
> > +static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
> > +			     char *extra_args, int buflen)
> > +{
> > +	const char *end = extra_args + buflen;
> > +	struct strbuf git_protocol = STRBUF_INIT;
> > +
> > +	/* First look for the host argument */
> > +	extra_args = parse_host_arg(hi, extra_args, buflen);
> 
> This works, but is a bit loose in what it accepts. I think it's better
> to be tighter - in particular, if there is no host argument, we
> shouldn't be looking for extra args.

I disagree, you shouldn't be precluded from using protocol v2 if you
don't include a host argument.

> 
> > +
> > +	/* Look for additional arguments places after a second NUL byte */
> > +	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
> 
> Assuming that the host argument exists, this for loop should start at
> extra_args + 1 to skip the second NUL byte. This currently works
> because this code is lenient towards empty strings.

Being lenient towards empty strings is fine, I don't see any reason why
we should disallow them.  Also, this code already
requires that there be the second NUL byte because if there isn't then
the code which parses the host argument would fail out.

> 
> > +		const char *arg = extra_args;
> > +
> > +		/*
> > +		 * Parse the extra arguments, adding most to 'git_protocol'
> > +		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
> > +		 * service that will be run.
> > +		 *
> > +		 * If there ends up being a particular arg in the future that
> > +		 * git-daemon needs to parse specificly (like the 'host' arg)
> > +		 * then it can be parsed here and not added to 'git_protocol'.
> > +		 */
> > +		if (*arg) {
> > +			if (git_protocol.len > 0)
> > +				strbuf_addch(&git_protocol, ':');
> > +			strbuf_addstr(&git_protocol, arg);
> > +		}
> > +	}
> 
> But I think we shouldn't be lenient towards empty strings.

Why not? I see no issue with allowing them.  In fact if we error out we
could be painting ourselves into a corner much like how we did with the
host parsing logic.

> 
> > +
> > +	if (git_protocol.len > 0)
> > +		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
> > +				 git_protocol.buf);
> > +	strbuf_release(&git_protocol);
> >  }

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1
  2017-10-10 18:28       ` Jonathan Tan
@ 2017-10-13 22:18         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-13 22:18 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On 10/10, Jonathan Tan wrote:
> On Tue,  3 Oct 2017 13:15:02 -0700
> Brandon Williams <bmwill@google.com> wrote:
> 
> > +	switch (determine_protocol_version_server()) {
> > +	case protocol_v1:
> > +		if (advertise_refs || !stateless_rpc)
> > +			packet_write_fmt(1, "version 1\n");
> > +		/*
> > +		 * v1 is just the original protocol with a version string,
> > +		 * so just fall through after writing the version string.
> > +		 */
> 
> Peff sent out at least one patch [1] that reformats fallthrough comments
> to be understood by GCC. It's probably worth doing here too.
> 
> In this case, I would do the 2-comment system that Peff suggested:
> 
> 	/*
> 	 * v1 is just the original protocol with a version string,
> 	 * so just fall through after writing the version string.
> 	 */
> 	if (advertise_refs || !stateless_rpc)
> 		packet_write_fmt(1, "version 1\n");
> 	/* fallthrough */
> 
> (I put the first comment before the code, so it doesn't look so weird.)

Sounds good.

> 
> [1] https://public-inbox.org/git/20170921062541.ew67gyvrmb2ot4sf@sigill.intra.peff.net/
> 
> > +	case protocol_v0:
> > +		break;
> > +	default:

I'm also going to change this to from default to
'protocol_unknown_version' that way we get a compiler error instead of a
run-time BUG when introducing a new protocol version number.

> > +		BUG("unknown protocol version");
> > +	}

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH] Documentation: document Extra Parameters
  2017-10-10 19:39     ` [PATCH] Documentation: document Extra Parameters Jonathan Tan
@ 2017-10-13 22:26       ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-13 22:26 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: bturner, git, git, gitster, jrnieder, peff, sbeller

On 10/10, Jonathan Tan wrote:
> Document the server support for Extra Parameters, additional information
> that the client can send in its first message to the server during a
> Git client-server interaction.
> 
> Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
> ---
> I noticed that Documentation/technical was not updated in this patch
> series. It probably should, and here is a suggestion on how to do it.
> 
> Also, I'm not sure what to call the extra sendable information. I'm open
> to suggestions for a better name than Extra Parameters.

Thanks for writing this up.

> ---
>  Documentation/technical/http-protocol.txt |  8 ++++++
>  Documentation/technical/pack-protocol.txt | 43 ++++++++++++++++++++++++++-----
>  2 files changed, 44 insertions(+), 7 deletions(-)
> 
> diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
> index 1c561bdd9..a0e45f288 100644
> --- a/Documentation/technical/http-protocol.txt
> +++ b/Documentation/technical/http-protocol.txt
> @@ -219,6 +219,10 @@ smart server reply:
>     S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
>     S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
>  
> +The client may send Extra Parameters (see
> +Documentation/technical/pack-protocol.txt) as a colon-separated string
> +in the Git-Protocol HTTP header.
> +
>  Dumb Server Response
>  ^^^^^^^^^^^^^^^^^^^^
>  Dumb servers MUST respond with the dumb server reply format.
> @@ -269,7 +273,11 @@ the C locale ordering.  The stream SHOULD include the default ref
>  named `HEAD` as the first ref.  The stream MUST include capability
>  declarations behind a NUL on the first ref.
>  
> +The returned response contains "version 1" if "version=1" was sent as an
> +Extra Parameter.
> +
>    smart_reply     =  PKT-LINE("# service=$servicename" LF)
> +		     *1("version 1")
>  		     ref_list
>  		     "0000"
>    ref_list        =  empty_list / non_empty_list
> diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
> index ed1eae8b8..f9ebfb23e 100644
> --- a/Documentation/technical/pack-protocol.txt
> +++ b/Documentation/technical/pack-protocol.txt
> @@ -39,6 +39,19 @@ communicates with that invoked process over the SSH connection.
>  The file:// transport runs the 'upload-pack' or 'receive-pack'
>  process locally and communicates with it over a pipe.
>  
> +Extra Parameters
> +----------------
> +
> +The protocol provides a mechanism in which clients can send additional
> +information in its first message to the server. These are called "Extra
> +Parameters", and are supported by the Git, SSH, and HTTP protocols.
> +
> +Each Extra Parameter takes the form of `<key>=<value>` or `<key>`.
> +
> +Servers that receive any such Extra Parameters MUST ignore all
> +unrecognized keys. Currently, the only Extra Parameter recognized is
> +"version=1".
> +
>  Git Transport
>  -------------
>  
> @@ -46,18 +59,25 @@ The Git transport starts off by sending the command and repository
>  on the wire using the pkt-line format, followed by a NUL byte and a
>  hostname parameter, terminated by a NUL byte.
>  
> -   0032git-upload-pack /project.git\0host=myserver.com\0
> +   0033git-upload-pack /project.git\0host=myserver.com\0
> +
> +The transport may send Extra Parameters by adding an additional NUL
> +byte, and then adding one or more NUL-terminated strings:
> +
> +   003egit-upload-pack /project.git\0host=myserver.com\0\0version=1\0
>  
>  --
> -   git-proto-request = request-command SP pathname NUL [ host-parameter NUL ]
> +   git-proto-request = request-command SP pathname NUL
> +		       [ host-parameter NUL [ NUL extra-parameters ] ]

This should probably be "[ host-parameter NUL ] [ NUL extra-parameters ]"
because we don't want to require sending a host parameter in order to
send extra parameters.

>     request-command   = "git-upload-pack" / "git-receive-pack" /
>  		       "git-upload-archive"   ; case sensitive
>     pathname          = *( %x01-ff ) ; exclude NUL
>     host-parameter    = "host=" hostname [ ":" port ]
> +   extra-parameters  = 1*extra-parameter
> +   extra-parameter   = 1*( %x01-ff ) NUL
>  --
>  
> -Only host-parameter is allowed in the git-proto-request. Clients
> -MUST NOT attempt to send additional parameters. It is used for the
> +host-parameter is used for the
>  git-daemon name based virtual hosting.  See --interpolated-path
>  option to git daemon, with the %H/%CH format characters.
>  
> @@ -117,6 +137,12 @@ we execute it without the leading '/'.
>  		     v
>     ssh user@example.com "git-upload-pack '~alice/project.git'"
>  
> +Depending on the value of the `protocol.version` configuration variable,
> +Git may attempt to send Extra Parameters as a colon-separated string in
> +the GIT_PROTOCOL environment variable. This is done only if
> +the `ssh.variant` configuration variable indicates that the ssh command
> +supports passing environment variables as an argument.
> +
>  A few things to remember here:
>  
>  - The "command name" is spelled with dash (e.g. git-upload-pack), but
> @@ -137,11 +163,13 @@ Reference Discovery
>  -------------------
>  
>  When the client initially connects the server will immediately respond
> -with a listing of each reference it has (all branches and tags) along
> +with a version number (if "version=1" is sent as an Extra Parameter),
> +and a listing of each reference it has (all branches and tags) along
>  with the object name that each reference currently points to.
>  
> -   $ echo -e -n "0039git-upload-pack /schacon/gitbook.git\0host=example.com\0" |
> +   $ echo -e -n "0044git-upload-pack /schacon/gitbook.git\0host=example.com\0\0version=1\0" |
>        nc -v example.com 9418
> +   000aversion 1
>     00887217a7c7e582c46cec22a130adf4b9d7d950fba0 HEAD\0multi_ack thin-pack
>  		side-band side-band-64k ofs-delta shallow no-progress include-tag
>     00441d3fcd5ced445d1abc402225c0b8a1299641f497 refs/heads/integration
> @@ -165,7 +193,8 @@ immediately after the ref itself, if presented. A conforming server
>  MUST peel the ref if it's an annotated tag.
>  
>  ----
> -  advertised-refs  =  (no-refs / list-of-refs)
> +  advertised-refs  =  *1("version 1")
> +		      (no-refs / list-of-refs)
>  		      *shallow
>  		      flush-pkt
>  
> -- 
> 2.14.2.920.gcf0c67979c-goog
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 03/10] protocol: introduce protocol extention mechanisms
  2017-10-10 21:17               ` Jonathan Nieder
  2017-10-10 21:32                 ` Stefan Beller
  2017-10-11  0:39                 ` Junio C Hamano
@ 2017-10-13 22:46                 ` Brandon Williams
  2 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-13 22:46 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Martin Ågren, Junio C Hamano, Simon Ruderich,
	Git Mailing List, bturner, Jeff Hostetler, jonathantanmy,
	Jeff King, Stefan Beller

On 10/10, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > Given some of this discussion though, maybe we want to change the
> > semantics of 'protocol.version' so that both servers and clients respect
> > it.  As of right now, these patches have the server always allow
> > protocol v0 and v1?  Though that doesn't do much right now since v1 is
> > the same as v0.
> 
> I strongly prefer not to do that.
> 
> If we want to make the advertised protocol versions on the server side
> configurable, I think it should be independent from the configuration
> for protocol versions to use on the client side.  Rationale:
> 
>  - As a client-side user, I may want to (after reporting the bug, of
>    course!) blacklist certain protocol versions to work around server
>    bugs.  But this should not affect my behavior as a server --- in
>    my role as a server, these server-side bugs have no effect on me.
> 
>  - As a server operator, I may want to (after reporting the bug, of
>    course!) blacklist certain protocol versions to work around client
>    bugs.  But this should not affect my behavior as a client --- in my
>    role as a client, these client-side bugs have no effect on me.
> 
> Making the client-side case configurable seems important since Git is
> widely used in environments where it may not be easy to control the
> deployed version (so having configuration as an escape hatch is
> important).
> 
> Making the server-side case configurable seems less important since
> Git server operators usually have tight control over the deployed Git
> version and can apply a targetted fix or workaround.

This is fine with me.  Realistically, as I mentioned, all of this is
unimportant at the moment as it doesn't prevent us from moving forward
with the transition or with implementing a v2.  If we do get to a point
in the future where we need to explicitly care about blacklisting or
whitelisting protocol versions from the config then we can take care of
that then.  The important thing is that servers won't die if they see
multiple 'version=?' entries or unknown values of '?' in GIT_PROTOCOL.


-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 07/10] connect: tell server that the client understands v1
  2017-10-10 18:30       ` Jonathan Tan
@ 2017-10-13 22:56         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-13 22:56 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: git, bturner, git, gitster, jrnieder, peff, sbeller

On 10/10, Jonathan Tan wrote:
> On Tue,  3 Oct 2017 13:15:04 -0700
> Brandon Williams <bmwill@google.com> wrote:
> 
> > 2. ssh://, file://
> >    Set 'GIT_PROTOCOL' environment variable with the desired protocol
> >    version.  With the file:// transport, 'GIT_PROTOCOL' can be set
> >    explicitly in the locally running git-upload-pack or git-receive-pack
> >    processes.  With the ssh:// transport and OpenSSH compliant ssh
> >    programs, 'GIT_PROTOCOL' can be sent across ssh by using '-o
> >    SendEnv=GIT_PROTOCOL' and having the server whitelist this
> >    environment variable.
> 
> In your commit message, also mention what GIT_PROTOCOL contains
> (version=?). (From this commit message alone, I would have expected a
> lone integer, but that is not the case.)
> 
> Same comment for the commit message of PATCH v3 08/10.

I'll update the commit msgs.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant
  2017-10-03 21:42       ` Jonathan Nieder
@ 2017-10-16 17:18         ` Brandon Williams
  2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:18 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller

On 10/03, Jonathan Nieder wrote:
> Hi,
> 
> Brandon Williams wrote:
> 
> > When using the 'ssh' transport, the '-o' option is used to specify an
> > environment variable which should be set on the remote end.  This allows
> > git to send additional information when contacting the server,
> > requesting the use of a different protocol version via the
> > 'GIT_PROTOCOL' environment variable like so: "-o SendEnv=GIT_PROTOCOL"
> >
> > Unfortunately not all ssh variants support the sending of environment
> > variables to the remote end.  To account for this, only use the '-o'
> > option for ssh variants which are OpenSSH compliant.  This is done by
> > checking that the basename of the ssh command is 'ssh' or the ssh
> > variant is overridden to be 'ssh' (via the ssh.variant config).
> 
> This also affects -p (port), right?

Yeah I'll add a comment in the commit msg indicating that options like
-p and -4 -6 are are only supported by some variants.

> 
> What happens if I specify a ssh://host:port/path URL and the SSH
> implementation is of 'simple' type?

The port would only be sent if your ssh command supported it.

> 
> > Previously if an ssh command's basename wasn't 'plink' or
> 
> Git's commit messages use the present tense to describe the current
> state of the code, so this is "Currently". :)

I'll fix this :)

> 
> > 'tortoiseplink' git assumed that the command was an OpenSSH variant.
> > Since user configured ssh commands may not be OpenSSH compliant, tighten
> > this constraint and assume a variant of 'simple' if the basename of the
> > command doesn't match the variants known to git.  The new ssh variant
> > 'simple' will only have the host and command to execute ([username@]host
> > command) passed as parameters to the ssh command.
> >
> > Update the Documentation to better reflect the command-line options sent
> > to ssh commands based on their variant.
> >
> > Reported-by: Jeffrey Yasskin <jyasskin@google.com>
> > Signed-off-by: Brandon Williams <bmwill@google.com>
> 
> Thanks for working on this.
> 
> For background, the GIT_SSH implementation that motivated this is
> https://github.com/travis-ci/dpl/blob/6c3fddfda1f2a85944c544446b068bac0a77c049/lib/dpl/provider.rb#L215,
> which does not support -p or -4/-6, either.
> 
> > ---
> >  Documentation/config.txt |  27 ++++++++++--
> >  Documentation/git.txt    |   9 ++--
> >  connect.c                | 107 ++++++++++++++++++++++++++---------------------
> >  t/t5601-clone.sh         |   9 ++--
> >  t/t5700-protocol-v1.sh   |   2 +
> >  5 files changed, 95 insertions(+), 59 deletions(-)
> [...]
> > --- a/connect.c
> > +++ b/connect.c
> > @@ -776,37 +776,44 @@ static const char *get_ssh_command(void)
> [...]
> > +static enum ssh_variant determine_ssh_variant(const char *ssh_command,
> > +					      int is_cmdline)
> [...]
> > -	if (!strcasecmp(variant, "plink") ||
> > -	    !strcasecmp(variant, "plink.exe"))
> > -		*port_option = 'P';
> > +	if (!strcasecmp(variant, "ssh"))
> > +		ssh_variant = VARIANT_SSH;
> 
> Could this handle ssh.exe, too?

Yeah I'll add the additional comparison.

> 
> [...]
> > --- a/t/t5601-clone.sh
> > +++ b/t/t5601-clone.sh
> 
> Can this get tests for the new defaulting behavior?  E.g.
> 
>  - default is "simple"
>  - how "simple" treats an ssh://host:port/path URL
>  - how "simple" treats ipv4/ipv6 switching
>  - ssh defaults to "ssh"
>  - if GIT_SSH=ssh, can set ssh.variant to "simple" to force the "simple"
>    mode

I'll look to adding a few more tests.

> 
> One other worry: this (intentionally) changes the behavior of a
> previously-working GIT_SSH=ssh-wrapper that wants to support
> OpenSSH-style options but does not declare ssh.variant=ssh.  When
> discovering this change, what should the author of such an ssh-wrapper
> do?
> 
> They could instruct their users to set ssh.variant or GIT_SSH_VARIANT
> to "ssh", but then they are at the mercy of future additional options
> supported by OpenSSH we may want to start using in the future (e.g.,
> maybe we will start passing "--" to separate options from the
> hostname).  So this is not a futureproof option for them.
> 
> They could take the new default behavior or instruct their users to
> set ssh.variant or GIT_SSH_VARIANT to "simple", but then they lose
> support for handling alternate ports, ipv4/ipv6, and specifying -o
> SendEnv to propagate GIT_PROTOCOL or other envvars.  They can handle
> GIT_PROTOCOL propagation manually, but losing port support seems like
> a heavy cost.
> 
> They could send a patch to define yet another variant that is
> forward-compatible, for example using an interface similar to what
> git-credential(1) uses.  Then they can set GIT_SSH to their
> OpenSSH-style helper and GIT_FANCY_NEW_SSH to their more modern
> helper, so that old Git versions could use GIT_SSH and new Git
> versions could use GIT_FANCY_NEW_SSH.  This might be their best
> option.  It feels odd to say that their only good way forward is to
> send a patch, but on the other hand someone with such an itch is
> likely to be in the best position to define an appropriate interface.
> 
> They could send a documentation patch to make more promises about the
> commandline used in OpenSSH mode: e.g. setting a rule in advance about
> which options can take an argument so that they can properly parse an
> OpenSSH command line in a future-compatible way.
> 
> Or they could send a patch to allow passing the port in "simple"
> mode, for example using an environment variable.
> 
> Am I missing another option?  What advice do we give to this person?
> 
> Thanks,
> Jonathan

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v4 00/11] protocol transition
  2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
                       ` (11 preceding siblings ...)
  2017-10-10 19:39     ` [PATCH] Documentation: document Extra Parameters Jonathan Tan
@ 2017-10-16 17:55     ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 01/11] connect: in ref advertisement, shallows are last Brandon Williams
                         ` (10 more replies)
  12 siblings, 11 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Changes in v4:
 * Added more tests for the new handeling of ssh variants.
 * Removed the 'default' case in upload_pack and receive_pack and instead
   ensured that all enum values were accounted for.  This way when a new
   protocol version is introduced the compiler will throw an error if the new
   protocol version isn't accounted for in these switch statements.
 * Added Jonathan's Documentation patch ontop of the series (with the small
   change I pointed out in reply to the patch itself)
 * A few other small changes due to reviewer comments.


Brandon Williams (9):
  pkt-line: add packet_write function
  protocol: introduce protocol extension mechanisms
  daemon: recognize hidden request arguments
  upload-pack, receive-pack: introduce protocol version 1
  connect: teach client to recognize v1 server response
  connect: tell server that the client understands v1
  http: tell server that the client understands v1
  i5700: add interop test for protocol transition
  ssh: introduce a 'simple' ssh variant

Jonathan Tan (2):
  connect: in ref advertisement, shallows are last
  Documentation: document Extra Parameters

 Documentation/config.txt                  |  44 +++-
 Documentation/git.txt                     |  15 +-
 Documentation/technical/http-protocol.txt |   8 +
 Documentation/technical/pack-protocol.txt |  43 +++-
 Makefile                                  |   1 +
 builtin/receive-pack.c                    |  17 ++
 cache.h                                   |  10 +
 connect.c                                 | 354 ++++++++++++++++++++----------
 daemon.c                                  |  71 +++++-
 http.c                                    |  18 ++
 pkt-line.c                                |   6 +
 pkt-line.h                                |   1 +
 protocol.c                                |  79 +++++++
 protocol.h                                |  33 +++
 t/interop/i5700-protocol-transition.sh    |  68 ++++++
 t/lib-httpd/apache.conf                   |   7 +
 t/t5601-clone.sh                          |  26 ++-
 t/t5700-protocol-v1.sh                    | 294 +++++++++++++++++++++++++
 upload-pack.c                             |  20 +-
 19 files changed, 967 insertions(+), 148 deletions(-)
 create mode 100644 protocol.c
 create mode 100644 protocol.h
 create mode 100755 t/interop/i5700-protocol-transition.sh
 create mode 100755 t/t5700-protocol-v1.sh

--- interdiff with 'origin/bw/protocol-v1'


diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 1c561bdd9..a0e45f288 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -219,6 +219,10 @@ smart server reply:
    S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
    S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
 
+The client may send Extra Parameters (see
+Documentation/technical/pack-protocol.txt) as a colon-separated string
+in the Git-Protocol HTTP header.
+
 Dumb Server Response
 ^^^^^^^^^^^^^^^^^^^^
 Dumb servers MUST respond with the dumb server reply format.
@@ -269,7 +273,11 @@ the C locale ordering.  The stream SHOULD include the default ref
 named `HEAD` as the first ref.  The stream MUST include capability
 declarations behind a NUL on the first ref.
 
+The returned response contains "version 1" if "version=1" was sent as an
+Extra Parameter.
+
   smart_reply     =  PKT-LINE("# service=$servicename" LF)
+		     *1("version 1")
 		     ref_list
 		     "0000"
   ref_list        =  empty_list / non_empty_list
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index ed1eae8b8..cd31edc91 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -39,6 +39,19 @@ communicates with that invoked process over the SSH connection.
 The file:// transport runs the 'upload-pack' or 'receive-pack'
 process locally and communicates with it over a pipe.
 
+Extra Parameters
+----------------
+
+The protocol provides a mechanism in which clients can send additional
+information in its first message to the server. These are called "Extra
+Parameters", and are supported by the Git, SSH, and HTTP protocols.
+
+Each Extra Parameter takes the form of `<key>=<value>` or `<key>`.
+
+Servers that receive any such Extra Parameters MUST ignore all
+unrecognized keys. Currently, the only Extra Parameter recognized is
+"version=1".
+
 Git Transport
 -------------
 
@@ -46,18 +59,25 @@ The Git transport starts off by sending the command and repository
 on the wire using the pkt-line format, followed by a NUL byte and a
 hostname parameter, terminated by a NUL byte.
 
-   0032git-upload-pack /project.git\0host=myserver.com\0
+   0033git-upload-pack /project.git\0host=myserver.com\0
+
+The transport may send Extra Parameters by adding an additional NUL
+byte, and then adding one or more NUL-terminated strings:
+
+   003egit-upload-pack /project.git\0host=myserver.com\0\0version=1\0
 
 --
-   git-proto-request = request-command SP pathname NUL [ host-parameter NUL ]
+   git-proto-request = request-command SP pathname NUL
+		       [ host-parameter NUL ] [ NUL extra-parameters ]
    request-command   = "git-upload-pack" / "git-receive-pack" /
 		       "git-upload-archive"   ; case sensitive
    pathname          = *( %x01-ff ) ; exclude NUL
    host-parameter    = "host=" hostname [ ":" port ]
+   extra-parameters  = 1*extra-parameter
+   extra-parameter   = 1*( %x01-ff ) NUL
 --
 
-Only host-parameter is allowed in the git-proto-request. Clients
-MUST NOT attempt to send additional parameters. It is used for the
+host-parameter is used for the
 git-daemon name based virtual hosting.  See --interpolated-path
 option to git daemon, with the %H/%CH format characters.
 
@@ -117,6 +137,12 @@ we execute it without the leading '/'.
 		     v
    ssh user@example.com "git-upload-pack '~alice/project.git'"
 
+Depending on the value of the `protocol.version` configuration variable,
+Git may attempt to send Extra Parameters as a colon-separated string in
+the GIT_PROTOCOL environment variable. This is done only if
+the `ssh.variant` configuration variable indicates that the ssh command
+supports passing environment variables as an argument.
+
 A few things to remember here:
 
 - The "command name" is spelled with dash (e.g. git-upload-pack), but
@@ -137,11 +163,13 @@ Reference Discovery
 -------------------
 
 When the client initially connects the server will immediately respond
-with a listing of each reference it has (all branches and tags) along
+with a version number (if "version=1" is sent as an Extra Parameter),
+and a listing of each reference it has (all branches and tags) along
 with the object name that each reference currently points to.
 
-   $ echo -e -n "0039git-upload-pack /schacon/gitbook.git\0host=example.com\0" |
+   $ echo -e -n "0044git-upload-pack /schacon/gitbook.git\0host=example.com\0\0version=1\0" |
       nc -v example.com 9418
+   000aversion 1
    00887217a7c7e582c46cec22a130adf4b9d7d950fba0 HEAD\0multi_ack thin-pack
 		side-band side-band-64k ofs-delta shallow no-progress include-tag
    00441d3fcd5ced445d1abc402225c0b8a1299641f497 refs/heads/integration
@@ -165,7 +193,8 @@ immediately after the ref itself, if presented. A conforming server
 MUST peel the ref if it's an annotated tag.
 
 ----
-  advertised-refs  =  (no-refs / list-of-refs)
+  advertised-refs  =  *1("version 1")
+		      (no-refs / list-of-refs)
 		      *shallow
 		      flush-pkt
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 94b7d29ea..839c1462d 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1966,15 +1966,17 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v1:
-		if (advertise_refs || !stateless_rpc)
-			packet_write_fmt(1, "version 1\n");
 		/*
 		 * v1 is just the original protocol with a version string,
 		 * so just fall through after writing the version string.
 		 */
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
 	case protocol_v0:
 		break;
-	default:
+	case protocol_unknown_version:
 		BUG("unknown protocol version");
 	}
 
diff --git a/connect.c b/connect.c
index 65cee49b6..7fbd396b3 100644
--- a/connect.c
+++ b/connect.c
@@ -836,7 +836,8 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
 		}
 	}
 
-	if (!strcasecmp(variant, "ssh"))
+	if (!strcasecmp(variant, "ssh") ||
+	    !strcasecmp(variant, "ssh.exe"))
 		ssh_variant = VARIANT_SSH;
 	else if (!strcasecmp(variant, "plink") ||
 		 !strcasecmp(variant, "plink.exe"))
diff --git a/daemon.c b/daemon.c
index 36cc794c9..e37e343d0 100644
--- a/daemon.c
+++ b/daemon.c
@@ -582,6 +582,9 @@ static void canonicalize_client(struct strbuf *out, const char *in)
 
 /*
  * Read the host as supplied by the client connection.
+ *
+ * Returns a pointer to the character after the NUL byte terminating the host
+ * arguemnt, or 'extra_args' if there is no host arguemnt.
  */
 static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 {
diff --git a/pkt-line.c b/pkt-line.c
index c025d0332..7006b3587 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -188,7 +188,7 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 	return 0;
 }
 
-void packet_write(const int fd_out, const char *buf, size_t size)
+void packet_write(int fd_out, const char *buf, size_t size)
 {
 	if (packet_write_gently(fd_out, buf, size))
 		die_errno("packet write failed");
diff --git a/pkt-line.h b/pkt-line.h
index d9e9783b1..3dad583e2 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,7 +22,7 @@
 void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
-void packet_write(const int fd_out, const char *buf, size_t size);
+void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index ee1a24c5b..86811a0c3 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -309,21 +309,18 @@ test_expect_success 'clone checking out a tag' '
 setup_ssh_wrapper () {
 	test_expect_success 'setup ssh wrapper' '
 		cp "$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" \
-			"$TRASH_DIRECTORY/ssh-wrapper$X" &&
-		GIT_SSH="$TRASH_DIRECTORY/ssh-wrapper$X" &&
+			"$TRASH_DIRECTORY/ssh$X" &&
+		GIT_SSH="$TRASH_DIRECTORY/ssh$X" &&
 		export GIT_SSH &&
-		GIT_SSH_VARIANT=ssh &&
-		export GIT_SSH_VARIANT &&
 		export TRASH_DIRECTORY &&
 		>"$TRASH_DIRECTORY"/ssh-output
 	'
 }
 
 copy_ssh_wrapper_as () {
-	cp "$TRASH_DIRECTORY/ssh-wrapper$X" "${1%$X}$X" &&
+	cp "$TRASH_DIRECTORY/ssh$X" "${1%$X}$X" &&
 	GIT_SSH="${1%$X}$X" &&
-	export GIT_SSH &&
-	unset GIT_SSH_VARIANT
+	export GIT_SSH
 }
 
 expect_ssh () {
@@ -365,6 +362,22 @@ test_expect_success 'bracketed hostnames are still ssh' '
 	expect_ssh "-p 123" myhost src
 '
 
+test_expect_success 'OpenSSH variant passes -4' '
+	git clone -4 "[myhost:123]:src" ssh-ipv4-clone &&
+	expect_ssh "-4 -p 123" myhost src
+'
+
+test_expect_success 'variant can be overriden' '
+	git -c ssh.variant=simple clone -4 "[myhost:123]:src" ssh-simple-clone &&
+	expect_ssh myhost src
+'
+
+test_expect_success 'simple is treated as simple' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
+	git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple &&
+	expect_ssh myhost src
+'
+
 test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
 	git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
diff --git a/upload-pack.c b/upload-pack.c
index ef438e9c2..ef99a029c 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1071,16 +1071,18 @@ int cmd_main(int argc, const char **argv)
 
 	switch (determine_protocol_version_server()) {
 	case protocol_v1:
-		if (advertise_refs || !stateless_rpc)
-			packet_write_fmt(1, "version 1\n");
 		/*
 		 * v1 is just the original protocol with a version string,
 		 * so just fall through after writing the version string.
 		 */
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
 	case protocol_v0:
 		upload_pack();
 		break;
-	default:
+	case protocol_unknown_version:
 		BUG("unknown protocol version");
 	}
 

-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 01/11] connect: in ref advertisement, shallows are last
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 02/11] pkt-line: add packet_write function Brandon Williams
                         ` (9 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

From: Jonathan Tan <jonathantanmy@google.com>

Currently, get_remote_heads() parses the ref advertisement in one loop,
allowing refs and shallow lines to intersperse, despite this not being
allowed by the specification. Refactor get_remote_heads() to use two
loops instead, enforcing that refs come first, and then shallows.

This also makes it easier to teach get_remote_heads() to interpret other
lines in the ref advertisement, which will be done in a subsequent
patch.

As part of this change, this patch interprets capabilities only on the
first line in the ref advertisement, printing a warning message when
encountering capabilities on other lines.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 189 ++++++++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 123 insertions(+), 66 deletions(-)

diff --git a/connect.c b/connect.c
index df56c0cbf..8e2e276b6 100644
--- a/connect.c
+++ b/connect.c
@@ -11,6 +11,7 @@
 #include "string-list.h"
 #include "sha1-array.h"
 #include "transport.h"
+#include "strbuf.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -107,6 +108,104 @@ static void annotate_refs_with_symref_info(struct ref *ref)
 	string_list_clear(&symref, 0);
 }
 
+/*
+ * Read one line of a server's ref advertisement into packet_buffer.
+ */
+static int read_remote_ref(int in, char **src_buf, size_t *src_len,
+			   int *responded)
+{
+	int len = packet_read(in, src_buf, src_len,
+			      packet_buffer, sizeof(packet_buffer),
+			      PACKET_READ_GENTLE_ON_EOF |
+			      PACKET_READ_CHOMP_NEWLINE);
+	const char *arg;
+	if (len < 0)
+		die_initial_contact(*responded);
+	if (len > 4 && skip_prefix(packet_buffer, "ERR ", &arg))
+		die("remote error: %s", arg);
+
+	*responded = 1;
+
+	return len;
+}
+
+#define EXPECTING_FIRST_REF 0
+#define EXPECTING_REF 1
+#define EXPECTING_SHALLOW 2
+
+static void process_capabilities(int *len)
+{
+	int nul_location = strlen(packet_buffer);
+	if (nul_location == *len)
+		return;
+	server_capabilities = xstrdup(packet_buffer + nul_location + 1);
+	*len = nul_location;
+}
+
+static int process_dummy_ref(void)
+{
+	struct object_id oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	return !oidcmp(&null_oid, &oid) && !strcmp(name, "capabilities^{}");
+}
+
+static void check_no_capabilities(int len)
+{
+	if (strlen(packet_buffer) != len)
+		warning("Ignoring capabilities after first line '%s'",
+			packet_buffer + strlen(packet_buffer));
+}
+
+static int process_ref(int len, struct ref ***list, unsigned int flags,
+		       struct oid_array *extra_have)
+{
+	struct object_id old_oid;
+	const char *name;
+
+	if (parse_oid_hex(packet_buffer, &old_oid, &name))
+		return 0;
+	if (*name != ' ')
+		return 0;
+	name++;
+
+	if (extra_have && !strcmp(name, ".have")) {
+		oid_array_append(extra_have, &old_oid);
+	} else if (!strcmp(name, "capabilities^{}")) {
+		die("protocol error: unexpected capabilities^{}");
+	} else if (check_ref(name, flags)) {
+		struct ref *ref = alloc_ref(name);
+		oidcpy(&ref->old_oid, &old_oid);
+		**list = ref;
+		*list = &ref->next;
+	}
+	check_no_capabilities(len);
+	return 1;
+}
+
+static int process_shallow(int len, struct oid_array *shallow_points)
+{
+	const char *arg;
+	struct object_id old_oid;
+
+	if (!skip_prefix(packet_buffer, "shallow ", &arg))
+		return 0;
+
+	if (get_oid_hex(arg, &old_oid))
+		die("protocol error: expected shallow sha-1, got '%s'", arg);
+	if (!shallow_points)
+		die("repository on the other end cannot be shallow");
+	oid_array_append(shallow_points, &old_oid);
+	check_no_capabilities(len);
+	return 1;
+}
+
 /*
  * Read all the refs from the other end
  */
@@ -123,76 +222,34 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 * willing to talk to us.  A hang-up before seeing any
 	 * response does not necessarily mean an ACL problem, though.
 	 */
-	int saw_response;
-	int got_dummy_ref_with_capabilities_declaration = 0;
+	int responded = 0;
+	int len;
+	int state = EXPECTING_FIRST_REF;
 
 	*list = NULL;
-	for (saw_response = 0; ; saw_response = 1) {
-		struct ref *ref;
-		struct object_id old_oid;
-		char *name;
-		int len, name_len;
-		char *buffer = packet_buffer;
-		const char *arg;
-
-		len = packet_read(in, &src_buf, &src_len,
-				  packet_buffer, sizeof(packet_buffer),
-				  PACKET_READ_GENTLE_ON_EOF |
-				  PACKET_READ_CHOMP_NEWLINE);
-		if (len < 0)
-			die_initial_contact(saw_response);
-
-		if (!len)
-			break;
-
-		if (len > 4 && skip_prefix(buffer, "ERR ", &arg))
-			die("remote error: %s", arg);
-
-		if (len == GIT_SHA1_HEXSZ + strlen("shallow ") &&
-			skip_prefix(buffer, "shallow ", &arg)) {
-			if (get_oid_hex(arg, &old_oid))
-				die("protocol error: expected shallow sha-1, got '%s'", arg);
-			if (!shallow_points)
-				die("repository on the other end cannot be shallow");
-			oid_array_append(shallow_points, &old_oid);
-			continue;
-		}
 
-		if (len < GIT_SHA1_HEXSZ + 2 || get_oid_hex(buffer, &old_oid) ||
-			buffer[GIT_SHA1_HEXSZ] != ' ')
-			die("protocol error: expected sha/ref, got '%s'", buffer);
-		name = buffer + GIT_SHA1_HEXSZ + 1;
-
-		name_len = strlen(name);
-		if (len != name_len + GIT_SHA1_HEXSZ + 1) {
-			free(server_capabilities);
-			server_capabilities = xstrdup(name + name_len + 1);
-		}
-
-		if (extra_have && !strcmp(name, ".have")) {
-			oid_array_append(extra_have, &old_oid);
-			continue;
-		}
-
-		if (!strcmp(name, "capabilities^{}")) {
-			if (saw_response)
-				die("protocol error: unexpected capabilities^{}");
-			if (got_dummy_ref_with_capabilities_declaration)
-				die("protocol error: multiple capabilities^{}");
-			got_dummy_ref_with_capabilities_declaration = 1;
-			continue;
+	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
+		switch (state) {
+		case EXPECTING_FIRST_REF:
+			process_capabilities(&len);
+			if (process_dummy_ref()) {
+				state = EXPECTING_SHALLOW;
+				break;
+			}
+			state = EXPECTING_REF;
+			/* fallthrough */
+		case EXPECTING_REF:
+			if (process_ref(len, &list, flags, extra_have))
+				break;
+			state = EXPECTING_SHALLOW;
+			/* fallthrough */
+		case EXPECTING_SHALLOW:
+			if (process_shallow(len, shallow_points))
+				break;
+			die("protocol error: unexpected '%s'", packet_buffer);
+		default:
+			die("unexpected state %d", state);
 		}
-
-		if (!check_ref(name, flags))
-			continue;
-
-		if (got_dummy_ref_with_capabilities_declaration)
-			die("protocol error: unexpected ref after capabilities^{}");
-
-		ref = alloc_ref(buffer + GIT_SHA1_HEXSZ + 1);
-		oidcpy(&ref->old_oid, &old_oid);
-		*list = ref;
-		list = &ref->next;
 	}
 
 	annotate_refs_with_symref_info(*orig_list);
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 02/11] pkt-line: add packet_write function
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 01/11] connect: in ref advertisement, shallows are last Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 03/11] protocol: introduce protocol extension mechanisms Brandon Williams
                         ` (8 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Add a function which can be used to write the contents of an arbitrary
buffer.  This makes it easy to build up data in a buffer before writing
the packet instead of formatting the entire contents of the packet using
'packet_write_fmt()'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 pkt-line.c | 6 ++++++
 pkt-line.h | 1 +
 2 files changed, 7 insertions(+)

diff --git a/pkt-line.c b/pkt-line.c
index 647bbd3bc..7006b3587 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -188,6 +188,12 @@ static int packet_write_gently(const int fd_out, const char *buf, size_t size)
 	return 0;
 }
 
+void packet_write(int fd_out, const char *buf, size_t size)
+{
+	if (packet_write_gently(fd_out, buf, size))
+		die_errno("packet write failed");
+}
+
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...)
 {
 	va_list args;
diff --git a/pkt-line.h b/pkt-line.h
index 66ef610fc..3dad583e2 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -22,6 +22,7 @@
 void packet_flush(int fd);
 void packet_write_fmt(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 void packet_buf_flush(struct strbuf *buf);
+void packet_write(int fd_out, const char *buf, size_t size);
 void packet_buf_write(struct strbuf *buf, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
 int packet_flush_gently(int fd);
 int packet_write_fmt_gently(int fd, const char *fmt, ...) __attribute__((format (printf, 2, 3)));
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 03/11] protocol: introduce protocol extension mechanisms
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 01/11] connect: in ref advertisement, shallows are last Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 02/11] pkt-line: add packet_write function Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 21:25         ` Kevin Daudt
  2017-10-16 17:55       ` [PATCH v4 04/11] daemon: recognize hidden request arguments Brandon Williams
                         ` (7 subsequent siblings)
  10 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Create protocol.{c,h} and provide functions which future servers and
clients can use to determine which protocol to use or is being used.

Also introduce the 'GIT_PROTOCOL' environment variable which will be
used to communicate a colon separated list of keys with optional values
to a server.  Unknown keys and values must be tolerated.  This mechanism
is used to communicate which version of the wire protocol a client would
like to use with a server.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt | 17 +++++++++++
 Documentation/git.txt    |  6 ++++
 Makefile                 |  1 +
 cache.h                  |  8 +++++
 protocol.c               | 79 ++++++++++++++++++++++++++++++++++++++++++++++++
 protocol.h               | 33 ++++++++++++++++++++
 6 files changed, 144 insertions(+)
 create mode 100644 protocol.c
 create mode 100644 protocol.h

diff --git a/Documentation/config.txt b/Documentation/config.txt
index dc4e3f58a..b78747abc 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2517,6 +2517,23 @@ The protocol names currently used by git are:
     `hg` to allow the `git-remote-hg` helper)
 --
 
+protocol.version::
+	Experimental. If set, clients will attempt to communicate with a
+	server using the specified protocol version.  If unset, no
+	attempt will be made by the client to communicate using a
+	particular protocol version, this results in protocol version 0
+	being used.
+	Supported versions:
++
+--
+
+* `0` - the original wire protocol.
+
+* `1` - the original wire protocol with the addition of a version string
+  in the initial response from the server.
+
+--
+
 pull.ff::
 	By default, Git does not create an extra merge commit when merging
 	a commit that is a descendant of the current commit. Instead, the
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6e3a6767e..7518ea3af 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -697,6 +697,12 @@ of clones and fetches.
 	which feed potentially-untrusted URLS to git commands.  See
 	linkgit:git-config[1] for more details.
 
+`GIT_PROTOCOL`::
+	For internal use only.  Used in handshaking the wire protocol.
+	Contains a colon ':' separated list of keys with optional values
+	'key[=value]'.  Presence of unknown keys and values must be
+	ignored.
+
 Discussion[[Discussion]]
 ------------------------
 
diff --git a/Makefile b/Makefile
index ed4ca438b..9ce68cded 100644
--- a/Makefile
+++ b/Makefile
@@ -842,6 +842,7 @@ LIB_OBJS += pretty.o
 LIB_OBJS += prio-queue.o
 LIB_OBJS += progress.o
 LIB_OBJS += prompt.o
+LIB_OBJS += protocol.o
 LIB_OBJS += quote.o
 LIB_OBJS += reachable.o
 LIB_OBJS += read-cache.o
diff --git a/cache.h b/cache.h
index 49b083ee0..c74b73671 100644
--- a/cache.h
+++ b/cache.h
@@ -445,6 +445,14 @@ static inline enum object_type object_type(unsigned int mode)
 #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS"
 #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH"
 
+/*
+ * Environment variable used in handshaking the wire protocol.
+ * Contains a colon ':' separated list of keys with optional values
+ * 'key[=value]'.  Presence of unknown keys and values must be
+ * ignored.
+ */
+#define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+
 /*
  * This environment variable is expected to contain a boolean indicating
  * whether we should or should not treat:
diff --git a/protocol.c b/protocol.c
new file mode 100644
index 000000000..43012b7eb
--- /dev/null
+++ b/protocol.c
@@ -0,0 +1,79 @@
+#include "cache.h"
+#include "config.h"
+#include "protocol.h"
+
+static enum protocol_version parse_protocol_version(const char *value)
+{
+	if (!strcmp(value, "0"))
+		return protocol_v0;
+	else if (!strcmp(value, "1"))
+		return protocol_v1;
+	else
+		return protocol_unknown_version;
+}
+
+enum protocol_version get_protocol_version_config(void)
+{
+	const char *value;
+	if (!git_config_get_string_const("protocol.version", &value)) {
+		enum protocol_version version = parse_protocol_version(value);
+
+		if (version == protocol_unknown_version)
+			die("unknown value for config 'protocol.version': %s",
+			    value);
+
+		return version;
+	}
+
+	return protocol_v0;
+}
+
+enum protocol_version determine_protocol_version_server(void)
+{
+	const char *git_protocol = getenv(GIT_PROTOCOL_ENVIRONMENT);
+	enum protocol_version version = protocol_v0;
+
+	/*
+	 * Determine which protocol version the client has requested.  Since
+	 * multiple 'version' keys can be sent by the client, indicating that
+	 * the client is okay to speak any of them, select the greatest version
+	 * that the client has requested.  This is due to the assumption that
+	 * the most recent protocol version will be the most state-of-the-art.
+	 */
+	if (git_protocol) {
+		struct string_list list = STRING_LIST_INIT_DUP;
+		const struct string_list_item *item;
+		string_list_split(&list, git_protocol, ':', -1);
+
+		for_each_string_list_item(item, &list) {
+			const char *value;
+			enum protocol_version v;
+
+			if (skip_prefix(item->string, "version=", &value)) {
+				v = parse_protocol_version(value);
+				if (v > version)
+					version = v;
+			}
+		}
+
+		string_list_clear(&list, 0);
+	}
+
+	return version;
+}
+
+enum protocol_version determine_protocol_version_client(const char *server_response)
+{
+	enum protocol_version version = protocol_v0;
+
+	if (skip_prefix(server_response, "version ", &server_response)) {
+		version = parse_protocol_version(server_response);
+
+		if (version == protocol_unknown_version)
+			die("server is speaking an unknown protocol");
+		if (version == protocol_v0)
+			die("protocol error: server explicitly said version 0");
+	}
+
+	return version;
+}
diff --git a/protocol.h b/protocol.h
new file mode 100644
index 000000000..1b2bc94a8
--- /dev/null
+++ b/protocol.h
@@ -0,0 +1,33 @@
+#ifndef PROTOCOL_H
+#define PROTOCOL_H
+
+enum protocol_version {
+	protocol_unknown_version = -1,
+	protocol_v0 = 0,
+	protocol_v1 = 1,
+};
+
+/*
+ * Used by a client to determine which protocol version to request be used when
+ * communicating with a server, reflecting the configured value of the
+ * 'protocol.version' config.  If unconfigured, a value of 'protocol_v0' is
+ * returned.
+ */
+extern enum protocol_version get_protocol_version_config(void);
+
+/*
+ * Used by a server to determine which protocol version should be used based on
+ * a client's request, communicated via the 'GIT_PROTOCOL' environment variable
+ * by setting appropriate values for the key 'version'.  If a client doesn't
+ * request a particular protocol version, a default of 'protocol_v0' will be
+ * used.
+ */
+extern enum protocol_version determine_protocol_version_server(void);
+
+/*
+ * Used by a client to determine which protocol version the server is speaking
+ * based on the server's initial response.
+ */
+extern enum protocol_version determine_protocol_version_client(const char *server_response);
+
+#endif /* PROTOCOL_H */
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 04/11] daemon: recognize hidden request arguments
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (2 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 03/11] protocol: introduce protocol extension mechanisms Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 05/11] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
                         ` (6 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

A normal request to git-daemon is structured as
"command path/to/repo\0host=..\0" and due to a bug introduced in
49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) we
aren't able to place any extra arguments (separated by NULs) besides the
host otherwise the parsing of those arguments would enter an infinite
loop.  This bug was fixed in 73bb33a94 (daemon: Strictly parse the
"extra arg" part of the command, 2009-06-04) but a check was put in
place to disallow extra arguments so that new clients wouldn't trigger
this bug in older servers.

In order to get around this limitation teach git-daemon to recognize
additional request arguments hidden behind a second NUL byte.  Requests
can then be structured like:
"command path/to/repo\0host=..\0\0version=1\0key=value\0".  git-daemon
can then parse out the extra arguments and set 'GIT_PROTOCOL'
accordingly.

By placing these extra arguments behind a second NUL byte we can skirt
around both the infinite loop bug in 49ba83fb6 (Add virtualization
support to git-daemon, 2006-09-19) as well as the explicit disallowing
of extra arguments introduced in 73bb33a94 (daemon: Strictly parse the
"extra arg" part of the command, 2009-06-04) because both of these
versions of git-daemon check for a single NUL byte after the host
argument before terminating the argument parsing.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 daemon.c | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 1 file changed, 62 insertions(+), 9 deletions(-)

diff --git a/daemon.c b/daemon.c
index 30747075f..e37e343d0 100644
--- a/daemon.c
+++ b/daemon.c
@@ -282,7 +282,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi)
 	return NULL;		/* Fallthrough. Deny by default */
 }
 
-typedef int (*daemon_service_fn)(void);
+typedef int (*daemon_service_fn)(const struct argv_array *env);
 struct daemon_service {
 	const char *name;
 	const char *config_name;
@@ -363,7 +363,7 @@ static int run_access_hook(struct daemon_service *service, const char *dir,
 }
 
 static int run_service(const char *dir, struct daemon_service *service,
-		       struct hostinfo *hi)
+		       struct hostinfo *hi, const struct argv_array *env)
 {
 	const char *path;
 	int enabled = service->enabled;
@@ -422,7 +422,7 @@ static int run_service(const char *dir, struct daemon_service *service,
 	 */
 	signal(SIGTERM, SIG_IGN);
 
-	return service->fn();
+	return service->fn(env);
 }
 
 static void copy_to_log(int fd)
@@ -462,25 +462,34 @@ static int run_service_command(struct child_process *cld)
 	return finish_command(cld);
 }
 
-static int upload_pack(void)
+static int upload_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_pushl(&cld.args, "upload-pack", "--strict", NULL);
 	argv_array_pushf(&cld.args, "--timeout=%u", timeout);
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int upload_archive(void)
+static int upload_archive(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "upload-archive");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
-static int receive_pack(void)
+static int receive_pack(const struct argv_array *env)
 {
 	struct child_process cld = CHILD_PROCESS_INIT;
 	argv_array_push(&cld.args, "receive-pack");
+
+	argv_array_pushv(&cld.env_array, env->argv);
+
 	return run_service_command(&cld);
 }
 
@@ -573,8 +582,11 @@ static void canonicalize_client(struct strbuf *out, const char *in)
 
 /*
  * Read the host as supplied by the client connection.
+ *
+ * Returns a pointer to the character after the NUL byte terminating the host
+ * arguemnt, or 'extra_args' if there is no host arguemnt.
  */
-static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
+static char *parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 {
 	char *val;
 	int vallen;
@@ -602,6 +614,43 @@ static void parse_host_arg(struct hostinfo *hi, char *extra_args, int buflen)
 		if (extra_args < end && *extra_args)
 			die("Invalid request");
 	}
+
+	return extra_args;
+}
+
+static void parse_extra_args(struct hostinfo *hi, struct argv_array *env,
+			     char *extra_args, int buflen)
+{
+	const char *end = extra_args + buflen;
+	struct strbuf git_protocol = STRBUF_INIT;
+
+	/* First look for the host argument */
+	extra_args = parse_host_arg(hi, extra_args, buflen);
+
+	/* Look for additional arguments places after a second NUL byte */
+	for (; extra_args < end; extra_args += strlen(extra_args) + 1) {
+		const char *arg = extra_args;
+
+		/*
+		 * Parse the extra arguments, adding most to 'git_protocol'
+		 * which will be used to set the 'GIT_PROTOCOL' envvar in the
+		 * service that will be run.
+		 *
+		 * If there ends up being a particular arg in the future that
+		 * git-daemon needs to parse specificly (like the 'host' arg)
+		 * then it can be parsed here and not added to 'git_protocol'.
+		 */
+		if (*arg) {
+			if (git_protocol.len > 0)
+				strbuf_addch(&git_protocol, ':');
+			strbuf_addstr(&git_protocol, arg);
+		}
+	}
+
+	if (git_protocol.len > 0)
+		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=%s",
+				 git_protocol.buf);
+	strbuf_release(&git_protocol);
 }
 
 /*
@@ -695,6 +744,7 @@ static int execute(void)
 	int pktlen, len, i;
 	char *addr = getenv("REMOTE_ADDR"), *port = getenv("REMOTE_PORT");
 	struct hostinfo hi;
+	struct argv_array env = ARGV_ARRAY_INIT;
 
 	hostinfo_init(&hi);
 
@@ -716,8 +766,9 @@ static int execute(void)
 		pktlen--;
 	}
 
+	/* parse additional args hidden behind a NUL byte */
 	if (len != pktlen)
-		parse_host_arg(&hi, line + len + 1, pktlen - len - 1);
+		parse_extra_args(&hi, &env, line + len + 1, pktlen - len - 1);
 
 	for (i = 0; i < ARRAY_SIZE(daemon_service); i++) {
 		struct daemon_service *s = &(daemon_service[i]);
@@ -730,13 +781,15 @@ static int execute(void)
 			 * Note: The directory here is probably context sensitive,
 			 * and might depend on the actual service being performed.
 			 */
-			int rc = run_service(arg, s, &hi);
+			int rc = run_service(arg, s, &hi, &env);
 			hostinfo_clear(&hi);
+			argv_array_clear(&env);
 			return rc;
 		}
 	}
 
 	hostinfo_clear(&hi);
+	argv_array_clear(&env);
 	logerror("Protocol error: '%s'", line);
 	return -1;
 }
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 05/11] upload-pack, receive-pack: introduce protocol version 1
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (3 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 04/11] daemon: recognize hidden request arguments Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 06/11] connect: teach client to recognize v1 server response Brandon Williams
                         ` (5 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Teach upload-pack and receive-pack to understand and respond using
protocol version 1, if requested.

Protocol version 1 is simply the original and current protocol (what I'm
calling version 0) with the addition of a single packet line, which
precedes the ref advertisement, indicating the protocol version being
spoken.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 builtin/receive-pack.c | 17 +++++++++++++++++
 upload-pack.c          | 20 +++++++++++++++++++-
 2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index dd06b3fb4..839c1462d 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -24,6 +24,7 @@
 #include "tmp-objdir.h"
 #include "oidset.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static const char * const receive_pack_usage[] = {
 	N_("git receive-pack <git-dir>"),
@@ -1963,6 +1964,22 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 	else if (0 <= receive_unpack_limit)
 		unpack_limit = receive_unpack_limit;
 
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
+	case protocol_v0:
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	if (advertise_refs || !stateless_rpc) {
 		write_head_info();
 	}
diff --git a/upload-pack.c b/upload-pack.c
index 7efff2fbf..ef99a029c 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -18,6 +18,7 @@
 #include "parse-options.h"
 #include "argv-array.h"
 #include "prio-queue.h"
+#include "protocol.h"
 
 static const char * const upload_pack_usage[] = {
 	N_("git upload-pack [<options>] <dir>"),
@@ -1067,6 +1068,23 @@ int cmd_main(int argc, const char **argv)
 		die("'%s' does not appear to be a git repository", dir);
 
 	git_config(upload_pack_config, NULL);
-	upload_pack();
+
+	switch (determine_protocol_version_server()) {
+	case protocol_v1:
+		/*
+		 * v1 is just the original protocol with a version string,
+		 * so just fall through after writing the version string.
+		 */
+		if (advertise_refs || !stateless_rpc)
+			packet_write_fmt(1, "version 1\n");
+
+		/* fallthrough */
+	case protocol_v0:
+		upload_pack();
+		break;
+	case protocol_unknown_version:
+		BUG("unknown protocol version");
+	}
+
 	return 0;
 }
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 06/11] connect: teach client to recognize v1 server response
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (4 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 05/11] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 07/11] connect: tell server that the client understands v1 Brandon Williams
                         ` (4 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Teach a client to recognize that a server understands protocol v1 by
looking at the first pkt-line the server sends in response.  This is
done by looking for the response "version 1" send by upload-pack or
receive-pack.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c | 30 ++++++++++++++++++++++++++----
 1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 8e2e276b6..a5e708a61 100644
--- a/connect.c
+++ b/connect.c
@@ -12,6 +12,7 @@
 #include "sha1-array.h"
 #include "transport.h"
 #include "strbuf.h"
+#include "protocol.h"
 
 static char *server_capabilities;
 static const char *parse_feature_value(const char *, const char *, int *);
@@ -129,9 +130,23 @@ static int read_remote_ref(int in, char **src_buf, size_t *src_len,
 	return len;
 }
 
-#define EXPECTING_FIRST_REF 0
-#define EXPECTING_REF 1
-#define EXPECTING_SHALLOW 2
+#define EXPECTING_PROTOCOL_VERSION 0
+#define EXPECTING_FIRST_REF 1
+#define EXPECTING_REF 2
+#define EXPECTING_SHALLOW 3
+
+/* Returns 1 if packet_buffer is a protocol version pkt-line, 0 otherwise. */
+static int process_protocol_version(void)
+{
+	switch (determine_protocol_version_client(packet_buffer)) {
+	case protocol_v1:
+		return 1;
+	case protocol_v0:
+		return 0;
+	default:
+		die("server is speaking an unknown protocol");
+	}
+}
 
 static void process_capabilities(int *len)
 {
@@ -224,12 +239,19 @@ struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
 	 */
 	int responded = 0;
 	int len;
-	int state = EXPECTING_FIRST_REF;
+	int state = EXPECTING_PROTOCOL_VERSION;
 
 	*list = NULL;
 
 	while ((len = read_remote_ref(in, &src_buf, &src_len, &responded))) {
 		switch (state) {
+		case EXPECTING_PROTOCOL_VERSION:
+			if (process_protocol_version()) {
+				state = EXPECTING_FIRST_REF;
+				break;
+			}
+			state = EXPECTING_FIRST_REF;
+			/* fallthrough */
 		case EXPECTING_FIRST_REF:
 			process_capabilities(&len);
 			if (process_dummy_ref()) {
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 07/11] connect: tell server that the client understands v1
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (5 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 06/11] connect: teach client to recognize v1 server response Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 08/11] http: " Brandon Williams
                         ` (3 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Teach the connection logic to tell a serve that it understands protocol
v1.  This is done in 2 different ways for the builtin transports, both
of which ultimately set 'GIT_PROTOCOL' to 'version=1' on the server.

1. git://
   A normal request to git-daemon is structured as
   "command path/to/repo\0host=..\0" and due to a bug introduced in
   49ba83fb6 (Add virtualization support to git-daemon, 2006-09-19) we
   aren't able to place any extra arguments (separated by NULs) besides
   the host otherwise the parsing of those arguments would enter an
   infinite loop.  This bug was fixed in 73bb33a94 (daemon: Strictly
   parse the "extra arg" part of the command, 2009-06-04) but a check
   was put in place to disallow extra arguments so that new clients
   wouldn't trigger this bug in older servers.

   In order to get around this limitation git-daemon was taught to
   recognize additional request arguments hidden behind a second
   NUL byte.  Requests can then be structured like:
   "command path/to/repo\0host=..\0\0version=1\0key=value\0".
   git-daemon can then parse out the extra arguments and set
   'GIT_PROTOCOL' accordingly.

   By placing these extra arguments behind a second NUL byte we can
   skirt around both the infinite loop bug in 49ba83fb6 (Add
   virtualization support to git-daemon, 2006-09-19) as well as the
   explicit disallowing of extra arguments introduced in 73bb33a94
   (daemon: Strictly parse the "extra arg" part of the command,
   2009-06-04) because both of these versions of git-daemon check for a
   single NUL byte after the host argument before terminating the
   argument parsing.

2. ssh://, file://
   Set 'GIT_PROTOCOL' environment variable with the desired protocol
   version.  With the file:// transport, 'GIT_PROTOCOL' can be set
   explicitly in the locally running git-upload-pack or git-receive-pack
   processes.  With the ssh:// transport and OpenSSH compliant ssh
   programs, 'GIT_PROTOCOL' can be sent across ssh by using '-o
   SendEnv=GIT_PROTOCOL' and having the server whitelist this
   environment variable.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 connect.c              |  37 ++++++--
 t/t5700-protocol-v1.sh | 223 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 255 insertions(+), 5 deletions(-)
 create mode 100755 t/t5700-protocol-v1.sh

diff --git a/connect.c b/connect.c
index a5e708a61..b8695a2fa 100644
--- a/connect.c
+++ b/connect.c
@@ -871,6 +871,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
+		struct strbuf request = STRBUF_INIT;
 		/*
 		 * Set up virtual host information based on where we will
 		 * connect, unless the user has overridden us in
@@ -898,13 +899,25 @@ struct child_process *git_connect(int fd[2], const char *url,
 		 * Note: Do not add any other headers here!  Doing so
 		 * will cause older git-daemon servers to crash.
 		 */
-		packet_write_fmt(fd[1],
-			     "%s %s%chost=%s%c",
-			     prog, path, 0,
-			     target_host, 0);
+		strbuf_addf(&request,
+			    "%s %s%chost=%s%c",
+			    prog, path, 0,
+			    target_host, 0);
+
+		/* If using a new version put that stuff here after a second null byte */
+		if (get_protocol_version_config() > 0) {
+			strbuf_addch(&request, '\0');
+			strbuf_addf(&request, "version=%d%c",
+				    get_protocol_version_config(), '\0');
+		}
+
+		packet_write(fd[1], request.buf, request.len);
+
 		free(target_host);
+		strbuf_release(&request);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
+		const char *const *var;
 
 		conn = xmalloc(sizeof(*conn));
 		child_process_init(conn);
@@ -917,7 +930,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 		sq_quote_buf(&cmd, path);
 
 		/* remove repo-local variables from the environment */
-		conn->env = local_repo_env;
+		for (var = local_repo_env; *var; var++)
+			argv_array_push(&conn->env_array, *var);
+
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
@@ -971,6 +986,14 @@ struct child_process *git_connect(int fd[2], const char *url,
 			}
 
 			argv_array_push(&conn->args, ssh);
+
+			if (get_protocol_version_config() > 0) {
+				argv_array_push(&conn->args, "-o");
+				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
+
 			if (flags & CONNECT_IPV4)
 				argv_array_push(&conn->args, "-4");
 			else if (flags & CONNECT_IPV6)
@@ -985,6 +1008,10 @@ struct child_process *git_connect(int fd[2], const char *url,
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
+			if (get_protocol_version_config() > 0) {
+				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+						 get_protocol_version_config());
+			}
 		}
 		argv_array_push(&conn->args, cmd.buf);
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
new file mode 100755
index 000000000..6551932da
--- /dev/null
+++ b/t/t5700-protocol-v1.sh
@@ -0,0 +1,223 @@
+#!/bin/sh
+
+test_description='test git wire-protocol transition'
+
+TEST_NO_CREATE_REPO=1
+
+. ./test-lib.sh
+
+# Test protocol v1 with 'git://' transport
+#
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+start_git_daemon --export-all --enable=receive-pack
+daemon_parent=$GIT_DAEMON_DOCUMENT_ROOT_PATH/parent
+
+test_expect_success 'create repo to be served by git-daemon' '
+	git init "$daemon_parent" &&
+	test_commit -C "$daemon_parent" one
+'
+
+test_expect_success 'clone with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "$GIT_DAEMON_URL/parent" daemon_child 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "clone> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with git:// using protocol v1' '
+	test_commit -C "$daemon_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C daemon_child log -1 --format=%s origin/master >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "fetch> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with git:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "fetch> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with git:// using protocol v1' '
+	test_commit -C daemon_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C daemon_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C daemon_child log -1 --format=%s >actual &&
+	git -C "$daemon_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "push> .*\\\0\\\0version=1\\\0$" log &&
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+stop_git_daemon
+
+# Test protocol v1 with 'file://' transport
+#
+test_expect_success 'create repo to be served by file:// transport' '
+	git init file_parent &&
+	test_commit -C file_parent one
+'
+
+test_expect_success 'clone with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "file://$(pwd)/file_parent" file_child 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with file:// using protocol v1' '
+	test_commit -C file_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C file_child log -1 --format=%s origin/master >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with file:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with file:// using protocol v1' '
+	test_commit -C file_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C file_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+
+	git -C file_child log -1 --format=%s >actual &&
+	git -C file_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+# Test protocol v1 with 'ssh://' transport
+#
+test_expect_success 'setup ssh wrapper' '
+	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh" &&
+	export GIT_SSH &&
+	export TRASH_DIRECTORY &&
+	>"$TRASH_DIRECTORY"/ssh-output
+'
+
+expect_ssh () {
+	test_when_finished '(cd "$TRASH_DIRECTORY" && rm -f ssh-expect && >ssh-output)' &&
+	echo "ssh: -o SendEnv=GIT_PROTOCOL myhost $1 '$PWD/ssh_parent'" >"$TRASH_DIRECTORY/ssh-expect" &&
+	(cd "$TRASH_DIRECTORY" && test_cmp ssh-expect ssh-output)
+}
+
+test_expect_success 'create repo to be served by ssh:// transport' '
+	git init ssh_parent &&
+	test_commit -C ssh_parent one
+'
+
+test_expect_success 'clone with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -c protocol.version=1 \
+		clone "ssh://myhost:$(pwd)/ssh_parent" ssh_child 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "clone< version 1" log
+'
+
+test_expect_success 'fetch with ssh:// using protocol v1' '
+	test_commit -C ssh_parent two &&
+
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		fetch 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s origin/master >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'pull with ssh:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		pull 2>log &&
+	expect_ssh git-upload-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "fetch< version 1" log
+'
+
+test_expect_success 'push with ssh:// using protocol v1' '
+	test_commit -C ssh_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C ssh_child -c protocol.version=1 \
+		push origin HEAD:client_branch 2>log &&
+	expect_ssh git-receive-pack &&
+
+	git -C ssh_child log -1 --format=%s >actual &&
+	git -C ssh_parent log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "push< version 1" log
+'
+
+test_done
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 08/11] http: tell server that the client understands v1
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (6 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 07/11] connect: tell server that the client understands v1 Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 09/11] i5700: add interop test for protocol transition Brandon Williams
                         ` (2 subsequent siblings)
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Tell a server that protocol v1 can be used by sending the http header
'Git-Protocol' with 'version=1' indicating this.

Also teach the apache http server to pass through the 'Git-Protocol'
header as an environment variable 'GIT_PROTOCOL'.

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 cache.h                 |  2 ++
 http.c                  | 18 +++++++++++++
 t/lib-httpd/apache.conf |  7 +++++
 t/t5700-protocol-v1.sh  | 69 +++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 96 insertions(+)

diff --git a/cache.h b/cache.h
index c74b73671..3a6b869c2 100644
--- a/cache.h
+++ b/cache.h
@@ -452,6 +452,8 @@ static inline enum object_type object_type(unsigned int mode)
  * ignored.
  */
 #define GIT_PROTOCOL_ENVIRONMENT "GIT_PROTOCOL"
+/* HTTP header used to handshake the wire protocol */
+#define GIT_PROTOCOL_HEADER "Git-Protocol"
 
 /*
  * This environment variable is expected to contain a boolean indicating
diff --git a/http.c b/http.c
index 9e40a465f..ffb719216 100644
--- a/http.c
+++ b/http.c
@@ -12,6 +12,7 @@
 #include "gettext.h"
 #include "transport.h"
 #include "packfile.h"
+#include "protocol.h"
 
 static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
@@ -897,6 +898,21 @@ static void set_from_env(const char **var, const char *envname)
 		*var = val;
 }
 
+static void protocol_http_header(void)
+{
+	if (get_protocol_version_config() > 0) {
+		struct strbuf protocol_header = STRBUF_INIT;
+
+		strbuf_addf(&protocol_header, GIT_PROTOCOL_HEADER ": version=%d",
+			    get_protocol_version_config());
+
+
+		extra_http_headers = curl_slist_append(extra_http_headers,
+						       protocol_header.buf);
+		strbuf_release(&protocol_header);
+	}
+}
+
 void http_init(struct remote *remote, const char *url, int proactive_auth)
 {
 	char *low_speed_limit;
@@ -927,6 +943,8 @@ void http_init(struct remote *remote, const char *url, int proactive_auth)
 	if (remote)
 		var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
 
+	protocol_http_header();
+
 	pragma_header = curl_slist_append(http_copy_default_headers(),
 		"Pragma: no-cache");
 	no_pragma_header = curl_slist_append(http_copy_default_headers(),
diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index 0642ae7e6..df1943631 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -67,6 +67,9 @@ LockFile accept.lock
 <IfModule !mod_unixd.c>
 	LoadModule unixd_module modules/mod_unixd.so
 </IfModule>
+<IfModule !mod_setenvif.c>
+	LoadModule setenvif_module modules/mod_setenvif.so
+</IfModule>
 </IfVersion>
 
 PassEnv GIT_VALGRIND
@@ -76,6 +79,10 @@ PassEnv ASAN_OPTIONS
 PassEnv GIT_TRACE
 PassEnv GIT_CONFIG_NOSYSTEM
 
+<IfVersion >= 2.4>
+	SetEnvIf Git-Protocol ".*" GIT_PROTOCOL=$0
+</IfVersion>
+
 Alias /dumb/ www/
 Alias /auth/dumb/ www/auth/dumb/
 
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index 6551932da..b0779d362 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -220,4 +220,73 @@ test_expect_success 'push with ssh:// using protocol v1' '
 	grep "push< version 1" log
 '
 
+# Test protocol v1 with 'http://' transport
+#
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
+
+test_expect_success 'create repo to be served by http:// transport' '
+	git init "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" config http.receivepack true &&
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" one
+'
+
+test_expect_success 'clone with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 GIT_TRACE_CURL=1 git -c protocol.version=1 \
+		clone "$HTTPD_URL/smart/http_parent" http_child 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Client requested to use protocol v1
+	grep "Git-Protocol: version=1" log &&
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'fetch with http:// using protocol v1' '
+	test_commit -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" two &&
+
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		fetch 2>log &&
+
+	git -C http_child log -1 --format=%s origin/master >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'pull with http:// using protocol v1' '
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		pull 2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+test_expect_success 'push with http:// using protocol v1' '
+	test_commit -C http_child three &&
+
+	# Push to another branch, as the target repository has the
+	# master branch checked out and we cannot push into it.
+	GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
+		push origin HEAD:client_branch && #2>log &&
+
+	git -C http_child log -1 --format=%s >actual &&
+	git -C "$HTTPD_DOCUMENT_ROOT_PATH/http_parent" log -1 --format=%s client_branch >expect &&
+	test_cmp expect actual &&
+
+	# Server responded using protocol v1
+	grep "git< version 1" log
+'
+
+stop_httpd
+
 test_done
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 09/11] i5700: add interop test for protocol transition
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (7 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 08/11] http: " Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 10/11] ssh: introduce a 'simple' ssh variant Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 11/11] Documentation: document Extra Parameters Brandon Williams
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

Signed-off-by: Brandon Williams <bmwill@google.com>
---
 t/interop/i5700-protocol-transition.sh | 68 ++++++++++++++++++++++++++++++++++
 1 file changed, 68 insertions(+)
 create mode 100755 t/interop/i5700-protocol-transition.sh

diff --git a/t/interop/i5700-protocol-transition.sh b/t/interop/i5700-protocol-transition.sh
new file mode 100755
index 000000000..97e8e580e
--- /dev/null
+++ b/t/interop/i5700-protocol-transition.sh
@@ -0,0 +1,68 @@
+#!/bin/sh
+
+VERSION_A=.
+VERSION_B=v2.0.0
+
+: ${LIB_GIT_DAEMON_PORT:=5700}
+LIB_GIT_DAEMON_COMMAND='git.b daemon'
+
+test_description='clone and fetch by client who is trying to use a new protocol'
+. ./interop-lib.sh
+. "$TEST_DIRECTORY"/lib-git-daemon.sh
+
+start_git_daemon --export-all
+
+repo=$GIT_DAEMON_DOCUMENT_ROOT_PATH/repo
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init "$repo" &&
+	git.b -C "$repo" commit --allow-empty -m one
+'
+
+test_expect_success "git:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone "$GIT_DAEMON_URL/repo" child 2>log &&
+	git.a -C child log -1 --format=%s >actual &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	grep "version=1" log
+'
+
+test_expect_success "git:// fetch with $VERSION_A and protocol v1" '
+	git.b -C "$repo" commit --allow-empty -m two &&
+	git.b -C "$repo" log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child -c protocol.version=1 fetch 2>log &&
+	git.a -C child log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	grep "version=1" log &&
+	! grep "version 1" log
+'
+
+stop_git_daemon
+
+test_expect_success "create repo served by $VERSION_B" '
+	git.b init parent &&
+	git.b -C parent commit --allow-empty -m one
+'
+
+test_expect_success "file:// clone with $VERSION_A and protocol v1" '
+	GIT_TRACE_PACKET=1 git.a -c protocol.version=1 clone --upload-pack="git.b upload-pack" parent child2 2>log &&
+	git.a -C child2 log -1 --format=%s >actual &&
+	git.b -C parent log -1 --format=%s >expect &&
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_expect_success "file:// fetch with $VERSION_A and protocol v1" '
+	git.b -C parent commit --allow-empty -m two &&
+	git.b -C parent log -1 --format=%s >expect &&
+
+	GIT_TRACE_PACKET=1 git.a -C child2 -c protocol.version=1 fetch --upload-pack="git.b upload-pack" 2>log &&
+	git.a -C child2 log -1 --format=%s FETCH_HEAD >actual &&
+
+	test_cmp expect actual &&
+	! grep "version 1" log
+'
+
+test_done
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 10/11] ssh: introduce a 'simple' ssh variant
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (8 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 09/11] i5700: add interop test for protocol transition Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  2017-10-16 17:55       ` [PATCH v4 11/11] Documentation: document Extra Parameters Brandon Williams
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

When using the 'ssh' transport, the '-o' option is used to specify an
environment variable which should be set on the remote end.  This allows
git to send additional information when contacting the server,
requesting the use of a different protocol version via the
'GIT_PROTOCOL' environment variable like so: "-o SendEnv=GIT_PROTOCOL".

Unfortunately not all ssh variants support the sending of environment
variables to the remote end.  To account for this, only use the '-o'
option for ssh variants which are OpenSSH compliant.  This is done by
checking that the basename of the ssh command is 'ssh' or the ssh
variant is overridden to be 'ssh' (via the ssh.variant config).

Other options like '-p' and '-P', which are used to specify a specific
port to use, or '-4' and '-6', which are used to indicate that IPV4 or
IPV6 addresses should be used, may also not be supported by all ssh
variants.

Currently if an ssh command's basename wasn't 'plink' or
'tortoiseplink' git assumes that the command is an OpenSSH variant.
Since user configured ssh commands may not be OpenSSH compliant, tighten
this constraint and assume a variant of 'simple' if the basename of the
command doesn't match the variants known to git.  The new ssh variant
'simple' will only have the host and command to execute ([username@]host
command) passed as parameters to the ssh command.

Update the Documentation to better reflect the command-line options sent
to ssh commands based on their variant.

Reported-by: Jeffrey Yasskin <jyasskin@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/config.txt |  27 ++++++++++--
 Documentation/git.txt    |   9 ++--
 connect.c                | 108 ++++++++++++++++++++++++++---------------------
 t/t5601-clone.sh         |  26 +++++++++---
 t/t5700-protocol-v1.sh   |   2 +
 5 files changed, 111 insertions(+), 61 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b78747abc..0460af37e 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2084,12 +2084,31 @@ ssh.variant::
 	Depending on the value of the environment variables `GIT_SSH` or
 	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
 	auto-detects whether to adjust its command-line parameters for use
-	with plink or tortoiseplink, as opposed to the default (OpenSSH).
+	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
+	(simple).
 +
 The config variable `ssh.variant` can be set to override this auto-detection;
-valid values are `ssh`, `plink`, `putty` or `tortoiseplink`. Any other value
-will be treated as normal ssh. This setting can be overridden via the
-environment variable `GIT_SSH_VARIANT`.
+valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
+other value will be treated as normal ssh. This setting can be overridden via
+the environment variable `GIT_SSH_VARIANT`.
++
+The current command-line parameters used for each variant are as
+follows:
++
+--
+
+* `ssh` - [-p port] [-4] [-6] [-o option] [username@]host command
+
+* `simple` - [username@]host command
+
+* `plink` or `putty` - [-P port] [-4] [-6] [username@]host command
+
+* `tortoiseplink` - [-P port] [-4] [-6] -batch [username@]host command
+
+--
++
+Except for the `simple` variant, command-line parameters are likely to
+change as git gains new features.
 
 i18n.commitEncoding::
 	Character encoding the commit messages are stored in; Git itself
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 7518ea3af..8bc3f2147 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -518,11 +518,10 @@ other
 	If either of these environment variables is set then 'git fetch'
 	and 'git push' will use the specified command instead of 'ssh'
 	when they need to connect to a remote system.
-	The command will be given exactly two or four arguments: the
-	'username@host' (or just 'host') from the URL and the shell
-	command to execute on that remote system, optionally preceded by
-	`-p` (literally) and the 'port' from the URL when it specifies
-	something other than the default SSH port.
+	The command-line parameters passed to the configured command are
+	determined by the ssh variant.  See `ssh.variant` option in
+	linkgit:git-config[1] for details.
+
 +
 `$GIT_SSH_COMMAND` takes precedence over `$GIT_SSH`, and is interpreted
 by the shell, which allows additional arguments to be included.
diff --git a/connect.c b/connect.c
index b8695a2fa..7fbd396b3 100644
--- a/connect.c
+++ b/connect.c
@@ -776,37 +776,44 @@ static const char *get_ssh_command(void)
 	return NULL;
 }
 
-static int override_ssh_variant(int *port_option, int *needs_batch)
+enum ssh_variant {
+	VARIANT_SIMPLE,
+	VARIANT_SSH,
+	VARIANT_PLINK,
+	VARIANT_PUTTY,
+	VARIANT_TORTOISEPLINK,
+};
+
+static int override_ssh_variant(enum ssh_variant *ssh_variant)
 {
-	char *variant;
+	const char *variant = getenv("GIT_SSH_VARIANT");
 
-	variant = xstrdup_or_null(getenv("GIT_SSH_VARIANT"));
-	if (!variant &&
-	    git_config_get_string("ssh.variant", &variant))
+	if (!variant && git_config_get_string_const("ssh.variant", &variant))
 		return 0;
 
-	if (!strcmp(variant, "plink") || !strcmp(variant, "putty")) {
-		*port_option = 'P';
-		*needs_batch = 0;
-	} else if (!strcmp(variant, "tortoiseplink")) {
-		*port_option = 'P';
-		*needs_batch = 1;
-	} else {
-		*port_option = 'p';
-		*needs_batch = 0;
-	}
-	free(variant);
+	if (!strcmp(variant, "plink"))
+		*ssh_variant = VARIANT_PLINK;
+	else if (!strcmp(variant, "putty"))
+		*ssh_variant = VARIANT_PUTTY;
+	else if (!strcmp(variant, "tortoiseplink"))
+		*ssh_variant = VARIANT_TORTOISEPLINK;
+	else if (!strcmp(variant, "simple"))
+		*ssh_variant = VARIANT_SIMPLE;
+	else
+		*ssh_variant = VARIANT_SSH;
+
 	return 1;
 }
 
-static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
-			       int *port_option, int *needs_batch)
+static enum ssh_variant determine_ssh_variant(const char *ssh_command,
+					      int is_cmdline)
 {
+	enum ssh_variant ssh_variant = VARIANT_SIMPLE;
 	const char *variant;
 	char *p = NULL;
 
-	if (override_ssh_variant(port_option, needs_batch))
-		return;
+	if (override_ssh_variant(&ssh_variant))
+		return ssh_variant;
 
 	if (!is_cmdline) {
 		p = xstrdup(ssh_command);
@@ -825,19 +832,22 @@ static void handle_ssh_variant(const char *ssh_command, int is_cmdline,
 			free(ssh_argv);
 		} else {
 			free(p);
-			return;
+			return ssh_variant;
 		}
 	}
 
-	if (!strcasecmp(variant, "plink") ||
-	    !strcasecmp(variant, "plink.exe"))
-		*port_option = 'P';
+	if (!strcasecmp(variant, "ssh") ||
+	    !strcasecmp(variant, "ssh.exe"))
+		ssh_variant = VARIANT_SSH;
+	else if (!strcasecmp(variant, "plink") ||
+		 !strcasecmp(variant, "plink.exe"))
+		ssh_variant = VARIANT_PLINK;
 	else if (!strcasecmp(variant, "tortoiseplink") ||
-		 !strcasecmp(variant, "tortoiseplink.exe")) {
-		*port_option = 'P';
-		*needs_batch = 1;
-	}
+		 !strcasecmp(variant, "tortoiseplink.exe"))
+		ssh_variant = VARIANT_TORTOISEPLINK;
+
 	free(p);
+	return ssh_variant;
 }
 
 /*
@@ -937,8 +947,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
 			const char *ssh;
-			int needs_batch = 0;
-			int port_option = 'p';
+			enum ssh_variant variant;
 			char *ssh_host = hostandport;
 			const char *port = NULL;
 			transport_check_allowed("ssh");
@@ -965,10 +974,9 @@ struct child_process *git_connect(int fd[2], const char *url,
 				die("strange hostname '%s' blocked", ssh_host);
 
 			ssh = get_ssh_command();
-			if (ssh)
-				handle_ssh_variant(ssh, 1, &port_option,
-						   &needs_batch);
-			else {
+			if (ssh) {
+				variant = determine_ssh_variant(ssh, 1);
+			} else {
 				/*
 				 * GIT_SSH is the no-shell version of
 				 * GIT_SSH_COMMAND (and must remain so for
@@ -979,32 +987,38 @@ struct child_process *git_connect(int fd[2], const char *url,
 				ssh = getenv("GIT_SSH");
 				if (!ssh)
 					ssh = "ssh";
-				else
-					handle_ssh_variant(ssh, 0,
-							   &port_option,
-							   &needs_batch);
+				variant = determine_ssh_variant(ssh, 0);
 			}
 
 			argv_array_push(&conn->args, ssh);
 
-			if (get_protocol_version_config() > 0) {
+			if (variant == VARIANT_SSH &&
+			    get_protocol_version_config() > 0) {
 				argv_array_push(&conn->args, "-o");
 				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
 				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
 						 get_protocol_version_config());
 			}
 
-			if (flags & CONNECT_IPV4)
-				argv_array_push(&conn->args, "-4");
-			else if (flags & CONNECT_IPV6)
-				argv_array_push(&conn->args, "-6");
-			if (needs_batch)
+			if (variant != VARIANT_SIMPLE) {
+				if (flags & CONNECT_IPV4)
+					argv_array_push(&conn->args, "-4");
+				else if (flags & CONNECT_IPV6)
+					argv_array_push(&conn->args, "-6");
+			}
+
+			if (variant == VARIANT_TORTOISEPLINK)
 				argv_array_push(&conn->args, "-batch");
-			if (port) {
-				argv_array_pushf(&conn->args,
-						 "-%c", port_option);
+
+			if (port && variant != VARIANT_SIMPLE) {
+				if (variant == VARIANT_SSH)
+					argv_array_push(&conn->args, "-p");
+				else
+					argv_array_push(&conn->args, "-P");
+
 				argv_array_push(&conn->args, port);
 			}
+
 			argv_array_push(&conn->args, ssh_host);
 		} else {
 			transport_check_allowed("file");
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 9c56f771b..86811a0c3 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -309,8 +309,8 @@ test_expect_success 'clone checking out a tag' '
 setup_ssh_wrapper () {
 	test_expect_success 'setup ssh wrapper' '
 		cp "$GIT_BUILD_DIR/t/helper/test-fake-ssh$X" \
-			"$TRASH_DIRECTORY/ssh-wrapper$X" &&
-		GIT_SSH="$TRASH_DIRECTORY/ssh-wrapper$X" &&
+			"$TRASH_DIRECTORY/ssh$X" &&
+		GIT_SSH="$TRASH_DIRECTORY/ssh$X" &&
 		export GIT_SSH &&
 		export TRASH_DIRECTORY &&
 		>"$TRASH_DIRECTORY"/ssh-output
@@ -318,7 +318,7 @@ setup_ssh_wrapper () {
 }
 
 copy_ssh_wrapper_as () {
-	cp "$TRASH_DIRECTORY/ssh-wrapper$X" "${1%$X}$X" &&
+	cp "$TRASH_DIRECTORY/ssh$X" "${1%$X}$X" &&
 	GIT_SSH="${1%$X}$X" &&
 	export GIT_SSH
 }
@@ -362,10 +362,26 @@ test_expect_success 'bracketed hostnames are still ssh' '
 	expect_ssh "-p 123" myhost src
 '
 
-test_expect_success 'uplink is not treated as putty' '
+test_expect_success 'OpenSSH variant passes -4' '
+	git clone -4 "[myhost:123]:src" ssh-ipv4-clone &&
+	expect_ssh "-4 -p 123" myhost src
+'
+
+test_expect_success 'variant can be overriden' '
+	git -c ssh.variant=simple clone -4 "[myhost:123]:src" ssh-simple-clone &&
+	expect_ssh myhost src
+'
+
+test_expect_success 'simple is treated as simple' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
+	git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple &&
+	expect_ssh myhost src
+'
+
+test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
 	git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
-	expect_ssh "-p 123" myhost src
+	expect_ssh myhost src
 '
 
 test_expect_success 'plink is treated specially (as putty)' '
diff --git a/t/t5700-protocol-v1.sh b/t/t5700-protocol-v1.sh
index b0779d362..ba86a44eb 100755
--- a/t/t5700-protocol-v1.sh
+++ b/t/t5700-protocol-v1.sh
@@ -147,6 +147,8 @@ test_expect_success 'push with file:// using protocol v1' '
 test_expect_success 'setup ssh wrapper' '
 	GIT_SSH="$GIT_BUILD_DIR/t/helper/test-fake-ssh" &&
 	export GIT_SSH &&
+	GIT_SSH_VARIANT=ssh &&
+	export GIT_SSH_VARIANT &&
 	export TRASH_DIRECTORY &&
 	>"$TRASH_DIRECTORY"/ssh-output
 '
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH v4 11/11] Documentation: document Extra Parameters
  2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
                         ` (9 preceding siblings ...)
  2017-10-16 17:55       ` [PATCH v4 10/11] ssh: introduce a 'simple' ssh variant Brandon Williams
@ 2017-10-16 17:55       ` Brandon Williams
  10 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-16 17:55 UTC (permalink / raw)
  To: git
  Cc: martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller, Brandon Williams

From: Jonathan Tan <jonathantanmy@google.com>

Document the server support for Extra Parameters, additional information
that the client can send in its first message to the server during a
Git client-server interaction.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
---
 Documentation/technical/http-protocol.txt |  8 ++++++
 Documentation/technical/pack-protocol.txt | 43 ++++++++++++++++++++++++++-----
 2 files changed, 44 insertions(+), 7 deletions(-)

diff --git a/Documentation/technical/http-protocol.txt b/Documentation/technical/http-protocol.txt
index 1c561bdd9..a0e45f288 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -219,6 +219,10 @@ smart server reply:
    S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
    S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
 
+The client may send Extra Parameters (see
+Documentation/technical/pack-protocol.txt) as a colon-separated string
+in the Git-Protocol HTTP header.
+
 Dumb Server Response
 ^^^^^^^^^^^^^^^^^^^^
 Dumb servers MUST respond with the dumb server reply format.
@@ -269,7 +273,11 @@ the C locale ordering.  The stream SHOULD include the default ref
 named `HEAD` as the first ref.  The stream MUST include capability
 declarations behind a NUL on the first ref.
 
+The returned response contains "version 1" if "version=1" was sent as an
+Extra Parameter.
+
   smart_reply     =  PKT-LINE("# service=$servicename" LF)
+		     *1("version 1")
 		     ref_list
 		     "0000"
   ref_list        =  empty_list / non_empty_list
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index ed1eae8b8..cd31edc91 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -39,6 +39,19 @@ communicates with that invoked process over the SSH connection.
 The file:// transport runs the 'upload-pack' or 'receive-pack'
 process locally and communicates with it over a pipe.
 
+Extra Parameters
+----------------
+
+The protocol provides a mechanism in which clients can send additional
+information in its first message to the server. These are called "Extra
+Parameters", and are supported by the Git, SSH, and HTTP protocols.
+
+Each Extra Parameter takes the form of `<key>=<value>` or `<key>`.
+
+Servers that receive any such Extra Parameters MUST ignore all
+unrecognized keys. Currently, the only Extra Parameter recognized is
+"version=1".
+
 Git Transport
 -------------
 
@@ -46,18 +59,25 @@ The Git transport starts off by sending the command and repository
 on the wire using the pkt-line format, followed by a NUL byte and a
 hostname parameter, terminated by a NUL byte.
 
-   0032git-upload-pack /project.git\0host=myserver.com\0
+   0033git-upload-pack /project.git\0host=myserver.com\0
+
+The transport may send Extra Parameters by adding an additional NUL
+byte, and then adding one or more NUL-terminated strings:
+
+   003egit-upload-pack /project.git\0host=myserver.com\0\0version=1\0
 
 --
-   git-proto-request = request-command SP pathname NUL [ host-parameter NUL ]
+   git-proto-request = request-command SP pathname NUL
+		       [ host-parameter NUL ] [ NUL extra-parameters ]
    request-command   = "git-upload-pack" / "git-receive-pack" /
 		       "git-upload-archive"   ; case sensitive
    pathname          = *( %x01-ff ) ; exclude NUL
    host-parameter    = "host=" hostname [ ":" port ]
+   extra-parameters  = 1*extra-parameter
+   extra-parameter   = 1*( %x01-ff ) NUL
 --
 
-Only host-parameter is allowed in the git-proto-request. Clients
-MUST NOT attempt to send additional parameters. It is used for the
+host-parameter is used for the
 git-daemon name based virtual hosting.  See --interpolated-path
 option to git daemon, with the %H/%CH format characters.
 
@@ -117,6 +137,12 @@ we execute it without the leading '/'.
 		     v
    ssh user@example.com "git-upload-pack '~alice/project.git'"
 
+Depending on the value of the `protocol.version` configuration variable,
+Git may attempt to send Extra Parameters as a colon-separated string in
+the GIT_PROTOCOL environment variable. This is done only if
+the `ssh.variant` configuration variable indicates that the ssh command
+supports passing environment variables as an argument.
+
 A few things to remember here:
 
 - The "command name" is spelled with dash (e.g. git-upload-pack), but
@@ -137,11 +163,13 @@ Reference Discovery
 -------------------
 
 When the client initially connects the server will immediately respond
-with a listing of each reference it has (all branches and tags) along
+with a version number (if "version=1" is sent as an Extra Parameter),
+and a listing of each reference it has (all branches and tags) along
 with the object name that each reference currently points to.
 
-   $ echo -e -n "0039git-upload-pack /schacon/gitbook.git\0host=example.com\0" |
+   $ echo -e -n "0044git-upload-pack /schacon/gitbook.git\0host=example.com\0\0version=1\0" |
       nc -v example.com 9418
+   000aversion 1
    00887217a7c7e582c46cec22a130adf4b9d7d950fba0 HEAD\0multi_ack thin-pack
 		side-band side-band-64k ofs-delta shallow no-progress include-tag
    00441d3fcd5ced445d1abc402225c0b8a1299641f497 refs/heads/integration
@@ -165,7 +193,8 @@ immediately after the ref itself, if presented. A conforming server
 MUST peel the ref if it's an annotated tag.
 
 ----
-  advertised-refs  =  (no-refs / list-of-refs)
+  advertised-refs  =  *1("version 1")
+		      (no-refs / list-of-refs)
 		      *shallow
 		      flush-pkt
 
-- 
2.15.0.rc0.271.g36b669edcc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH v4 03/11] protocol: introduce protocol extension mechanisms
  2017-10-16 17:55       ` [PATCH v4 03/11] protocol: introduce protocol extension mechanisms Brandon Williams
@ 2017-10-16 21:25         ` Kevin Daudt
  0 siblings, 0 replies; 161+ messages in thread
From: Kevin Daudt @ 2017-10-16 21:25 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, martin.agren, simon, bturner, git, gitster, jonathantanmy,
	jrnieder, peff, sbeller

On Mon, Oct 16, 2017 at 10:55:24AM -0700, Brandon Williams wrote:
> Create protocol.{c,h} and provide functions which future servers and
> clients can use to determine which protocol to use or is being used.
> 
> Also introduce the 'GIT_PROTOCOL' environment variable which will be
> used to communicate a colon separated list of keys with optional values
> to a server.  Unknown keys and values must be tolerated.  This mechanism
> is used to communicate which version of the wire protocol a client would
> like to use with a server.
> 
> Signed-off-by: Brandon Williams <bmwill@google.com>
> ---
>  Documentation/config.txt | 17 +++++++++++
>  Documentation/git.txt    |  6 ++++
>  Makefile                 |  1 +
>  cache.h                  |  8 +++++
>  protocol.c               | 79 ++++++++++++++++++++++++++++++++++++++++++++++++
>  protocol.h               | 33 ++++++++++++++++++++
>  6 files changed, 144 insertions(+)
>  create mode 100644 protocol.c
>  create mode 100644 protocol.h
>
> [...]
> 
> diff --git a/protocol.h b/protocol.h
> new file mode 100644
> index 000000000..1b2bc94a8
> --- /dev/null
> +++ b/protocol.h
> @@ -0,0 +1,33 @@
> +#ifndef PROTOCOL_H
> +#define PROTOCOL_H
> +
> +enum protocol_version {
> +	protocol_unknown_version = -1,
> +	protocol_v0 = 0,
> +	protocol_v1 = 1,
> +};
> +
> +/*
> + * Used by a client to determine which protocol version to request be used when
> + * communicating with a server, reflecting the configured value of the
> + * 'protocol.version' config.  If unconfigured, a value of 'protocol_v0' is
> + * returned.
> + */

The first sentence reads a little weird to me around 'which version to
request be used'. 

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH
  2017-10-16 17:18         ` Brandon Williams
@ 2017-10-23 21:28           ` Jonathan Nieder
  2017-10-23 21:29             ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
                               ` (4 more replies)
  0 siblings, 5 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 21:28 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller, William Yan

Hi,

Brandon Williams wrote:
> On 10/03, Jonathan Nieder wrote:

>> What happens if I specify a ssh://host:port/path URL and the SSH
>> implementation is of 'simple' type?
>
> The port would only be sent if your ssh command supported it.

Thanks again for this patch.  William Yan (cc-ed) ran into this case,
and based on his experience here are some patches to handle it.

See patches 3 and 5 for more details.

Longer term, I hope that wrapper script authors start setting
GIT_SSH_VARIANT based on the behavior they want (e.g. like (*))
instead of making us autodetect, but this should be useful as a
stopgap in the meantime.

These patches are against bw/protocol-v1, which is in "next".

Thoughts of all kinds welcome, as always.

Sincerely,
Jonathan Nieder (5):
  connect: split git:// setup into a separate function
  connect: split ssh command line options into separate function
  ssh: 'auto' variant to select between 'ssh' and 'simple'
  ssh: 'simple' variant does not support -4/-6
  ssh: 'simple' variant does not support --port

 Documentation/config.txt |  17 +--
 connect.c                | 275 ++++++++++++++++++++++++++++++-----------------
 t/t5601-clone.sh         |  34 ++++--
 t/t5603-clone-dirname.sh |   2 +
 4 files changed, 214 insertions(+), 114 deletions(-)

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
@ 2017-10-23 21:29             ` Jonathan Nieder
  2017-10-23 22:16               ` Stefan Beller
  2017-10-23 21:30             ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 21:29 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller, William Yan

The git_connect function is growing long.  Split the
PROTO_GIT-specific portion to a separate function to make it easier to
read.

No functional change intended.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 connect.c | 103 +++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 59 insertions(+), 44 deletions(-)

diff --git a/connect.c b/connect.c
index 7fbd396b35..068e70caad 100644
--- a/connect.c
+++ b/connect.c
@@ -850,6 +850,64 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
 	return ssh_variant;
 }
 
+/*
+ * Open a connection using Git's native protocol.
+ *
+ * The caller is responsible for freeing hostandport, but this function may
+ * modify it (for example, to truncate it to remove the port part).
+ */
+static struct child_process *git_connect_git(int fd[2], char *hostandport,
+					     const char *path, const char *prog,
+					     int flags)
+{
+	struct child_process *conn = &no_fork;
+	struct strbuf request = STRBUF_INIT;
+	/*
+	 * Set up virtual host information based on where we will
+	 * connect, unless the user has overridden us in
+	 * the environment.
+	 */
+	char *target_host = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
+	if (target_host)
+		target_host = xstrdup(target_host);
+	else
+		target_host = xstrdup(hostandport);
+
+	transport_check_allowed("git");
+
+	/* These underlying connection commands die() if they
+	 * cannot connect.
+	 */
+	if (git_use_proxy(hostandport))
+		conn = git_proxy_connect(fd, hostandport);
+	else
+		git_tcp_connect(fd, hostandport, flags);
+	/*
+	 * Separate original protocol components prog and path
+	 * from extended host header with a NUL byte.
+	 *
+	 * Note: Do not add any other headers here!  Doing so
+	 * will cause older git-daemon servers to crash.
+	 */
+	strbuf_addf(&request,
+		    "%s %s%chost=%s%c",
+		    prog, path, 0,
+		    target_host, 0);
+
+	/* If using a new version put that stuff here after a second null byte */
+	if (get_protocol_version_config() > 0) {
+		strbuf_addch(&request, '\0');
+		strbuf_addf(&request, "version=%d%c",
+			    get_protocol_version_config(), '\0');
+	}
+
+	packet_write(fd[1], request.buf, request.len);
+
+	free(target_host);
+	strbuf_release(&request);
+	return conn;
+}
+
 /*
  * This returns a dummy child_process if the transport protocol does not
  * need fork(2), or a struct child_process object if it does.  Once done,
@@ -881,50 +939,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
-		struct strbuf request = STRBUF_INIT;
-		/*
-		 * Set up virtual host information based on where we will
-		 * connect, unless the user has overridden us in
-		 * the environment.
-		 */
-		char *target_host = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
-		if (target_host)
-			target_host = xstrdup(target_host);
-		else
-			target_host = xstrdup(hostandport);
-
-		transport_check_allowed("git");
-
-		/* These underlying connection commands die() if they
-		 * cannot connect.
-		 */
-		if (git_use_proxy(hostandport))
-			conn = git_proxy_connect(fd, hostandport);
-		else
-			git_tcp_connect(fd, hostandport, flags);
-		/*
-		 * Separate original protocol components prog and path
-		 * from extended host header with a NUL byte.
-		 *
-		 * Note: Do not add any other headers here!  Doing so
-		 * will cause older git-daemon servers to crash.
-		 */
-		strbuf_addf(&request,
-			    "%s %s%chost=%s%c",
-			    prog, path, 0,
-			    target_host, 0);
-
-		/* If using a new version put that stuff here after a second null byte */
-		if (get_protocol_version_config() > 0) {
-			strbuf_addch(&request, '\0');
-			strbuf_addf(&request, "version=%d%c",
-				    get_protocol_version_config(), '\0');
-		}
-
-		packet_write(fd[1], request.buf, request.len);
-
-		free(target_host);
-		strbuf_release(&request);
+		conn = git_connect_git(fd, hostandport, path, prog, flags);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
 		const char *const *var;
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 2/5] connect: split ssh command line options into separate function
  2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
  2017-10-23 21:29             ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
@ 2017-10-23 21:30             ` Jonathan Nieder
  2017-10-23 21:48               ` Stefan Beller
  2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
                               ` (2 subsequent siblings)
  4 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 21:30 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller, William Yan

The git_connect function is growing long.  Split the portion that
discovers an ssh command and options it accepts before the service
name and path to a separate function to make it easier to read.

No functional change intended.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 connect.c | 116 +++++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 61 insertions(+), 55 deletions(-)

diff --git a/connect.c b/connect.c
index 068e70caad..77ab6db3bb 100644
--- a/connect.c
+++ b/connect.c
@@ -908,6 +908,65 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
 	return conn;
 }
 
+/* Prepare a child_process for use by Git's SSH-tunneled transport. */
+static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
+			  const char *port, int flags)
+{
+	const char *ssh;
+	enum ssh_variant variant;
+
+	if (looks_like_command_line_option(ssh_host))
+		die("strange hostname '%s' blocked", ssh_host);
+
+	ssh = get_ssh_command();
+	if (ssh) {
+		variant = determine_ssh_variant(ssh, 1);
+	} else {
+		/*
+		 * GIT_SSH is the no-shell version of
+		 * GIT_SSH_COMMAND (and must remain so for
+		 * historical compatibility).
+		 */
+		conn->use_shell = 0;
+
+		ssh = getenv("GIT_SSH");
+		if (!ssh)
+			ssh = "ssh";
+		variant = determine_ssh_variant(ssh, 0);
+	}
+
+	argv_array_push(&conn->args, ssh);
+
+	if (variant == VARIANT_SSH &&
+	    get_protocol_version_config() > 0) {
+		argv_array_push(&conn->args, "-o");
+		argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+		argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+				 get_protocol_version_config());
+	}
+
+	if (variant != VARIANT_SIMPLE) {
+		if (flags & CONNECT_IPV4)
+			argv_array_push(&conn->args, "-4");
+		else if (flags & CONNECT_IPV6)
+			argv_array_push(&conn->args, "-6");
+	}
+
+	if (variant == VARIANT_TORTOISEPLINK)
+		argv_array_push(&conn->args, "-batch");
+
+	if (port && variant != VARIANT_SIMPLE) {
+		if (variant == VARIANT_SSH)
+			argv_array_push(&conn->args, "-p");
+		else
+			argv_array_push(&conn->args, "-P");
+
+		argv_array_push(&conn->args, port);
+	}
+
+	argv_array_push(&conn->args, ssh_host);
+}
+
 /*
  * This returns a dummy child_process if the transport protocol does not
  * need fork(2), or a struct child_process object if it does.  Once done,
@@ -961,16 +1020,13 @@ struct child_process *git_connect(int fd[2], const char *url,
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
-			const char *ssh;
-			enum ssh_variant variant;
 			char *ssh_host = hostandport;
 			const char *port = NULL;
+
 			transport_check_allowed("ssh");
 			get_host_and_port(&ssh_host, &port);
-
 			if (!port)
 				port = get_port(ssh_host);
-
 			if (flags & CONNECT_DIAG_URL) {
 				printf("Diag: url=%s\n", url ? url : "NULL");
 				printf("Diag: protocol=%s\n", prot_name(protocol));
@@ -984,57 +1040,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 				strbuf_release(&cmd);
 				return NULL;
 			}
-
-			if (looks_like_command_line_option(ssh_host))
-				die("strange hostname '%s' blocked", ssh_host);
-
-			ssh = get_ssh_command();
-			if (ssh) {
-				variant = determine_ssh_variant(ssh, 1);
-			} else {
-				/*
-				 * GIT_SSH is the no-shell version of
-				 * GIT_SSH_COMMAND (and must remain so for
-				 * historical compatibility).
-				 */
-				conn->use_shell = 0;
-
-				ssh = getenv("GIT_SSH");
-				if (!ssh)
-					ssh = "ssh";
-				variant = determine_ssh_variant(ssh, 0);
-			}
-
-			argv_array_push(&conn->args, ssh);
-
-			if (variant == VARIANT_SSH &&
-			    get_protocol_version_config() > 0) {
-				argv_array_push(&conn->args, "-o");
-				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
-				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-						 get_protocol_version_config());
-			}
-
-			if (variant != VARIANT_SIMPLE) {
-				if (flags & CONNECT_IPV4)
-					argv_array_push(&conn->args, "-4");
-				else if (flags & CONNECT_IPV6)
-					argv_array_push(&conn->args, "-6");
-			}
-
-			if (variant == VARIANT_TORTOISEPLINK)
-				argv_array_push(&conn->args, "-batch");
-
-			if (port && variant != VARIANT_SIMPLE) {
-				if (variant == VARIANT_SSH)
-					argv_array_push(&conn->args, "-p");
-				else
-					argv_array_push(&conn->args, "-P");
-
-				argv_array_push(&conn->args, port);
-			}
-
-			argv_array_push(&conn->args, ssh_host);
+			fill_ssh_args(conn, ssh_host, port, flags);
 		} else {
 			transport_check_allowed("file");
 			if (get_protocol_version_config() > 0) {
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
  2017-10-23 21:29             ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
  2017-10-23 21:30             ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
@ 2017-10-23 21:31             ` Jonathan Nieder
  2017-10-23 22:19               ` Jonathan Tan
                                 ` (3 more replies)
  2017-10-23 21:32             ` [PATCH 4/5] ssh: 'simple' variant does not support -4/-6 Jonathan Nieder
  2017-10-23 21:33             ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
  4 siblings, 4 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 21:31 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller, William Yan

Android's "repo" tool is a tool for managing a large codebase
consisting of multiple smaller repositories, similar to Git's
submodule feature.  Starting with Git 94b8ae5a (ssh: introduce a
'simple' ssh variant, 2017-10-16), users noticed that it stopped
handling the port in ssh:// URLs.

The cause: when it encounters ssh:// URLs, repo pre-connects to the
server and sets GIT_SSH to a helper ".repo/repo/git_ssh" that reuses
that connection.  Before 94b8ae5a, the helper was assumed to support
OpenSSH options for lack of a better guess and got passed a -p option
to set the port.  After that patch, it uses the new default of a
simple helper that does not accept an option to set the port.

The next release of "repo" will set GIT_SSH_VARIANT to "ssh" to avoid
that.  But users of old versions and of other similar GIT_SSH
implementations would not get the benefit of that fix.

So update the default to use OpenSSH options again, with a twist.  As
observed in 94b8ae5a, we cannot assume that $GIT_SSH always handles
OpenSSH options: common helpers such as travis-ci's dpl[*] are
configured using GIT_SSH and do not accept OpenSSH options.  So make
the default a new variant "auto", with the following behavior:

 1. First, check whether $GIT_SSH supports OpenSSH options by running

	$GIT_SSH -G <options> <host>

    This returns status 0 and prints configuration in OpenSSH if it
    recognizes all <options> and returns status 255 if it encounters
    an unrecognized option.  A wrapper script like

	exec ssh -- "$@"

    would fail with

	ssh: Could not resolve hostname -g: Name or service not known

    , correctly reflecting that it does not support OpenSSH options.

 2. Based on the result from step (1), behave like "ssh" (if it
    succeeded) or "simple" (if it failed).

This way, the default ssh variant for unrecognized commands can handle
both the repo and dpl cases as intended.

If the GIT_SSH command name is recognized (e.g., "ssh") then continue
to use that variant directly without the autodetection step (1), as
before.

[*] https://github.com/travis-ci/dpl/blob/6c3fddfda1f2a85944c544446b068bac0a77c049/lib/dpl/provider.rb#L215

Reported-by: William Yan <wyan@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
repo fix: https://gerrit-review.googlesource.com/c/git-repo/+/134950

 Documentation/config.txt | 17 ++++++++----
 connect.c                | 72 ++++++++++++++++++++++++++++++++----------------
 t/t5601-clone.sh         | 14 ++++++++++
 3 files changed, 73 insertions(+), 30 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0460af37e2..4a16b324f0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2083,14 +2083,19 @@ visited as a result of a redirection do not participate in matching.
 ssh.variant::
 	Depending on the value of the environment variables `GIT_SSH` or
 	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
-	auto-detects whether to adjust its command-line parameters for use
-	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
-	(simple).
+	auto-detects whether to pass command-line parameters for use
+	with a simple wrapper script (simple), OpenSSH (ssh), plink, or
+	tortoiseplink.
++
+The default is `auto`, which means to auto-detect whether the ssh command
+implements OpenSSH options using the `-G` (print configuration) option.
+If the ssh command supports OpenSSH options, it then behaves like `ssh`;
+otherwise, it behaves like `simple`.
 +
 The config variable `ssh.variant` can be set to override this auto-detection;
-valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
-other value will be treated as normal ssh. This setting can be overridden via
-the environment variable `GIT_SSH_VARIANT`.
+valid values are `ssh`, `simple`, `plink`, `putty`, `tortoiseplink`, and
+`auto`.  Any other value will be treated as normal ssh.  This setting can be
+overridden via the environment variable `GIT_SSH_VARIANT`.
 +
 The current command-line parameters used for each variant are as
 follows:
diff --git a/connect.c b/connect.c
index 77ab6db3bb..2dc9554b30 100644
--- a/connect.c
+++ b/connect.c
@@ -777,6 +777,7 @@ static const char *get_ssh_command(void)
 }
 
 enum ssh_variant {
+	VARIANT_AUTO,
 	VARIANT_SIMPLE,
 	VARIANT_SSH,
 	VARIANT_PLINK,
@@ -791,7 +792,9 @@ static int override_ssh_variant(enum ssh_variant *ssh_variant)
 	if (!variant && git_config_get_string_const("ssh.variant", &variant))
 		return 0;
 
-	if (!strcmp(variant, "plink"))
+	if (!strcmp(variant, "auto"))
+		*ssh_variant = VARIANT_AUTO;
+	else if (!strcmp(variant, "plink"))
 		*ssh_variant = VARIANT_PLINK;
 	else if (!strcmp(variant, "putty"))
 		*ssh_variant = VARIANT_PUTTY;
@@ -808,7 +811,7 @@ static int override_ssh_variant(enum ssh_variant *ssh_variant)
 static enum ssh_variant determine_ssh_variant(const char *ssh_command,
 					      int is_cmdline)
 {
-	enum ssh_variant ssh_variant = VARIANT_SIMPLE;
+	enum ssh_variant ssh_variant = VARIANT_AUTO;
 	const char *variant;
 	char *p = NULL;
 
@@ -908,6 +911,38 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
 	return conn;
 }
 
+static void push_ssh_options(struct argv_array *args, struct argv_array *env,
+			       enum ssh_variant variant, const char *port,
+			       int flags)
+{
+	if (variant == VARIANT_SSH &&
+	    get_protocol_version_config() > 0) {
+		argv_array_push(args, "-o");
+		argv_array_push(args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+				 get_protocol_version_config());
+	}
+
+	if (variant != VARIANT_SIMPLE) {
+		if (flags & CONNECT_IPV4)
+			argv_array_push(args, "-4");
+		else if (flags & CONNECT_IPV6)
+			argv_array_push(args, "-6");
+	}
+
+	if (variant == VARIANT_TORTOISEPLINK)
+		argv_array_push(args, "-batch");
+
+	if (port && variant != VARIANT_SIMPLE) {
+		if (variant == VARIANT_SSH)
+			argv_array_push(args, "-p");
+		else
+			argv_array_push(args, "-P");
+
+		argv_array_push(args, port);
+	}
+}
+
 /* Prepare a child_process for use by Git's SSH-tunneled transport. */
 static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
 			  const char *port, int flags)
@@ -937,33 +972,22 @@ static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
 
 	argv_array_push(&conn->args, ssh);
 
-	if (variant == VARIANT_SSH &&
-	    get_protocol_version_config() > 0) {
-		argv_array_push(&conn->args, "-o");
-		argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
-		argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-				 get_protocol_version_config());
-	}
-
-	if (variant != VARIANT_SIMPLE) {
-		if (flags & CONNECT_IPV4)
-			argv_array_push(&conn->args, "-4");
-		else if (flags & CONNECT_IPV6)
-			argv_array_push(&conn->args, "-6");
-	}
+	if (variant == VARIANT_AUTO) {
+		struct child_process detect = CHILD_PROCESS_INIT;
 
-	if (variant == VARIANT_TORTOISEPLINK)
-		argv_array_push(&conn->args, "-batch");
+		detect.use_shell = conn->use_shell;
+		detect.no_stdin = detect.no_stdout = detect.no_stderr = 1;
 
-	if (port && variant != VARIANT_SIMPLE) {
-		if (variant == VARIANT_SSH)
-			argv_array_push(&conn->args, "-p");
-		else
-			argv_array_push(&conn->args, "-P");
+		argv_array_push(&detect.args, ssh);
+		argv_array_push(&detect.args, "-G");
+		push_ssh_options(&detect.args, &detect.env_array,
+				 VARIANT_SSH, port, flags);
+		argv_array_push(&detect.args, ssh_host);
 
-		argv_array_push(&conn->args, port);
+		variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
 	}
 
+	push_ssh_options(&conn->args, &conn->env_array, variant, port, flags);
 	argv_array_push(&conn->args, ssh_host);
 }
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 86811a0c35..fd94dd40d2 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -384,6 +384,20 @@ test_expect_success 'uplink is treated as simple' '
 	expect_ssh myhost src
 '
 
+test_expect_success 'OpenSSH-like uplink is treated as ssh' '
+	write_script "$TRASH_DIRECTORY/uplink" <<-EOF &&
+	if test "\$1" = "-G"
+	then
+		exit 0
+	fi &&
+	exec "\$TRASH_DIRECTORY/ssh$X" "\$@"
+	EOF
+	GIT_SSH="$TRASH_DIRECTORY/uplink" &&
+	export GIT_SSH &&
+	git clone "[myhost:123]:src" ssh-bracket-clone-sshlike-uplink &&
+	expect_ssh "-p 123" myhost src
+'
+
 test_expect_success 'plink is treated specially (as putty)' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
 	git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 4/5] ssh: 'simple' variant does not support -4/-6
  2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
                               ` (2 preceding siblings ...)
  2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
@ 2017-10-23 21:32             ` Jonathan Nieder
  2017-10-23 21:33             ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
  4 siblings, 0 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 21:32 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller, William Yan

If the user passes -4/--ipv4 or -6/--ipv6 to "git fetch" or "git push"
and the ssh command configured with GIT_SSH does not support such a
setting, error out instead of ignoring the option and continuing.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 connect.c        | 25 ++++++++++++++++++++++---
 t/t5601-clone.sh | 12 ++++++------
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/connect.c b/connect.c
index 2dc9554b30..a1c0ba1b3a 100644
--- a/connect.c
+++ b/connect.c
@@ -923,11 +923,30 @@ static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 				 get_protocol_version_config());
 	}
 
-	if (variant != VARIANT_SIMPLE) {
-		if (flags & CONNECT_IPV4)
+	if (flags & CONNECT_IPV4) {
+		switch (variant) {
+		case VARIANT_AUTO:
+			BUG("VARIANT_AUTO passed to push_ssh_options");
+		case VARIANT_SIMPLE:
+			die("ssh variant 'simple' does not support -4");
+		case VARIANT_SSH:
+		case VARIANT_PLINK:
+		case VARIANT_PUTTY:
+		case VARIANT_TORTOISEPLINK:
 			argv_array_push(args, "-4");
-		else if (flags & CONNECT_IPV6)
+		}
+	} else if (flags & CONNECT_IPV6) {
+		switch (variant) {
+		case VARIANT_AUTO:
+			BUG("VARIANT_AUTO passed to push_ssh_options");
+		case VARIANT_SIMPLE:
+			die("ssh variant 'simple' does not support -6");
+		case VARIANT_SSH:
+		case VARIANT_PLINK:
+		case VARIANT_PUTTY:
+		case VARIANT_TORTOISEPLINK:
 			argv_array_push(args, "-6");
+		}
 	}
 
 	if (variant == VARIANT_TORTOISEPLINK)
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index fd94dd40d2..a672fc1e08 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -367,15 +367,15 @@ test_expect_success 'OpenSSH variant passes -4' '
 	expect_ssh "-4 -p 123" myhost src
 '
 
-test_expect_success 'variant can be overriden' '
-	git -c ssh.variant=simple clone -4 "[myhost:123]:src" ssh-simple-clone &&
-	expect_ssh myhost src
+test_expect_success 'variant can be overridden' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/putty" &&
+	git -c ssh.variant=putty clone -4 "[myhost:123]:src" ssh-putty-clone &&
+	expect_ssh "-4 -P 123" myhost src
 '
 
-test_expect_success 'simple is treated as simple' '
+test_expect_success 'simple does not support -4/-6' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
-	git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple &&
-	expect_ssh myhost src
+	test_must_fail git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple
 '
 
 test_expect_success 'uplink is treated as simple' '
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 5/5] ssh: 'simple' variant does not support --port
  2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
                               ` (3 preceding siblings ...)
  2017-10-23 21:32             ` [PATCH 4/5] ssh: 'simple' variant does not support -4/-6 Jonathan Nieder
@ 2017-10-23 21:33             ` Jonathan Nieder
  2017-10-23 22:37               ` Stefan Beller
  4 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 21:33 UTC (permalink / raw)
  To: Brandon Williams
  Cc: git, bturner, git, gitster, jonathantanmy, peff, sbeller, William Yan

When trying to connect to an ssh:// URL with port explicitly specified
and the ssh command configured with GIT_SSH does not support such a
setting, it is less confusing to error out than to silently suppress
the port setting and continue.

This requires updating the GIT_SSH setting in t5603-clone-dirname.sh.
That test is about the directory name produced when cloning various
URLs.  It uses an ssh wrapper that ignores all its arguments but does
not declare that it supports a port argument; update it to set
GIT_SSH_VARIANT=ssh to do so.  (Real-life ssh wrappers that pass a
port argument to OpenSSH would also support -G and would not require
such an update.)

Reported-by: William Yan <wyan@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
That's the end of the series.  Thanks for reading.

 connect.c                | 15 ++++++++++++---
 t/t5601-clone.sh         | 10 ++++++++--
 t/t5603-clone-dirname.sh |  2 ++
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/connect.c b/connect.c
index a1c0ba1b3a..98f2d9ce57 100644
--- a/connect.c
+++ b/connect.c
@@ -952,11 +952,20 @@ static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 	if (variant == VARIANT_TORTOISEPLINK)
 		argv_array_push(args, "-batch");
 
-	if (port && variant != VARIANT_SIMPLE) {
-		if (variant == VARIANT_SSH)
+	if (port) {
+		switch (variant) {
+		case VARIANT_AUTO:
+			BUG("VARIANT_AUTO passed to push_ssh_options");
+		case VARIANT_SIMPLE:
+			die("ssh variant 'simple' does not support setting port");
+		case VARIANT_SSH:
 			argv_array_push(args, "-p");
-		else
+			break;
+		case VARIANT_PLINK:
+		case VARIANT_PUTTY:
+		case VARIANT_TORTOISEPLINK:
 			argv_array_push(args, "-P");
+		}
 
 		argv_array_push(args, port);
 	}
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index a672fc1e08..11fa516997 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -375,12 +375,18 @@ test_expect_success 'variant can be overridden' '
 
 test_expect_success 'simple does not support -4/-6' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
-	test_must_fail git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple
+	test_must_fail git clone -4 "myhost:src" ssh-4-clone-simple
+'
+
+test_expect_success 'simple does not support port' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
+	test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-simple
 '
 
 test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
-	git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
+	test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
+	git clone "myhost:src" ssh-clone-uplink &&
 	expect_ssh myhost src
 '
 
diff --git a/t/t5603-clone-dirname.sh b/t/t5603-clone-dirname.sh
index d5af758129..13b5e5eb9b 100755
--- a/t/t5603-clone-dirname.sh
+++ b/t/t5603-clone-dirname.sh
@@ -11,7 +11,9 @@ test_expect_success 'setup ssh wrapper' '
 	git upload-pack "$TRASH_DIRECTORY"
 	EOF
 	GIT_SSH="$TRASH_DIRECTORY/ssh-wrapper" &&
+	GIT_SSH_VARIANT=ssh &&
 	export GIT_SSH &&
+	export GIT_SSH_VARIANT &&
 	export TRASH_DIRECTORY
 '
 
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/5] connect: split ssh command line options into separate function
  2017-10-23 21:30             ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
@ 2017-10-23 21:48               ` Stefan Beller
  0 siblings, 0 replies; 161+ messages in thread
From: Stefan Beller @ 2017-10-23 21:48 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, Bryan Turner, Jeff Hostetler,
	Junio C Hamano, Jonathan Tan, Jeff King, William Yan

On Mon, Oct 23, 2017 at 2:30 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> The git_connect function is growing long.  Split the portion that
> discovers an ssh command and options it accepts before the service
> name and path to a separate function to make it easier to read.
>
> No functional change intended.
>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

and
Reviewed-by: Stefan Beller <sbeller@google.com>

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-23 21:29             ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
@ 2017-10-23 22:16               ` Stefan Beller
  2017-10-24  0:09                 ` [WIP PATCH] diff: add option to ignore whitespaces for move detection only Stefan Beller
  2017-10-24  1:54                 ` [PATCH 1/5] connect: split git:// setup into a separate function Junio C Hamano
  0 siblings, 2 replies; 161+ messages in thread
From: Stefan Beller @ 2017-10-23 22:16 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, Bryan Turner, Jeff Hostetler,
	Junio C Hamano, Jonathan Tan, Jeff King, William Yan

On Mon, Oct 23, 2017 at 2:29 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> The git_connect function is growing long.  Split the
> PROTO_GIT-specific portion to a separate function to make it easier to
> read.
>
> No functional change intended.
>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>

This also looks good to me.

unrelated:
Patch 2 was very easy to review using "log -p -w --color-moved",
this one however was not. This is because -w caused the diff machinery
to generate a completely different diff. (Not showing the new function
completely but some weird function header trickery. The white space
mangled output is below; most of it was colored "moved")

I had to have -w as otherwise --color-moved would not work,
so maybe we want to have an option to ignore white space for the
sake of move detection only, not affecting the diff in general;
maybe '--ignore-white-space-in-move-detection'?

I think once this option is given, all we have to do is pay attention to
this option in diff.c#moved_entry_cmp/next_byte, which is best built
on top of Peffs recent fixes origin/jk/diff-color-moved-fix.
Would that be of interest for people?

Thanks,
Stefan

diff --git a/connect.c b/connect.c
index 7fbd396b35..068e70caad 100644
--- a/connect.c
+++ b/connect.c
@@ -851,36 +851,16 @@ static enum ssh_variant
determine_ssh_variant(const char *ssh_command,
 }

 /*
- * This returns a dummy child_process if the transport protocol does not
- * need fork(2), or a struct child_process object if it does.  Once done,
- * finish the connection with finish_connect() with the value returned from
- * this function (it is safe to call finish_connect() with NULL to support
- * the former case).
+ * Open a connection using Git's native protocol.
  *
- * If it returns, the connect is successful; it just dies on errors (this
- * will hopefully be changed in a libification effort, to return NULL when
- * the connection failed).
+ * The caller is responsible for freeing hostandport, but this function may
+ * modify it (for example, to truncate it to remove the port part).
  */
-struct child_process *git_connect(int fd[2], const char *url,
-                                  const char *prog, int flags)
+static struct child_process *git_connect_git(int fd[2], char *hostandport,
+                                             const char *path, const
char *prog,
+                                             int flags)
 {
-        char *hostandport, *path;
         struct child_process *conn = &no_fork;
-        enum protocol protocol;
-
-        /* Without this we cannot rely on waitpid() to tell
-         * what happened to our children.
-         */
-        signal(SIGCHLD, SIG_DFL);
-
-        protocol = parse_connect_url(url, &hostandport, &path);
-        if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) {
-                printf("Diag: url=%s\n", url ? url : "NULL");
-                printf("Diag: protocol=%s\n", prot_name(protocol));
-                printf("Diag: hostandport=%s\n", hostandport ?
hostandport : "NULL");
-                printf("Diag: path=%s\n", path ? path : "NULL");
-                conn = NULL;
-        } else if (protocol == PROTO_GIT) {
         struct strbuf request = STRBUF_INIT;
         /*
          * Set up virtual host information based on where we will
@@ -925,6 +905,41 @@ struct child_process *git_connect(int fd[2],
const char *url,

         free(target_host);
         strbuf_release(&request);
+        return conn;
+}
+
+/*
+ * This returns a dummy child_process if the transport protocol does not
+ * need fork(2), or a struct child_process object if it does.  Once done,
+ * finish the connection with finish_connect() with the value returned from
+ * this function (it is safe to call finish_connect() with NULL to support
+ * the former case).
+ *
+ * If it returns, the connect is successful; it just dies on errors (this
+ * will hopefully be changed in a libification effort, to return NULL when
+ * the connection failed).
+ */
+struct child_process *git_connect(int fd[2], const char *url,
+                                  const char *prog, int flags)
+{
+        char *hostandport, *path;
+        struct child_process *conn = &no_fork;
+        enum protocol protocol;
+
+        /* Without this we cannot rely on waitpid() to tell
+         * what happened to our children.
+         */
+        signal(SIGCHLD, SIG_DFL);
+
+        protocol = parse_connect_url(url, &hostandport, &path);
+        if ((flags & CONNECT_DIAG_URL) && (protocol != PROTO_SSH)) {
+                printf("Diag: url=%s\n", url ? url : "NULL");
+                printf("Diag: protocol=%s\n", prot_name(protocol));
+                printf("Diag: hostandport=%s\n", hostandport ?
hostandport : "NULL");
+                printf("Diag: path=%s\n", path ? path : "NULL");
+                conn = NULL;
+        } else if (protocol == PROTO_GIT) {
+                conn = git_connect_git(fd, hostandport, path, prog, flags);
         } else {
                 struct strbuf cmd = STRBUF_INIT;
                 const char *const *var;

^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
@ 2017-10-23 22:19               ` Jonathan Tan
  2017-10-23 22:43                 ` Jonathan Nieder
  2017-10-23 22:33               ` Stefan Beller
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-10-23 22:19 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, bturner, git, gitster, peff, sbeller, William Yan

On Mon, 23 Oct 2017 14:31:59 -0700
Jonathan Nieder <jrnieder@gmail.com> wrote:

> @@ -2083,14 +2083,19 @@ visited as a result of a redirection do not participate in matching.
>  ssh.variant::
>  	Depending on the value of the environment variables `GIT_SSH` or
>  	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
> -	auto-detects whether to adjust its command-line parameters for use
> -	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
> -	(simple).
> +	auto-detects whether to pass command-line parameters for use
> +	with a simple wrapper script (simple), OpenSSH (ssh), plink, or
> +	tortoiseplink.
> ++
> +The default is `auto`, which means to auto-detect whether the ssh command
> +implements OpenSSH options using the `-G` (print configuration) option.
> +If the ssh command supports OpenSSH options, it then behaves like `ssh`;
> +otherwise, it behaves like `simple`.
>  +
>  The config variable `ssh.variant` can be set to override this auto-detection;
> -valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
> -other value will be treated as normal ssh. This setting can be overridden via
> -the environment variable `GIT_SSH_VARIANT`.
> +valid values are `ssh`, `simple`, `plink`, `putty`, `tortoiseplink`, and
> +`auto`.  Any other value will be treated as normal ssh.  This setting can be
> +overridden via the environment variable `GIT_SSH_VARIANT`.

The new documentation seems to imply that setting ssh.variant (or
GIT_SSH_VARIANT) to "auto" is equivalent to not setting it at all, but
looking at the code, it doesn't seem to be the case (not setting it at
all invokes checking the first word of core.sshCommand, and only uses
VARIANT_AUTO if that check is inconclusive, whereas setting
ssh.variant=auto skips the core.sshCommand check entirely).

Maybe document ssh.variant as follows:

    If unset, Git will determine the command-line arguments to use based
    on the basename of the configured SSH command (through the
    environment variable `GIT_SSH` or `GIT_SSH_COMMAND`, or the config
    setting `core.sshCommand`). If the basename is unrecognized, Git
    will attempt to detect support of OpenSSH options by first invoking
    the configured SSH command with the `-G` (print configuration) flag,
    and will subsequently use OpenSSH options (upon success) or no
    options besides the host (upon failure).

    If set, Git will not do any auto-detection based on the basename of
    the configured SSH command. This can be set to `ssh` (OpenSSH
    options), `plink`, `putty`, `tortoiseplink`, `simple` (no options
    besides the host), or `auto` (the detection with `-G` as described
    above). If set to any other value, Git behaves as if this is set to
    `ssh`.

(Patches 1, 2, 4, and 5 seem fine to me.)

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
  2017-10-23 22:19               ` Jonathan Tan
@ 2017-10-23 22:33               ` Stefan Beller
  2017-10-23 22:54                 ` Jonathan Nieder
  2017-10-24  2:16               ` Junio C Hamano
  2017-10-25 12:51               ` Johannes Schindelin
  3 siblings, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-10-23 22:33 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, Bryan Turner, Jeff Hostetler,
	Junio C Hamano, Jonathan Tan, Jeff King, William Yan

On Mon, Oct 23, 2017 at 2:31 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>  1. First, check whether $GIT_SSH supports OpenSSH options by running
>
>         $GIT_SSH -G <options> <host>
>
>     This returns status 0 and prints configuration in OpenSSH if it
>     recognizes all <options> and returns status 255 if it encounters
>     an unrecognized option.  A wrapper script like
>
>         exec ssh -- "$@"
>
>     would fail with
>
>         ssh: Could not resolve hostname -g: Name or service not known

capital -G?


> -       if (variant == VARIANT_TORTOISEPLINK)
> -               argv_array_push(&conn->args, "-batch");
> +               detect.use_shell = conn->use_shell;

Why do we have to use a shell for evaluation of this
test balloon?

> +               detect.no_stdin = detect.no_stdout = detect.no_stderr = 1;

okay.

...
> +               argv_array_push(&detect.args, "-G");
...
> +               variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;

What if we have a VARIANT_SIMPLE, that doesn't care about -G
but just connects to the remote host (keeping the session open), do we need
to kill it after some time to have run_command return eventually?

Or can we give a command to be executed remotely (e.g. 'false') that
we know returns quickly?

>  '
>
> +test_expect_success 'OpenSSH-like uplink is treated as ssh' '
> +       write_script "$TRASH_DIRECTORY/uplink" <<-EOF &&
> +       if test "\$1" = "-G"

Reading this test (and the commit message) I realize, we do care
about the order of options, so this is fine.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 5/5] ssh: 'simple' variant does not support --port
  2017-10-23 21:33             ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
@ 2017-10-23 22:37               ` Stefan Beller
  0 siblings, 0 replies; 161+ messages in thread
From: Stefan Beller @ 2017-10-23 22:37 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, Bryan Turner, Jeff Hostetler,
	Junio C Hamano, Jonathan Tan, Jeff King, William Yan

On Mon, Oct 23, 2017 at 2:33 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> When trying to connect to an ssh:// URL with port explicitly specified
> and the ssh command configured with GIT_SSH does not support such a
> setting, it is less confusing to error out than to silently suppress
> the port setting and continue.
>
> This requires updating the GIT_SSH setting in t5603-clone-dirname.sh.
> That test is about the directory name produced when cloning various
> URLs.  It uses an ssh wrapper that ignores all its arguments but does
> not declare that it supports a port argument; update it to set
> GIT_SSH_VARIANT=ssh to do so.  (Real-life ssh wrappers that pass a
> port argument to OpenSSH would also support -G and would not require
> such an update.)
>
> Reported-by: William Yan <wyan@google.com>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
> That's the end of the series.  Thanks for reading.

Patches 4 & 5 look good to me,

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 22:19               ` Jonathan Tan
@ 2017-10-23 22:43                 ` Jonathan Nieder
  2017-10-23 22:51                   ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 22:43 UTC (permalink / raw)
  To: Jonathan Tan
  Cc: Brandon Williams, git, bturner, git, gitster, peff, sbeller, William Yan

Hi,

Jonathan Tan wrote:

> The new documentation seems to imply that setting ssh.variant (or
> GIT_SSH_VARIANT) to "auto" is equivalent to not setting it at all, but
> looking at the code, it doesn't seem to be the case (not setting it at
> all invokes checking the first word of core.sshCommand, and only uses
> VARIANT_AUTO if that check is inconclusive, whereas setting
> ssh.variant=auto skips the core.sshCommand check entirely).
>
> Maybe document ssh.variant as follows:
>
>     If unset, Git will determine the command-line arguments to use based
>     on the basename of the configured SSH command (through the
>     environment variable `GIT_SSH` or `GIT_SSH_COMMAND`, or the config
>     setting `core.sshCommand`). If the basename is unrecognized, Git
>     will attempt to detect support of OpenSSH options by first invoking
>     the configured SSH command with the `-G` (print configuration) flag,
>     and will subsequently use OpenSSH options (upon success) or no
>     options besides the host (upon failure).
>
>     If set, Git will not do any auto-detection based on the basename of
>     the configured SSH command. This can be set to `ssh` (OpenSSH
>     options), `plink`, `putty`, `tortoiseplink`, `simple` (no options
>     besides the host), or `auto` (the detection with `-G` as described
>     above). If set to any other value, Git behaves as if this is set to
>     `ssh`.

Good point.  Brandon noticed something similar as well.

Separately from how to document it, what do you think a good behavior
would be?  Should the "auto" configuration trigger command line based
detection just like no configuration at all?  Should the "auto" value
for configuration be removed and that behavior restricted to the
no-configuration case?

I'm tempted to go with the former, which would look like the following.
What do you think?

If this looks good, I can reroll in a moment.

diff --git i/Documentation/config.txt w/Documentation/config.txt
index 4a16b324f0..6dffa4aa3d 100644
--- i/Documentation/config.txt
+++ w/Documentation/config.txt
@@ -2081,20 +2081,21 @@ matched against are those given directly to Git commands.  This means any URLs
 visited as a result of a redirection do not participate in matching.
 
 ssh.variant::
-	Depending on the value of the environment variables `GIT_SSH` or
-	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
-	auto-detects whether to pass command-line parameters for use
-	with a simple wrapper script (simple), OpenSSH (ssh), plink, or
-	tortoiseplink.
-+
-The default is `auto`, which means to auto-detect whether the ssh command
-implements OpenSSH options using the `-G` (print configuration) option.
-If the ssh command supports OpenSSH options, it then behaves like `ssh`;
-otherwise, it behaves like `simple`.
-+
-The config variable `ssh.variant` can be set to override this auto-detection;
-valid values are `ssh`, `simple`, `plink`, `putty`, `tortoiseplink`, and
-`auto`.  Any other value will be treated as normal ssh.  This setting can be
+	By default, Git determines the command line arguments to use
+	based on the basename of the configured SSH command (configured
+	using the environment variable `GIT_SSH` or `GIT_SSH_COMMAND` or
+	the config setting `core.sshCommand`). If the basename is
+	unrecognized, Git will attempt to detect support of OpenSSH
+	options by first invoking the configured SSH command with the
+	`-G` (print configuration) option and will subsequently use
+	OpenSSH options (if that is successful) or no options besides
+	the host and remote command (if it fails).
++
+The config variable `ssh.variant` can be set to override this detection:
+valid values are `ssh` (to use OpenSSH options), `plink`, `putty`,
+`tortoiseplink`, `simple` (no options except the host and remote command).
+The default auto-detection can be explicitly requested using the value
+`auto`.  Any other value is treated as `ssh`.  This setting can also be
 overridden via the environment variable `GIT_SSH_VARIANT`.
 +
 The current command-line parameters used for each variant are as
diff --git i/connect.c w/connect.c
index 98f2d9ce57..06bcd3981e 100644
--- i/connect.c
+++ w/connect.c
@@ -785,12 +785,12 @@ enum ssh_variant {
 	VARIANT_TORTOISEPLINK,
 };
 
-static int override_ssh_variant(enum ssh_variant *ssh_variant)
+static void override_ssh_variant(enum ssh_variant *ssh_variant)
 {
 	const char *variant = getenv("GIT_SSH_VARIANT");
 
 	if (!variant && git_config_get_string_const("ssh.variant", &variant))
-		return 0;
+		return;
 
 	if (!strcmp(variant, "auto"))
 		*ssh_variant = VARIANT_AUTO;
@@ -804,8 +804,6 @@ static int override_ssh_variant(enum ssh_variant *ssh_variant)
 		*ssh_variant = VARIANT_SIMPLE;
 	else
 		*ssh_variant = VARIANT_SSH;
-
-	return 1;
 }
 
 static enum ssh_variant determine_ssh_variant(const char *ssh_command,
@@ -815,7 +813,9 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
 	const char *variant;
 	char *p = NULL;
 
-	if (override_ssh_variant(&ssh_variant))
+	override_ssh_variant(&ssh_variant);
+
+	if (ssh_variant != VARIANT_AUTO)
 		return ssh_variant;
 
 	if (!is_cmdline) {
diff --git i/t/t5601-clone.sh w/t/t5601-clone.sh
index 11fa516997..f9a2ae84c7 100755
--- i/t/t5601-clone.sh
+++ w/t/t5601-clone.sh
@@ -373,6 +373,12 @@ test_expect_success 'variant can be overridden' '
 	expect_ssh "-4 -P 123" myhost src
 '
 
+test_expect_success 'variant=auto picks based on basename' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
+	git -c ssh.variant=auto clone -4 "[myhost:123]:src" ssh-auto-clone &&
+	expect_ssh "-4 -P 123" myhost src
+'
+
 test_expect_success 'simple does not support -4/-6' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
 	test_must_fail git clone -4 "myhost:src" ssh-4-clone-simple

^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 22:43                 ` Jonathan Nieder
@ 2017-10-23 22:51                   ` Brandon Williams
  2017-10-23 22:57                     ` Jonathan Tan
  2017-10-23 23:12                     ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
  0 siblings, 2 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-23 22:51 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jonathan Tan, git, bturner, git, gitster, peff, sbeller, William Yan

On 10/23, Jonathan Nieder wrote:
> Hi,
> 
> Jonathan Tan wrote:
> 
> > The new documentation seems to imply that setting ssh.variant (or
> > GIT_SSH_VARIANT) to "auto" is equivalent to not setting it at all, but
> > looking at the code, it doesn't seem to be the case (not setting it at
> > all invokes checking the first word of core.sshCommand, and only uses
> > VARIANT_AUTO if that check is inconclusive, whereas setting
> > ssh.variant=auto skips the core.sshCommand check entirely).
> >
> > Maybe document ssh.variant as follows:
> >
> >     If unset, Git will determine the command-line arguments to use based
> >     on the basename of the configured SSH command (through the
> >     environment variable `GIT_SSH` or `GIT_SSH_COMMAND`, or the config
> >     setting `core.sshCommand`). If the basename is unrecognized, Git
> >     will attempt to detect support of OpenSSH options by first invoking
> >     the configured SSH command with the `-G` (print configuration) flag,
> >     and will subsequently use OpenSSH options (upon success) or no
> >     options besides the host (upon failure).
> >
> >     If set, Git will not do any auto-detection based on the basename of
> >     the configured SSH command. This can be set to `ssh` (OpenSSH
> >     options), `plink`, `putty`, `tortoiseplink`, `simple` (no options
> >     besides the host), or `auto` (the detection with `-G` as described
> >     above). If set to any other value, Git behaves as if this is set to
> >     `ssh`.
> 
> Good point.  Brandon noticed something similar as well.
> 
> Separately from how to document it, what do you think a good behavior
> would be?  Should the "auto" configuration trigger command line based
> detection just like no configuration at all?  Should the "auto" value
> for configuration be removed and that behavior restricted to the
> no-configuration case?
> 
> I'm tempted to go with the former, which would look like the following.
> What do you think?

As a user having some variant as 'auto' doesn't make much sense, i mean
isn't that exactly what the default behavior is?  Check if my ssh
command matches existing variants and go with that.  What you are
proposing is the make the existing auto detection better (yay!) though I
don't know if it warrants adding a new variant all together.

Instead it may be better to stick this new improved detection at the end
of the existing variant discovery function 'determine_ssh_variant()' as
a last ditch effort to figure out the variant.  That way we don't have
an extra variant type that can be configured and eliminates some of the
additional code in the switch statements to handle that enum value
(though that isn't really that big of a deal).

> 
> If this looks good, I can reroll in a moment.
> 
> diff --git i/Documentation/config.txt w/Documentation/config.txt
> index 4a16b324f0..6dffa4aa3d 100644
> --- i/Documentation/config.txt
> +++ w/Documentation/config.txt
> @@ -2081,20 +2081,21 @@ matched against are those given directly to Git commands.  This means any URLs
>  visited as a result of a redirection do not participate in matching.
>  
>  ssh.variant::
> -	Depending on the value of the environment variables `GIT_SSH` or
> -	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
> -	auto-detects whether to pass command-line parameters for use
> -	with a simple wrapper script (simple), OpenSSH (ssh), plink, or
> -	tortoiseplink.
> -+
> -The default is `auto`, which means to auto-detect whether the ssh command
> -implements OpenSSH options using the `-G` (print configuration) option.
> -If the ssh command supports OpenSSH options, it then behaves like `ssh`;
> -otherwise, it behaves like `simple`.
> -+
> -The config variable `ssh.variant` can be set to override this auto-detection;
> -valid values are `ssh`, `simple`, `plink`, `putty`, `tortoiseplink`, and
> -`auto`.  Any other value will be treated as normal ssh.  This setting can be
> +	By default, Git determines the command line arguments to use
> +	based on the basename of the configured SSH command (configured
> +	using the environment variable `GIT_SSH` or `GIT_SSH_COMMAND` or
> +	the config setting `core.sshCommand`). If the basename is
> +	unrecognized, Git will attempt to detect support of OpenSSH
> +	options by first invoking the configured SSH command with the
> +	`-G` (print configuration) option and will subsequently use
> +	OpenSSH options (if that is successful) or no options besides
> +	the host and remote command (if it fails).
> ++
> +The config variable `ssh.variant` can be set to override this detection:
> +valid values are `ssh` (to use OpenSSH options), `plink`, `putty`,
> +`tortoiseplink`, `simple` (no options except the host and remote command).
> +The default auto-detection can be explicitly requested using the value
> +`auto`.  Any other value is treated as `ssh`.  This setting can also be
>  overridden via the environment variable `GIT_SSH_VARIANT`.
>  +
>  The current command-line parameters used for each variant are as
> diff --git i/connect.c w/connect.c
> index 98f2d9ce57..06bcd3981e 100644
> --- i/connect.c
> +++ w/connect.c
> @@ -785,12 +785,12 @@ enum ssh_variant {
>  	VARIANT_TORTOISEPLINK,
>  };
>  
> -static int override_ssh_variant(enum ssh_variant *ssh_variant)
> +static void override_ssh_variant(enum ssh_variant *ssh_variant)
>  {
>  	const char *variant = getenv("GIT_SSH_VARIANT");
>  
>  	if (!variant && git_config_get_string_const("ssh.variant", &variant))
> -		return 0;
> +		return;
>  
>  	if (!strcmp(variant, "auto"))
>  		*ssh_variant = VARIANT_AUTO;
> @@ -804,8 +804,6 @@ static int override_ssh_variant(enum ssh_variant *ssh_variant)
>  		*ssh_variant = VARIANT_SIMPLE;
>  	else
>  		*ssh_variant = VARIANT_SSH;
> -
> -	return 1;
>  }
>  
>  static enum ssh_variant determine_ssh_variant(const char *ssh_command,
> @@ -815,7 +813,9 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
>  	const char *variant;
>  	char *p = NULL;
>  
> -	if (override_ssh_variant(&ssh_variant))
> +	override_ssh_variant(&ssh_variant);
> +
> +	if (ssh_variant != VARIANT_AUTO)
>  		return ssh_variant;
>  
>  	if (!is_cmdline) {
> diff --git i/t/t5601-clone.sh w/t/t5601-clone.sh
> index 11fa516997..f9a2ae84c7 100755
> --- i/t/t5601-clone.sh
> +++ w/t/t5601-clone.sh
> @@ -373,6 +373,12 @@ test_expect_success 'variant can be overridden' '
>  	expect_ssh "-4 -P 123" myhost src
>  '
>  
> +test_expect_success 'variant=auto picks based on basename' '
> +	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
> +	git -c ssh.variant=auto clone -4 "[myhost:123]:src" ssh-auto-clone &&
> +	expect_ssh "-4 -P 123" myhost src
> +'
> +
>  test_expect_success 'simple does not support -4/-6' '
>  	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
>  	test_must_fail git clone -4 "myhost:src" ssh-4-clone-simple

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 22:33               ` Stefan Beller
@ 2017-10-23 22:54                 ` Jonathan Nieder
  0 siblings, 0 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 22:54 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Brandon Williams, git, Bryan Turner, Jeff Hostetler,
	Junio C Hamano, Jonathan Tan, Jeff King, William Yan

Stefan Beller wrote:
> On Mon, Oct 23, 2017 at 2:31 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>>  1. First, check whether $GIT_SSH supports OpenSSH options by running
>>
>>         $GIT_SSH -G <options> <host>
>>
>>     This returns status 0 and prints configuration in OpenSSH if it
>>     recognizes all <options> and returns status 255 if it encounters
>>     an unrecognized option.  A wrapper script like
>>
>>         exec ssh -- "$@"
>>
>>     would fail with
>>
>>         ssh: Could not resolve hostname -g: Name or service not known
>
> capital -G?

The actual error message uses lowercase (presumably they use tolower on
the hostname).

>> -       if (variant == VARIANT_TORTOISEPLINK)
>> -               argv_array_push(&conn->args, "-batch");
>> +               detect.use_shell = conn->use_shell;
>
> Why do we have to use a shell for evaluation of this
> test balloon?

If the user set GIT_SSH_COMMAND instead of GIT_SSH then we need to run
it using a shell.  The above line inherits the use_shell setting so it
ends up doing whatever conn would do.

[...]
>> +               argv_array_push(&detect.args, "-G");
> ...
>> +               variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
>
> What if we have a VARIANT_SIMPLE, that doesn't care about -G
> but just connects to the remote host (keeping the session open), do we need
> to kill it after some time to have run_command return eventually?
>
> Or can we give a command to be executed remotely (e.g. 'false') that
> we know returns quickly?

Since stdin is /dev/null, it would presumably return quickly.  But I
also don't expect this kind of behavior from GIT_SSH commands.  The
kinds I'd expect are

 A. The repo case, which forwards to 'ssh' and supports all 'ssh'
    flags

 B. The travis-ci case, which expects a host and command and does
    not accept any flags

 C. More sophisticated wrappers that parse flags but still do not
    handle -G

For case (A), the detection run would figure out that this accepts
OpenSSH options. Good.

For case (B), the detection run would figure out that this does not
accept OpenSSH options. Good.

For case (C), the detection run would think that this does not accept
OpenSSH options, when it accepts some. But I think that's the best we
can do for now. Longer term, we need to work with the author of such a
script to find out what kind of interface they want.

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 22:51                   ` Brandon Williams
@ 2017-10-23 22:57                     ` Jonathan Tan
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
  2017-10-23 23:12                     ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
  1 sibling, 1 reply; 161+ messages in thread
From: Jonathan Tan @ 2017-10-23 22:57 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jonathan Nieder, git, bturner, git, gitster, peff, sbeller, William Yan

On Mon, 23 Oct 2017 15:51:06 -0700
Brandon Williams <bmwill@google.com> wrote:

> On 10/23, Jonathan Nieder wrote:
> > Separately from how to document it, what do you think a good behavior
> > would be?  Should the "auto" configuration trigger command line based
> > detection just like no configuration at all?  Should the "auto" value
> > for configuration be removed and that behavior restricted to the
> > no-configuration case?
> > 
> > I'm tempted to go with the former, which would look like the following.
> > What do you think?
> 
> As a user having some variant as 'auto' doesn't make much sense, i mean
> isn't that exactly what the default behavior is?

So you're suggesting the second option ("that behavior restricted to the
no-configuration case")?

I'm leaning towards supporting "auto", actually. At the very least, it
gives the user a clear way to override an existing config.

> Check if my ssh
> command matches existing variants and go with that.  What you are
> proposing is the make the existing auto detection better (yay!) though I
> don't know if it warrants adding a new variant all together.
> 
> Instead it may be better to stick this new improved detection at the end
> of the existing variant discovery function 'determine_ssh_variant()' as
> a last ditch effort to figure out the variant.  That way we don't have
> an extra variant type that can be configured and eliminates some of the
> additional code in the switch statements to handle that enum value
> (though that isn't really that big of a deal).

This sounds like what is already being done in the code.

> > If this looks good, I can reroll in a moment.

Yes, this looks good.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 22:51                   ` Brandon Williams
  2017-10-23 22:57                     ` Jonathan Tan
@ 2017-10-23 23:12                     ` Jonathan Nieder
  1 sibling, 0 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:12 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jonathan Tan, git, bturner, git, gitster, peff, sbeller, William Yan

Brandon Williams wrote:
> On 10/23, Jonathan Nieder wrote:

>> Separately from how to document it, what do you think a good behavior
>> would be?  Should the "auto" configuration trigger command line based
>> detection just like no configuration at all?  Should the "auto" value
>> for configuration be removed and that behavior restricted to the
>> no-configuration case?
>>
>> I'm tempted to go with the former, which would look like the following.
>> What do you think?
>
> As a user having some variant as 'auto' doesn't make much sense, i mean
> isn't that exactly what the default behavior is?  Check if my ssh
> command matches existing variants and go with that.  What you are
> proposing is the make the existing auto detection better (yay!) though I
> don't know if it warrants adding a new variant all together.
>
> Instead it may be better to stick this new improved detection at the end
> of the existing variant discovery function 'determine_ssh_variant()' as
> a last ditch effort to figure out the variant.  That way we don't have
> an extra variant type that can be configured and eliminates some of the
> additional code in the switch statements to handle that enum value
> (though that isn't really that big of a deal).

Yes, if git config allowed e.g. ".git/config" to unset a setting from
e.g. "/etc/gitconfig", then I wouldn't want the 'auto' value to exist
in configuration at all.  But because git's config language doesn't
have a way to unset a variable, this series provided "auto" as a way
to explicitly request the same behavior as unset.

In that spirit, the patch I sent was broken ("auto" meant something
different from unset), so thanks for pointing the issue out.

Sensible?
Jonathan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH
  2017-10-23 22:57                     ` Jonathan Tan
@ 2017-10-23 23:16                       ` Jonathan Nieder
  2017-10-23 23:17                         ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
                                           ` (5 more replies)
  0 siblings, 6 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:16 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Brandon Williams, git, gitster, peff, sbeller, William Yan

Jonathan Tan wrote:
>> On 10/23, Jonathan Nieder wrote:

>>> If this looks good, I can reroll in a moment.
>
> Yes, this looks good.

Thanks.  Here goes.

The interdiff is upthread.  Thanks, all, for the quick review.

Jonathan Nieder (5):
  connect: split git:// setup into a separate function
  connect: split ssh command line options into separate function
  ssh: 'auto' variant to select between 'ssh' and 'simple'
  ssh: 'simple' variant does not support -4/-6
  ssh: 'simple' variant does not support --port

 Documentation/config.txt |  24 ++--
 connect.c                | 285 +++++++++++++++++++++++++++++------------------
 t/t5601-clone.sh         |  40 +++++--
 t/t5603-clone-dirname.sh |   2 +
 4 files changed, 229 insertions(+), 122 deletions(-)

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
@ 2017-10-23 23:17                         ` Jonathan Nieder
  2017-10-24  1:44                           ` Junio C Hamano
  2017-10-23 23:17                         ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
                                           ` (4 subsequent siblings)
  5 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:17 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Brandon Williams, git, gitster, peff, sbeller, William Yan

The git_connect function is growing long.  Split the
PROTO_GIT-specific portion to a separate function to make it easier to
read.

No functional change intended.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
---
As before, except with sbeller's Reviewed-by.

 connect.c | 103 +++++++++++++++++++++++++++++++++++---------------------------
 1 file changed, 59 insertions(+), 44 deletions(-)

diff --git a/connect.c b/connect.c
index 7fbd396b35..068e70caad 100644
--- a/connect.c
+++ b/connect.c
@@ -850,6 +850,64 @@ static enum ssh_variant determine_ssh_variant(const char *ssh_command,
 	return ssh_variant;
 }
 
+/*
+ * Open a connection using Git's native protocol.
+ *
+ * The caller is responsible for freeing hostandport, but this function may
+ * modify it (for example, to truncate it to remove the port part).
+ */
+static struct child_process *git_connect_git(int fd[2], char *hostandport,
+					     const char *path, const char *prog,
+					     int flags)
+{
+	struct child_process *conn = &no_fork;
+	struct strbuf request = STRBUF_INIT;
+	/*
+	 * Set up virtual host information based on where we will
+	 * connect, unless the user has overridden us in
+	 * the environment.
+	 */
+	char *target_host = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
+	if (target_host)
+		target_host = xstrdup(target_host);
+	else
+		target_host = xstrdup(hostandport);
+
+	transport_check_allowed("git");
+
+	/* These underlying connection commands die() if they
+	 * cannot connect.
+	 */
+	if (git_use_proxy(hostandport))
+		conn = git_proxy_connect(fd, hostandport);
+	else
+		git_tcp_connect(fd, hostandport, flags);
+	/*
+	 * Separate original protocol components prog and path
+	 * from extended host header with a NUL byte.
+	 *
+	 * Note: Do not add any other headers here!  Doing so
+	 * will cause older git-daemon servers to crash.
+	 */
+	strbuf_addf(&request,
+		    "%s %s%chost=%s%c",
+		    prog, path, 0,
+		    target_host, 0);
+
+	/* If using a new version put that stuff here after a second null byte */
+	if (get_protocol_version_config() > 0) {
+		strbuf_addch(&request, '\0');
+		strbuf_addf(&request, "version=%d%c",
+			    get_protocol_version_config(), '\0');
+	}
+
+	packet_write(fd[1], request.buf, request.len);
+
+	free(target_host);
+	strbuf_release(&request);
+	return conn;
+}
+
 /*
  * This returns a dummy child_process if the transport protocol does not
  * need fork(2), or a struct child_process object if it does.  Once done,
@@ -881,50 +939,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		printf("Diag: path=%s\n", path ? path : "NULL");
 		conn = NULL;
 	} else if (protocol == PROTO_GIT) {
-		struct strbuf request = STRBUF_INIT;
-		/*
-		 * Set up virtual host information based on where we will
-		 * connect, unless the user has overridden us in
-		 * the environment.
-		 */
-		char *target_host = getenv("GIT_OVERRIDE_VIRTUAL_HOST");
-		if (target_host)
-			target_host = xstrdup(target_host);
-		else
-			target_host = xstrdup(hostandport);
-
-		transport_check_allowed("git");
-
-		/* These underlying connection commands die() if they
-		 * cannot connect.
-		 */
-		if (git_use_proxy(hostandport))
-			conn = git_proxy_connect(fd, hostandport);
-		else
-			git_tcp_connect(fd, hostandport, flags);
-		/*
-		 * Separate original protocol components prog and path
-		 * from extended host header with a NUL byte.
-		 *
-		 * Note: Do not add any other headers here!  Doing so
-		 * will cause older git-daemon servers to crash.
-		 */
-		strbuf_addf(&request,
-			    "%s %s%chost=%s%c",
-			    prog, path, 0,
-			    target_host, 0);
-
-		/* If using a new version put that stuff here after a second null byte */
-		if (get_protocol_version_config() > 0) {
-			strbuf_addch(&request, '\0');
-			strbuf_addf(&request, "version=%d%c",
-				    get_protocol_version_config(), '\0');
-		}
-
-		packet_write(fd[1], request.buf, request.len);
-
-		free(target_host);
-		strbuf_release(&request);
+		conn = git_connect_git(fd, hostandport, path, prog, flags);
 	} else {
 		struct strbuf cmd = STRBUF_INIT;
 		const char *const *var;
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 2/5] connect: split ssh command line options into separate function
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
  2017-10-23 23:17                         ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
@ 2017-10-23 23:17                         ` Jonathan Nieder
  2017-10-24  2:01                           ` Junio C Hamano
  2017-10-23 23:18                         ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
                                           ` (3 subsequent siblings)
  5 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:17 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Brandon Williams, git, gitster, peff, sbeller, William Yan

The git_connect function is growing long.  Split the portion that
discovers an ssh command and options it accepts before the service
name and path to a separate function to make it easier to read.

No functional change intended.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Stefan Beller <sbeller@google.com>
---
As before, except for the Reviewed-by.

 connect.c | 116 +++++++++++++++++++++++++++++++++-----------------------------
 1 file changed, 61 insertions(+), 55 deletions(-)

diff --git a/connect.c b/connect.c
index 068e70caad..77ab6db3bb 100644
--- a/connect.c
+++ b/connect.c
@@ -908,6 +908,65 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
 	return conn;
 }
 
+/* Prepare a child_process for use by Git's SSH-tunneled transport. */
+static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
+			  const char *port, int flags)
+{
+	const char *ssh;
+	enum ssh_variant variant;
+
+	if (looks_like_command_line_option(ssh_host))
+		die("strange hostname '%s' blocked", ssh_host);
+
+	ssh = get_ssh_command();
+	if (ssh) {
+		variant = determine_ssh_variant(ssh, 1);
+	} else {
+		/*
+		 * GIT_SSH is the no-shell version of
+		 * GIT_SSH_COMMAND (and must remain so for
+		 * historical compatibility).
+		 */
+		conn->use_shell = 0;
+
+		ssh = getenv("GIT_SSH");
+		if (!ssh)
+			ssh = "ssh";
+		variant = determine_ssh_variant(ssh, 0);
+	}
+
+	argv_array_push(&conn->args, ssh);
+
+	if (variant == VARIANT_SSH &&
+	    get_protocol_version_config() > 0) {
+		argv_array_push(&conn->args, "-o");
+		argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+		argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+				 get_protocol_version_config());
+	}
+
+	if (variant != VARIANT_SIMPLE) {
+		if (flags & CONNECT_IPV4)
+			argv_array_push(&conn->args, "-4");
+		else if (flags & CONNECT_IPV6)
+			argv_array_push(&conn->args, "-6");
+	}
+
+	if (variant == VARIANT_TORTOISEPLINK)
+		argv_array_push(&conn->args, "-batch");
+
+	if (port && variant != VARIANT_SIMPLE) {
+		if (variant == VARIANT_SSH)
+			argv_array_push(&conn->args, "-p");
+		else
+			argv_array_push(&conn->args, "-P");
+
+		argv_array_push(&conn->args, port);
+	}
+
+	argv_array_push(&conn->args, ssh_host);
+}
+
 /*
  * This returns a dummy child_process if the transport protocol does not
  * need fork(2), or a struct child_process object if it does.  Once done,
@@ -961,16 +1020,13 @@ struct child_process *git_connect(int fd[2], const char *url,
 		conn->use_shell = 1;
 		conn->in = conn->out = -1;
 		if (protocol == PROTO_SSH) {
-			const char *ssh;
-			enum ssh_variant variant;
 			char *ssh_host = hostandport;
 			const char *port = NULL;
+
 			transport_check_allowed("ssh");
 			get_host_and_port(&ssh_host, &port);
-
 			if (!port)
 				port = get_port(ssh_host);
-
 			if (flags & CONNECT_DIAG_URL) {
 				printf("Diag: url=%s\n", url ? url : "NULL");
 				printf("Diag: protocol=%s\n", prot_name(protocol));
@@ -984,57 +1040,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 				strbuf_release(&cmd);
 				return NULL;
 			}
-
-			if (looks_like_command_line_option(ssh_host))
-				die("strange hostname '%s' blocked", ssh_host);
-
-			ssh = get_ssh_command();
-			if (ssh) {
-				variant = determine_ssh_variant(ssh, 1);
-			} else {
-				/*
-				 * GIT_SSH is the no-shell version of
-				 * GIT_SSH_COMMAND (and must remain so for
-				 * historical compatibility).
-				 */
-				conn->use_shell = 0;
-
-				ssh = getenv("GIT_SSH");
-				if (!ssh)
-					ssh = "ssh";
-				variant = determine_ssh_variant(ssh, 0);
-			}
-
-			argv_array_push(&conn->args, ssh);
-
-			if (variant == VARIANT_SSH &&
-			    get_protocol_version_config() > 0) {
-				argv_array_push(&conn->args, "-o");
-				argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
-				argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-						 get_protocol_version_config());
-			}
-
-			if (variant != VARIANT_SIMPLE) {
-				if (flags & CONNECT_IPV4)
-					argv_array_push(&conn->args, "-4");
-				else if (flags & CONNECT_IPV6)
-					argv_array_push(&conn->args, "-6");
-			}
-
-			if (variant == VARIANT_TORTOISEPLINK)
-				argv_array_push(&conn->args, "-batch");
-
-			if (port && variant != VARIANT_SIMPLE) {
-				if (variant == VARIANT_SSH)
-					argv_array_push(&conn->args, "-p");
-				else
-					argv_array_push(&conn->args, "-P");
-
-				argv_array_push(&conn->args, port);
-			}
-
-			argv_array_push(&conn->args, ssh_host);
+			fill_ssh_args(conn, ssh_host, port, flags);
 		} else {
 			transport_check_allowed("file");
 			if (get_protocol_version_config() > 0) {
-- 
2.15.0.rc1.287.g2b38de12cc-goog


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
  2017-10-23 23:17                         ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
  2017-10-23 23:17                         ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
@ 2017-10-23 23:18                         ` Jonathan Nieder
  2017-10-23 23:27                           ` Brandon Williams
  2017-10-23 23:19                         ` [PATCH 4/5] ssh: 'simple' variant does not support -4/-6 Jonathan Nieder
                                           ` (2 subsequent siblings)
  5 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:18 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Brandon Williams, git, gitster, peff, sbeller, William Yan

Android's "repo" tool is a tool for managing a large codebase
consisting of multiple smaller repositories, similar to Git's
submodule feature.  Starting with Git 94b8ae5a (ssh: introduce a
'simple' ssh variant, 2017-10-16), users noticed that it stopped
handling the port in ssh:// URLs.

The cause: when it encounters ssh:// URLs, repo pre-connects to the
server and sets GIT_SSH to a helper ".repo/repo/git_ssh" that reuses
that connection.  Before 94b8ae5a, the helper was assumed to support
OpenSSH options for lack of a better guess and got passed a -p option
to set the port.  After that patch, it uses the new default of a
simple helper that does not accept an option to set the port.

The next release of "repo" will set GIT_SSH_VARIANT to "ssh" to avoid
that.  But users of old versions and of other similar GIT_SSH
implementations would not get the benefit of that fix.

So update the default to use OpenSSH options again, with a twist.  As
observed in 94b8ae5a, we cannot assume that $GIT_SSH always handles
OpenSSH options: common helpers such as travis-ci's dpl[*] are
configured using GIT_SSH and do not accept OpenSSH options.  So make
the default a new variant "auto", with the following behavior:

 1. First, check for a recognized basename, like today.

 2. If the basename is not recognized, check whether $GIT_SSH supports
    OpenSSH options by running

	$GIT_SSH -G <options> <host>

    This returns status 0 and prints configuration in OpenSSH if it
    recognizes all <options> and returns status 255 if it encounters
    an unrecognized option.  A wrapper script like

	exec ssh -- "$@"

    would fail with

	ssh: Could not resolve hostname -g: Name or service not known

    , correctly reflecting that it does not support OpenSSH options.

 3. Based on the result from step (2), behave like "ssh" (if it
    succeeded) or "simple" (if it failed).

This way, the default ssh variant for unrecognized commands can handle
both the repo and dpl cases as intended.

[*] https://github.com/travis-ci/dpl/blob/6c3fddfda1f2a85944c544446b068bac0a77c049/lib/dpl/provider.rb#L215

Reported-by: William Yan <wyan@google.com>
Improved-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
This is the main one.  Simplified by making "auto" behave the same as
unset.

 Documentation/config.txt | 24 ++++++++------
 connect.c                | 82 +++++++++++++++++++++++++++++++-----------------
 t/t5601-clone.sh         | 20 ++++++++++++
 3 files changed, 88 insertions(+), 38 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 0460af37e2..6dffa4aa3d 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2081,16 +2081,22 @@ matched against are those given directly to Git commands.  This means any URLs
 visited as a result of a redirection do not participate in matching.
 
 ssh.variant::
-	Depending on the value of the environment variables `GIT_SSH` or
-	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
-	auto-detects whether to adjust its command-line parameters for use
-	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
-	(simple).
+	By default, Git determines the command line arguments to use
+	based on the basename of the configured SSH command (configured
+	using the environment variable `GIT_SSH` or `GIT_SSH_COMMAND` or
+	the config setting `core.sshCommand`). If the basename is
+	unrecognized, Git will attempt to detect support of OpenSSH
+	options by first invoking the configured SSH command with the
+	`-G` (print configuration) option and will subsequently use
+	OpenSSH options (if that is successful) or no options besides
+	the host and remote command (if it fails).
 +
-The config variable `ssh.variant` can be set to override this auto-detection;
-valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
-other value will be treated as normal ssh. This setting can be overridden via
-the environment variable `GIT_SSH_VARIANT`.
+The config variable `ssh.variant` can be set to override this detection:
+valid values are `ssh` (to use OpenSSH options), `plink`, `putty`,
+`tortoiseplink`, `simple` (no options except the host and remote command).
+The default auto-detection can be explicitly requested using the value
+`auto`.  Any other value is treated as `ssh`.  This setting can also be
+overridden via the environment variable `GIT_SSH_VARIANT`.
 +
 The current command-line parameters used for each variant are as
 follows:
diff --git a/connect.c b/connect.c
index 77ab6db3bb..0441dcbacf 100644
--- a/connect.c
+++ b/connect.c
@@ -777,6 +777,7 @@ static const char *get_ssh_command(void)
 }
 
 enum ssh_variant {
+	VARIANT_AUTO,
 	VARIANT_SIMPLE,
 	VARIANT_SSH,
 	VARIANT_PLINK,
@@ -784,14 +785,16 @@ enum ssh_variant {
 	VARIANT_TORTOISEPLINK,
 };
 
-static int override_ssh_variant(enum ssh_variant *ssh_variant)
+static void override_ssh_variant(enum ssh_variant *ssh_variant)
 {
 	const char *variant = getenv("GIT_SSH_VARIANT");
 
 	if (!variant && git_config_get_string_const("ssh.variant", &variant))
-		return 0;
+		return;
 
-	if (!strcmp(variant, "plink"))
+	if (!strcmp(variant, "auto"))
+		*ssh_variant = VARIANT_AUTO;
+	else if (!strcmp(variant, "plink"))
 		*ssh_variant = VARIANT_PLINK;
 	else if (!strcmp(variant, "putty"))
 		*ssh_variant = VARIANT_PUTTY;
@@ -801,18 +804,18 @@ static int override_ssh_variant(enum ssh_variant *ssh_variant)
 		*ssh_variant = VARIANT_SIMPLE;
 	else
 		*ssh_variant = VARIANT_SSH;
-
-	return 1;
 }
 
 static enum ssh_variant determine_ssh_variant(const char *ssh_command,
 					      int is_cmdline)
 {
-	enum ssh_variant ssh_variant = VARIANT_SIMPLE;
+	enum ssh_variant ssh_variant = VARIANT_AUTO;
 	const char *variant;
 	char *p = NULL;
 
-	if (override_ssh_variant(&ssh_variant))
+	override_ssh_variant(&ssh_variant);
+
+	if (ssh_variant != VARIANT_AUTO)
 		return ssh_variant;
 
 	if (!is_cmdline) {
@@ -908,6 +911,38 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
 	return conn;
 }
 
+static void push_ssh_options(struct argv_array *args, struct argv_array *env,
+			       enum ssh_variant variant, const char *port,
+			       int flags)
+{
+	if (variant == VARIANT_SSH &&
+	    get_protocol_version_config() > 0) {
+		argv_array_push(args, "-o");
+		argv_array_push(args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
+		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
+				 get_protocol_version_config());
+	}
+
+	if (variant != VARIANT_SIMPLE) {
+		if (flags & CONNECT_IPV4)
+			argv_array_push(args, "-4");
+		else if (flags & CONNECT_IPV6)
+			argv_array_push(args, "-6");
+	}
+
+	if (variant == VARIANT_TORTOISEPLINK)
+		argv_array_push(args, "-batch");
+
+	if (port && variant != VARIANT_SIMPLE) {
+		if (variant == VARIANT_SSH)
+			argv_array_push(args, "-p");
+		else
+			argv_array_push(args, "-P");
+
+		argv_array_push(args, port);
+	}
+}
+
 /* Prepare a child_process for use by Git's SSH-tunneled transport. */
 static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
 			  const char *port, int flags)
@@ -937,33 +972,22 @@ static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
 
 	argv_array_push(&conn->args, ssh);
 
-	if (variant == VARIANT_SSH &&
-	    get_protocol_version_config() > 0) {
-		argv_array_push(&conn->args, "-o");
-		argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
-		argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
-				 get_protocol_version_config());
-	}
-
-	if (variant != VARIANT_SIMPLE) {
-		if (flags & CONNECT_IPV4)
-			argv_array_push(&conn->args, "-4");
-		else if (flags & CONNECT_IPV6)
-			argv_array_push(&conn->args, "-6");
-	}
+	if (variant == VARIANT_AUTO) {
+		struct child_process detect = CHILD_PROCESS_INIT;
 
-	if (variant == VARIANT_TORTOISEPLINK)
-		argv_array_push(&conn->args, "-batch");
+		detect.use_shell = conn->use_shell;
+		detect.no_stdin = detect.no_stdout = detect.no_stderr = 1;
 
-	if (port && variant != VARIANT_SIMPLE) {
-		if (variant == VARIANT_SSH)
-			argv_array_push(&conn->args, "-p");
-		else
-			argv_array_push(&conn->args, "-P");
+		argv_array_push(&detect.args, ssh);
+		argv_array_push(&detect.args, "-G");
+		push_ssh_options(&detect.args, &detect.env_array,
+				 VARIANT_SSH, port, flags);
+		argv_array_push(&detect.args, ssh_host);
 
-		argv_array_push(&conn->args, port);
+		variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
 	}
 
+	push_ssh_options(&conn->args, &conn->env_array, variant, port, flags);
 	argv_array_push(&conn->args, ssh_host);
 }
 
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 86811a0c35..df9dfafdd8 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -372,6 +372,12 @@ test_expect_success 'variant can be overriden' '
 	expect_ssh myhost src
 '
 
+test_expect_success 'variant=auto picks based on basename' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
+	git -c ssh.variant=auto clone -4 "[myhost:123]:src" ssh-auto-clone &&
+	expect_ssh "-4 -P 123" myhost src
+'
+
 test_expect_success 'simple is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
 	git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple &&
@@ -384,6 +390,20 @@ test_expect_success 'uplink is treated as simple' '
 	expect_ssh myhost src
 '
 
+test_expect_success 'OpenSSH-like uplink is treated as ssh' '
+	write_script "$TRASH_DIRECTORY/uplink" <<-EOF &&
+	if test "\$1" = "-G"
+	then
+		exit 0
+	fi &&
+	exec "\$TRASH_DIRECTORY/ssh$X" "\$@"
+	EOF
+	GIT_SSH="$TRASH_DIRECTORY/uplink" &&
+	export GIT_SSH &&
+	git clone "[myhost:123]:src" ssh-bracket-clone-sshlike-uplink &&
+	expect_ssh "-p 123" myhost src
+'
+
 test_expect_success 'plink is treated specially (as putty)' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
 	git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 4/5] ssh: 'simple' variant does not support -4/-6
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
                                           ` (2 preceding siblings ...)
  2017-10-23 23:18                         ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
@ 2017-10-23 23:19                         ` Jonathan Nieder
  2017-10-23 23:19                         ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
  2017-10-24  2:22                         ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Junio C Hamano
  5 siblings, 0 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:19 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Brandon Williams, git, gitster, peff, sbeller, William Yan

If the user passes -4/--ipv4 or -6/--ipv6 to "git fetch" or "git push"
and the ssh command configured with GIT_SSH does not support such a
setting, error out instead of ignoring the option and continuing.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Stefan Beller <sbeller@google.com>
---
As before, just rebased.

 connect.c        | 25 ++++++++++++++++++++++---
 t/t5601-clone.sh | 12 ++++++------
 2 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/connect.c b/connect.c
index 0441dcbacf..9c64e8155a 100644
--- a/connect.c
+++ b/connect.c
@@ -923,11 +923,30 @@ static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 				 get_protocol_version_config());
 	}
 
-	if (variant != VARIANT_SIMPLE) {
-		if (flags & CONNECT_IPV4)
+	if (flags & CONNECT_IPV4) {
+		switch (variant) {
+		case VARIANT_AUTO:
+			BUG("VARIANT_AUTO passed to push_ssh_options");
+		case VARIANT_SIMPLE:
+			die("ssh variant 'simple' does not support -4");
+		case VARIANT_SSH:
+		case VARIANT_PLINK:
+		case VARIANT_PUTTY:
+		case VARIANT_TORTOISEPLINK:
 			argv_array_push(args, "-4");
-		else if (flags & CONNECT_IPV6)
+		}
+	} else if (flags & CONNECT_IPV6) {
+		switch (variant) {
+		case VARIANT_AUTO:
+			BUG("VARIANT_AUTO passed to push_ssh_options");
+		case VARIANT_SIMPLE:
+			die("ssh variant 'simple' does not support -6");
+		case VARIANT_SSH:
+		case VARIANT_PLINK:
+		case VARIANT_PUTTY:
+		case VARIANT_TORTOISEPLINK:
 			argv_array_push(args, "-6");
+		}
 	}
 
 	if (variant == VARIANT_TORTOISEPLINK)
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index df9dfafdd8..ea401cec1f 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -367,9 +367,10 @@ test_expect_success 'OpenSSH variant passes -4' '
 	expect_ssh "-4 -p 123" myhost src
 '
 
-test_expect_success 'variant can be overriden' '
-	git -c ssh.variant=simple clone -4 "[myhost:123]:src" ssh-simple-clone &&
-	expect_ssh myhost src
+test_expect_success 'variant can be overridden' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/putty" &&
+	git -c ssh.variant=putty clone -4 "[myhost:123]:src" ssh-putty-clone &&
+	expect_ssh "-4 -P 123" myhost src
 '
 
 test_expect_success 'variant=auto picks based on basename' '
@@ -378,10 +379,9 @@ test_expect_success 'variant=auto picks based on basename' '
 	expect_ssh "-4 -P 123" myhost src
 '
 
-test_expect_success 'simple is treated as simple' '
+test_expect_success 'simple does not support -4/-6' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
-	git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple &&
-	expect_ssh myhost src
+	test_must_fail git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple
 '
 
 test_expect_success 'uplink is treated as simple' '
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* [PATCH 5/5] ssh: 'simple' variant does not support --port
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
                                           ` (3 preceding siblings ...)
  2017-10-23 23:19                         ` [PATCH 4/5] ssh: 'simple' variant does not support -4/-6 Jonathan Nieder
@ 2017-10-23 23:19                         ` Jonathan Nieder
  2017-10-24  2:22                         ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Junio C Hamano
  5 siblings, 0 replies; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-23 23:19 UTC (permalink / raw)
  To: Jonathan Tan; +Cc: Brandon Williams, git, gitster, peff, sbeller, William Yan

When trying to connect to an ssh:// URL with port explicitly specified
and the ssh command configured with GIT_SSH does not support such a
setting, it is less confusing to error out than to silently suppress
the port setting and continue.

This requires updating the GIT_SSH setting in t5603-clone-dirname.sh.
That test is about the directory name produced when cloning various
URLs.  It uses an ssh wrapper that ignores all its arguments but does
not declare that it supports a port argument; update it to set
GIT_SSH_VARIANT=ssh to do so.  (Real-life ssh wrappers that pass a
port argument to OpenSSH would also support -G and would not require
such an update.)

Reported-by: William Yan <wyan@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Stefan Beller <sbeller@google.com>
---
That's the end of the series.  Thanks for reading.

 connect.c                | 15 ++++++++++++---
 t/t5601-clone.sh         | 10 ++++++++--
 t/t5603-clone-dirname.sh |  2 ++
 3 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/connect.c b/connect.c
index 9c64e8155a..06bcd3981e 100644
--- a/connect.c
+++ b/connect.c
@@ -952,11 +952,20 @@ static void push_ssh_options(struct argv_array *args, struct argv_array *env,
 	if (variant == VARIANT_TORTOISEPLINK)
 		argv_array_push(args, "-batch");
 
-	if (port && variant != VARIANT_SIMPLE) {
-		if (variant == VARIANT_SSH)
+	if (port) {
+		switch (variant) {
+		case VARIANT_AUTO:
+			BUG("VARIANT_AUTO passed to push_ssh_options");
+		case VARIANT_SIMPLE:
+			die("ssh variant 'simple' does not support setting port");
+		case VARIANT_SSH:
 			argv_array_push(args, "-p");
-		else
+			break;
+		case VARIANT_PLINK:
+		case VARIANT_PUTTY:
+		case VARIANT_TORTOISEPLINK:
 			argv_array_push(args, "-P");
+		}
 
 		argv_array_push(args, port);
 	}
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index ea401cec1f..f9a2ae84c7 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -381,12 +381,18 @@ test_expect_success 'variant=auto picks based on basename' '
 
 test_expect_success 'simple does not support -4/-6' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
-	test_must_fail git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple
+	test_must_fail git clone -4 "myhost:src" ssh-4-clone-simple
+'
+
+test_expect_success 'simple does not support port' '
+	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
+	test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-simple
 '
 
 test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
-	git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
+	test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
+	git clone "myhost:src" ssh-clone-uplink &&
 	expect_ssh myhost src
 '
 
diff --git a/t/t5603-clone-dirname.sh b/t/t5603-clone-dirname.sh
index d5af758129..13b5e5eb9b 100755
--- a/t/t5603-clone-dirname.sh
+++ b/t/t5603-clone-dirname.sh
@@ -11,7 +11,9 @@ test_expect_success 'setup ssh wrapper' '
 	git upload-pack "$TRASH_DIRECTORY"
 	EOF
 	GIT_SSH="$TRASH_DIRECTORY/ssh-wrapper" &&
+	GIT_SSH_VARIANT=ssh &&
 	export GIT_SSH &&
+	export GIT_SSH_VARIANT &&
 	export TRASH_DIRECTORY
 '
 
-- 
2.15.0.rc1.287.g2b38de12cc


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 23:18                         ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
@ 2017-10-23 23:27                           ` Brandon Williams
  2017-10-23 23:33                             ` Stefan Beller
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-23 23:27 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Jonathan Tan, git, gitster, peff, sbeller, William Yan

On 10/23, Jonathan Nieder wrote:
> Android's "repo" tool is a tool for managing a large codebase
> consisting of multiple smaller repositories, similar to Git's
> submodule feature.  Starting with Git 94b8ae5a (ssh: introduce a
> 'simple' ssh variant, 2017-10-16), users noticed that it stopped
> handling the port in ssh:// URLs.
> 
> The cause: when it encounters ssh:// URLs, repo pre-connects to the
> server and sets GIT_SSH to a helper ".repo/repo/git_ssh" that reuses
> that connection.  Before 94b8ae5a, the helper was assumed to support
> OpenSSH options for lack of a better guess and got passed a -p option
> to set the port.  After that patch, it uses the new default of a
> simple helper that does not accept an option to set the port.
> 
> The next release of "repo" will set GIT_SSH_VARIANT to "ssh" to avoid
> that.  But users of old versions and of other similar GIT_SSH
> implementations would not get the benefit of that fix.
> 
> So update the default to use OpenSSH options again, with a twist.  As
> observed in 94b8ae5a, we cannot assume that $GIT_SSH always handles
> OpenSSH options: common helpers such as travis-ci's dpl[*] are
> configured using GIT_SSH and do not accept OpenSSH options.  So make
> the default a new variant "auto", with the following behavior:
> 
>  1. First, check for a recognized basename, like today.
> 
>  2. If the basename is not recognized, check whether $GIT_SSH supports
>     OpenSSH options by running
> 
> 	$GIT_SSH -G <options> <host>
> 
>     This returns status 0 and prints configuration in OpenSSH if it
>     recognizes all <options> and returns status 255 if it encounters
>     an unrecognized option.  A wrapper script like
> 
> 	exec ssh -- "$@"
> 
>     would fail with
> 
> 	ssh: Could not resolve hostname -g: Name or service not known
> 
>     , correctly reflecting that it does not support OpenSSH options.
> 
>  3. Based on the result from step (2), behave like "ssh" (if it
>     succeeded) or "simple" (if it failed).
> 
> This way, the default ssh variant for unrecognized commands can handle
> both the repo and dpl cases as intended.
> 
> [*] https://github.com/travis-ci/dpl/blob/6c3fddfda1f2a85944c544446b068bac0a77c049/lib/dpl/provider.rb#L215
> 
> Reported-by: William Yan <wyan@google.com>
> Improved-by: Jonathan Tan <jonathantanmy@google.com>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> ---
> This is the main one.  Simplified by making "auto" behave the same as
> unset.

I still don't see the benefit of allowing a user to explicitly set
'auto' then, if setting it to 'auto' is effectively a noop.  But maybe
there's something I'm not seeing.

> 
>  Documentation/config.txt | 24 ++++++++------
>  connect.c                | 82 +++++++++++++++++++++++++++++++-----------------
>  t/t5601-clone.sh         | 20 ++++++++++++
>  3 files changed, 88 insertions(+), 38 deletions(-)
> 
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 0460af37e2..6dffa4aa3d 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -2081,16 +2081,22 @@ matched against are those given directly to Git commands.  This means any URLs
>  visited as a result of a redirection do not participate in matching.
>  
>  ssh.variant::
> -	Depending on the value of the environment variables `GIT_SSH` or
> -	`GIT_SSH_COMMAND`, or the config setting `core.sshCommand`, Git
> -	auto-detects whether to adjust its command-line parameters for use
> -	with ssh (OpenSSH), plink or tortoiseplink, as opposed to the default
> -	(simple).
> +	By default, Git determines the command line arguments to use
> +	based on the basename of the configured SSH command (configured
> +	using the environment variable `GIT_SSH` or `GIT_SSH_COMMAND` or
> +	the config setting `core.sshCommand`). If the basename is
> +	unrecognized, Git will attempt to detect support of OpenSSH
> +	options by first invoking the configured SSH command with the
> +	`-G` (print configuration) option and will subsequently use
> +	OpenSSH options (if that is successful) or no options besides
> +	the host and remote command (if it fails).
>  +
> -The config variable `ssh.variant` can be set to override this auto-detection;
> -valid values are `ssh`, `simple`, `plink`, `putty` or `tortoiseplink`. Any
> -other value will be treated as normal ssh. This setting can be overridden via
> -the environment variable `GIT_SSH_VARIANT`.
> +The config variable `ssh.variant` can be set to override this detection:
> +valid values are `ssh` (to use OpenSSH options), `plink`, `putty`,
> +`tortoiseplink`, `simple` (no options except the host and remote command).
> +The default auto-detection can be explicitly requested using the value
> +`auto`.  Any other value is treated as `ssh`.  This setting can also be
> +overridden via the environment variable `GIT_SSH_VARIANT`.
>  +
>  The current command-line parameters used for each variant are as
>  follows:
> diff --git a/connect.c b/connect.c
> index 77ab6db3bb..0441dcbacf 100644
> --- a/connect.c
> +++ b/connect.c
> @@ -777,6 +777,7 @@ static const char *get_ssh_command(void)
>  }
>  
>  enum ssh_variant {
> +	VARIANT_AUTO,
>  	VARIANT_SIMPLE,
>  	VARIANT_SSH,
>  	VARIANT_PLINK,
> @@ -784,14 +785,16 @@ enum ssh_variant {
>  	VARIANT_TORTOISEPLINK,
>  };
>  
> -static int override_ssh_variant(enum ssh_variant *ssh_variant)
> +static void override_ssh_variant(enum ssh_variant *ssh_variant)
>  {
>  	const char *variant = getenv("GIT_SSH_VARIANT");
>  
>  	if (!variant && git_config_get_string_const("ssh.variant", &variant))
> -		return 0;
> +		return;
>  
> -	if (!strcmp(variant, "plink"))
> +	if (!strcmp(variant, "auto"))
> +		*ssh_variant = VARIANT_AUTO;
> +	else if (!strcmp(variant, "plink"))
>  		*ssh_variant = VARIANT_PLINK;
>  	else if (!strcmp(variant, "putty"))
>  		*ssh_variant = VARIANT_PUTTY;
> @@ -801,18 +804,18 @@ static int override_ssh_variant(enum ssh_variant *ssh_variant)
>  		*ssh_variant = VARIANT_SIMPLE;
>  	else
>  		*ssh_variant = VARIANT_SSH;
> -
> -	return 1;
>  }
>  
>  static enum ssh_variant determine_ssh_variant(const char *ssh_command,
>  					      int is_cmdline)
>  {
> -	enum ssh_variant ssh_variant = VARIANT_SIMPLE;
> +	enum ssh_variant ssh_variant = VARIANT_AUTO;
>  	const char *variant;
>  	char *p = NULL;
>  
> -	if (override_ssh_variant(&ssh_variant))
> +	override_ssh_variant(&ssh_variant);
> +
> +	if (ssh_variant != VARIANT_AUTO)
>  		return ssh_variant;
>  
>  	if (!is_cmdline) {
> @@ -908,6 +911,38 @@ static struct child_process *git_connect_git(int fd[2], char *hostandport,
>  	return conn;
>  }
>  
> +static void push_ssh_options(struct argv_array *args, struct argv_array *env,
> +			       enum ssh_variant variant, const char *port,
> +			       int flags)
> +{
> +	if (variant == VARIANT_SSH &&
> +	    get_protocol_version_config() > 0) {
> +		argv_array_push(args, "-o");
> +		argv_array_push(args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
> +		argv_array_pushf(env, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
> +				 get_protocol_version_config());
> +	}
> +
> +	if (variant != VARIANT_SIMPLE) {
> +		if (flags & CONNECT_IPV4)
> +			argv_array_push(args, "-4");
> +		else if (flags & CONNECT_IPV6)
> +			argv_array_push(args, "-6");
> +	}
> +
> +	if (variant == VARIANT_TORTOISEPLINK)
> +		argv_array_push(args, "-batch");
> +
> +	if (port && variant != VARIANT_SIMPLE) {
> +		if (variant == VARIANT_SSH)
> +			argv_array_push(args, "-p");
> +		else
> +			argv_array_push(args, "-P");
> +
> +		argv_array_push(args, port);
> +	}
> +}
> +
>  /* Prepare a child_process for use by Git's SSH-tunneled transport. */
>  static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
>  			  const char *port, int flags)
> @@ -937,33 +972,22 @@ static void fill_ssh_args(struct child_process *conn, const char *ssh_host,
>  
>  	argv_array_push(&conn->args, ssh);
>  
> -	if (variant == VARIANT_SSH &&
> -	    get_protocol_version_config() > 0) {
> -		argv_array_push(&conn->args, "-o");
> -		argv_array_push(&conn->args, "SendEnv=" GIT_PROTOCOL_ENVIRONMENT);
> -		argv_array_pushf(&conn->env_array, GIT_PROTOCOL_ENVIRONMENT "=version=%d",
> -				 get_protocol_version_config());
> -	}
> -
> -	if (variant != VARIANT_SIMPLE) {
> -		if (flags & CONNECT_IPV4)
> -			argv_array_push(&conn->args, "-4");
> -		else if (flags & CONNECT_IPV6)
> -			argv_array_push(&conn->args, "-6");
> -	}
> +	if (variant == VARIANT_AUTO) {
> +		struct child_process detect = CHILD_PROCESS_INIT;
>  
> -	if (variant == VARIANT_TORTOISEPLINK)
> -		argv_array_push(&conn->args, "-batch");
> +		detect.use_shell = conn->use_shell;
> +		detect.no_stdin = detect.no_stdout = detect.no_stderr = 1;
>  
> -	if (port && variant != VARIANT_SIMPLE) {
> -		if (variant == VARIANT_SSH)
> -			argv_array_push(&conn->args, "-p");
> -		else
> -			argv_array_push(&conn->args, "-P");
> +		argv_array_push(&detect.args, ssh);
> +		argv_array_push(&detect.args, "-G");
> +		push_ssh_options(&detect.args, &detect.env_array,
> +				 VARIANT_SSH, port, flags);
> +		argv_array_push(&detect.args, ssh_host);
>  
> -		argv_array_push(&conn->args, port);
> +		variant = run_command(&detect) ? VARIANT_SIMPLE : VARIANT_SSH;
>  	}
>  
> +	push_ssh_options(&conn->args, &conn->env_array, variant, port, flags);
>  	argv_array_push(&conn->args, ssh_host);
>  }
>  
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index 86811a0c35..df9dfafdd8 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -372,6 +372,12 @@ test_expect_success 'variant can be overriden' '
>  	expect_ssh myhost src
>  '
>  
> +test_expect_success 'variant=auto picks based on basename' '
> +	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
> +	git -c ssh.variant=auto clone -4 "[myhost:123]:src" ssh-auto-clone &&
> +	expect_ssh "-4 -P 123" myhost src
> +'
> +
>  test_expect_success 'simple is treated as simple' '
>  	copy_ssh_wrapper_as "$TRASH_DIRECTORY/simple" &&
>  	git clone -4 "[myhost:123]:src" ssh-bracket-clone-simple &&
> @@ -384,6 +390,20 @@ test_expect_success 'uplink is treated as simple' '
>  	expect_ssh myhost src
>  '
>  
> +test_expect_success 'OpenSSH-like uplink is treated as ssh' '
> +	write_script "$TRASH_DIRECTORY/uplink" <<-EOF &&
> +	if test "\$1" = "-G"
> +	then
> +		exit 0
> +	fi &&
> +	exec "\$TRASH_DIRECTORY/ssh$X" "\$@"
> +	EOF
> +	GIT_SSH="$TRASH_DIRECTORY/uplink" &&
> +	export GIT_SSH &&
> +	git clone "[myhost:123]:src" ssh-bracket-clone-sshlike-uplink &&
> +	expect_ssh "-p 123" myhost src
> +'
> +
>  test_expect_success 'plink is treated specially (as putty)' '
>  	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
>  	git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&
> -- 
> 2.15.0.rc1.287.g2b38de12cc
> 

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 23:27                           ` Brandon Williams
@ 2017-10-23 23:33                             ` Stefan Beller
  0 siblings, 0 replies; 161+ messages in thread
From: Stefan Beller @ 2017-10-23 23:33 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Jonathan Nieder, Jonathan Tan, git, Junio C Hamano, Jeff King,
	William Yan

>> This is the main one.  Simplified by making "auto" behave the same as
>> unset.
>
> I still don't see the benefit of allowing a user to explicitly set
> 'auto' then, if setting it to 'auto' is effectively a noop.  But maybe
> there's something I'm not seeing.
>

If /etc/gitconfig says SSH, and you have different configs in your
different repos,
the easiest way out is setting AUTO in ~/.gitconfig.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* [WIP PATCH] diff: add option to ignore whitespaces for move detection only
  2017-10-23 22:16               ` Stefan Beller
@ 2017-10-24  0:09                 ` Stefan Beller
  2017-10-24 18:48                   ` Brandon Williams
  2017-10-24  1:54                 ` [PATCH 1/5] connect: split git:// setup into a separate function Junio C Hamano
  1 sibling, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-10-24  0:09 UTC (permalink / raw)
  To: sbeller
  Cc: bmwill, bturner, git, git, gitster, jonathantanmy, jrnieder, peff, wyan

Signed-off-by: Stefan Beller <sbeller@google.com>
---

 diff.c                     |  10 ++--
 diff.h                     |   1 +
 t/t4015-diff-whitespace.sh | 114 ++++++++++++++++++++++++++++++++++++++++++++-
 
See, only 10 lines of code! (and a few more for tests)

We have run out of space in diff_options.flags,touched_flags.
as we 1<<U31 as the highest bit. We could reuse 1<<9 that is currently
unused (removed in 882749a04f (diff: add --word-diff option that
generalizes --color-words, 2010-04-14)). But that postpones the
real fix for only a short amount of time.

Ideas welcome how to extend the flag space. (We cannot just make it
a long either, as some arcane architecures have 32 bit longs.)

Another TODO: documentation

I plan to trim the CC list for any resend that will be needed.

Thanks,
Stefan
 
 3 files changed, 121 insertions(+), 4 deletions(-)

diff --git a/diff.c b/diff.c
index c4a669ffa8..ddb2018307 100644
--- a/diff.c
+++ b/diff.c
@@ -726,7 +726,8 @@ static int next_byte(const char **cp, const char **endp,
 			return (int)' ';
 		}
 
-		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE)) {
+		if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE) ||
+		    diffopt->color_moved_ignore_space) {
 			while (*cp < *endp && isspace(**cp))
 				(*cp)++;
 			/*
@@ -751,7 +752,8 @@ static int moved_entry_cmp(const struct diff_options *diffopt,
 	const char *ap = a->es->line, *ae = a->es->line + a->es->len;
 	const char *bp = b->es->line, *be = b->es->line + b->es->len;
 
-	if (!(diffopt->xdl_opts & XDF_WHITESPACE_FLAGS))
+	if (!(diffopt->xdl_opts & XDF_WHITESPACE_FLAGS) &&
+	    !diffopt->color_moved_ignore_space)
 		return a->es->len != b->es->len  || memcmp(ap, bp, a->es->len);
 
 	if (DIFF_XDL_TST(diffopt, IGNORE_WHITESPACE_AT_EOL)) {
@@ -774,7 +776,7 @@ static int moved_entry_cmp(const struct diff_options *diffopt,
 
 static unsigned get_string_hash(struct emitted_diff_symbol *es, struct diff_options *o)
 {
-	if (o->xdl_opts & XDF_WHITESPACE_FLAGS) {
+	if ((o->xdl_opts & XDF_WHITESPACE_FLAGS) || o->color_moved_ignore_space) {
 		static struct strbuf sb = STRBUF_INIT;
 		const char *ap = es->line, *ae = es->line + es->len;
 		int c;
@@ -4660,6 +4662,8 @@ int diff_opt_parse(struct diff_options *options,
 		DIFF_XDL_CLR(options, NEED_MINIMAL);
 	else if (!strcmp(arg, "-w") || !strcmp(arg, "--ignore-all-space"))
 		DIFF_XDL_SET(options, IGNORE_WHITESPACE);
+	else if (!strcmp(arg, "--ignore-all-space-in-move-detection"))
+		options->color_moved_ignore_space = 1;
 	else if (!strcmp(arg, "-b") || !strcmp(arg, "--ignore-space-change"))
 		DIFF_XDL_SET(options, IGNORE_WHITESPACE_CHANGE);
 	else if (!strcmp(arg, "--ignore-space-at-eol"))
diff --git a/diff.h b/diff.h
index aca150ba2e..6ba3f53bbd 100644
--- a/diff.h
+++ b/diff.h
@@ -196,6 +196,7 @@ struct diff_options {
 	} color_moved;
 	#define COLOR_MOVED_DEFAULT COLOR_MOVED_ZEBRA
 	#define COLOR_MOVED_MIN_ALNUM_COUNT 20
+	int color_moved_ignore_space;
 };
 
 void diff_emit_submodule_del(struct diff_options *o, const char *line);
diff --git a/t/t4015-diff-whitespace.sh b/t/t4015-diff-whitespace.sh
index 6c9a93b734..d7ee3aabf2 100755
--- a/t/t4015-diff-whitespace.sh
+++ b/t/t4015-diff-whitespace.sh
@@ -1677,7 +1677,119 @@ test_expect_success 'move detection with submodules' '
 
 	# nor did we mess with it another way
 	git diff --submodule=diff --color | test_decode_color >expect &&
-	test_cmp expect decoded_actual
+	test_cmp expect decoded_actual &&
+	rm -rf bananas &&
+	git submodule deinit bananas
+'
+
+test_expect_success 'move detection only ignores white spaces' '
+	git reset --hard &&
+	q_to_tab <<-\EOF >function.c &&
+	int func()
+	{
+	Qif (foo) {
+	QQ// this part of the function
+	QQ// function will be very long
+	QQ// indeed. We must exceed both
+	QQ// per-line and number of line
+	QQ// minimums
+	QQ;
+	Q}
+	Qbaz();
+	Qbar();
+	Q// more unrelated stuff
+	}
+	EOF
+	git add function.c &&
+	git commit -m "add function.c" &&
+	q_to_tab <<-\EOF >function.c &&
+	int do_foo()
+	{
+	Q// this part of the function
+	Q// function will be very long
+	Q// indeed. We must exceed both
+	Q// per-line and number of line
+	Q// minimums
+	Q;
+	}
+
+	int func()
+	{
+	Qif (foo)
+	QQdo_foo();
+	Qbaz();
+	Qbar();
+	Q// more unrelated stuff
+	}
+	EOF
+
+	# Make sure we get a different diff using -w ("moved function header")
+	git diff --color --color-moved -w |
+		grep -v "index" |
+		test_decode_color >actual &&
+	q_to_tab <<-\EOF >expected &&
+	<BOLD>diff --git a/function.c b/function.c<RESET>
+	<BOLD>--- a/function.c<RESET>
+	<BOLD>+++ b/function.c<RESET>
+	<CYAN>@@ -1,6 +1,5 @@<RESET>
+	<RED>-int func()<RESET>
+	<GREEN>+<RESET><GREEN>int do_foo()<RESET>
+	 {<RESET>
+	<RED>-	if (foo) {<RESET>
+	 Q// this part of the function<RESET>
+	 Q// function will be very long<RESET>
+	 Q// indeed. We must exceed both<RESET>
+	<CYAN>@@ -8,6 +7,11 @@<RESET> <RESET>int func()<RESET>
+	 Q// minimums<RESET>
+	 Q;<RESET>
+	 }<RESET>
+	<GREEN>+<RESET>
+	<GREEN>+<RESET><GREEN>int func()<RESET>
+	<GREEN>+<RESET><GREEN>{<RESET>
+	<GREEN>+<RESET>Q<GREEN>if (foo)<RESET>
+	<GREEN>+<RESET>QQ<GREEN>do_foo();<RESET>
+	 Qbaz();<RESET>
+	 Qbar();<RESET>
+	 Q// more unrelated stuff<RESET>
+	EOF
+	test_cmp expected actual &&
+
+	# And now ignoring white space only in the move detection
+	git diff --color --color-moved --ignore-all-space-in-move-detection |
+		grep -v "index" |
+		test_decode_color >actual &&
+	q_to_tab <<-\EOF >expected &&
+	<BOLD>diff --git a/function.c b/function.c<RESET>
+	<BOLD>--- a/function.c<RESET>
+	<BOLD>+++ b/function.c<RESET>
+	<CYAN>@@ -1,13 +1,17 @@<RESET>
+	<GREEN>+<RESET><GREEN>int do_foo()<RESET>
+	<GREEN>+<RESET><GREEN>{<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// this part of the function<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// function will be very long<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// indeed. We must exceed both<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// per-line and number of line<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>// minimums<RESET>
+	<BOLD;CYAN>+<RESET>Q<BOLD;CYAN>;<RESET>
+	<BOLD;CYAN>+<RESET><BOLD;CYAN>}<RESET>
+	<GREEN>+<RESET>
+	 int func()<RESET>
+	 {<RESET>
+	<RED>-Qif (foo) {<RESET>
+	<BOLD;MAGENTA>-QQ// this part of the function<RESET>
+	<BOLD;MAGENTA>-QQ// function will be very long<RESET>
+	<BOLD;MAGENTA>-QQ// indeed. We must exceed both<RESET>
+	<BOLD;MAGENTA>-QQ// per-line and number of line<RESET>
+	<BOLD;MAGENTA>-QQ// minimums<RESET>
+	<BOLD;MAGENTA>-QQ;<RESET>
+	<BOLD;MAGENTA>-Q}<RESET>
+	<GREEN>+<RESET>Q<GREEN>if (foo)<RESET>
+	<GREEN>+<RESET>QQ<GREEN>do_foo();<RESET>
+	 Qbaz();<RESET>
+	 Qbar();<RESET>
+	 Q// more unrelated stuff<RESET>
+	EOF
+	test_cmp expected actual
 '
 
 test_done
-- 
2.15.0.rc2.6.g953226eb5f


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-23 23:17                         ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
@ 2017-10-24  1:44                           ` Junio C Hamano
  2017-11-15 20:25                             ` Jonathan Nieder
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-10-24  1:44 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jonathan Tan, Brandon Williams, git, peff, sbeller, William Yan

Jonathan Nieder <jrnieder@gmail.com> writes:

> The git_connect function is growing long.  Split the
> PROTO_GIT-specific portion to a separate function to make it easier to
> read.
>
> No functional change intended.
>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> Reviewed-by: Stefan Beller <sbeller@google.com>
> ---
> As before, except with sbeller's Reviewed-by.

I found this quite nice, except for one thing.

> +/*
> + * Open a connection using Git's native protocol.
> + *
> + * The caller is responsible for freeing hostandport, but this function may
> + * modify it (for example, to truncate it to remove the port part).
> + */
> +static struct child_process *git_connect_git(int fd[2], char *hostandport,
> +					     const char *path, const char *prog,
> +					     int flags)
> +{
> +	struct child_process *conn = &no_fork;
> +	struct strbuf request = STRBUF_INIT;

As this one decides what "conn" to return, including the fallback
&no_fork instance,...

> +	...
> +	return conn;
> +}
> +
>  /*
>   * This returns a dummy child_process if the transport protocol does not
>   * need fork(2), or a struct child_process object if it does.  Once done,
> @@ -881,50 +939,7 @@ struct child_process *git_connect(int fd[2], const char *url,

Each of the if/elseif/ cascade, one of which calls the new helper,
now makes an explicit assignment to "conn" declared in
git_connect().

Which means the defaulting of git_connect::conn to &no_fork is now
unneeded.  One of the things that made the original cascade a bit
harder to follow than necessary, aside from the physical length of
the PROTO_GIT part, was that the case where conn remains to point at
no_fork looked very special and it was buried in that long PROTO_GIT
part.  Now the main source of that issue is fixed, it would make it
clear to leave conn uninitialized (or initialize to NULL---but leaving
it uninitialized would make the intention of the code more clear, I
would think, that each of the if/elseif/ cascade must assign to it).

>  		printf("Diag: path=%s\n", path ? path : "NULL");
>  		conn = NULL;
>  	} else if (protocol == PROTO_GIT) {
> -		struct strbuf request = STRBUF_INIT;
> -...
> +		conn = git_connect_git(fd, hostandport, path, prog, flags);
>  	} else {
>  		struct strbuf cmd = STRBUF_INIT;
>  		const char *const *var;

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-23 22:16               ` Stefan Beller
  2017-10-24  0:09                 ` [WIP PATCH] diff: add option to ignore whitespaces for move detection only Stefan Beller
@ 2017-10-24  1:54                 ` Junio C Hamano
  2017-10-24  2:52                   ` Stefan Beller
  1 sibling, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-10-24  1:54 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Jonathan Nieder, Brandon Williams, git, Bryan Turner,
	Jeff Hostetler, Jonathan Tan, Jeff King, William Yan

Stefan Beller <sbeller@google.com> writes:

> I think once this option is given, all we have to do is pay attention to
> this option in diff.c#moved_entry_cmp/next_byte, which is best built
> on top of Peffs recent fixes origin/jk/diff-color-moved-fix.
> Would that be of interest for people?

Two things and a half.

 * I was hoping that the next_byte() and string_hash() thing, once
   they are cleaned up, will eventually be shared with the xdiff/
   code at the lower layer, which needs to do pretty much the same
   in order to implement various whitespace ignoring options.  I am
   not sure how well the approach taken by the WIP patch meshes with
   the needs of the lower layer.

 * I agree that -w that applies only one or the other and not both
   may sometimes produce a better/readable result, but the more
   important part is how the user can tell when to exercise the
   option.  Would it be realistic to expect them to try -w in
   different combinations and see which looks the best?  What if we
   have a patch that touch two files, one looks better with -w only
   for coloring moved and the other looks better with -w for both?

 * As moved-lines display is mostly a presentation thing, I wonder
   if it makes sense to always match loosely wrt whitespace
   differences.  It is tempting because if it is true, we do not
   have to worry about the second issue above.

Thanks.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 2/5] connect: split ssh command line options into separate function
  2017-10-23 23:17                         ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
@ 2017-10-24  2:01                           ` Junio C Hamano
  0 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-10-24  2:01 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jonathan Tan, Brandon Williams, git, peff, sbeller, William Yan

Jonathan Nieder <jrnieder@gmail.com> writes:

> The git_connect function is growing long.  Split the portion that
> discovers an ssh command and options it accepts before the service
> name and path to a separate function to make it easier to read.
>
> No functional change intended.
>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
> Reviewed-by: Stefan Beller <sbeller@google.com>
> ---
> As before, except for the Reviewed-by.
>
>  connect.c | 116 +++++++++++++++++++++++++++++++++-----------------------------
>  1 file changed, 61 insertions(+), 55 deletions(-)

Looks like a straight-forwrd split.  Makes sense.

Thanks.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
  2017-10-23 22:19               ` Jonathan Tan
  2017-10-23 22:33               ` Stefan Beller
@ 2017-10-24  2:16               ` Junio C Hamano
  2017-10-25 12:51               ` Johannes Schindelin
  3 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-10-24  2:16 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, bturner, git, jonathantanmy, peff,
	sbeller, William Yan

Jonathan Nieder <jrnieder@gmail.com> writes:

>  1. First, check whether $GIT_SSH supports OpenSSH options by running
>
> 	$GIT_SSH -G <options> <host>
>
>     This returns status 0 and prints configuration in OpenSSH if it
>     recognizes all <options> and returns status 255 if it encounters
>     an unrecognized option.  A wrapper script like
>
> 	exec ssh -- "$@"
>
>     would fail with
>
> 	ssh: Could not resolve hostname -g: Name or service not known
>
>     , correctly reflecting that it does not support OpenSSH options.

Two comments.

 * It would have been really nicer if the push_ssh_options() got
   split from its caller in a separate preparatory [PATCH 2.5/5].

 * Use of -G for auto-detection is clever and cute, but do we know
   how safe it would be in the real world?  What worries me the most
   is "myssh -G localhost" (no extra options) that does not fit our
   expectation on "-G" to either exit immediately with an error when
   it is not understood, or to exit like OpenSSH does, and instead
   successfully makes a connection and gets stuck.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH
  2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
                                           ` (4 preceding siblings ...)
  2017-10-23 23:19                         ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
@ 2017-10-24  2:22                         ` Junio C Hamano
  5 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-10-24  2:22 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jonathan Tan, Brandon Williams, git, peff, sbeller, William Yan

Jonathan Nieder <jrnieder@gmail.com> writes:

> Jonathan Tan wrote:
>>> On 10/23, Jonathan Nieder wrote:
>
>>>> If this looks good, I can reroll in a moment.
>>
>> Yes, this looks good.
>
> Thanks.  Here goes.
>
> The interdiff is upthread.  Thanks, all, for the quick review.
>
> Jonathan Nieder (5):
>   connect: split git:// setup into a separate function
>   connect: split ssh command line options into separate function
>   ssh: 'auto' variant to select between 'ssh' and 'simple'
>   ssh: 'simple' variant does not support -4/-6
>   ssh: 'simple' variant does not support --port

These looked mostly good. I left some nitpicks on individual
patches.


^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-24  1:54                 ` [PATCH 1/5] connect: split git:// setup into a separate function Junio C Hamano
@ 2017-10-24  2:52                   ` Stefan Beller
  0 siblings, 0 replies; 161+ messages in thread
From: Stefan Beller @ 2017-10-24  2:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Brandon Williams, git, Bryan Turner,
	Jeff Hostetler, Jonathan Tan, Jeff King, William Yan

On Mon, Oct 23, 2017 at 6:54 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> I think once this option is given, all we have to do is pay attention to
>> this option in diff.c#moved_entry_cmp/next_byte, which is best built
>> on top of Peffs recent fixes origin/jk/diff-color-moved-fix.
>> Would that be of interest for people?
>
> Two things and a half.
>
>  * I was hoping that the next_byte() and string_hash() thing, once
>    they are cleaned up, will eventually be shared with the xdiff/
>    code at the lower layer, which needs to do pretty much the same
>    in order to implement various whitespace ignoring options.  I am
>    not sure how well the approach taken by the WIP patch meshes with
>    the needs of the lower layer.

Good point. I'll keep that in mind when redoing that patch.
(I might even try to clean up the xdiff stuff and reuse the code
here)

>  * I agree that -w that applies only one or the other and not both
>    may sometimes produce a better/readable result, but the more
>    important part is how the user can tell when to exercise the
>    option.  Would it be realistic to expect them to try -w in
>    different combinations and see which looks the best?  What if we
>    have a patch that touch two files, one looks better with -w only
>    for coloring moved and the other looks better with -w for both?
>
>  * As moved-lines display is mostly a presentation thing, I wonder
>    if it makes sense to always match loosely wrt whitespace
>    differences.  It is tempting because if it is true, we do not
>    have to worry about the second issue above.

Well, sometimes the user wants to know if it is byte-for-byte identical
(unlikely to be code, but maybe column oriented data for input;
think of all our FORTRAN users. ;)
and at other times the user just wants to approximately know
if it is the same (i.e. code ignoring white space changes).
If Git was *really* smart w.r.t. languages it might even ignore
identifier renames in the programming language.

So I think an "unconditionally ignore all white space in
move detection" is not the best (it could be a good default though!)
but the user wants to have a possibility to byte-for-byte comparision
(maybe even including CRLFs/LFs)

Thanks,
Stefan

> Thanks.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [WIP PATCH] diff: add option to ignore whitespaces for move detection only
  2017-10-24  0:09                 ` [WIP PATCH] diff: add option to ignore whitespaces for move detection only Stefan Beller
@ 2017-10-24 18:48                   ` Brandon Williams
  2017-10-25  1:25                     ` Junio C Hamano
  0 siblings, 1 reply; 161+ messages in thread
From: Brandon Williams @ 2017-10-24 18:48 UTC (permalink / raw)
  To: Stefan Beller
  Cc: bturner, git, git, gitster, jonathantanmy, jrnieder, peff, wyan

On 10/23, Stefan Beller wrote:
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
> 
>  diff.c                     |  10 ++--
>  diff.h                     |   1 +
>  t/t4015-diff-whitespace.sh | 114 ++++++++++++++++++++++++++++++++++++++++++++-
>  
> See, only 10 lines of code! (and a few more for tests)
> 
> We have run out of space in diff_options.flags,touched_flags.
> as we 1<<U31 as the highest bit. We could reuse 1<<9 that is currently
> unused (removed in 882749a04f (diff: add --word-diff option that
> generalizes --color-words, 2010-04-14)). But that postpones the
> real fix for only a short amount of time.
> 
> Ideas welcome how to extend the flag space. (We cannot just make it
> a long either, as some arcane architecures have 32 bit longs.)

One simple idea would be to convert the single 'flag' into various bit
fields themselves, that way if you need to add a new flag you would just
make a new bit field.  I'm unaware of any downsides of doing so (though
i may be missing something) but doing so would probably cause a bit of
code churn.


-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [WIP PATCH] diff: add option to ignore whitespaces for move detection only
  2017-10-24 18:48                   ` Brandon Williams
@ 2017-10-25  1:25                     ` Junio C Hamano
  2017-10-25  1:26                       ` Junio C Hamano
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-10-25  1:25 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Stefan Beller, bturner, git, git, jonathantanmy, jrnieder, peff, wyan

Brandon Williams <bmwill@google.com> writes:

> One simple idea would be to convert the single 'flag' into various bit
> fields themselves, that way if you need to add a new flag you would just
> make a new bit field.  I'm unaware of any downsides of doing so (though
> i may be missing something) but doing so would probably cause a bit of
> code churn.

The reason why people want to have their own bit in the flags word
is because they want to use DIFF_OPT_{SET,CLR,TST,TOUCHED} but they
do not want to do the work to extend them beyond a single word.  

I think it is doable by making everything a 1-bit wide bitfield
without affecting existing users.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [WIP PATCH] diff: add option to ignore whitespaces for move detection only
  2017-10-25  1:25                     ` Junio C Hamano
@ 2017-10-25  1:26                       ` Junio C Hamano
  2017-10-25 18:58                         ` Brandon Williams
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-10-25  1:26 UTC (permalink / raw)
  To: Brandon Williams
  Cc: Stefan Beller, bturner, git, git, jonathantanmy, jrnieder, peff, wyan

Junio C Hamano <gitster@pobox.com> writes:

> Brandon Williams <bmwill@google.com> writes:
>
>> One simple idea would be to convert the single 'flag' into various bit
>> fields themselves, that way if you need to add a new flag you would just
>> make a new bit field.  I'm unaware of any downsides of doing so (though
>> i may be missing something) but doing so would probably cause a bit of
>> code churn.
>
> The reason why people want to have their own bit in the flags word
> is because they want to use DIFF_OPT_{SET,CLR,TST,TOUCHED} but they
> do not want to do the work to extend them beyond a single word.  
>
> I think it is doable by making everything a 1-bit wide bitfield
> without affecting existing users.

... but the "touched" thing may be harder---I haven't thought it
through.

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
                                 ` (2 preceding siblings ...)
  2017-10-24  2:16               ` Junio C Hamano
@ 2017-10-25 12:51               ` Johannes Schindelin
  2017-10-25 16:18                 ` Stefan Beller
  3 siblings, 1 reply; 161+ messages in thread
From: Johannes Schindelin @ 2017-10-25 12:51 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Brandon Williams, git, bturner, git, gitster, jonathantanmy,
	peff, sbeller, William Yan

Hi Jonathan,

[I only saw that you replied to 3/5 with v2 after writing this reply, but
it would apply to v2's 3/5, too]

On Mon, 23 Oct 2017, Jonathan Nieder wrote:

> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index 86811a0c35..fd94dd40d2 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -384,6 +384,20 @@ test_expect_success 'uplink is treated as simple' '
>  	expect_ssh myhost src
>  '
>  
> +test_expect_success 'OpenSSH-like uplink is treated as ssh' '
> +	write_script "$TRASH_DIRECTORY/uplink" <<-EOF &&
> +	if test "\$1" = "-G"
> +	then
> +		exit 0
> +	fi &&
> +	exec "\$TRASH_DIRECTORY/ssh$X" "\$@"
> +	EOF
> +	GIT_SSH="$TRASH_DIRECTORY/uplink" &&
> +	export GIT_SSH &&
> +	git clone "[myhost:123]:src" ssh-bracket-clone-sshlike-uplink &&
> +	expect_ssh "-p 123" myhost src
> +'
> +
>  test_expect_success 'plink is treated specially (as putty)' '
>  	copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
>  	git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&

This breaks on Windows. And it is not immediately obvious how so, so let
me explain:

As you know, on Windows there is no executable flag. There is the .exe
file extension to indicate an executable (and .com and .bat and .cmd are
also handled as executable, at least as executable script).

Now, what happens if you call "abc" in the MSYS2 Bash and there is no
script called "abc" but an executable called "abc.exe" in the PATH? Why,
of course we execute that executable. It has to, because if "abc.exe"
would be renamed into "abc", it would not work any longer.

That is also the reason why that "copy_ssh_wrapper" helper function
automatically appends that .exe file suffix on Windows: it has to.

Every workaround breaks down at some point, and this workaround is no
exception. What should the MSYS2 Bash do if asked to overwrite "abc" and
there is only an "abc.exe"? It actually overwrites "abc.exe" and moves on.

And this is where we are here: the previous test case copied the ssh
wrapper as "uplink". Except on Windows, it is "uplink.exe". And your newly
introduced test case overwrites it. And then it tells Git specifically to
look for "uplink", and Git does *not* append that .exe suffix
automatically as the MSYS2 Bash would do, because git.exe is not intended
to work MSYS2-like.

As a consequence, the test fails. Could you please squash in this, to fix
the test on Windows?

-- snipsnap --
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index ec4b17bca62..1afcbd00617 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -393,6 +393,7 @@ test_expect_success 'simple does not support port' '
 
 test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
+	test_when_finished "rm \"$TRASH_DIRECTORY/uplink$X\"" &&
 	test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
 	git clone "myhost:src" ssh-clone-uplink &&
 	expect_ssh myhost src


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-25 12:51               ` Johannes Schindelin
@ 2017-10-25 16:18                 ` Stefan Beller
  2017-10-25 16:32                   ` Jonathan Nieder
  0 siblings, 1 reply; 161+ messages in thread
From: Stefan Beller @ 2017-10-25 16:18 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jonathan Nieder, Brandon Williams, git, Bryan Turner,
	Jeff Hostetler, Junio C Hamano, Jonathan Tan, Jeff King,
	William Yan

On Wed, Oct 25, 2017 at 5:51 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Jonathan,
>
> [I only saw that you replied to 3/5 with v2 after writing this reply, but
> it would apply to v2's 3/5, too]
>
> On Mon, 23 Oct 2017, Jonathan Nieder wrote:
>
>> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
>> index 86811a0c35..fd94dd40d2 100755
>> --- a/t/t5601-clone.sh
>> +++ b/t/t5601-clone.sh
>> @@ -384,6 +384,20 @@ test_expect_success 'uplink is treated as simple' '
>>       expect_ssh myhost src
>>  '
>>
>> +test_expect_success 'OpenSSH-like uplink is treated as ssh' '
>> +     write_script "$TRASH_DIRECTORY/uplink" <<-EOF &&
>> +     if test "\$1" = "-G"
>> +     then
>> +             exit 0
>> +     fi &&
>> +     exec "\$TRASH_DIRECTORY/ssh$X" "\$@"
>> +     EOF
>> +     GIT_SSH="$TRASH_DIRECTORY/uplink" &&
>> +     export GIT_SSH &&
>> +     git clone "[myhost:123]:src" ssh-bracket-clone-sshlike-uplink &&
>> +     expect_ssh "-p 123" myhost src
>> +'
>> +
>>  test_expect_success 'plink is treated specially (as putty)' '
>>       copy_ssh_wrapper_as "$TRASH_DIRECTORY/plink" &&
>>       git clone "[myhost:123]:src" ssh-bracket-clone-plink-0 &&
>
> This breaks on Windows. And it is not immediately obvious how so, so let
> me explain:
>
> As you know, on Windows there is no executable flag. There is the .exe
> file extension to indicate an executable (and .com and .bat and .cmd are
> also handled as executable, at least as executable script).
>
> Now, what happens if you call "abc" in the MSYS2 Bash and there is no
> script called "abc" but an executable called "abc.exe" in the PATH? Why,
> of course we execute that executable. It has to, because if "abc.exe"
> would be renamed into "abc", it would not work any longer.
>
> That is also the reason why that "copy_ssh_wrapper" helper function
> automatically appends that .exe file suffix on Windows: it has to.
>
> Every workaround breaks down at some point, and this workaround is no
> exception. What should the MSYS2 Bash do if asked to overwrite "abc" and
> there is only an "abc.exe"? It actually overwrites "abc.exe" and moves on.
>
> And this is where we are here: the previous test case copied the ssh
> wrapper as "uplink". Except on Windows, it is "uplink.exe". And your newly
> introduced test case overwrites it. And then it tells Git specifically to
> look for "uplink", and Git does *not* append that .exe suffix
> automatically as the MSYS2 Bash would do, because git.exe is not intended
> to work MSYS2-like.
>
> As a consequence, the test fails. Could you please squash in this, to fix
> the test on Windows?

This explanation is in detail and would even make a good commit message
for a follow up commit. (Squashing just that line would loose the explanation
as I suspect the original commit will not dedicate so much text to
this single line)

>
> -- snipsnap --
> diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
> index ec4b17bca62..1afcbd00617 100755
> --- a/t/t5601-clone.sh
> +++ b/t/t5601-clone.sh
> @@ -393,6 +393,7 @@ test_expect_success 'simple does not support port' '
>
>  test_expect_success 'uplink is treated as simple' '
>         copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
> +       test_when_finished "rm \"$TRASH_DIRECTORY/uplink$X\"" &&
>         test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
>         git clone "myhost:src" ssh-clone-uplink &&
>         expect_ssh myhost src
>

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-25 16:18                 ` Stefan Beller
@ 2017-10-25 16:32                   ` Jonathan Nieder
  2017-10-30  0:40                     ` Junio C Hamano
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-10-25 16:32 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Johannes Schindelin, Brandon Williams, git, Bryan Turner,
	Jeff Hostetler, Junio C Hamano, Jonathan Tan, Jeff King,
	William Yan

Hi,

Stefan Beller wrote:
> On Wed, Oct 25, 2017 at 5:51 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:

>> This breaks on Windows. And it is not immediately obvious how so, so let
>> me explain:
[nice explanation snipped]
>> As a consequence, the test fails. Could you please squash in this, to fix
>> the test on Windows?
>
> This explanation is in detail and would even make a good commit message
> for a follow up commit. (Squashing just that line would loose the explanation
> as I suspect the original commit will not dedicate so much text to
> this single line)

I have other changes to make when rerolling anyway (from Junio's
review), so no need for a followup patch.  Will fix this in the
reroll today.

Thanks for catching and diagnosing this, Dscho!

Jonathan

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [WIP PATCH] diff: add option to ignore whitespaces for move detection only
  2017-10-25  1:26                       ` Junio C Hamano
@ 2017-10-25 18:58                         ` Brandon Williams
  0 siblings, 0 replies; 161+ messages in thread
From: Brandon Williams @ 2017-10-25 18:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, bturner, git, git, jonathantanmy, jrnieder, peff, wyan

On 10/25, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Brandon Williams <bmwill@google.com> writes:
> >
> >> One simple idea would be to convert the single 'flag' into various bit
> >> fields themselves, that way if you need to add a new flag you would just
> >> make a new bit field.  I'm unaware of any downsides of doing so (though
> >> i may be missing something) but doing so would probably cause a bit of
> >> code churn.
> >
> > The reason why people want to have their own bit in the flags word
> > is because they want to use DIFF_OPT_{SET,CLR,TST,TOUCHED} but they
> > do not want to do the work to extend them beyond a single word.  
> >
> > I think it is doable by making everything a 1-bit wide bitfield
> > without affecting existing users.
> 
> ... but the "touched" thing may be harder---I haven't thought it
> through.

From what I can tell the 'touched' thing is implemented as a parallel
flag field so we would just need to have each flag use 2-bits, one
for the flag itself and one for the 'touched' field.  Then when using
those macros it would just need to update the corresponding 'touched'
field as well as what ever happens with the flag itself.  It may be a
little more involved than the current scheme but it should be doable if
we need to extend the flag space past 32 bits.

-- 
Brandon Williams

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-25 16:32                   ` Jonathan Nieder
@ 2017-10-30  0:40                     ` Junio C Hamano
  2017-10-30 12:37                       ` Johannes Schindelin
  0 siblings, 1 reply; 161+ messages in thread
From: Junio C Hamano @ 2017-10-30  0:40 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Stefan Beller, Johannes Schindelin, Brandon Williams, git,
	Bryan Turner, Jeff Hostetler, Jonathan Tan, Jeff King,
	William Yan

Jonathan Nieder <jrnieder@gmail.com> writes:

> I have other changes to make when rerolling anyway (from Junio's
> review), so no need for a followup patch.  Will fix this in the
> reroll today.
>
> Thanks for catching and diagnosing this, Dscho!

In the meantime, I've queued this from Dscho; please take it into
consideration when you reroll.

Thanks.

-- >8 --
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Date: Sun, 29 Oct 2017 16:12:46 +0100
Subject: [PATCH] fixup! ssh: 'auto' variant to select between 'ssh' and 'simple'

This is needed because on Windows, if `uplink.exe` exists, the MSYS2
Bash will overwrite that when redirecting via `>uplink`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5601-clone.sh | 1 +
 1 file changed, 1 insertion(+)

diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index f9a2ae84c7..534eb21915 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -391,6 +391,7 @@ test_expect_success 'simple does not support port' '
 
 test_expect_success 'uplink is treated as simple' '
 	copy_ssh_wrapper_as "$TRASH_DIRECTORY/uplink" &&
+	test_when_finished "rm \"$TRASH_DIRECTORY/uplink$X\"" &&
 	test_must_fail git clone "[myhost:123]:src" ssh-bracket-clone-uplink &&
 	git clone "myhost:src" ssh-clone-uplink &&
 	expect_ssh myhost src
-- 
2.15.0-rc2-267-g7d3ed0014a


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple'
  2017-10-30  0:40                     ` Junio C Hamano
@ 2017-10-30 12:37                       ` Johannes Schindelin
  0 siblings, 0 replies; 161+ messages in thread
From: Johannes Schindelin @ 2017-10-30 12:37 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, Stefan Beller, Brandon Williams, git,
	Bryan Turner, Jeff Hostetler, Jonathan Tan, Jeff King,
	William Yan

Hi Junio,

On Mon, 30 Oct 2017, Junio C Hamano wrote:

> Jonathan Nieder <jrnieder@gmail.com> writes:
> 
> > I have other changes to make when rerolling anyway (from Junio's
> > review), so no need for a followup patch.  Will fix this in the
> > reroll today.
> >
> > Thanks for catching and diagnosing this, Dscho!
> 
> In the meantime, I've queued this from Dscho; please take it into
> consideration when you reroll.
> 
> Thanks.

Thanks for sending it out as an email.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 161+ messages in thread

* Re: [PATCH 1/5] connect: split git:// setup into a separate function
  2017-10-24  1:44                           ` Junio C Hamano
@ 2017-11-15 20:25                             ` Jonathan Nieder
  2017-11-17  1:12                               ` Junio C Hamano
  0 siblings, 1 reply; 161+ messages in thread
From: Jonathan Nieder @ 2017-11-15 20:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Tan, Brandon Williams, git, peff, sbeller, William Yan

Hi,

On Oct 24, 2017, Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:

>> +static struct child_process *git_connect_git(int fd[2], char *hostandport,
>> +					     const char *path, const char *prog,
>> +					     int flags)
>> +{
>> +	struct child_process *conn = &no_fork;
>> +	struct strbuf request = STRBUF_INIT;
>
> As this one decides what "conn" to return, including the fallback
> &no_fork instance,...
>
>> +	...
>> +	return conn;
>> +}
>> +
>>  /*
>>   * This returns a dummy child_process if the transport protocol does not
>>   * need fork(2), or a struct child_process object if it does.  Once done,
>> @@ -881,50 +939,7 @@ struct child_process *git_connect(int fd[2], const char *url,
>
> Each of the if/elseif/ cascade, one of which calls the new helper,
> now makes an explicit assignment to "conn" declared in
> git_connect().
>
> Which means the defaulting of git_connect::conn to &no_fork is now
> unneeded.  One of the things that made the original cascade a bit
> harder to follow than necessary, aside from the physical length of
> the PROTO_GIT part, was that the case where conn remains to point at
> no_fork looked very special and it was buried in that long PROTO_GIT
> part.

Good idea.  Here's what I'll include in the reroll.

-- >8 --
Subject: connect: move no_fork fallback to git_tcp_connect

git_connect has the structure

	struct child_process *conn = &no_fork;

	...
	switch (protocol) {
	case PROTO_GIT:
		if (git_use_proxy(hostandport))
			conn = git_proxy_connect(fd, hostandport);
		else
			git_tcp_connect(fd, hostandport, flags);
		...
		break;
	case PROTO_SSH:
		conn = xmalloc(sizeof(*conn));
		child_process_init(conn);
		argv_array_push(&conn->args, ssh);
		...
		break;
	...
	return conn;

In all cases except the git_tcp_connect case, conn is explicitly
assigned a value. Make the code clearer by explicitly assigning
'conn = &no_fork' in the tcp case and eliminating the default so the
compiler can ensure conn is always correctly assigned.

Noticed-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 connect.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/connect.c b/connect.c
index 7fbd396b35..b6accf71cb 100644
--- a/connect.c
+++ b/connect.c
@@ -582,12 +582,21 @@ static int git_tcp_connect_sock(char *host, int flags)
 #endif /* NO_IPV6 */
 
 
-static void git_tcp_connect(int fd[2], char *host, int flags)
+static struct child_process no_fork = CHILD_PROCESS_INIT;
+
+int git_connection_is_socket(struct child_process *conn)
+{
+	return conn == &no_fork;
+}
+
+static struct child_process *git_tcp_connect(int fd[2], char *host, int flags)
 {
 	int sockfd = git_tcp_connect_sock(host, flags);
 
 	fd[0] = sockfd;
 	fd[1] = dup(sockfd);
+
+	return &no_fork;
 }
 
 
@@ -761,8 +770,6 @@ static enum protocol parse_connect_url(const char *url_orig, char **ret_host,
 	return protocol;
 }
 
-static struct child_process no_fork = CHILD_PROCESS_INIT;
-
 static const char *get_ssh_command(void)
 {
 	const char *ssh;
@@ -865,7 +872,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 				  const char *prog, int flags)
 {
 	char *hostandport, *path;
-	struct child_process *conn = &no_fork;
+	struct child_process *conn;
 	enum protocol protocol;
 
 	/* Without this we cannot rely on waitpid() to tell
@@ -901,7 +908,7 @@ struct child_process *git_connect(int fd[2], const char *url,
 		if (git_use_proxy(hostandport))
 			conn = git_proxy_connect(fd, hostandport);
 		else
-			git_tcp_connect(fd, hostandport, flags);
+			conn = git_tcp_connect(fd, hostandport, flags);
 		/*
 		 * Separate original protocol components prog and path
 		 * from extended host header with a NUL byte.
@@ -1041,11 +1048,6 @@ struct child_process *git_connect(int fd[2], const char *url,
 	return conn;
 }
 
-int git_connection_is_socket(struct child_process *conn)
-{
-	return conn == &no_fork;
-}
-
 int finish_connect(struct child_process *conn)
 {
 	int code;
-- 
2.15.0.448.gf294e3d99a


^ permalink raw reply related	[flat|nested] 161+ messages in thread

* Re: [PATCH 1/5] connect: split git:// setup into a separate function
  2017-11-15 20:25                             ` Jonathan Nieder
@ 2017-11-17  1:12                               ` Junio C Hamano
  0 siblings, 0 replies; 161+ messages in thread
From: Junio C Hamano @ 2017-11-17  1:12 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Jonathan Tan, Brandon Williams, git, peff, sbeller, William Yan

Jonathan Nieder <jrnieder@gmail.com> writes:

>> Which means the defaulting of git_connect::conn to &no_fork is now
>> unneeded.  One of the things that made the original cascade a bit
>> harder to follow than necessary, aside from the physical length of
>> the PROTO_GIT part, was that the case where conn remains to point at
>> no_fork looked very special and it was buried in that long PROTO_GIT
>> part.
>
> Good idea.  Here's what I'll include in the reroll.

Sounds good.

^ permalink raw reply	[flat|nested] 161+ messages in thread

end of thread, other threads:[~2017-11-17  1:12 UTC | newest]

Thread overview: 161+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-13 21:54 [PATCH 0/8] protocol transition Brandon Williams
2017-09-13 21:54 ` [PATCH 1/8] pkt-line: add packet_write function Brandon Williams
2017-09-13 21:54 ` [PATCH 2/8] protocol: introduce protocol extention mechanisms Brandon Williams
2017-09-13 22:27   ` Stefan Beller
2017-09-18 17:02     ` Brandon Williams
2017-09-18 18:34       ` Stefan Beller
2017-09-18 19:58         ` Brandon Williams
2017-09-18 20:06           ` Stefan Beller
2017-09-13 21:54 ` [PATCH 3/8] daemon: recognize hidden request arguments Brandon Williams
2017-09-13 22:31   ` Stefan Beller
2017-09-18 16:56     ` Brandon Williams
2017-09-21  0:24   ` Jonathan Tan
2017-09-21  0:31     ` Jonathan Tan
2017-09-21 21:55       ` Brandon Williams
2017-09-13 21:54 ` [PATCH 4/8] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
2017-09-13 21:54 ` [PATCH 5/8] connect: teach client to recognize v1 server response Brandon Williams
2017-09-13 21:54 ` [PATCH 6/8] connect: tell server that the client understands v1 Brandon Williams
2017-09-13 21:54 ` [PATCH 7/8] http: " Brandon Williams
2017-09-13 21:54 ` [PATCH 8/8] i5700: add interop test for protocol transition Brandon Williams
2017-09-20 18:48 ` [PATCH 1.5/8] connect: die when a capability line comes after a ref Brandon Williams
2017-09-20 19:14   ` Jeff King
2017-09-20 20:06     ` Brandon Williams
2017-09-20 20:48       ` Jonathan Nieder
2017-09-21  3:02       ` Junio C Hamano
2017-09-21 20:45       ` [PATCH] connect: in ref advertisement, shallows are last Jonathan Tan
2017-09-21 23:45         ` [PATCH v2] " Jonathan Tan
2017-09-22  0:00           ` Brandon Williams
2017-09-22  0:08             ` [PATCH v3] " Jonathan Tan
2017-09-22  1:06               ` Junio C Hamano
2017-09-22  1:39                 ` Junio C Hamano
2017-09-22 16:45                   ` Brandon Williams
2017-09-22 20:15                     ` [PATCH v4] " Jonathan Tan
2017-09-22 21:01                       ` Brandon Williams
2017-09-22 22:16                         ` Jonathan Tan
2017-09-24  0:52                       ` Junio C Hamano
2017-09-26 18:21         ` [PATCH v5] " Jonathan Tan
2017-09-26 18:31           ` Brandon Williams
2017-09-26 23:56 ` [PATCH v2 0/9] protocol transition Brandon Williams
2017-09-26 23:56   ` [PATCH v2 1/9] connect: in ref advertisement, shallows are last Brandon Williams
2017-09-26 23:56   ` [PATCH v2 2/9] pkt-line: add packet_write function Brandon Williams
2017-09-26 23:56   ` [PATCH v2 3/9] protocol: introduce protocol extention mechanisms Brandon Williams
2017-09-27  5:17     ` Junio C Hamano
2017-09-27 11:23       ` Junio C Hamano
2017-09-29 21:20         ` Brandon Williams
2017-09-28 21:58       ` Brandon Williams
2017-09-27  6:30     ` Stefan Beller
2017-09-28 21:04       ` Brandon Williams
2017-09-26 23:56   ` [PATCH v2 4/9] daemon: recognize hidden request arguments Brandon Williams
2017-09-27  5:20     ` Junio C Hamano
2017-09-27 21:22       ` Brandon Williams
2017-09-28 16:57         ` Brandon Williams
2017-09-26 23:56   ` [PATCH v2 5/9] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
2017-09-27  5:23     ` Junio C Hamano
2017-09-27 21:29       ` Brandon Williams
2017-09-26 23:56   ` [PATCH v2 6/9] connect: teach client to recognize v1 server response Brandon Williams
2017-09-27  1:07     ` Junio C Hamano
2017-09-27 17:34       ` Brandon Williams
2017-09-27  5:29     ` Junio C Hamano
2017-09-28 22:08       ` Brandon Williams
2017-09-26 23:56   ` [PATCH v2 7/9] connect: tell server that the client understands v1 Brandon Williams
2017-09-27  6:21     ` Junio C Hamano
2017-09-27  6:29       ` Junio C Hamano
2017-09-29 21:32         ` Brandon Williams
2017-09-28 22:20       ` Brandon Williams
2017-09-26 23:56   ` [PATCH v2 8/9] http: " Brandon Williams
2017-09-27  6:24     ` Junio C Hamano
2017-09-27 21:36       ` Brandon Williams
2017-09-26 23:56   ` [PATCH v2 9/9] i5700: add interop test for protocol transition Brandon Williams
2017-10-03 20:14   ` [PATCH v3 00/10] " Brandon Williams
2017-10-03 20:14     ` [PATCH v3 01/10] connect: in ref advertisement, shallows are last Brandon Williams
2017-10-10 18:14       ` Jonathan Tan
2017-10-03 20:14     ` [PATCH v3 02/10] pkt-line: add packet_write function Brandon Williams
2017-10-10 18:15       ` Jonathan Tan
2017-10-03 20:15     ` [PATCH v3 03/10] protocol: introduce protocol extention mechanisms Brandon Williams
2017-10-06  9:09       ` Simon Ruderich
2017-10-06  9:40         ` Junio C Hamano
2017-10-06 11:11           ` Martin Ågren
2017-10-06 12:09             ` Junio C Hamano
2017-10-06 19:42               ` Martin Ågren
2017-10-06 20:27                 ` Stefan Beller
2017-10-08 14:24                   ` Martin Ågren
2017-10-10 21:00             ` Brandon Williams
2017-10-10 21:17               ` Jonathan Nieder
2017-10-10 21:32                 ` Stefan Beller
2017-10-11  0:39                 ` Junio C Hamano
2017-10-13 22:46                 ` Brandon Williams
2017-10-09  4:05           ` Martin Ågren
2017-10-10 19:51       ` Jonathan Tan
2017-10-03 20:15     ` [PATCH v3 04/10] daemon: recognize hidden request arguments Brandon Williams
2017-10-10 18:24       ` Jonathan Tan
2017-10-13 22:04         ` Brandon Williams
2017-10-03 20:15     ` [PATCH v3 05/10] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
2017-10-10 18:28       ` Jonathan Tan
2017-10-13 22:18         ` Brandon Williams
2017-10-03 20:15     ` [PATCH v3 06/10] connect: teach client to recognize v1 server response Brandon Williams
2017-10-03 20:15     ` [PATCH v3 07/10] connect: tell server that the client understands v1 Brandon Williams
2017-10-10 18:30       ` Jonathan Tan
2017-10-13 22:56         ` Brandon Williams
2017-10-03 20:15     ` [PATCH v3 08/10] http: " Brandon Williams
2017-10-03 20:15     ` [PATCH v3 09/10] i5700: add interop test for protocol transition Brandon Williams
2017-10-03 20:15     ` [PATCH v3 10/10] ssh: introduce a 'simple' ssh variant Brandon Williams
2017-10-03 21:42       ` Jonathan Nieder
2017-10-16 17:18         ` Brandon Williams
2017-10-23 21:28           ` [PATCH 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
2017-10-23 21:29             ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
2017-10-23 22:16               ` Stefan Beller
2017-10-24  0:09                 ` [WIP PATCH] diff: add option to ignore whitespaces for move detection only Stefan Beller
2017-10-24 18:48                   ` Brandon Williams
2017-10-25  1:25                     ` Junio C Hamano
2017-10-25  1:26                       ` Junio C Hamano
2017-10-25 18:58                         ` Brandon Williams
2017-10-24  1:54                 ` [PATCH 1/5] connect: split git:// setup into a separate function Junio C Hamano
2017-10-24  2:52                   ` Stefan Beller
2017-10-23 21:30             ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
2017-10-23 21:48               ` Stefan Beller
2017-10-23 21:31             ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
2017-10-23 22:19               ` Jonathan Tan
2017-10-23 22:43                 ` Jonathan Nieder
2017-10-23 22:51                   ` Brandon Williams
2017-10-23 22:57                     ` Jonathan Tan
2017-10-23 23:16                       ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Jonathan Nieder
2017-10-23 23:17                         ` [PATCH 1/5] connect: split git:// setup into a separate function Jonathan Nieder
2017-10-24  1:44                           ` Junio C Hamano
2017-11-15 20:25                             ` Jonathan Nieder
2017-11-17  1:12                               ` Junio C Hamano
2017-10-23 23:17                         ` [PATCH 2/5] connect: split ssh command line options into " Jonathan Nieder
2017-10-24  2:01                           ` Junio C Hamano
2017-10-23 23:18                         ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
2017-10-23 23:27                           ` Brandon Williams
2017-10-23 23:33                             ` Stefan Beller
2017-10-23 23:19                         ` [PATCH 4/5] ssh: 'simple' variant does not support -4/-6 Jonathan Nieder
2017-10-23 23:19                         ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
2017-10-24  2:22                         ` [PATCH v2 0/5] Coping with unrecognized ssh wrapper scripts in GIT_SSH Junio C Hamano
2017-10-23 23:12                     ` [PATCH 3/5] ssh: 'auto' variant to select between 'ssh' and 'simple' Jonathan Nieder
2017-10-23 22:33               ` Stefan Beller
2017-10-23 22:54                 ` Jonathan Nieder
2017-10-24  2:16               ` Junio C Hamano
2017-10-25 12:51               ` Johannes Schindelin
2017-10-25 16:18                 ` Stefan Beller
2017-10-25 16:32                   ` Jonathan Nieder
2017-10-30  0:40                     ` Junio C Hamano
2017-10-30 12:37                       ` Johannes Schindelin
2017-10-23 21:32             ` [PATCH 4/5] ssh: 'simple' variant does not support -4/-6 Jonathan Nieder
2017-10-23 21:33             ` [PATCH 5/5] ssh: 'simple' variant does not support --port Jonathan Nieder
2017-10-23 22:37               ` Stefan Beller
2017-10-04  6:20     ` [PATCH v3 00/10] protocol transition Junio C Hamano
2017-10-10 19:39     ` [PATCH] Documentation: document Extra Parameters Jonathan Tan
2017-10-13 22:26       ` Brandon Williams
2017-10-16 17:55     ` [PATCH v4 00/11] protocol transition Brandon Williams
2017-10-16 17:55       ` [PATCH v4 01/11] connect: in ref advertisement, shallows are last Brandon Williams
2017-10-16 17:55       ` [PATCH v4 02/11] pkt-line: add packet_write function Brandon Williams
2017-10-16 17:55       ` [PATCH v4 03/11] protocol: introduce protocol extension mechanisms Brandon Williams
2017-10-16 21:25         ` Kevin Daudt
2017-10-16 17:55       ` [PATCH v4 04/11] daemon: recognize hidden request arguments Brandon Williams
2017-10-16 17:55       ` [PATCH v4 05/11] upload-pack, receive-pack: introduce protocol version 1 Brandon Williams
2017-10-16 17:55       ` [PATCH v4 06/11] connect: teach client to recognize v1 server response Brandon Williams
2017-10-16 17:55       ` [PATCH v4 07/11] connect: tell server that the client understands v1 Brandon Williams
2017-10-16 17:55       ` [PATCH v4 08/11] http: " Brandon Williams
2017-10-16 17:55       ` [PATCH v4 09/11] i5700: add interop test for protocol transition Brandon Williams
2017-10-16 17:55       ` [PATCH v4 10/11] ssh: introduce a 'simple' ssh variant Brandon Williams
2017-10-16 17:55       ` [PATCH v4 11/11] Documentation: document Extra Parameters Brandon Williams

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.