All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCHv1 0/4] Push options in C Git
@ 2016-06-30  0:59 Stefan Beller
  2016-06-30  0:59 ` [PATCH 1/4] push options: {pre,post}-receive hook learns about push options Stefan Beller
                   ` (4 more replies)
  0 siblings, 5 replies; 52+ messages in thread
From: Stefan Beller @ 2016-06-30  0:59 UTC (permalink / raw)
  To: git, dwwang; +Cc: Stefan Beller

Allow a user to pass information along a push to the pre/post-receive hook
on the remote.

When using a remote that is more than just a plain Git host (e.g. Gerrit,
Git{hub/lab}, etc) this may become more obvious: The (server backend specific)
push options can instruct the server to:
* open a pull request
* send out emails asking for review
* (un)trigger continuous integration
* set priority for continuous integration (i.e. bots pushing may ask to be
  treated with lower priority compared to humans)
* ... 

Most of these actions can be done on the client side as well,
but in these remote-centric workflows it is easier to do that on the remote,
which is why we need to transport the information there.

More concrete examples:
* When you want a change in Gerrit to be submitted to refs/heads/master, you
  push instead to a magic branch refs/for/master and Gerrit will create a change
  for you (similar to a pull request). Instead we could imagine that you push
  to a magical refs/heads/master with a push option "create-change".
  
* When pushing to Gerrit you can already attach some of this information by
  adding a '%' followed by the parameter to the ref, i.e. when interacting with
  Gerrit it is possible to do things like[1]:
    
    git push origin HEAD:refs/for/master%draft%topic=example%cc=jon.doe@example.org
  
  This is not appealing to our users as it looks like hacks upon hacks to make
  it work. It would read better if it was spelled as:
  
  git push origin HEAD:refs/for/master \
      --push-option draft \
      --push-option topic=example \
      --push-option cc=jon.doe@example.org
      
  (with a short form that is even easier to type,
   but this is is more intuitive already)

This is a patch series to Git core, which is developed at the same time
as a change is proposed to JGit by Dan Wang, see [2].

This code is also available at [3].

Thanks,
Stefan

[1] Not all Gerrit '%' options are documented, so here is a link to source code instead :(
https://gerrit.googlesource.com/gerrit/+/refs/heads/master/gerrit-server/src/main/java/com/google/gerrit/server/git/ReceiveCommits.java#1141

[2] https://git.eclipse.org/r/#/c/74570/ 
 
[3] https://github.com/stefanbeller/git/tree/pushoptions

Stefan Beller (4):
  push options: {pre,post}-receive hook learns about push options
  receive-pack: implement advertising and receiving push options
  push: accept push options
  add a test for push options

 Documentation/config.txt                          |  7 +-
 Documentation/git-push.txt                        |  8 ++-
 Documentation/githooks.txt                        |  4 ++
 Documentation/technical/pack-protocol.txt         | 10 +--
 Documentation/technical/protocol-capabilities.txt |  8 +++
 builtin/push.c                                    | 16 ++++-
 builtin/receive-pack.c                            | 85 +++++++++++++++++++----
 send-pack.c                                       | 29 ++++++++
 send-pack.h                                       |  3 +
 t/t5544-push-options.sh                           | 85 +++++++++++++++++++++++
 transport.c                                       |  2 +
 transport.h                                       |  7 ++
 12 files changed, 242 insertions(+), 22 deletions(-)
 create mode 100755 t/t5544-push-options.sh

-- 
2.9.0.141.gdd65b60


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
@ 2016-06-30  0:59 ` Stefan Beller
  2016-07-01  7:14   ` Jeff King
  2016-06-30  0:59 ` [PATCH 2/4] receive-pack: implement advertising and receiving " Stefan Beller
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-06-30  0:59 UTC (permalink / raw)
  To: git, dwwang; +Cc: Stefan Beller

The environment variable GIT_PUSH_OPTION_FILE is set to the push options
separated by new line.

The code is not executed as the push options are set to NULL, nor is the
new capability advertised.

The rationale for keeping the actual options inside a file instead of
putting them directly into an environment variable has multiple reasons:

1) After a discussion about environment variables and shells, we may not
want to put user data into an environment variable (see [1] for example).

2) If a user passes multiple push options, we need to pass them to the
hooks in a way, the hook can separate them. This could be done via
multiple environment variables (e.g. have GIT_PUSH_OPTIONS_COUNT and
GIT_PUSH_OPTIONS_{0,1,2,...} set), or put it all in one environment
variable and choose an appropriate separator. That is hard to parse
in both ways. For now we'll just put it in a file separated by new line,
such that the hook scripts can pickup the variables as

    while read option ; do
        process_push_option() $option
    done < $GIT_PUSH_OPTION_FILE

3) environment variables are messed with in the run-command API depending
on the occurrence of a '=' to set or unset the variable. Having multiple
'=' is ok, such that we could have it set to a "key=value" pair.

4) When putting the options in a file, we need to take care of the race
condition of multiple clients pushing. That is actually rather easy.

5) We only inject new lines as separator into the file, so it is possible
to transmit binaries ("attach an image to a code review").

[1] 'Shellshock' https://lwn.net/Articles/614218/

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/githooks.txt |  4 ++++
 builtin/receive-pack.c     | 41 ++++++++++++++++++++++++++++-------------
 2 files changed, 32 insertions(+), 13 deletions(-)

diff --git a/Documentation/githooks.txt b/Documentation/githooks.txt
index d82e912..dc80574 100644
--- a/Documentation/githooks.txt
+++ b/Documentation/githooks.txt
@@ -247,6 +247,8 @@ Both standard output and standard error output are forwarded to
 'git send-pack' on the other end, so you can simply `echo` messages
 for the user.
 
+The push options are available in the variable GIT_PUSH_OPTION_FILE.
+
 [[update]]
 update
 ~~~~~~
@@ -322,6 +324,8 @@ a sample script `post-receive-email` provided in the `contrib/hooks`
 directory in Git distribution, which implements sending commit
 emails.
 
+The push options are available in the variable GIT_PUSH_OPTION_FILE.
+
 [[post-update]]
 post-update
 ~~~~~~~~~~~
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 15c323a..0da6852 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -550,8 +550,16 @@ static void prepare_push_cert_sha1(struct child_process *proc)
 	}
 }
 
+struct receive_hook_feed_state {
+	struct command *cmd;
+	int skip_broken;
+	struct strbuf buf;
+	const char *push_options_file;
+};
+
 typedef int (*feed_fn)(void *, const char **, size_t *);
-static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_state)
+static int run_and_feed_hook(const char *hook_name, feed_fn feed,
+			     struct receive_hook_feed_state *feed_state)
 {
 	struct child_process proc = CHILD_PROCESS_INIT;
 	struct async muxer;
@@ -567,6 +575,9 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
 	proc.argv = argv;
 	proc.in = -1;
 	proc.stdout_to_stderr = 1;
+	if (feed_state && feed_state->push_options_file)
+		argv_array_pushf(&proc.env_array, "GIT_PUSH_OPTION_FILE=%s",
+				 feed_state->push_options_file);
 
 	if (use_sideband) {
 		memset(&muxer, 0, sizeof(muxer));
@@ -606,12 +617,6 @@ static int run_and_feed_hook(const char *hook_name, feed_fn feed, void *feed_sta
 	return finish_command(&proc);
 }
 
-struct receive_hook_feed_state {
-	struct command *cmd;
-	int skip_broken;
-	struct strbuf buf;
-};
-
 static int feed_receive_hook(void *state_, const char **bufp, size_t *sizep)
 {
 	struct receive_hook_feed_state *state = state_;
@@ -634,8 +639,10 @@ static int feed_receive_hook(void *state_, const char **bufp, size_t *sizep)
 	return 0;
 }
 
-static int run_receive_hook(struct command *commands, const char *hook_name,
-			    int skip_broken)
+static int run_receive_hook(struct command *commands,
+			    const char *hook_name,
+			    int skip_broken,
+			    const char *push_options_file)
 {
 	struct receive_hook_feed_state state;
 	int status;
@@ -646,6 +653,7 @@ static int run_receive_hook(struct command *commands, const char *hook_name,
 	if (feed_receive_hook(&state, NULL, NULL))
 		return 0;
 	state.cmd = commands;
+	state.push_options_file = push_options_file;
 	status = run_and_feed_hook(hook_name, feed_receive_hook, &state);
 	strbuf_release(&state.buf);
 	return status;
@@ -1316,7 +1324,8 @@ cleanup:
 
 static void execute_commands(struct command *commands,
 			     const char *unpacker_error,
-			     struct shallow_info *si)
+			     struct shallow_info *si,
+			     const char *push_options_file)
 {
 	struct command *cmd;
 	unsigned char sha1[20];
@@ -1335,7 +1344,7 @@ static void execute_commands(struct command *commands,
 
 	reject_updates_to_hidden(commands);
 
-	if (run_receive_hook(commands, "pre-receive", 0)) {
+	if (run_receive_hook(commands, "pre-receive", 0, push_options_file)) {
 		for (cmd = commands; cmd; cmd = cmd->next) {
 			if (!cmd->error_string)
 				cmd->error_string = "pre-receive hook declined";
@@ -1756,6 +1765,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 
 	if ((commands = read_head_info(&shallow)) != NULL) {
 		const char *unpack_status = NULL;
+		const char *push_options_file = NULL;
 
 		prepare_shallow_info(&si, &shallow);
 		if (!si.nr_ours && !si.nr_theirs)
@@ -1764,13 +1774,18 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 			unpack_status = unpack_with_sideband(&si);
 			update_shallow_info(commands, &si, &ref);
 		}
-		execute_commands(commands, unpack_status, &si);
+		execute_commands(commands, unpack_status, &si,
+				 push_options_file);
 		if (pack_lockfile)
 			unlink_or_warn(pack_lockfile);
 		if (report_status)
 			report(commands, unpack_status);
-		run_receive_hook(commands, "post-receive", 1);
+		run_receive_hook(commands, "post-receive", 1,
+				 push_options_file);
 		run_update_post_hook(commands);
+		if (push_options_file)
+			/* ignore errors */
+			unlink(push_options_file);
 		if (auto_gc) {
 			const char *argv_gc_auto[] = {
 				"gc", "--auto", "--quiet", NULL,
-- 
2.9.0.141.gdd65b60


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
  2016-06-30  0:59 ` [PATCH 1/4] push options: {pre,post}-receive hook learns about push options Stefan Beller
@ 2016-06-30  0:59 ` Stefan Beller
  2016-07-01 17:11   ` Junio C Hamano
  2016-06-30  0:59 ` [PATCH 3/4] push: accept " Stefan Beller
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-06-30  0:59 UTC (permalink / raw)
  To: git, dwwang; +Cc: Stefan Beller

The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

For now we only accept C strings with no new lines.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt                          |  7 +++-
 Documentation/technical/pack-protocol.txt         | 10 +++---
 Documentation/technical/protocol-capabilities.txt |  8 +++++
 builtin/receive-pack.c                            | 44 +++++++++++++++++++++++
 4 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 626243f..3b199f9 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,7 +2410,12 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
 	By default, git-receive-pack will advertise the atomic push
-	capability to its clients. If you don't want to this capability
+	capability to its clients. If you don't want this capability
+	to be advertised, set this variable to false.
+
+receive.advertisePushOptions::
+	By default, git-receive-pack will advertise the push options capability
+	to its clients. If you don't want this capability
 	to be advertised, set this variable to false.
 
 receive.autogc::
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 ----------------------------------------------
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 ----
   update-request    =  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..b71eda9 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,14 @@ atomic pushes. If the pushing client requests this capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+------------
+
+If the server sends the 'push-options' capability it is capable to accept
+push options after the update commands have been sent. If the pushing client
+requests this capability, the server will pass the options to the pre and post
+receive hooks that process this push request.
+
 allow-tip-sha1-in-want
 ----------------------
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 0da6852..68627ed 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.advertisepushoptions") == 0) {
+		advertise_push_options = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -207,6 +214,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
 			      "report-status delete-refs side-band-64k quiet");
 		if (advertise_atomic_push)
 			strbuf_addstr(&cap, " atomic");
+		if (advertise_push_options)
+			strbuf_addstr(&cap, " push-options");
 		if (prefer_ofs_delta)
 			strbuf_addstr(&cap, " ofs-delta");
 		if (push_cert_nonce)
@@ -1448,6 +1457,9 @@ static struct command *read_head_info(struct sha1_array *shallow)
 			if (advertise_atomic_push
 			    && parse_feature_request(feature_list, "atomic"))
 				use_atomic = 1;
+			if (advertise_push_options
+			    && parse_feature_request(feature_list, "push-options"))
+				use_push_options = 1;
 		}
 
 		if (!strcmp(line, "push-cert")) {
@@ -1480,6 +1492,35 @@ static struct command *read_head_info(struct sha1_array *shallow)
 	return commands;
 }
 
+static const char *stream_push_options_to_file()
+{
+	static const char *fname = "push-options-XXXXXX";
+	char *ret = xmallocz(strlen(fname));
+	int fd;
+	memcpy(ret, fname, strlen(fname));
+	fd = mkstemp(ret);
+
+	for (;;) {
+		char *line;
+		int len;
+
+		line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		if (write_in_full(fd, line, len) < 0 ||
+		    write_in_full(fd, "\n", 1) < 0)
+			goto fail;
+	}
+
+	return ret;
+fail:
+	close(fd);
+	free(ret);
+	return NULL;
+}
+
 static const char *parse_pack_header(struct pack_header *hdr)
 {
 	switch (read_pack_header(0, hdr)) {
@@ -1767,6 +1808,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		const char *unpack_status = NULL;
 		const char *push_options_file = NULL;
 
+		if (use_push_options)
+			push_options_file = stream_push_options_to_file();
+
 		prepare_shallow_info(&si, &shallow);
 		if (!si.nr_ours && !si.nr_theirs)
 			shallow_update = 0;
-- 
2.9.0.141.gdd65b60


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 3/4] push: accept push options
  2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
  2016-06-30  0:59 ` [PATCH 1/4] push options: {pre,post}-receive hook learns about push options Stefan Beller
  2016-06-30  0:59 ` [PATCH 2/4] receive-pack: implement advertising and receiving " Stefan Beller
@ 2016-06-30  0:59 ` Stefan Beller
  2016-06-30  0:59 ` [PATCH 4/4] add a test for " Stefan Beller
  2016-07-01  7:09 ` [RFC PATCHv1 0/4] Push options in C Git Jeff King
  4 siblings, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-06-30  0:59 UTC (permalink / raw)
  To: git, dwwang; +Cc: Stefan Beller

This implements everything that is required on the client side to make use
of push options from the porcelain push command.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/git-push.txt |  8 +++++++-
 builtin/push.c             | 16 +++++++++++++---
 send-pack.c                | 29 +++++++++++++++++++++++++++++
 send-pack.h                |  3 +++
 transport.c                |  2 ++
 transport.h                |  7 +++++++
 6 files changed, 61 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 19f46b6..b0b1273 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -11,7 +11,7 @@ SYNOPSIS
 [verse]
 'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | --dry-run] [--receive-pack=<git-receive-pack>]
 	   [--repo=<repository>] [-f | --force] [-d | --delete] [--prune] [-v | --verbose]
-	   [-u | --set-upstream]
+	   [-u | --set-upstream] [--push-option=<string>]
 	   [--[no-]signed|--sign=(true|false|if-asked)]
 	   [--force-with-lease[=<refname>[:<expect>]]]
 	   [--no-verify] [<repository> [<refspec>...]]
@@ -156,6 +156,12 @@ already exists on the remote side.
 	Either all refs are updated, or on error, no refs are updated.
 	If the server does not support atomic pushes the push will fail.
 
+-L::
+--push-option::
+	Transmit the given string to the server, which passes them to
+	the pre-receive as well as the post-receive hook. Only C strings
+	containing no new lines are allowed.
+
 --receive-pack=<git-receive-pack>::
 --exec=<git-receive-pack>::
 	Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index 4e9e4db..418f786 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -353,7 +353,8 @@ static int push_with_options(struct transport *transport, int flags)
 	return 1;
 }
 
-static int do_push(const char *repo, int flags)
+static int do_push(const char *repo, int flags,
+		   const struct string_list *push_options)
 {
 	int i, errs;
 	struct remote *remote = pushremote_get(repo);
@@ -376,6 +377,9 @@ static int do_push(const char *repo, int flags)
 	if (remote->mirror)
 		flags |= (TRANSPORT_PUSH_MIRROR|TRANSPORT_PUSH_FORCE);
 
+	if (push_options->nr)
+		flags |= TRANSPORT_PUSH_OPTIONS;
+
 	if ((flags & TRANSPORT_PUSH_ALL) && refspec) {
 		if (!strcmp(*refspec, "refs/tags/*"))
 			return error(_("--all and --tags are incompatible"));
@@ -406,13 +410,16 @@ static int do_push(const char *repo, int flags)
 		for (i = 0; i < url_nr; i++) {
 			struct transport *transport =
 				transport_get(remote, url[i]);
+			if (flags & TRANSPORT_PUSH_OPTIONS)
+				transport->push_options = push_options;
 			if (push_with_options(transport, flags))
 				errs++;
 		}
 	} else {
 		struct transport *transport =
 			transport_get(remote, NULL);
-
+		if (flags & TRANSPORT_PUSH_OPTIONS)
+			transport->push_options = push_options;
 		if (push_with_options(transport, flags))
 			errs++;
 	}
@@ -500,6 +507,8 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 	int push_cert = -1;
 	int rc;
 	const char *repo = NULL;	/* default repository */
+	static struct string_list push_options = STRING_LIST_INIT_DUP;
+
 	struct option options[] = {
 		OPT__VERBOSITY(&verbosity),
 		OPT_STRING( 0 , "repo", &repo, N_("repository"), N_("repository")),
@@ -533,6 +542,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 		  0, "signed", &push_cert, "yes|no|if-asked", N_("GPG sign the push"),
 		  PARSE_OPT_OPTARG, option_parse_push_signed },
 		OPT_BIT(0, "atomic", &flags, N_("request atomic transaction on remote side"), TRANSPORT_PUSH_ATOMIC),
+		OPT_STRING_LIST('L', "push-option", &push_options, N_("server-specific"), N_("options to transmit")),
 		OPT_SET_INT('4', "ipv4", &family, N_("use IPv4 addresses only"),
 				TRANSPORT_FAMILY_IPV4),
 		OPT_SET_INT('6', "ipv6", &family, N_("use IPv6 addresses only"),
@@ -563,7 +573,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 		set_refspecs(argv + 1, argc - 1, repo);
 	}
 
-	rc = do_push(repo, flags);
+	rc = do_push(repo, flags, &push_options);
 	if (rc == -1)
 		usage_with_options(push_usage, options);
 	else
diff --git a/send-pack.c b/send-pack.c
index 37ee04e..17c30a1 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -261,6 +261,7 @@ static int generate_push_cert(struct strbuf *req_buf,
 			      const char *push_cert_nonce)
 {
 	const struct ref *ref;
+	struct string_list_item *item;
 	char *signing_key = xstrdup(get_signing_key());
 	const char *cp, *np;
 	struct strbuf cert = STRBUF_INIT;
@@ -279,6 +280,12 @@ static int generate_push_cert(struct strbuf *req_buf,
 		strbuf_addf(&cert, "nonce %s\n", push_cert_nonce);
 	strbuf_addstr(&cert, "\n");
 
+	if (args->push_options) {
+		for_each_string_list_item(item, args->push_options)
+			strbuf_addf(&cert, "push-option %s\n", item->string);
+		strbuf_addstr(&cert, "\n");
+	}
+
 	for (ref = remote_refs; ref; ref = ref->next) {
 		if (check_to_send_update(ref, args) < 0)
 			continue;
@@ -371,6 +378,8 @@ int send_pack(struct send_pack_args *args,
 	int agent_supported = 0;
 	int use_atomic = 0;
 	int atomic_supported = 0;
+	int use_push_options = 0;
+	int push_options_supported = 0;
 	unsigned cmds_sent = 0;
 	int ret;
 	struct async demux;
@@ -393,6 +402,8 @@ int send_pack(struct send_pack_args *args,
 		args->use_thin_pack = 0;
 	if (server_supports("atomic"))
 		atomic_supported = 1;
+	if (server_supports("push-options"))
+		push_options_supported = 1;
 
 	if (args->push_cert != SEND_PACK_PUSH_CERT_NEVER) {
 		int len;
@@ -419,6 +430,11 @@ int send_pack(struct send_pack_args *args,
 
 	use_atomic = atomic_supported && args->atomic;
 
+	if (args->push_options && !push_options_supported)
+		die(_("the receiving end does not support push options"));
+
+	use_push_options = push_options_supported && args->push_options;
+
 	if (status_report)
 		strbuf_addstr(&cap_buf, " report-status");
 	if (use_sideband)
@@ -427,6 +443,8 @@ int send_pack(struct send_pack_args *args,
 		strbuf_addstr(&cap_buf, " quiet");
 	if (use_atomic)
 		strbuf_addstr(&cap_buf, " atomic");
+	if (use_push_options)
+		strbuf_addstr(&cap_buf, " push-options");
 	if (agent_supported)
 		strbuf_addf(&cap_buf, " agent=%s", git_user_agent_sanitized());
 
@@ -513,6 +531,17 @@ int send_pack(struct send_pack_args *args,
 	strbuf_release(&req_buf);
 	strbuf_release(&cap_buf);
 
+	if (use_push_options) {
+		struct string_list_item *item;
+		struct strbuf sb = STRBUF_INIT;
+
+		for_each_string_list_item(item, args->push_options)
+			packet_buf_write(&sb, "%s", item->string);
+			write_or_die(out, sb.buf, sb.len);
+		packet_flush(out);
+		strbuf_release(&sb);
+	}
+
 	if (use_sideband && cmds_sent) {
 		memset(&demux, 0, sizeof(demux));
 		demux.proc = sideband_demux;
diff --git a/send-pack.h b/send-pack.h
index 57f222a..67fc40f 100644
--- a/send-pack.h
+++ b/send-pack.h
@@ -1,6 +1,8 @@
 #ifndef SEND_PACK_H
 #define SEND_PACK_H
 
+#include "string-list.h"
+
 /* Possible values for push_cert field in send_pack_args. */
 #define SEND_PACK_PUSH_CERT_NEVER 0
 #define SEND_PACK_PUSH_CERT_IF_ASKED 1
@@ -21,6 +23,7 @@ struct send_pack_args {
 		push_cert:2,
 		stateless_rpc:1,
 		atomic:1;
+	const struct string_list *push_options;
 };
 
 struct option;
diff --git a/transport.c b/transport.c
index 095e61f..598bd1f 100644
--- a/transport.c
+++ b/transport.c
@@ -510,6 +510,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	args.dry_run = !!(flags & TRANSPORT_PUSH_DRY_RUN);
 	args.porcelain = !!(flags & TRANSPORT_PUSH_PORCELAIN);
 	args.atomic = !!(flags & TRANSPORT_PUSH_ATOMIC);
+	args.push_options = transport->push_options;
 	args.url = transport->url;
 
 	if (flags & TRANSPORT_PUSH_CERT_ALWAYS)
@@ -640,6 +641,7 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	struct transport *ret = xcalloc(1, sizeof(*ret));
 
 	ret->progress = isatty(2);
+	ret->push_options = NULL;
 
 	if (!remote)
 		die("No remote provided to transport_get()");
diff --git a/transport.h b/transport.h
index c681408..6fe3485 100644
--- a/transport.h
+++ b/transport.h
@@ -48,6 +48,12 @@ struct transport {
 	 */
 	unsigned cloning : 1;
 
+	/*
+	 * These strings will be passed to the {pre, post}-receive hook,
+	 * on the remote side, if both sides support the push options capability.
+	 */
+	const struct string_list *push_options;
+
 	/**
 	 * Returns 0 if successful, positive if the option is not
 	 * recognized or is inapplicable, and negative if the option
@@ -134,6 +140,7 @@ struct transport {
 #define TRANSPORT_PUSH_CERT_ALWAYS 2048
 #define TRANSPORT_PUSH_CERT_IF_ASKED 4096
 #define TRANSPORT_PUSH_ATOMIC 8192
+#define TRANSPORT_PUSH_OPTIONS 16384
 
 #define TRANSPORT_SUMMARY_WIDTH (2 * DEFAULT_ABBREV + 3)
 #define TRANSPORT_SUMMARY(x) (int)(TRANSPORT_SUMMARY_WIDTH + strlen(x) - gettext_width(x)), (x)
-- 
2.9.0.141.gdd65b60


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* [PATCH 4/4] add a test for push options
  2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
                   ` (2 preceding siblings ...)
  2016-06-30  0:59 ` [PATCH 3/4] push: accept " Stefan Beller
@ 2016-06-30  0:59 ` Stefan Beller
  2016-07-01  7:09 ` [RFC PATCHv1 0/4] Push options in C Git Jeff King
  4 siblings, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-06-30  0:59 UTC (permalink / raw)
  To: git, dwwang; +Cc: Stefan Beller

The functions `mk_repo_pair` as well as `test_refs` are borrowed from
t5543-atomic-push, with additional hooks installed.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 t/t5544-push-options.sh | 85 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 85 insertions(+)
 create mode 100755 t/t5544-push-options.sh

diff --git a/t/t5544-push-options.sh b/t/t5544-push-options.sh
new file mode 100755
index 0000000..49d5f80
--- /dev/null
+++ b/t/t5544-push-options.sh
@@ -0,0 +1,85 @@
+#!/bin/sh
+
+test_description='pushing to a repository using push options'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+	rm -rf workbench upstream &&
+	test_create_repo upstream &&
+	test_create_repo workbench &&
+	(
+		cd upstream &&
+		git config receive.denyCurrentBranch warn &&
+		mkdir -p .git/hooks &&
+		cat >.git/hooks/pre-receive <<-'EOF' &&
+		#!/bin/sh
+		cat $GIT_PUSH_OPTION_FILE >hooks/pre-receive.push_options
+		EOF
+		chmod u+x .git/hooks/pre-receive
+
+		cat >.git/hooks/post-receive <<-'EOF' &&
+		#!/bin/sh
+		cat $GIT_PUSH_OPTION_FILE >hooks/post-receive.push_options
+		EOF
+		chmod u+x .git/hooks/post-receive
+	) &&
+	(
+		cd workbench &&
+		git remote add up ../upstream
+	)
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+	test $# = 2 &&
+	git -C upstream rev-parse --verify "$1" >expect &&
+	git -C workbench rev-parse --verify "$2" >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'one push option works for a single branch' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git push --mirror up &&
+		test_commit two &&
+		git push --push-option=asdf up master
+	) &&
+	test_refs master master &&
+	echo "asdf" >expect &&
+	test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+	test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_expect_success 'push option denied by remote' '
+	mk_repo_pair &&
+	git -C upstream config receive.advertisePushOptions false &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git push --mirror up &&
+		test_commit two &&
+		test_must_fail git push --push-option=asdf up master
+	) &&
+	test_refs master HEAD@{1}
+'
+
+test_expect_success 'two push options work' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git push --mirror up &&
+		test_commit two &&
+		git push --push-option=asdf --push-option="more structured text" up master
+	) &&
+	test_refs master master &&
+	printf "asdf\nmore structured text\n" >expect &&
+	test_cmp expect upstream/.git/hooks/pre-receive.push_options &&
+	test_cmp expect upstream/.git/hooks/post-receive.push_options
+'
+
+test_done
-- 
2.9.0.141.gdd65b60


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [RFC PATCHv1 0/4] Push options in C Git
  2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
                   ` (3 preceding siblings ...)
  2016-06-30  0:59 ` [PATCH 4/4] add a test for " Stefan Beller
@ 2016-07-01  7:09 ` Jeff King
  2016-07-01 17:07   ` Stefan Beller
  4 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-01  7:09 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Dennis Kaarsemaker, git, dwwang

On Wed, Jun 29, 2016 at 05:59:47PM -0700, Stefan Beller wrote:

> Allow a user to pass information along a push to the pre/post-receive hook
> on the remote.

Seems like a reasonable thing to want, and related to:

  http://thread.gmane.org/gmane.comp.version-control.git/285124

which was specifically interested in making "--quiet" and "--force"
available, but stopped short of arbitrary data. (It looks like that
topic got overlooked, and Dennis never prodded again).

> When using a remote that is more than just a plain Git host (e.g. Gerrit,
> Git{hub/lab}, etc) this may become more obvious: The (server backend specific)
> push options can instruct the server to:
> * open a pull request

I know this is just a "for example", but I don't know if we would ever
support opening a pull request directly via push, if only because a good
pull request has actual parameters that the user needs to fill out (like
the cover letter message).

So you'd probably want some client tool to help the user figure out what
to put in the PR, and of course that already exists, because GitHub has
an HTTP API that it can talk to. Shoe-horning it into Git doesn't buy
much.

I'm not sure if your other examples would be better served by just using
an HTTP API or not. I think Gerrit is funny in this regard because it
eschews branches entirely. E.g., in a GitHub PR, you push to branch
"foo", and then you open a PR using "foo" as the source. But with
Gerrit, you push to the magic refs/for/master, and you have no real way
to cross-reference that submission later.

Whereas in Dennis's patches, it was about specific information directly
related to the act of pushing.

> * When pushing to Gerrit you can already attach some of this information by
>   adding a '%' followed by the parameter to the ref, i.e. when interacting with
>   Gerrit it is possible to do things like[1]:
>     
>     git push origin HEAD:refs/for/master%draft%topic=example%cc=jon.doe@example.org
>   
>   This is not appealing to our users as it looks like hacks upon hacks to make
>   it work. It would read better if it was spelled as:

Heh. It _is_ hacks upon hacks, isn't it? ;)

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-06-30  0:59 ` [PATCH 1/4] push options: {pre,post}-receive hook learns about push options Stefan Beller
@ 2016-07-01  7:14   ` Jeff King
  2016-07-01 17:20     ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-01  7:14 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, dwwang

On Wed, Jun 29, 2016 at 05:59:48PM -0700, Stefan Beller wrote:

> The environment variable GIT_PUSH_OPTION_FILE is set to the push options
> separated by new line.
> 
> The code is not executed as the push options are set to NULL, nor is the
> new capability advertised.
> 
> The rationale for keeping the actual options inside a file instead of
> putting them directly into an environment variable has multiple reasons:

Thanks for including this rationale; my first thought on seeing the
patch was "wouldn't this be much more convenient for a hook if each
value had its own environment variable?".

Putting the data into a file does alleviate some issues. But it also
makes parsing annoying, and opens up new issues (like what happens
to content that has a newline in it?). Wouldn't multiple files be a bit
more convenient? Especially for your example of eventually carrying
larger binary content like images.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCHv1 0/4] Push options in C Git
  2016-07-01  7:09 ` [RFC PATCHv1 0/4] Push options in C Git Jeff King
@ 2016-07-01 17:07   ` Stefan Beller
  2016-07-01 17:55     ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 17:07 UTC (permalink / raw)
  To: Jeff King; +Cc: Dennis Kaarsemaker, git, Dan Wang

On Fri, Jul 1, 2016 at 12:09 AM, Jeff King <peff@peff.net> wrote:
> On Wed, Jun 29, 2016 at 05:59:47PM -0700, Stefan Beller wrote:
>
>> Allow a user to pass information along a push to the pre/post-receive hook
>> on the remote.
>
> Seems like a reasonable thing to want, and related to:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/285124
>
> which was specifically interested in making "--quiet" and "--force"
> available, but stopped short of arbitrary data. (It looks like that
> topic got overlooked, and Dennis never prodded again).

Doh! I should have searched for such a topic. :/
I looked at the patches and they differ conceptually in what goes over the wire.

In Dennis series that is a hook_options=<string> in the capability
selection line,
that is after the first update line, i.e. the hook options are
restricted to not contain
white spaces and be of length < ~60k characters IIUC (reserve about 4k for the
update, ref and other selected capabilities)

In this proposal we only indicate that we want to transmit options in the
negotiation phase, and then after the update transmission is done, you can
have as many packets as you want excluding flush packets (the flush packet
indicates we're done transmitting these push/hook options)

So I might be biased by my code, but I'd claim that is more future proof.

>
>> When using a remote that is more than just a plain Git host (e.g. Gerrit,
>> Git{hub/lab}, etc) this may become more obvious: The (server backend specific)
>> push options can instruct the server to:
>> * open a pull request
>
> I know this is just a "for example", but I don't know if we would ever
> support opening a pull request directly via push, if only because a good
> pull request has actual parameters that the user needs to fill out (like
> the cover letter message).

Sure, just an example! You could have

    "git push --read-push-option-from-file=coverletter.html".

as this really only disallows flush packets (i.e. anything goes,
including images)
and the server must be smart enough to interpret the file.

>
> So you'd probably want some client tool to help the user figure out what
> to put in the PR, and of course that already exists, because GitHub has
> an HTTP API that it can talk to. Shoe-horning it into Git doesn't buy
> much.

Let's say the example was bad. the reality is that Dennis and I implemented
a needed feature independently of each other and you have the
out-of-the-box-Github-does-HTTP-too solution. So I think we'd want
*something* in Git to have information in the receiving hooks available?

It seems to me there are many other Git-wrappers, such as the repo tool
or git-review[2], and most of them start with the premise: "Git doesn't support
X yet, so while they take their time to figure out how to do it
properly upstream,
we'll just use this hacky script" and then eventually you end up with an xml
parsing beast with thousands lines of python code (The story of the repo
tool and submodules).

[1] https://code.google.com/p/git-repo/
[2] http://linux.die.net/man/1/git-review

>
> I'm not sure if your other examples would be better served by just using
> an HTTP API or not.

The decision for the HTTP API would be the decision for a wrapper tool.
And all these wrapper tools differ, so it is not easy to switch from one to
another, on the other hand, if you know Git well enough you can easily adapt
to other work flows as it is not a new tool, but just a new command line option
or so?

> I think Gerrit is funny in this regard because it
> eschews branches entirely. E.g., in a GitHub PR, you push to branch
> "foo", and then you open a PR using "foo" as the source.

Once upon a time you could also open a pull request using the sha1?
(I did that once and then was asked to do some fixes before pulling and
I had to abandon and reopen a proper branch PR)

> But with
> Gerrit, you push to the magic refs/for/master, and you have no real way
> to cross-reference that submission later.

What do you mean by cross reference?

I could ask you to look at https://gerrit-review.googlesource.com/#/c/79354/
and in the upper right corner you'll find a download button that let's you copy
"git fetch <URL> refs/changes/54/79354/3 && git checkout FETCH_HEAD".
That allows you to inspect that thing alone later (or any of the
revisions as the
structure is  refs/changes/<2 digit sharding>/<change id>/<revision count>
"revision" translates to the concept of resending a patch series v2,v3 here)

Alternatively I could ask you to look for a "Change-Id" that is a hash in the
footer of a commit, so you can find it in the Git-history or on the review site.

Gerrits flaws are off topic to this series though (maybe?)

>
> Whereas in Dennis's patches, it was about specific information directly
> related to the act of pushing.

But does it cover all the informations related to pushing? like

  "I am a bot, down-prioritize me, please"
  "I am a bot, therefore I do not care about the internal replication
on the server"

The last one is interesting as it is very specific to our servers. So you could
argue we'd want to roll our own fork of Git that also covers such a hook option,
but I think it is easier for both the Git community as well as us here to allow
Git transmit free form text and the server can decide how to act upon it?

>
>> * When pushing to Gerrit you can already attach some of this information by
>>   adding a '%' followed by the parameter to the ref, i.e. when interacting with
>>   Gerrit it is possible to do things like[1]:
>>
>>     git push origin HEAD:refs/for/master%draft%topic=example%cc=jon.doe@example.org
>>
>>   This is not appealing to our users as it looks like hacks upon hacks to make
>>   it work. It would read better if it was spelled as:
>
> Heh. It _is_ hacks upon hacks, isn't it? ;)

I do not disagree. Now is the time to do it right, but we like the
convenience of having just
one tool (Git) that can do it all for us. We don't want to write yet
another wrapper script that
does the HTTP API thing. ;)


I might have missed the point of this email completely as I seem to
try to defend the
patch series (and Gerrit a bit). So do you think the functionality of
this series is overkill
and we'd rather go with Dennis series?

I just think that is not future proof on the wire as it restricts what
you can put on the wire
as well as not future proof as it specifies what you options you can
select over the wire.

Gits spirit (Oh no, not this discussion!) is to allow a broad range of
uses and work flows.
It doesn't hinder you from shooting into your feet if that's what
you're into, and
that is what this series does precisely.

>
> -Peff

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-06-30  0:59 ` [PATCH 2/4] receive-pack: implement advertising and receiving " Stefan Beller
@ 2016-07-01 17:11   ` Junio C Hamano
  2016-07-01 17:24     ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-01 17:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, dwwang

Stefan Beller <sbeller@google.com> writes:

> +static const char *stream_push_options_to_file()
> +{
> +	static const char *fname = "push-options-XXXXXX";
> +	char *ret = xmallocz(strlen(fname));
> +	int fd;
> +	memcpy(ret, fname, strlen(fname));
> +	fd = mkstemp(ret);

Probably

	char *ret;
	int fd;

	ret = xstrdup("push-options-XXXXXX");
	fd = xmkstemp(ret);

or use mkstemp but check the return value and "goto fail".

> +	for (;;) {
> +		char *line;
> +		int len;
> +
> +		line = packet_read_line(0, &len);
> +
> +		if (!line)
> +			break;
> +
> +		if (write_in_full(fd, line, len) < 0 ||
> +		    write_in_full(fd, "\n", 1) < 0)
> +			goto fail;
> +	}
> +
> +	return ret;
> +fail:
> +	close(fd);
> +	free(ret);
> +	return NULL;
> +}
> +
>  static const char *parse_pack_header(struct pack_header *hdr)
>  {
>  	switch (read_pack_header(0, hdr)) {
> @@ -1767,6 +1808,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  		const char *unpack_status = NULL;
>  		const char *push_options_file = NULL;
>  
> +		if (use_push_options)
> +			push_options_file = stream_push_options_to_file();
> +
>  		prepare_shallow_info(&si, &shallow);
>  		if (!si.nr_ours && !si.nr_theirs)
>  			shallow_update = 0;

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01  7:14   ` Jeff King
@ 2016-07-01 17:20     ` Stefan Beller
  2016-07-01 17:59       ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 17:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Dan Wang

On Fri, Jul 1, 2016 at 12:14 AM, Jeff King <peff@peff.net> wrote:
>
> Putting the data into a file does alleviate some issues. But it also
> makes parsing annoying, and opens up new issues (like what happens
> to content that has a newline in it?). Wouldn't multiple files be a bit
> more convenient? Especially for your example of eventually carrying
> larger binary content like images.

Yeah I did not think about having multiple files. I think that is the
way to go now.
In the environment variable you'd have the paths, such that

process_hook_option_from_file()
{
 ... # do a thing with $1
}

for p in $GIT_HOOK_OPTION_FILES ;
do
  process_hook_option_from_file $p
done

is a viable hook.


>> The rationale for keeping the actual options inside a file instead of
>> putting them directly into an environment variable has multiple reasons:
>
> Thanks for including this rationale; my first thought on seeing the
> patch was "wouldn't this be much more convenient for a hook if each
> value had its own environment variable?".

That's what I thought as well. Office discussion ensued and I am still
offended by this solution, but it sucks less than multiple environment variables
(I tried writing a script to construct and evaluate the environment
variables and
that doesn't look nice)

If we did not have a GIT_PUSH_OPTIONS_COUNT and GIT_PUSH_OPTION_<N>
but rather GIT_PUSH_OPTIONS_VARIABLES that contains the other variables,
it may be easier to handle, but whether you read from a file or evaluate the
environment variable is only a minor step, the indirection is there anyway
and this would be very close to what we have above.

Thanks,
Stefan

>
> -Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-01 17:11   ` Junio C Hamano
@ 2016-07-01 17:24     ` Stefan Beller
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 17:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Dan Wang

On Fri, Jul 1, 2016 at 10:11 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> +static const char *stream_push_options_to_file()
>> +{
>> +     static const char *fname = "push-options-XXXXXX";
>> +     char *ret = xmallocz(strlen(fname));
>> +     int fd;
>> +     memcpy(ret, fname, strlen(fname));
>> +     fd = mkstemp(ret);
>
> Probably
>
>         char *ret;
>         int fd;
>
>         ret = xstrdup("push-options-XXXXXX");
>         fd = xmkstemp(ret);
>

That is way better to read!
Also I did not know we also have a wrapper for mkstemp;
Maybe I should check for any new syscall if we wrap that.

> or use mkstemp but check the return value and "goto fail".
>
>> +     for (;;) {
>> +             char *line;
>> +             int len;
>> +
>> +             line = packet_read_line(0, &len);
>> +
>> +             if (!line)
>> +                     break;
>> +
>> +             if (write_in_full(fd, line, len) < 0 ||
>> +                 write_in_full(fd, "\n", 1) < 0)
>> +                     goto fail;
>> +     }
>> +
>> +     return ret;
>> +fail:
>> +     close(fd);
>> +     free(ret);
>> +     return NULL;
>> +}
>> +
>>  static const char *parse_pack_header(struct pack_header *hdr)
>>  {
>>       switch (read_pack_header(0, hdr)) {
>> @@ -1767,6 +1808,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>>               const char *unpack_status = NULL;
>>               const char *push_options_file = NULL;
>>
>> +             if (use_push_options)
>> +                     push_options_file = stream_push_options_to_file();
>> +
>>               prepare_shallow_info(&si, &shallow);
>>               if (!si.nr_ours && !si.nr_theirs)
>>                       shallow_update = 0;

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCHv1 0/4] Push options in C Git
  2016-07-01 17:07   ` Stefan Beller
@ 2016-07-01 17:55     ` Jeff King
  2016-07-01 18:25       ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-01 17:55 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Dennis Kaarsemaker, git, Dan Wang

On Fri, Jul 01, 2016 at 10:07:10AM -0700, Stefan Beller wrote:

> > So you'd probably want some client tool to help the user figure out what
> > to put in the PR, and of course that already exists, because GitHub has
> > an HTTP API that it can talk to. Shoe-horning it into Git doesn't buy
> > much.
> 
> Let's say the example was bad. the reality is that Dennis and I implemented
> a needed feature independently of each other and you have the
> out-of-the-box-Github-does-HTTP-too solution. So I think we'd want
> *something* in Git to have information in the receiving hooks available?

Sure, _if_ it's information that the receiving hooks want to have.

I guess what I'm questioning is exactly what receiving hooks do need to
have. Things like "the user said --force" seem obviously related to the
push. Things like auto-creating a PR seem less about the push itself,
and more about something you can do with the result of the push.

So a litmus test for me would be something like:

  Can you get the same effect with:

    git push $remote my-branch &&
    curl https://$remote/api/do-the-thing?q=my-branch

  as you would with:

    git push --push-data=do-the-thing $remote mybranch

If so, then it seems like inserting the data through the git connection
is not really adding functionality, but it _is_ adding complexity to
git (that other tools are already handling with much more flexibility).

We have a similar rule of thumb for hooks; we don't add a hook if
there's no reason you couldn't just run the hook content immediately
before or after the command.

> It seems to me there are many other Git-wrappers, such as the repo tool
> or git-review[2], and most of them start with the premise: "Git doesn't support
> X yet, so while they take their time to figure out how to do it
> properly upstream,
> we'll just use this hacky script" and then eventually you end up with an xml
> parsing beast with thousands lines of python code (The story of the repo
> tool and submodules).

But it's exactly this complexity that makes me worried about moving
things into git. This patch addresses only the transport problem. If
your problem is that something like a review system built on git
involves lot of complicated data, now you have two problems: you still
have the complicated data, and now you have to shoe-horn it into Git's
ad-hoc protocol.

I'm also not convinced that it alleviates the need for the tool to have
a separate API. Imagine you can send this data while pushing now. How do
you edit it later? How do you retrieve it programmatically? Etc.

> > I think Gerrit is funny in this regard because it
> > eschews branches entirely. E.g., in a GitHub PR, you push to branch
> > "foo", and then you open a PR using "foo" as the source.
> 
> Once upon a time you could also open a pull request using the sha1?
> (I did that once and then was asked to do some fixes before pulling and
> I had to abandon and reopen a proper branch PR)
> 
> > But with
> > Gerrit, you push to the magic refs/for/master, and you have no real way
> > to cross-reference that submission later.
> 
> What do you mean by cross reference?

I just meant that in my GitHub example, the branch name you chose
("foo") becomes the key with which you can correlate the two actions:
pushing the data via Git, and later opening a PR (via the web page, or
the API, or whatever).

With Gerrit, how do you refer to the commits you just pushed, when
making a separate action?  My impression is that Gerrit assigns
change-ids centrally when you push to refs/for/master; how do you figure
out after the push what the change-id is of the thing you just pushed?
I can guess maybe it sends something to stderr that the user then reads,
but that is purely a guess. I've never really used Gerrit.

You're right that one could use sha1s for this purpose.

> Gerrits flaws are off topic to this series though (maybe?)

I'm far from qualified to critique Gerrit. I was mostly just observing
that your view of the problem space (and thus the solution) is colored
by Gerrit's way of doing things. That doesn't make it a wrong solution,
but it's worth seeing whether other systems would get the same benefit
(just because part of the "is it worth it" equation is how widely used
the feature could be).

> > Whereas in Dennis's patches, it was about specific information directly
> > related to the act of pushing.
> 
> But does it cover all the informations related to pushing? like
> 
>   "I am a bot, down-prioritize me, please"
>   "I am a bot, therefore I do not care about the internal replication
> on the server"
> 
> The last one is interesting as it is very specific to our servers. So you could
> argue we'd want to roll our own fork of Git that also covers such a hook option,
> but I think it is easier for both the Git community as well as us here to allow
> Git transmit free form text and the server can decide how to act upon it?

No, I don't think it covers all information. I like the idea of being
able to pass free-form key/value pairs to hooks, because by definition
we don't know what the hooks are doing, or what the user might want to
communicate to them.

> I might have missed the point of this email completely as I seem to
> try to defend the
> patch series (and Gerrit a bit). So do you think the functionality of
> this series is overkill
> and we'd rather go with Dennis series?

Oh, I'm supposed to have a point to my emails? :)

I was mostly just musing. I do see the general idea as a useful thing;
there really is data that is related to the push itself, and not other
actions you may want to perform on the push.

I'm not necessarily sure that we need a complicated data model (like
multiple values of the same key), or the ability to handle arbitrarily
large binary data. Those don't necessarily complicate the wire protocol
(I like the idea of adding a new phase of key/values, followed by a
flush), but they do complicate the server implementation (how do we
communicate them to the hooks? Should there be limits for safety and DoS
protection on the server?)

> Gits spirit (Oh no, not this discussion!) is to allow a broad range of
> uses and work flows.  It doesn't hinder you from shooting into your
> feet if that's what you're into, and that is what this series does
> precisely.

I think Git's spirit is also to fit as a tool into the greater
ecosystem, and not duplicate functionality that is done elsewhere. So
for example there is no authentication in the git:// protocol; that job
is done by ssh.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 17:20     ` Stefan Beller
@ 2016-07-01 17:59       ` Jeff King
  2016-07-01 18:03         ` Junio C Hamano
  2016-07-01 18:40         ` Stefan Beller
  0 siblings, 2 replies; 52+ messages in thread
From: Jeff King @ 2016-07-01 17:59 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Dan Wang

On Fri, Jul 01, 2016 at 10:20:32AM -0700, Stefan Beller wrote:

> >> The rationale for keeping the actual options inside a file instead of
> >> putting them directly into an environment variable has multiple reasons:
> >
> > Thanks for including this rationale; my first thought on seeing the
> > patch was "wouldn't this be much more convenient for a hook if each
> > value had its own environment variable?".
> 
> That's what I thought as well. Office discussion ensued and I am still
> offended by this solution, but it sucks less than multiple environment variables
> (I tried writing a script to construct and evaluate the environment
> variables and
> that doesn't look nice)

If you give up on having multiple incarnations of each variable, then I
think:

  GIT_PUSH_VAR_foo=value_for_foo
  GIT_PUSH_VAR_bar=value_for_bar

is quite elegant, and easy to use from hooks. It just cannot represent
multiple such "foo" variables.

> If we did not have a GIT_PUSH_OPTIONS_COUNT and GIT_PUSH_OPTION_<N>
> but rather GIT_PUSH_OPTIONS_VARIABLES that contains the other variables,
> it may be easier to handle, but whether you read from a file or evaluate the
> environment variable is only a minor step, the indirection is there anyway
> and this would be very close to what we have above.

It makes the server implementation a bit uglier. You have to create the
temporary file, and you have to clean it up. What process is responsible
for cleaning up stale files? Obviously receive-pack would try to clean
up after itself, but what happens when it is "kill -9"'d, or the system
goes down, etc? We clean up stale tmp files like tmp_obj_* in git-gc; I
think we'd want something like that here.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 17:59       ` Jeff King
@ 2016-07-01 18:03         ` Junio C Hamano
  2016-07-01 18:11           ` Jeff King
  2016-07-01 18:40         ` Stefan Beller
  1 sibling, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-01 18:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git, Dan Wang

On Fri, Jul 1, 2016 at 10:59 AM, Jeff King <peff@peff.net> wrote:
> If you give up on having multiple incarnations of each variable, then I
> think:
>
>   GIT_PUSH_VAR_foo=value_for_foo
>   GIT_PUSH_VAR_bar=value_for_bar
>
> is quite elegant, and easy to use from hooks. It just cannot represent
> multiple such "foo" variables.
>
>> If we did not have a GIT_PUSH_OPTIONS_COUNT and GIT_PUSH_OPTION_<N>
>> but rather GIT_PUSH_OPTIONS_VARIABLES that contains the other variables,
>> it may be easier to handle, but whether you read from a file or evaluate the
>> environment variable is only a minor step, the indirection is there anyway
>> and this would be very close to what we have above.
>
> It makes the server implementation a bit uglier. You have to create the
> temporary file, and you have to clean it up. What process is responsible
> for cleaning up stale files? Obviously receive-pack would try to clean
> up after itself, but what happens when it is "kill -9"'d, or the system
> goes down, etc? We clean up stale tmp files like tmp_obj_* in git-gc; I
> think we'd want something like that here.

It still is not clear to me why the option to pass _COUNT and _VAR_<N> is
rejected.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 18:03         ` Junio C Hamano
@ 2016-07-01 18:11           ` Jeff King
  2016-07-01 19:25             ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-01 18:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Dan Wang

On Fri, Jul 01, 2016 at 11:03:06AM -0700, Junio C Hamano wrote:

> On Fri, Jul 1, 2016 at 10:59 AM, Jeff King <peff@peff.net> wrote:
> > If you give up on having multiple incarnations of each variable, then I
> > think:
> >
> >   GIT_PUSH_VAR_foo=value_for_foo
> >   GIT_PUSH_VAR_bar=value_for_bar
> >
> > is quite elegant, and easy to use from hooks. It just cannot represent
> > multiple such "foo" variables.
> >
> >> If we did not have a GIT_PUSH_OPTIONS_COUNT and GIT_PUSH_OPTION_<N>
> >> but rather GIT_PUSH_OPTIONS_VARIABLES that contains the other variables,
> >> it may be easier to handle, but whether you read from a file or evaluate the
> >> environment variable is only a minor step, the indirection is there anyway
> >> and this would be very close to what we have above.
> >
> > It makes the server implementation a bit uglier. You have to create the
> > temporary file, and you have to clean it up. What process is responsible
> > for cleaning up stale files? Obviously receive-pack would try to clean
> > up after itself, but what happens when it is "kill -9"'d, or the system
> > goes down, etc? We clean up stale tmp files like tmp_obj_* in git-gc; I
> > think we'd want something like that here.
> 
> It still is not clear to me why the option to pass _COUNT and _VAR_<N> is
> rejected.

It's a little more unwieldy to parse in a shell hook, but not too bad, I
guess:

  if test -n "$GIT_PUSH_VAR_COUNT"; then
    i=0
    while test "$i" -lt "$GIT_PUSH_VAR_COUNT"; do
      eval "value=\$GIT_PUSH_VAR_$i"
      case "$value" in
      force=*)
	force=${value#*=}
	;;
      ... and so on ...
      esac
    done
  fi

  ...
  if test "$force" = true
     ...

Compare to:

  if test "$GIT_PUSH_VAR_force" = true
     ...

The "count" method gives you the flexibility to parse multiple keys as
lists, last-one-wins, or whatever scheme you want. But it also gives you
the _responsibility_ to do the parsing yourself, which is a pain when
you want to do the simple thing.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCHv1 0/4] Push options in C Git
  2016-07-01 17:55     ` Jeff King
@ 2016-07-01 18:25       ` Stefan Beller
  2016-07-01 20:01         ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 18:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Dennis Kaarsemaker, git, Dan Wang

On Fri, Jul 1, 2016 at 10:55 AM, Jeff King <peff@peff.net> wrote:
> On Fri, Jul 01, 2016 at 10:07:10AM -0700, Stefan Beller wrote:
>
>> > So you'd probably want some client tool to help the user figure out what
>> > to put in the PR, and of course that already exists, because GitHub has
>> > an HTTP API that it can talk to. Shoe-horning it into Git doesn't buy
>> > much.
>>
>> Let's say the example was bad. the reality is that Dennis and I implemented
>> a needed feature independently of each other and you have the
>> out-of-the-box-Github-does-HTTP-too solution. So I think we'd want
>> *something* in Git to have information in the receiving hooks available?
>
> Sure, _if_ it's information that the receiving hooks want to have.
>
> I guess what I'm questioning is exactly what receiving hooks do need to
> have. Things like "the user said --force" seem obviously related to the
> push. Things like auto-creating a PR seem less about the push itself,
> and more about something you can do with the result of the push.
>
> So a litmus test for me would be something like:
>
>   Can you get the same effect with:
>
>     git push $remote my-branch &&
>     curl https://$remote/api/do-the-thing?q=my-branch
>
>   as you would with:
>
>     git push --push-data=do-the-thing $remote mybranch

People are happy when they can use one tool instead of two,
e.g. "Deploy with Git" is a thing[1] and maybe they want to add a
switch "once pushed skip the tests, I know they are broken".

[1] https://devcenter.heroku.com/articles/git

>
> If so, then it seems like inserting the data through the git connection
> is not really adding functionality, but it _is_ adding complexity to
> git (that other tools are already handling with much more flexibility).
>
> We have a similar rule of thumb for hooks; we don't add a hook if
> there's no reason you couldn't just run the hook content immediately
> before or after the command.
>
>> It seems to me there are many other Git-wrappers, such as the repo tool
>> or git-review[2], and most of them start with the premise: "Git doesn't support
>> X yet, so while they take their time to figure out how to do it
>> properly upstream,
>> we'll just use this hacky script" and then eventually you end up with an xml
>> parsing beast with thousands lines of python code (The story of the repo
>> tool and submodules).
>
> But it's exactly this complexity that makes me worried about moving
> things into git. This patch addresses only the transport problem. If
> your problem is that something like a review system built on git
> involves lot of complicated data, now you have two problems: you still
> have the complicated data, and now you have to shoe-horn it into Git's
> ad-hoc protocol.
>
> I'm also not convinced that it alleviates the need for the tool to have
> a separate API. Imagine you can send this data while pushing now. How do
> you edit it later? How do you retrieve it programmatically? Etc.
>
>> > I think Gerrit is funny in this regard because it
>> > eschews branches entirely. E.g., in a GitHub PR, you push to branch
>> > "foo", and then you open a PR using "foo" as the source.
>>
>> Once upon a time you could also open a pull request using the sha1?
>> (I did that once and then was asked to do some fixes before pulling and
>> I had to abandon and reopen a proper branch PR)
>>
>> > But with
>> > Gerrit, you push to the magic refs/for/master, and you have no real way
>> > to cross-reference that submission later.
>>
>> What do you mean by cross reference?
>
> I just meant that in my GitHub example, the branch name you chose
> ("foo") becomes the key with which you can correlate the two actions:
> pushing the data via Git, and later opening a PR (via the web page, or
> the API, or whatever).
>
> With Gerrit, how do you refer to the commits you just pushed, when
> making a separate action?  My impression is that Gerrit assigns
> change-ids centrally when you push to refs/for/master;

actually you install a commit hook to do it decentralized.

> how do you figure
> out after the push what the change-id is of the thing you just pushed?
> I can guess maybe it sends something to stderr that the user then reads,
> but that is purely a guess. I've never really used Gerrit.

I assigns the "legacy" sequential numbers like so:

$ git push origin HEAD:refs/for/master
Counting objects: 18, done.
Delta compression using up to 12 threads.
Compressing objects: 100% (18/18), done.
Writing objects: 100% (18/18), 4.15 KiB | 0 bytes/s, done.
Total 18 (delta 13), reused 0 (delta 0)
remote: Resolving deltas: 100% (13/13)
remote: Processing changes: new: 3, updated: 2, done
remote: (W) ba18b08: too many commit message lines longer than 70
characters; manually wrap lines
remote:
remote: New Changes:
remote:   https://gerrit-review.googlesource.com/79620 project:
introduce hasCommit method
remote:   https://gerrit-review.googlesource.com/79621 project:
GetSubmodules uses recursive switch for ls-tree
remote:   https://gerrit-review.googlesource.com/79622 sync: add a
quick flag to ignore up to date projects
remote:
remote:
remote: Updated Changes:
remote:   https://gerrit-review.googlesource.com/79195 manifest_xml:
move path parsing before revision parsing
remote:   https://gerrit-review.googlesource.com/79200 Add a
superproject metarepository for atomic syncs
remote:
To https://gerrit.googlesource.com/git-repo
 * [new branch]      HEAD -> refs/for/master


>
> You're right that one could use sha1s for this purpose.
>
>> Gerrits flaws are off topic to this series though (maybe?)
>
> I'm far from qualified to critique Gerrit. I was mostly just observing
> that your view of the problem space (and thus the solution) is colored
> by Gerrit's way of doing things. That doesn't make it a wrong solution,
> but it's worth seeing whether other systems would get the same benefit
> (just because part of the "is it worth it" equation is how widely used
> the feature could be).


>
>> > Whereas in Dennis's patches, it was about specific information directly
>> > related to the act of pushing.
>>
>> But does it cover all the informations related to pushing? like
>>
>>   "I am a bot, down-prioritize me, please"
>>   "I am a bot, therefore I do not care about the internal replication
>> on the server"
>>
>> The last one is interesting as it is very specific to our servers. So you could
>> argue we'd want to roll our own fork of Git that also covers such a hook option,
>> but I think it is easier for both the Git community as well as us here to allow
>> Git transmit free form text and the server can decide how to act upon it?
>
> No, I don't think it covers all information. I like the idea of being
> able to pass free-form key/value pairs to hooks, because by definition
> we don't know what the hooks are doing, or what the user might want to
> communicate to them.
>
>> I might have missed the point of this email completely as I seem to
>> try to defend the
>> patch series (and Gerrit a bit). So do you think the functionality of
>> this series is overkill
>> and we'd rather go with Dennis series?
>
> Oh, I'm supposed to have a point to my emails? :)

I am used to read high quality emails from you and asked myself what I can
learn by reading this email. "Trying to figure out your point", nothing more.

>
> I was mostly just musing. I do see the general idea as a useful thing;
> there really is data that is related to the push itself, and not other
> actions you may want to perform on the push.
>
> I'm not necessarily sure that we need a complicated data model (like
> multiple values of the same key), or the ability to handle arbitrarily
> large binary data. Those don't necessarily complicate the wire protocol
> (I like the idea of adding a new phase of key/values, followed by a
> flush),

So you suggest that we only allow key=value pairs and not "key" only?
I am not sure if that makes it easier, though (what part is easier?) compared
to completely free form.

> but they do complicate the server implementation (how do we
> communicate them to the hooks? Should there be limits for safety and DoS
> protection on the server?)

Oh good point! How do you handle DoS protection for pushing large blobs,
or weird history (that may blow up the before-behind implementation as
referenced
in the "topological index field for commit objects" thread)

Shouldn't that be server specific as well?

Mind that I proposed a "receive.advertisePushOptions" config option.
Maybe that can be fine tuned in a later patch "receive.PushOptionSizeLimit",
"receive.maxPushOptions" or such?

>
>> Gits spirit (Oh no, not this discussion!) is to allow a broad range of
>> uses and work flows.  It doesn't hinder you from shooting into your
>> feet if that's what you're into, and that is what this series does
>> precisely.
>
> I think Git's spirit is also to fit as a tool into the greater
> ecosystem, and not duplicate functionality that is done elsewhere. So
> for example there is no authentication in the git:// protocol; that job
> is done by ssh.

or https and you have a cookie for that.

Talking about authentication, any tool that we additionally have "to
just trigger
this one thing remotely" would need to learn about proper auth, too. so it will
not be just a small helper script, but blows up into a large thing. I
think we could
avoid that duplicated effort, too.

>
> -Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 17:59       ` Jeff King
  2016-07-01 18:03         ` Junio C Hamano
@ 2016-07-01 18:40         ` Stefan Beller
  1 sibling, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 18:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Dan Wang

On Fri, Jul 1, 2016 at 10:59 AM, Jeff King <peff@peff.net> wrote:
> On Fri, Jul 01, 2016 at 10:20:32AM -0700, Stefan Beller wrote:
>
>> >> The rationale for keeping the actual options inside a file instead of
>> >> putting them directly into an environment variable has multiple reasons:
>> >
>> > Thanks for including this rationale; my first thought on seeing the
>> > patch was "wouldn't this be much more convenient for a hook if each
>> > value had its own environment variable?".
>>
>> That's what I thought as well. Office discussion ensued and I am still
>> offended by this solution, but it sucks less than multiple environment variables
>> (I tried writing a script to construct and evaluate the environment
>> variables and
>> that doesn't look nice)
>
> If you give up on having multiple incarnations of each variable, then I
> think:
>
>   GIT_PUSH_VAR_foo=value_for_foo
>   GIT_PUSH_VAR_bar=value_for_bar
>
> is quite elegant, and easy to use from hooks. It just cannot represent
> multiple such "foo" variables.

I see! Then you can have a single check for one feature

    if $GIT_PUSH_VAR_foo = "value_for_foo" ; then
        foo "bar"
    elif $GIT_PUSH_VAR_foo != "" ; then
        free_form_foo $GIT_PUSH_VAR_foo
    fi

and no worries about parsing.

Mind that we now put a user provided thing into the variable again.
Which we may be fine with.

The question I still have is how much of enforcement we want to do?
Does the client reject a push option if it doesn't contain a '=', such that
the server doesn't try setting a weird "GIT_PUSH_VAR_bar_baz".

Mind that a different server may not handle these in environment variables,
but read it off the wire and handle it in memory.

>
>> If we did not have a GIT_PUSH_OPTIONS_COUNT and GIT_PUSH_OPTION_<N>
>> but rather GIT_PUSH_OPTIONS_VARIABLES that contains the other variables,
>> it may be easier to handle, but whether you read from a file or evaluate the
>> environment variable is only a minor step, the indirection is there anyway
>> and this would be very close to what we have above.
>
> It makes the server implementation a bit uglier. You have to create the
> temporary file, and you have to clean it up. What process is responsible
> for cleaning up stale files? Obviously receive-pack would try to clean
> up after itself, but what happens when it is "kill -9"'d, or the system
> goes down, etc? We clean up stale tmp files like tmp_obj_* in git-gc; I
> think we'd want something like that here.

Yeah that is one of the weaknesses with the file based solution.

>
> -Peff

> Jeff writes:
>> Junio writes:
>> It still is not clear to me why the option to pass _COUNT and _VAR_<N> is
>> rejected.
> The "count" method gives you the flexibility to parse multiple keys as
> lists, last-one-wins, or whatever scheme you want. But it also gives you
> the _responsibility_ to do the parsing yourself, which is a pain when
> you want to do the simple thing.

We could ship Git with a default parser in the example hook that takes care of
the responsibility in an opinionated way (like Git does with "multiple
options are
allowed", and "last option wins").

If we do that, then the _COUNT method may be favorable?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 18:11           ` Jeff King
@ 2016-07-01 19:25             ` Junio C Hamano
  2016-07-01 19:31               ` Stefan Beller
  2016-07-01 19:39               ` Jeff King
  0 siblings, 2 replies; 52+ messages in thread
From: Junio C Hamano @ 2016-07-01 19:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Beller, git, Dan Wang

Jeff King <peff@peff.net> writes:

> Compare to:
>
>   if test "$GIT_PUSH_VAR_force" = true
>      ...

OK.  As this is a new feature, I guess it is not too bad if we tell
users that they cannot have duplicate keys in their <key,val> data
they ask Git to transport.  They can do the key1=val1 key2=val2
numbering themselves if that is really needed.


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 19:25             ` Junio C Hamano
@ 2016-07-01 19:31               ` Stefan Beller
  2016-07-01 19:39               ` Jeff King
  1 sibling, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 19:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git, Dan Wang

On Fri, Jul 1, 2016 at 12:25 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> Compare to:
>>
>>   if test "$GIT_PUSH_VAR_force" = true
>>      ...
>
> OK.  As this is a new feature, I guess it is not too bad if we tell
> users that they cannot have duplicate keys in their <key,val> data
> they ask Git to transport.  They can do the key1=val1 key2=val2
> numbering themselves if that is really needed.
>

The original use case for us was to send out email on your behalf, so
git push -o CC=user1@domain -o CC=user2@domain would not work well with
this backend approach as the client would already yell at us for duplicate
keys.

Instead that has to be `git push -o CC=user1@domain,user2@domain
and we do need to parse the email addresses apart ourselves.

So currently I'd favor the _COUNT thing and we deliver a reasonable
_COUNT -> key/value parser/translator in the example hook.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 19:25             ` Junio C Hamano
  2016-07-01 19:31               ` Stefan Beller
@ 2016-07-01 19:39               ` Jeff King
  2016-07-01 19:50                 ` Stefan Beller
  1 sibling, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-01 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, git, Dan Wang

On Fri, Jul 01, 2016 at 12:25:39PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Compare to:
> >
> >   if test "$GIT_PUSH_VAR_force" = true
> >      ...
> 
> OK.  As this is a new feature, I guess it is not too bad if we tell
> users that they cannot have duplicate keys in their <key,val> data
> they ask Git to transport.  They can do the key1=val1 key2=val2
> numbering themselves if that is really needed.

That was sort of my question. _Is it_ too bad or not? It's hard to say
without knowing the use cases.

I'm OK with the more flexible scheme if it's something people really
want to use. Though perhaps with your suggestion, we can have our cake
and eat it, too. If we codify the key1/key2 thing by converting:

  git push --push-var=foo=one --push-var=foo=two

into:

  GIT_PUSH_VAR_foo=one
  GIT_PUSH_VAR_foo1=two

then a hook can either:

  - just use $GIT_PUSH_VAR_foo if they only know how to handle a single
    value

  - handle a list like:

     if test -n "$GIT_PUSH_VAR_foo"
     then
       # non-empty list
       echo first value is $GIT_PUSH_VAR_foo
       i=1
       while true; do
         eval "test -z \$GIT_PUSH_VAR_foo$i" && break
	 eval "echo value \$i is \$GIT_PUSH_VAR_foo$i"
       done
     fi

That's ugly, of course, but not really uglier than the parsing required
for the COUNT proposal.

I'm assuming that git actually knows about and enforces that things are
"key=value". I'm not sure how you'd get by without that (you'd have to
infer the meaning of a parameter by its position, which seems like a
recipe for mistakes and incompatibilities).

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 1/4] push options: {pre,post}-receive hook learns about push options
  2016-07-01 19:39               ` Jeff King
@ 2016-07-01 19:50                 ` Stefan Beller
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-01 19:50 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Dan Wang

On Fri, Jul 1, 2016 at 12:39 PM, Jeff King <peff@peff.net> wrote:
>
> That was sort of my question. _Is it_ too bad or not? It's hard to say
> without knowing the use cases.
>
...
>
> That's ugly, of course, but not really uglier than the parsing required
> for the COUNT proposal.

ok, I'll try to get the COUNT proposal in good enough shape.

>
> I'm assuming that git actually knows about and enforces that things are
> "key=value". I'm not sure how you'd get by without that (you'd have to
> infer the meaning of a parameter by its position, which seems like a
> recipe for mistakes and incompatibilities).
>

Of course! We could use "a hack" just like in the Git config itself.
If a boolean parameter is not given, the default is assumed. If it is
given the opposite is assumed. Same for attributes, that can be set
by just giving the key (with no "=true").

So I think I can be convinced to go only with key/value pairs for now, but
it would be easier if you'd just have

    git push -o draft ...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [RFC PATCHv1 0/4] Push options in C Git
  2016-07-01 18:25       ` Stefan Beller
@ 2016-07-01 20:01         ` Jeff King
  0 siblings, 0 replies; 52+ messages in thread
From: Jeff King @ 2016-07-01 20:01 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Dennis Kaarsemaker, git, Dan Wang

On Fri, Jul 01, 2016 at 11:25:40AM -0700, Stefan Beller wrote:

> > So a litmus test for me would be something like:
> >
> >   Can you get the same effect with:
> >
> >     git push $remote my-branch &&
> >     curl https://$remote/api/do-the-thing?q=my-branch
> >
> >   as you would with:
> >
> >     git push --push-data=do-the-thing $remote mybranch
> 
> People are happy when they can use one tool instead of two,
> e.g. "Deploy with Git" is a thing[1] and maybe they want to add a
> switch "once pushed skip the tests, I know they are broken".
> 
> [1] https://devcenter.heroku.com/articles/git

Yes, but people are also happy when they can use a flexible and
standardized tool to do a thing. I'd be more frustrated when I found out
that Git's data-pushing protocol has arbitrary limitations (like, say, I
can't push a data item larger than a single 64K pkt-line), which would
easily just work with something like HTTP POSTs.

I'm not saying my point trumps yours; just that there are a lot of
tradeoffs in a design like this.

> > With Gerrit, how do you refer to the commits you just pushed, when
> > making a separate action?  My impression is that Gerrit assigns
> > change-ids centrally when you push to refs/for/master;
> 
> actually you install a commit hook to do it decentralized.

Ah, OK. So you could do roughly the same thing:

  git push ... &&
  id=$(
    git cat-file commit HEAD |
    perl -lne '/^Change-Id: (.*)/ and print $1'
  ) &&
  curl https://$remote/api/do-something?id=$id

(Yes, I know Gerrit doesn't necessarily have an API like that; I was
mostly just wondering how it compared to the GitHub PR example I had
laid out).

> I am used to read high quality emails from you and asked myself what I can
> learn by reading this email. "Trying to figure out your point", nothing more.

Heh. Sorry, they can't all be winners. ;)

> So you suggest that we only allow key=value pairs and not "key" only?
> I am not sure if that makes it easier, though (what part is easier?) compared
> to completely free form.

I had just assumed they would always have a key and a value. If you have
no key, then you are stuck parsing positionally to know what each value
means. If you have no value, you can treat each element like a boolean
true. But it's not that much worse to turn "foo" into "foo=true".

> > but they do complicate the server implementation (how do we
> > communicate them to the hooks? Should there be limits for safety and
> > DoS protection on the server?)
> 
> Oh good point! How do you handle DoS protection for pushing large
> blobs, or weird history (that may blow up the before-behind
> implementation as referenced in the "topological index field for
> commit objects" thread)

At GitHub we use a combination of push-time size heuristics and CPU
run-time limits.

So at push time, we reject objects larger than 100MB (uncompressed) in
index-pack, and we'll refuse to make a single allocation larger than a
gigabyte or so (to prevent index-pack blowing up while trying to figure
out how big the object is). We reject any single pack larger than 2GB.
Rejected objects never even make it to the repo, and we clean up the tmp
packs immediately. So you can't just stream data at us endlessly and
chew up disk or RAM (you can chew up plenty of CPU, though). 

That prevents the most egregious DoS problems, and protects us from crap
like "whoops, this tree entry has a 2GB name, and it wraps an integer
somewhere deep in git that probably should have been a size_t".

But git is complicated enough that you can still get plenty of malicious
crap through. So we aggressively limit completion times for most
operations, and we keep track of the runtime of every single git
operation and correlate it with the repo, user, and source IP, and apply
quotas when there's excessive use. That's not nice for a user who has a
legitimately big request, but it prevents them from impacting other
users (so it's a baseline, and then we try to optimize things that don't
fit into the timeouts, or that cause users to have quotas applied).

That was a bit of a tangent, but back to this topic: I'd probably want
some kind of hard limit on the number of data items and the size of each
(though if they are single pkt-lines, that puts a hard limit already).

> Shouldn't that be server specific as well?

Potentially, yes.

> Mind that I proposed a "receive.advertisePushOptions" config option.
> Maybe that can be fine tuned in a later patch "receive.PushOptionSizeLimit",
> "receive.maxPushOptions" or such?

Yep, I think that is a good approach (presumably with some sane safe-ish
defaults).

> Talking about authentication, any tool that we additionally have "to
> just trigger this one thing remotely" would need to learn about proper
> auth, too. so it will not be just a small helper script, but blows up
> into a large thing. I think we could avoid that duplicated effort,
> too.

I think the right tool there is credential helpers. And I don't think
it's a new problem. You already can log into Gerrit via the web browser,
but you have to have separate auth for running git. Hopefully there are
tools to make that less painful (either pulling the browser cookie, or
getting an opaque application token from the web app to feed into git).

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 21:49 [PATCHv7 0/4] Push options Stefan Beller
@ 2016-07-14 21:49 ` Stefan Beller
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-14 21:49 UTC (permalink / raw)
  To: git, gitster; +Cc: dwwang, e, peff, dennis, jrnieder, Stefan Beller

The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt                          |  9 +++++--
 Documentation/technical/pack-protocol.txt         | 10 +++++---
 Documentation/technical/protocol-capabilities.txt |  9 +++++++
 builtin/receive-pack.c                            | 30 +++++++++++++++++++++++
 4 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 16dc22d..0bb6daa 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2417,8 +2417,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
 	By default, git-receive-pack will advertise the atomic push
-	capability to its clients. If you don't want to this capability
-	to be advertised, set this variable to false.
+	capability to its clients. If you don't want to advertise this
+	capability, set this variable to false.
+
+receive.advertisePushOptions::
+	By default, git-receive-pack will advertise the push options
+	capability to its clients. If you don't want to advertise this
+	capability, set this variable to false.
 
 receive.autogc::
 	By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 ----------------------------------------------
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 ----
   update-request    =  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+------------
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 ----------------------
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 6cdd2c6..3c9360a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.advertisepushoptions") == 0) {
+		advertise_push_options = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -211,6 +218,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
 			strbuf_addstr(&cap, " ofs-delta");
 		if (push_cert_nonce)
 			strbuf_addf(&cap, " push-cert=%s", push_cert_nonce);
+		if (advertise_push_options)
+			strbuf_addstr(&cap, " push-options");
 		strbuf_addf(&cap, " agent=%s", git_user_agent_sanitized());
 		packet_write(1, "%s %s%c%s\n",
 			     sha1_to_hex(sha1), path, 0, cap.buf);
@@ -1455,6 +1464,9 @@ static struct command *read_head_info(struct sha1_array *shallow)
 			if (advertise_atomic_push
 			    && parse_feature_request(feature_list, "atomic"))
 				use_atomic = 1;
+			if (advertise_push_options
+			    && parse_feature_request(feature_list, "push-options"))
+				use_push_options = 1;
 		}
 
 		if (!strcmp(line, "push-cert")) {
@@ -1487,6 +1499,21 @@ static struct command *read_head_info(struct sha1_array *shallow)
 	return commands;
 }
 
+static void read_push_options(struct string_list *options)
+{
+	while (1) {
+		char *line;
+		int len;
+
+		line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		string_list_append(options, line);
+	}
+}
+
 static const char *parse_pack_header(struct pack_header *hdr)
 {
 	switch (read_pack_header(0, hdr)) {
@@ -1774,6 +1801,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		const char *unpack_status = NULL;
 		struct string_list push_options = STRING_LIST_INIT_DUP;
 
+		if (use_push_options)
+			read_push_options(&push_options);
+
 		prepare_shallow_info(&si, &shallow);
 		if (!si.nr_ours && !si.nr_theirs)
 			shallow_update = 0;
-- 
2.9.0.247.gf748855.dirty


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 19:45         ` Jeff King
@ 2016-07-14 20:07           ` Junio C Hamano
  0 siblings, 0 replies; 52+ messages in thread
From: Junio C Hamano @ 2016-07-14 20:07 UTC (permalink / raw)
  To: Jeff King
  Cc: Stefan Beller, git, Dan Wang, Eric Wong, Dennis Kaarsemaker,
	Jonathan Nieder

Jeff King <peff@peff.net> writes:

> packet_read() does NUL-terminate for you. It gets the extra bytes
> because it doesn't store the 4-byte size in the output (whereas the
> client does not ever send anything over LARGE_PACKET_MAX, _including_
> those bytes, so we always have room to store its result in our
> LARGE_PACKET_MAX buffer, plus the NUL, with 3 bytes to spare).

Good; then the loop will further be simplified ;-)

Thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 19:07       ` Junio C Hamano
@ 2016-07-14 19:45         ` Jeff King
  2016-07-14 20:07           ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-14 19:45 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, git, Dan Wang, Eric Wong, Dennis Kaarsemaker,
	Jonathan Nieder

On Thu, Jul 14, 2016 at 12:07:15PM -0700, Junio C Hamano wrote:

> > Although as you remarked in another email, this would not pose a problem for
> > the shell variable, so we could also drop it to allow multi line
> > options. will do.
> 
> One thing to note is that I do not think there is a guarantee that
> packet_buf[] is NUL-terminated, and when len == LAGE_PACKET_MAX, you
> do not have room to NUL-terminate it yourself.

packet_read() does NUL-terminate for you. It gets the extra bytes
because it doesn't store the 4-byte size in the output (whereas the
client does not ever send anything over LARGE_PACKET_MAX, _including_
those bytes, so we always have room to store its result in our
LARGE_PACKET_MAX buffer, plus the NUL, with 3 bytes to spare).

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 19:00     ` Stefan Beller
@ 2016-07-14 19:07       ` Junio C Hamano
  2016-07-14 19:45         ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-14 19:07 UTC (permalink / raw)
  To: Stefan Beller
  Cc: git, Dan Wang, Eric Wong, Jeff King, Dennis Kaarsemaker, Jonathan Nieder

Stefan Beller <sbeller@google.com> writes:

>>> +             lf = strchr(line, '\n');
>>> +             if (lf)
>>> +                     *lf = '\0';
>>
>> packet_read_line() -> packet_read_line_generic() calls packet_read()
>> with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?
>
> This check was not about "option with lf at end\n", but rather we want to chop
> off "option\nover\nmultiple\nlines" ?

Ahh, I did misread the check.

> Although as you remarked in another email, this would not pose a problem for
> the shell variable, so we could also drop it to allow multi line
> options. will do.

One thing to note is that I do not think there is a guarantee that
packet_buf[] is NUL-terminated, and when len == LAGE_PACKET_MAX, you
do not have room to NUL-terminate it yourself.

string_list_append(ret, line) that assumes the "string" is NUL
terminated may become an issue that you need to solve by appending
the result of xmemdupz() into a non-duping string list.



^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 18:38   ` Junio C Hamano
@ 2016-07-14 19:00     ` Stefan Beller
  2016-07-14 19:07       ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-14 19:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Dan Wang, Eric Wong, Jeff King, Dennis Kaarsemaker, Jonathan Nieder

> packet_read_line() calls packet_read_line_generic() that calls
> packet_read() with the fixed packet_buffer[] that is sized to be
> LARGE_PACKET_MAX.

>
> Can this check even trigger?

I thought when len == LARGE_PACKET_MAX,
this could trigger. Though on inspection of packet_read,
we already reject packets that have size LARGE_PACKET_MAX,
and the largest size allowed is LARGE_PACKET_MAX - 1.

I guess we can remove it altogether and still be future proof.
If we ever want to allow larger push options we either need to
have larger (variable sized) packets or a capability push-options-v2,
so I'll rip this check out.

>
>> +             lf = strchr(line, '\n');
>> +             if (lf)
>> +                     *lf = '\0';
>
> packet_read_line() -> packet_read_line_generic() calls packet_read()
> with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?

This check was not about "option with lf at end\n", but rather we want to chop
off "option\nover\nmultiple\nlines" ?

Although as you remarked in another email, this would not pose a problem for
the shell variable, so we could also drop it to allow multi line
options. will do.

>
>> +
>> +             string_list_append(ret, line);
>> +     }
>> +
>> +     return ret;
>> +}
>
> Other than that, looks good to me.
>
> Thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 17:39 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
@ 2016-07-14 18:38   ` Junio C Hamano
  2016-07-14 19:00     ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-14 18:38 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, dwwang, e, peff, dennis, jrnieder

Stefan Beller <sbeller@google.com> writes:

> +static struct string_list *read_push_options(void)
> +{
> +	struct string_list *ret = xmalloc(sizeof(*ret));
> +	string_list_init(ret, 1);
> +
> +	while (1) {
> +		char *line, *lf;
> +		int len;
> +
> +		line = packet_read_line(0, &len);
> +
> +		if (!line)
> +			break;
> +
> +		/*
> +		* NEEDSWORK: expose the limitations to be configurable;
> +		* Once the limit can be lifted, include a way for payloads
> +		* larger than one pkt, e.g use last byte to indicate if
> +		* the push option continues in the next packet or implement
> +		* larger packets.
> +		*/
> +		if (len > LARGE_PACKET_MAX - 1) {

packet_read_line() calls packet_read_line_generic() that calls
packet_read() with the fixed packet_buffer[] that is sized to be
LARGE_PACKET_MAX.

Can this check even trigger?

> +			/*
> +			 * NEEDSWORK: The error message in die(..) is not
> +			 * transmitted in call cases, so ideally all die(..)
> +			 * calls are prefixed with rp_error and then we can
> +			 * combine rp_error && die into one helper function.
> +			 */
> +			rp_error("protocol error: server configuration allows push "
> +				 "options of size up to %d bytes",
> +				 LARGE_PACKET_MAX - 1);
> +			die("protocol error: push options too large");
> +		}

> +		lf = strchr(line, '\n');
> +		if (lf)
> +			*lf = '\0';

packet_read_line() -> packet_read_line_generic() calls packet_read()
with PACKET_READ_CHOMP_NEWLINE flag bit; do we need this check?

> +
> +		string_list_append(ret, line);
> +	}
> +
> +	return ret;
> +}

Other than that, looks good to me.

Thanks.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-14 17:39 [PATCHv5 0/4] Push options Stefan Beller
@ 2016-07-14 17:39 ` Stefan Beller
  2016-07-14 18:38   ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-14 17:39 UTC (permalink / raw)
  To: git, gitster; +Cc: dwwang, e, peff, dennis, jrnieder, Stefan Beller

The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt                          |  9 +++-
 Documentation/technical/pack-protocol.txt         | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  9 ++++
 builtin/receive-pack.c                            | 59 +++++++++++++++++++++++
 4 files changed, 81 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..25b5db1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,8 +2410,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
 	By default, git-receive-pack will advertise the atomic push
-	capability to its clients. If you don't want to this capability
-	to be advertised, set this variable to false.
+	capability to its clients. If you don't want to advertise this
+	capability, set this variable to false.
+
+receive.advertisePushOptions::
+	By default, git-receive-pack will advertise the push options
+	capability to its clients. If you don't want to advertise this
+	capability, set this variable to false.
 
 receive.autogc::
 	By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 ----------------------------------------------
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 ----
   update-request    =  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+------------
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 ----------------------
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 236417e..917ac18 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.advertisepushoptions") == 0) {
+		advertise_push_options = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -211,6 +218,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
 			strbuf_addstr(&cap, " ofs-delta");
 		if (push_cert_nonce)
 			strbuf_addf(&cap, " push-cert=%s", push_cert_nonce);
+		if (advertise_push_options)
+			strbuf_addstr(&cap, " push-options");
 		strbuf_addf(&cap, " agent=%s", git_user_agent_sanitized());
 		packet_write(1, "%s %s%c%s\n",
 			     sha1_to_hex(sha1), path, 0, cap.buf);
@@ -1455,6 +1464,9 @@ static struct command *read_head_info(struct sha1_array *shallow)
 			if (advertise_atomic_push
 			    && parse_feature_request(feature_list, "atomic"))
 				use_atomic = 1;
+			if (advertise_push_options
+			    && parse_feature_request(feature_list, "push-options"))
+				use_push_options = 1;
 		}
 
 		if (!strcmp(line, "push-cert")) {
@@ -1487,6 +1499,50 @@ static struct command *read_head_info(struct sha1_array *shallow)
 	return commands;
 }
 
+static struct string_list *read_push_options(void)
+{
+	struct string_list *ret = xmalloc(sizeof(*ret));
+	string_list_init(ret, 1);
+
+	while (1) {
+		char *line, *lf;
+		int len;
+
+		line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		/*
+		* NEEDSWORK: expose the limitations to be configurable;
+		* Once the limit can be lifted, include a way for payloads
+		* larger than one pkt, e.g use last byte to indicate if
+		* the push option continues in the next packet or implement
+		* larger packets.
+		*/
+		if (len > LARGE_PACKET_MAX - 1) {
+			/*
+			 * NEEDSWORK: The error message in die(..) is not
+			 * transmitted in call cases, so ideally all die(..)
+			 * calls are prefixed with rp_error and then we can
+			 * combine rp_error && die into one helper function.
+			 */
+			rp_error("protocol error: server configuration allows push "
+				 "options of size up to %d bytes",
+				 LARGE_PACKET_MAX - 1);
+			die("protocol error: push options too large");
+		}
+
+		lf = strchr(line, '\n');
+		if (lf)
+			*lf = '\0';
+
+		string_list_append(ret, line);
+	}
+
+	return ret;
+}
+
 static const char *parse_pack_header(struct pack_header *hdr)
 {
 	switch (read_pack_header(0, hdr)) {
@@ -1774,6 +1830,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		const char *unpack_status = NULL;
 		struct string_list *push_options = NULL;
 
+		if (use_push_options)
+			push_options = read_push_options();
+
 		prepare_shallow_info(&si, &shallow);
 		if (!si.nr_ours && !si.nr_theirs)
 			shallow_update = 0;
-- 
2.9.0.247.gf748855.dirty


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-10 17:06   ` Shawn Pearce
  2016-07-10 18:05     ` Stefan Beller
@ 2016-07-12  5:24     ` Jeff King
  1 sibling, 0 replies; 52+ messages in thread
From: Jeff King @ 2016-07-12  5:24 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Stefan Beller, Junio C Hamano, git, e, dwwang,
	Dennis Kaarsemaker, Jonathan Nieder

On Sun, Jul 10, 2016 at 10:06:12AM -0700, Shawn Pearce wrote:

> On Fri, Jul 8, 2016 at 5:31 PM, Stefan Beller <sbeller@google.com> wrote:
> > +
> > +       /* NEEDSWORK: expose the limitations to be configurable. */
> > +       int max_options = 32;
> > +
> > +       /*
> > +        * NEEDSWORK: expose the limitations to be configurable;
> > +        * Once the limit can be lifted, include a way for payloads
> > +        * larger than one pkt, e.g allow a payload of up to
> > +        * LARGE_PACKET_MAX - 1 only, and reserve the last byte
> > +        * to indicate whether the next pkt continues with this
> > +        * push option.
> > +        */
> > +       int max_size = 1024;
> 
> Instead of this, what about a new config variable
> receive.maxCommandBytes[1] that places a limit on the number of bytes
> of pkt-line data the client can supply in both the command list ("old
> new ref"), push signature framing, and option list?

Hmm, I hadn't thought about that. Just for RAM usage, I don't really see
a huge need for individual limits Sometimes meta-limits like this can be
confusing to set ("I have a huge repo with 50,000 refs that I want to
mirror-push; what do I set the byte-limit to?"), but for the most part
people shouldn't need to be touching it.  And I wonder if this could
simply be done at the pkt-line level.

OTOH, if the true goal is just to limit memory usage, I wonder if git
should be involved at all. Sadly, things like ulimit or cgroups are not
greated for limiting heap memory. ulimit doesn't count non-brk() memory
on most operating systems, and cgroups is too eager to count shared
packfile mmaps. You can try hooking xmalloc/free to keep stats, but you
have to know the size of each allocation in free(), which means either
being intimate with the underlying malloc implementation, or doing dirty
tricks (e.g., increasing the allocation by a few bytes and shoving the
size there). So it feels like there should be a way to ask the malloc
implementation not to allocate more than N bytes at a time, but AFAIK
glibc malloc does not have such a feature. Maybe other allocators do.

I also think there are some reasons besides RAM usage to limit things.

For instance, there is definitely some per-ref work that receive-pack
does (and hooks may do, as well). That remains proportional to the
amount of data sent, but an annoying client can min-max the parameters
by sending a bunch of short ref names, maximizing the size of the "# of
refs" parameter.

Similarly, I'm as much interested in total RAM usage as I am in making
sure we don't hit weird pathological cases. For instance, places where
tree entry names are so long that they overflow "int" and cause bogus
size computations and access memory outside of the array. That probably
wouldn't be an issue here, if the total data size is on the order of
megabytes (so your worst-case is somebody minimizes all parameters but
one, and then that maximizes the other one; but if it still caps out at
a few megabytes, that's not going to overflow anything).

> I studied a lot of repositories[2] and most use ref names under 200
> bytes in length. A 3 MiB default for receive.maxCommandBytes gives
> users something like 11,115 references in a single git push invocation
> if they used all 200 bytes in every name. Most users don't have ref
> names this long. Unlike a cap on each ref, it allows users to use the
> full 65449 bytes in a reference name available in pkt-line, but you
> can only send 48 such references. Likewise for options. :)

My usual "worst case" for refs is:

  https://github.com/JetBrains/intellij-community

A mirror push of that needs more like 4MiB. I'd probably say something
like 10MiB is a reasonable default.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-10 18:05     ` Stefan Beller
@ 2016-07-12  4:53       ` Shawn Pearce
  0 siblings, 0 replies; 52+ messages in thread
From: Shawn Pearce @ 2016-07-12  4:53 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, Eric Wong, Jeff King, Dan Wang,
	Dennis Kaarsemaker, Jonathan Nieder

On Sun, Jul 10, 2016 at 11:05 AM, Stefan Beller <sbeller@google.com> wrote:
> On Sun, Jul 10, 2016 at 10:06 AM, Shawn Pearce <spearce@spearce.org> wrote:
>> On Fri, Jul 8, 2016 at 5:31 PM, Stefan Beller <sbeller@google.com> wrote:
>>> +
>>> +       /* NEEDSWORK: expose the limitations to be configurable. */
>>> +       int max_options = 32;
>>> +
>>> +       /*
>>> +        * NEEDSWORK: expose the limitations to be configurable;
>>> +        * Once the limit can be lifted, include a way for payloads
>>> +        * larger than one pkt, e.g allow a payload of up to
>>> +        * LARGE_PACKET_MAX - 1 only, and reserve the last byte
>>> +        * to indicate whether the next pkt continues with this
>>> +        * push option.
>>> +        */
>>> +       int max_size = 1024;
>>
>> Instead of this, what about a new config variable
>> receive.maxCommandBytes[1] that places a limit on the number of bytes
>> of pkt-line data the client can supply in both the command list ("old
>> new ref"), push signature framing, and option list?

FYI https://git.eclipse.org/r/77108 is my proposal in JGit.

> The design here with two bounds was used to not care about the oversized
> push options for now. (I mean we can still just reject larger push
> options even when
> having a receive.maxCommandBytes setting.)

I don't really see a problem with a single option being nearly 64 KiB
in size. The pack file is going to (probably) need a lot more memory
than that to be unpacked and have SHA-1s generated. The
maxCommandBytes setting at 3 MiB neatly allows a few options to be up
to the pkt-line 64 KiB size, which is 1/10th of what anyone would ever
need. :)

> In an earlier discussion Jeff said roughly "either make it work well,
> or don't make it work at all, i.e. why are git push options better
> than a `git push .. && curl <server>/REST-API` thing?"

One reason this is less optimal is you need to setup authentication
twice. git push might be working over SSH, or over a custom
git-remote-NNN transport. But curl might not speak the same
authentication. Even if git and curl are both using .netrc or an HTTP
cookie file you need to pass the right options to make sure both use
the same auth.

Another is we can do validation of the incoming Git data in context of
what options you specified to us and refuse it while discarding the
pack if we don't like what was asked. With `git push && curl` we have
to save the data without necessarily knowing exactly what we should do
with it, just in case the user comes back with a corresponding REST
API call on another connection.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-10 17:06   ` Shawn Pearce
@ 2016-07-10 18:05     ` Stefan Beller
  2016-07-12  4:53       ` Shawn Pearce
  2016-07-12  5:24     ` Jeff King
  1 sibling, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-10 18:05 UTC (permalink / raw)
  To: Shawn Pearce
  Cc: Junio C Hamano, git, Eric Wong, Jeff King, Dan Wang,
	Dennis Kaarsemaker, Jonathan Nieder

On Sun, Jul 10, 2016 at 10:06 AM, Shawn Pearce <spearce@spearce.org> wrote:
> On Fri, Jul 8, 2016 at 5:31 PM, Stefan Beller <sbeller@google.com> wrote:
>> +
>> +       /* NEEDSWORK: expose the limitations to be configurable. */
>> +       int max_options = 32;
>> +
>> +       /*
>> +        * NEEDSWORK: expose the limitations to be configurable;
>> +        * Once the limit can be lifted, include a way for payloads
>> +        * larger than one pkt, e.g allow a payload of up to
>> +        * LARGE_PACKET_MAX - 1 only, and reserve the last byte
>> +        * to indicate whether the next pkt continues with this
>> +        * push option.
>> +        */
>> +       int max_size = 1024;
>
> Instead of this, what about a new config variable
> receive.maxCommandBytes[1] that places a limit on the number of bytes
> of pkt-line data the client can supply in both the command list ("old
> new ref"), push signature framing, and option list?

Including the whole command list is pretty smart as it actually tackles the
DoS problem as a whole. We shortly discussed having just one upper bound
limit for the push options alone, but we were distracted by the discussion
on whether to advertise this number or just reject it on the server side
after it filled up so much data.

The design here with two bounds was used to not care about the oversized
push options for now. (I mean we can still just reject larger push
options even when
having a receive.maxCommandBytes setting.)

>
> Memory demands for the server are proportional to the data sent. A
> simple byte limit lets the user make the decision about how this gets
> used. Longer ref names or option values means fewer refs or options
> can be sent. Shorter ref names or option values means more values or
> options can be sent.
>
> I studied a lot of repositories[2] and most use ref names under 200
> bytes in length. A 3 MiB default for receive.maxCommandBytes gives
> users something like 11,115 references in a single git push invocation
> if they used all 200 bytes in every name. Most users don't have ref
> names this long. Unlike a cap on each ref, it allows users to use the
> full 65449 bytes in a reference name available in pkt-line, but you
> can only send 48 such references. Likewise for options. :)

In an earlier discussion Jeff said roughly "either make it work well,
or don't make it work at all, i.e. why are git push options better
than a `git push .. && curl <server>/REST-API` thing?"

And by having this design we could punt on the corner cases with
transmitting arbitrary large push options/binaries for now and claim
it's another next step that needs to be done when adding the config
option for it. By having a single receive.maxCommandBytes setting
we would sweep that problem under the rug and people could wonder
why it fails with the large push option.

As said in an earlier email as a side note, we could think about introducing
a v2 pkt line format which starts with a variable int to indicate the packet
size, such that the payload is not bound up to 64k.

I think 3MiB is a bit much for everyday use though and not enough for
corner cases?

>
>
> [1] I may propose this to JGit.
> [2] More than 3M, but maybe Peff has access to more.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-09  0:31 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
@ 2016-07-10 17:06   ` Shawn Pearce
  2016-07-10 18:05     ` Stefan Beller
  2016-07-12  5:24     ` Jeff King
  0 siblings, 2 replies; 52+ messages in thread
From: Shawn Pearce @ 2016-07-10 17:06 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, e, Jeff King, dwwang, Dennis Kaarsemaker,
	Jonathan Nieder

On Fri, Jul 8, 2016 at 5:31 PM, Stefan Beller <sbeller@google.com> wrote:
> +
> +       /* NEEDSWORK: expose the limitations to be configurable. */
> +       int max_options = 32;
> +
> +       /*
> +        * NEEDSWORK: expose the limitations to be configurable;
> +        * Once the limit can be lifted, include a way for payloads
> +        * larger than one pkt, e.g allow a payload of up to
> +        * LARGE_PACKET_MAX - 1 only, and reserve the last byte
> +        * to indicate whether the next pkt continues with this
> +        * push option.
> +        */
> +       int max_size = 1024;

Instead of this, what about a new config variable
receive.maxCommandBytes[1] that places a limit on the number of bytes
of pkt-line data the client can supply in both the command list ("old
new ref"), push signature framing, and option list?

Memory demands for the server are proportional to the data sent. A
simple byte limit lets the user make the decision about how this gets
used. Longer ref names or option values means fewer refs or options
can be sent. Shorter ref names or option values means more values or
options can be sent.

I studied a lot of repositories[2] and most use ref names under 200
bytes in length. A 3 MiB default for receive.maxCommandBytes gives
users something like 11,115 references in a single git push invocation
if they used all 200 bytes in every name. Most users don't have ref
names this long. Unlike a cap on each ref, it allows users to use the
full 65449 bytes in a reference name available in pkt-line, but you
can only send 48 such references. Likewise for options. :)


[1] I may propose this to JGit.
[2] More than 3M, but maybe Peff has access to more.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-09  0:31 [PATCHv4 0/4] Push options Stefan Beller
@ 2016-07-09  0:31 ` Stefan Beller
  2016-07-10 17:06   ` Shawn Pearce
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-09  0:31 UTC (permalink / raw)
  To: gitster; +Cc: git, e, peff, dwwang, dennis, jrnieder, Stefan Beller

The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt                          |  9 ++-
 Documentation/technical/pack-protocol.txt         | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  9 +++
 builtin/receive-pack.c                            | 69 +++++++++++++++++++++++
 4 files changed, 91 insertions(+), 6 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..25b5db1 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,8 +2410,13 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
 	By default, git-receive-pack will advertise the atomic push
-	capability to its clients. If you don't want to this capability
-	to be advertised, set this variable to false.
+	capability to its clients. If you don't want to advertise this
+	capability, set this variable to false.
+
+receive.advertisePushOptions::
+	By default, git-receive-pack will advertise the push options
+	capability to its clients. If you don't want to advertise this
+	capability, set this variable to false.
 
 receive.autogc::
 	By default, git-receive-pack will run "git-gc --auto" after
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 ----------------------------------------------
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 ----
   update-request    =  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..4c28d3a 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,15 @@ atomic pushes. If the pushing client requests this capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+------------
+
+If the server sends the 'push-options' capability it is able to accept
+push options after the update commands have been sent, but before the
+packfile is streamed. If the pushing client requests this capability,
+the server will pass the options to the pre- and post- receive hooks
+that process this push request.
+
 allow-tip-sha1-in-want
 ----------------------
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 236417e..754db6e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.advertisepushoptions") == 0) {
+		advertise_push_options = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -211,6 +218,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
 			strbuf_addstr(&cap, " ofs-delta");
 		if (push_cert_nonce)
 			strbuf_addf(&cap, " push-cert=%s", push_cert_nonce);
+		if (advertise_push_options)
+			strbuf_addstr(&cap, " push-options");
 		strbuf_addf(&cap, " agent=%s", git_user_agent_sanitized());
 		packet_write(1, "%s %s%c%s\n",
 			     sha1_to_hex(sha1), path, 0, cap.buf);
@@ -1455,6 +1464,9 @@ static struct command *read_head_info(struct sha1_array *shallow)
 			if (advertise_atomic_push
 			    && parse_feature_request(feature_list, "atomic"))
 				use_atomic = 1;
+			if (advertise_push_options
+			    && parse_feature_request(feature_list, "push-options"))
+				use_push_options = 1;
 		}
 
 		if (!strcmp(line, "push-cert")) {
@@ -1487,6 +1499,60 @@ static struct command *read_head_info(struct sha1_array *shallow)
 	return commands;
 }
 
+static struct string_list *read_push_options(void)
+{
+	int i;
+	struct string_list *ret = xmalloc(sizeof(*ret));
+	string_list_init(ret, 1);
+
+	/* NEEDSWORK: expose the limitations to be configurable. */
+	int max_options = 32;
+
+	/*
+	 * NEEDSWORK: expose the limitations to be configurable;
+	 * Once the limit can be lifted, include a way for payloads
+	 * larger than one pkt, e.g allow a payload of up to
+	 * LARGE_PACKET_MAX - 1 only, and reserve the last byte
+	 * to indicate whether the next pkt continues with this
+	 * push option.
+	 */
+	int max_size = 1024;
+
+	for (i = 0; i < max_options; i++) {
+		char *line;
+		int len;
+
+		line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		if (len > max_size) {
+			/*
+			 * NEEDSWORK: The error message in die(..) is not
+			 * transmitted in call cases, so ideally all die(..)
+			 * calls are prefixed with rp_error and then we can
+			 * combine rp_error && die into one helper function.
+			 */
+			rp_error("protocol error: server configuration allows push "
+				 "options of size up to %d bytes", max_size);
+			die("protocol error: push options too large");
+		}
+
+		len = strcspn(line, "\n");
+		line[len] = '\0';
+
+		string_list_append(ret, line);
+	}
+	if (i == max_options) {
+		rp_error("protocol error: server configuration only allows up "
+		    "to %d push options", max_options);
+		die("protocol error: push options too large");
+	}
+
+	return ret;
+}
+
 static const char *parse_pack_header(struct pack_header *hdr)
 {
 	switch (read_pack_header(0, hdr)) {
@@ -1774,6 +1840,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		const char *unpack_status = NULL;
 		struct string_list *push_options = NULL;
 
+		if (use_push_options)
+			push_options = read_push_options();
+
 		prepare_shallow_info(&si, &shallow);
 		if (!si.nr_ours && !si.nr_theirs)
 			shallow_update = 0;
-- 
2.9.0.247.g176c4f7


^ permalink raw reply related	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 22:46                           ` Jeff King
@ 2016-07-08 22:51                             ` Stefan Beller
  0 siblings, 0 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-08 22:51 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 8, 2016 at 3:46 PM, Jeff King <peff@peff.net> wrote:
> Sorry, I meant converting die() into:
>
>   rp_error(...);
>   die(...);
>
> possibly via an rp_die() helper.  The existing rp_error() cases would
> remain untouched.

Oh I see!
That makes a lot of sense.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 22:43                         ` Stefan Beller
@ 2016-07-08 22:46                           ` Jeff King
  2016-07-08 22:51                             ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-08 22:46 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 08, 2016 at 03:43:27PM -0700, Stefan Beller wrote:

> On Fri, Jul 8, 2016 at 3:35 PM, Jeff King <peff@peff.net> wrote:
> >
> > Yes. I haven't been following the intermediate discussion and patches,
> > but I don't see anything wrong with the general design above. I think
> > you do need to use rp_error() to get the die message to the client for
> > non-ssh cases, though (that is a problem with other protocol-error
> > messages, too; I wonder if we should install a custom die handler, or
> > convert them all to some kind of rp_die()).
> 
> Some of the rp_error messages do not want to die(), but most seem to
> be ok when the rp_error would die.

Sorry, I meant converting die() into:

  rp_error(...);
  die(...);

possibly via an rp_die() helper.  The existing rp_error() cases would
remain untouched.

Installing a die handler could make that work automatically, though I
suspect it would lead to corner cases where we break protocol (e.g., if
we die() in the middle of writing out a packet).

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 22:35                       ` Jeff King
@ 2016-07-08 22:43                         ` Stefan Beller
  2016-07-08 22:46                           ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-08 22:43 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 8, 2016 at 3:35 PM, Jeff King <peff@peff.net> wrote:
>
> Yes. I haven't been following the intermediate discussion and patches,
> but I don't see anything wrong with the general design above. I think
> you do need to use rp_error() to get the die message to the client for
> non-ssh cases, though (that is a problem with other protocol-error
> messages, too; I wonder if we should install a custom die handler, or
> convert them all to some kind of rp_die()).

Some of the rp_error messages do not want to die(), but most seem to
be ok when the rp_error would die.

I'll look into that.

>
> -Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 22:29                     ` Stefan Beller
@ 2016-07-08 22:35                       ` Jeff King
  2016-07-08 22:43                         ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-08 22:35 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 08, 2016 at 03:29:09PM -0700, Stefan Beller wrote:

> > In the malicious case, the client says "I'll send you 10 push option
> > with an upper bound of 1024K", and then sends gigabytes anyway. Either
> > way the server has to react to what is sent, not what is promised.
> 
> Well that is what the initial patch did via:
> 
> +       for (i = 0; i < max_options; i++) {
> +               char *line;
> +               int len;
> +
> +               line = packet_read_line(0, &len);
> +
> +               if (!line)
> +                       break;
> +
> +               if (len > max_size)
> +                       die("protocol error: server configuration allows push "
> +                           "options of size up to %d bytes", max_size);
> +
> +               len = strcspn(line, "\n");
> +               line[len] = '\0';
> +
> +               string_list_append(ret, line);
> +       }
> +       if (i == max_options)
> +               die("protocol error: server configuration only allows up "
> +                   "to %d push options", max_options);
> 
> I assume the die is an effective way to "stop receiving".
> 
> Thinking further about what you said, I think the initial selections of
> max_size and max_options is sufficient and we only see those bounds in
> the malicious case.
> 
> This discussion rather makes me wonder if we want to stick to the initial design
> as it was easy and not overcomplicating things as we assume the abort case
> doesn't occur often.

Yes. I haven't been following the intermediate discussion and patches,
but I don't see anything wrong with the general design above. I think
you do need to use rp_error() to get the die message to the client for
non-ssh cases, though (that is a problem with other protocol-error
messages, too; I wonder if we should install a custom die handler, or
convert them all to some kind of rp_die()).

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 22:21                   ` Jeff King
@ 2016-07-08 22:29                     ` Stefan Beller
  2016-07-08 22:35                       ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-08 22:29 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 8, 2016 at 3:21 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Jul 08, 2016 at 03:17:13PM -0700, Stefan Beller wrote:
>
>> >If people are seeing these in
>> > routine use, then the limits are set too low, and this should happen
>> > roughly as often as a BUG assertion, and IMHO should be treated roughly
>> > the same: don't bother with translation, and don't worry about
>> > optimizing wasted bandwidth for this case. It won't happen enough to
>> > matter.
>>
>> Well the wasted band width is part of the server protection, no?
>
> Not if you stop receiving as soon as you hit the limits. Then of course
> they can send up to the limit each time, but that is not a DoS. That is
> things working as advertised.
>
>> This would favor the idea Jonathan came up with:
>>
>>     server: I advertise push options
>>     client: ok I want to use push options
>>     client: I'll send you 1000 push options with upper bound of 1000M
>>     server: It's a bit too much, eh?
>>     * server quits
>>
>> So this case only occurs for the (malicious?) corner case, where I
>> do not bother a translation.
>
> In the malicious case, the client says "I'll send you 10 push option
> with an upper bound of 1024K", and then sends gigabytes anyway. Either
> way the server has to react to what is sent, not what is promised.

Well that is what the initial patch did via:

+       for (i = 0; i < max_options; i++) {
+               char *line;
+               int len;
+
+               line = packet_read_line(0, &len);
+
+               if (!line)
+                       break;
+
+               if (len > max_size)
+                       die("protocol error: server configuration allows push "
+                           "options of size up to %d bytes", max_size);
+
+               len = strcspn(line, "\n");
+               line[len] = '\0';
+
+               string_list_append(ret, line);
+       }
+       if (i == max_options)
+               die("protocol error: server configuration only allows up "
+                   "to %d push options", max_options);

I assume the die is an effective way to "stop receiving".

Thinking further about what you said, I think the initial selections of
max_size and max_options is sufficient and we only see those bounds in
the malicious case.

This discussion rather makes me wonder if we want to stick to the initial design
as it was easy and not overcomplicating things as we assume the abort case
doesn't occur often.


>
> -Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 22:17                 ` Stefan Beller
@ 2016-07-08 22:21                   ` Jeff King
  2016-07-08 22:29                     ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-08 22:21 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 08, 2016 at 03:17:13PM -0700, Stefan Beller wrote:

> >If people are seeing these in
> > routine use, then the limits are set too low, and this should happen
> > roughly as often as a BUG assertion, and IMHO should be treated roughly
> > the same: don't bother with translation, and don't worry about
> > optimizing wasted bandwidth for this case. It won't happen enough to
> > matter.
> 
> Well the wasted band width is part of the server protection, no?

Not if you stop receiving as soon as you hit the limits. Then of course
they can send up to the limit each time, but that is not a DoS. That is
things working as advertised.

> This would favor the idea Jonathan came up with:
> 
>     server: I advertise push options
>     client: ok I want to use push options
>     client: I'll send you 1000 push options with upper bound of 1000M
>     server: It's a bit too much, eh?
>     * server quits
> 
> So this case only occurs for the (malicious?) corner case, where I
> do not bother a translation.

In the malicious case, the client says "I'll send you 10 push option
with an upper bound of 1024K", and then sends gigabytes anyway. Either
way the server has to react to what is sent, not what is promised.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 21:46               ` Jeff King
@ 2016-07-08 22:17                 ` Stefan Beller
  2016-07-08 22:21                   ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-08 22:17 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 8, 2016 at 2:46 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Jul 08, 2016 at 11:57:20AM -0700, Stefan Beller wrote:
>
>> >> Sorry to butt into the conversation late, but: I am not yet convinced.
>> >>
>> >> Is the idea that if the push options were very large, this would save
>> >> the client from the cost of sending them?
>> >
>> > Not really.  I have no strong opinion on the benefit of limiting
>> > number/size.  Stefan limited the number/size at the receiving end
>> > and made receiving end die with its message.
>>
>> Jeff claimed we'd need some sort of DoS protection for this feature,
>> so I considered just die-ing enough for an initial implementation.
>
> I do not think we need to worry too much about niceties for these
> limits. The point is to protect servers from malicious nonsense, like
> somebody sending gigabytes of push options, or trying to overflow a
> buffer in a hook with a large value.

Agreed. This would speak for keeping the implementation as is.

>If people are seeing these in
> routine use, then the limits are set too low, and this should happen
> roughly as often as a BUG assertion, and IMHO should be treated roughly
> the same: don't bother with translation, and don't worry about
> optimizing wasted bandwidth for this case. It won't happen enough to
> matter.

Well the wasted band width is part of the server protection, no?
This would favor the idea Jonathan came up with:

    server: I advertise push options
    client: ok I want to use push options
    client: I'll send you 1000 push options with upper bound of 1000M
    server: It's a bit too much, eh?
    * server quits

So this case only occurs for the (malicious?) corner case, where I
do not bother a translation.

But having the size announcement not in
the capability advertisement, but in the actual push options phase makes
sense to me as we do not want to clutter the capabilities with data that can
come later. We would only waste a little bit of band width, (the
initial ls-remote
and command list of the client).


Speaking of this, I can craft a malicious client that sends the
following command list

0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
refs/heads/loooooooooooooooooong-ref
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
refs/heads/loooooooooooooooooong-ref
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
refs/heads/loooooooooooooooooong-ref
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
refs/heads/loooooooooooooooooong-ref
0000000000000000000000000000000000000000
0000000000000000000000000000000000000000
refs/heads/loooooooooooooooooong-ref
<repeat the above a few times>
0000

IIUC in the receive-pack code we would queue that up and the error checking
(two times null sha1? update of the same ref more than once?), is
done just after we send out the flush packet, i.e. when all commands
are received.

This would also result in sending gigabytes of junk as well as a
memory issue on the server
side?

The new push options design is actually neat in the way that the
client exactly says what it wants
and the server can reject early, but not cluttering the capability
advertisement.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 18:57             ` Stefan Beller
@ 2016-07-08 21:46               ` Jeff King
  2016-07-08 22:17                 ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-08 21:46 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, Jonathan Nieder, git, Eric Wong, Dan Wang,
	Dennis Kaarsemaker

On Fri, Jul 08, 2016 at 11:57:20AM -0700, Stefan Beller wrote:

> >> Sorry to butt into the conversation late, but: I am not yet convinced.
> >>
> >> Is the idea that if the push options were very large, this would save
> >> the client from the cost of sending them?
> >
> > Not really.  I have no strong opinion on the benefit of limiting
> > number/size.  Stefan limited the number/size at the receiving end
> > and made receiving end die with its message.
> 
> Jeff claimed we'd need some sort of DoS protection for this feature,
> so I considered just die-ing enough for an initial implementation.

I do not think we need to worry too much about niceties for these
limits. The point is to protect servers from malicious nonsense, like
somebody sending gigabytes of push options, or trying to overflow a
buffer in a hook with a large value. If people are seeing these in
routine use, then the limits are set too low, and this should happen
roughly as often as a BUG assertion, and IMHO should be treated roughly
the same: don't bother with translation, and don't worry about
optimizing wasted bandwidth for this case. It won't happen enough to
matter.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 18:39           ` Junio C Hamano
@ 2016-07-08 18:57             ` Stefan Beller
  2016-07-08 21:46               ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-08 18:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, git, Eric Wong, Jeff King, Dan Wang, Dennis Kaarsemaker

On Fri, Jul 8, 2016 at 11:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>
>> Hi,
>>
>> Junio C Hamano wrote:
>>> Stefan Beller <sbeller@google.com> writes:
>>
>>>>> More importantly, if we plan to make this configurable and not make
>>>>> the limit a hardwired constant of the wire protocol, it may be
>>>>> better to advertise push-options capability with the limit, e.g.
>>>>> "push-options=32" (or even "push-options=1024/32"), so that the
>>>>> client side can count and abort early?
>>
>> Sorry to butt into the conversation late, but: I am not yet convinced.
>>
>> Is the idea that if the push options were very large, this would save
>> the client from the cost of sending them?
>
> Not really.  I have no strong opinion on the benefit of limiting
> number/size.  Stefan limited the number/size at the receiving end
> and made receiving end die with its message.

Jeff claimed we'd need some sort of DoS protection for this feature,
so I considered just die-ing enough for an initial implementation.

> I was merely trying to
> tweak the arrangement so that the sending end can complain with its
> own message, possibly in its own language, especially because it was
> unclear to me if the die() message on the receiving end would always
> go back to the sending end correctly.

I'm currently implementing Jonathans suggestion as it seems to be a reasonable
trade off (client hasn't sent a lot of data when it is decided it
doesn't go through,
the server can complain with a reasonable error message, only downside: no
i18n localisation support on the client side as the server will
currently report the
error in English).

That method will make heavy use of rp_error that uses the side band for
communicating the actual error message.

>
>> But this comes with a
>> downside: the server doesn't get to send an error message about
>> where
>> the maximum number of push options can come from (e.g., with a link to
>> a page where the limit can be adjusted, or with an explanation of when
>> clients tend to run into this problem and what they should do
>> instead).
>
> Hmm, interesting point.  That would be better told by the receiving
> end, as the way to configure it (if offered) would be different from
> installation to installation.
>
>> So I'd like to propose an alternative. What if the client tells the
>> server the number of push options early on (and possibly also a cap on
>> the length of those push options)?  That way, the client doesn't have
>> to waste bit sending the push options but the server gets an
>> opportunity to send a helpful error message on sideband 3.
>>
>>       server> HEAD\0push-options ...
>>       client> ... commands ...
>>       client> push-options 2
>>       client> my-first-option
>>       client> my-second-option
>

Another (slightly offtopic) observation:
If in the future we'll need to transmit push options >64k, instead of
splitting the push option to multiple packets, we could invent "large
packets". The current upper bound for packets is artificailly low, such
that the server is able to interleave sideband information with the actual
data in a fetch and have the client display the progress in a timely manner.
When pushing to the server we'd not need to have progress information
(the server doesn't care, and the client knows the size it is pushing).

As of today a packet consists of 4 bytes (hex characters) to indicate
the length and then the payload follows. So instead we could transmit
"v" (that is not a hex character) followed by a variable length integer for
the length and then the payload which has no upper bound.

In the release notes for 2.3 you wrote:
> * "git push" and "git fetch" did not communicate an overlong refname
>   correctly.  Now it uses 64kB sideband to accommodate longer ones.

That could also make use of these "large packets" instead.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-08 17:58         ` Jonathan Nieder
@ 2016-07-08 18:39           ` Junio C Hamano
  2016-07-08 18:57             ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-08 18:39 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Stefan Beller, git, Eric Wong, Jeff King, Dan Wang, Dennis Kaarsemaker

Jonathan Nieder <jrnieder@gmail.com> writes:

> Hi,
>
> Junio C Hamano wrote:
>> Stefan Beller <sbeller@google.com> writes:
>
>>>> More importantly, if we plan to make this configurable and not make
>>>> the limit a hardwired constant of the wire protocol, it may be
>>>> better to advertise push-options capability with the limit, e.g.
>>>> "push-options=32" (or even "push-options=1024/32"), so that the
>>>> client side can count and abort early?
>
> Sorry to butt into the conversation late, but: I am not yet convinced.
>
> Is the idea that if the push options were very large, this would save
> the client from the cost of sending them?  

Not really.  I have no strong opinion on the benefit of limiting
number/size.  Stefan limited the number/size at the receiving end
and made receiving end die with its message.  I was merely trying to
tweak the arrangement so that the sending end can complain with its
own message, possibly in its own language, especially because it was
unclear to me if the die() message on the receiving end would always
go back to the sending end correctly.

> But this comes with a
> downside: the server doesn't get to send an error message about
> where 
> the maximum number of push options can come from (e.g., with a link to
> a page where the limit can be adjusted, or with an explanation of when
> clients tend to run into this problem and what they should do
> instead).

Hmm, interesting point.  That would be better told by the receiving
end, as the way to configure it (if offered) would be different from
installation to installation.

> So I'd like to propose an alternative. What if the client tells the
> server the number of push options early on (and possibly also a cap on
> the length of those push options)?  That way, the client doesn't have
> to waste bit sending the push options but the server gets an
> opportunity to send a helpful error message on sideband 3.
>
> 	server> HEAD\0push-options ...
> 	client> ... commands ...
> 	client> push-options 2
> 	client> my-first-option
> 	client> my-second-option


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07 22:06       ` Junio C Hamano
@ 2016-07-08 17:58         ` Jonathan Nieder
  2016-07-08 18:39           ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Jonathan Nieder @ 2016-07-08 17:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Beller, git, Eric Wong, Jeff King, Dan Wang, Dennis Kaarsemaker

Hi,

Junio C Hamano wrote:
> Stefan Beller <sbeller@google.com> writes:

>>> More importantly, if we plan to make this configurable and not make
>>> the limit a hardwired constant of the wire protocol, it may be
>>> better to advertise push-options capability with the limit, e.g.
>>> "push-options=32" (or even "push-options=1024/32"), so that the
>>> client side can count and abort early?

Sorry to butt into the conversation late, but: I am not yet convinced.

Is the idea that if the push options were very large, this would save
the client from the cost of sending them?  But this comes with a
downside: the server doesn't get to send an error message about where
the maximum number of push options can come from (e.g., with a link to
a page where the limit can be adjusted, or with an explanation of when
clients tend to run into this problem and what they should do
instead).

So I'd like to propose an alternative. What if the client tells the
server the number of push options early on (and possibly also a cap on
the length of those push options)?  That way, the client doesn't have
to waste bit sending the push options but the server gets an
opportunity to send a helpful error message on sideband 3.

	server> HEAD\0push-options ...
	client> ... commands ...
	client> push-options 2
	client> my-first-option
	client> my-second-option

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07 22:06         ` Stefan Beller
@ 2016-07-07 22:09           ` Jeff King
  0 siblings, 0 replies; 52+ messages in thread
From: Jeff King @ 2016-07-07 22:09 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, Eric Wong, Dan Wang, Dennis Kaarsemaker

On Thu, Jul 07, 2016 at 03:06:31PM -0700, Stefan Beller wrote:

> > The problem is for hosting sites which serve repositories via git-daemon
> > from untrusted users who have real shell accounts (e.g., you set up
> > git-daemon to run as the "daemon" user serving repositories out of
> > people's home directories; you don't want users to escalate their shell
> > access into running arbitrary code as "daemon").
> 
> I think you would want to lock down the
> hosting site as much as possible and not put untrusted users home
> directories on there? So it is hard for me to imagine you'd go for such a setup
> in practice.

Sure, I think that's a good way to run a hosting site, too. But it
doesn't mean people don't have other needs. kernel.org was run as I
mentioned above for many years.

Another related case: you have a multi-user server where Alice might run
"git fetch server:~bob/repo.git". Alice does not want to run arbitrary
code based on what is in Bob's repo.git. Even if they are in the same
company, it is a poor security practice.

> > But I don't think that case applies here. That is about running
> > upload-pack on an untrusted repository, but your changes here are part
> > of receive-pack. In such a scenario, users should be pushing as
> > themselves via ssh. And if they are not (e.g., the admin set up
> > push-over-smart-http centrally), they are already screwed, as a
> > malicious user could just set up a pre-receive hook.
> 
> I hear that as: "The pre-receive hook itself can do much more
> damage than an oversized push option payload".

Exactly. Or more to the point: we promise nothing here except for
upload-pack, so changes to receive-pack do not have to worry about this
issue at all.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07 21:41     ` Stefan Beller
  2016-07-07 21:56       ` Jeff King
@ 2016-07-07 22:06       ` Junio C Hamano
  2016-07-08 17:58         ` Jonathan Nieder
  1 sibling, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-07 22:06 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Eric Wong, Jeff King, Dan Wang, Dennis Kaarsemaker

Stefan Beller <sbeller@google.com> writes:

> No, there is no good objective reason. I added it just after the atomic
> flag as that is what I implemented.
>
> Is there a reason for a particular order of capabilities? I always considered
> it a set of strings, i.e. any order is valid and there is no preference in
> which way to put it.

That is correct, but "there is no inherent order or grouping" does
not lead to "hence it is OK to put a new thing at a random place in
the middle."

If a reviewer sees a new thing at some specific point in a
collection, when the collection is understood not to have any
specific order or grouping, it makes the reviewer doubt the
belief--there must be a reason why this was placed not at the end;
otherwise a new thing wouldn't be placed randomly at the middle.

If you place a new thing at the end, it still leaves ambiguity; it
may be there because there is no inherent order or grouping, or the
new one is sorts later than the one that used to be at the last or
somehow related to it.  

> I agree on this for all content that can be modified by the user
> (e.g. files in the work tree such as .gitmodules), but the .git/config
> file cannot be changed remotely. So I wonder how an attack would
> look like for a hosting provider or anyone else?
> We still rely on a sane system and trust /etc/gitconfig
> so we do trust the host/admin but not the user?

It is not "we" do trust; it is "we let host/admin trust itself while
making sure that they can protect themselves from their users".

>> More importantly, if we plan to make this configurable and not make
>> the limit a hardwired constant of the wire protocol, it may be
>> better to advertise push-options capability with the limit, e.g.
>> "push-options=32" (or even "push-options=1024/32"), so that the
>> client side can count and abort early?
>
> Yeah we may want to start out with a strict format here indicating
> the parameters used for evaluating the size.
> So what do these numbers mean?
>
> I assume (and hence I should document that,) that the first (1024)
> is the maximum number of bytes per push option. The second
> number (32) is the number of push options (not the number of pkts,
> as one push option may take more than one pkt if the first number is
> larger than 65k, see the NEEDSWORK comment.)
>
> Do we really need 2 numbers, or could we just have one number
> describing the maximum total size in bytes before the remote rejects
> the connection?

That's for you to decide.  push-options=32 is probably OK but it
will prevent a well-behaving "git push" from catching a request that
will be rejected, if you are basing the receiver side decision on
the other number.

The suggestion was primarily my reaction after seeing the two new
calls to die() on the receiver side, whose message I was not sure
will be given to the user who is pushing, i.e.

> When not going over ssh://, does the user see these messages?

Your "git push" will be collecting the options in a string-list
while parsing the options, so it would be able to check immediately
upon seeing the advertised capability if it will trigger this
protocol error even before making the request, which would be a good
thing to do, and I am reasonably sure we can give a better error
message if we do that on the local side without having the receiver
to tell you (for one thing, we can i18n the local side error
message more easily).


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07 21:56       ` Jeff King
@ 2016-07-07 22:06         ` Stefan Beller
  2016-07-07 22:09           ` Jeff King
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-07 22:06 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git, Eric Wong, Dan Wang, Dennis Kaarsemaker

On Thu, Jul 7, 2016 at 2:56 PM, Jeff King <peff@peff.net> wrote:
> On Thu, Jul 07, 2016 at 02:41:37PM -0700, Stefan Beller wrote:
>
>> >> +     /* NEEDSWORK: expose the limitations to be configurable. */
>> >> +     int max_options = 32;
>> >> +
>> >> +     /*
>> >> +      * NEEDSWORK: expose the limitations to be configurable;
>> >> +      * Once the limit can be lifted, include a way for payloads
>> >> +      * larger than one pkt, e.g allow a payload of up to
>> >> +      * LARGE_PACKET_MAX - 1 only, and reserve the last byte
>> >> +      * to indicate whether the next pkt continues with this
>> >> +      * push option.
>> >> +      */
>> >> +     int max_size = 1024;
>> >
>> > Good NEEDSWORK comments; perhaps also hint that the configuration
>> > must not come from the repository level configuration file (i.e.
>> > Peff's "scoped configuration" from jk/upload-pack-hook topic)?
>>
>> Ok, I reviewed that series. It is unclear to me how the attack would
>> actually look like in that case.
>>
>> In 20b20a22f8f Jeff writes:
>> > Because we promise that
>> > upload-pack is safe to run in an untrusted repository, we
>> > cannot execute arbitrary code or commands found in the
>> > repository (neither in hooks/, nor in the config).
>>
>> I agree on this for all content that can be modified by the user
>> (e.g. files in the work tree such as .gitmodules), but the .git/config
>> file cannot be changed remotely. So I wonder how an attack would
>> look like for a hosting provider or anyone else?
>> We still rely on a sane system and trust /etc/gitconfig
>> so we do trust the host/admin but not the user?
>
> The problem is for hosting sites which serve repositories via git-daemon
> from untrusted users who have real shell accounts (e.g., you set up
> git-daemon to run as the "daemon" user serving repositories out of
> people's home directories; you don't want users to escalate their shell
> access into running arbitrary code as "daemon").

I think you would want to lock down the
hosting site as much as possible and not put untrusted users home
directories on there? So it is hard for me to imagine you'd go for such a setup
in practice.

>
> But I don't think that case applies here. That is about running
> upload-pack on an untrusted repository, but your changes here are part
> of receive-pack. In such a scenario, users should be pushing as
> themselves via ssh. And if they are not (e.g., the admin set up
> push-over-smart-http centrally), they are already screwed, as a
> malicious user could just set up a pre-receive hook.

I hear that as: "The pre-receive hook itself can do much more
damage than an oversized push option payload".

OK.

>
> IOW, we promise only that upload-pack is safe to run an untrusted repo,
> but not receive-pack.
>
> -Peff

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07 21:41     ` Stefan Beller
@ 2016-07-07 21:56       ` Jeff King
  2016-07-07 22:06         ` Stefan Beller
  2016-07-07 22:06       ` Junio C Hamano
  1 sibling, 1 reply; 52+ messages in thread
From: Jeff King @ 2016-07-07 21:56 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, Eric Wong, Dan Wang, Dennis Kaarsemaker

On Thu, Jul 07, 2016 at 02:41:37PM -0700, Stefan Beller wrote:

> >> +     /* NEEDSWORK: expose the limitations to be configurable. */
> >> +     int max_options = 32;
> >> +
> >> +     /*
> >> +      * NEEDSWORK: expose the limitations to be configurable;
> >> +      * Once the limit can be lifted, include a way for payloads
> >> +      * larger than one pkt, e.g allow a payload of up to
> >> +      * LARGE_PACKET_MAX - 1 only, and reserve the last byte
> >> +      * to indicate whether the next pkt continues with this
> >> +      * push option.
> >> +      */
> >> +     int max_size = 1024;
> >
> > Good NEEDSWORK comments; perhaps also hint that the configuration
> > must not come from the repository level configuration file (i.e.
> > Peff's "scoped configuration" from jk/upload-pack-hook topic)?
> 
> Ok, I reviewed that series. It is unclear to me how the attack would
> actually look like in that case.
> 
> In 20b20a22f8f Jeff writes:
> > Because we promise that
> > upload-pack is safe to run in an untrusted repository, we
> > cannot execute arbitrary code or commands found in the
> > repository (neither in hooks/, nor in the config).
> 
> I agree on this for all content that can be modified by the user
> (e.g. files in the work tree such as .gitmodules), but the .git/config
> file cannot be changed remotely. So I wonder how an attack would
> look like for a hosting provider or anyone else?
> We still rely on a sane system and trust /etc/gitconfig
> so we do trust the host/admin but not the user?

The problem is for hosting sites which serve repositories via git-daemon
from untrusted users who have real shell accounts (e.g., you set up
git-daemon to run as the "daemon" user serving repositories out of
people's home directories; you don't want users to escalate their shell
access into running arbitrary code as "daemon").

But I don't think that case applies here. That is about running
upload-pack on an untrusted repository, but your changes here are part
of receive-pack. In such a scenario, users should be pushing as
themselves via ssh. And if they are not (e.g., the admin set up
push-over-smart-http centrally), they are already screwed, as a
malicious user could just set up a pre-receive hook.

IOW, we promise only that upload-pack is safe to run an untrusted repo,
but not receive-pack.

-Peff

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07 20:37   ` Junio C Hamano
@ 2016-07-07 21:41     ` Stefan Beller
  2016-07-07 21:56       ` Jeff King
  2016-07-07 22:06       ` Junio C Hamano
  0 siblings, 2 replies; 52+ messages in thread
From: Stefan Beller @ 2016-07-07 21:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Eric Wong, Jeff King, Dan Wang, Dennis Kaarsemaker

On Thu, Jul 7, 2016 at 1:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> @@ -207,6 +214,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
>>                             "report-status delete-refs side-band-64k quiet");
>>               if (advertise_atomic_push)
>>                       strbuf_addstr(&cap, " atomic");
>> +             if (advertise_push_options)
>> +                     strbuf_addstr(&cap, " push-options");
>>               if (prefer_ofs_delta)
>>                       strbuf_addstr(&cap, " ofs-delta");
>>               if (push_cert_nonce)
>
> Hmph, was there a good reason to add it in the middle (contrast to
> the previous addition to the "only possible values are..."
> enumeration)?

No, there is no good objective reason. I added it just after the atomic
flag as that is what I implemented.

Is there a reason for a particular order of capabilities? I always considered
it a set of strings, i.e. any order is valid and there is no preference in
which way to put it.

>
>> +static struct string_list *read_push_options()
>
> static struct string_list *read_push_options(void)
>
>> +{
>> +     int i;
>> +     struct string_list *ret = xmalloc(sizeof(*ret));
>> +     string_list_init(ret, 1);
>> +
>> +     /* NEEDSWORK: expose the limitations to be configurable. */
>> +     int max_options = 32;
>> +
>> +     /*
>> +      * NEEDSWORK: expose the limitations to be configurable;
>> +      * Once the limit can be lifted, include a way for payloads
>> +      * larger than one pkt, e.g allow a payload of up to
>> +      * LARGE_PACKET_MAX - 1 only, and reserve the last byte
>> +      * to indicate whether the next pkt continues with this
>> +      * push option.
>> +      */
>> +     int max_size = 1024;
>
> Good NEEDSWORK comments; perhaps also hint that the configuration
> must not come from the repository level configuration file (i.e.
> Peff's "scoped configuration" from jk/upload-pack-hook topic)?

Ok, I reviewed that series. It is unclear to me how the attack would
actually look like in that case.

In 20b20a22f8f Jeff writes:
> Because we promise that
> upload-pack is safe to run in an untrusted repository, we
> cannot execute arbitrary code or commands found in the
> repository (neither in hooks/, nor in the config).

I agree on this for all content that can be modified by the user
(e.g. files in the work tree such as .gitmodules), but the .git/config
file cannot be changed remotely. So I wonder how an attack would
look like for a hosting provider or anyone else?
We still rely on a sane system and trust /etc/gitconfig
so we do trust the host/admin but not the user?

>
>> +     for (i = 0; i < max_options; i++) {
>> +             char *line;
>> +             int len;
>> +
>> +             line = packet_read_line(0, &len);
>> +
>> +             if (!line)
>> +                     break;
>> +
>> +             if (len > max_size)
>> +                     die("protocol error: server configuration allows push "
>> +                         "options of size up to %d bytes", max_size);
>> +
>> +             len = strcspn(line, "\n");
>> +             line[len] = '\0';
>> +
>> +             string_list_append(ret, line);
>> +     }
>> +     if (i == max_options)
>> +             die("protocol error: server configuration only allows up "
>> +                 "to %d push options", max_options);
>
> When not going over ssh://, does the user sees these messages?
>
> More importantly, if we plan to make this configurable and not make
> the limit a hardwired constant of the wire protocol, it may be
> better to advertise push-options capability with the limit, e.g.
> "push-options=32" (or even "push-options=1024/32"), so that the
> client side can count and abort early?

Yeah we may want to start out with a strict format here indicating
the parameters used for evaluating the size.
So what do these numbers mean?

I assume (and hence I should document that,) that the first (1024)
is the maximum number of bytes per push option. The second
number (32) is the number of push options (not the number of pkts,
as one push option may take more than one pkt if the first number is
larger than 65k, see the NEEDSWORK comment.)

Do we really need 2 numbers, or could we just have one number
describing the maximum total size in bytes before the remote rejects
the connection?

>
> I wondered how well the extra flush works with the extra framing
> smart-http does to wrap the wire protocol; as I do not see any
> change to the http side, I'd assume that there is no issue.

That's a dangerous assumption of yours, as I did not test the
https side, yet.

>
>> +
>> +     return ret;
>> +}
>> +
>>  static const char *parse_pack_header(struct pack_header *hdr)
>>  {
>>       switch (read_pack_header(0, hdr)) {
>> @@ -1773,6 +1829,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>>               const char *unpack_status = NULL;
>>               struct string_list *push_options = NULL;
>>
>> +             if (use_push_options)
>> +                     push_options = read_push_options();
>> +
>>               prepare_shallow_info(&si, &shallow);
>>               if (!si.nr_ours && !si.nr_theirs)
>>                       shallow_update = 0;
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07  1:12 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
@ 2016-07-07 20:37   ` Junio C Hamano
  2016-07-07 21:41     ` Stefan Beller
  0 siblings, 1 reply; 52+ messages in thread
From: Junio C Hamano @ 2016-07-07 20:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, e, peff, dwwang, dennis

Stefan Beller <sbeller@google.com> writes:

> While documenting
> this, fix a nit in the `receive.advertiseAtomic` wording.
>  
>  receive.advertiseAtomic::
>  	By default, git-receive-pack will advertise the atomic push
> -	capability to its clients. If you don't want to this capability
> +	capability to its clients. If you don't want this capability
> +	to be advertised, set this variable to false.
> +
> +receive.advertisePushOptions::
> +	By default, git-receive-pack will advertise the push options capability
> +	to its clients. If you don't want this capability
>  	to be advertised, set this variable to false.

I think we correcting the nit by avoiding passive voice, i.e.

	If you don't want to advertise this capability, set this
	variable to false.

would make it easier to read.

>  in packet-line format to the client, followed by a flush-pkt.  The only
>  real difference is that the capability listing is different - the only
> -possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
> +possible values are 'report-status', 'delete-refs', 'ofs-delta' and
> +'push-options'.

OK.

> +push-options
> +------------
> +
> +If the server sends the 'push-options' capability it is capable to accept

Two nits:

 - A comma would make it easier to read.
 - "capable" goes with "of <gerund>", while "able" goes with "to <infinitive>".

i.e. "... capability, it is capable of accepting..."

> +push options after the update commands have been sent. If the pushing client
> +requests this capability, the server will pass the options to the pre and post
> +receive hooks that process this push request.

Missing dashes, i.e. "pre- and post-receive hooks"?

> @@ -207,6 +214,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
>  			      "report-status delete-refs side-band-64k quiet");
>  		if (advertise_atomic_push)
>  			strbuf_addstr(&cap, " atomic");
> +		if (advertise_push_options)
> +			strbuf_addstr(&cap, " push-options");
>  		if (prefer_ofs_delta)
>  			strbuf_addstr(&cap, " ofs-delta");
>  		if (push_cert_nonce)

Hmph, was there a good reason to add it in the middle (contrast to
the previous addition to the "only possible values are..."
enumeration)?

> +static struct string_list *read_push_options()

static struct string_list *read_push_options(void)

> +{
> +	int i;
> +	struct string_list *ret = xmalloc(sizeof(*ret));
> +	string_list_init(ret, 1);
> +
> +	/* NEEDSWORK: expose the limitations to be configurable. */
> +	int max_options = 32;
> +
> +	/*
> +	 * NEEDSWORK: expose the limitations to be configurable;
> +	 * Once the limit can be lifted, include a way for payloads
> +	 * larger than one pkt, e.g allow a payload of up to
> +	 * LARGE_PACKET_MAX - 1 only, and reserve the last byte
> +	 * to indicate whether the next pkt continues with this
> +	 * push option.
> +	 */
> +	int max_size = 1024;

Good NEEDSWORK comments; perhaps also hint that the configuration
must not come from the repository level configuration file (i.e.
Peff's "scoped configuration" from jk/upload-pack-hook topic)?

> +	for (i = 0; i < max_options; i++) {
> +		char *line;
> +		int len;
> +
> +		line = packet_read_line(0, &len);
> +
> +		if (!line)
> +			break;
> +
> +		if (len > max_size)
> +			die("protocol error: server configuration allows push "
> +			    "options of size up to %d bytes", max_size);
> +
> +		len = strcspn(line, "\n");
> +		line[len] = '\0';
> +
> +		string_list_append(ret, line);
> +	}
> +	if (i == max_options)
> +		die("protocol error: server configuration only allows up "
> +		    "to %d push options", max_options);

When not going over ssh://, does the user sees these messages?

More importantly, if we plan to make this configurable and not make
the limit a hardwired constant of the wire protocol, it may be
better to advertise push-options capability with the limit, e.g.
"push-options=32" (or even "push-options=1024/32"), so that the
client side can count and abort early?

I wondered how well the extra flush works with the extra framing
smart-http does to wrap the wire protocol; as I do not see any
change to the http side, I'd assume that there is no issue.

> +
> +	return ret;
> +}
> +
>  static const char *parse_pack_header(struct pack_header *hdr)
>  {
>  	switch (read_pack_header(0, hdr)) {
> @@ -1773,6 +1829,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
>  		const char *unpack_status = NULL;
>  		struct string_list *push_options = NULL;
>  
> +		if (use_push_options)
> +			push_options = read_push_options();
> +
>  		prepare_shallow_info(&si, &shallow);
>  		if (!si.nr_ours && !si.nr_theirs)
>  			shallow_update = 0;


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH 2/4] receive-pack: implement advertising and receiving push options
  2016-07-07  1:12 [PATCHv3 " Stefan Beller
@ 2016-07-07  1:12 ` Stefan Beller
  2016-07-07 20:37   ` Junio C Hamano
  0 siblings, 1 reply; 52+ messages in thread
From: Stefan Beller @ 2016-07-07  1:12 UTC (permalink / raw)
  To: gitster; +Cc: git, e, peff, dwwang, dennis, Stefan Beller

The pre/post receive hook may be interested in more information from the
user. This information can be transmitted when both client and server
support the "push-options" capability, which when used is a phase directly
after update commands ended by a flush pkt.

Similar to the atomic option, the server capability can be disabled via
the `receive.advertisePushOptions` config variable. While documenting
this, fix a nit in the `receive.advertiseAtomic` wording.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/config.txt                          |  7 ++-
 Documentation/technical/pack-protocol.txt         | 10 ++--
 Documentation/technical/protocol-capabilities.txt |  8 +++
 builtin/receive-pack.c                            | 59 +++++++++++++++++++++++
 4 files changed, 79 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index e208af1..df1b314 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2410,7 +2410,12 @@ rebase.instructionFormat
 
 receive.advertiseAtomic::
 	By default, git-receive-pack will advertise the atomic push
-	capability to its clients. If you don't want to this capability
+	capability to its clients. If you don't want this capability
+	to be advertised, set this variable to false.
+
+receive.advertisePushOptions::
+	By default, git-receive-pack will advertise the push options capability
+	to its clients. If you don't want this capability
 	to be advertised, set this variable to false.
 
 receive.autogc::
diff --git a/Documentation/technical/pack-protocol.txt b/Documentation/technical/pack-protocol.txt
index 8b36343..7a2ed30 100644
--- a/Documentation/technical/pack-protocol.txt
+++ b/Documentation/technical/pack-protocol.txt
@@ -454,7 +454,8 @@ The reference discovery phase is done nearly the same way as it is in the
 fetching protocol. Each reference obj-id and name on the server is sent
 in packet-line format to the client, followed by a flush-pkt.  The only
 real difference is that the capability listing is different - the only
-possible values are 'report-status', 'delete-refs' and 'ofs-delta'.
+possible values are 'report-status', 'delete-refs', 'ofs-delta' and
+'push-options'.
 
 Reference Update Request and Packfile Transfer
 ----------------------------------------------
@@ -465,9 +466,10 @@ that it wants to update, it sends a line listing the obj-id currently on
 the server, the obj-id the client would like to update it to and the name
 of the reference.
 
-This list is followed by a flush-pkt and then the packfile that should
-contain all the objects that the server will need to complete the new
-references.
+This list is followed by a flush-pkt. Then the push options are transmitted
+one per packet followed by another flush-pkt. After that the packfile that
+should contain all the objects that the server will need to complete the new
+references will be sent.
 
 ----
   update-request    =  *shallow ( command-list | push-cert ) [packfile]
diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index eaab6b4..b71eda9 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -253,6 +253,14 @@ atomic pushes. If the pushing client requests this capability, the server
 will update the refs in one atomic transaction. Either all refs are
 updated or none.
 
+push-options
+------------
+
+If the server sends the 'push-options' capability it is capable to accept
+push options after the update commands have been sent. If the pushing client
+requests this capability, the server will pass the options to the pre and post
+receive hooks that process this push request.
+
 allow-tip-sha1-in-want
 ----------------------
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index edbf81e..e71041a 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -44,10 +44,12 @@ static struct strbuf fsck_msg_types = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int advertise_atomic_push = 1;
+static int advertise_push_options = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
 static int use_atomic;
+static int use_push_options;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -193,6 +195,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.advertisepushoptions") == 0) {
+		advertise_push_options = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -207,6 +214,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
 			      "report-status delete-refs side-band-64k quiet");
 		if (advertise_atomic_push)
 			strbuf_addstr(&cap, " atomic");
+		if (advertise_push_options)
+			strbuf_addstr(&cap, " push-options");
 		if (prefer_ofs_delta)
 			strbuf_addstr(&cap, " ofs-delta");
 		if (push_cert_nonce)
@@ -1454,6 +1463,9 @@ static struct command *read_head_info(struct sha1_array *shallow)
 			if (advertise_atomic_push
 			    && parse_feature_request(feature_list, "atomic"))
 				use_atomic = 1;
+			if (advertise_push_options
+			    && parse_feature_request(feature_list, "push-options"))
+				use_push_options = 1;
 		}
 
 		if (!strcmp(line, "push-cert")) {
@@ -1486,6 +1498,50 @@ static struct command *read_head_info(struct sha1_array *shallow)
 	return commands;
 }
 
+static struct string_list *read_push_options()
+{
+	int i;
+	struct string_list *ret = xmalloc(sizeof(*ret));
+	string_list_init(ret, 1);
+
+	/* NEEDSWORK: expose the limitations to be configurable. */
+	int max_options = 32;
+
+	/*
+	 * NEEDSWORK: expose the limitations to be configurable;
+	 * Once the limit can be lifted, include a way for payloads
+	 * larger than one pkt, e.g allow a payload of up to
+	 * LARGE_PACKET_MAX - 1 only, and reserve the last byte
+	 * to indicate whether the next pkt continues with this
+	 * push option.
+	 */
+	int max_size = 1024;
+
+	for (i = 0; i < max_options; i++) {
+		char *line;
+		int len;
+
+		line = packet_read_line(0, &len);
+
+		if (!line)
+			break;
+
+		if (len > max_size)
+			die("protocol error: server configuration allows push "
+			    "options of size up to %d bytes", max_size);
+
+		len = strcspn(line, "\n");
+		line[len] = '\0';
+
+		string_list_append(ret, line);
+	}
+	if (i == max_options)
+		die("protocol error: server configuration only allows up "
+		    "to %d push options", max_options);
+
+	return ret;
+}
+
 static const char *parse_pack_header(struct pack_header *hdr)
 {
 	switch (read_pack_header(0, hdr)) {
@@ -1773,6 +1829,9 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix)
 		const char *unpack_status = NULL;
 		struct string_list *push_options = NULL;
 
+		if (use_push_options)
+			push_options = read_push_options();
+
 		prepare_shallow_info(&si, &shallow);
 		if (!si.nr_ours && !si.nr_theirs)
 			shallow_update = 0;
-- 
2.9.0.141.gd59d3e9.dirty


^ permalink raw reply related	[flat|nested] 52+ messages in thread

end of thread, other threads:[~2016-07-14 21:50 UTC | newest]

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-30  0:59 [RFC PATCHv1 0/4] Push options in C Git Stefan Beller
2016-06-30  0:59 ` [PATCH 1/4] push options: {pre,post}-receive hook learns about push options Stefan Beller
2016-07-01  7:14   ` Jeff King
2016-07-01 17:20     ` Stefan Beller
2016-07-01 17:59       ` Jeff King
2016-07-01 18:03         ` Junio C Hamano
2016-07-01 18:11           ` Jeff King
2016-07-01 19:25             ` Junio C Hamano
2016-07-01 19:31               ` Stefan Beller
2016-07-01 19:39               ` Jeff King
2016-07-01 19:50                 ` Stefan Beller
2016-07-01 18:40         ` Stefan Beller
2016-06-30  0:59 ` [PATCH 2/4] receive-pack: implement advertising and receiving " Stefan Beller
2016-07-01 17:11   ` Junio C Hamano
2016-07-01 17:24     ` Stefan Beller
2016-06-30  0:59 ` [PATCH 3/4] push: accept " Stefan Beller
2016-06-30  0:59 ` [PATCH 4/4] add a test for " Stefan Beller
2016-07-01  7:09 ` [RFC PATCHv1 0/4] Push options in C Git Jeff King
2016-07-01 17:07   ` Stefan Beller
2016-07-01 17:55     ` Jeff King
2016-07-01 18:25       ` Stefan Beller
2016-07-01 20:01         ` Jeff King
2016-07-07  1:12 [PATCHv3 " Stefan Beller
2016-07-07  1:12 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-07 20:37   ` Junio C Hamano
2016-07-07 21:41     ` Stefan Beller
2016-07-07 21:56       ` Jeff King
2016-07-07 22:06         ` Stefan Beller
2016-07-07 22:09           ` Jeff King
2016-07-07 22:06       ` Junio C Hamano
2016-07-08 17:58         ` Jonathan Nieder
2016-07-08 18:39           ` Junio C Hamano
2016-07-08 18:57             ` Stefan Beller
2016-07-08 21:46               ` Jeff King
2016-07-08 22:17                 ` Stefan Beller
2016-07-08 22:21                   ` Jeff King
2016-07-08 22:29                     ` Stefan Beller
2016-07-08 22:35                       ` Jeff King
2016-07-08 22:43                         ` Stefan Beller
2016-07-08 22:46                           ` Jeff King
2016-07-08 22:51                             ` Stefan Beller
2016-07-09  0:31 [PATCHv4 0/4] Push options Stefan Beller
2016-07-09  0:31 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-10 17:06   ` Shawn Pearce
2016-07-10 18:05     ` Stefan Beller
2016-07-12  4:53       ` Shawn Pearce
2016-07-12  5:24     ` Jeff King
2016-07-14 17:39 [PATCHv5 0/4] Push options Stefan Beller
2016-07-14 17:39 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller
2016-07-14 18:38   ` Junio C Hamano
2016-07-14 19:00     ` Stefan Beller
2016-07-14 19:07       ` Junio C Hamano
2016-07-14 19:45         ` Jeff King
2016-07-14 20:07           ` Junio C Hamano
2016-07-14 21:49 [PATCHv7 0/4] Push options Stefan Beller
2016-07-14 21:49 ` [PATCH 2/4] receive-pack: implement advertising and receiving push options Stefan Beller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.