All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] atomic pushes
@ 2014-12-30 23:41 Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands Stefan Beller
                   ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This patch series adds a flag to git push to update the remote refs atomically.

All comments from Eric and Junio have been incorporated.

The patch to negotiate the atomic push protocol was ripped apart
and we have one commit for the receiving side and one for the sending side,
so the order of new functionality introduced should not bring any bisect
breakages. 

The remote side learned the receive.advertiseatomic config option
and a test was added, that the user cannot push any more if they ask
for --atomic but the server doesn't advertise atomic.


But as of writing this cover letter I have a question on the security 
implications here. If we ask the remote side to not advertise atomic, 
(say because of some problem with too many file descriptors open)
but the client side (being a malicious client, not possible with the code 
presented in this patch series) does ask for atomic, the server should not
obey the request to handle it atomic. I just added that as well.

How about other capabilites requested such as "report-status",
"side-band-64k" and "quiet" ? 

This series applies on top of origin/mh/reflog-expire

Any comments are welcome!

Thanks,
Stefan

Ronnie Sahlberg (3):
  receive-pack.c: negotiate atomic push support
  send-pack.c: add --atomic command line argument
  push.c: add an --atomic argument

Stefan Beller (6):
  receive-pack.c: shorten the execute_commands loop over all commands
  receive-pack.c: move iterating over all commands outside
    execute_commands
  receive-pack.c: move transaction handling in a central place
  receive-pack.c: add execute_commands_atomic function
  send-pack: rename ref_update_to_be_sent to check_to_send_update
  t5543-atomic-push.sh: add basic tests for atomic pushes

 Documentation/git-push.txt                        |   7 +-
 Documentation/git-send-pack.txt                   |   7 +-
 Documentation/technical/protocol-capabilities.txt |  13 +-
 builtin/push.c                                    |   5 +
 builtin/receive-pack.c                            | 166 ++++++++++++++----
 builtin/send-pack.c                               |   6 +-
 remote.h                                          |   3 +-
 send-pack.c                                       |  65 +++++++-
 send-pack.h                                       |   3 +-
 t/t5543-atomic-push.sh                            | 194 ++++++++++++++++++++++
 transport.c                                       |   5 +
 transport.h                                       |   1 +
 12 files changed, 425 insertions(+), 50 deletions(-)
 create mode 100755 t/t5543-atomic-push.sh

-- 
2.2.1.62.g3f15098

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2015-01-03  2:20   ` Jonathan Nieder
  2014-12-30 23:41 ` [PATCHv9 2/9] receive-pack.c: move iterating over all commands outside execute_commands Stefan Beller
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This commit shortens execute_commands loop over all commands by moving
some parts of the code to extra functions. This will help us in a later
patch to avoid duplicate code when the loop over all commands should be
made atomic or not.

No functional changes intended.

Suggested-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
    * simplified should_process_cmd to be a one liner
    * check_shallow_bugs doesn't check of shallow_update being set, rather
      the function is just called if that option is set.
    
    v8: no change
    
    v7:
     new in v7 as in v7 I'd split up the previous
     [PATCH 4/7] receive-pack.c: receive-pack.c: use a single ref_transaction for atomic pushes
     as suggested by Eric.
    
     This is pretty much
    > patch 1: Factor out code into helper functions which will be needed by
    > the upcoming atomic and non-atomic worker functions. Example helpers:
    > 'cmd->error_string' and cmd->skip_update' check; and the
    > 'si->shallow_ref[cmd->index]' check and handling.

 builtin/receive-pack.c | 42 ++++++++++++++++++++++++++----------------
 1 file changed, 26 insertions(+), 16 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 32fc540..68b9cfb 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1042,11 +1042,33 @@ static void reject_updates_to_hidden(struct command *commands)
 	}
 }
 
+static int should_process_cmd(struct command *cmd)
+{
+	return !cmd->error_string && !cmd->skip_update;
+}
+
+static void check_shallow_bugs(struct command *commands,
+			       struct shallow_info *si)
+{
+	struct command *cmd;
+	int checked_connectivity = 1;
+	for (cmd = commands; cmd; cmd = cmd->next) {
+		if (should_process_cmd(cmd) && si->shallow_ref[cmd->index]) {
+			error("BUG: connectivity check has not been run on ref %s",
+			      cmd->ref_name);
+			checked_connectivity = 0;
+		}
+	}
+	if (!checked_connectivity)
+		error("BUG: run 'git fsck' for safety.\n"
+		      "If there are errors, try to remove "
+		      "the reported refs above");
+}
+
 static void execute_commands(struct command *commands,
 			     const char *unpacker_error,
 			     struct shallow_info *si)
 {
-	int checked_connectivity;
 	struct command *cmd;
 	unsigned char sha1[20];
 	struct iterate_data data;
@@ -1077,27 +1099,15 @@ static void execute_commands(struct command *commands,
 	free(head_name_to_free);
 	head_name = head_name_to_free = resolve_refdup("HEAD", 0, sha1, NULL);
 
-	checked_connectivity = 1;
 	for (cmd = commands; cmd; cmd = cmd->next) {
-		if (cmd->error_string)
-			continue;
-
-		if (cmd->skip_update)
+		if (!should_process_cmd(cmd))
 			continue;
 
 		cmd->error_string = update(cmd, si);
-		if (shallow_update && !cmd->error_string &&
-		    si->shallow_ref[cmd->index]) {
-			error("BUG: connectivity check has not been run on ref %s",
-			      cmd->ref_name);
-			checked_connectivity = 0;
-		}
 	}
 
-	if (shallow_update && !checked_connectivity)
-		error("BUG: run 'git fsck' for safety.\n"
-		      "If there are errors, try to remove "
-		      "the reported refs above");
+	if (shallow_update)
+		check_shallow_bugs(commands, si);
 }
 
 static struct command **queue_command(struct command **tail,
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 2/9] receive-pack.c: move iterating over all commands outside execute_commands
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 3/9] receive-pack.c: move transaction handling in a central place Stefan Beller
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This commit allows us in a later patch to easily distinguish between
the non atomic way to update the received refs and the atomic way which
is introduced in a later patch.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
     new and shiny. But makes the next patch easier to review.

 builtin/receive-pack.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 68b9cfb..941aae5 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1065,6 +1065,18 @@ static void check_shallow_bugs(struct command *commands,
 		      "the reported refs above");
 }
 
+static void execute_commands_non_atomic(struct command *commands,
+					struct shallow_info *si)
+{
+	struct command *cmd;
+	for (cmd = commands; cmd; cmd = cmd->next) {
+		if (!should_process_cmd(cmd))
+			continue;
+
+		cmd->error_string = update(cmd, si);
+	}
+}
+
 static void execute_commands(struct command *commands,
 			     const char *unpacker_error,
 			     struct shallow_info *si)
@@ -1099,12 +1111,7 @@ static void execute_commands(struct command *commands,
 	free(head_name_to_free);
 	head_name = head_name_to_free = resolve_refdup("HEAD", 0, sha1, NULL);
 
-	for (cmd = commands; cmd; cmd = cmd->next) {
-		if (!should_process_cmd(cmd))
-			continue;
-
-		cmd->error_string = update(cmd, si);
-	}
+	execute_commands_non_atomic(commands, si);
 
 	if (shallow_update)
 		check_shallow_bugs(commands, si);
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 3/9] receive-pack.c: move transaction handling in a central place
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 2/9] receive-pack.c: move iterating over all commands outside execute_commands Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 4/9] receive-pack.c: add execute_commands_atomic function Stefan Beller
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This moves all code related to transactions into the
execute_commands_non_atomic function. This includes
beginning and committing the transaction as well as
dealing with the errors which may occur during the
begin and commit phase of a transaction.

No functional changes intended.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
     This was split up into the patch before and this one.
     This patch only deals with the transactions now.
    
    v8:
     move execute_commands_loop before execute_commands, so it compiles/links
     without warnings.
    
    v7:
     new in v7, this is part of the previous "[PATCH 4/7]
     receive-pack.c: receive-pack.c: use a single ref_transaction for atomic pushes"
    
    This covers the suggestion of patch 2 and 3 by Eric
    > patch 2: Factor out the main 'for' loop of execute_commands() into a
    > new function. This new function will eventually become
    > execute_commands_non_atomic(). At this point, execute_commands() is
    > pretty much in its final form with the exception of the upcoming 'if
    > (use_atomic)' conditional.
    > patch 3: Morph the function extracted in patch 2 into
    > execute_commands_non_atomic() by adding transaction handling inside
    > the 'for' loop (and applying the changes from the early part of the
    > patch which go along with that).

 builtin/receive-pack.c | 51 ++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 941aae5..36e8795 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -66,6 +66,7 @@ static const char *NONCE_SLOP = "SLOP";
 static const char *nonce_status;
 static long nonce_stamp_slop;
 static unsigned long nonce_stamp_slop_limit;
+static struct ref_transaction *transaction;
 
 static enum deny_action parse_deny_action(const char *var, const char *value)
 {
@@ -821,6 +822,7 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 	}
 
 	if (is_null_sha1(new_sha1)) {
+		struct strbuf err = STRBUF_INIT;
 		if (!parse_object(old_sha1)) {
 			old_sha1 = NULL;
 			if (ref_exists(name)) {
@@ -830,35 +832,36 @@ static const char *update(struct command *cmd, struct shallow_info *si)
 				cmd->did_not_exist = 1;
 			}
 		}
-		if (delete_ref(namespaced_name, old_sha1, 0)) {
-			rp_error("failed to delete %s", name);
+		if (ref_transaction_delete(transaction,
+					   namespaced_name,
+					   old_sha1,
+					   0, old_sha1 != NULL,
+					   "push", &err)) {
+			rp_error("%s", err.buf);
+			strbuf_release(&err);
 			return "failed to delete";
 		}
+		strbuf_release(&err);
 		return NULL; /* good */
 	}
 	else {
 		struct strbuf err = STRBUF_INIT;
-		struct ref_transaction *transaction;
-
 		if (shallow_update && si->shallow_ref[cmd->index] &&
 		    update_shallow_ref(cmd, si))
 			return "shallow error";
 
-		transaction = ref_transaction_begin(&err);
-		if (!transaction ||
-		    ref_transaction_update(transaction, namespaced_name,
-					   new_sha1, old_sha1, 0, 1, "push",
-					   &err) ||
-		    ref_transaction_commit(transaction, &err)) {
-			ref_transaction_free(transaction);
-
+		if (ref_transaction_update(transaction,
+					   namespaced_name,
+					   new_sha1, old_sha1,
+					   0, 1, "push",
+					   &err)) {
 			rp_error("%s", err.buf);
 			strbuf_release(&err);
+
 			return "failed to update ref";
 		}
-
-		ref_transaction_free(transaction);
 		strbuf_release(&err);
+
 		return NULL; /* good */
 	}
 }
@@ -1069,12 +1072,32 @@ static void execute_commands_non_atomic(struct command *commands,
 					struct shallow_info *si)
 {
 	struct command *cmd;
+	struct strbuf err = STRBUF_INIT;
+
 	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (!should_process_cmd(cmd))
 			continue;
 
+		transaction = ref_transaction_begin(&err);
+		if (!transaction) {
+			rp_error("%s", err.buf);
+			strbuf_reset(&err);
+			cmd->error_string = "transaction failed to start";
+			continue;
+		}
+
 		cmd->error_string = update(cmd, si);
+
+		if (!cmd->error_string
+		    && ref_transaction_commit(transaction, &err)) {
+			rp_error("%s", err.buf);
+			strbuf_reset(&err);
+			cmd->error_string = "failed to update ref";
+		}
+		ref_transaction_free(transaction);
 	}
+
+	strbuf_release(&err);
 }
 
 static void execute_commands(struct command *commands,
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 4/9] receive-pack.c: add execute_commands_atomic function
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
                   ` (2 preceding siblings ...)
  2014-12-30 23:41 ` [PATCHv9 3/9] receive-pack.c: move transaction handling in a central place Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 5/9] receive-pack.c: negotiate atomic push support Stefan Beller
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This introduces the new function execute_commands_atomic which will use
one atomic transaction for all updates. The default behavior is still
the old non atomic way, one ref at a time. This is to cause as little
disruption as possible to existing clients. It is unknown if there are
client scripts that depend on the old non-atomic behavior so we make it
opt-in for now.

A later patch will add the possibility to actually use the functionality
added by this patch. For now use_atomic is always 0.

Inspired-by: Ronnie Sahlberg <sahlberg@google.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
    	Because the patches have been reordered, we introduce use_atomic
    	in this patch, but don't touch it.
    
    > Minor comment: This cleanup code is repeated in both the success and
    > fail branches. It _might_ (or not) be a bit cleaner and more
    > maintainable to replace the above three lines with:
    
    	Personally I have no strong opinion about that one.
    	I implemented it as you suggested, having a cleanup and a failure
    	label. It seems to look ok and has less lines of code.
    
    	However writing this code made me feel a bit like producing
    	spagetti code here. ("Goto is evil!", I accepted goto as a
    	convenient way to have one common cleanup or failure before
    	exiting the function before, but now we have two jump targets.)
    
    Changes in v8:
    	removed superflous "}" to make it compile again
    
    Changes in v7:
    	Eric suggested to replace "[PATCH 4/7] receive-pack.c:
    	receive-pack.c: use a single ref_transaction for atomic pushes"
    	by smaller patches
    	This is the last patch replacing said large commit.
    
    Changes in v6:
    	This is a complete rewrite of the patch essentially.
    	Eric suggested to split up the flow into functions, so
    	it is easier to read. It's more lines of code, but indeed
    	it's easier to follow. Thanks Eric!
    
    	Note there is another patch following this one
    	moving the helper functions above execute_commands.
    	I just choose the order of the functions in this patch
    	to have a better diff (just inserting the call to the function
    	execute_commands_non_atomic and that function directly follows.)
    	The next patch of the series will move that up.
    
    	Because of the rewrite and the fixes of the previous five
    	versions there is not much left of Ronnies original patch,
    	so I'll claim authorship of this one.
    
    Changes v1 -> v2:
    	* update(...) assumes to be always in a transaction
    	* Caring about when to begin/commit transactions is put
    	  into execute_commands
    v2->v3:
    	* meditated about the error flow. Now we always construct a local
    	  strbuf err if required. Then the flow is easier to follow and
    	  destruction of it is performed nearby.
    	* early return in execute_commands if transaction_begin fails.
    
    v3->v4:
    	* revamp logic again. This should keep the non atomic behavior
    	  as is (in case of error say so, in non error case just free the
    	  transaction). In the atomic case we either do nothing (when no error),
    	  or abort with the goto.
    
    		if (!cmd->error_string) {
    			if (!use_atomic
    			    && ref_transaction_commit(transaction, &err)) {
    				ref_transaction_free(transaction);
    				rp_error("%s", err.buf);
    				strbuf_release(&err);
    				cmd->error_string = "failed to update ref";
    			}
    		} else if (use_atomic) {
    			goto atomic_failure;
    		} else {
    			ref_transaction_free(transaction);
    		}
    
    	 * Having the goto directly there when checking for cmd->error_string,
    	   we don't need to do it again, so the paragraph explaining the error
    	   checking is gone as well. (Previous patch had the following, this is
    	   put at the end of the function, where the goto jumps to and the comment
    	   has been dropped.
    +		/*
    +		 * update(...) may abort early (i.e. because the hook refused to
    +		 * update that ref) which then doesn't even record a transaction
    +		 * regarding that ref. Make sure all commands are without error
    +		 * and then commit atomically.
    +		 */
    +		for (cmd = commands; cmd; cmd = cmd->next)
    +			if (cmd->error_string)
    +				break;
    
    v4->v5:
    Eric wrote:
    > Repeating from my earlier review[1]: If the 'pre-receive' hook
    > "declines", then this transaction is left dangling (and its resources
    > leaked).
    
    You're right. The initialization of the transaction is now
    near the actual loop after the pre receive hook.
    
    > The !use_atomic case (below), calls this error "failed to start
    > transaction", not merely "transaction error".
    
    ok, now both are "transaction failed to start".
    In all cases where these generic errors are reported,
    we do have a rp_error(...) with details.
    
    > Furthermore, in the use_atomic case (also below), when a commit fails,
    > you assign err.buf to cmd->error_string rather than a generic
    > "transaction error" message. What differs between these cases which
    > makes the generic message preferable here over the more specific
    > err.buf message?
    
    They are the same now.
    
    > Repeating from my earlier review[1]: This is leaking 'transaction' for
    > each successful commit (and only freeing it upon commit error).
    
    Right. I thought I had it covered with the else clause. Of course not.
    
    > At the end of this function, strbuf_release(&err) is invoked, which
    > leaves all these cmd->error_strings dangling.
    
    I removed all assignments of err.buf now.
    
    > goto's can help simplify error-handling when multiple conditional
    > branches need to perform common cleanup, however, this label
    > corresponds to only a single goto statement.
    
    moved up again.

 builtin/receive-pack.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 46 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 36e8795..d431e97 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -40,6 +40,7 @@ static int transfer_unpack_limit = -1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
+static int use_atomic;
 static int quiet;
 static int prefer_ofs_delta = 1;
 static int auto_update_server_info;
@@ -1096,7 +1097,48 @@ static void execute_commands_non_atomic(struct command *commands,
 		}
 		ref_transaction_free(transaction);
 	}
+	strbuf_release(&err);
+}
+
+static void execute_commands_atomic(struct command *commands,
+					struct shallow_info *si)
+{
+	struct command *cmd;
+	struct strbuf err = STRBUF_INIT;
+	const char *reported_error = "atomic push failure";
+
+	transaction = ref_transaction_begin(&err);
+	if (!transaction) {
+		rp_error("%s", err.buf);
+		strbuf_reset(&err);
+		reported_error = "transaction failed to start";
+		goto failure;
+	}
+
+	for (cmd = commands; cmd; cmd = cmd->next) {
+		if (!should_process_cmd(cmd))
+			continue;
+
+		cmd->error_string = update(cmd, si);
+
+		if (cmd->error_string)
+			goto failure;
+	}
 
+	if (ref_transaction_commit(transaction, &err)) {
+		rp_error("%s", err.buf);
+		reported_error = "atomic transaction failed";
+		goto failure;
+	}
+	goto cleanup;
+
+failure:
+	for (cmd = commands; cmd; cmd = cmd->next)
+		if (!cmd->error_string)
+			cmd->error_string = reported_error;
+
+cleanup:
+	ref_transaction_free(transaction);
 	strbuf_release(&err);
 }
 
@@ -1134,7 +1176,10 @@ static void execute_commands(struct command *commands,
 	free(head_name_to_free);
 	head_name = head_name_to_free = resolve_refdup("HEAD", 0, sha1, NULL);
 
-	execute_commands_non_atomic(commands, si);
+	if (use_atomic)
+		execute_commands_atomic(commands, si);
+	else
+		execute_commands_non_atomic(commands, si);
 
 	if (shallow_update)
 		check_shallow_bugs(commands, si);
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 5/9] receive-pack.c: negotiate atomic push support
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
                   ` (3 preceding siblings ...)
  2014-12-30 23:41 ` [PATCHv9 4/9] receive-pack.c: add execute_commands_atomic function Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 6/9] send-pack: rename ref_update_to_be_sent to check_to_send_update Stefan Beller
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster
  Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg,
	Ronnie Sahlberg, Stefan Beller

From: Ronnie Sahlberg <sahlberg@google.com>

This adds the atomic protocol option to allow
receive-pack to inform the client that it has
atomic push capability.

This commit makes the functionality introduced
in the previous commits go live for the serving
side. The changes in documentation reflect the
protocol capabilities of the server.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
     This was once part of "[PATCH 1/7] receive-pack.c:
     add protocol support to negotiate atomic-push"
     but now it only touches the receive-pack.c part
     and doesn't bother with the send-pack part any more.
     That is done in a later patch, when send-pack actually
     learns all the things it needs to know about the
     atomic push option.
    
     We can configure the remote if it wants to advertise
     atomicity!
    
    All previous notes were lost due to my glorious
    capability to operate git rebase.

 Documentation/technical/protocol-capabilities.txt | 13 +++++++++++--
 builtin/receive-pack.c                            | 11 +++++++++++
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/Documentation/technical/protocol-capabilities.txt b/Documentation/technical/protocol-capabilities.txt
index 6d5424c..4f8a7bf 100644
--- a/Documentation/technical/protocol-capabilities.txt
+++ b/Documentation/technical/protocol-capabilities.txt
@@ -18,8 +18,9 @@ was sent.  Server MUST NOT ignore capabilities that client requested
 and server advertised.  As a consequence of these rules, server MUST
 NOT advertise capabilities it does not understand.
 
-The 'report-status', 'delete-refs', 'quiet', and 'push-cert' capabilities
-are sent and recognized by the receive-pack (push to server) process.
+The 'atomic', 'report-status', 'delete-refs', 'quiet', and 'push-cert'
+capabilities are sent and recognized by the receive-pack (push to server)
+process.
 
 The 'ofs-delta' and 'side-band-64k' capabilities are sent and recognized
 by both upload-pack and receive-pack protocols.  The 'agent' capability
@@ -244,6 +245,14 @@ respond with the 'quiet' capability to suppress server-side progress
 reporting if the local progress reporting is also being suppressed
 (e.g., via `push -q`, or if stderr does not go to a tty).
 
+atomic
+------
+
+If the server sends the 'atomic' capability it is capable of accepting
+atomic pushes. If the pushing client requests this capability, the server
+will update the refs in one atomic transaction. Either all refs are
+updated or none.
+
 allow-tip-sha1-in-want
 ----------------------
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index d431e97..8fd58ff 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -37,6 +37,7 @@ static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
+static int advertise_atomic_push = 1;
 static int unpack_limit = 100;
 static int report_status;
 static int use_sideband;
@@ -159,6 +160,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "receive.advertiseatomic") == 0) {
+		advertise_atomic_push = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
@@ -174,6 +180,8 @@ static void show_ref(const char *path, const unsigned char *sha1)
 
 		strbuf_addstr(&cap,
 			      "report-status delete-refs side-band-64k quiet");
+		if (advertise_atomic_push)
+			strbuf_addstr(&cap, " atomic");
 		if (prefer_ofs_delta)
 			strbuf_addstr(&cap, " ofs-delta");
 		if (push_cert_nonce)
@@ -1264,6 +1272,9 @@ static struct command *read_head_info(struct sha1_array *shallow)
 				use_sideband = LARGE_PACKET_MAX;
 			if (parse_feature_request(feature_list, "quiet"))
 				quiet = 1;
+			if (advertise_atomic_push
+			    && parse_feature_request(feature_list, "atomic"))
+				use_atomic = 1;
 		}
 
 		if (!strcmp(line, "push-cert")) {
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 6/9] send-pack: rename ref_update_to_be_sent to check_to_send_update
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
                   ` (4 preceding siblings ...)
  2014-12-30 23:41 ` [PATCHv9 5/9] receive-pack.c: negotiate atomic push support Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 7/9] send-pack.c: add --atomic command line argument Stefan Beller
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This renames ref_update_to_be_sent to check_to_send_update and inverts
the meaning of the return value. Having the return value inverted we
can have different values for the error codes. This is useful in a
later patch when we want to know if we hit the CHECK_REF_STATUS_REJECTED
case.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v7, v8, v9:
    * no changes
    
    v6:
    * negative #define'd values
    
    skipped v4 v5
    
    This was introduced with the [PATCHv2] series.
    Changes v2 -> v3:
    
    * Rename to check_to_send_update
    * Negative error values.
    * errors values are #define'd and not raw numbers.

 send-pack.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/send-pack.c b/send-pack.c
index 949cb61..4974825 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -190,10 +190,13 @@ static void advertise_shallow_grafts_buf(struct strbuf *sb)
 	for_each_commit_graft(advertise_shallow_grafts_cb, sb);
 }
 
-static int ref_update_to_be_sent(const struct ref *ref, const struct send_pack_args *args)
+#define CHECK_REF_NO_PUSH -1
+#define CHECK_REF_STATUS_REJECTED -2
+#define CHECK_REF_UPTODATE -3
+static int check_to_send_update(const struct ref *ref, const struct send_pack_args *args)
 {
 	if (!ref->peer_ref && !args->send_mirror)
-		return 0;
+		return CHECK_REF_NO_PUSH;
 
 	/* Check for statuses set by set_ref_status_for_push() */
 	switch (ref->status) {
@@ -203,10 +206,11 @@ static int ref_update_to_be_sent(const struct ref *ref, const struct send_pack_a
 	case REF_STATUS_REJECT_NEEDS_FORCE:
 	case REF_STATUS_REJECT_STALE:
 	case REF_STATUS_REJECT_NODELETE:
+		return CHECK_REF_STATUS_REJECTED;
 	case REF_STATUS_UPTODATE:
-		return 0;
+		return CHECK_REF_UPTODATE;
 	default:
-		return 1;
+		return 0;
 	}
 }
 
@@ -250,7 +254,7 @@ static int generate_push_cert(struct strbuf *req_buf,
 	strbuf_addstr(&cert, "\n");
 
 	for (ref = remote_refs; ref; ref = ref->next) {
-		if (!ref_update_to_be_sent(ref, args))
+		if (check_to_send_update(ref, args) < 0)
 			continue;
 		update_seen = 1;
 		strbuf_addf(&cert, "%s %s %s\n",
@@ -359,7 +363,7 @@ int send_pack(struct send_pack_args *args,
 	 * the pack data.
 	 */
 	for (ref = remote_refs; ref; ref = ref->next) {
-		if (!ref_update_to_be_sent(ref, args))
+		if (check_to_send_update(ref, args) < 0)
 			continue;
 
 		if (!ref->deletion)
@@ -380,7 +384,7 @@ int send_pack(struct send_pack_args *args,
 		if (args->dry_run || args->push_cert)
 			continue;
 
-		if (!ref_update_to_be_sent(ref, args))
+		if (check_to_send_update(ref, args) < 0)
 			continue;
 
 		old_hex = sha1_to_hex(ref->old_sha1);
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 7/9] send-pack.c: add --atomic command line argument
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
                   ` (5 preceding siblings ...)
  2014-12-30 23:41 ` [PATCHv9 6/9] send-pack: rename ref_update_to_be_sent to check_to_send_update Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 8/9] push.c: add an --atomic argument Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 9/9] t5543-atomic-push.sh: add basic tests for atomic pushes Stefan Beller
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster
  Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg,
	Ronnie Sahlberg, Stefan Beller

From: Ronnie Sahlberg <sahlberg@google.com>

This adds support to send-pack to negotiate and use atomic pushes
iff the server supports it. Atomic pushes are activated by a new command
line flag --atomic.

In order to do this we also need to change the semantics for send_pack()
slightly. The existing send_pack() function actually doesn't send all the
refs back to the server when multiple refs are involved, for example
when using --all. Several of the failure modes for pushes can already be
detected locally in the send_pack client based on the information from the
initial server side list of all the refs as generated by receive-pack.
Any such refs that we thus know would fail to push are thus pruned from
the list of refs we send to the server to update.

For atomic pushes, we have to deal thus with both failures that are detected
locally as well as failures that are reported back from the server. In order
to do so we treat all local failures as push failures too.

We introduce a new status code REF_STATUS_ATOMIC_PUSH_FAILED so we can
flag all refs that we would normally have tried to push to the server
but we did not due to local failures. This is to improve the error message
back to the end user to flag that "these refs failed to update since the
atomic push operation failed."

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
     This patch now incorporates parts of the very first patch of
     the previous round. So this patch now implements all changes
     in send-pack apart from the refactoring in the previous patch.
    
    v8:
    	no changes
    v7:
     * return error(...); instead of error(...); return -1;
    
    v6:
    * realign to one error string:
    +	error("atomic push failed for ref %s. status: %d\n",
    +	      failing_ref->name, failing_ref->status);
    
    * Use correct value now (negative defined from previous patch)
    
    skipped v4 v5
    
    Changes v2 -> v3:
    > We avoid assignment inside a conditional.
    
    Ok I switched to using a switch statement.
    
        Changes v1 -> v2:
         * Now we only need a very small change in the existing code and have
           a new static function, which cares about error reporting.
              Junio wrote:
              > Hmph.  Is "atomic push" so special that it deserves a separate
              > parameter?  When we come up with yet another mode of failure, would
              > we add another parameter to the callers to this function?
         * error messages are worded differently (lower case!),
         * use of error function instead of fprintf
    
         * undashed the printed error message ("atomic push failed");

 Documentation/git-send-pack.txt |  7 +++++-
 builtin/send-pack.c             |  6 ++++-
 remote.h                        |  3 ++-
 send-pack.c                     | 49 +++++++++++++++++++++++++++++++++++++++--
 send-pack.h                     |  3 ++-
 transport.c                     |  4 ++++
 6 files changed, 66 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-send-pack.txt b/Documentation/git-send-pack.txt
index 2a0de42..45c7725 100644
--- a/Documentation/git-send-pack.txt
+++ b/Documentation/git-send-pack.txt
@@ -9,7 +9,7 @@ git-send-pack - Push objects over Git protocol to another repository
 SYNOPSIS
 --------
 [verse]
-'git send-pack' [--all] [--dry-run] [--force] [--receive-pack=<git-receive-pack>] [--verbose] [--thin] [<host>:]<directory> [<ref>...]
+'git send-pack' [--all] [--dry-run] [--force] [--receive-pack=<git-receive-pack>] [--verbose] [--thin] [--atomic] [<host>:]<directory> [<ref>...]
 
 DESCRIPTION
 -----------
@@ -62,6 +62,11 @@ be in a separate packet, and the list must end with a flush packet.
 	Send a "thin" pack, which records objects in deltified form based
 	on objects not included in the pack to reduce network traffic.
 
+--atomic::
+	Use an atomic transaction for updating the refs. If any of the refs
+	fails to update then the entire push will fail without changing any
+	refs.
+
 <host>::
 	A remote host to house the repository.  When this
 	part is specified, 'git-receive-pack' is invoked via
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index b564a77..b961e5a 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -13,7 +13,7 @@
 #include "sha1-array.h"
 
 static const char send_pack_usage[] =
-"git send-pack [--all | --mirror] [--dry-run] [--force] [--receive-pack=<git-receive-pack>] [--verbose] [--thin] [<host>:]<directory> [<ref>...]\n"
+"git send-pack [--all | --mirror] [--dry-run] [--force] [--receive-pack=<git-receive-pack>] [--verbose] [--thin] [--atomic] [<host>:]<directory> [<ref>...]\n"
 "  --all and explicit <ref> specification are mutually exclusive.";
 
 static struct send_pack_args args;
@@ -170,6 +170,10 @@ int cmd_send_pack(int argc, const char **argv, const char *prefix)
 				args.use_thin_pack = 1;
 				continue;
 			}
+			if (!strcmp(arg, "--atomic")) {
+				args.atomic = 1;
+				continue;
+			}
 			if (!strcmp(arg, "--stateless-rpc")) {
 				args.stateless_rpc = 1;
 				continue;
diff --git a/remote.h b/remote.h
index 8b62efd..f346524 100644
--- a/remote.h
+++ b/remote.h
@@ -115,7 +115,8 @@ struct ref {
 		REF_STATUS_REJECT_SHALLOW,
 		REF_STATUS_UPTODATE,
 		REF_STATUS_REMOTE_REJECT,
-		REF_STATUS_EXPECTING_REPORT
+		REF_STATUS_EXPECTING_REPORT,
+		REF_STATUS_ATOMIC_PUSH_FAILED
 	} status;
 	char *remote_status;
 	struct ref *peer_ref; /* when renaming */
diff --git a/send-pack.c b/send-pack.c
index 4974825..e8f60df 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -282,6 +282,29 @@ free_return:
 	return update_seen;
 }
 
+
+static int atomic_push_failure(struct send_pack_args *args,
+			       struct ref *remote_refs,
+			       struct ref *failing_ref)
+{
+	struct ref *ref;
+	/* Mark other refs as failed */
+	for (ref = remote_refs; ref; ref = ref->next) {
+		if (!ref->peer_ref && !args->send_mirror)
+			continue;
+
+		switch (ref->status) {
+		case REF_STATUS_EXPECTING_REPORT:
+			ref->status = REF_STATUS_ATOMIC_PUSH_FAILED;
+			continue;
+		default:
+			; /* do nothing */
+		}
+	}
+	return error("atomic push failed for ref %s. status: %d\n",
+		     failing_ref->name, failing_ref->status);
+}
+
 int send_pack(struct send_pack_args *args,
 	      int fd[], struct child_process *conn,
 	      struct ref *remote_refs,
@@ -298,6 +321,8 @@ int send_pack(struct send_pack_args *args,
 	int use_sideband = 0;
 	int quiet_supported = 0;
 	int agent_supported = 0;
+	int use_atomic = 0;
+	int atomic_supported = 0;
 	unsigned cmds_sent = 0;
 	int ret;
 	struct async demux;
@@ -318,6 +343,8 @@ int send_pack(struct send_pack_args *args,
 		agent_supported = 1;
 	if (server_supports("no-thin"))
 		args->use_thin_pack = 0;
+	if (server_supports("atomic"))
+		atomic_supported = 1;
 	if (args->push_cert) {
 		int len;
 
@@ -332,6 +359,10 @@ int send_pack(struct send_pack_args *args,
 			"Perhaps you should specify a branch such as 'master'.\n");
 		return 0;
 	}
+	if (args->atomic && !atomic_supported)
+		die(_("server does not support --atomic push"));
+
+	use_atomic = atomic_supported && args->atomic;
 
 	if (status_report)
 		strbuf_addstr(&cap_buf, " report-status");
@@ -339,6 +370,8 @@ int send_pack(struct send_pack_args *args,
 		strbuf_addstr(&cap_buf, " side-band-64k");
 	if (quiet_supported && (args->quiet || !args->progress))
 		strbuf_addstr(&cap_buf, " quiet");
+	if (use_atomic)
+		strbuf_addstr(&cap_buf, " atomic");
 	if (agent_supported)
 		strbuf_addf(&cap_buf, " agent=%s", git_user_agent_sanitized());
 
@@ -363,9 +396,21 @@ int send_pack(struct send_pack_args *args,
 	 * the pack data.
 	 */
 	for (ref = remote_refs; ref; ref = ref->next) {
-		if (check_to_send_update(ref, args) < 0)
+		switch (check_to_send_update(ref, args)) {
+		case 0: /* no error */
+			break;
+		case CHECK_REF_STATUS_REJECTED:
+			/*
+			 * When we know the server would reject a ref update if
+			 * we were to send it and we're trying to send the refs
+			 * atomically, abort the whole operation.
+			 */
+			if (use_atomic)
+				return atomic_push_failure(args, remote_refs, ref);
+			/* Fallthrough for non atomic case. */
+		default:
 			continue;
-
+		}
 		if (!ref->deletion)
 			need_pack_data = 1;
 
diff --git a/send-pack.h b/send-pack.h
index 5635457..b664648 100644
--- a/send-pack.h
+++ b/send-pack.h
@@ -13,7 +13,8 @@ struct send_pack_args {
 		use_ofs_delta:1,
 		dry_run:1,
 		push_cert:1,
-		stateless_rpc:1;
+		stateless_rpc:1,
+		atomic:1;
 };
 
 int send_pack(struct send_pack_args *args,
diff --git a/transport.c b/transport.c
index 70d38e4..c67feee 100644
--- a/transport.c
+++ b/transport.c
@@ -728,6 +728,10 @@ static int print_one_push_status(struct ref *ref, const char *dest, int count, i
 						 ref->deletion ? NULL : ref->peer_ref,
 						 "remote failed to report status", porcelain);
 		break;
+	case REF_STATUS_ATOMIC_PUSH_FAILED:
+		print_ref_status('!', "[rejected]", ref, ref->peer_ref,
+						 "atomic push failed", porcelain);
+		break;
 	case REF_STATUS_OK:
 		print_ok_ref_status(ref, porcelain);
 		break;
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 8/9] push.c: add an --atomic argument
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
                   ` (6 preceding siblings ...)
  2014-12-30 23:41 ` [PATCHv9 7/9] send-pack.c: add --atomic command line argument Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  2014-12-30 23:41 ` [PATCHv9 9/9] t5543-atomic-push.sh: add basic tests for atomic pushes Stefan Beller
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster
  Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg,
	Ronnie Sahlberg, Stefan Beller

From: Ronnie Sahlberg <sahlberg@google.com>

Add a command line argument to the git push command to request atomic
pushes.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v8, v9:
    no changes
    
    v7:
    	Use OPT_BOOL instead of OPT_BIT.
    	This allows for --no-atomic option on the command line.
    v6:
    -		OPT_BIT(0, "atomic", &flags, N_("use an atomic transaction remote"),
    +		OPT_BIT(0, "atomic", &flags, N_("request atomic transaction on remote side"),
    
    skipped v4 v5
    
    v2->v3:
    	* s/atomic-push/atomic/
    	* s/the an/an/
    	* no serverside, but just remote instead
    	* TRANSPORT_PUSH_ATOMIC instead of TRANSPORT_ATOMIC_PUSH
    
    Changes v1 -> v2
    	It's --atomic now! (dropping the -push)

 Documentation/git-push.txt | 7 ++++++-
 builtin/push.c             | 5 +++++
 transport.c                | 1 +
 transport.h                | 1 +
 4 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/Documentation/git-push.txt b/Documentation/git-push.txt
index 21b3f29..4764fcf 100644
--- a/Documentation/git-push.txt
+++ b/Documentation/git-push.txt
@@ -9,7 +9,7 @@ git-push - Update remote refs along with associated objects
 SYNOPSIS
 --------
 [verse]
-'git push' [--all | --mirror | --tags] [--follow-tags] [-n | --dry-run] [--receive-pack=<git-receive-pack>]
+'git push' [--all | --mirror | --tags] [--follow-tags] [--atomic] [-n | --dry-run] [--receive-pack=<git-receive-pack>]
 	   [--repo=<repository>] [-f | --force] [--prune] [-v | --verbose]
 	   [-u | --set-upstream] [--signed]
 	   [--force-with-lease[=<refname>[:<expect>]]]
@@ -136,6 +136,11 @@ already exists on the remote side.
 	logged.  See linkgit:git-receive-pack[1] for the details
 	on the receiving end.
 
+--[no-]atomic::
+	Use an atomic transaction on the remote side if available.
+	Either all refs are updated, or on error, no refs are updated.
+	If the server does not support atomic pushes the push will fail.
+
 --receive-pack=<git-receive-pack>::
 --exec=<git-receive-pack>::
 	Path to the 'git-receive-pack' program on the remote
diff --git a/builtin/push.c b/builtin/push.c
index a076b19..8f1d945 100644
--- a/builtin/push.c
+++ b/builtin/push.c
@@ -487,6 +487,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 	int flags = 0;
 	int tags = 0;
 	int rc;
+	int atomic = 0;
 	const char *repo = NULL;	/* default repository */
 	struct option options[] = {
 		OPT__VERBOSITY(&verbosity),
@@ -518,6 +519,7 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "follow-tags", &flags, N_("push missing but relevant tags"),
 			TRANSPORT_PUSH_FOLLOW_TAGS),
 		OPT_BIT(0, "signed", &flags, N_("GPG sign the push"), TRANSPORT_PUSH_CERT),
+		OPT_BOOL(0, "atomic", &atomic, N_("request atomic transaction on remote side")),
 		OPT_END()
 	};
 
@@ -533,6 +535,9 @@ int cmd_push(int argc, const char **argv, const char *prefix)
 	if (tags)
 		add_refspec("refs/tags/*");
 
+	if (atomic)
+		flags |= TRANSPORT_PUSH_ATOMIC;
+
 	if (argc > 0) {
 		repo = argv[0];
 		set_refspecs(argv + 1, argc - 1, repo);
diff --git a/transport.c b/transport.c
index c67feee..1373152 100644
--- a/transport.c
+++ b/transport.c
@@ -830,6 +830,7 @@ static int git_transport_push(struct transport *transport, struct ref *remote_re
 	args.dry_run = !!(flags & TRANSPORT_PUSH_DRY_RUN);
 	args.porcelain = !!(flags & TRANSPORT_PUSH_PORCELAIN);
 	args.push_cert = !!(flags & TRANSPORT_PUSH_CERT);
+	args.atomic = !!(flags & TRANSPORT_PUSH_ATOMIC);
 	args.url = transport->url;
 
 	ret = send_pack(&args, data->fd, data->conn, remote_refs,
diff --git a/transport.h b/transport.h
index 3e0091e..18d2cf8 100644
--- a/transport.h
+++ b/transport.h
@@ -125,6 +125,7 @@ struct transport {
 #define TRANSPORT_PUSH_NO_HOOK 512
 #define TRANSPORT_PUSH_FOLLOW_TAGS 1024
 #define TRANSPORT_PUSH_CERT 2048
+#define TRANSPORT_PUSH_ATOMIC 4096
 
 #define TRANSPORT_SUMMARY_WIDTH (2 * DEFAULT_ABBREV + 3)
 #define TRANSPORT_SUMMARY(x) (int)(TRANSPORT_SUMMARY_WIDTH + strlen(x) - gettext_width(x)), (x)
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv9 9/9] t5543-atomic-push.sh: add basic tests for atomic pushes
  2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
                   ` (7 preceding siblings ...)
  2014-12-30 23:41 ` [PATCHv9 8/9] push.c: add an --atomic argument Stefan Beller
@ 2014-12-30 23:41 ` Stefan Beller
  8 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2014-12-30 23:41 UTC (permalink / raw)
  To: gitster; +Cc: git, sunshine, mhagger, jrnieder, ronniesahlberg, Stefan Beller

This adds tests for the atomic push option.
The first four tests check if the atomic option works in
good conditions and the last three patches check if the atomic
option prevents any change to be pushed if just one ref cannot
be updated.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v9:
     add test for 'atomic push is not advertised if configured'
     at the end.
    
    v7, v8: no changes
    
    v6:
    the same as v2, so I can resend the whole series as v6
    
    v3 v4 v5 were skipped
    
    Changes v1 -> v2:
    > Please drop unused comments; they are distracting.
    
    ok
    
    > It is not wrong per-se, but haven't you already tested in
    > combination with --mirror in the previous test?
    
    I fixed the previous tests, so that there is no --mirror
    and --atomic together. There is still a first --mirror push
    for setup and a second with --atomic <branchnames> though
    
    > check_branches upstream master HEAD@{2} second HEAD~
    
    A similar function test_ref_upstream is introduced.
    
    > What's the value of this test?  Isn't it a non-fast-forward check
    > you already tested in the previous one?
    
    I messed up there. Originally I wanted to test the 2 different
    stages of rejection. A non-fast-forward check is done locally and
    we don't even try pushing. But I also want to test if we locally
    thing all is good, but the server refuses a ref to update.
    This is now done with the last test named 'atomic push obeys
    update hook preventing a branch to be pushed'. And that still fails.
    
    I'll investigate that, while still sending out the series for another
    review though.
    
    * Redone the test helper, there is test_ref_upstream now.
      This tests explicitely for SHA1 values of the ref.
      (It's needed in the last test for example. The git push fails,
      but still modifies the ref :/ )
    * checked all && chains and repaired them
    * sometimes make use of git -C <workdir>
    
    Notes v1:
    Originally Ronnie had a similar patch prepared. But as I added
    more tests and cleaned up the existing tests (e.g. use test_commit
    instead of "echo one >file && gitadd file && git commit -a -m 'one'",
    removal of dead code), the file has changed so much that I'd rather
    take ownership.

 t/t5543-atomic-push.sh | 194 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 194 insertions(+)
 create mode 100755 t/t5543-atomic-push.sh

diff --git a/t/t5543-atomic-push.sh b/t/t5543-atomic-push.sh
new file mode 100755
index 0000000..3480b33
--- /dev/null
+++ b/t/t5543-atomic-push.sh
@@ -0,0 +1,194 @@
+#!/bin/sh
+
+test_description='pushing to a repository using the atomic push option'
+
+. ./test-lib.sh
+
+mk_repo_pair () {
+	rm -rf workbench upstream &&
+	test_create_repo upstream &&
+	test_create_repo workbench &&
+	(
+		cd upstream &&
+		git config receive.denyCurrentBranch warn
+	) &&
+	(
+		cd workbench &&
+		git remote add up ../upstream
+	)
+}
+
+# Compare the ref ($1) in upstream with a ref value from workbench ($2)
+# i.e. test_refs second HEAD@{2}
+test_refs () {
+	test $# = 2 &&
+	git -C upstream rev-parse --verify "$1" >expect &&
+	git -C workbench rev-parse --verify "$2" >actual &&
+	test_cmp expect actual
+}
+
+test_expect_success 'atomic push works for a single branch' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git push --mirror up &&
+		test_commit two &&
+		git push --atomic up master
+	) &&
+	test_refs master master
+'
+
+test_expect_success 'atomic push works for two branches' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git branch second &&
+		git push --mirror up &&
+		test_commit two &&
+		git checkout second &&
+		test_commit three &&
+		git push --atomic up master second
+	) &&
+	test_refs master master &&
+	test_refs second second
+'
+
+test_expect_success 'atomic push works in combination with --mirror' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git checkout -b second &&
+		test_commit two &&
+		git push --atomic --mirror up
+	) &&
+	test_refs master master &&
+	test_refs second second
+'
+
+test_expect_success 'atomic push works in combination with --force' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git branch second master &&
+		test_commit two_a &&
+		git checkout second &&
+		test_commit two_b &&
+		test_commit three_b &&
+		test_commit four &&
+		git push --mirror up &&
+		# The actual test is below
+		git checkout master &&
+		test_commit three_a &&
+		git checkout second &&
+		git reset --hard HEAD^ &&
+		git push --force --atomic up master second
+	) &&
+	test_refs master master &&
+	test_refs second second
+'
+
+# set up two branches where master can be pushed but second can not
+# (non-fast-forward). Since second can not be pushed the whole operation
+# will fail and leave master untouched.
+test_expect_success 'atomic push fails if one branch fails' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git checkout -b second master &&
+		test_commit two &&
+		test_commit three &&
+		test_commit four &&
+		git push --mirror up &&
+		git reset --hard HEAD~2 &&
+		test_commit five &&
+		git checkout master &&
+		test_commit six &&
+		test_must_fail git push --atomic --all up
+	) &&
+	test_refs master HEAD@{7} &&
+	test_refs second HEAD@{4}
+'
+
+test_expect_success 'atomic push fails if one tag fails remotely' '
+	# prepare the repo
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git checkout -b second master &&
+		test_commit two &&
+		git push --mirror up
+	) &&
+	# a third party modifies the server side:
+	(
+		cd upstream &&
+		git checkout second &&
+		git tag test_tag second
+	) &&
+	# see if we can now push both branches.
+	(
+		cd workbench &&
+		git checkout master &&
+		test_commit three &&
+		git checkout second &&
+		test_commit four &&
+		git tag test_tag &&
+		test_must_fail git push --tags --atomic up master second
+	) &&
+	test_refs master HEAD@{3} &&
+	test_refs second HEAD@{1}
+'
+
+test_expect_success 'atomic push obeys update hook preventing a branch to be pushed' '
+	mk_repo_pair &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git checkout -b second master &&
+		test_commit two &&
+		git push --mirror up
+	) &&
+	(
+		cd upstream &&
+		HOOKDIR="$(git rev-parse --git-dir)/hooks" &&
+		HOOK="$HOOKDIR/update" &&
+		mkdir -p "$HOOKDIR" &&
+		write_script "$HOOK" <<-\EOF
+			# only allow update to master from now on
+			test "$1" = "refs/heads/master"
+		EOF
+	) &&
+	(
+		cd workbench &&
+		git checkout master &&
+		test_commit three &&
+		git checkout second &&
+		test_commit four &&
+		test_must_fail git push --atomic up master second
+	) &&
+	test_refs master HEAD@{3} &&
+	test_refs second HEAD@{1}
+'
+
+test_expect_success 'atomic push is not advertised if configured' '
+	mk_repo_pair &&
+	(
+		cd upstream
+		git config receive.advertiseatomic 0
+	) &&
+	(
+		cd workbench &&
+		test_commit one &&
+		git push --mirror up &&
+		test_commit two &&
+		test_must_fail git push --atomic up master
+	) &&
+	test_refs master HEAD@{1}
+'
+
+test_done
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands
  2014-12-30 23:41 ` [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands Stefan Beller
@ 2015-01-03  2:20   ` Jonathan Nieder
  2015-01-03  9:53     ` Duy Nguyen
  0 siblings, 1 reply; 24+ messages in thread
From: Jonathan Nieder @ 2015-01-03  2:20 UTC (permalink / raw)
  To: Stefan Beller
  Cc: gitster, git, sunshine, mhagger, ronniesahlberg,
	Nguyễn Thái Ngọc Duy

(+cc: Duy, who understands shallow push well)
Hi,

Stefan Beller wrote:

> This commit shortens execute_commands loop over all commands by moving

The commit message can be simplified by leaving out "This commit" and
stating what the commit does in the imperative, focusing on what problem
the commit solves.

For example:

	Make the main "execute_commands" loop in receive-pack easier to read
	by splitting out some steps into helper functions.  The new helper
	'should_process_cmd' checks if a ref update is unnecessary, whether
	due to an error having occured or for another reason.  The helper
	'check_shallow_bugs' warns if we have forgotten to run a connectivity
	check on a ref which is shallow for the client.

	This will help us to duplicate less code in a later patch when we make
	a second copy of the "execute_commands" loop.

	No functional change intended.

By the way, is there some clearer name for check_shallow_bugs?  That
name makes it sound like there are some bugs we are checking up on,
whereas a name that makes it obvious what the function will do and saves
me from having to read the function body would be ideal.

[...]
> +++ b/builtin/receive-pack.c
[...]
> @@ -1077,27 +1099,15 @@ static void execute_commands(struct command *commands,
[...]
>  	for (cmd = commands; cmd; cmd = cmd->next) {
[...]
> -		if (shallow_update && !cmd->error_string &&
> -		    si->shallow_ref[cmd->index]) {
> -			error("BUG: connectivity check has not been run on ref %s",
> -			      cmd->ref_name);
> -			checked_connectivity = 0;
> -		}
>  	}
>
> -	if (shallow_update && !checked_connectivity)
> -		error("BUG: run 'git fsck' for safety.\n"
> -		      "If there are errors, try to remove "
> -		      "the reported refs above");
> +	if (shallow_update)
> +		check_shallow_bugs(commands, si);

In the same spirit of saving the reader from having to look at the
body of check_shallow_bugs, would it make sense for the part that reacts
to an error to be kept in the caller?  E.g.:

	if (shallow_update && warn_if_skipped_connectivity_check(commands, si))
		error("BUG: run 'git fsck for safety.\n"
		      "If there are errors, try removing the refs reported above");

Is this error possible, by the way?  update() does not return success
unless it has reached the bottom block in the function.  In the
!is_null_sha1(new_sha1) case that means it calls update_shallow_ref(),
which performs the connectivity check.  In the is_null_sha1(new_sha1)
case, update_shallow_info() does not set cmd->index and
si->shallow_ref[cmd->index] cannot be set.

Perhaps this error message could be written in a way that makes it
clearer that we really expect it not to happen, like

		die("BUG: connectivity check skipped in shallow push???");

(die() instead of error() to prevent refs from updating and pointing
to a disconnected history).

Thoughts?
Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-03  2:20   ` Jonathan Nieder
@ 2015-01-03  9:53     ` Duy Nguyen
  2015-01-05 18:02       ` Stefan Beller
  0 siblings, 1 reply; 24+ messages in thread
From: Duy Nguyen @ 2015-01-03  9:53 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Stefan Beller, Junio C Hamano, Git Mailing List, Eric Sunshine,
	Michael Haggerty, ronniesahlberg

On Sat, Jan 3, 2015 at 9:20 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> -     if (shallow_update && !checked_connectivity)
>> -             error("BUG: run 'git fsck' for safety.\n"
>> -                   "If there are errors, try to remove "
>> -                   "the reported refs above");
>> +     if (shallow_update)
>> +             check_shallow_bugs(commands, si);
>
> In the same spirit of saving the reader from having to look at the
> body of check_shallow_bugs, would it make sense for the part that reacts
> to an error to be kept in the caller?  E.g.:
>
>         if (shallow_update && warn_if_skipped_connectivity_check(commands, si))
>                 error("BUG: run 'git fsck for safety.\n"
>                       "If there are errors, try removing the refs reported above");
>
> Is this error possible, by the way?

That code is to prevent bugs in future. The whole operation is spread
out in many functions/steps and people may overlook.

> update() does not return success
> unless it has reached the bottom block in the function.  In the
> !is_null_sha1(new_sha1) case that means it calls update_shallow_ref(),
> which performs the connectivity check.  In the is_null_sha1(new_sha1)
> case, update_shallow_info() does not set cmd->index and
> si->shallow_ref[cmd->index] cannot be set.
>
> Perhaps this error message could be written in a way that makes it
> clearer that we really expect it not to happen, like
>
>                 die("BUG: connectivity check skipped in shallow push???");
>
> (die() instead of error() to prevent refs from updating and pointing
> to a disconnected history).

If connectivity test is not done, we don't know if history is really
disconnected. But yeah die() may be better (err on the safe side).
-- 
Duy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-03  9:53     ` Duy Nguyen
@ 2015-01-05 18:02       ` Stefan Beller
  2015-01-05 18:25         ` [PATCHv10 01/10] " Stefan Beller
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Beller @ 2015-01-05 18:02 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Jonathan Nieder, Junio C Hamano, Git Mailing List, Eric Sunshine,
	Michael Haggerty, ronnie sahlberg

On Sat, Jan 3, 2015 at 1:53 AM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Sat, Jan 3, 2015 at 9:20 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>>> -     if (shallow_update && !checked_connectivity)
>>> -             error("BUG: run 'git fsck' for safety.\n"
>>> -                   "If there are errors, try to remove "
>>> -                   "the reported refs above");
>>> +     if (shallow_update)
>>> +             check_shallow_bugs(commands, si);
>>
>> In the same spirit of saving the reader from having to look at the
>> body of check_shallow_bugs, would it make sense for the part that reacts
>> to an error to be kept in the caller?  E.g.:
>>
>>         if (shallow_update && warn_if_skipped_connectivity_check(commands, si))

The intention of that patch is to move the code unrelated to executing commands
out of execute_commands. And I feel this error checking is not the core task of
executing commands, hence it should be moved out completely. Having one part
in  warn_if_skipped_connectivity_check and then the other error part
triggered by
an unsuccessful return doesn't improve the situation IMHO.

I think about moving the check for shallow_update inside that function and
to have a

        if (!shallow_update)
                return;

and additionally renaming the function to be more precise:

        assure_shallow_connectivity_checked();

These changes I can put into this patch.

Replacing error by a die command will go in an extra patch on top.
So I am understanding your answers such that we definitely want to prevent
a push if this future bug happen. Instead of incorporating that into
later patches
of the series to abort in case of this possible bug, we could just go
with s/error/die/
and the problem is solved.

>>                 error("BUG: run 'git fsck for safety.\n"
>>                       "If there are errors, try removing the refs reported above");
>>
>> Is this error possible, by the way?
>
> That code is to prevent bugs in future. The whole operation is spread
> out in many functions/steps and people may overlook.
>

Then this patch also helps with introducing a dedicated function to assure
the connectivity which we can reuse maybe.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-05 18:02       ` Stefan Beller
@ 2015-01-05 18:25         ` Stefan Beller
  2015-01-05 18:25           ` [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked Stefan Beller
  2015-01-05 20:22           ` [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands Jonathan Nieder
  0 siblings, 2 replies; 24+ messages in thread
From: Stefan Beller @ 2015-01-05 18:25 UTC (permalink / raw)
  To: jrnieder, pclouds
  Cc: gitster, git, sunshine, mhagger, ronniesahlberg, Stefan Beller

Make the main "execute_commands" loop in receive-pack easier to read
by splitting out some steps into helper functions. The new helper
'should_process_cmd' checks if a ref update is unnecessary, whether
due to an error having occurred or for another reason. The helper
'assure_connectivity_checked' warns if we have forgotten to run a
connectivity check on a ref which is shallow for the client which
would be a bug.

This will help us to duplicate less code in a later patch when we make
a second copy of the "execute_commands" loop.

No functional change intended.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v10:
    * rename check_shallow_bugs to assure_connectivity_checked.
    * reword commit message.
    
    v9:
    * simplified should_process_cmd to be a one liner
    * check_shallow_bugs doesn't check of shallow_update being set, rather
      the function is just called if that option is set.
    
    v8: no change
    
    v7:
     new in v7 as in v7 I'd split up the previous
     [PATCH 4/7] receive-pack.c: receive-pack.c: use a single ref_transaction for atomic pushes
     as suggested by Eric.
    
     This is pretty much
    > patch 1: Factor out code into helper functions which will be needed by
    > the upcoming atomic and non-atomic worker functions. Example helpers:
    > 'cmd->error_string' and cmd->skip_update' check; and the
    > 'si->shallow_ref[cmd->index]' check and handling.

 builtin/receive-pack.c | 43 +++++++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 32fc540..ed428e4 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1042,11 +1042,34 @@ static void reject_updates_to_hidden(struct command *commands)
 	}
 }
 
+static int should_process_cmd(struct command *cmd)
+{
+	return !cmd->error_string && !cmd->skip_update;
+}
+
+static void assure_connectivity_checked(struct command *commands,
+					struct shallow_info *si)
+{
+	struct command *cmd;
+	int checked_connectivity = 1;
+
+	for (cmd = commands; cmd; cmd = cmd->next) {
+		if (should_process_cmd(cmd) && si->shallow_ref[cmd->index]) {
+			error("BUG: connectivity check has not been run on ref %s",
+			      cmd->ref_name);
+			checked_connectivity = 0;
+		}
+	}
+	if (!checked_connectivity)
+		error("BUG: run 'git fsck' for safety.\n"
+		      "If there are errors, try to remove "
+		      "the reported refs above");
+}
+
 static void execute_commands(struct command *commands,
 			     const char *unpacker_error,
 			     struct shallow_info *si)
 {
-	int checked_connectivity;
 	struct command *cmd;
 	unsigned char sha1[20];
 	struct iterate_data data;
@@ -1077,27 +1100,15 @@ static void execute_commands(struct command *commands,
 	free(head_name_to_free);
 	head_name = head_name_to_free = resolve_refdup("HEAD", 0, sha1, NULL);
 
-	checked_connectivity = 1;
 	for (cmd = commands; cmd; cmd = cmd->next) {
-		if (cmd->error_string)
-			continue;
-
-		if (cmd->skip_update)
+		if (!should_process_cmd(cmd))
 			continue;
 
 		cmd->error_string = update(cmd, si);
-		if (shallow_update && !cmd->error_string &&
-		    si->shallow_ref[cmd->index]) {
-			error("BUG: connectivity check has not been run on ref %s",
-			      cmd->ref_name);
-			checked_connectivity = 0;
-		}
 	}
 
-	if (shallow_update && !checked_connectivity)
-		error("BUG: run 'git fsck' for safety.\n"
-		      "If there are errors, try to remove "
-		      "the reported refs above");
+	if (shallow_update)
+		assure_connectivity_checked(commands, si);
 }
 
 static struct command **queue_command(struct command **tail,
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked
  2015-01-05 18:25         ` [PATCHv10 01/10] " Stefan Beller
@ 2015-01-05 18:25           ` Stefan Beller
  2015-01-05 20:17             ` Jonathan Nieder
  2015-01-05 20:22           ` [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands Jonathan Nieder
  1 sibling, 1 reply; 24+ messages in thread
From: Stefan Beller @ 2015-01-05 18:25 UTC (permalink / raw)
  To: jrnieder, pclouds
  Cc: gitster, git, sunshine, mhagger, ronniesahlberg, Stefan Beller

Discussion on the previous patch revealed we rather want to err on the
safe side. To do so we need to stop receive-pack in case of the possible
future bug when connectivity is not checked on a shallow push.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v10:
    * new in v10.

 builtin/receive-pack.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index ed428e4..7ebea7f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1055,15 +1055,15 @@ static void assure_connectivity_checked(struct command *commands,
 
 	for (cmd = commands; cmd; cmd = cmd->next) {
 		if (should_process_cmd(cmd) && si->shallow_ref[cmd->index]) {
-			error("BUG: connectivity check has not been run on ref %s",
-			      cmd->ref_name);
+			die("BUG: connectivity check has not been run on ref %s",
+			    cmd->ref_name);
 			checked_connectivity = 0;
 		}
 	}
 	if (!checked_connectivity)
-		error("BUG: run 'git fsck' for safety.\n"
-		      "If there are errors, try to remove "
-		      "the reported refs above");
+		die("BUG: run 'git fsck' for safety.\n"
+		    "If there are errors, try to remove "
+		    "the reported refs above");
 }
 
 static void execute_commands(struct command *commands,
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked
  2015-01-05 18:25           ` [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked Stefan Beller
@ 2015-01-05 20:17             ` Jonathan Nieder
  2015-01-05 21:15               ` Stefan Beller
  0 siblings, 1 reply; 24+ messages in thread
From: Jonathan Nieder @ 2015-01-05 20:17 UTC (permalink / raw)
  To: Stefan Beller; +Cc: pclouds, gitster, git, sunshine, mhagger, ronniesahlberg

Stefan Beller wrote:

> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -1055,15 +1055,15 @@ static void assure_connectivity_checked(struct command *commands,
>  
>  	for (cmd = commands; cmd; cmd = cmd->next) {
>  		if (should_process_cmd(cmd) && si->shallow_ref[cmd->index]) {
> -			error("BUG: connectivity check has not been run on ref %s",
> -			      cmd->ref_name);
> +			die("BUG: connectivity check has not been run on ref %s",
> +			    cmd->ref_name);

This could stay as error() so that the caller gets to see the list of
refs that didn't experience a connectivity check.  Or if that list
isn't important, this error/die call could be dropped and the
'checked_connectivity = 0' setting would be enough.

>  			checked_connectivity = 0;
>  		}
>  	}
>  	if (!checked_connectivity)
> -		error("BUG: run 'git fsck' for safety.\n"
> -		      "If there are errors, try to remove "
> -		      "the reported refs above");
> +		die("BUG: run 'git fsck' for safety.\n"
> +		    "If there are errors, try to remove "
> +		    "the reported refs above");

I find this error message misleading and confusing.  It makes it seem
like this is an expected error that we are trying to help the user
recover from.  Instead, something impossible has happened.  "Try to
remove the reported refs" is actively harmful advice --- that would be
a way for the user to work around a serious bug instead of figuring
out what went wrong and getting git fixed.

Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-05 18:25         ` [PATCHv10 01/10] " Stefan Beller
  2015-01-05 18:25           ` [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked Stefan Beller
@ 2015-01-05 20:22           ` Jonathan Nieder
  2015-01-05 21:07             ` Stefan Beller
  1 sibling, 1 reply; 24+ messages in thread
From: Jonathan Nieder @ 2015-01-05 20:22 UTC (permalink / raw)
  To: Stefan Beller; +Cc: pclouds, gitster, git, sunshine, mhagger, ronniesahlberg

Stefan Beller wrote:

> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
[...]
> @@ -1077,27 +1100,15 @@ static void execute_commands(struct command *commands,
[...]
> +	if (shallow_update)
> +		assure_connectivity_checked(commands, si);

Looking at this code alone, it seems like assure_connectivity_checked()
is going to ensure that connectivity was checked, so that I can assume
connectivity going forward.  But the opposite is true --- it is a
safety check that prints a warning and doesn't affect what I can
assume.

The factored-out function fails in what it is meant to do, which is to
save the reader of execute_commands from having to look at the
implementation of the parts they are not interested in.

Would something like warn_if_skipped_connectivity_check() make sense?

Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-05 20:22           ` [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands Jonathan Nieder
@ 2015-01-05 21:07             ` Stefan Beller
  2015-01-05 21:18               ` Jonathan Nieder
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Beller @ 2015-01-05 21:07 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Duy Nguyen, Junio C Hamano, git, Eric Sunshine, Michael Haggerty,
	ronnie sahlberg

On Mon, Jan 5, 2015 at 12:22 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Stefan Beller wrote:
>
>> --- a/builtin/receive-pack.c
>> +++ b/builtin/receive-pack.c
> [...]
>> @@ -1077,27 +1100,15 @@ static void execute_commands(struct command *commands,
> [...]
>> +     if (shallow_update)
>> +             assure_connectivity_checked(commands, si);
>
> Looking at this code alone, it seems like assure_connectivity_checked()
> is going to ensure that connectivity was checked, so that I can assume
> connectivity going forward.  But the opposite is true --- it is a
> safety check that prints a warning and doesn't affect what I can
> assume.

I disagree on that. Combined with the next patch (s/error/die/) we can assume
that the the connectivity is there as if it is not, git is dead.

This is why I choose the word assure. Maybe check_assumption would be better?

>
> The factored-out function fails in what it is meant to do, which is to
> save the reader of execute_commands from having to look at the
> implementation of the parts they are not interested in.
>
> Would something like warn_if_skipped_connectivity_check() make sense?

The next patch would then change this to die_if_... ?
I'd be ok with that, but in your original email you would still have the last
die(...) in the execute_command function which I dislike.
So what about:

if (shallow_update)
       (warn|die)_on_skipped_connectivity_check()

?

>
> Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked
  2015-01-05 20:17             ` Jonathan Nieder
@ 2015-01-05 21:15               ` Stefan Beller
  2015-01-05 21:25                 ` Jonathan Nieder
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Beller @ 2015-01-05 21:15 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Duy Nguyen, Junio C Hamano, git, Eric Sunshine, Michael Haggerty,
	ronnie sahlberg

On Mon, Jan 5, 2015 at 12:17 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Stefan Beller wrote:
>
>> --- a/builtin/receive-pack.c
>> +++ b/builtin/receive-pack.c
>> @@ -1055,15 +1055,15 @@ static void assure_connectivity_checked(struct command *commands,
>>
>>       for (cmd = commands; cmd; cmd = cmd->next) {
>>               if (should_process_cmd(cmd) && si->shallow_ref[cmd->index]) {
>> -                     error("BUG: connectivity check has not been run on ref %s",
>> -                           cmd->ref_name);
>> +                     die("BUG: connectivity check has not been run on ref %s",
>> +                         cmd->ref_name);
>
> This could stay as error() so that the caller gets to see the list of
> refs that didn't experience a connectivity check.  Or if that list
> isn't important, this error/die call could be dropped and the
> 'checked_connectivity = 0' setting would be enough.

Right. I was once again writing patches without brain activity.
So we'd keep this as an error.

>
>>                       checked_connectivity = 0;
>>               }
>>       }
>>       if (!checked_connectivity)
>> -             error("BUG: run 'git fsck' for safety.\n"
>> -                   "If there are errors, try to remove "
>> -                   "the reported refs above");
>> +             die("BUG: run 'git fsck' for safety.\n"
>> +                 "If there are errors, try to remove "
>> +                 "the reported refs above");
>
> I find this error message misleading and confusing.  It makes it seem
> like this is an expected error that we are trying to help the user
> recover from.  Instead, something impossible has happened.  "Try to
> remove the reported refs" is actively harmful advice --- that would be
> a way for the user to work around a serious bug instead of figuring
> out what went wrong and getting git fixed.

Maybe we should do both?

    die ("BUG: Some refs have not been checked for connectivity."
          "Please contact the git developers (git@vger.kernel.org) and"
          "report the problem. As a workaround run 'git fsck'. If there"
          "are errors, try to remove the reported refs above. (This "
          "may lead to data loss, backup first.)"

Just thinking out loud:
We are using die(...) for two kinds of problems. Either because of
some bad condition given to us by the user (wrong/meaningless arguments)
which we then to refuse to accept.

The other case is usually die("BUG: Git is broken in some way") as we're
discussing here. Would it make sense to have an extra die_bug function,
which amends the reported error string by something like

    "Please contact the git developers (git@vger.kernel.org) and
report the problem."

as above?

Thanks,
Stefan

>
> Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-05 21:07             ` Stefan Beller
@ 2015-01-05 21:18               ` Jonathan Nieder
  2015-01-06 19:34                 ` [PATCHv11 01/11] " Stefan Beller
  0 siblings, 1 reply; 24+ messages in thread
From: Jonathan Nieder @ 2015-01-05 21:18 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Duy Nguyen, Junio C Hamano, git, Eric Sunshine, Michael Haggerty,
	ronnie sahlberg

Stefan Beller wrote:
> On Mon, Jan 5, 2015 at 12:22 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
>> Stefan Beller wrote:

>>> --- a/builtin/receive-pack.c
>>> +++ b/builtin/receive-pack.c
>> [...]
>>> @@ -1077,27 +1100,15 @@ static void execute_commands(struct command *commands,
>> [...]
>>> +     if (shallow_update)
>>> +             assure_connectivity_checked(commands, si);
>>
>> Looking at this code alone, it seems like assure_connectivity_checked()
>> is going to ensure that connectivity was checked, so that I can assume
>> connectivity going forward.  But the opposite is true --- it is a
>> safety check that prints a warning and doesn't affect what I can
>> assume.
>
> I disagree on that. Combined with the next patch (s/error/die/) we can assume
> that the the connectivity is there as if it is not, git is dead.
>
> This is why I choose the word assure.

If this patch depends on the next one, would it make sense to put them
in the opposite order?

>> The factored-out function fails in what it is meant to do, which is to
>> save the reader of execute_commands from having to look at the
>> implementation of the parts they are not interested in.
>>
>> Would something like warn_if_skipped_connectivity_check() make sense?
>
> The next patch would then change this to die_if_... ?
> I'd be ok with that, but in your original email you would still have the last
> die(...) in the execute_command function which I dislike.
> So what about:
>
> if (shallow_update)
>        (warn|die)_on_skipped_connectivity_check()
>
> ?

My personal preference would be to refactor the preceding code to make
the check unnecessary.

But aside from that, anything that makes the code clearer is fine with
me.  I find ..._if_... clearer than ..._on_... here because it seems
more obvious that it is not an expected condition (i.e., it's a kind
of abbreviation for

	warn_if_skipped_connectivity_check_which_should_never_happen()

) but that's a more minor detail.  An alternative way to make the code
clearer would be to use a comment.

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked
  2015-01-05 21:15               ` Stefan Beller
@ 2015-01-05 21:25                 ` Jonathan Nieder
  2015-01-06 19:40                   ` [PATCHv11 02/11] receive-pack.c: die instead of error in case of possible future bug Stefan Beller
  0 siblings, 1 reply; 24+ messages in thread
From: Jonathan Nieder @ 2015-01-05 21:25 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Duy Nguyen, Junio C Hamano, git, Eric Sunshine, Michael Haggerty,
	ronnie sahlberg

Stefan Beller wrote:

> Maybe we should do both?
>
>     die ("BUG: Some refs have not been checked for connectivity."
>           "Please contact the git developers (git@vger.kernel.org) and"
>           "report the problem. As a workaround run 'git fsck'. If there"
>           "are errors, try to remove the reported refs above. (This "
>           "may lead to data loss, backup first.)"

I personally find this kind of message grating when I run into it.
The message is trying to tell me what to do, but it is not in a
position to know what the best thing to do is.

It could be that I am using an ancient version of git with known bugs.
In that case I should upgrade.

It could be that I am using faulty hardware that flips random bits and
confuses software.

It could be that I have a patched version of git, in which case I should
contact the author of my patch instead of git@vger.kernel.org.

It could be that this is a recent, terrible regression and
git@vger.kernel.org is already bombarded with reports about it.

If the message says

	fatal: BUG: connectivity check skipped???

then it has exactly the right amount of information to tell me what to
do.  Now I have

 - a short string to grep for in the source code (or on the web) to
   find out what happened

 - a clear indication that This Can't Happen, so I know to try to
   reproduce it and contact the author of my patched git or upstream
   or whoever, depending on the context

 - no irrelevant guesses to confuse me

The workaround of running 'git fsck' could be actively harmful,
depending on what the bug is.  All that we know is that a bug has
occured and we shouldn't proceed further.

> Just thinking out loud:
[...]
>                  Would it make sense to have an extra die_bug function,

Yes, I think something like the kernel's BUG_ON and WARN_ON would be
very nice (though orthogonal to this change).

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCHv11 01/11] receive-pack.c: shorten the execute_commands loop over all commands
  2015-01-05 21:18               ` Jonathan Nieder
@ 2015-01-06 19:34                 ` Stefan Beller
  0 siblings, 0 replies; 24+ messages in thread
From: Stefan Beller @ 2015-01-06 19:34 UTC (permalink / raw)
  To: jrnieder, pclouds
  Cc: gitster, git, sunshine, mhagger, ronniesahlberg, Stefan Beller

Make the main "execute_commands" loop in receive-pack easier to read
by splitting out some steps into helper functions. The new helper
'should_process_cmd' checks if a ref update is unnecessary, whether
due to an error having occurred or for another reason. The helper
'warn_if_skipped_connectivity_check' warns if we have forgotten to
run a connectivity check on a ref which is shallow for the client
which would be a bug.

This will help us to duplicate less code in a later patch when we make
a second copy of the "execute_commands" loop.

No functional change intended.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v11:
    * rename assure_connectivity_checked to warn_if_skipped_connectivity_check
    
        >> This is why I choose the word assure.
        >If this patch depends on the next one, would it make sense to put them
        >in the opposite order?
        
        With the next patch applied which dies if the assumption doesn't hold, 
        assure/assume sounds to me as if it describes the siuation well.
        
        > My personal preference would be to refactor the preceding code to make
        > the check unnecessary.
        
        The way I understand it, the shallow case is spread out all over the place
        not by choise but because it doesn't work better. So the original author
        (Duy) was wise enough to put checks in place because knowing it is fragile
        and may break in the future? 
        This series doesn't intend to refactor the shallow case.
        
        So I picked up warn_if_skipped_connectivity_check as I'm ok with that.
    
    v10:
    * rename check_shallow_bugs to assure_connectivity_checked.
    * reword commit message.
    
    v9:
    * simplified should_process_cmd to be a one liner
    * check_shallow_bugs doesn't check of shallow_update being set, rather
      the function is just called if that option is set.
    
    v8: no change
    
    v7:
     new in v7 as in v7 I'd split up the previous
     [PATCH 4/7] receive-pack.c: receive-pack.c: use a single ref_transaction for atomic pushes
     as suggested by Eric.
    
     This is pretty much
    > patch 1: Factor out code into helper functions which will be needed by
    > the upcoming atomic and non-atomic worker functions. Example helpers:
    > 'cmd->error_string' and cmd->skip_update' check; and the
    > 'si->shallow_ref[cmd->index]' check and handling.

 builtin/receive-pack.c | 43 +++++++++++++++++++++++++++----------------
 1 file changed, 27 insertions(+), 16 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 32fc540..2ebaf66 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1042,11 +1042,34 @@ static void reject_updates_to_hidden(struct command *commands)
 	}
 }
 
+static int should_process_cmd(struct command *cmd)
+{
+	return !cmd->error_string && !cmd->skip_update;
+}
+
+static void warn_if_skipped_connectivity_check(struct command *commands,
+					       struct shallow_info *si)
+{
+	struct command *cmd;
+	int checked_connectivity = 1;
+
+	for (cmd = commands; cmd; cmd = cmd->next) {
+		if (should_process_cmd(cmd) && si->shallow_ref[cmd->index]) {
+			error("BUG: connectivity check has not been run on ref %s",
+			      cmd->ref_name);
+			checked_connectivity = 0;
+		}
+	}
+	if (!checked_connectivity)
+		error("BUG: run 'git fsck' for safety.\n"
+		      "If there are errors, try to remove "
+		      "the reported refs above");
+}
+
 static void execute_commands(struct command *commands,
 			     const char *unpacker_error,
 			     struct shallow_info *si)
 {
-	int checked_connectivity;
 	struct command *cmd;
 	unsigned char sha1[20];
 	struct iterate_data data;
@@ -1077,27 +1100,15 @@ static void execute_commands(struct command *commands,
 	free(head_name_to_free);
 	head_name = head_name_to_free = resolve_refdup("HEAD", 0, sha1, NULL);
 
-	checked_connectivity = 1;
 	for (cmd = commands; cmd; cmd = cmd->next) {
-		if (cmd->error_string)
-			continue;
-
-		if (cmd->skip_update)
+		if (!should_process_cmd(cmd))
 			continue;
 
 		cmd->error_string = update(cmd, si);
-		if (shallow_update && !cmd->error_string &&
-		    si->shallow_ref[cmd->index]) {
-			error("BUG: connectivity check has not been run on ref %s",
-			      cmd->ref_name);
-			checked_connectivity = 0;
-		}
 	}
 
-	if (shallow_update && !checked_connectivity)
-		error("BUG: run 'git fsck' for safety.\n"
-		      "If there are errors, try to remove "
-		      "the reported refs above");
+	if (shallow_update)
+		warn_if_skipped_connectivity_check(commands, si);
 }
 
 static struct command **queue_command(struct command **tail,
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCHv11 02/11] receive-pack.c: die instead of error in case of possible future bug
  2015-01-05 21:25                 ` Jonathan Nieder
@ 2015-01-06 19:40                   ` Stefan Beller
  2015-01-06 19:46                     ` Jonathan Nieder
  0 siblings, 1 reply; 24+ messages in thread
From: Stefan Beller @ 2015-01-06 19:40 UTC (permalink / raw)
  To: jrnieder, pclouds
  Cc: gitster, git, sunshine, mhagger, ronniesahlberg, Stefan Beller

Discussion on the previous patch revealed we rather want to err on the
safe side. To do so we need to stop receive-pack in case of the possible
future bug when connectivity is not checked on a shallow push.

Also while touching that code we considered that removing the reported
refs may be harmful in some situations. Sound the message more like a
"This Cannot Happen, Please Investigate!" instead of giving advice to
remove refs. Running 'git fsck' should not cause any problems though.

Suggested-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    v11:
    * only die at the end so the loop works out for all refs.
    * Remove the advice to delete refs.
    
    > If the message says
    >        fatal: BUG: connectivity check skipped???
    > then it has exactly the right amount of information to tell me what to
    > do.  Now I have
    > - a short string to grep for in the source code (or on the web) to
    >    find out what happened
    
    And do a git blame to see previous versions?
    
    I am not so sure of this patch any more as it actually stops people
    doing work if they want to do so. (They may deliberately choose to
    ignore the BUG:... message, because of a deadline in 2 hours.)
    
    So I do think this helps on getting people to report the bug in the
    future if it arises faster, but on the other hand if we assume the
    faulty hardware scenario and the deadline we actually stop people
    from getting their desired work done.
    
    This patch doesn't actually relate to the topic of the series
    (atomic pushes), but is a cleanup-as-we-go patch. If we need to
    have further discussion on this, I'd rather want to delay this patch
    and have a follow up on top of the atomic series.
    
    Thanks,
    Stefan
    
    v10:
    * new in v10.

 builtin/receive-pack.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 2ebaf66..7b0d0f4 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1061,9 +1061,8 @@ static void warn_if_skipped_connectivity_check(struct command *commands,
 		}
 	}
 	if (!checked_connectivity)
-		error("BUG: run 'git fsck' for safety.\n"
-		      "If there are errors, try to remove "
-		      "the reported refs above");
+		die("BUG: run 'git fsck' for safety.\n"
+		    "BUG: connectivity check skipped???");
 }
 
 static void execute_commands(struct command *commands,
-- 
2.2.1.62.g3f15098

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCHv11 02/11] receive-pack.c: die instead of error in case of possible future bug
  2015-01-06 19:40                   ` [PATCHv11 02/11] receive-pack.c: die instead of error in case of possible future bug Stefan Beller
@ 2015-01-06 19:46                     ` Jonathan Nieder
  0 siblings, 0 replies; 24+ messages in thread
From: Jonathan Nieder @ 2015-01-06 19:46 UTC (permalink / raw)
  To: Stefan Beller; +Cc: pclouds, gitster, git, sunshine, mhagger, ronniesahlberg

Stefan Beller wrote:

> +++ b/builtin/receive-pack.c
> @@ -1061,9 +1061,8 @@ static void warn_if_skipped_connectivity_check(struct command *commands,
>  		}
>  	}
>  	if (!checked_connectivity)
> -		error("BUG: run 'git fsck' for safety.\n"
> -		      "If there are errors, try to remove "
> -		      "the reported refs above");
> +		die("BUG: run 'git fsck' for safety.\n"

This advice is still inaccurate.

If the idea is that the user should ensure the connectivity check is
run, why don't we run the check ourselves here?

If the idea is instead that something unexpected has happened and we
don't have enough information to go further, why do we ask the user
to run 'git fsck'?

Hope that helps,
Jonathan

> +		    "BUG: connectivity check skipped???");

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-01-06 19:46 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-12-30 23:41 [PATCH 0/9] atomic pushes Stefan Beller
2014-12-30 23:41 ` [PATCHv9 1/9] receive-pack.c: shorten the execute_commands loop over all commands Stefan Beller
2015-01-03  2:20   ` Jonathan Nieder
2015-01-03  9:53     ` Duy Nguyen
2015-01-05 18:02       ` Stefan Beller
2015-01-05 18:25         ` [PATCHv10 01/10] " Stefan Beller
2015-01-05 18:25           ` [PATCHv10 02/10] receive-pack.c: die instead of error in assure_connectivity_checked Stefan Beller
2015-01-05 20:17             ` Jonathan Nieder
2015-01-05 21:15               ` Stefan Beller
2015-01-05 21:25                 ` Jonathan Nieder
2015-01-06 19:40                   ` [PATCHv11 02/11] receive-pack.c: die instead of error in case of possible future bug Stefan Beller
2015-01-06 19:46                     ` Jonathan Nieder
2015-01-05 20:22           ` [PATCHv10 01/10] receive-pack.c: shorten the execute_commands loop over all commands Jonathan Nieder
2015-01-05 21:07             ` Stefan Beller
2015-01-05 21:18               ` Jonathan Nieder
2015-01-06 19:34                 ` [PATCHv11 01/11] " Stefan Beller
2014-12-30 23:41 ` [PATCHv9 2/9] receive-pack.c: move iterating over all commands outside execute_commands Stefan Beller
2014-12-30 23:41 ` [PATCHv9 3/9] receive-pack.c: move transaction handling in a central place Stefan Beller
2014-12-30 23:41 ` [PATCHv9 4/9] receive-pack.c: add execute_commands_atomic function Stefan Beller
2014-12-30 23:41 ` [PATCHv9 5/9] receive-pack.c: negotiate atomic push support Stefan Beller
2014-12-30 23:41 ` [PATCHv9 6/9] send-pack: rename ref_update_to_be_sent to check_to_send_update Stefan Beller
2014-12-30 23:41 ` [PATCHv9 7/9] send-pack.c: add --atomic command line argument Stefan Beller
2014-12-30 23:41 ` [PATCHv9 8/9] push.c: add an --atomic argument Stefan Beller
2014-12-30 23:41 ` [PATCHv9 9/9] t5543-atomic-push.sh: add basic tests for atomic pushes Stefan Beller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.