[PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit()
@ 2015-04-24 11:35 Michael Haggerty
  2015-04-24 11:35 ` [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1() Michael Haggerty
                   ` (6 more replies)
  0 siblings, 7 replies; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 11:35 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git, Michael Haggerty

In ref_transaction_commit(), close the reference lockfiles immediately
to avoid keeping too many file descriptors open at a time. This is
pretty easy, because in the first loop (where we create the locks) we
already know what, if anything, has to be written into the lockfile.
So write it and close the lockfile immediately. In the second loop,
rename the lockfiles for reference updates into place, and in the
cleanup loop, unlock any references that are still locked (i.e., those
that were only being verified or deleted).

I think this is a cleaner solution than Stefan's approach [1] of
closing and reopening fds based on an estimate of how many fds we can
afford to waste--we only need a single file descriptor at a time, and
we never have to close then reopen a lockfile. But it is a bit more
intrusive, so it might still be preferable to use Stefan's approach
for release 2.4.0, if indeed any fix for this problem is still being
considered for that release.

This patch series applies on top of Stefan's

    c1f0ca9 refs.c: remove lock_fd from struct ref_lock (2015-04-16)

and it fixes two tests that Stefan introduced earlier in that series.

It is also available from my GitHub account:

    https://github.com/mhagger/git branch close-ref-locks-promptly

Michael

[1] http://article.gmane.org/gmane.comp.version-control.git/267548

Michael Haggerty (5):
  write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
  commit_ref_update(): new function, extracted from write_ref_sha1()
  write_ref_sha1(): inline function at callers
  ref_transaction_commit(): remove the local flags variables
  ref_transaction_commit(): only keep one lockfile open at a time

 refs.c                | 107 ++++++++++++++++++++++++++++++++++----------------
 t/t1400-update-ref.sh |   4 +-
 2 files changed, 75 insertions(+), 36 deletions(-)

-- 
2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
@ 2015-04-24 11:35 ` Michael Haggerty
  2015-04-24 11:35 ` [PATCH 2/5] commit_ref_update(): " Michael Haggerty
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 11:35 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git, Michael Haggerty

This is the first step towards separating the checking and writing of
the new reference value to committing the change.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 20 ++++++++++++++++----
 1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/refs.c b/refs.c
index 4f495bd..72dae21 100644
--- a/refs.c
+++ b/refs.c
@@ -3017,11 +3017,10 @@ int is_branch(const char *refname)
 }
 
 /*
- * Write sha1 into the ref specified by the lock. Make sure that errno
- * is sane on error.
+ * Write sha1 into the open lockfile, then close the lockfile. On
+ * errors, rollback the lockfile and set errno to reflect the problem.
  */
-static int write_ref_sha1(struct ref_lock *lock,
-	const unsigned char *sha1, const char *logmsg)
+static int write_ref_to_lockfile(struct ref_lock *lock, const unsigned char *sha1)
 {
 	static char term = '\n';
 	struct object *o;
@@ -3050,6 +3049,19 @@ static int write_ref_sha1(struct ref_lock *lock,
 		errno = save_errno;
 		return -1;
 	}
+	return 0;
+}
+
+/*
+ * Write sha1 into the ref specified by the lock. Make sure that errno
+ * is sane on error.
+ */
+static int write_ref_sha1(struct ref_lock *lock,
+	const unsigned char *sha1, const char *logmsg)
+{
+	if (write_ref_to_lockfile(lock, sha1))
+		return -1;
+
 	clear_loose_ref_cache(&ref_cache);
 	if (log_ref_write(lock->ref_name, lock->old_sha1, sha1, logmsg) < 0 ||
 	    (strcmp(lock->ref_name, lock->orig_ref_name) &&
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/5] commit_ref_update(): new function, extracted from write_ref_sha1()
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
  2015-04-24 11:35 ` [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1() Michael Haggerty
@ 2015-04-24 11:35 ` Michael Haggerty
  2015-04-24 11:35 ` [PATCH 3/5] write_ref_sha1(): inline function at callers Michael Haggerty
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 11:35 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git, Michael Haggerty

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 26 +++++++++++++++++---------
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/refs.c b/refs.c
index 72dae21..9b68aec 100644
--- a/refs.c
+++ b/refs.c
@@ -3052,16 +3052,9 @@ static int write_ref_to_lockfile(struct ref_lock *lock, const unsigned char *sha
 	return 0;
 }
 
-/*
- * Write sha1 into the ref specified by the lock. Make sure that errno
- * is sane on error.
- */
-static int write_ref_sha1(struct ref_lock *lock,
-	const unsigned char *sha1, const char *logmsg)
+static int commit_ref_update(struct ref_lock *lock,
+			     const unsigned char *sha1, const char *logmsg)
 {
-	if (write_ref_to_lockfile(lock, sha1))
-		return -1;
-
 	clear_loose_ref_cache(&ref_cache);
 	if (log_ref_write(lock->ref_name, lock->old_sha1, sha1, logmsg) < 0 ||
 	    (strcmp(lock->ref_name, lock->orig_ref_name) &&
@@ -3100,6 +3093,21 @@ static int write_ref_sha1(struct ref_lock *lock,
 	return 0;
 }
 
+/*
+ * Write sha1 as the new value of the reference specified by the
+ * (open) lock. On error, roll back the lockfile and set errno
+ * appropriately.
+ */
+static int write_ref_sha1(struct ref_lock *lock,
+	const unsigned char *sha1, const char *logmsg)
+{
+	if (write_ref_to_lockfile(lock, sha1) ||
+	    commit_ref_update(lock, sha1, logmsg))
+		return -1;
+
+	return 0;
+}
+
 int create_symref(const char *ref_target, const char *refs_heads_master,
 		  const char *logmsg)
 {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/5] write_ref_sha1(): inline function at callers
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
  2015-04-24 11:35 ` [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1() Michael Haggerty
  2015-04-24 11:35 ` [PATCH 2/5] commit_ref_update(): " Michael Haggerty
@ 2015-04-24 11:35 ` Michael Haggerty
  2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 11:35 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git, Michael Haggerty

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 38 +++++++++++++++-----------------------
 1 file changed, 15 insertions(+), 23 deletions(-)

diff --git a/refs.c b/refs.c
index 9b68aec..a55d541 100644
--- a/refs.c
+++ b/refs.c
@@ -2770,8 +2770,10 @@ static int rename_ref_available(const char *oldname, const char *newname)
 	return ret;
 }
 
-static int write_ref_sha1(struct ref_lock *lock, const unsigned char *sha1,
-			  const char *logmsg);
+static int write_ref_to_lockfile(struct ref_lock *lock,
+				 const unsigned char *sha1);
+static int commit_ref_update(struct ref_lock *lock,
+			     const unsigned char *sha1, const char *logmsg);
 
 int rename_ref(const char *oldrefname, const char *newrefname, const char *logmsg)
 {
@@ -2829,7 +2831,8 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
 		goto rollback;
 	}
 	hashcpy(lock->old_sha1, orig_sha1);
-	if (write_ref_sha1(lock, orig_sha1, logmsg)) {
+	if (write_ref_to_lockfile(lock, orig_sha1) ||
+	    commit_ref_update(lock, orig_sha1, logmsg)) {
 		error("unable to write current sha1 into %s", newrefname);
 		goto rollback;
 	}
@@ -2845,7 +2848,8 @@ int rename_ref(const char *oldrefname, const char *newrefname, const char *logms
 
 	flag = log_all_ref_updates;
 	log_all_ref_updates = 0;
-	if (write_ref_sha1(lock, orig_sha1, NULL))
+	if (write_ref_to_lockfile(lock, orig_sha1) ||
+	    commit_ref_update(lock, orig_sha1, NULL))
 		error("unable to write current sha1 into %s", oldrefname);
 	log_all_ref_updates = flag;
 
@@ -3093,21 +3097,6 @@ static int commit_ref_update(struct ref_lock *lock,
 	return 0;
 }
 
-/*
- * Write sha1 as the new value of the reference specified by the
- * (open) lock. On error, roll back the lockfile and set errno
- * appropriately.
- */
-static int write_ref_sha1(struct ref_lock *lock,
-	const unsigned char *sha1, const char *logmsg)
-{
-	if (write_ref_to_lockfile(lock, sha1) ||
-	    commit_ref_update(lock, sha1, logmsg))
-		return -1;
-
-	return 0;
-}
-
 int create_symref(const char *ref_target, const char *refs_heads_master,
 		  const char *logmsg)
 {
@@ -3801,15 +3790,18 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 				 */
 				unlock_ref(update->lock);
 				update->lock = NULL;
-			} else if (write_ref_sha1(update->lock, update->new_sha1,
-						  update->msg)) {
-				update->lock = NULL; /* freed by write_ref_sha1 */
+			} else if (write_ref_to_lockfile(update->lock,
+							 update->new_sha1) ||
+				   commit_ref_update(update->lock,
+						     update->new_sha1,
+						     update->msg)) {
+				update->lock = NULL; /* freed by the above calls */
 				strbuf_addf(err, "Cannot update the ref '%s'.",
 					    update->refname);
 				ret = TRANSACTION_GENERIC_ERROR;
 				goto cleanup;
 			} else {
-				/* freed by write_ref_sha1(): */
+				/* freed by the above calls: */
 				update->lock = NULL;
 			}
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/5] ref_transaction_commit(): remove the local flags variables
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
                   ` (2 preceding siblings ...)
  2015-04-24 11:35 ` [PATCH 3/5] write_ref_sha1(): inline function at callers Michael Haggerty
@ 2015-04-24 11:35 ` Michael Haggerty
  2015-04-24 17:30   ` Jeff King
  2015-04-24 21:51   ` Eric Sunshine
  2015-04-24 11:35 ` [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time Michael Haggerty
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 11:35 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git, Michael Haggerty

Instead, work directly with update->flags. This has the advantage that
the REF_DELETING bit, set in the first loop, can be read in the third
loop instead of having to compute the same expression again. Plus, it
was kindof confusing having both update->flags and flags, which
sometimes had different values.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c | 15 ++++++---------
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/refs.c b/refs.c
index a55d541..782e777 100644
--- a/refs.c
+++ b/refs.c
@@ -3752,16 +3752,15 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 	/* Acquire all locks while verifying old values */
 	for (i = 0; i < n; i++) {
 		struct ref_update *update = updates[i];
-		unsigned int flags = update->flags;
 
-		if ((flags & REF_HAVE_NEW) && is_null_sha1(update->new_sha1))
-			flags |= REF_DELETING;
+		if ((update->flags & REF_HAVE_NEW) && is_null_sha1(update->new_sha1))
+			update->flags |= REF_DELETING;
 		update->lock = lock_ref_sha1_basic(
 				update->refname,
 				((update->flags & REF_HAVE_OLD) ?
 				 update->old_sha1 : NULL),
 				NULL,
-				flags,
+				update->flags,
 				&update->type);
 		if (!update->lock) {
 			ret = (errno == ENOTDIR)
@@ -3776,9 +3775,8 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 	/* Perform updates first so live commits remain referenced */
 	for (i = 0; i < n; i++) {
 		struct ref_update *update = updates[i];
-		int flags = update->flags;
 
-		if ((flags & REF_HAVE_NEW) && !is_null_sha1(update->new_sha1)) {
+		if ((update->flags & REF_HAVE_NEW) && !is_null_sha1(update->new_sha1)) {
 			int overwriting_symref = ((update->type & REF_ISSYMREF) &&
 						  (update->flags & REF_NODEREF));
 
@@ -3810,15 +3808,14 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 	/* Perform deletes now that updates are safely completed */
 	for (i = 0; i < n; i++) {
 		struct ref_update *update = updates[i];
-		int flags = update->flags;
 
-		if ((flags & REF_HAVE_NEW) && is_null_sha1(update->new_sha1)) {
+		if (update->flags & REF_DELETING) {
 			if (delete_ref_loose(update->lock, update->type, err)) {
 				ret = TRANSACTION_GENERIC_ERROR;
 				goto cleanup;
 			}
 
-			if (!(flags & REF_ISPRUNING))
+			if (!(update->flags & REF_ISPRUNING))
 				string_list_append(&refs_to_delete,
 						   update->lock->ref_name);
 		}
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
                   ` (3 preceding siblings ...)
  2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
@ 2015-04-24 11:35 ` Michael Haggerty
  2015-04-25  6:08   ` Michael Haggerty
  2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
  2015-04-25  4:28 ` Junio C Hamano
  6 siblings, 1 reply; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 11:35 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git, Michael Haggerty

The old code locked all of the references in the first loop, leaving
all of the lockfiles open until later loops. Since we might be
updating a lot of references, this could cause file descriptor
exhaustion.

But there is no reason to keep the lockfiles open after the first
loop:

* For references being deleted or verified, we don't have to write
  anything into the lockfile, so we can close it immediately.

* For references being updated, we can write the new SHA-1 into the
  lockfile immediately and close the lockfile without renaming it into
  place. In the second loop, the already-written contents can be
  renamed into place using commit_ref_update().

To simplify the bookkeeping across loops, add a new REF_NEEDS_COMMIT
bit to update->flags, which keeps track of whether the corresponding
lockfile needs to be committed, as opposed to just unlocked.

This change fixes two tests in t1400.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c                | 68 +++++++++++++++++++++++++++++++++++++--------------
 t/t1400-update-ref.sh |  4 +--
 2 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/refs.c b/refs.c
index 782e777..2bdd93c 100644
--- a/refs.c
+++ b/refs.c
@@ -36,25 +36,31 @@ static unsigned char refname_disposition[256] = {
  * Flag passed to lock_ref_sha1_basic() telling it to tolerate broken
  * refs (i.e., because the reference is about to be deleted anyway).
  */
-#define REF_DELETING	0x02
+#define REF_DELETING 0x02
 
 /*
  * Used as a flag in ref_update::flags when a loose ref is being
  * pruned.
  */
-#define REF_ISPRUNING	0x04
+#define REF_ISPRUNING 0x04
 
 /*
  * Used as a flag in ref_update::flags when the reference should be
  * updated to new_sha1.
  */
-#define REF_HAVE_NEW	0x08
+#define REF_HAVE_NEW 0x08
 
 /*
  * Used as a flag in ref_update::flags when old_sha1 should be
  * checked.
  */
-#define REF_HAVE_OLD	0x10
+#define REF_HAVE_OLD 0x10
+
+/*
+ * Used as a flag in ref_update::flags when the lockfile needs to be
+ * committed.
+ */
+#define REF_NEEDS_COMMIT 0x20
 
 /*
  * Try to read one refname component from the front of refname.
@@ -3749,7 +3755,12 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		goto cleanup;
 	}
 
-	/* Acquire all locks while verifying old values */
+	/*
+	 * Acquire all locks, verify old values if provided, check
+	 * that new values are valid, and write new values to the
+	 * lockfiles, ready to be activated. Only keep one lockfile
+	 * open at a time to avoid running out of file descriptors.
+	 */
 	for (i = 0; i < n; i++) {
 		struct ref_update *update = updates[i];
 
@@ -3770,13 +3781,7 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 				    update->refname);
 			goto cleanup;
 		}
-	}
-
-	/* Perform updates first so live commits remain referenced */
-	for (i = 0; i < n; i++) {
-		struct ref_update *update = updates[i];
-
-		if ((update->flags & REF_HAVE_NEW) && !is_null_sha1(update->new_sha1)) {
+		if ((update->flags & REF_HAVE_NEW) && !(update->flags & REF_DELETING)) {
 			int overwriting_symref = ((update->type & REF_ISSYMREF) &&
 						  (update->flags & REF_NODEREF));
 
@@ -3786,14 +3791,39 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 				 * The reference already has the desired
 				 * value, so we don't need to write it.
 				 */
-				unlock_ref(update->lock);
-				update->lock = NULL;
 			} else if (write_ref_to_lockfile(update->lock,
-							 update->new_sha1) ||
-				   commit_ref_update(update->lock,
-						     update->new_sha1,
-						     update->msg)) {
-				update->lock = NULL; /* freed by the above calls */
+							 update->new_sha1)) {
+				update->lock = NULL; /* freed by the above call */
+				strbuf_addf(err, "Cannot update the ref '%s'.",
+					    update->refname);
+				ret = TRANSACTION_GENERIC_ERROR;
+				goto cleanup;
+			} else {
+				update->flags |= REF_NEEDS_COMMIT;
+			}
+		}
+		if (!(update->flags & REF_NEEDS_COMMIT)) {
+			/*
+			 * We didn't have to write anything to the lockfile.
+			 * Close it to free up the file descriptor:
+			 */
+			if (close_ref(update->lock)) {
+				strbuf_addf(err, "Couldn't close %s.lock",
+					    update->refname);
+				goto cleanup;
+			}
+		}
+	}
+
+	/* Perform updates first so live commits remain referenced */
+	for (i = 0; i < n; i++) {
+		struct ref_update *update = updates[i];
+
+		if (update->flags & REF_NEEDS_COMMIT) {
+			if (commit_ref_update(update->lock,
+						    update->new_sha1,
+						    update->msg)) {
+				update->lock = NULL; /* freed by the above call */
 				strbuf_addf(err, "Cannot update the ref '%s'.",
 					    update->refname);
 				ret = TRANSACTION_GENERIC_ERROR;
diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index 7a69f1a..636d3a1 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -1071,7 +1071,7 @@ run_with_limited_open_files () {
 
 test_lazy_prereq ULIMIT_FILE_DESCRIPTORS 'run_with_limited_open_files true'
 
-test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' '
+test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' '
 (
 	for i in $(test_seq 33)
 	do
@@ -1082,7 +1082,7 @@ test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches
 )
 '
 
-test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' '
+test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' '
 (
 	for i in $(test_seq 33)
 	do
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit()
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
                   ` (4 preceding siblings ...)
  2015-04-24 11:35 ` [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time Michael Haggerty
@ 2015-04-24 17:26 ` Jeff King
  2015-04-24 19:13   ` Stefan Beller
  2015-04-25 19:27   ` Junio C Hamano
  2015-04-25  4:28 ` Junio C Hamano
  6 siblings, 2 replies; 19+ messages in thread
From: Jeff King @ 2015-04-24 17:26 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Stefan Beller, git

On Fri, Apr 24, 2015 at 01:35:44PM +0200, Michael Haggerty wrote:

> In ref_transaction_commit(), close the reference lockfiles immediately
> to avoid keeping too many file descriptors open at a time. This is
> pretty easy, because in the first loop (where we create the locks) we
> already know what, if anything, has to be written into the lockfile.
> So write it and close the lockfile immediately. In the second loop,
> rename the lockfiles for reference updates into place, and in the
> cleanup loop, unlock any references that are still locked (i.e., those
> that were only being verified or deleted).
> 
> I think this is a cleaner solution than Stefan's approach [1] of
> closing and reopening fds based on an estimate of how many fds we can
> afford to waste--we only need a single file descriptor at a time, and
> we never have to close then reopen a lockfile. But it is a bit more
> intrusive, so it might still be preferable to use Stefan's approach
> for release 2.4.0, if indeed any fix for this problem is still being
> considered for that release.

I like this approach much better. It seems like the best of all worlds
(same performance, and we don't have to worry about whether and when to
close lockfiles).

Stefan's patch is just in pu at this point, right? I do not think there
is any rushing/release concern. It is too late for either to be in
v2.4.0, so the only decision is whether to aim for "master" or "maint".
To me, they both seem to be in the same ballpark as far as risking a
regression.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] ref_transaction_commit(): remove the local flags variables
  2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
@ 2015-04-24 17:30   ` Jeff King
  2015-04-24 21:15     ` Michael Haggerty
  2015-04-24 21:51   ` Eric Sunshine
  1 sibling, 1 reply; 19+ messages in thread
From: Jeff King @ 2015-04-24 17:30 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Stefan Beller, git

On Fri, Apr 24, 2015 at 01:35:48PM +0200, Michael Haggerty wrote:

> Instead, work directly with update->flags. This has the advantage that
> the REF_DELETING bit, set in the first loop, can be read in the third
> loop instead of having to compute the same expression again. Plus, it
> was kindof confusing having both update->flags and flags, which
> sometimes had different values.

Hmm. I think this is losing the distinction of "flags the caller has
passed in to us" versus "flags we are using locally only during the
transaction_commit routine". If callers look at the flags in the
REF_TRANSACTION_CLOSED state, do they care about seeing these new flags?

My guess is probably not in practice, and "leaking" these flags is an
acceptable tradeoff for keeping the transaction_commit function simpler.
But I haven't looked that closely.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit()
  2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
@ 2015-04-24 19:13   ` Stefan Beller
  2015-04-25 19:27   ` Junio C Hamano
  1 sibling, 0 replies; 19+ messages in thread
From: Stefan Beller @ 2015-04-24 19:13 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Junio C Hamano, git

On Fri, Apr 24, 2015 at 10:26 AM, Jeff King <peff@peff.net> wrote:
> On Fri, Apr 24, 2015 at 01:35:44PM +0200, Michael Haggerty wrote:
>
>> In ref_transaction_commit(), close the reference lockfiles immediately
>> to avoid keeping too many file descriptors open at a time. This is
>> pretty easy, because in the first loop (where we create the locks) we
>> already know what, if anything, has to be written into the lockfile.
>> So write it and close the lockfile immediately. In the second loop,
>> rename the lockfiles for reference updates into place, and in the
>> cleanup loop, unlock any references that are still locked (i.e., those
>> that were only being verified or deleted).
>>
>> I think this is a cleaner solution than Stefan's approach [1] of
>> closing and reopening fds based on an estimate of how many fds we can
>> afford to waste--we only need a single file descriptor at a time, and
>> we never have to close then reopen a lockfile. But it is a bit more
>> intrusive, so it might still be preferable to use Stefan's approach
>> for release 2.4.0, if indeed any fix for this problem is still being
>> considered for that release.
>
> I like this approach much better. It seems like the best of all worlds
> (same performance, and we don't have to worry about whether and when to
> close lockfiles).

I would have guessed this approach to take more work to do it right.
Thanks Michael for tackling the problem in an elegant way!

>
> Stefan's patch is just in pu at this point, right? I do not think there
> is any rushing/release concern.

Yeah it's in pu, so it's easy to remove.

> It is too late for either to be in
> v2.4.0, so the only decision is whether to aim for "master" or "maint".
> To me, they both seem to be in the same ballpark as far as risking a
> regression.
>
> -Peff


Thanks,
Stefan

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] ref_transaction_commit(): remove the local flags variables
  2015-04-24 17:30   ` Jeff King
@ 2015-04-24 21:15     ` Michael Haggerty
  2015-04-24 21:19       ` Jeff King
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 21:15 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Stefan Beller, git

On 04/24/2015 07:30 PM, Jeff King wrote:
> On Fri, Apr 24, 2015 at 01:35:48PM +0200, Michael Haggerty wrote:
> 
>> Instead, work directly with update->flags. This has the advantage that
>> the REF_DELETING bit, set in the first loop, can be read in the third
>> loop instead of having to compute the same expression again. Plus, it
>> was kindof confusing having both update->flags and flags, which
>> sometimes had different values.
> 
> Hmm. I think this is losing the distinction of "flags the caller has
> passed in to us" versus "flags we are using locally only during the
> transaction_commit routine". If callers look at the flags in the
> REF_TRANSACTION_CLOSED state, do they care about seeing these new flags?
> 
> My guess is probably not in practice, and "leaking" these flags is an
> acceptable tradeoff for keeping the transaction_commit function simpler.
> But I haven't looked that closely.

"struct ref_update" is opaque to callers outside of the refs module, and
ref_update::flags is not read anywhere outside of
ref_transaction_commit() (and its value is passed to
lock_ref_sha1_basic()). So I don't think we have to be shy about storing
our own internal information there.

In fact, REF_DELETING, REF_ISPRUNING, REF_HAVE_NEW, and REF_HAVE_OLD are
also private to the refs module.

I suppose we could mask out all the "private" bits in the flags
parameter passed by the caller, to make sure that the caller hasn't
accidentally set other bits. I think that would be more defensive than
our usual practice, but I don't mind doing it if people think it would
be prudent.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] ref_transaction_commit(): remove the local flags variables
  2015-04-24 21:15     ` Michael Haggerty
@ 2015-04-24 21:19       ` Jeff King
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff King @ 2015-04-24 21:19 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Stefan Beller, git

On Fri, Apr 24, 2015 at 11:15:09PM +0200, Michael Haggerty wrote:

> > Hmm. I think this is losing the distinction of "flags the caller has
> > passed in to us" versus "flags we are using locally only during the
> > transaction_commit routine". If callers look at the flags in the
> > REF_TRANSACTION_CLOSED state, do they care about seeing these new flags?
> > 
> > My guess is probably not in practice, and "leaking" these flags is an
> > acceptable tradeoff for keeping the transaction_commit function simpler.
> > But I haven't looked that closely.
> 
> "struct ref_update" is opaque to callers outside of the refs module, and
> ref_update::flags is not read anywhere outside of
> ref_transaction_commit() (and its value is passed to
> lock_ref_sha1_basic()). So I don't think we have to be shy about storing
> our own internal information there.
> 
> In fact, REF_DELETING, REF_ISPRUNING, REF_HAVE_NEW, and REF_HAVE_OLD are
> also private to the refs module.

Thanks for checking. If nobody is affected (and is not likely to be), I
agree it's not worth worrying about.

> I suppose we could mask out all the "private" bits in the flags
> parameter passed by the caller, to make sure that the caller hasn't
> accidentally set other bits. I think that would be more defensive than
> our usual practice, but I don't mind doing it if people think it would
> be prudent.

I don't think it's necessary.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] ref_transaction_commit(): remove the local flags variables
  2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
  2015-04-24 17:30   ` Jeff King
@ 2015-04-24 21:51   ` Eric Sunshine
  2015-04-24 22:22     ` Michael Haggerty
  1 sibling, 1 reply; 19+ messages in thread
From: Eric Sunshine @ 2015-04-24 21:51 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Stefan Beller, Jeff King, Git List

On Fri, Apr 24, 2015 at 7:35 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> Instead, work directly with update->flags. This has the advantage that
> the REF_DELETING bit, set in the first loop, can be read in the third
> loop instead of having to compute the same expression again. Plus, it
> was kindof confusing having both update->flags and flags, which

s/kindof/kind of/

> sometimes had different values.
>
> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/5] ref_transaction_commit(): remove the local flags variables
  2015-04-24 21:51   ` Eric Sunshine
@ 2015-04-24 22:22     ` Michael Haggerty
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Haggerty @ 2015-04-24 22:22 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Junio C Hamano, Stefan Beller, Jeff King, Git List

On 04/24/2015 11:51 PM, Eric Sunshine wrote:
> On Fri, Apr 24, 2015 at 7:35 AM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> Instead, work directly with update->flags. This has the advantage that
>> the REF_DELETING bit, set in the first loop, can be read in the third
>> loop instead of having to compute the same expression again. Plus, it
>> was kindof confusing having both update->flags and flags, which
> 
> s/kindof/kind of/
> 
>> sometimes had different values.
>>
>> Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>

Hehe thanks for looking over my scribblings. In this case, neither
"kindof" nor "kind of" is in fact correct English. I used the slang word
"kindof" (sometimes spelled "kinda") to mean "somewhat", I guess because
"somewhat" sounded too formal for my slapdash opinion.

But I suppose it's kindof thoughtless to use slang in commit messages
:-), especially given that they are also meant for people for whom
English is a second language (let alone sloppy American English).

I suggest s/kindof/potentially/, at least until I have time to submit a
patch to the English language to make "kindof" a reputable word :-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit()
  2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
                   ` (5 preceding siblings ...)
  2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
@ 2015-04-25  4:28 ` Junio C Hamano
  6 siblings, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2015-04-25  4:28 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Stefan Beller, Jeff King, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> In ref_transaction_commit(), close the reference lockfiles immediately
> to avoid keeping too many file descriptors open at a time. This is
> pretty easy, because in the first loop (where we create the locks) we
> already know what, if anything, has to be written into the lockfile.

Nicely analysed and beautifully executed.  I like it.

> This patch series applies on top of Stefan's
>
>     c1f0ca9 refs.c: remove lock_fd from struct ref_lock (2015-04-16)
>
> and it fixes two tests that Stefan introduced earlier in that series.

Thanks.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
  2015-04-24 11:35 ` [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time Michael Haggerty
@ 2015-04-25  6:08   ` Michael Haggerty
  2015-04-25  6:57     ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Michael Haggerty @ 2015-04-25  6:08 UTC (permalink / raw)
  To: Junio C Hamano, Stefan Beller; +Cc: Jeff King, git

On 04/24/2015 01:35 PM, Michael Haggerty wrote:
> The old code locked all of the references in the first loop, leaving
> all of the lockfiles open until later loops. Since we might be
> updating a lot of references, this could cause file descriptor
> exhaustion.
> 
> But there is no reason to keep the lockfiles open after the first
> loop:
> 
> * For references being deleted or verified, we don't have to write
>   anything into the lockfile, so we can close it immediately.
> 
> * For references being updated, we can write the new SHA-1 into the
>   lockfile immediately and close the lockfile without renaming it into
>   place. In the second loop, the already-written contents can be
>   renamed into place using commit_ref_update().
> 
> To simplify the bookkeeping across loops, add a new REF_NEEDS_COMMIT
> bit to update->flags, which keeps track of whether the corresponding
> lockfile needs to be committed, as opposed to just unlocked.
> 
> This change fixes two tests in t1400.

I realize that I should have highlighted another, perhaps more serious
bug that is fixed by this change. The old code was roughly

    for update in updates:
        acquire locks and check old_sha
    for update in updates:
        if changing value:
            write_ref_to_lockfile()
            commit_ref_update()
    for update in updates:
        if deleting value:
            unlink()
    for update in updates:
        if reference still locked:
            unlock_ref()

The atomicity of the update depends on all preconditions being checked
in the first loop, before any changes have start being committed. The
problem is that write_ref_to_lockfile() (which used to be part of
write_ref_sha1()), which happens in the second loop, checks some more
preconditions:

* It verifies that new_sha1 is a valid object
* If the reference being updated is a branch, it verifies that new_sha1
  points at a commit object (as opposed to a tag, tree, or blob).

If either of these preconditions fails, the "transaction" might be
aborted after some reference updates have already been permanently
committed. In other words, the all-or-nothing promise of "git update-ref
--stdin" and "git push --atomic" would be violated.

I'm sorry that I didn't fully realize this problem earlier.

After this patch, the code looks like

    for update in updates:
        acquire locks and check old_sha
        if changing value:
            write_ref_to_lockfile()
    for update in updates:
        if changing value:
            commit_ref_update()
    for update in updates:
        if deleting value:
            unlink()
    for update in updates:
        if reference still locked:
            unlock_ref()

This has the important effect that the pre-checks in
write_ref_to_lockfile() are carried out before any changes have been
committed, so if any of those checks fail, the transaction can be rolled
back correctly.

Given that "git push --atomic" is one of the main new features of 2.4.0,
it would be unfortunate for the release to contain this bug, plus the
bug that atomic pushes can fail due to file descriptor exhaustion.

Is this problem important enough to justify delaying the release to get
the fix in?

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
  2015-04-25  6:08   ` Michael Haggerty
@ 2015-04-25  6:57     ` Junio C Hamano
  2015-04-25 19:21       ` Junio C Hamano
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2015-04-25  6:57 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Stefan Beller, Jeff King, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Given that "git push --atomic" is one of the main new features of 2.4.0,
> it would be unfortunate for the release to contain this bug, plus the
> bug that atomic pushes can fail due to file descriptor exhaustion.
>
> Is this problem important enough to justify delaying the release to get
> the fix in?

I am not too worried about "push --atomic", as we can just add a few
words to Release Notes and documentation saying "this is still an
experimental broken code that is unusable; don't use the feature in
production".

I however am more worried about the other one "update-ref --stdin";
the change will be pure regression for those who want to do many
updates and do not care if the update is atomic, no?

Unfortuntely it is pretty late in the game, and even though in
principle I know that the only sensible way forward is to revert the
original breakage, I find it very tempting to patch it up with your
series.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
  2015-04-25  6:57     ` Junio C Hamano
@ 2015-04-25 19:21       ` Junio C Hamano
  2015-04-28  4:36         ` Jeff King
  0 siblings, 1 reply; 19+ messages in thread
From: Junio C Hamano @ 2015-04-25 19:21 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Stefan Beller, Jeff King, git

Junio C Hamano <gitster@pobox.com> writes:

> I am not too worried about "push --atomic", as we can just add a few
> words to Release Notes and documentation saying "this is still an
> experimental broken code that is unusable; don't use the feature in
> production".
>
> I however am more worried about the other one "update-ref --stdin";
> the change will be pure regression for those who want to do many
> updates and do not care if the update is atomic, no?

I should have refrained from touching the keyboard so late at night
X-<.  This regression was done long time ago (even in v2.1.0 I see
that ref_transaction_commit() tries to grab all locks at once).

So it is only "push --atomic".

The choice is between (1) shipping "push --atomic" that is known to
be broken, (2) applying your five-patch series which may (a) fix
both "push --atomic" and "update-ref --stdin", or (b) break other
transaction users including "update-ref -stdin" in unexpected ways.

I dunno.  I am still tempted to go route (2) hoping that it would
result in (2-a) not (2-b).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit()
  2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
  2015-04-24 19:13   ` Stefan Beller
@ 2015-04-25 19:27   ` Junio C Hamano
  1 sibling, 0 replies; 19+ messages in thread
From: Junio C Hamano @ 2015-04-25 19:27 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, Stefan Beller, git

Jeff King <peff@peff.net> writes:

> Stefan's patch is just in pu at this point, right? I do not think there
> is any rushing/release concern. It is too late for either to be in
> v2.4.0, so the only decision is whether to aim for "master" or "maint".
> To me, they both seem to be in the same ballpark as far as risking a
> regression.

The series builds on c1f0ca9 that is on 2.4.0-rc2, so we would need
to wiggle things around to apply to older codebase if we want to fix
the "too many open file descriptor" issue on older maintenance
releases for "update-ref --stdin".

I personally feel that it is OK to ship v2.4.0 without the fix,
leaving "push --atomic" broken, and fix it in v2.4.1, but I kinda
prefer that the final fix to be applicable for older maintenance
releases, at least to 2.3.x track, if not 2.2.x track.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
  2015-04-25 19:21       ` Junio C Hamano
@ 2015-04-28  4:36         ` Jeff King
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff King @ 2015-04-28  4:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Stefan Beller, git

On Sat, Apr 25, 2015 at 12:21:07PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I am not too worried about "push --atomic", as we can just add a few
> > words to Release Notes and documentation saying "this is still an
> > experimental broken code that is unusable; don't use the feature in
> > production".
> >
> > I however am more worried about the other one "update-ref --stdin";
> > the change will be pure regression for those who want to do many
> > updates and do not care if the update is atomic, no?
> 
> I should have refrained from touching the keyboard so late at night
> X-<.  This regression was done long time ago (even in v2.1.0 I see
> that ref_transaction_commit() tries to grab all locks at once).
> 
> So it is only "push --atomic".
> 
> The choice is between (1) shipping "push --atomic" that is known to
> be broken, (2) applying your five-patch series which may (a) fix
> both "push --atomic" and "update-ref --stdin", or (b) break other
> transaction users including "update-ref -stdin" in unexpected ways.

I'm not sure "--atomic" is foolproof even with these patches.

Certainly there is the obvious problem that the filesystem is not atomic
(so a power loss halfway through committing the locks will result in a
non-atomic push). But there are also subtle locking issues related to
D/F conflicts.

For instance, if we do (and I take no credit for this discovery; Michael
showed it to me last week):

  {
    echo "create refs/heads/branch $sha1"
    echo "create refs/heads/foo/bar $sha1" &&
    echo "create refs/heads/foo $sha1"
  } | git update-ref --stdin

we do not notice the conflict between "foo" and "foo/bar" until the
commit stage. Fortunately our ref-ordering rules mean we will always try
to write "foo" before "foo/bar", which will fail because "foo" must be a
directory (because we have "foo/bar.lock"). But because "branch" comes
alphabetically first, we will have written that ref already, committing
part of the request.

You can work around this with code to check for D/F conflicts in a
single transaction, but then you are subject to races with other
processes. E.g., one process writes "branch" and "foo", the other is
simultaneously writing "foo/bar". There is no lock contention, but the
write of "foo" may fail at the commit stage.

So I think it is OK to restrict the definition of "atomic" here to the
common case of making sure that we can lock and check the $old_sha1 for
each of the refs. Under that definition, "push --atomic" is not broken
as-is, but there is room for improving its robustness.  And I have no
problem with trying to make our storage format as robust as possible in
these cases, but I think we need to admit that the filesystem ref
storage is never going to be 100% transactional.

I know that sounds a bit like moving the goalposts to say "eh, this
feature is not broken, your definition of broken is just wrong". But my
point is that shipping "push --atomic" as-is is still a useful step
forward, and we can continue to iterate and improve on the concept in
future releases.

-Peff

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-04-28  4:37 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
2015-04-24 11:35 ` [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1() Michael Haggerty
2015-04-24 11:35 ` [PATCH 2/5] commit_ref_update(): " Michael Haggerty
2015-04-24 11:35 ` [PATCH 3/5] write_ref_sha1(): inline function at callers Michael Haggerty
2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
2015-04-24 17:30   ` Jeff King
2015-04-24 21:15     ` Michael Haggerty
2015-04-24 21:19       ` Jeff King
2015-04-24 21:51   ` Eric Sunshine
2015-04-24 22:22     ` Michael Haggerty
2015-04-24 11:35 ` [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time Michael Haggerty
2015-04-25  6:08   ` Michael Haggerty
2015-04-25  6:57     ` Junio C Hamano
2015-04-25 19:21       ` Junio C Hamano
2015-04-28  4:36         ` Jeff King
2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
2015-04-24 19:13   ` Stefan Beller
2015-04-25 19:27   ` Junio C Hamano
2015-04-25  4:28 ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.