All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames
@ 2017-08-25 18:49 Martin Ågren
  2017-08-25 18:49 ` [PATCH 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
  2017-08-25 21:00 ` [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Junio C Hamano
  0 siblings, 2 replies; 34+ messages in thread
From: Martin Ågren @ 2017-08-25 18:49 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

files_transaction_prepare() and the functions it calls add strings to a
string list without duplicating them, i.e., we keep the original raw
pointers we were given. That is "ok", since we keep them only for a
short-enough time, but we end up leaking some of them.

Switch to duplicating the strings, so that affected_refnames does not
leak memory. The original strings might still leak, but at least that
can now be addressed without worrying about these pointers.

No-one takes any pointers to the strings in the list (it is basically
only used to check for set membership), so it is ok for
string_list_clear to free them.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 5cca55510..22daca2ba 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2407,7 +2407,7 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 			       "ref_transaction_prepare");
 	size_t i;
 	int ret = 0;
-	struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
+	struct string_list affected_refnames = STRING_LIST_INIT_DUP;
 	char *head_ref = NULL;
 	int head_type;
 	struct object_id head_oid;
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 2/2] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-08-25 18:49 [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Martin Ågren
@ 2017-08-25 18:49 ` Martin Ågren
  2017-08-25 21:00 ` [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Junio C Hamano
  1 sibling, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-08-25 18:49 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

We hand over the "referent" string (buffer) to several different
functions, one of which would sometimes keep the raw pointer,
referent.buf. (When split_symref_update calls string_list_insert.) The
previous patch removed that behavior, so we can now safely release the
string buffer before returning.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
I appreciate Git's stance on memory leaks ("if we're about to exit or
die, why bother?") but this one feels different since
lock_ref_for_update is called in a loop rather deep down the call stack.
Feel free to tell me I'm wrong. Patch 1/2 does introduce some
malloc-churning. An alternative would be to call strbuf_release only
when we know it's safe, but that feels really ugly.

 refs/files-backend.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 22daca2ba..6af07c404 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2252,7 +2252,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 	struct strbuf referent = STRBUF_INIT;
 	int mustexist = (update->flags & REF_HAVE_OLD) &&
 		!is_null_oid(&update->old_oid);
-	int ret;
+	int ret = 0;
 	struct ref_lock *lock;
 
 	files_assert_main_repository(refs, "lock_ref_for_update");
@@ -2264,7 +2264,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		ret = split_head_update(update, transaction, head_ref,
 					affected_refnames, err);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
 	ret = lock_raw_ref(refs, update->refname, mustexist,
@@ -2278,7 +2278,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		strbuf_addf(err, "cannot lock ref '%s': %s",
 			    original_update_refname(update), reason);
 		free(reason);
-		return ret;
+		goto out;
 	}
 
 	update->backend_data = lock;
@@ -2297,10 +2297,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
 						    original_update_refname(update));
-					return -1;
+					ret = -1;
+					goto out;
 				}
 			} else if (check_old_oid(update, &lock->old_oid, err)) {
-				return TRANSACTION_GENERIC_ERROR;
+				ret = TRANSACTION_GENERIC_ERROR;
+				goto out;
 			}
 		} else {
 			/*
@@ -2314,13 +2316,15 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 						  referent.buf, transaction,
 						  affected_refnames, err);
 			if (ret)
-				return ret;
+				goto out;
 		}
 	} else {
 		struct ref_update *parent_update;
 
-		if (check_old_oid(update, &lock->old_oid, err))
-			return TRANSACTION_GENERIC_ERROR;
+		if (check_old_oid(update, &lock->old_oid, err)) {
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
+		}
 
 		/*
 		 * If this update is happening indirectly because of a
@@ -2357,7 +2361,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 				    "cannot update ref '%s': %s",
 				    update->refname, write_err);
 			free(write_err);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		} else {
 			update->flags |= REF_NEEDS_COMMIT;
 		}
@@ -2371,10 +2376,14 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		if (close_ref(lock)) {
 			strbuf_addf(err, "couldn't close '%s.lock'",
 				    update->refname);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		}
 	}
-	return 0;
+
+out:
+	strbuf_release(&referent);
+	return ret;
 }
 
 /*
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames
  2017-08-25 18:49 [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Martin Ågren
  2017-08-25 18:49 ` [PATCH 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
@ 2017-08-25 21:00 ` Junio C Hamano
  2017-08-26 10:16   ` Martin Ågren
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2017-08-25 21:00 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Michael Haggerty

Martin Ågren <martin.agren@gmail.com> writes:

> files_transaction_prepare() and the functions it calls add strings to a
> string list without duplicating them, i.e., we keep the original raw
> pointers we were given. That is "ok", since we keep them only for a
> short-enough time, but we end up leaking some of them.

Sorry, but I do not understand this statement.  If affected_refnames
string list borrows strings from other structures who own them, and
none of these strings are freed by clearing affected_refnames list,
that is not "leaking"---we didn't acquire the ownership, so it is
not our job to free them in the first place.  Among the original
owners of strings we borrow from, some may not properly free, in
which case that is a leak.

What problem are you solving?

>
> Switch to duplicating the strings, so that affected_refnames does not
> leak memory. The original strings might still leak, but at least that
> can now be addressed without worrying about these pointers.
>
> No-one takes any pointers to the strings in the list (it is basically
> only used to check for set membership), so it is ok for
> string_list_clear to free them.
>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
>  refs/files-backend.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 5cca55510..22daca2ba 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2407,7 +2407,7 @@ static int files_transaction_prepare(struct ref_store *ref_store,
>  			       "ref_transaction_prepare");
>  	size_t i;
>  	int ret = 0;
> -	struct string_list affected_refnames = STRING_LIST_INIT_NODUP;
> +	struct string_list affected_refnames = STRING_LIST_INIT_DUP;
>  	char *head_ref = NULL;
>  	int head_type;
>  	struct object_id head_oid;

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames
  2017-08-25 21:00 ` [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Junio C Hamano
@ 2017-08-26 10:16   ` Martin Ågren
  2017-08-28  8:06     ` Michael Haggerty
  0 siblings, 1 reply; 34+ messages in thread
From: Martin Ågren @ 2017-08-26 10:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Michael Haggerty

On 25 August 2017 at 23:00, Junio C Hamano <gitster@pobox.com> wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
>
>> files_transaction_prepare() and the functions it calls add strings to a
>> string list without duplicating them, i.e., we keep the original raw
>> pointers we were given. That is "ok", since we keep them only for a
>> short-enough time, but we end up leaking some of them.
>
> Sorry, but I do not understand this statement.  If affected_refnames
> string list borrows strings from other structures who own them, and
> none of these strings are freed by clearing affected_refnames list,
> that is not "leaking"---we didn't acquire the ownership, so it is
> not our job to free them in the first place.  Among the original
> owners of strings we borrow from, some may not properly free, in
> which case that is a leak.
>
> What problem are you solving?

Sorry. Maybe this explains my intentions better:

    In lock_ref_for_update(), we populate a strbuf "referent" through
    lock_raw_ref(). If we don't have a symref, we don't use "referent"
    for anything (and it won't have allocated any memory). Otherwise, we
    hand over referent.buf to someone who uses it immediately
    (refs_read_ref_full) or to someone who holds on to the pointer
    (split_symref_update ends up adding it to a string list). Therefore,
    at the end of lock_ref_for_update() we can't unconditionally release
    the strbuf, so we end up leaking it.

    We could release the strbuf when we know that it's safe (possibly
    also only when we know that it's needed). Instead, in preparation
    for the next patch, make the string list not hold on to the raw
    pointers, i.e., make it duplicate the strings on insertion and
    manage its own resources.

Of course, the pointer-keeping and free-avoidance might be by design
and/or wanted, e.g., to avoid excessive mallocing and freeing. I admit
to not knowing what is a realistic number of iterations in the loop that
calls lock_ref_for_update, i.e., how severe this leak might be. Maybe
the "backend" nature of this code does not necessarily imply "this could
be called any number of times throughout the process' lifetime".

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames
  2017-08-26 10:16   ` Martin Ågren
@ 2017-08-28  8:06     ` Michael Haggerty
  2017-08-28 10:09       ` Martin Ågren
  0 siblings, 1 reply; 34+ messages in thread
From: Michael Haggerty @ 2017-08-28  8:06 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, Git Mailing List

On Sat, Aug 26, 2017 at 12:16 PM, Martin Ågren <martin.agren@gmail.com> wrote:
> On 25 August 2017 at 23:00, Junio C Hamano <gitster@pobox.com> wrote:
>> Martin Ågren <martin.agren@gmail.com> writes:
>>
>>> files_transaction_prepare() and the functions it calls add strings to a
>>> string list without duplicating them, i.e., we keep the original raw
>>> pointers we were given. That is "ok", since we keep them only for a
>>> short-enough time, but we end up leaking some of them.
>>
>> [...]

Good find, Martin.

First of all, you are right that we don't want any memory leaks here.
Nobody promises that the program will end soon if a reference update
fails. (In fact, there are invocations of `git` that trigger multiple
reference updates.) This is a small leak, but we should fix it.

The problem (if I may take a stab at explaining it in my own words) is
that `files_transaction_prepare()` uses a `string_list` called
`affected_refnames` to ensure that the same reference name is not
modified more than once in a single reference transaction. Currently,
`affected_refnames` is configured *not* to duplicate the refnames that
are fed to it, which also means that it doesn't free the refnames when
it is cleared.

This is correct for most refnames, which are owned by entries in the
`ref_transaction`, and therefore have a longer lifetime than
`affected_refnames`.

But there is one code path that can add a refname to
`affected_refnames` without making a provision for the refname ever to
be freed:

* files_transaction_prepare()
  * lock_ref_for_update()
    * split_symref_update()
      * item = string_list_insert(affected_refnames, referent)

The `referent` in the last statement comes from a `strbuf` that is
created in `lock_ref_for_update()` then passed to `lock_raw_ref()`,
which fills it.

It would be a serious bug if `lock_ref_for_update()` would dispose of
`referent`, because the pointer in `affected_refnames` would point at
freed memory. But in fact we have the opposite problem;
`lock_ref_for_update()` never frees the memory (it doesn't even
`strbuf_detach()` it). So that memory is leaked.

Your proposed solution is to change `affected_refnames` to duplicate
the strings that are stored in it, in which case
`lock_ref_for_update()` can properly dispose of `referent`, fixing the
leak. This works, but at the price of having to allocate memory for
*all* references in the update, even though most of them are already
fine.

But note that `split_symref_update()` *also* passes a copy of
`referent` to `ref_transaction_add_update()`, which *already* makes a
copy of the reference name and adds an entry containing the name to
the `ref_transaction`. If we would store *that* copy to
`affected_refnames`, then it would get freed when the
`ref_transaction` is cleaned up, and we could fix the memory leak
without allocating any new memory. This requires a little reorg of
`split_symref_update()` but it's not too bad:

* Do the initial check using `string_list_has_string()` rather than
calling `string_list_insert()` already.
* After `new_update` has been created, call `string_list_insert()`,
passing it `new_update->refname` as the string.

If this is done in place of your first commit, then your second commit
could be left unchanged.

Michael

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames
  2017-08-28  8:06     ` Michael Haggerty
@ 2017-08-28 10:09       ` Martin Ågren
  2017-08-28 20:32         ` [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
  2017-08-28 20:32         ` [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
  0 siblings, 2 replies; 34+ messages in thread
From: Martin Ågren @ 2017-08-28 10:09 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, Git Mailing List

On 28 August 2017 at 10:06, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On Sat, Aug 26, 2017 at 12:16 PM, Martin Ågren <martin.agren@gmail.com> wrote:
>> On 25 August 2017 at 23:00, Junio C Hamano <gitster@pobox.com> wrote:
>>> Martin Ågren <martin.agren@gmail.com> writes:
>>>
>>>> files_transaction_prepare() and the functions it calls add strings to a
>>>> string list without duplicating them, i.e., we keep the original raw
>>>> pointers we were given. That is "ok", since we keep them only for a
>>>> short-enough time, but we end up leaking some of them.
>>>
>>> [...]
>
> Good find, Martin.
>
> First of all, you are right that we don't want any memory leaks here.
> Nobody promises that the program will end soon if a reference update
> fails. (In fact, there are invocations of `git` that trigger multiple
> reference updates.) This is a small leak, but we should fix it.
>
> The problem (if I may take a stab at explaining it in my own words) is
> that `files_transaction_prepare()` uses a `string_list` called
> `affected_refnames` to ensure that the same reference name is not
> modified more than once in a single reference transaction. Currently,
> `affected_refnames` is configured *not* to duplicate the refnames that
> are fed to it, which also means that it doesn't free the refnames when
> it is cleared.
>
> This is correct for most refnames, which are owned by entries in the
> `ref_transaction`, and therefore have a longer lifetime than
> `affected_refnames`.
>
> But there is one code path that can add a refname to
> `affected_refnames` without making a provision for the refname ever to
> be freed:
>
> * files_transaction_prepare()
>   * lock_ref_for_update()
>     * split_symref_update()
>       * item = string_list_insert(affected_refnames, referent)
>
> The `referent` in the last statement comes from a `strbuf` that is
> created in `lock_ref_for_update()` then passed to `lock_raw_ref()`,
> which fills it.
>
> It would be a serious bug if `lock_ref_for_update()` would dispose of
> `referent`, because the pointer in `affected_refnames` would point at
> freed memory. But in fact we have the opposite problem;
> `lock_ref_for_update()` never frees the memory (it doesn't even
> `strbuf_detach()` it). So that memory is leaked.

Thanks for this very well-written description. It matches my
understanding completely, which is comforting.

> Your proposed solution is to change `affected_refnames` to duplicate
> the strings that are stored in it, in which case
> `lock_ref_for_update()` can properly dispose of `referent`, fixing the
> leak. This works, but at the price of having to allocate memory for
> *all* references in the update, even though most of them are already
> fine.

Agreed.

> But note that `split_symref_update()` *also* passes a copy of
> `referent` to `ref_transaction_add_update()`, which *already* makes a
> copy of the reference name and adds an entry containing the name to
> the `ref_transaction`. If we would store *that* copy to
> `affected_refnames`, then it would get freed when the
> `ref_transaction` is cleaned up, and we could fix the memory leak
> without allocating any new memory. This requires a little reorg of
> `split_symref_update()` but it's not too bad:
>
> * Do the initial check using `string_list_has_string()` rather than
> calling `string_list_insert()` already.
> * After `new_update` has been created, call `string_list_insert()`,
> passing it `new_update->refname` as the string.

I haven't looked at the code yet, but from what I remember, this would
a) work and b) be a straightforward rearrangement as you say. I'll give
this approach a try (unless you want to, of course; just holler).

> If this is done in place of your first commit, then your second commit
> could be left unchanged.

Thanks a lot for your comments. I appreciate it.

Martin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list
  2017-08-28 10:09       ` Martin Ågren
@ 2017-08-28 20:32         ` Martin Ågren
  2017-08-29  8:33           ` Michael Haggerty
  2017-08-28 20:32         ` [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
  1 sibling, 1 reply; 34+ messages in thread
From: Martin Ågren @ 2017-08-28 20:32 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty, Junio C Hamano

split_symref_update() receives a string-pointer `referent` and adds it
to the list of `affected_refnames`. The list simply holds on to the
pointers it is given, it does not copy the strings and it never frees
them. The `referent` string in split_symref_update() belongs to a string
buffer in the caller. After we return, the string will be leaked.

In the next patch, we want to properly release the string buffer in the
caller, but we can't safely do so until we've made sure that
`affected_refnames` will not be holding on to a pointer to the string.
We could configure the list to handle its own resources, but it would
mean some alloc/free-churning. The list is already handling other
strings (through other code paths) which we do not need to worry about,
and we'd be memory-churning those strings too, completely unnecessary.

Observe that split_symref_update() creates a `new_update`-object and
that `new_update->refname` is then a copy of `referent`. The difference
is, this copy will be freed, and it will be freed *after*
`affected_refnames` has been cleared.

Rearrange the handling of `referent`, so that we don't add it directly
to `affected_refnames`. Instead, first just check whether `referent`
exists in the string list, and later add `new_update->refname`.

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
Thanks Junio and Michael for your comments on the first version. This
first patch is now completely different and much much better (thanks
Michael!). The commit message should also be better (sorry Junio...).
The second one has a new commit message, but the diff is the same.

Martin

 refs/files-backend.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 5cca55510..bdb0e22e5 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2140,13 +2140,12 @@ static int split_symref_update(struct files_ref_store *refs,
 
 	/*
 	 * First make sure that referent is not already in the
-	 * transaction. This insertion is O(N) in the transaction
+	 * transaction. This check is O(N) in the transaction
 	 * size, but it happens at most once per symref in a
 	 * transaction.
 	 */
-	item = string_list_insert(affected_refnames, referent);
-	if (item->util) {
-		/* An entry already existed */
+	if (string_list_has_string(affected_refnames, referent)) {
+		/* An entry already exists */
 		strbuf_addf(err,
 			    "multiple updates for '%s' (including one "
 			    "via symref '%s') are not allowed",
@@ -2181,6 +2180,15 @@ static int split_symref_update(struct files_ref_store *refs,
 	update->flags |= REF_LOG_ONLY | REF_NODEREF;
 	update->flags &= ~REF_HAVE_OLD;
 
+	/*
+	 * Add the referent. This insertion is O(N) in the transaction
+	 * size, but it happens at most once per symref in a
+	 * transaction. Make sure to add new_update->refname, which will
+	 * be valid as long as affected_refnames is in use, and NOT
+	 * referent, which might soon be freed by our caller.
+	 */
+	item = string_list_insert(affected_refnames, new_update->refname);
+	assert(!item->util);
 	item->util = new_update;
 
 	return 0;
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-08-28 10:09       ` Martin Ågren
  2017-08-28 20:32         ` [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
@ 2017-08-28 20:32         ` Martin Ågren
  2017-08-29  8:39           ` Michael Haggerty
  1 sibling, 1 reply; 34+ messages in thread
From: Martin Ågren @ 2017-08-28 20:32 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty, Junio C Hamano

After the previous patch, none of the functions we call hold on to
`referent.buf`, so we can safely release the string buffer before
returning.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 refs/files-backend.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index bdb0e22e5..15f34b10e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2260,7 +2260,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 	struct strbuf referent = STRBUF_INIT;
 	int mustexist = (update->flags & REF_HAVE_OLD) &&
 		!is_null_oid(&update->old_oid);
-	int ret;
+	int ret = 0;
 	struct ref_lock *lock;
 
 	files_assert_main_repository(refs, "lock_ref_for_update");
@@ -2272,7 +2272,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		ret = split_head_update(update, transaction, head_ref,
 					affected_refnames, err);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
 	ret = lock_raw_ref(refs, update->refname, mustexist,
@@ -2286,7 +2286,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		strbuf_addf(err, "cannot lock ref '%s': %s",
 			    original_update_refname(update), reason);
 		free(reason);
-		return ret;
+		goto out;
 	}
 
 	update->backend_data = lock;
@@ -2305,10 +2305,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
 						    original_update_refname(update));
-					return -1;
+					ret = -1;
+					goto out;
 				}
 			} else if (check_old_oid(update, &lock->old_oid, err)) {
-				return TRANSACTION_GENERIC_ERROR;
+				ret = TRANSACTION_GENERIC_ERROR;
+				goto out;
 			}
 		} else {
 			/*
@@ -2322,13 +2324,15 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 						  referent.buf, transaction,
 						  affected_refnames, err);
 			if (ret)
-				return ret;
+				goto out;
 		}
 	} else {
 		struct ref_update *parent_update;
 
-		if (check_old_oid(update, &lock->old_oid, err))
-			return TRANSACTION_GENERIC_ERROR;
+		if (check_old_oid(update, &lock->old_oid, err)) {
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
+		}
 
 		/*
 		 * If this update is happening indirectly because of a
@@ -2365,7 +2369,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 				    "cannot update ref '%s': %s",
 				    update->refname, write_err);
 			free(write_err);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		} else {
 			update->flags |= REF_NEEDS_COMMIT;
 		}
@@ -2379,10 +2384,14 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		if (close_ref(lock)) {
 			strbuf_addf(err, "couldn't close '%s.lock'",
 				    update->refname);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		}
 	}
-	return 0;
+
+out:
+	strbuf_release(&referent);
+	return ret;
 }
 
 /*
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list
  2017-08-28 20:32         ` [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
@ 2017-08-29  8:33           ` Michael Haggerty
  0 siblings, 0 replies; 34+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:33 UTC (permalink / raw)
  To: Martin Ågren, git; +Cc: Junio C Hamano

On 08/28/2017 10:32 PM, Martin Ågren wrote:
> split_symref_update() receives a string-pointer `referent` and adds it
> to the list of `affected_refnames`. The list simply holds on to the
> pointers it is given, it does not copy the strings and it never frees
> them. The `referent` string in split_symref_update() belongs to a string
> buffer in the caller. After we return, the string will be leaked.
> 
> In the next patch, we want to properly release the string buffer in the
> caller, but we can't safely do so until we've made sure that
> `affected_refnames` will not be holding on to a pointer to the string.
> We could configure the list to handle its own resources, but it would
> mean some alloc/free-churning. The list is already handling other
> strings (through other code paths) which we do not need to worry about,
> and we'd be memory-churning those strings too, completely unnecessary.
> 
> Observe that split_symref_update() creates a `new_update`-object and
> that `new_update->refname` is then a copy of `referent`. The difference
> is, this copy will be freed, and it will be freed *after*
> `affected_refnames` has been cleared.
> 
> Rearrange the handling of `referent`, so that we don't add it directly
> to `affected_refnames`. Instead, first just check whether `referent`
> exists in the string list, and later add `new_update->refname`.
> 
> Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
> Signed-off-by: Martin Ågren <martin.agren@gmail.com>
> ---
> Thanks Junio and Michael for your comments on the first version. This
> first patch is now completely different and much much better (thanks
> Michael!). The commit message should also be better (sorry Junio...).
> The second one has a new commit message, but the diff is the same.
> 
> Martin
> 
>  refs/files-backend.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 5cca55510..bdb0e22e5 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2140,13 +2140,12 @@ static int split_symref_update(struct files_ref_store *refs,
>  
>  	/*
>  	 * First make sure that referent is not already in the
> -	 * transaction. This insertion is O(N) in the transaction
> +	 * transaction. This check is O(N) in the transaction

The check is not O(N); it is O(lg N) because it is implemented via a
binary search in the (sorted) `affected_refnames`. The insertion below
is still O(N), because it has to shift entries later in the list to the
right to make room for the new entry.

>  	 * size, but it happens at most once per symref in a
>  	 * transaction.
>  	 */
> -	item = string_list_insert(affected_refnames, referent);
> -	if (item->util) {
> -		/* An entry already existed */
> +	if (string_list_has_string(affected_refnames, referent)) {
> +		/* An entry already exists */
>  		strbuf_addf(err,
>  			    "multiple updates for '%s' (including one "
>  			    "via symref '%s') are not allowed",
> @@ -2181,6 +2180,15 @@ static int split_symref_update(struct files_ref_store *refs,
>  	update->flags |= REF_LOG_ONLY | REF_NODEREF;
>  	update->flags &= ~REF_HAVE_OLD;
>  
> +	/*
> +	 * Add the referent. This insertion is O(N) in the transaction
> +	 * size, but it happens at most once per symref in a
> +	 * transaction. Make sure to add new_update->refname, which will
> +	 * be valid as long as affected_refnames is in use, and NOT
> +	 * referent, which might soon be freed by our caller.
> +	 */
> +	item = string_list_insert(affected_refnames, new_update->refname);
> +	assert(!item->util);

We generally avoid using `assert()`. It would be preferable to use

        if (item->util)
                BUG("%s unexpectedly found in affected_refnames",
new_update->refname);

>  	item->util = new_update;
>  
>  	return 0;
> 

Otherwise, looks good!

Michael

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-08-28 20:32         ` [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
@ 2017-08-29  8:39           ` Michael Haggerty
  2017-08-29 10:41             ` Martin Ågren
  0 siblings, 1 reply; 34+ messages in thread
From: Michael Haggerty @ 2017-08-29  8:39 UTC (permalink / raw)
  To: Martin Ågren, git; +Cc: Junio C Hamano

On 08/28/2017 10:32 PM, Martin Ågren wrote:
> After the previous patch, none of the functions we call hold on to
> `referent.buf`, so we can safely release the string buffer before
> returning.

This patch looks good to me, but I did notice a pre-existing problem in
the area...

> ---
>  refs/files-backend.c | 31 ++++++++++++++++++++-----------
>  1 file changed, 20 insertions(+), 11 deletions(-)
> 
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index bdb0e22e5..15f34b10e 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> [...]
> @@ -2305,10 +2305,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  					strbuf_addf(err, "cannot lock ref '%s': "
>  						    "error reading reference",
>  						    original_update_refname(update));
> -					return -1;
> +					ret = -1;
> +					goto out;

It is incorrect to return -1 here. First of all, stylistically, the
return value should be a symbolic constant. But in fact, it should be
returning `TRANSACTION_GENERIC_ERROR` here, whereas -1 is
`TRANSACTION_NAME_CONFLICT`. So the code is not just stylistically
wrong; it is functionally wrong.

I know that this is not your mistake, but would you like to add another
patch to your series to fix this up? I'd do it myself, but it's a little
bit awkward because the fix will conflict with your patch.

Thanks,
Michael

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-08-29  8:39           ` Michael Haggerty
@ 2017-08-29 10:41             ` Martin Ågren
  2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
                                 ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Martin Ågren @ 2017-08-29 10:41 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Git Mailing List, Junio C Hamano

On 29 August 2017 at 10:39, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> On 08/28/2017 10:32 PM, Martin Ågren wrote:
>> After the previous patch, none of the functions we call hold on to
>> `referent.buf`, so we can safely release the string buffer before
>> returning.
>
> This patch looks good to me, but I did notice a pre-existing problem in
> the area...
>
>> ---
>>  refs/files-backend.c | 31 ++++++++++++++++++++-----------
>>  1 file changed, 20 insertions(+), 11 deletions(-)
>>
>> diff --git a/refs/files-backend.c b/refs/files-backend.c
>> index bdb0e22e5..15f34b10e 100644
>> --- a/refs/files-backend.c
>> +++ b/refs/files-backend.c
>> [...]
>> @@ -2305,10 +2305,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>>                                       strbuf_addf(err, "cannot lock ref '%s': "
>>                                                   "error reading reference",
>>                                                   original_update_refname(update));
>> -                                     return -1;
>> +                                     ret = -1;
>> +                                     goto out;
>
> It is incorrect to return -1 here. First of all, stylistically, the
> return value should be a symbolic constant. But in fact, it should be
> returning `TRANSACTION_GENERIC_ERROR` here, whereas -1 is
> `TRANSACTION_NAME_CONFLICT`. So the code is not just stylistically
> wrong; it is functionally wrong.
>
> I know that this is not your mistake, but would you like to add another
> patch to your series to fix this up? I'd do it myself, but it's a little
> bit awkward because the fix will conflict with your patch.

Sure. I'll send out a v3 later today. I'll fix this in a third patch,
and I'll also address your comments on the first patch.

Thanks a lot.

Martin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-08-29 10:41             ` Martin Ågren
@ 2017-08-29 17:18               ` Martin Ågren
  2017-08-30  2:52                 ` Michael Haggerty
  2017-09-05  8:45                 ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Jeff King
  2017-08-29 17:18               ` [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
  2017-08-29 17:18               ` [PATCH v3 3/3] refs/files-backend: correct return value " Martin Ågren
  2 siblings, 2 replies; 34+ messages in thread
From: Martin Ågren @ 2017-08-29 17:18 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

split_symref_update() receives a string-pointer `referent` and adds it
to the list of `affected_refnames`. The list simply holds on to the
pointers it is given, it does not copy the strings and it does not ever
free them. The `referent` string in split_symref_update() belongs to a
string buffer in the caller. After we return, the string will be leaked.

In the next patch, we want to properly release the string buffer in the
caller, but we can't safely do so until we've made sure that
`affected_refnames` will not be holding on to a pointer to the string.
We could configure the list to handle its own resources, but it would
mean some alloc/free-churning. The list is already handling other
strings (through other code paths) which we do not need to worry about,
and we'd be memory-churning those strings too, completely unnecessary.

Observe that split_symref_update() creates a `new_update`-object through
ref_transaction_add_update(), after which `new_update->refname` is a
copy of `referent`. The difference is, this copy will be freed, and it
will be freed *after* `affected_refnames` has been cleared.

Rearrange the handling of `referent`, so that we don't add it directly
to `affected_refnames`. Instead, first just check whether `referent`
exists in the string list, and later add `new_update->refname`.

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v3: "O(lg N)" in comment; if-BUG() instead of assert()

 refs/files-backend.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index fccbc24ac..73615d869 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2144,13 +2144,12 @@ static int split_symref_update(struct files_ref_store *refs,
 
 	/*
 	 * First make sure that referent is not already in the
-	 * transaction. This insertion is O(N) in the transaction
+	 * transaction. This check is O(lg N) in the transaction
 	 * size, but it happens at most once per symref in a
 	 * transaction.
 	 */
-	item = string_list_insert(affected_refnames, referent);
-	if (item->util) {
-		/* An entry already existed */
+	if (string_list_has_string(affected_refnames, referent)) {
+		/* An entry already exists */
 		strbuf_addf(err,
 			    "multiple updates for '%s' (including one "
 			    "via symref '%s') are not allowed",
@@ -2185,6 +2184,17 @@ static int split_symref_update(struct files_ref_store *refs,
 	update->flags |= REF_LOG_ONLY | REF_NODEREF;
 	update->flags &= ~REF_HAVE_OLD;
 
+	/*
+	 * Add the referent. This insertion is O(N) in the transaction
+	 * size, but it happens at most once per symref in a
+	 * transaction. Make sure to add new_update->refname, which will
+	 * be valid as long as affected_refnames is in use, and NOT
+	 * referent, which might soon be freed by our caller.
+	 */
+	item = string_list_insert(affected_refnames, new_update->refname);
+	if (item->util)
+		BUG("%s unexpectedly found in affected_refnames",
+		    new_update->refname);
 	item->util = new_update;
 
 	return 0;
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-08-29 10:41             ` Martin Ågren
  2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
@ 2017-08-29 17:18               ` Martin Ågren
  2017-09-05  8:47                 ` Jeff King
  2017-08-29 17:18               ` [PATCH v3 3/3] refs/files-backend: correct return value " Martin Ågren
  2 siblings, 1 reply; 34+ messages in thread
From: Martin Ågren @ 2017-08-29 17:18 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

After the previous patch, none of the functions we call hold on to
`referent.buf`, so we can safely release the string buffer before
returning.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v3: identical to v2

 refs/files-backend.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 73615d869..a2b3df21b 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2266,7 +2266,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 	struct strbuf referent = STRBUF_INIT;
 	int mustexist = (update->flags & REF_HAVE_OLD) &&
 		!is_null_oid(&update->old_oid);
-	int ret;
+	int ret = 0;
 	struct ref_lock *lock;
 
 	files_assert_main_repository(refs, "lock_ref_for_update");
@@ -2278,7 +2278,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		ret = split_head_update(update, transaction, head_ref,
 					affected_refnames, err);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
 	ret = lock_raw_ref(refs, update->refname, mustexist,
@@ -2292,7 +2292,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		strbuf_addf(err, "cannot lock ref '%s': %s",
 			    original_update_refname(update), reason);
 		free(reason);
-		return ret;
+		goto out;
 	}
 
 	update->backend_data = lock;
@@ -2311,10 +2311,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
 						    original_update_refname(update));
-					return -1;
+					ret = -1;
+					goto out;
 				}
 			} else if (check_old_oid(update, &lock->old_oid, err)) {
-				return TRANSACTION_GENERIC_ERROR;
+				ret = TRANSACTION_GENERIC_ERROR;
+				goto out;
 			}
 		} else {
 			/*
@@ -2328,13 +2330,15 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 						  referent.buf, transaction,
 						  affected_refnames, err);
 			if (ret)
-				return ret;
+				goto out;
 		}
 	} else {
 		struct ref_update *parent_update;
 
-		if (check_old_oid(update, &lock->old_oid, err))
-			return TRANSACTION_GENERIC_ERROR;
+		if (check_old_oid(update, &lock->old_oid, err)) {
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
+		}
 
 		/*
 		 * If this update is happening indirectly because of a
@@ -2371,7 +2375,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 				    "cannot update ref '%s': %s",
 				    update->refname, write_err);
 			free(write_err);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		} else {
 			update->flags |= REF_NEEDS_COMMIT;
 		}
@@ -2385,10 +2390,14 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		if (close_ref(lock)) {
 			strbuf_addf(err, "couldn't close '%s.lock'",
 				    update->refname);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		}
 	}
-	return 0;
+
+out:
+	strbuf_release(&referent);
+	return ret;
 }
 
 /*
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v3 3/3] refs/files-backend: correct return value in lock_ref_for_update
  2017-08-29 10:41             ` Martin Ågren
  2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
  2017-08-29 17:18               ` [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
@ 2017-08-29 17:18               ` Martin Ågren
  2 siblings, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-08-29 17:18 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

In one code path we return a literal -1 and not a symbolic constant. The
value -1 would be interpreted as TRANSACTION_NAME_CONFLICT, which is
wrong. Use TRANSACTION_GENERIC_ERROR instead (that is the only other
return value we have to choose from).

Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
v3: this is a new patch, as suggested by Michael

 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index a2b3df21b..ad05d1d5f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2311,7 +2311,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
 						    original_update_refname(update));
-					ret = -1;
+					ret = TRANSACTION_GENERIC_ERROR;
 					goto out;
 				}
 			} else if (check_old_oid(update, &lock->old_oid, err)) {
-- 
2.14.1.151.g45c1275a3.dirty


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
@ 2017-08-30  2:52                 ` Michael Haggerty
  2017-08-30 18:02                   ` Martin Ågren
  2017-09-05  8:45                 ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Jeff King
  1 sibling, 1 reply; 34+ messages in thread
From: Michael Haggerty @ 2017-08-30  2:52 UTC (permalink / raw)
  To: Martin Ågren, git

v3 looks good to me. Thanks!

Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>

Michael

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-08-30  2:52                 ` Michael Haggerty
@ 2017-08-30 18:02                   ` Martin Ågren
  2017-09-05 10:02                     ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Martin Ågren @ 2017-08-30 18:02 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Git Mailing List

On 30 August 2017 at 04:52, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> v3 looks good to me. Thanks!
>
> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>

Thank _you_ for very helpful feedback on the earlier versions.

Martin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
  2017-08-30  2:52                 ` Michael Haggerty
@ 2017-09-05  8:45                 ` Jeff King
  2017-09-05  9:03                   ` Michael Haggerty
  1 sibling, 1 reply; 34+ messages in thread
From: Jeff King @ 2017-09-05  8:45 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Michael Haggerty

On Tue, Aug 29, 2017 at 07:18:22PM +0200, Martin Ågren wrote:

> Observe that split_symref_update() creates a `new_update`-object through
> ref_transaction_add_update(), after which `new_update->refname` is a
> copy of `referent`. The difference is, this copy will be freed, and it
> will be freed *after* `affected_refnames` has been cleared.
> 
> Rearrange the handling of `referent`, so that we don't add it directly
> to `affected_refnames`. Instead, first just check whether `referent`
> exists in the string list, and later add `new_update->refname`.

Coincidentally[1] I came across this same leak, and my solution ended up
slightly different. I'll share it here in case it's of interest.

In your solution we end up searching the string list twice: once to see
if we have the item, and then again to insert it. Whereas in the
original we did both with a single search.

But we can observe that either:

  1. The item already existed, in which case our insert was a noop, and
     we're good.

or

  2. We inserted it, in which case we proceed with creating new_update.

     We can then in O(1) replace the pointer in the string list item
     with the storage in new_update. We know we're not violating any
     string_list invariants because the strings contain the same bytes.

I.e.:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 9266f5ab9d..1d16c1b33e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2178,6 +2178,11 @@ static int split_symref_update(struct files_ref_store *refs,
 	update->flags |= REF_LOG_ONLY | REF_NODEREF;
 	update->flags &= ~REF_HAVE_OLD;
 
+	/*
+	 * Re-point at the storage provided by our ref_update, which we know
+	 * will last as long as the affected_refnames list.
+	 */
+	item->string = new_update->refname;
 	item->util = new_update;
 
 	return 0;

It feels pretty dirty, though. It would certainly be a bug if we ever
decided to switch affected_refnames to duplicate its strings.

So given that your solution is only a constant-time factor worse in
efficiency, we should probably prefer it as the more maintainable
option.

-Peff

[1] It's not really a coincidence, of course. All the recent leak
    discussion has got both of us prodding at Git with various tools. :)

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-08-29 17:18               ` [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
@ 2017-09-05  8:47                 ` Jeff King
  2017-09-05 17:28                   ` Martin Ågren
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2017-09-05  8:47 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Michael Haggerty

On Tue, Aug 29, 2017 at 07:18:23PM +0200, Martin Ågren wrote:

> After the previous patch, none of the functions we call hold on to
> `referent.buf`, so we can safely release the string buffer before
> returning.

I ended up doing this a little differently in my version:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 9266f5ab9d..1d16c1b33e 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2287,9 +2292,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 			 * the transaction, so we have to read it here
 			 * to record and possibly check old_sha1:
 			 */
-			if (refs_read_ref_full(&refs->base,
-					       referent.buf, 0,
-					       lock->old_oid.hash, NULL)) {
+			ret = refs_read_ref_full(&refs->base,
+						 referent.buf, 0,
+						 lock->old_oid.hash, NULL);
+			strbuf_release(&referent);
+
+			if (ret) {
 				if (update->flags & REF_HAVE_OLD) {
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
@@ -2310,6 +2318,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 			ret = split_symref_update(refs, update,
 						  referent.buf, transaction,
 						  affected_refnames, err);
+			strbuf_release(&referent);
 			if (ret)
 				return ret;
 		}

After we look at referent.buf once in each of the branch arms, we don't
need it at all. Disposing of it there means we don't have to worry about
it in all of the later early-returns.

I'm assuming that referent will always be empty unless REF_ISSYMREF is
set. Which seems reasonable, but I didn't double check.

Food for thought. I'd be OK with either version.

-Peff

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-05  8:45                 ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Jeff King
@ 2017-09-05  9:03                   ` Michael Haggerty
  2017-09-05  9:04                     ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Michael Haggerty @ 2017-09-05  9:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Ågren, Git Mailing List

On Tue, Sep 5, 2017 at 10:45 AM, Jeff King <peff@peff.net> wrote:
> On Tue, Aug 29, 2017 at 07:18:22PM +0200, Martin Ågren wrote:
>
>> [...]
>> Rearrange the handling of `referent`, so that we don't add it directly
>> to `affected_refnames`. Instead, first just check whether `referent`
>> exists in the string list, and later add `new_update->refname`.
>
> Coincidentally[1] I came across this same leak, and my solution ended up
> slightly different. I'll share it here in case it's of interest.
>
> In your solution we end up searching the string list twice: once to see
> if we have the item, and then again to insert it. Whereas in the
> original we did both with a single search.
>
> But we can observe that either:
>
>   1. The item already existed, in which case our insert was a noop, and
>      we're good.
>
> or
>
>   2. We inserted it, in which case we proceed with creating new_update.
>
>      We can then in O(1) replace the pointer in the string list item
>      with the storage in new_update. We know we're not violating any
>      string_list invariants because the strings contain the same bytes.
>
> I.e.:
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 9266f5ab9d..1d16c1b33e 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2178,6 +2178,11 @@ static int split_symref_update(struct files_ref_store *refs,
>         update->flags |= REF_LOG_ONLY | REF_NODEREF;
>         update->flags &= ~REF_HAVE_OLD;
>
> +       /*
> +        * Re-point at the storage provided by our ref_update, which we know
> +        * will last as long as the affected_refnames list.
> +        */
> +       item->string = new_update->refname;
>         item->util = new_update;
>
>         return 0;
>
> It feels pretty dirty, though. It would certainly be a bug if we ever
> decided to switch affected_refnames to duplicate its strings.
>
> So given that your solution is only a constant-time factor worse in
> efficiency, we should probably prefer it as the more maintainable
> option.

This is clever, but I don't like that it requires outside code to
change internal `string_list` structures in a way that is not
documented to be OK.

If we cared about getting rid of the extra `O(lg N)` search (and I
agree with you that it doesn't matter in this case), I think the clean
way to do it would be for `string_list` to expose a method like

    struct string_list_item *string_list_insert_at_index(struct
string_list *list, size_t index, const char *string);

and to use it, together with `string_list_find_insert_index()`, to
avoid having to search twice.

Michael

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-05  9:03                   ` Michael Haggerty
@ 2017-09-05  9:04                     ` Jeff King
  0 siblings, 0 replies; 34+ messages in thread
From: Jeff King @ 2017-09-05  9:04 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Martin Ågren, Git Mailing List

On Tue, Sep 05, 2017 at 11:03:36AM +0200, Michael Haggerty wrote:

> > It feels pretty dirty, though. It would certainly be a bug if we ever
> > decided to switch affected_refnames to duplicate its strings.
> >
> > So given that your solution is only a constant-time factor worse in
> > efficiency, we should probably prefer it as the more maintainable
> > option.
> 
> This is clever, but I don't like that it requires outside code to
> change internal `string_list` structures in a way that is not
> documented to be OK.
> 
> If we cared about getting rid of the extra `O(lg N)` search (and I
> agree with you that it doesn't matter in this case), I think the clean
> way to do it would be for `string_list` to expose a method like
> 
>     struct string_list_item *string_list_insert_at_index(struct
> string_list *list, size_t index, const char *string);
> 
> and to use it, together with `string_list_find_insert_index()`, to
> avoid having to search twice.

Yes, agreed on all counts.

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-08-30 18:02                   ` Martin Ågren
@ 2017-09-05 10:02                     ` Junio C Hamano
  2017-09-05 17:24                       ` Martin Ågren
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2017-09-05 10:02 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Michael Haggerty, Git Mailing List

Martin Ågren <martin.agren@gmail.com> writes:

> On 30 August 2017 at 04:52, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>> v3 looks good to me. Thanks!
>>
>> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
>
> Thank _you_ for very helpful feedback on the earlier versions.
>
> Martin

Yes, the earlier attempt was sort-of barking up a wrong tree.  

Once we step back and observe other users of affected_refnames and
realize that the list is meant to to point at a refname field of an
existing instance of another structure by borrowing, the blame
shifts from files_transaction_prepare() to the real culprit.
Michael's review gave us a very good "let's step back a bit" that
made the huge improvement between v1 and v2/v3.

I wonder if we should be tightening the use of affected_refnames
even further, though.  

It is may be sufficient to make sure that we do not throw anything
that we would need to free as part of destroying affected_refnames
into the string list, purely from the "leak prevention" perspective.

But stepping back a bit, the reason why the string list exists in
the first place is to make sure we do not touch the same ref twice
in a single transaction, the set of all possible updates in the
single transaction exists somewhere, and each of these updates know
what ref it wants to update.  

And that is recorded in transaction->updates[]->refname no?

So it seems to me that logically any and all string that is
registered in affected_refnames list must be the refname field of
a ref_update structure in the transaction.

And from that point of view, doesn't split_head_update() wants a
similar fix?  It attempts to insert "HEAD", makes sure it hasn't
been inserted and then hangs a new update transaction as its util.
It is not wrong per-se from purely leak-prevention point of view,
as that "HEAD" is a literal string we woudn't even want to free,
but from logical/"what each data means" point of view, it still
feels wrong.



^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-05 10:02                     ` Junio C Hamano
@ 2017-09-05 17:24                       ` Martin Ågren
  2017-09-05 20:36                         ` Jeff King
  0 siblings, 1 reply; 34+ messages in thread
From: Martin Ågren @ 2017-09-05 17:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, Git Mailing List

On 5 September 2017 at 12:02, Junio C Hamano <gitster@pobox.com> wrote:
> Martin Ågren <martin.agren@gmail.com> writes:
>
>> On 30 August 2017 at 04:52, Michael Haggerty <mhagger@alum.mit.edu> wrote:
>>> v3 looks good to me. Thanks!
>>>
>>> Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
>>
>> Thank _you_ for very helpful feedback on the earlier versions.
>>
>> Martin
>
> Yes, the earlier attempt was sort-of barking up a wrong tree.
>
> Once we step back and observe other users of affected_refnames and
> realize that the list is meant to to point at a refname field of an
> existing instance of another structure by borrowing, the blame
> shifts from files_transaction_prepare() to the real culprit.
> Michael's review gave us a very good "let's step back a bit" that
> made the huge improvement between v1 and v2/v3.
>
> I wonder if we should be tightening the use of affected_refnames
> even further, though.
>
> It is may be sufficient to make sure that we do not throw anything
> that we would need to free as part of destroying affected_refnames
> into the string list, purely from the "leak prevention" perspective.
>
> But stepping back a bit, the reason why the string list exists in
> the first place is to make sure we do not touch the same ref twice
> in a single transaction, the set of all possible updates in the
> single transaction exists somewhere, and each of these updates know
> what ref it wants to update.
>
> And that is recorded in transaction->updates[]->refname no?
>
> So it seems to me that logically any and all string that is
> registered in affected_refnames list must be the refname field of
> a ref_update structure in the transaction.

I'm with you up to here.

> And from that point of view, doesn't split_head_update() wants a
> similar fix?  It attempts to insert "HEAD", makes sure it hasn't
> been inserted and then hangs a new update transaction as its util.
> It is not wrong per-se from purely leak-prevention point of view,
> as that "HEAD" is a literal string we woudn't even want to free,
> but from logical/"what each data means" point of view, it still
> feels wrong.

There is a "Special hack" comment related to this, and I don't feel
particularly confident that I could make any meaningful contribution in
this area. To be honest, I don't immediately see in which direction your
suggestion/idea/thought is going, which tells me I should not be making
a mess out of it. :-)

Martin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-09-05  8:47                 ` Jeff King
@ 2017-09-05 17:28                   ` Martin Ågren
  0 siblings, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-05 17:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Git Mailing List, Michael Haggerty

On 5 September 2017 at 10:47, Jeff King <peff@peff.net> wrote:
> On Tue, Aug 29, 2017 at 07:18:23PM +0200, Martin Ågren wrote:
>
>> After the previous patch, none of the functions we call hold on to
>> `referent.buf`, so we can safely release the string buffer before
>> returning.
>
> I ended up doing this a little differently in my version:
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 9266f5ab9d..1d16c1b33e 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2287,9 +2292,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>                          * the transaction, so we have to read it here
>                          * to record and possibly check old_sha1:
>                          */
> -                       if (refs_read_ref_full(&refs->base,
> -                                              referent.buf, 0,
> -                                              lock->old_oid.hash, NULL)) {
> +                       ret = refs_read_ref_full(&refs->base,
> +                                                referent.buf, 0,
> +                                                lock->old_oid.hash, NULL);
> +                       strbuf_release(&referent);
> +
> +                       if (ret) {
>                                 if (update->flags & REF_HAVE_OLD) {
>                                         strbuf_addf(err, "cannot lock ref '%s': "
>                                                     "error reading reference",
> @@ -2310,6 +2318,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>                         ret = split_symref_update(refs, update,
>                                                   referent.buf, transaction,
>                                                   affected_refnames, err);
> +                       strbuf_release(&referent);
>                         if (ret)
>                                 return ret;
>                 }
>
> After we look at referent.buf once in each of the branch arms, we don't
> need it at all. Disposing of it there means we don't have to worry about
> it in all of the later early-returns.
>
> I'm assuming that referent will always be empty unless REF_ISSYMREF is
> set. Which seems reasonable, but I didn't double check.

Some time after I posted v3, I had the same thought. I did double-check
and it does hold. But then I thought that even if it holds now, maybe
it's a bit too brittle. On the other hand, my patch is quite noisy and
maybe the connection between the two is obvious enough that it will hold
in the future as well.

> Food for thought. I'd be OK with either version.

So would I...

Martin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-05 17:24                       ` Martin Ågren
@ 2017-09-05 20:36                         ` Jeff King
  2017-09-05 21:26                           ` Junio C Hamano
  0 siblings, 1 reply; 34+ messages in thread
From: Jeff King @ 2017-09-05 20:36 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, Michael Haggerty, Git Mailing List

On Tue, Sep 05, 2017 at 07:24:18PM +0200, Martin Ågren wrote:

> > And from that point of view, doesn't split_head_update() wants a
> > similar fix?  It attempts to insert "HEAD", makes sure it hasn't
> > been inserted and then hangs a new update transaction as its util.
> > It is not wrong per-se from purely leak-prevention point of view,
> > as that "HEAD" is a literal string we woudn't even want to free,
> > but from logical/"what each data means" point of view, it still
> > feels wrong.
> 
> There is a "Special hack" comment related to this, and I don't feel
> particularly confident that I could make any meaningful contribution in
> this area. To be honest, I don't immediately see in which direction your
> suggestion/idea/thought is going, which tells me I should not be making
> a mess out of it. :-)

I noticed the HEAD funniness, too, when looking at this earlier. I agree
with Junio that it's not quite consistent with the general rule of
"string list items point to their refnames", but I don't think it
matters in practice.

I think the fix, if we wanted to do one, would be similar to what you
did in split_symref_update(). Like:

diff --git a/refs/files-backend.c b/refs/files-backend.c
index f3455609d6..3f9deff902 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2095,8 +2095,7 @@ static int split_head_update(struct ref_update *update,
 	 * transaction. This insertion is O(N) in the transaction
 	 * size, but it happens at most once per transaction.
 	 */
-	item = string_list_insert(affected_refnames, "HEAD");
-	if (item->util) {
+	if (string_list_has_string(affected_refnames, "HEAD")) {
 		/* An entry already existed */
 		strbuf_addf(err,
 			    "multiple updates for 'HEAD' (including one "
@@ -2111,6 +2110,7 @@ static int split_head_update(struct ref_update *update,
 			update->new_oid.hash, update->old_oid.hash,
 			update->msg);
 
+	item = string_list_insert(affected_refnames, new_update->refname);
 	item->util = new_update;
 
 	return 0;

-Peff

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-05 20:36                         ` Jeff King
@ 2017-09-05 21:26                           ` Junio C Hamano
  2017-09-06 18:12                             ` Martin Ågren
  0 siblings, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2017-09-05 21:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Martin Ågren, Michael Haggerty, Git Mailing List

Jeff King <peff@peff.net> writes:

> I noticed the HEAD funniness, too, when looking at this earlier. I agree
> with Junio that it's not quite consistent with the general rule of
> "string list items point to their refnames", but I don't think it
> matters in practice.

Yup, we are on the same page; the "fix" I was alluding to would look
exactly like what you wrote below, but I agree the distinction does
not matter in practice.  IOW, I do not think the code after Martin's
fix is wrong per-se.

Thanks.

> I think the fix, if we wanted to do one, would be similar to what you
> did in split_symref_update(). Like:
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index f3455609d6..3f9deff902 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -2095,8 +2095,7 @@ static int split_head_update(struct ref_update *update,
>  	 * transaction. This insertion is O(N) in the transaction
>  	 * size, but it happens at most once per transaction.
>  	 */
> -	item = string_list_insert(affected_refnames, "HEAD");
> -	if (item->util) {
> +	if (string_list_has_string(affected_refnames, "HEAD")) {
>  		/* An entry already existed */
>  		strbuf_addf(err,
>  			    "multiple updates for 'HEAD' (including one "
> @@ -2111,6 +2110,7 @@ static int split_head_update(struct ref_update *update,
>  			update->new_oid.hash, update->old_oid.hash,
>  			update->msg);
>  
> +	item = string_list_insert(affected_refnames, new_update->refname);
>  	item->util = new_update;
>  
>  	return 0;
>
> -Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-05 21:26                           ` Junio C Hamano
@ 2017-09-06 18:12                             ` Martin Ågren
  2017-09-06 19:52                               ` Junio C Hamano
                                                 ` (2 more replies)
  0 siblings, 3 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-06 18:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Michael Haggerty, Git Mailing List

On 5 September 2017 at 23:26, Junio C Hamano <gitster@pobox.com> wrote:
> Jeff King <peff@peff.net> writes:
>
>> I noticed the HEAD funniness, too, when looking at this earlier. I agree
>> with Junio that it's not quite consistent with the general rule of
>> "string list items point to their refnames", but I don't think it
>> matters in practice.
>
> Yup, we are on the same page; the "fix" I was alluding to would look
> exactly like what you wrote below, but I agree the distinction does
> not matter in practice.  IOW, I do not think the code after Martin's
> fix is wrong per-se.

Well, "not wrong per-se" tells me you'd feel slightly more comfortable
about the state of things if we did this. ;)

I'll take Peff's hint, tweak/add comments for correctness and symmetry
with the previous patch and add an if-BUG for symmetry. Peff: Do I have
your sign-off? (Do I need it?)

Junio: The topic is in pu as ma/split-symref-update-fix. Re-roll or new
topic or as a separate "patch 4/3" for you to place on top, any
preference?

If we re-roll, would you prefer Peff's much smaller take on patch 2
(strbuf_release where it matters, instead of sprinkling "goto out" all
over)? I think me and him agreed that we'd be fine either way. I'd reuse
my commit message, if I get his sign-off and "From:".

Obviously, if Michael or anyone else has any input, I'm open to that as
well..

Martin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-06 18:12                             ` Martin Ågren
@ 2017-09-06 19:52                               ` Junio C Hamano
  2017-09-06 23:45                               ` Jeff King
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
  2 siblings, 0 replies; 34+ messages in thread
From: Junio C Hamano @ 2017-09-06 19:52 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Jeff King, Michael Haggerty, Git Mailing List

Martin Ågren <martin.agren@gmail.com> writes:

> Junio: The topic is in pu as ma/split-symref-update-fix. Re-roll or new
> topic or as a separate "patch 4/3" for you to place on top, any
> preference?

Until a topic hits 'next', you solely own it and it is mostly up to
you how to go about improving it.  My preference would be much less
relevant than what the end result would require, i.e....

> If we re-roll, would you prefer Peff's much smaller take on patch 2
> (strbuf_release where it matters, instead of sprinkling "goto out" all
> over)? I think me and him agreed that we'd be fine either way. I'd reuse
> my commit message, if I get his sign-off and "From:".

... if we take Peff's approach, then rerolling the whole thing would
be the only approach that makes sense---incremental update would
make the resulting history less readable.  Between two approaches I
do not have a strong preference either way---it's a choice that can
be made between you and Peff.

> Obviously, if Michael or anyone else has any input, I'm open to that as
> well..

Sure.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list
  2017-09-06 18:12                             ` Martin Ågren
  2017-09-06 19:52                               ` Junio C Hamano
@ 2017-09-06 23:45                               ` Jeff King
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
  2 siblings, 0 replies; 34+ messages in thread
From: Jeff King @ 2017-09-06 23:45 UTC (permalink / raw)
  To: Martin Ågren; +Cc: Junio C Hamano, Michael Haggerty, Git Mailing List

On Wed, Sep 06, 2017 at 08:12:24PM +0200, Martin Ågren wrote:

> On 5 September 2017 at 23:26, Junio C Hamano <gitster@pobox.com> wrote:
> > Jeff King <peff@peff.net> writes:
> >
> >> I noticed the HEAD funniness, too, when looking at this earlier. I agree
> >> with Junio that it's not quite consistent with the general rule of
> >> "string list items point to their refnames", but I don't think it
> >> matters in practice.
> >
> > Yup, we are on the same page; the "fix" I was alluding to would look
> > exactly like what you wrote below, but I agree the distinction does
> > not matter in practice.  IOW, I do not think the code after Martin's
> > fix is wrong per-se.
> 
> Well, "not wrong per-se" tells me you'd feel slightly more comfortable
> about the state of things if we did this. ;)
> 
> I'll take Peff's hint, tweak/add comments for correctness and symmetry
> with the previous patch and add an if-BUG for symmetry. Peff: Do I have
> your sign-off? (Do I need it?)

Yes, you have my sign-off. Probably it is not necessary for such a
trivial patch, but it never hurts to be sure.

> If we re-roll, would you prefer Peff's much smaller take on patch 2
> (strbuf_release where it matters, instead of sprinkling "goto out" all
> over)? I think me and him agreed that we'd be fine either way. I'd reuse
> my commit message, if I get his sign-off and "From:".

You are welcome to forge my sign-off there (but I really am OK with
either approach).

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 0/4] Rerolling ma/split-symref-update-fix
  2017-09-06 18:12                             ` Martin Ågren
  2017-09-06 19:52                               ` Junio C Hamano
  2017-09-06 23:45                               ` Jeff King
@ 2017-09-09  6:57                               ` Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 1/4] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
                                                   ` (4 more replies)
  2 siblings, 5 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-09  6:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Michael Haggerty

> I'll take Peff's hint, tweak/add comments for correctness and symmetry
> with the previous patch and add an if-BUG for symmetry.

Here's a reroll of ma/split-symref-update-fix. The first three patches
are v3 plus Michael's Reviewed-By.

The fourth is the conceptual fix of adding `refname` instead of "HEAD"
into the list of affected refnames.

Thanks all for comments, suggestions and help along the way.

Martin

Martin Ågren (4):
  refs/files-backend: add longer-scoped copy of string to list
  refs/files-backend: fix memory leak in lock_ref_for_update
  refs/files-backend: correct return value in lock_ref_for_update
  refs/files-backend: add `refname`, not "HEAD", to list

 refs/files-backend.c | 62 +++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 44 insertions(+), 18 deletions(-)

-- 
2.14.1.460.g848a19d64


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 1/4] refs/files-backend: add longer-scoped copy of string to list
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
@ 2017-09-09  6:57                                 ` Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 2/4] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
                                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-09  6:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Michael Haggerty

split_symref_update() receives a string-pointer `referent` and adds it
to the list of `affected_refnames`. The list simply holds on to the
pointers it is given, it does not copy the strings and it does not ever
free them. The `referent` string in split_symref_update() belongs to a
string buffer in the caller. After we return, the string will be leaked.

In the next patch, we want to properly release the string buffer in the
caller, but we can't safely do so until we've made sure that
`affected_refnames` will not be holding on to a pointer to the string.
We could configure the list to handle its own resources, but it would
mean some alloc/free-churning. The list is already handling other
strings (through other code paths) which we do not need to worry about,
and we'd be memory-churning those strings too, completely unnecessary.

Observe that split_symref_update() creates a `new_update`-object through
ref_transaction_add_update(), after which `new_update->refname` is a
copy of `referent`. The difference is, this copy will be freed, and it
will be freed *after* `affected_refnames` has been cleared.

Rearrange the handling of `referent`, so that we don't add it directly
to `affected_refnames`. Instead, first just check whether `referent`
exists in the string list, and later add `new_update->refname`.

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 refs/files-backend.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 0404f2c23..baceef3b3 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2634,13 +2634,12 @@ static int split_symref_update(struct files_ref_store *refs,
 
 	/*
 	 * First make sure that referent is not already in the
-	 * transaction. This insertion is O(N) in the transaction
+	 * transaction. This check is O(lg N) in the transaction
 	 * size, but it happens at most once per symref in a
 	 * transaction.
 	 */
-	item = string_list_insert(affected_refnames, referent);
-	if (item->util) {
-		/* An entry already existed */
+	if (string_list_has_string(affected_refnames, referent)) {
+		/* An entry already exists */
 		strbuf_addf(err,
 			    "multiple updates for '%s' (including one "
 			    "via symref '%s') are not allowed",
@@ -2675,6 +2674,17 @@ static int split_symref_update(struct files_ref_store *refs,
 	update->flags |= REF_LOG_ONLY | REF_NODEREF;
 	update->flags &= ~REF_HAVE_OLD;
 
+	/*
+	 * Add the referent. This insertion is O(N) in the transaction
+	 * size, but it happens at most once per symref in a
+	 * transaction. Make sure to add new_update->refname, which will
+	 * be valid as long as affected_refnames is in use, and NOT
+	 * referent, which might soon be freed by our caller.
+	 */
+	item = string_list_insert(affected_refnames, new_update->refname);
+	if (item->util)
+		BUG("%s unexpectedly found in affected_refnames",
+		    new_update->refname);
 	item->util = new_update;
 
 	return 0;
-- 
2.14.1.460.g848a19d64


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 2/4] refs/files-backend: fix memory leak in lock_ref_for_update
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 1/4] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
@ 2017-09-09  6:57                                 ` Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 3/4] refs/files-backend: correct return value " Martin Ågren
                                                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-09  6:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Michael Haggerty

After the previous patch, none of the functions we call hold on to
`referent.buf`, so we can safely release the string buffer before
returning.

Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 refs/files-backend.c | 31 ++++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 11 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index baceef3b3..3d6363966 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2756,7 +2756,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 	struct strbuf referent = STRBUF_INIT;
 	int mustexist = (update->flags & REF_HAVE_OLD) &&
 		!is_null_oid(&update->old_oid);
-	int ret;
+	int ret = 0;
 	struct ref_lock *lock;
 
 	files_assert_main_repository(refs, "lock_ref_for_update");
@@ -2768,7 +2768,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		ret = split_head_update(update, transaction, head_ref,
 					affected_refnames, err);
 		if (ret)
-			return ret;
+			goto out;
 	}
 
 	ret = lock_raw_ref(refs, update->refname, mustexist,
@@ -2782,7 +2782,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		strbuf_addf(err, "cannot lock ref '%s': %s",
 			    original_update_refname(update), reason);
 		free(reason);
-		return ret;
+		goto out;
 	}
 
 	update->backend_data = lock;
@@ -2801,10 +2801,12 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
 						    original_update_refname(update));
-					return -1;
+					ret = -1;
+					goto out;
 				}
 			} else if (check_old_oid(update, &lock->old_oid, err)) {
-				return TRANSACTION_GENERIC_ERROR;
+				ret = TRANSACTION_GENERIC_ERROR;
+				goto out;
 			}
 		} else {
 			/*
@@ -2818,13 +2820,15 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 						  referent.buf, transaction,
 						  affected_refnames, err);
 			if (ret)
-				return ret;
+				goto out;
 		}
 	} else {
 		struct ref_update *parent_update;
 
-		if (check_old_oid(update, &lock->old_oid, err))
-			return TRANSACTION_GENERIC_ERROR;
+		if (check_old_oid(update, &lock->old_oid, err)) {
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
+		}
 
 		/*
 		 * If this update is happening indirectly because of a
@@ -2861,7 +2865,8 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 				    "cannot update ref '%s': %s",
 				    update->refname, write_err);
 			free(write_err);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		} else {
 			update->flags |= REF_NEEDS_COMMIT;
 		}
@@ -2875,10 +2880,14 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 		if (close_ref(lock)) {
 			strbuf_addf(err, "couldn't close '%s.lock'",
 				    update->refname);
-			return TRANSACTION_GENERIC_ERROR;
+			ret = TRANSACTION_GENERIC_ERROR;
+			goto out;
 		}
 	}
-	return 0;
+
+out:
+	strbuf_release(&referent);
+	return ret;
 }
 
 /*
-- 
2.14.1.460.g848a19d64


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 3/4] refs/files-backend: correct return value in lock_ref_for_update
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 1/4] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 2/4] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
@ 2017-09-09  6:57                                 ` Martin Ågren
  2017-09-09  6:57                                 ` [PATCH v4 4/4] refs/files-backend: add `refname`, not "HEAD", to list Martin Ågren
  2017-09-09 10:47                                 ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Jeff King
  4 siblings, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-09  6:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Michael Haggerty

In one code path we return a literal -1 and not a symbolic constant. The
value -1 would be interpreted as TRANSACTION_NAME_CONFLICT, which is
wrong. Use TRANSACTION_GENERIC_ERROR instead (that is the only other
return value we have to choose from).

Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 refs/files-backend.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 3d6363966..03df00275 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2801,7 +2801,7 @@ static int lock_ref_for_update(struct files_ref_store *refs,
 					strbuf_addf(err, "cannot lock ref '%s': "
 						    "error reading reference",
 						    original_update_refname(update));
-					ret = -1;
+					ret = TRANSACTION_GENERIC_ERROR;
 					goto out;
 				}
 			} else if (check_old_oid(update, &lock->old_oid, err)) {
-- 
2.14.1.460.g848a19d64


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 4/4] refs/files-backend: add `refname`, not "HEAD", to list
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
                                                   ` (2 preceding siblings ...)
  2017-09-09  6:57                                 ` [PATCH v4 3/4] refs/files-backend: correct return value " Martin Ågren
@ 2017-09-09  6:57                                 ` Martin Ågren
  2017-09-09 10:47                                 ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Jeff King
  4 siblings, 0 replies; 34+ messages in thread
From: Martin Ågren @ 2017-09-09  6:57 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano, Michael Haggerty

An earlier patch rewrote `split_symref_update()` to add a copy of a
string to a string list instead of adding the original string. That was
so that the original string could be freed in a later patch, but it is
also conceptually cleaner, since now all calls to `string_list_insert()`
and `string_list_append()` add `update->refname`. --- Except a literal
"HEAD" is added in `split_head_update()`.

Restructure `split_head_update()` in the same way as the earlier patch
did for `split_symref_update()`. This does not correct any practical
problem, but makes things conceptually cleaner. The downside is a call
to `string_list_has_string()`, which should be relatively cheap.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
---
 refs/files-backend.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 03df00275..926f87b94 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -2589,11 +2589,10 @@ static int split_head_update(struct ref_update *update,
 
 	/*
 	 * First make sure that HEAD is not already in the
-	 * transaction. This insertion is O(N) in the transaction
+	 * transaction. This check is O(lg N) in the transaction
 	 * size, but it happens at most once per transaction.
 	 */
-	item = string_list_insert(affected_refnames, "HEAD");
-	if (item->util) {
+	if (string_list_has_string(affected_refnames, "HEAD")) {
 		/* An entry already existed */
 		strbuf_addf(err,
 			    "multiple updates for 'HEAD' (including one "
@@ -2608,6 +2607,14 @@ static int split_head_update(struct ref_update *update,
 			update->new_oid.hash, update->old_oid.hash,
 			update->msg);
 
+	/*
+	 * Add "HEAD". This insertion is O(N) in the transaction
+	 * size, but it happens at most once per transaction.
+	 * Add new_update->refname instead of a literal "HEAD".
+	 */
+	if (strcmp(new_update->refname, "HEAD"))
+		BUG("%s unexpectedly not 'HEAD'", new_update->refname);
+	item = string_list_insert(affected_refnames, new_update->refname);
 	item->util = new_update;
 
 	return 0;
-- 
2.14.1.460.g848a19d64


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 0/4] Rerolling ma/split-symref-update-fix
  2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
                                                   ` (3 preceding siblings ...)
  2017-09-09  6:57                                 ` [PATCH v4 4/4] refs/files-backend: add `refname`, not "HEAD", to list Martin Ågren
@ 2017-09-09 10:47                                 ` Jeff King
  4 siblings, 0 replies; 34+ messages in thread
From: Jeff King @ 2017-09-09 10:47 UTC (permalink / raw)
  To: Martin Ågren; +Cc: git, Junio C Hamano, Michael Haggerty

On Sat, Sep 09, 2017 at 08:57:14AM +0200, Martin Ågren wrote:

> > I'll take Peff's hint, tweak/add comments for correctness and symmetry
> > with the previous patch and add an if-BUG for symmetry.
> 
> Here's a reroll of ma/split-symref-update-fix. The first three patches
> are v3 plus Michael's Reviewed-By.
> 
> The fourth is the conceptual fix of adding `refname` instead of "HEAD"
> into the list of affected refnames.
> 
> Thanks all for comments, suggestions and help along the way.

This looks good to me, including the explanation in the new patch.
Thanks!

-Peff

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2017-09-09 10:47 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-25 18:49 [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Martin Ågren
2017-08-25 18:49 ` [PATCH 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-08-25 21:00 ` [PATCH 1/2] refs/files-backend: duplicate strings added to affected_refnames Junio C Hamano
2017-08-26 10:16   ` Martin Ågren
2017-08-28  8:06     ` Michael Haggerty
2017-08-28 10:09       ` Martin Ågren
2017-08-28 20:32         ` [PATCH v2 1/2] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
2017-08-29  8:33           ` Michael Haggerty
2017-08-28 20:32         ` [PATCH v2 2/2] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-08-29  8:39           ` Michael Haggerty
2017-08-29 10:41             ` Martin Ågren
2017-08-29 17:18               ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
2017-08-30  2:52                 ` Michael Haggerty
2017-08-30 18:02                   ` Martin Ågren
2017-09-05 10:02                     ` Junio C Hamano
2017-09-05 17:24                       ` Martin Ågren
2017-09-05 20:36                         ` Jeff King
2017-09-05 21:26                           ` Junio C Hamano
2017-09-06 18:12                             ` Martin Ågren
2017-09-06 19:52                               ` Junio C Hamano
2017-09-06 23:45                               ` Jeff King
2017-09-09  6:57                               ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 1/4] refs/files-backend: add longer-scoped copy of string to list Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 2/4] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 3/4] refs/files-backend: correct return value " Martin Ågren
2017-09-09  6:57                                 ` [PATCH v4 4/4] refs/files-backend: add `refname`, not "HEAD", to list Martin Ågren
2017-09-09 10:47                                 ` [PATCH v4 0/4] Rerolling ma/split-symref-update-fix Jeff King
2017-09-05  8:45                 ` [PATCH v3 1/3] refs/files-backend: add longer-scoped copy of string to list Jeff King
2017-09-05  9:03                   ` Michael Haggerty
2017-09-05  9:04                     ` Jeff King
2017-08-29 17:18               ` [PATCH v3 2/3] refs/files-backend: fix memory leak in lock_ref_for_update Martin Ågren
2017-09-05  8:47                 ` Jeff King
2017-09-05 17:28                   ` Martin Ågren
2017-08-29 17:18               ` [PATCH v3 3/3] refs/files-backend: correct return value " Martin Ågren

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.