All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv3] refs.c: enable large transactions
@ 2015-04-22 21:30 Stefan Beller
  2015-04-23 17:56 ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Beller @ 2015-04-22 21:30 UTC (permalink / raw)
  To: mhagger, gitster, git, peff; +Cc: Stefan Beller

This is another attempt on enabling large transactions
(large in terms of open file descriptors). We keep track of how many
lock files are opened by the ref_transaction_commit function.
When more than a reasonable amount of files is open, we close
the file descriptors to make sure the transaction can continue.

Another idea I had during implementing this was to move this file
closing into the lock file API, such that only a certain amount of
lock files can be open at any given point in time and we'd be 'garbage
collecting' open fds when necessary in any relevant call to the lock
file API. This would have brought the advantage of having such
functionality available in other users of the lock file API as well.
The downside however is the over complication, you really need to always
check for (lock->fd != -1) all the time, which may slow down other parts
of the code, which did not ask for such a feature.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

Notes:
    * Added error checking when reopening the lock
    
    * Only call close_lock_file when needed.
      (This makes the code a bit harder to read, but
       might be worth it nevertheless. close_lock_file would
       return early anyway, so we're trading off a function call
       to some additional check (!(flags & REF_HAVE_NEW) ||is_null_sha1(update->new_sha1))
    
    * tuned the number of spare fd to be 25 as in the other occurence.
      At least we want to be consistent with our made up ballpark numbers.
    
    * This replaces the latest patch on origin/sb/remove-fd-from-ref-lock

 refs.c                | 28 ++++++++++++++++++++++++++++
 t/t1400-update-ref.sh |  4 ++--
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index 4f495bd..7ce7b97 100644
--- a/refs.c
+++ b/refs.c
@@ -3041,6 +3041,13 @@ static int write_ref_sha1(struct ref_lock *lock,
 		errno = EINVAL;
 		return -1;
 	}
+	if (lock->lk->fd == -1 && reopen_lock_file(lock->lk) == -1) {
+		int save_errno = errno;
+		error("Couldn't reopen %s", lock->lk->filename.buf);
+		unlock_ref(lock);
+		errno = save_errno;
+		return -1;
+	}
 	if (write_in_full(lock->lk->fd, sha1_to_hex(sha1), 40) != 40 ||
 	    write_in_full(lock->lk->fd, &term, 1) != 1 ||
 	    close_ref(lock) < 0) {
@@ -3718,6 +3725,7 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 			   struct strbuf *err)
 {
 	int ret = 0, i;
+	unsigned int remaining_fds;
 	int n = transaction->nr;
 	struct ref_update **updates = transaction->updates;
 	struct string_list refs_to_delete = STRING_LIST_INIT_NODUP;
@@ -3733,6 +3741,20 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 		return 0;
 	}
 
+	/*
+	 * We need to open many files in a large transaction, so come up with
+	 * a reasonable maximum. We still keep some spares for stdin/out and
+	 * other open files. Experiments determined we need more fds when
+	 * running inside our test suite than directly in the shell. It's
+	 * unclear where these fds come from. 25 should be a reasonable large
+	 * number though.
+	 */
+	remaining_fds = get_max_fd_limit();
+	if (remaining_fds > 25)
+		remaining_fds -= 25;
+	else
+		remaining_fds = 0;
+
 	/* Copy, sort, and reject duplicate refs */
 	qsort(updates, n, sizeof(*updates), ref_update_compare);
 	if (ref_update_reject_duplicates(updates, n, err)) {
@@ -3762,6 +3784,12 @@ int ref_transaction_commit(struct ref_transaction *transaction,
 				    update->refname);
 			goto cleanup;
 		}
+		if (!(flags & REF_HAVE_NEW) ||
+		    is_null_sha1(update->new_sha1) ||
+		    remaining_fds == 0)
+			close_lock_file(update->lock->lk);
+		else
+			remaining_fds--;
 	}
 
 	/* Perform updates first so live commits remain referenced */
diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index 7a69f1a..636d3a1 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -1071,7 +1071,7 @@ run_with_limited_open_files () {
 
 test_lazy_prereq ULIMIT_FILE_DESCRIPTORS 'run_with_limited_open_files true'
 
-test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' '
+test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches does not burst open file limit' '
 (
 	for i in $(test_seq 33)
 	do
@@ -1082,7 +1082,7 @@ test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction creating branches
 )
 '
 
-test_expect_failure ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' '
+test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches does not burst open file limit' '
 (
 	for i in $(test_seq 33)
 	do
-- 
2.4.0.rc2.5.g4c2045b.dirty

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-22 21:30 [PATCHv3] refs.c: enable large transactions Stefan Beller
@ 2015-04-23 17:56 ` Junio C Hamano
  2015-04-24  0:21   ` Stefan Beller
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2015-04-23 17:56 UTC (permalink / raw)
  To: Stefan Beller; +Cc: mhagger, git, peff

Stefan Beller <sbeller@google.com> writes:

> diff --git a/refs.c b/refs.c
> index 4f495bd..7ce7b97 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -3041,6 +3041,13 @@ static int write_ref_sha1(struct ref_lock *lock,
>  		errno = EINVAL;
>  		return -1;
>  	}
> +	if (lock->lk->fd == -1 && reopen_lock_file(lock->lk) == -1) {

Granted, we explicitly assign -1 to lk->fd when we know it is
closed, and the return value from reopen_lock_file() can come only
from the return value from open(2), which is defined to return -1
(i.e. not just any negative value) upon failure, but still, isn't it
customary to check with "< 0" rather than "== -1" for errors?

> +		int save_errno = errno;
> +		error("Couldn't reopen %s", lock->lk->filename.buf);

No need to change this line, but I noticed that we might want to do
something about the first one of the following two:

    $ git grep -e '[ 	]error(_*"[A-Z]' | wc -l
    146
    $ git grep -e '[ 	]error(_*"[a-z]' | wc -l
    390

> +		unlock_ref(lock);
> +		errno = save_errno;
> +		return -1;
> +	}

Is this the only place in the entire codebase where a lockfile,
which may have been closed to save number of open file descriptors,
needs to be reopened?  If I understand correctly, lockfile API is
not for sole use of refs (don't the index and the pack codepaths use
it, too?), so it may give us a better abstraction to have a helper
function in lockfile.[ch] that takes a lock object, i.e.

	int make_lock_fd_valid(struct lock_file *lk)
        {
		if (lk->fd < 0 && reopen_lock_file(lk) < 0) {
			... error ...
                        return -1;
		}
                return 0;
	}

and to call it at this point, i.e.

	if (make_lock_fd_valid(lock->lk) < 0)
        	return -1;

perhaps?

> @@ -3733,6 +3741,20 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  		return 0;
>  	}
>  
> +	/*
> +	 * We need to open many files in a large transaction, so come up with
> +	 * a reasonable maximum. We still keep some spares for stdin/out and
> +	 * other open files. Experiments determined we need more fds when
> +	 * running inside our test suite than directly in the shell. It's
> +	 * unclear where these fds come from. 25 should be a reasonable large
> +	 * number though.
> +	 */

> +	remaining_fds = get_max_fd_limit();
> +	if (remaining_fds > 25)
> +		remaining_fds -= 25;
> +	else
> +		remaining_fds = 0;

Is the value of pack_open_fds visible from here? I am wondering if
we might want "scratch fds Git can use for its own use" accounting
so that the two subsystems can collectively say "it is still safe
to use one more fd and leave 25 for other people". With the code
structure proposed here, the pack reader can grab all but 25 fds,
which would leave the rest of Git including this code only 25,
and the value in remaining_fds would become totally meaningless.

It certainly can wait to worry about that and we do not have to do
anything about it in this patch, but it may still be a good idea to
leave "NEEDSWORK:" comment here (and point at open_packed_git_1() in
sha1_file.c in the comment).

I do not think the other side needs to know about the fd consumption
in this function, as what is opened in here will be closed before
this function returns.

> @@ -3762,6 +3784,12 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>  				    update->refname);
>  			goto cleanup;
>  		}
> +		if (!(flags & REF_HAVE_NEW) ||
> +		    is_null_sha1(update->new_sha1) ||
> +		    remaining_fds == 0)
> +			close_lock_file(update->lock->lk);
> +		else
> +			remaining_fds--;
>  	}

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-23 17:56 ` Junio C Hamano
@ 2015-04-24  0:21   ` Stefan Beller
  2015-04-24  1:37     ` Junio C Hamano
  0 siblings, 1 reply; 12+ messages in thread
From: Stefan Beller @ 2015-04-24  0:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, Jeff King

On Thu, Apr 23, 2015 at 10:56 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> diff --git a/refs.c b/refs.c
>> index 4f495bd..7ce7b97 100644
>> --- a/refs.c
>> +++ b/refs.c
>> @@ -3041,6 +3041,13 @@ static int write_ref_sha1(struct ref_lock *lock,
>>               errno = EINVAL;
>>               return -1;
>>       }
>> +     if (lock->lk->fd == -1 && reopen_lock_file(lock->lk) == -1) {
>
> Granted, we explicitly assign -1 to lk->fd when we know it is
> closed, and the return value from reopen_lock_file() can come only
> from the return value from open(2), which is defined to return -1
> (i.e. not just any negative value) upon failure, but still, isn't it
> customary to check with "< 0" rather than "== -1" for errors?

< 0 would be better here indeed.

>
>> +             int save_errno = errno;
>> +             error("Couldn't reopen %s", lock->lk->filename.buf);
>
> No need to change this line, but I noticed that we might want to do
> something about the first one of the following two:

I personally like to have each error(...) to have a unique string, such
that when you run into trouble (most likely reported by a user),
you can easily pinpoint where the exact error is.
Maybe we could think about overriding error to actually report
"version, filename, linenumber, actual error message"

>
>     $ git grep -e '[    ]error(_*"[A-Z]' | wc -l
>     146
>     $ git grep -e '[    ]error(_*"[a-z]' | wc -l
>     390
>
>> +             unlock_ref(lock);
>> +             errno = save_errno;
>> +             return -1;
>> +     }
>
> Is this the only place in the entire codebase where a lockfile,
> which may have been closed to save number of open file descriptors,
> needs to be reopened?  If I understand correctly, lockfile API is
> not for sole use of refs (don't the index and the pack codepaths use
> it, too?), so it may give us a better abstraction to have a helper
> function in lockfile.[ch] that takes a lock object, i.e.
>
>         int make_lock_fd_valid(struct lock_file *lk)
>         {
>                 if (lk->fd < 0 && reopen_lock_file(lk) < 0) {
>                         ... error ...
>                         return -1;
>                 }
>                 return 0;
>         }
>
> and to call it at this point, i.e.
>
>         if (make_lock_fd_valid(lock->lk) < 0)
>                 return -1;
>
> perhaps?

This is what I was originally looking for, thanks for the design guidance!

>
>> @@ -3733,6 +3741,20 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>>               return 0;
>>       }
>>
>> +     /*
>> +      * We need to open many files in a large transaction, so come up with
>> +      * a reasonable maximum. We still keep some spares for stdin/out and
>> +      * other open files. Experiments determined we need more fds when
>> +      * running inside our test suite than directly in the shell. It's
>> +      * unclear where these fds come from. 25 should be a reasonable large
>> +      * number though.
>> +      */
>
>> +     remaining_fds = get_max_fd_limit();
>> +     if (remaining_fds > 25)
>> +             remaining_fds -= 25;
>> +     else
>> +             remaining_fds = 0;
>
> Is the value of pack_open_fds visible from here? I am wondering if
> we might want "scratch fds Git can use for its own use" accounting
> so that the two subsystems can collectively say "it is still safe
> to use one more fd and leave 25 for other people". With the code
> structure proposed here, the pack reader can grab all but 25 fds,
> which would leave the rest of Git including this code only 25,
> and the value in remaining_fds would become totally meaningless.
>
> It certainly can wait to worry about that and we do not have to do
> anything about it in this patch, but it may still be a good idea to
> leave "NEEDSWORK:" comment here (and point at open_packed_git_1() in
> sha1_file.c in the comment).
>
> I do not think the other side needs to know about the fd consumption
> in this function, as what is opened in here will be closed before
> this function returns.
>
>> @@ -3762,6 +3784,12 @@ int ref_transaction_commit(struct ref_transaction *transaction,
>>                                   update->refname);
>>                       goto cleanup;
>>               }
>> +             if (!(flags & REF_HAVE_NEW) ||
>> +                 is_null_sha1(update->new_sha1) ||
>> +                 remaining_fds == 0)
>> +                     close_lock_file(update->lock->lk);
>> +             else
>> +                     remaining_fds--;
>>       }

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24  0:21   ` Stefan Beller
@ 2015-04-24  1:37     ` Junio C Hamano
  2015-04-24 16:16       ` Stefan Beller
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2015-04-24  1:37 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Michael Haggerty, git, Jeff King

Stefan Beller <sbeller@google.com> writes:

> On Thu, Apr 23, 2015 at 10:56 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
>>> +             int save_errno = errno;
>>> +             error("Couldn't reopen %s", lock->lk->filename.buf);
>>
>> No need to change this line, but I noticed that we might want to do
>> something about the first one of the following two:
>
> I personally like to have each error(...) to have a unique string, such
> that when you run into trouble (most likely reported by a user),
> you can easily pinpoint where the exact error is.

I was hoping that the "grep" patterns were strong enough hint, but
let me be explicit.  I was commenting on "Could" not being spelled
as "could".

>>     $ git grep -e '[    ]error(_*"[A-Z]' | wc -l
>>     146
>>     $ git grep -e '[    ]error(_*"[a-z]' | wc -l
>>     390

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24  1:37     ` Junio C Hamano
@ 2015-04-24 16:16       ` Stefan Beller
  2015-04-24 17:19         ` Jeff King
  2015-04-24 18:12         ` Jonathan Nieder
  0 siblings, 2 replies; 12+ messages in thread
From: Stefan Beller @ 2015-04-24 16:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, Jeff King

On Thu, Apr 23, 2015 at 6:37 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> On Thu, Apr 23, 2015 at 10:56 AM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>>> +             int save_errno = errno;
>>>> +             error("Couldn't reopen %s", lock->lk->filename.buf);
>>>
>>> No need to change this line, but I noticed that we might want to do
>>> something about the first one of the following two:
>>
>> I personally like to have each error(...) to have a unique string, such
>> that when you run into trouble (most likely reported by a user),
>> you can easily pinpoint where the exact error is.
>
> I was hoping that the "grep" patterns were strong enough hint, but
> let me be explicit.  I was commenting on "Could" not being spelled
> as "could".
>
>>>     $ git grep -e '[    ]error(_*"[A-Z]' | wc -l
>>>     146
>>>     $ git grep -e '[    ]error(_*"[a-z]' | wc -l
>>>     390

I understood that you were talking about Could being capitalized.
Though it's distributed 30/70 which hints to me we do not care in
enough to explain the capitalized letters as slip-throughs on review
or such, but it looks like the authors choice to me.

I think it's a mistake to s/Could/could/g for all errors in the code base
as it reduces the information provided in the error messages.
Just 3 days ago ("Regular Rebase Failure"). I used different
capitalization to get a better understanding where the error may be.

So if we throw away that information, we should add new information
to make the spot of the error easily findable in the source.
That's why I proposed the idea of the version,filename,linenumber
as that is one of the strongest signals (most information in a short
amount of text) I can imagine.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24 16:16       ` Stefan Beller
@ 2015-04-24 17:19         ` Jeff King
  2015-04-24 18:12         ` Jonathan Nieder
  1 sibling, 0 replies; 12+ messages in thread
From: Jeff King @ 2015-04-24 17:19 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Michael Haggerty, git

On Fri, Apr 24, 2015 at 09:16:28AM -0700, Stefan Beller wrote:

> I think it's a mistake to s/Could/could/g for all errors in the code base
> as it reduces the information provided in the error messages.
> Just 3 days ago ("Regular Rebase Failure"). I used different
> capitalization to get a better understanding where the error may be.
> 
> So if we throw away that information, we should add new information
> to make the spot of the error easily findable in the source.
> That's why I proposed the idea of the version,filename,linenumber
> as that is one of the strongest signals (most information in a short
> amount of text) I can imagine.

I do like that idea, and I think you could base it on the trace_printf
implementation. Note that it requires variadic macros, but I think
that's OK. Just like trace_printf, we can do the macro implementation
when we support that feature, and people on older systems just won't get
the extra file/line data.

I also assume we would not show this information by default, but only
with GIT_TRACE_ERRORS or something like that.

I would love it if we could also get a stack trace for warnings and
errors. Very often the line number of the error() call is not nearly as
interesting as the line number of the _caller_. But doing that portably
is rather hard. Maybe a better option would be to make it easier to
convince git to dump core at the right moments (e.g., dump core when we
hit die() rather than calling exit). And then you can run gdb on the
core file, which gives you a backtrace and much more.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24 16:16       ` Stefan Beller
  2015-04-24 17:19         ` Jeff King
@ 2015-04-24 18:12         ` Jonathan Nieder
  2015-04-24 18:31           ` Stefan Beller
  2015-04-24 20:17           ` Jeff King
  1 sibling, 2 replies; 12+ messages in thread
From: Jonathan Nieder @ 2015-04-24 18:12 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Junio C Hamano, Michael Haggerty, git, Jeff King

Stefan Beller wrote:

> I understood that you were talking about Could being capitalized.
> Though it's distributed 30/70 which hints to me we do not care in
> enough to explain the capitalized letters as slip-throughs on review
> or such, but it looks like the authors choice to me.

Lowercase, brief, with no period at the end is the convention.  This
is user-facing UI, so consistency matters.  It's not author's choice.

Existing examples of capitalized messages fall into a few categories:

 - shell scripts used to tend to use the capitalized form, and when
   they got rewritten as builtins the messages weren't cleaned up at
   the same time

 - some patch authors have different habits and it wasn't worth going
   back and forth to fix it (but the convention still stands, and the
   result of a program that can't decide how to talk to the user is
   still harmful)

 - once there are a few examples, they get copy/pasted and imitated

> I think it's a mistake to s/Could/could/g for all errors in the code base
> as it reduces the information provided in the error messages.

This seems like an after-the-fact justification for a mistake.

Often there is a choice about whether the caller or the callee to a
function prints an error.  If the caller does, it can say more about
the context.  If the callee does, the message is in one place and can
be tweaked more easily to be more useful.

To get the benefits of both, we could print a backtrace with every
error.  That way, the callee can describe the error in one place but
the context is all there.  But I'm really glad we don't do that!

The main purpose of most error messages is instructional: they give
information to the user in a way that is abstracted from the
implementation (the user doesn't care what function it happened in!)
that tells them what happened and what they can do next.
Gratuitous inconsistency in formatting doesn't help with that.

Actually, I wouldn't mind an environment variable that tells error()
to include a backtrace if someone wants it.  (See backtrace(3)
for implementation hints if interested in doing that.)

> Just 3 days ago ("Regular Rebase Failure"). I used different
> capitalization to get a better understanding where the error may be.

Wouldn't it be better if there weren't so many similar messages
produced in different contexts in the first place?  (I.e., this
suggests to me that we should move more in the direction of
callee-produces-error.)

Sorry, that was a long way to say very little.  I just wanted to
emphasize that when a UI inconsistency has a useful side effect, while
that can be useful to understand and learn from, we should not be
afraid to fix it.  And to clear up any ambiguity about git's error
message convention. :)

Thanks and hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24 18:12         ` Jonathan Nieder
@ 2015-04-24 18:31           ` Stefan Beller
  2015-04-24 20:17           ` Jeff King
  1 sibling, 0 replies; 12+ messages in thread
From: Stefan Beller @ 2015-04-24 18:31 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, Michael Haggerty, git, Jeff King

On Fri, Apr 24, 2015 at 11:12 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Stefan Beller wrote:
>
>> I understood that you were talking about Could being capitalized.
>> Though it's distributed 30/70 which hints to me we do not care in
>> enough to explain the capitalized letters as slip-throughs on review
>> or such, but it looks like the authors choice to me.
>
> Lowercase, brief, with no period at the end is the convention.  This
> is user-facing UI, so consistency matters.  It's not author's choice.
>
> Existing examples of capitalized messages fall into a few categories:
>
>  - shell scripts used to tend to use the capitalized form, and when
>    they got rewritten as builtins the messages weren't cleaned up at
>    the same time
>
>  - some patch authors have different habits and it wasn't worth going
>    back and forth to fix it (but the convention still stands, and the
>    result of a program that can't decide how to talk to the user is
>    still harmful)
>
>  - once there are a few examples, they get copy/pasted and imitated
>
>> I think it's a mistake to s/Could/could/g for all errors in the code base
>> as it reduces the information provided in the error messages.
>
> This seems like an after-the-fact justification for a mistake.
>
> Often there is a choice about whether the caller or the callee to a
> function prints an error.  If the caller does, it can say more about
> the context.  If the callee does, the message is in one place and can
> be tweaked more easily to be more useful.
>
> To get the benefits of both, we could print a backtrace with every
> error.  That way, the callee can describe the error in one place but
> the context is all there.  But I'm really glad we don't do that!
>
> The main purpose of most error messages is instructional: they give
> information to the user in a way that is abstracted from the
> implementation (the user doesn't care what function it happened in!)
> that tells them what happened and what they can do next.
> Gratuitous inconsistency in formatting doesn't help with that.

I agree we should fix the formatting, but I was pointing out the side effects
and how to avoid the side effects. So what I am proposing is a cleanup of
the warning messages as well as a GIT_TRACE_ERRORS variable as
Jeff proposed, because then we have all the benefits.

If we were to just cleanup the error messages we improve on certain aspects
(UI), while making others worse.


>
> Actually, I wouldn't mind an environment variable that tells error()
> to include a backtrace if someone wants it.  (See backtrace(3)
> for implementation hints if interested in doing that.)

CONFORMING TO
       These functions are GNU extensions.

I guess I can live with backtrace, but the Git for Windows people may
hate me for it.


>
>> Just 3 days ago ("Regular Rebase Failure"). I used different
>> capitalization to get a better understanding where the error may be.
>
> Wouldn't it be better if there weren't so many similar messages
> produced in different contexts in the first place?  (I.e., this
> suggests to me that we should move more in the direction of
> callee-produces-error.)
>
> Sorry, that was a long way to say very little.  I just wanted to
> emphasize that when a UI inconsistency has a useful side effect, while
> that can be useful to understand and learn from, we should not be
> afraid to fix it.  And to clear up any ambiguity about git's error
> message convention. :)
>
> Thanks and hope that helps,

I really appreciate the background information provided!

Thanks,
Stefan

> Jonathan

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24 18:12         ` Jonathan Nieder
  2015-04-24 18:31           ` Stefan Beller
@ 2015-04-24 20:17           ` Jeff King
  2015-04-25  4:23             ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2015-04-24 20:17 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Stefan Beller, Junio C Hamano, Michael Haggerty, git

On Fri, Apr 24, 2015 at 11:12:36AM -0700, Jonathan Nieder wrote:

> Actually, I wouldn't mind an environment variable that tells error()
> to include a backtrace if someone wants it.  (See backtrace(3)
> for implementation hints if interested in doing that.)

Thanks, I didn't know about backtrace(3), and figured we'd be stuck with
an ELF library or similar to get at the symbols.

That being said, the resulting backtrace is not nearly as nice as what
is produced by gdb (which includes pretty-printed variables, or even
local variables with "bt full"). Not everybody will have gdb, of course,
but nor will everybody have backtrace().

So if anything, I think my inclination would be to make it easier to
help people (and ourselves) get a backtrace from gdb.

One can get a core for a running process with gcore, or trigger a
coredump by killing with SIGABRT. But catching it at the exact moment of
a die() is a bit hard without support from the program. What about
something like this:

diff --git a/usage.c b/usage.c
index ed14645..fa92190 100644
--- a/usage.c
+++ b/usage.c
@@ -34,6 +34,8 @@ static NORETURN void usage_builtin(const char *err, va_list params)
 static NORETURN void die_builtin(const char *err, va_list params)
 {
 	vreportf("fatal: ", err, params);
+	if (git_env_bool("GIT_DIE_ABORT", 0))
+		abort();
 	exit(128);
 }
 

Usage would be something like:

  ulimit -c unlimited ;# optional, but many distros seem to ship with
                      ;# cores disabled these days
  GIT_DIE_ABORT=1 git some-command-that-fails
  gdb --quiet -ex 'bt full' -ex quit git core

We could even wrap that up in a git-debug script. I suspect there may be
some complications with finding the core file, though, as the git
process may chdir before dumping. I'm not sure if there's a good way
around that (obviously setting /proc/sys/kernel/core_pattern is not
exactly friendly).

I dunno. Certainly there is room for a less-full-featured system that
would run more seamlessly, or on more people's systems (i.e., so that we
can easily say "set GIT_FOO and tell us what it outputs" in response to
bug reports). Maybe that system is backtrace(), or maybe it is just
__FILE__ and __LINE__ markers for each printed error.

-Peff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-24 20:17           ` Jeff King
@ 2015-04-25  4:23             ` Junio C Hamano
  2015-04-25  5:00               ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2015-04-25  4:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Jonathan Nieder, Stefan Beller, Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> So if anything, I think my inclination would be to make it easier to
> help people (and ourselves) get a backtrace from gdb.
>
> One can get a core for a running process with gcore, or trigger a
> coredump by killing with SIGABRT. But catching it at the exact moment of
> a die() is a bit hard without support from the program. What about
> something like this:
>
> diff --git a/usage.c b/usage.c
> index ed14645..fa92190 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -34,6 +34,8 @@ static NORETURN void usage_builtin(const char *err, va_list params)
>  static NORETURN void die_builtin(const char *err, va_list params)
>  {
>  	vreportf("fatal: ", err, params);
> +	if (git_env_bool("GIT_DIE_ABORT", 0))
> +		abort();
>  	exit(128);
>  }

Two comments.

There probably are more than a few places that calls exit(128)
without using die(), upon seeing that some helper function returned
an error code, because the helper already gave an error message.

The proposals so far, including this one, assume that the bug
reporter will first fail the operation with "normal" invocation
of Git (e.g. without GIT_DIE_ABORT exported) and then retry the
same operation in a different way (e.g. with GIT_DIE_ABORT) to
give us something that would help diagnosis.  Such a user could
just rerun Git under gdb with breakpoint set to die_builtin, no?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-25  4:23             ` Junio C Hamano
@ 2015-04-25  5:00               ` Jeff King
  2015-04-25  5:24                 ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2015-04-25  5:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Stefan Beller, Michael Haggerty, git

On Fri, Apr 24, 2015 at 09:23:16PM -0700, Junio C Hamano wrote:

> The proposals so far, including this one, assume that the bug
> reporter will first fail the operation with "normal" invocation
> of Git (e.g. without GIT_DIE_ABORT exported) and then retry the
> same operation in a different way (e.g. with GIT_DIE_ABORT) to
> give us something that would help diagnosis.  Such a user could
> just rerun Git under gdb with breakpoint set to die_builtin, no?

Good point. I was trying to automate the gathering of the backtrace so
that even bug-reporters who have not used gdb could easily get us more
information. But of course if a coredump only gets us halfway there and
we have to script gdb to convert the core into a backtrace anyway, it is
not buying us much over just scripting gdb in the first place.

A better solution to what I proposed earlier is perhaps:

  git config alias.debug '!gdb --quiet \
	    -ex "break exit" \
	    -ex "run" \
	    -ex "bt full" \
	    -ex "continue" \
	    -ex "quit" \
	    --args git \
  '
  git debug rev-parse foobar

It has the minor irritation that gdb will control the process stdio
(slurping from stdin and polluting stdout, whereas we would prefer no
input and output to stderr). There might be a way to convince gdb to do
otherwise, or you could probably go pretty far with some shell fd
redirects and using "set args" inside gdb. Or maybe something with
gdbserver.

But the point is that yeah, we shouldn't try to build really good
introspection inside git. Debuggers already do a way better job of this.
If they're hard for people to use to obtain simple information like a
backtrace, we should work on wrapping that difficulty up in a script.

It might still be useful to provide a much lesser form of introspection,
if it would be available in a lot more places than gdb would. E.g.,
__FILE__ and __LINE__ markers on error messages might be useful. A
mediocre backtrace() that is only available on glibc systems is probably
not.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCHv3] refs.c: enable large transactions
  2015-04-25  5:00               ` Jeff King
@ 2015-04-25  5:24                 ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2015-04-25  5:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Stefan Beller, Michael Haggerty, git

On Sat, Apr 25, 2015 at 01:00:58AM -0400, Jeff King wrote:

> A better solution to what I proposed earlier is perhaps:
> 
>   git config alias.debug '!gdb --quiet \
> 	    -ex "break exit" \
> 	    -ex "run" \
> 	    -ex "bt full" \
> 	    -ex "continue" \
> 	    -ex "quit" \
> 	    --args git \
>   '
>   git debug rev-parse foobar
> 
> It has the minor irritation that gdb will control the process stdio
> (slurping from stdin and polluting stdout, whereas we would prefer no
> input and output to stderr). There might be a way to convince gdb to do
> otherwise, or you could probably go pretty far with some shell fd
> redirects and using "set args" inside gdb. Or maybe something with
> gdbserver.

Just to extend the thought exercise, here is something marginally less
horrible. Save as "git-debug" in your PATH and chmod +x.

-- >8 --
#!/bin/sh

if ! type gdb >/dev/null 2>&1; then
	echo >&2 "Sorry, you don't seem to have gdb installed."
	exit 1
fi

args=
for i in "$@"; do
	args="$args '$(printf '%s' "$i" | sed "s/'/'\\\\''/")'"
done

gdb --quiet \
	-ex "set args $args <&7 >&8 2>&9" 7<&0 8>&1 9>&2 </dev/null >&2 \
	-ex 'break exit' \
	-ex 'run' \
	-ex 'bt full' \
	-ex 'continue' \
	-ex 'quit' \
	git
-- 8< --

It's still rather hard to use with sub-programs started by git (e.g.,
the upload-pack spawned by a fetch), but I think it would work for many
simple cases. I'm not sure if I would use it myself. As somebody
confident in using gdb, my next step after seeing the backtrace would be
to start poking around interactively. Besides the stdio magic, this is
not really buying me much beyond running gdb myself.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-04-25  5:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-22 21:30 [PATCHv3] refs.c: enable large transactions Stefan Beller
2015-04-23 17:56 ` Junio C Hamano
2015-04-24  0:21   ` Stefan Beller
2015-04-24  1:37     ` Junio C Hamano
2015-04-24 16:16       ` Stefan Beller
2015-04-24 17:19         ` Jeff King
2015-04-24 18:12         ` Jonathan Nieder
2015-04-24 18:31           ` Stefan Beller
2015-04-24 20:17           ` Jeff King
2015-04-25  4:23             ` Junio C Hamano
2015-04-25  5:00               ` Jeff King
2015-04-25  5:24                 ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.