All of lore.kernel.org
 help / color / mirror / Atom feed
* Segfault in git reflog
@ 2015-12-30  9:24 Dennis Kaarsemaker
  2015-12-30 10:31 ` Duy Nguyen
  2015-12-30 11:17 ` Dennis Kaarsemaker
  0 siblings, 2 replies; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30  9:24 UTC (permalink / raw)
  To: git

I've hit a segfault in git reflog with latest git, reproducable in git.git:

spirit:~/code/git (master)$ ./git describe
v2.7.0-rc3

I've minimized the reflog to:

spirit:~/code/git (master)$ cat .git/logs/HEAD
2635c2b8bfc9aec07b7f023d8e3b3d02df715344 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765642 +0100  
74c855f87d25a5b5c12d0485ec77c785a1c734c5 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master

...which I realize looks a bit broken. I think at the time I was playing with
some patches that also caused segfaults, causing gaps in the reflog.
Nevertheless, I think segfaulting is bad. All objects in the reflog are
reachable.

gdb has the following to say:

spirit:~/code/git (master)$ gdb --args ./git --no-pager reflog
(gdb) run
Starting program: /home/dennis/code/git/git --no-pager reflog
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
28274d0 (HEAD -> master, tag: v2.7.0-rc3, upstream/master, peff/jk/tag-source-propagate, peff/jk/sigpipe-report, gitster/master) HEAD@{0}: checkout: moving from 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master

Program received signal SIGSEGV, Segmentation fault.
copy_commit_list (list=0x4834dc7000000011) at commit.c:450
450         pp = commit_list_append(list->item, pp);
(gdb) bt
#0  copy_commit_list (list=0x4834dc7000000011) at commit.c:450
#1  0x000000000050705e in save_parents (commit=commit@entry=0x928a90, revs=0x7fffffffcb80) at revision.c:3044
#2  0x000000000050a54e in get_revision_1 (revs=revs@entry=0x7fffffffcb80) at revision.c:3119
#3  0x000000000050a710 in get_revision_1 (revs=<optimized out>) at revision.c:3112
#4  get_revision_internal (revs=0x7fffffffcb80) at revision.c:3248
#5  0x000000000050a99d in get_revision (revs=revs@entry=0x7fffffffcb80) at revision.c:3322
#6  0x0000000000446032 in cmd_log_walk (rev=rev@entry=0x7fffffffcb80) at builtin/log.c:344
#7  0x0000000000446bf8 in cmd_log_reflog (argc=1, argv=0x7fffffffd6a8, prefix=0x0) at builtin/log.c:626
#8  0x0000000000406126 in run_builtin (argv=0x7fffffffd6a8, argc=1, p=0x7bbec0 <commands+1920>) at git.c:350
#9  handle_builtin (argc=1, argv=0x7fffffffd6a8) at git.c:536
#10 0x0000000000405261 in run_argv (argv=0x7fffffffd4c8, argcp=0x7fffffffd4ac) at git.c:582
#11 main (argc=1, av=<optimized out>) at git.c:690
(gdb) p list
$1 = (struct commit_list *) 0x4834dc7000000011
(gdb) p list->item
Cannot access memory at address 0x4834dc7000000011

A bisect blames 53d00b3 (log: use true parents for diff even when rewriting),
which does indeed touch the code that seems to be segfaulting.

I've tried digging into this, but didn't get very far.
-- 
Dennis Kaarsemaker <dennis@kaarsemaker.net>

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Segfault in git reflog
  2015-12-30  9:24 Segfault in git reflog Dennis Kaarsemaker
@ 2015-12-30 10:31 ` Duy Nguyen
  2015-12-30 11:17 ` Dennis Kaarsemaker
  1 sibling, 0 replies; 25+ messages in thread
From: Duy Nguyen @ 2015-12-30 10:31 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Git Mailing List

On Wed, Dec 30, 2015 at 4:24 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> I've hit a segfault in git reflog with latest git, reproducable in git.git:
>
> spirit:~/code/git (master)$ ./git describe
> v2.7.0-rc3
>
> I've minimized the reflog to:
>
> spirit:~/code/git (master)$ cat .git/logs/HEAD
> 2635c2b8bfc9aec07b7f023d8e3b3d02df715344 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765642 +0100
> 74c855f87d25a5b5c12d0485ec77c785a1c734c5 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master
>
> ...which I realize looks a bit broken. I think at the time I was playing with
> some patches that also caused segfaults, causing gaps in the reflog.
> Nevertheless, I think segfaulting is bad. All objects in the reflog are
> reachable.
>
> gdb has the following to say:
>
> spirit:~/code/git (master)$ gdb --args ./git --no-pager reflog

If it helps, if add_ref_decoration is not called at all, there's no
segfault. Definitely something with reflog parsing, probably leaving
an uninitialized pointer. But I have not gotten there yet.
-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Segfault in git reflog
  2015-12-30  9:24 Segfault in git reflog Dennis Kaarsemaker
  2015-12-30 10:31 ` Duy Nguyen
@ 2015-12-30 11:17 ` Dennis Kaarsemaker
  2015-12-30 11:26   ` Duy Nguyen
  1 sibling, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 11:17 UTC (permalink / raw)
  To: git

On wo, 2015-12-30 at 10:24 +0100, Dennis Kaarsemaker wrote:
> spirit:~/code/git (master)$ cat .git/logs/HEAD
> 2635c2b8bfc9aec07b7f023d8e3b3d02df715344 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765642 +0100  
> 74c855f87d25a5b5c12d0485ec77c785a1c734c5 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master

74c855f87d25a5b5c12d0485ec77c785a1c734c5 is actually a tag, pointing to
3c3d3f629a6176b401ebec455c5dd59ed1b5f910. I'm not sure if that is
relevant.

-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Segfault in git reflog
  2015-12-30 11:17 ` Dennis Kaarsemaker
@ 2015-12-30 11:26   ` Duy Nguyen
  2015-12-30 11:28     ` Duy Nguyen
  0 siblings, 1 reply; 25+ messages in thread
From: Duy Nguyen @ 2015-12-30 11:26 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Git Mailing List

On Wed, Dec 30, 2015 at 6:17 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> On wo, 2015-12-30 at 10:24 +0100, Dennis Kaarsemaker wrote:
>> spirit:~/code/git (master)$ cat .git/logs/HEAD
>> 2635c2b8bfc9aec07b7f023d8e3b3d02df715344 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765642 +0100
>> 74c855f87d25a5b5c12d0485ec77c785a1c734c5 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master
>
> 74c855f87d25a5b5c12d0485ec77c785a1c734c5 is actually a tag, pointing to
> 3c3d3f629a6176b401ebec455c5dd59ed1b5f910. I'm not sure if that is
> relevant.

It is. save_parents() expects "commit" to be a commit because it needs
commit->index, which is not available from struct tag. So when
saved_parents_at() tries to read commit->index, it gets random value
(from the tag). Hell breaks loose from there because this index field
points to some memory in the slab mem allocator. The question is, how
come a tag is passed in here. Maybe we can sprinkle some
assert(object->type == OBJ_COMMIT) in revision.c and see..
-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Segfault in git reflog
  2015-12-30 11:26   ` Duy Nguyen
@ 2015-12-30 11:28     ` Duy Nguyen
  2015-12-30 12:28       ` Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Duy Nguyen @ 2015-12-30 11:28 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Git Mailing List

On Wed, Dec 30, 2015 at 6:26 PM, Duy Nguyen <pclouds@gmail.com> wrote:
> On Wed, Dec 30, 2015 at 6:17 PM, Dennis Kaarsemaker
> <dennis@kaarsemaker.net> wrote:
>> On wo, 2015-12-30 at 10:24 +0100, Dennis Kaarsemaker wrote:
>>> spirit:~/code/git (master)$ cat .git/logs/HEAD
>>> 2635c2b8bfc9aec07b7f023d8e3b3d02df715344 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765642 +0100
>>> 74c855f87d25a5b5c12d0485ec77c785a1c734c5 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master

Ah... I came from a different angle and did not realize the tag sha1
is from your reflog. So yeah maybe reflog parsing code should check
object type first, don't assume it's a commit!

>>
>> 74c855f87d25a5b5c12d0485ec77c785a1c734c5 is actually a tag, pointing to
>> 3c3d3f629a6176b401ebec455c5dd59ed1b5f910. I'm not sure if that is
>> relevant.
>
> It is. save_parents() expects "commit" to be a commit because it needs
> commit->index, which is not available from struct tag. So when
> saved_parents_at() tries to read commit->index, it gets random value
> (from the tag). Hell breaks loose from there because this index field
> points to some memory in the slab mem allocator. The question is, how
> come a tag is passed in here. Maybe we can sprinkle some
> assert(object->type == OBJ_COMMIT) in revision.c and see..
> --
> Duy



-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Segfault in git reflog
  2015-12-30 11:28     ` Duy Nguyen
@ 2015-12-30 12:28       ` Dennis Kaarsemaker
  2015-12-30 13:19         ` Duy Nguyen
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 12:28 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

On wo, 2015-12-30 at 18:28 +0700, Duy Nguyen wrote:
> On Wed, Dec 30, 2015 at 6:26 PM, Duy Nguyen <pclouds@gmail.com>
> wrote:
> > On Wed, Dec 30, 2015 at 6:17 PM, Dennis Kaarsemaker
> > <dennis@kaarsemaker.net> wrote:
> >> On wo, 2015-12-30 at 10:24 +0100, Dennis Kaarsemaker wrote:
> >>> spirit:~/code/git (master)$ cat .git/logs/HEAD
> >>> 2635c2b8bfc9aec07b7f023d8e3b3d02df715344
> 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <
> dennis@kaarsemaker.net> 1446765642 +0100
> >>> 74c855f87d25a5b5c12d0485ec77c785a1c734c5
> 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <
> dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from
> 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master
> 
> Ah... I came from a different angle and did not realize the tag sha1
> is from your reflog. So yeah maybe reflog parsing code should check
> object type first, don't assume it's a commit!

Something like this perhaps?

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..cd538dd 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -25,6 +25,14 @@ static int read_one_reflog(unsigned char *osha1, unsigned char *nsha1,
 {
        struct complete_reflogs *array = cb_data;
        struct reflog_info *item;
+       struct object *obj;
+
+       obj = parse_object(osha1);
+       if(obj && obj->type != OBJ_COMMIT)
+               die(_("Broken reflog, %s is a %s, not a commit"), sha1_to_hex(obj->oid.hash), typename(obj->type));
+       obj = parse_object(nsha1);
+       if(obj && obj->type != OBJ_COMMIT)
+               die(_("Broken reflog, %s is a %s, not a commit"), sha1_to_hex(obj->oid.hash), typename(obj->type));
 
        ALLOC_GROW(array->items, array->nr + 1, array->alloc);
        item = array->items + array->nr;

That gives me:
fatal: Broken reflog, 74c855f87d25a5b5c12d0485ec77c785a1c734c5 is a tag, not a commit

-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: Segfault in git reflog
  2015-12-30 12:28       ` Dennis Kaarsemaker
@ 2015-12-30 13:19         ` Duy Nguyen
  2015-12-30 15:22           ` [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Duy Nguyen @ 2015-12-30 13:19 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Git Mailing List

On Wed, Dec 30, 2015 at 01:28:06PM +0100, Dennis Kaarsemaker wrote:
> On wo, 2015-12-30 at 18:28 +0700, Duy Nguyen wrote:
> > On Wed, Dec 30, 2015 at 6:26 PM, Duy Nguyen <pclouds@gmail.com>
> > wrote:
> > > On Wed, Dec 30, 2015 at 6:17 PM, Dennis Kaarsemaker
> > > <dennis@kaarsemaker.net> wrote:
> > >> On wo, 2015-12-30 at 10:24 +0100, Dennis Kaarsemaker wrote:
> > >>> spirit:~/code/git (master)$ cat .git/logs/HEAD
> > >>> 2635c2b8bfc9aec07b7f023d8e3b3d02df715344
> > 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <
> > dennis@kaarsemaker.net> 1446765642 +0100
> > >>> 74c855f87d25a5b5c12d0485ec77c785a1c734c5
> > 54bc41416c5d3ecb978acb0df80d57aa3e54494c Dennis Kaarsemaker <
> > dennis@kaarsemaker.net> 1446765951 +0100  checkout: moving from
> > 3c3d3f629a6176b401ebec455c5dd59ed1b5f910 to master
> > 
> > Ah... I came from a different angle and did not realize the tag sha1
> > is from your reflog. So yeah maybe reflog parsing code should check
> > object type first, don't assume it's a commit!
> 
> Something like this perhaps?
> 
> diff --git a/reflog-walk.c b/reflog-walk.c
> index 85b8a54..cd538dd 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -25,6 +25,14 @@ static int read_one_reflog(unsigned char *osha1, unsigned char *nsha1,
>  {
>         struct complete_reflogs *array = cb_data;
>         struct reflog_info *item;
> +       struct object *obj;
> +
> +       obj = parse_object(osha1);
> +       if(obj && obj->type != OBJ_COMMIT)
> +               die(_("Broken reflog, %s is a %s, not a commit"), sha1_to_hex(obj->oid.hash), typename(obj->type));
> +       obj = parse_object(nsha1);
> +       if(obj && obj->type != OBJ_COMMIT)
> +               die(_("Broken reflog, %s is a %s, not a commit"), sha1_to_hex(obj->oid.hash), typename(obj->type));
>  
>         ALLOC_GROW(array->items, array->nr + 1, array->alloc);
>         item = array->items + array->nr;
> 
> That gives me:
> fatal: Broken reflog, 74c855f87d25a5b5c12d0485ec77c785a1c734c5 is a tag, not a commit

I would go with something like this. The typecasting to "struct commit
*" is the bug because parse_object() can return any object type. With
this I got

error: object 74c855f87d25a5b5c12d0485ec77c785a1c734c5 is a tag, not a commit

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..09d18fa 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -236,8 +236,9 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
 	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+	commit_info->commit = lookup_commit(reflog->osha1);
+	if (!commit_info->commit ||
+	    parse_commit(commit_info->commit)) {
 		commit->parents = NULL;
 		return;
 	}
--
Duy

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 13:19         ` Duy Nguyen
@ 2015-12-30 15:22           ` Dennis Kaarsemaker
  2015-12-30 21:20             ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 15:22 UTC (permalink / raw)
  To: git, pclouds

Use lookup_commit instead of parse_object to look up commits mentioned
in the reflog. This avoids a segfault in save_parents if somehow a sha1
for something other than a commit ends up in the reflog.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
Duy Nguyen wrote:

> I would go with something like this. The typecasting to "struct commit
> *" is the bug because parse_object() can return any object type.

Yeah, that's much better. Here it is as a patch with a test. 

 reflog-walk.c     | 4 ++--
 t/t1410-reflog.sh | 6 ++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..b85c8e8 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
 	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+	commit_info->commit = lookup_commit(reflog->osha1);
+	if (!commit_info->commit || parse_commit(commit_info->commit)) {
 		commit->parents = NULL;
 		return;
 	}
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index b79049f..76ccbe5 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -325,4 +325,10 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
 	test_cmp expect actual
 '
 
+test_expect_success 'reflog containing non-commit sha1s' '
+	git checkout -b broken-reflog &&
+	echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) abc <xyz> 0000000001 +0000" >> .git/logs/refs/heads/broken-reflog &&
+	git reflog broken-reflog
+'
+
 test_done
-- 
2.7.0-rc1-207-ga35084c


-- 
Dennis Kaarsemaker <dennis@kaarsemaker.net>
http://twitter.com/seveas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 15:22           ` [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog Dennis Kaarsemaker
@ 2015-12-30 21:20             ` Junio C Hamano
  2015-12-30 21:33               ` Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-12-30 21:20 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git, pclouds

Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:

> diff --git a/reflog-walk.c b/reflog-walk.c
> index 85b8a54..b85c8e8 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
>  	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
>  	info->last_commit_reflog = commit_reflog;
>  	commit_reflog->recno--;
> -	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
> -	if (!commit_info->commit) {
> +	commit_info->commit = lookup_commit(reflog->osha1);
> +	if (!commit_info->commit || parse_commit(commit_info->commit)) {
>  		commit->parents = NULL;
>  		return;

This looks somewhat roundabout and illogical.  The original was bad
because it blindly assumed reflgo->osha1 refers to a commit without
making sure that assumption holds.  Calling lookup_commit() blindly
is not much better, even though you are helped that the function
happens not to barf if the given object is not a commit.

Also this changes semantics, no?  Trace the original flow and think
what happens, when we see a commit object that cannot be parsed in
parse_commit_buffer().  parse_object() calls parse_object_buffer()
which in turn calls parse_commit_buffer() and the entire callchain
returns NULL.  commit_info->commit will become NULL in such a case.

With your code, lookup_commit() will store a non NULL in
commit_info->commit, and parse_commit() calls parse_commit_buffer()
and that would fail, so you clear commit->parents to NULL but fail
to set commit_info->commit to NULL.

Why not keep the parse_object() as-is and make sure we error out
unless the result is a commit with a more explicit check, perhaps
like this, instead?

 reflog-walk.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..861d7c4 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	struct commit_info *commit_info =
 		get_commit_info(commit, &info->reflogs, 0);
 	struct commit_reflog *commit_reflog;
+	struct object *logobj;
 	struct reflog_info *reflog;
 
 	info->last_commit_reflog = NULL;
@@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
 	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+	logobj = parse_object(reflog->osha1);
+	if (!logobj || logobj->type != OBJ_COMMIT) {
+		commit_info->commit = NULL;
 		commit->parents = NULL;
 		return;
 	}
+	commit_info->commit = (struct commit *)logobj;
 
 	commit->parents = xcalloc(1, sizeof(struct commit_list));
 	commit->parents->item = commit_info->commit;


> +test_expect_success 'reflog containing non-commit sha1s' '
> +	git checkout -b broken-reflog &&
> +	echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD) abc <xyz> 0000000001 +0000" >> .git/logs/refs/heads/broken-reflog &&
> +	git reflog broken-reflog
> +'
> +

This will negatively affect the ongoing effort to abstract out the
on-disk implementation of the reflog.  In some future installation
of Git, the reflog may not even be in .git/logs/refs/whatever file.

Use a non-branch ref, so that you can store any valid object not
just commits, and use a Git command (e.g. "git update-ref" or "git
tag") instead of the raw filesystem access to update it, perhaps
like this?

	git tag --create-reflog test-logs HEAD^ &&
	git tag -f test-logs HEAD^{tree} &&
	git tag -f test-logs HEAD &&
	git reflog test-logs

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 21:20             ` Junio C Hamano
@ 2015-12-30 21:33               ` Dennis Kaarsemaker
  2015-12-30 21:41                 ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 21:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, pclouds

On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote:
> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> 
> > diff --git a/reflog-walk.c b/reflog-walk.c
> > index 85b8a54..b85c8e8 100644
> > --- a/reflog-walk.c
> > +++ b/reflog-walk.c
> > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info
> > *info, struct commit *commit)
> >  	reflog = &commit_reflog->reflogs->items[commit_reflog
> > ->recno];
> >  	info->last_commit_reflog = commit_reflog;
> >  	commit_reflog->recno--;
> > -	commit_info->commit = (struct commit *)parse_object(reflog
> > ->osha1);
> > -	if (!commit_info->commit) {
> > +	commit_info->commit = lookup_commit(reflog->osha1);
> > +	if (!commit_info->commit || parse_commit(commit_info
> > ->commit)) {
> >  		commit->parents = NULL;
> >  		return;
> 
> This looks somewhat roundabout and illogical.  The original was bad
> because it blindly assumed reflgo->osha1 refers to a commit without
> making sure that assumption holds.  Calling lookup_commit() blindly
> is not much better, even though you are helped that the function
> happens not to barf if the given object is not a commit.
> 
> Also this changes semantics, no?  Trace the original flow and think
> what happens, when we see a commit object that cannot be parsed in
> parse_commit_buffer().  parse_object() calls parse_object_buffer()
> which in turn calls parse_commit_buffer() and the entire callchain
> returns NULL.  commit_info->commit will become NULL in such a case.
> 
> With your code, lookup_commit() will store a non NULL in
> commit_info->commit, and parse_commit() calls parse_commit_buffer()
> and that would fail, so you clear commit->parents to NULL but fail
> to set commit_info->commit to NULL.
>
> Why not keep the parse_object() as-is and make sure we error out
> unless the result is a commit with a more explicit check, perhaps
> like this, instead?

lookup_commit actually returns NULL (via object_as_type) for objects
that are not commits, so I don't think the above is true. The code
below also loses the diagnostic message about the object not being a
commit.

>  reflog-walk.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/reflog-walk.c b/reflog-walk.c
> index 85b8a54..861d7c4 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info
> *info, struct commit *commit)
>  	struct commit_info *commit_info =
>  		get_commit_info(commit, &info->reflogs, 0);
>  	struct commit_reflog *commit_reflog;
> +	struct object *logobj;
>  	struct reflog_info *reflog;
>  
>  	info->last_commit_reflog = NULL;
> @@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info
> *info, struct commit *commit)
>  	reflog = &commit_reflog->reflogs->items[commit_reflog
> ->recno];
>  	info->last_commit_reflog = commit_reflog;
>  	commit_reflog->recno--;
> -	commit_info->commit = (struct commit *)parse_object(reflog
> ->osha1);
> -	if (!commit_info->commit) {
> +	logobj = parse_object(reflog->osha1);
> +	if (!logobj || logobj->type != OBJ_COMMIT) {
> +		commit_info->commit = NULL;
>  		commit->parents = NULL;
>  		return;
>  	}
> +	commit_info->commit = (struct commit *)logobj;
>  
>  	commit->parents = xcalloc(1, sizeof(struct commit_list));
>  	commit->parents->item = commit_info->commit;
> 
> 
> > +test_expect_success 'reflog containing non-commit sha1s' '
> > +	git checkout -b broken-reflog &&
> > +	echo "$(git rev-parse HEAD^{tree}) $(git rev-parse HEAD)
> > abc <xyz> 0000000001 +0000" >> .git/logs/refs/heads/broken-reflog
> > &&
> > +	git reflog broken-reflog
> > +'
> > +
> 
> This will negatively affect the ongoing effort to abstract out the
> on-disk implementation of the reflog.  In some future installation
> of Git, the reflog may not even be in .git/logs/refs/whatever file.

I was following the style of the test above it, will fix.

> Use a non-branch ref, so that you can store any valid object not
> just commits, and use a Git command (e.g. "git update-ref" or "git
> tag") instead of the raw filesystem access to update it, perhaps
> like this?
> 
> 	git tag --create-reflog test-logs HEAD^ &&
> 	git tag -f test-logs HEAD^{tree} &&
> 	git tag -f test-logs HEAD &&
> 	git reflog test-logs

-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 21:33               ` Dennis Kaarsemaker
@ 2015-12-30 21:41                 ` Junio C Hamano
  2015-12-30 21:49                   ` Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-12-30 21:41 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git, pclouds

Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:

> On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote:
>> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
>> 
>> > diff --git a/reflog-walk.c b/reflog-walk.c
>> > index 85b8a54..b85c8e8 100644
>> > --- a/reflog-walk.c
>> > +++ b/reflog-walk.c
>> > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct reflog_walk_info
>> > *info, struct commit *commit)
>> >  	reflog = &commit_reflog->reflogs->items[commit_reflog
>> > ->recno];
>> >  	info->last_commit_reflog = commit_reflog;
>> >  	commit_reflog->recno--;
>> > -	commit_info->commit = (struct commit *)parse_object(reflog
>> > ->osha1);
>> > -	if (!commit_info->commit) {
>> > +	commit_info->commit = lookup_commit(reflog->osha1);
>> > +	if (!commit_info->commit || parse_commit(commit_info
>> > ->commit)) {
>> >  		commit->parents = NULL;
>> >  		return;
>> 
>> This looks somewhat roundabout and illogical.  The original was bad
>> because it blindly assumed reflgo->osha1 refers to a commit without
>> making sure that assumption holds.  Calling lookup_commit() blindly
>> is not much better, even though you are helped that the function
>> happens not to barf if the given object is not a commit.
>> 
>> Also this changes semantics, no?  Trace the original flow and think
>> what happens, when we see a commit object that cannot be parsed in
>> parse_commit_buffer().  parse_object() calls parse_object_buffer()
>> which in turn calls parse_commit_buffer() and the entire callchain
>> returns NULL.  commit_info->commit will become NULL in such a case.
>> 
>> With your code, lookup_commit() will store a non NULL in
>> commit_info->commit, and parse_commit() calls parse_commit_buffer()
>> and that would fail, so you clear commit->parents to NULL but fail
>> to set commit_info->commit to NULL.
>>
>> Why not keep the parse_object() as-is and make sure we error out
>> unless the result is a commit with a more explicit check, perhaps
>> like this, instead?
>
> lookup_commit actually returns NULL (via object_as_type) for objects
> that are not commits, so I don't think the above is true.

I think you did not read what you are responding to.  I was talking
about the error case where the object _is_ a commit (hence lookup
returns it), but parse_commit_buffer() does not like its contents.

> The code below also loses the diagnostic message about the object
> not being a commit.

Giving such a diagnostic message is a BUG.

A ref can legitimately point at any type of object (only refs under
refs/heads/, aka "branches", must point at commits), so you MUST NOT
complain about seeing a non-commit in a reflog in general.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 21:41                 ` Junio C Hamano
@ 2015-12-30 21:49                   ` Dennis Kaarsemaker
  2015-12-30 22:17                     ` [PATCH v2] " Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 21:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, pclouds

On wo, 2015-12-30 at 13:41 -0800, Junio C Hamano wrote:
> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> 
> > On wo, 2015-12-30 at 13:20 -0800, Junio C Hamano wrote:
> > > Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> > > 
> > > > diff --git a/reflog-walk.c b/reflog-walk.c
> > > > index 85b8a54..b85c8e8 100644
> > > > --- a/reflog-walk.c
> > > > +++ b/reflog-walk.c
> > > > @@ -236,8 +236,8 @@ void fake_reflog_parent(struct
> > > > reflog_walk_info
> > > > *info, struct commit *commit)
> > > >  	reflog = &commit_reflog->reflogs->items[commit_reflog
> > > > ->recno];
> > > >  	info->last_commit_reflog = commit_reflog;
> > > >  	commit_reflog->recno--;
> > > > -	commit_info->commit = (struct commit
> > > > *)parse_object(reflog
> > > > ->osha1);
> > > > -	if (!commit_info->commit) {
> > > > +	commit_info->commit = lookup_commit(reflog->osha1);
> > > > +	if (!commit_info->commit || parse_commit(commit_info
> > > > ->commit)) {
> > > >  		commit->parents = NULL;
> > > >  		return;
> > > 
> > > This looks somewhat roundabout and illogical.  The original was
> > > bad
> > > because it blindly assumed reflgo->osha1 refers to a commit
> > > without
> > > making sure that assumption holds.  Calling lookup_commit()
> > > blindly
> > > is not much better, even though you are helped that the function
> > > happens not to barf if the given object is not a commit.
> > > 
> > > Also this changes semantics, no?  Trace the original flow and
> > > think
> > > what happens, when we see a commit object that cannot be parsed
> > > in
> > > parse_commit_buffer().  parse_object() calls
> > > parse_object_buffer()
> > > which in turn calls parse_commit_buffer() and the entire
> > > callchain
> > > returns NULL.  commit_info->commit will become NULL in such a
> > > case.
> > > 
> > > With your code, lookup_commit() will store a non NULL in
> > > commit_info->commit, and parse_commit() calls
> > > parse_commit_buffer()
> > > and that would fail, so you clear commit->parents to NULL but
> > > fail
> > > to set commit_info->commit to NULL.
> > > 
> > > Why not keep the parse_object() as-is and make sure we error out
> > > unless the result is a commit with a more explicit check, perhaps
> > > like this, instead?
> > 
> > lookup_commit actually returns NULL (via object_as_type) for
> > objects
> > that are not commits, so I don't think the above is true.
> 
> I think you did not read what you are responding to.  I was talking
> about the error case where the object _is_ a commit (hence lookup
> returns it), but parse_commit_buffer() does not like its contents.

I read it, but misunderstood it. Thanks for clarifying.

> > The code below also loses the diagnostic message about the object
> > not being a commit.
> 
> Giving such a diagnostic message is a BUG.
> 
> A ref can legitimately point at any type of object (only refs under
> refs/heads/, aka "branches", must point at commits), so you MUST NOT
> complain about seeing a non-commit in a reflog in general.

Yeah, that makes sense, didn't think of that.
-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 21:49                   ` Dennis Kaarsemaker
@ 2015-12-30 22:17                     ` Dennis Kaarsemaker
  2015-12-30 22:42                       ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 22:17 UTC (permalink / raw)
  To: git, pclouds, gitster

Use lookup_commit instead of parse_object to look up commits mentioned
in the reflog. This avoids a segfault in save_parents if somehow a sha1
for something other than a commit ends up in the reflog.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
---
 reflog-walk.c     |  7 +++++--
 t/t1410-reflog.sh | 13 +++++++++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

This version implements Junio's way of doing this, to avoid spurious warnings
and to handle broken commit objects better.

I also added a failing test to show that git reflog still fails to display the
full reflog when it encounters such non-commit objects (it just stops when it
seems them). I don't know how to fix that.

The really correct way of fixing this bug may actually be a level higher, and
making git reflog not rely on information about parent commits, whether they
are fake or not.

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..861d7c4 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	struct commit_info *commit_info =
 		get_commit_info(commit, &info->reflogs, 0);
 	struct commit_reflog *commit_reflog;
+	struct object *logobj;
 	struct reflog_info *reflog;
 
 	info->last_commit_reflog = NULL;
@@ -236,11 +237,13 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
 	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+	logobj = parse_object(reflog->osha1);
+	if (!logobj || logobj->type != OBJ_COMMIT) {
+		commit_info->commit = NULL;
 		commit->parents = NULL;
 		return;
 	}
+	commit_info->commit = (struct commit *)logobj;
 
 	commit->parents = xcalloc(1, sizeof(struct commit_list));
 	commit->parents->item = commit_info->commit;
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index b79049f..130d671 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
 	test_cmp expect actual
 '
 
+test_expect_success 'no segfaults for reflog containing non-commit sha1s' '
+	git update-ref --create-reflog -m "Creating ref" \
+		refs/tests/tree-in-reflog HEAD &&
+	git update-ref -m "Forcing tree" refs/tests/tree-in-reflog HEAD^{tree} &&
+	git update-ref -m "Restoring to commit" refs/tests/tree-in-reflog HEAD &&
+	git reflog refs/tests/tree-in-reflog
+'
+
+test_expect_failure 'reflog containing non-commit sha1s displays fully' '
+	git reflog refs/tests/tree-in-reflog > actual &&
+	test_line_count = 3 actual
+'
+
 test_done
-- 
2.7.0-rc1-207-ga35084c

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 22:17                     ` [PATCH v2] " Dennis Kaarsemaker
@ 2015-12-30 22:42                       ` Junio C Hamano
  2015-12-30 23:33                         ` [PATCH v3] " Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-12-30 22:42 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git, pclouds

Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:

> The really correct way of fixing this bug may actually be a level higher, and
> making git reflog not rely on information about parent commits, whether they
> are fake or not.

I tend to agree.

The only common thing between "git reflog" wants to do (i.e. showing
the objects referred to by reflog entries) and what the normal "git
log" was/is designed to do is "we have many things, and we show them
one by one".  As the "many things" we have in the context of the
normal "git log" are all commits, it is reasonable that the internal
interface (i.e. revision.c::get_revision()) to iterate over these
"many things" returns a "struct commit *" and it also is reasonable
that "show them one by one" is done by calling log_tree_commit() in
builtin/log.c::cmd_log_walk().  Neither is suitable to deal with
series of reflog entries in general.  A proper implementation of
"git reflog" would have liked to be able to iterate over "many
things" by returning "struct object *" one-by-one, and then do the
equivalent of the switch() statement in builtin/log.c::cmd_show()
to show these objects.

The way "git log" was abused and made to show entries from reflog is
one of the ugly and unfortunate hacks in our codebase.

However, I see that there are one of two things that you could do to
make this part of code do slightly better than stopping at the first
non-commit object:

 - pretend that the non-commit entry never existed in the first
   place and return the commit that appears in the reflog next.

 - fabricate a fake "commit" object that says "I am not a commit;
   I merely exist to represent that the reflog you are walking has
   this non-commit object at this point in the sequence" and return
   it, instead of giving NULL in the error path.

> diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
> index b79049f..130d671 100755
> --- a/t/t1410-reflog.sh
> +++ b/t/t1410-reflog.sh
> @@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'no segfaults for reflog containing non-commit sha1s' '
> +	git update-ref --create-reflog -m "Creating ref" \
> +		refs/tests/tree-in-reflog HEAD &&
> +	git update-ref -m "Forcing tree" refs/tests/tree-in-reflog HEAD^{tree} &&
> +	git update-ref -m "Restoring to commit" refs/tests/tree-in-reflog HEAD &&
> +	git reflog refs/tests/tree-in-reflog
> +'
> +
> +test_expect_failure 'reflog containing non-commit sha1s displays fully' '
> +	git reflog refs/tests/tree-in-reflog > actual &&

Please write this without space after the redirection operator, i.e.

	git reflog refs/tests/tree-in-reflog >actual &&

> +	test_line_count = 3 actual
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v3] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 22:42                       ` Junio C Hamano
@ 2015-12-30 23:33                         ` Dennis Kaarsemaker
  2015-12-31  0:02                           ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-30 23:33 UTC (permalink / raw)
  To: git, pclouds, gitster

git reflog (ab)uses the log machinery to display its list of log
entries. To do so it must fake commit parent information for the log
walker.

For refs in refs/heads this is no problem, as they should only ever
point to commits. Tags and other refs however can point to anything,
thus their reflog may contain non-commit objects.

To avoid segfaulting, we check whether reflog entries are commits before
feeding them to the log walker and skip any non-commits. This means that
git reflog output will be incomplete for such refs, but that's one step
up from segfaulting. A more complete solution would be to decouple git
reflog from the log walker machinery.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
---
 reflog-walk.c     | 16 +++++++++++-----
 t/t1410-reflog.sh | 13 +++++++++++++
 2 files changed, 24 insertions(+), 5 deletions(-)

Junio C Hamano wrote:

> However, I see that there are one of two things that you could do to
> make this part of code do slightly better than stopping at the first
> non-commit object:
> 
>  - pretend that the non-commit entry never existed in the first
>     place and return the commit that appears in the reflog next.

This turned out to be doable in the same code segment: just keep on
processing reflog entries until you hit a commit or run out of entries.
That (and the updated foremerly-failing test) are the only changes
between v2 and v3.

I'll try to actually implement the proper solution, but that'll take a
while. Until then, this at least stops a segfault :)

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..0ebd1da 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	struct commit_info *commit_info =
 		get_commit_info(commit, &info->reflogs, 0);
 	struct commit_reflog *commit_reflog;
+	struct object *logobj;
 	struct reflog_info *reflog;
 
 	info->last_commit_reflog = NULL;
@@ -232,15 +233,20 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 		commit->parents = NULL;
 		return;
 	}
-
-	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
-	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+
+	do {
+		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
+		commit_reflog->recno--;
+		logobj = parse_object(reflog->osha1);
+	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
+
+	if (!logobj || logobj->type != OBJ_COMMIT) {
+		commit_info->commit = NULL;
 		commit->parents = NULL;
 		return;
 	}
+	commit_info->commit = (struct commit *)logobj;
 
 	commit->parents = xcalloc(1, sizeof(struct commit_list));
 	commit->parents->item = commit_info->commit;
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index b79049f..f97513c 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
 	test_cmp expect actual
 '
 
+test_expect_success 'no segfaults for reflog containing non-commit sha1s' '
+	git update-ref --create-reflog -m "Creating ref" \
+		refs/tests/tree-in-reflog HEAD &&
+	git update-ref -m "Forcing tree" refs/tests/tree-in-reflog HEAD^{tree} &&
+	git update-ref -m "Restoring to commit" refs/tests/tree-in-reflog HEAD &&
+	git reflog refs/tests/tree-in-reflog
+'
+
+test_expect_success 'reflog containing non-commit sha1s displays properly' '
+	git reflog refs/tests/tree-in-reflog >actual &&
+	test_line_count = 2 actual
+'
+
 test_done
-- 
2.7.0-rc1-207-ga35084c


-- 
Dennis Kaarsemaker <dennis@kaarsemaker.net>
http://twitter.com/seveas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-30 23:33                         ` [PATCH v3] " Dennis Kaarsemaker
@ 2015-12-31  0:02                           ` Junio C Hamano
  2015-12-31  8:57                             ` Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-12-31  0:02 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git, pclouds

Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:

> This turned out to be doable in the same code segment: just keep on
> processing reflog entries until you hit a commit or run out of entries.
> That (and the updated foremerly-failing test) are the only changes
> between v2 and v3.
>
> I'll try to actually implement the proper solution, but that'll take a
> while. Until then, this at least stops a segfault :)

Yeah, that would be an ambitious project.

> diff --git a/reflog-walk.c b/reflog-walk.c
> index 85b8a54..0ebd1da 100644
> --- a/reflog-walk.c
> +++ b/reflog-walk.c
> @@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
>  	struct commit_info *commit_info =
>  		get_commit_info(commit, &info->reflogs, 0);
>  	struct commit_reflog *commit_reflog;
> +	struct object *logobj;

This thing is not initialized...

>  	struct reflog_info *reflog;
>  
>  	info->last_commit_reflog = NULL;
> @@ -232,15 +233,20 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
>  		commit->parents = NULL;
>  		return;
>  	}
> -
> -	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
>  	info->last_commit_reflog = commit_reflog;
> -	commit_reflog->recno--;
> -	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
> -	if (!commit_info->commit) {
> +
> +	do {
> +		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
> +		commit_reflog->recno--;
> +		logobj = parse_object(reflog->osha1);
> +	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));

But this loop runs at least once, so logobj will always have some
sane value when the loop exits.

> +	if (!logobj || logobj->type != OBJ_COMMIT) {

And the only case where this should trigger is when we ran out of
recno.  Am I reading the updated code correctly?

With the updated code, the number of times we return from this
function is different from the number initially set to recno.  I had
to wonder if the caller cares and misbehaves, but the caller does
not know how long the reflog is before starting to call
get_revision() in a loop anyway, so it already has to deal with a
case where it did .recno=20 and get_revision() did not return that
many times.  So this probably is safe.

> +		commit_info->commit = NULL;
>  		commit->parents = NULL;
>  		return;
>  	}

> +	commit_info->commit = (struct commit *)logobj;
>  
>  	commit->parents = xcalloc(1, sizeof(struct commit_list));
>  	commit->parents->item = commit_info->commit;
> diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
> index b79049f..f97513c 100755
> --- a/t/t1410-reflog.sh
> +++ b/t/t1410-reflog.sh
> @@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
>  	test_cmp expect actual
>  '
>  
> +test_expect_success 'no segfaults for reflog containing non-commit sha1s' '
> +	git update-ref --create-reflog -m "Creating ref" \
> +		refs/tests/tree-in-reflog HEAD &&
> +	git update-ref -m "Forcing tree" refs/tests/tree-in-reflog HEAD^{tree} &&
> +	git update-ref -m "Restoring to commit" refs/tests/tree-in-reflog HEAD &&
> +	git reflog refs/tests/tree-in-reflog
> +'
> +
> +test_expect_success 'reflog containing non-commit sha1s displays properly' '

In general, "properly" is a poor word to use in test description (or
a commit log message or a bug report, for that matter), as the whole
point of a test is to precisely define what is "proper".

And the code change declares that a proper thing to do is to omit
non-commit entries without segfaulting, so something like

    s/displays properly/omits them/

perhaps?

> +	git reflog refs/tests/tree-in-reflog >actual &&
> +	test_line_count = 2 actual
> +'
> +
>  test_done
> -- 
> 2.7.0-rc1-207-ga35084c

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-31  0:02                           ` Junio C Hamano
@ 2015-12-31  8:57                             ` Dennis Kaarsemaker
  2015-12-31 15:43                               ` Dennis Kaarsemaker
  2016-01-05 21:12                               ` [PATCH v4] " Dennis Kaarsemaker
  0 siblings, 2 replies; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-31  8:57 UTC (permalink / raw)
  To: Junio C Hamano, git, pclouds

On wo, 2015-12-30 at 16:02 -0800, Junio C Hamano wrote:
> Dennis Kaarsemaker <dennis@kaarsemaker.net> writes:
> 
> > diff --git a/reflog-walk.c b/reflog-walk.c
> > index 85b8a54..0ebd1da 100644
> > --- a/reflog-walk.c
> > +++ b/reflog-walk.c
> > @@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info
> > *info, struct commit *commit)
> >  	struct commit_info *commit_info =
> >  		get_commit_info(commit, &info->reflogs, 0);
> >  	struct commit_reflog *commit_reflog;
> > +	struct object *logobj;
> 
> This thing is not initialized...
> 
> >  	struct reflog_info *reflog;
> >  
> >  	info->last_commit_reflog = NULL;
> > @@ -232,15 +233,20 @@ void fake_reflog_parent(struct
> > reflog_walk_info *info, struct commit *commit)
> >  		commit->parents = NULL;
> >  		return;
> >  	}
> > -
> > -	reflog = &commit_reflog->reflogs->items[commit_reflog
> > ->recno];
> >  	info->last_commit_reflog = commit_reflog;
> > -	commit_reflog->recno--;
> > -	commit_info->commit = (struct commit *)parse_object(reflog
> > ->osha1);
> > -	if (!commit_info->commit) {
> > +
> > +	do {
> > +		reflog = &commit_reflog->reflogs
> > ->items[commit_reflog->recno];
> > +		commit_reflog->recno--;
> > +		logobj = parse_object(reflog->osha1);
> > +	} while (commit_reflog->recno && (logobj && logobj->type
> > != OBJ_COMMIT));
> 
> But this loop runs at least once, so logobj will always have some
> sane value when the loop exits.
> 
> > +	if (!logobj || logobj->type != OBJ_COMMIT) {
> 
> And the only case where this should trigger is when we ran out of
> recno.  Am I reading the updated code correctly?

Yes, your description matches what I tried to implement.

> With the updated code, the number of times we return from this
> function is different from the number initially set to recno.  I had
> to wonder if the caller cares and misbehaves, but the caller does
> not know how long the reflog is before starting to call
> get_revision() in a loop anyway, so it already has to deal with a
> case where it did .recno=20 and get_revision() did not return that
> many times.  So this probably is safe.

That corresponds to what I see when experimenting with different
reflogs.

> > +test_expect_success 'reflog containing non-commit sha1s displays
> > properly' '
> 
> In general, "properly" is a poor word to use in test description (or
> a commit log message or a bug report, for that matter), as the whole
> point of a test is to precisely define what is "proper".
> 
> And the code change declares that a proper thing to do is to omit
> non-commit entries without segfaulting, so something like
> 
>     s/displays properly/omits them/
> 
> perhaps?

I did find the test title a bit iffy but couldn't really figure out
why. What you're saying makes a lot of sense, will fix.
-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v3] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-31  8:57                             ` Dennis Kaarsemaker
@ 2015-12-31 15:43                               ` Dennis Kaarsemaker
  2016-01-05 21:12                               ` [PATCH v4] " Dennis Kaarsemaker
  1 sibling, 0 replies; 25+ messages in thread
From: Dennis Kaarsemaker @ 2015-12-31 15:43 UTC (permalink / raw)
  To: Junio C Hamano, git, pclouds

On do, 2015-12-31 at 09:57 +0100, Dennis Kaarsemaker wrote:
> > > +test_expect_success 'reflog containing non-commit sha1s displays
> > > properly' '
> > 
> > In general, "properly" is a poor word to use in test description
> (or
> > a commit log message or a bug report, for that matter), as the
> whole
> > point of a test is to precisely define what is "proper".
> > 
> > And the code change declares that a proper thing to do is to omit
> > non-commit entries without segfaulting, so something like
> > 
> >     s/displays properly/omits them/
> > 
> > perhaps?
> 
> I did find the test title a bit iffy but couldn't really figure out
> why. What you're saying makes a lot of sense, will fix.

Thinking about it a bit more: 'proper' would be to show everything, we
just expect that not to work yet. So I'll make it

test_expect_failure 'reflog with non-commit entries displays all entries' '
	git reflog refs/tests/tree-in-reflog >actual &&
	test_line_count = 3 actual
'

-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2015-12-31  8:57                             ` Dennis Kaarsemaker
  2015-12-31 15:43                               ` Dennis Kaarsemaker
@ 2016-01-05 21:12                               ` Dennis Kaarsemaker
  2016-01-06  1:05                                 ` Eric Sunshine
  1 sibling, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2016-01-05 21:12 UTC (permalink / raw)
  To: git, pclouds, gitster

git reflog (ab)uses the log machinery to display its list of log
entries. To do so it must fake commit parent information for the log
walker.

For refs in refs/heads this is no problem, as they should only ever
point to commits. Tags and other refs however can point to anything,
thus their reflog may contain non-commit objects.

To avoid segfaulting, we check whether reflog entries are commits before
feeding them to the log walker and skip any non-commits. This means that
git reflog output will be incomplete for such refs, but that's one step
up from segfaulting. A more complete solution would be to decouple git
reflog from the log walker machinery.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Helped-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
---
Changes since v3: tweaks to the second test.

 reflog-walk.c     | 16 +++++++++++-----
 t/t1410-reflog.sh | 13 +++++++++++++
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 85b8a54..0ebd1da 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -221,6 +221,7 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	struct commit_info *commit_info =
 		get_commit_info(commit, &info->reflogs, 0);
 	struct commit_reflog *commit_reflog;
+	struct object *logobj;
 	struct reflog_info *reflog;
 
 	info->last_commit_reflog = NULL;
@@ -232,15 +233,20 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 		commit->parents = NULL;
 		return;
 	}
-
-	reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
 	info->last_commit_reflog = commit_reflog;
-	commit_reflog->recno--;
-	commit_info->commit = (struct commit *)parse_object(reflog->osha1);
-	if (!commit_info->commit) {
+
+	do {
+		reflog = &commit_reflog->reflogs->items[commit_reflog->recno];
+		commit_reflog->recno--;
+		logobj = parse_object(reflog->osha1);
+	} while (commit_reflog->recno && (logobj && logobj->type != OBJ_COMMIT));
+
+	if (!logobj || logobj->type != OBJ_COMMIT) {
+		commit_info->commit = NULL;
 		commit->parents = NULL;
 		return;
 	}
+	commit_info->commit = (struct commit *)logobj;
 
 	commit->parents = xcalloc(1, sizeof(struct commit_list));
 	commit->parents->item = commit_info->commit;
diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
index b79049f..17a194b 100755
--- a/t/t1410-reflog.sh
+++ b/t/t1410-reflog.sh
@@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
 	test_cmp expect actual
 '
 
+test_expect_success 'no segfaults for reflog containing non-commit sha1s' '
+	git update-ref --create-reflog -m "Creating ref" \
+		refs/tests/tree-in-reflog HEAD &&
+	git update-ref -m "Forcing tree" refs/tests/tree-in-reflog HEAD^{tree} &&
+	git update-ref -m "Restoring to commit" refs/tests/tree-in-reflog HEAD &&
+	git reflog refs/tests/tree-in-reflog
+'
+
+test_expect_failure 'reflog with non-commit entries displays all entries' '
+	git reflog refs/tests/tree-in-reflog >actual &&
+	test_line_count = 3 actual
+'
+
 test_done
-- 
2.7.0-rc3-219-g24972d4


-- 
Dennis Kaarsemaker <dennis@kaarsemaker.net>
http://twitter.com/seveas

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2016-01-05 21:12                               ` [PATCH v4] " Dennis Kaarsemaker
@ 2016-01-06  1:05                                 ` Eric Sunshine
  2016-01-06  1:20                                   ` Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Sunshine @ 2016-01-06  1:05 UTC (permalink / raw)
  To: Dennis Kaarsemaker
  Cc: Git List, Nguyễn Thái Ngọc Duy, Junio C Hamano

On Tue, Jan 5, 2016 at 4:12 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> git reflog (ab)uses the log machinery to display its list of log
> entries. To do so it must fake commit parent information for the log
> walker.
>
> For refs in refs/heads this is no problem, as they should only ever
> point to commits. Tags and other refs however can point to anything,
> thus their reflog may contain non-commit objects.
>
> To avoid segfaulting, we check whether reflog entries are commits before
> feeding them to the log walker and skip any non-commits. This means that
> git reflog output will be incomplete for such refs, but that's one step
> up from segfaulting. A more complete solution would be to decouple git
> reflog from the log walker machinery.
>
> Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
> ---
> diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
> @@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs at BUFSIZ boundaries' '
> +test_expect_success 'no segfaults for reflog containing non-commit sha1s' '

Nit: It's kind of strange for a test title to talk about not
segfaulting; that's behavior you'd expect to be true for all tests.
Perhaps describe it as "non-commit reflog entries handled sanely" or
something.

> +       git update-ref --create-reflog -m "Creating ref" \
> +               refs/tests/tree-in-reflog HEAD &&
> +       git update-ref -m "Forcing tree" refs/tests/tree-in-reflog HEAD^{tree} &&
> +       git update-ref -m "Restoring to commit" refs/tests/tree-in-reflog HEAD &&
> +       git reflog refs/tests/tree-in-reflog
> +'

Hmm, this test is successful for me on OS X even without the
reflog-walk.c changes applied.

> +test_expect_failure 'reflog with non-commit entries displays all entries' '
> +       git reflog refs/tests/tree-in-reflog >actual &&
> +       test_line_count = 3 actual
> +'

And this test actually fails (inversely) because it's expecting a
failure, but doesn't get one since the command produces the expected
output.

By the way, it may make sense to combine these two tests. If a
segfault occurs, the actual output likely will not match the expected
output, thus the test will fail anyhow (unless the segfault occurs
after all output).

> +
>  test_done
> --
> 2.7.0-rc3-219-g24972d4

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2016-01-06  1:05                                 ` Eric Sunshine
@ 2016-01-06  1:20                                   ` Dennis Kaarsemaker
  2016-01-06  1:28                                     ` Eric Sunshine
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2016-01-06  1:20 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Nguyễn Thái Ngọc Duy, Junio C Hamano

On di, 2016-01-05 at 20:05 -0500, Eric Sunshine wrote:
> On Tue, Jan 5, 2016 at 4:12 PM, Dennis Kaarsemaker
> <dennis@kaarsemaker.net> wrote:
> > git reflog (ab)uses the log machinery to display its list of log
> > entries. To do so it must fake commit parent information for the
> > log
> > walker.
> > 
> > For refs in refs/heads this is no problem, as they should only ever
> > point to commits. Tags and other refs however can point to
> > anything,
> > thus their reflog may contain non-commit objects.
> > 
> > To avoid segfaulting, we check whether reflog entries are commits
> > before
> > feeding them to the log walker and skip any non-commits. This means
> > that
> > git reflog output will be incomplete for such refs, but that's one
> > step
> > up from segfaulting. A more complete solution would be to decouple
> > git
> > reflog from the log walker machinery.
> > 
> > Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
> > ---
> > diff --git a/t/t1410-reflog.sh b/t/t1410-reflog.sh
> > @@ -325,4 +325,17 @@ test_expect_success 'parsing reverse reflogs
> > at BUFSIZ boundaries' '
> > +test_expect_success 'no segfaults for reflog containing non-commit
> > sha1s' '
> 
> Nit: It's kind of strange for a test title to talk about not
> segfaulting; that's behavior you'd expect to be true for all tests.
> Perhaps describe it as "non-commit reflog entries handled sanely" or
> something.

To paraphrase what Junio said earlier in this thread: tests determine
what is sane behavior, so using the word 'sanely' isn't really
appropriate. This is a regression test to make sure we don't
accidentally reintroduce behavior that segfaults, which I think is an
easy mistake to make with the current code, so I think the title is
appropriate.

> > +       git update-ref --create-reflog -m "Creating ref" \
> > +               refs/tests/tree-in-reflog HEAD &&
> > +       git update-ref -m "Forcing tree" refs/tests/tree-in-reflog
> > HEAD^{tree} &&
> > +       git update-ref -m "Restoring to commit" refs/tests/tree-in
> > -reflog HEAD &&
> > +       git reflog refs/tests/tree-in-reflog
> > +'
> 
> Hmm, this test is successful for me on OS X even without the
> reflog-walk.c changes applied.
> 
> > +test_expect_failure 'reflog with non-commit entries displays all
> > entries' '
> > +       git reflog refs/tests/tree-in-reflog >actual &&
> > +       test_line_count = 3 actual
> > +'
> 
> And this test actually fails (inversely) because it's expecting a
> failure, but doesn't get one since the command produces the expected
> output.

That's... surprising to say the least. What's the content of 'actual',
and which git.git commit are you on?

> By the way, it may make sense to combine these two tests. If a
> segfault occurs, the actual output likely will not match the expected
> output, thus the test will fail anyhow (unless the segfault occurs
> after all output).

I kept them separate to show that while this no longer segfaults, it's
still not the correct output, but showing correct output is a much
bigger project.

-- 
Dennis Kaarsemaker
www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2016-01-06  1:20                                   ` Dennis Kaarsemaker
@ 2016-01-06  1:28                                     ` Eric Sunshine
  2016-01-06  1:52                                       ` Eric Sunshine
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Sunshine @ 2016-01-06  1:28 UTC (permalink / raw)
  To: Dennis Kaarsemaker
  Cc: Git List, Nguyễn Thái Ngọc, Junio C Hamano

On Tue, Jan 5, 2016 at 8:20 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> On di, 2016-01-05 at 20:05 -0500, Eric Sunshine wrote:
>> On Tue, Jan 5, 2016 at 4:12 PM, Dennis Kaarsemaker
>> > +       git update-ref --create-reflog -m "Creating ref" \
>> > +               refs/tests/tree-in-reflog HEAD &&
>> > +       git update-ref -m "Forcing tree" refs/tests/tree-in-reflog
>> > HEAD^{tree} &&
>> > +       git update-ref -m "Restoring to commit" refs/tests/tree-in
>> > -reflog HEAD &&
>> > +       git reflog refs/tests/tree-in-reflog
>> > +'
>>
>> Hmm, this test is successful for me on OS X even without the
>> reflog-walk.c changes applied.
>>
>> > +test_expect_failure 'reflog with non-commit entries displays all
>> > entries' '
>> > +       git reflog refs/tests/tree-in-reflog >actual &&
>> > +       test_line_count = 3 actual
>> > +'
>>
>> And this test actually fails (inversely) because it's expecting a
>> failure, but doesn't get one since the command produces the expected
>> output.
>
> That's... surprising to say the least. What's the content of 'actual',
> and which git.git commit are you on?

% cat t/trash\ directory.t1410-reflog/actual
b60a214 refs/tests/tree-in-reflog@{0}: Restoring to commit
140c527 refs/tests/tree-in-reflog@{1}: Forcing tree
b60a214 refs/tests/tree-in-reflog@{2}: Creating ref
%

This is with only the t/t1410-reflog.sh changes from your patch
applied atop current 'master' (SHA1 7548842).

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2016-01-06  1:28                                     ` Eric Sunshine
@ 2016-01-06  1:52                                       ` Eric Sunshine
  2016-01-06  9:13                                         ` Dennis Kaarsemaker
  0 siblings, 1 reply; 25+ messages in thread
From: Eric Sunshine @ 2016-01-06  1:52 UTC (permalink / raw)
  To: Dennis Kaarsemaker
  Cc: Git List, Nguyễn Thái Ngọc, Junio C Hamano

On Tue, Jan 5, 2016 at 8:28 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Tue, Jan 5, 2016 at 8:20 PM, Dennis Kaarsemaker
> <dennis@kaarsemaker.net> wrote:
>> On di, 2016-01-05 at 20:05 -0500, Eric Sunshine wrote:
>>> Hmm, this test is successful for me on OS X even without the
>>> reflog-walk.c changes applied.
>>>
>>> And this test actually fails (inversely) because it's expecting a
>>> failure, but doesn't get one since the command produces the expected
>>> output.
>>
>> That's... surprising to say the least. What's the content of 'actual',
>> and which git.git commit are you on?
>
> % cat t/trash\ directory.t1410-reflog/actual
> b60a214 refs/tests/tree-in-reflog@{0}: Restoring to commit
> 140c527 refs/tests/tree-in-reflog@{1}: Forcing tree
> b60a214 refs/tests/tree-in-reflog@{2}: Creating ref
> %
>
> This is with only the t/t1410-reflog.sh changes from your patch
> applied atop current 'master' (SHA1 7548842).

By the way, the segfault does occur for me on Linux and FreeBSD.

And, in all cases, on all tested platforms, with the full patch
applied, both tests behave sanely (in the expected fashion). So, even
though the crash doesn't manifest everywhere, the fact that the tests
are meaningfully testing it on the "affected" platforms may mean that
it's not worth worrying about why it doesn't segfault on OS X.

(Of course, practicality aside, one might want to satisfy one's
intellectual curiosity about why it behaves differently on OS X.)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2016-01-06  1:52                                       ` Eric Sunshine
@ 2016-01-06  9:13                                         ` Dennis Kaarsemaker
  2016-01-06  9:30                                           ` Duy Nguyen
  0 siblings, 1 reply; 25+ messages in thread
From: Dennis Kaarsemaker @ 2016-01-06  9:13 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Nguyễn Thái Ngọc, Junio C Hamano

On di, 2016-01-05 at 20:52 -0500, Eric Sunshine wrote:
> On Tue, Jan 5, 2016 at 8:28 PM, Eric Sunshine <
> sunshine@sunshineco.com> wrote:
> > On Tue, Jan 5, 2016 at 8:20 PM, Dennis Kaarsemaker
> > <dennis@kaarsemaker.net> wrote:
> > > On di, 2016-01-05 at 20:05 -0500, Eric Sunshine wrote:
> > > > Hmm, this test is successful for me on OS X even without the
> > > > reflog-walk.c changes applied.
> > > > 
> > > > And this test actually fails (inversely) because it's expecting
> > > > a
> > > > failure, but doesn't get one since the command produces the
> > > > expected
> > > > output.
> > > 
> > > That's... surprising to say the least. What's the content of
> > > 'actual',
> > > and which git.git commit are you on?
> > 
> > % cat t/trash\ directory.t1410-reflog/actual
> > b60a214 refs/tests/tree-in-reflog@{0}: Restoring to commit
> > 140c527 refs/tests/tree-in-reflog@{1}: Forcing tree
> > b60a214 refs/tests/tree-in-reflog@{2}: Creating ref
> > %
> > 
> > This is with only the t/t1410-reflog.sh changes from your patch
> > applied atop current 'master' (SHA1 7548842).
>
> By the way, the segfault does occur for me on Linux and FreeBSD.
> 
> And, in all cases, on all tested platforms, with the full patch
> applied, both tests behave sanely (in the expected fashion). So, even
> though the crash doesn't manifest everywhere, the fact that the tests
> are meaningfully testing it on the "affected" platforms may mean that
> it's not worth worrying about why it doesn't segfault on OS X.
> 
> (Of course, practicality aside, one might want to satisfy one's
> intellectual curiosity about why it behaves differently on OS X.)

The only explanation I can think of (and that's with practically no
knowledge of OS X internals) is that OS X's memory allocation strategy
is unlucky. Git is definitely writing to a location it should not write
to. On linux and freebsd this is unallocated memory, so you get a
segfault. On OS X, it happens to be memory actually allocated by git,
resulting not in a segfault but in silent corruption of other in-memory
data. I would argue that this is a much worse result, even though in
this small test that corruption seems to not trigger a crash.

-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v4] reflog-walk: don't segfault on non-commit sha1's in the reflog
  2016-01-06  9:13                                         ` Dennis Kaarsemaker
@ 2016-01-06  9:30                                           ` Duy Nguyen
  0 siblings, 0 replies; 25+ messages in thread
From: Duy Nguyen @ 2016-01-06  9:30 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: Eric Sunshine, Git List, Junio C Hamano

On Wed, Jan 6, 2016 at 4:13 PM, Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
> On di, 2016-01-05 at 20:52 -0500, Eric Sunshine wrote:
>> On Tue, Jan 5, 2016 at 8:28 PM, Eric Sunshine <
>> sunshine@sunshineco.com> wrote:
>> > On Tue, Jan 5, 2016 at 8:20 PM, Dennis Kaarsemaker
>> > <dennis@kaarsemaker.net> wrote:
>> > > On di, 2016-01-05 at 20:05 -0500, Eric Sunshine wrote:
>> > > > Hmm, this test is successful for me on OS X even without the
>> > > > reflog-walk.c changes applied.
>> > > >
>> > > > And this test actually fails (inversely) because it's expecting
>> > > > a
>> > > > failure, but doesn't get one since the command produces the
>> > > > expected
>> > > > output.
>> > >
>> > > That's... surprising to say the least. What's the content of
>> > > 'actual',
>> > > and which git.git commit are you on?
>> >
>> > % cat t/trash\ directory.t1410-reflog/actual
>> > b60a214 refs/tests/tree-in-reflog@{0}: Restoring to commit
>> > 140c527 refs/tests/tree-in-reflog@{1}: Forcing tree
>> > b60a214 refs/tests/tree-in-reflog@{2}: Creating ref
>> > %
>> >
>> > This is with only the t/t1410-reflog.sh changes from your patch
>> > applied atop current 'master' (SHA1 7548842).
>>
>> By the way, the segfault does occur for me on Linux and FreeBSD.
>>
>> And, in all cases, on all tested platforms, with the full patch
>> applied, both tests behave sanely (in the expected fashion). So, even
>> though the crash doesn't manifest everywhere, the fact that the tests
>> are meaningfully testing it on the "affected" platforms may mean that
>> it's not worth worrying about why it doesn't segfault on OS X.
>>
>> (Of course, practicality aside, one might want to satisfy one's
>> intellectual curiosity about why it behaves differently on OS X.)
>
> The only explanation I can think of (and that's with practically no
> knowledge of OS X internals) is that OS X's memory allocation strategy
> is unlucky. Git is definitely writing to a location it should not write
> to. On linux and freebsd this is unallocated memory, so you get a
> segfault. On OS X, it happens to be memory actually allocated by git,
> resulting not in a segfault but in silent corruption of other in-memory
> data. I would argue that this is a much worse result, even though in
> this small test that corruption seems to not trigger a crash.

Agreed. For a guaranteed crash, put assert(c->object.type ==
OBJ_COMMIT); in the macro function "slabname## _at_peek" in
commit-slab.h. That is if I analyzed the crash correctly. I'm not
suggesting to put the assert() in the code permanently though because
I don't know how often this function is called.
-- 
Duy

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2016-01-06  9:31 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-30  9:24 Segfault in git reflog Dennis Kaarsemaker
2015-12-30 10:31 ` Duy Nguyen
2015-12-30 11:17 ` Dennis Kaarsemaker
2015-12-30 11:26   ` Duy Nguyen
2015-12-30 11:28     ` Duy Nguyen
2015-12-30 12:28       ` Dennis Kaarsemaker
2015-12-30 13:19         ` Duy Nguyen
2015-12-30 15:22           ` [PATCH] reflog-walk: don't segfault on non-commit sha1's in the reflog Dennis Kaarsemaker
2015-12-30 21:20             ` Junio C Hamano
2015-12-30 21:33               ` Dennis Kaarsemaker
2015-12-30 21:41                 ` Junio C Hamano
2015-12-30 21:49                   ` Dennis Kaarsemaker
2015-12-30 22:17                     ` [PATCH v2] " Dennis Kaarsemaker
2015-12-30 22:42                       ` Junio C Hamano
2015-12-30 23:33                         ` [PATCH v3] " Dennis Kaarsemaker
2015-12-31  0:02                           ` Junio C Hamano
2015-12-31  8:57                             ` Dennis Kaarsemaker
2015-12-31 15:43                               ` Dennis Kaarsemaker
2016-01-05 21:12                               ` [PATCH v4] " Dennis Kaarsemaker
2016-01-06  1:05                                 ` Eric Sunshine
2016-01-06  1:20                                   ` Dennis Kaarsemaker
2016-01-06  1:28                                     ` Eric Sunshine
2016-01-06  1:52                                       ` Eric Sunshine
2016-01-06  9:13                                         ` Dennis Kaarsemaker
2016-01-06  9:30                                           ` Duy Nguyen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.