All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] grep: Fix race condition in delta_base_cache
@ 2011-08-30 13:45 Nicolas Morey-Chaisemartin
  2011-08-31  1:59 ` Jeff King
  0 siblings, 1 reply; 4+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2011-08-30 13:45 UTC (permalink / raw)
  To: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 825 bytes --]


When running large git grep (ie: git grep regexp $(git rev-list --all)), glibc error sometimes occur:
*** glibc detected *** git: double free or corruption (!prev): 0x00000000010abdf0 ***

According to gdb the problem originate from release_delta_cash (sha1_file.c:1703)
		free(ent->data);

>From my analysis it seems that git grep threads do acquire lock before calling read_sha1_file but not before calling
read_object_with_reference who ends up calling read_sha1_file too.

Adding the lock around read_object_with_reference seems to fix the issue for me.
I've ran git grep about a dozen time and seen no more error while
it usually happened half the time before.

Signed-off-by: Nicolas Morey-Chaisemartin <nicolas@morey-chaisemartin.com>
---
 builtin/grep.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)



[-- Attachment #2: 0001-grep-Fix-race-condition-in-delta_base_cache.patch --]
[-- Type: text/x-patch, Size: 465 bytes --]

diff --git a/builtin/grep.c b/builtin/grep.c
index 1c359c2..56398d5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -598,8 +598,10 @@ static int grep_object(struct grep_opt *opt, const struct pathspec *pathspec,
 		struct strbuf base;
 		int hit, len;
 
+		read_sha1_lock();
 		data = read_object_with_reference(obj->sha1, tree_type,
 						  &size, NULL);
+		read_sha1_unlock();
 		if (!data)
 			die(_("unable to read tree (%s)"), sha1_to_hex(obj->sha1));
 


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] grep: Fix race condition in delta_base_cache
  2011-08-30 13:45 [PATCH] grep: Fix race condition in delta_base_cache Nicolas Morey-Chaisemartin
@ 2011-08-31  1:59 ` Jeff King
  2011-08-31  6:32   ` Nicolas Morey-Chaisemartin
  2011-08-31 19:13   ` Nicolas Morey-Chaisemartin
  0 siblings, 2 replies; 4+ messages in thread
From: Jeff King @ 2011-08-31  1:59 UTC (permalink / raw)
  To: Nicolas Morey-Chaisemartin; +Cc: Junio C Hamano, git

On Tue, Aug 30, 2011 at 03:45:38PM +0200, Nicolas Morey-Chaisemartin wrote:

> According to gdb the problem originate from release_delta_cash (sha1_file.c:1703)
> 		free(ent->data);
> 
> From my analysis it seems that git grep threads do acquire lock before
> calling read_sha1_file but not before calling
> read_object_with_reference who ends up calling read_sha1_file too.

Yeah, I think this is necessary, and the patch looks good.

I notice there are some other code paths that end up in xmalloc without
locking, too (e.g., load_file, and some strbuf_* calls). Don't those
need locking, too, as malloc may try to release packfile memory?

builtin/pack-objects.c dealt with this already by setting a new
"try_to_free" routine that locks[1], which we should also do. It
probably comes up less frequently, because it only happens when we're
under memory pressure.

-Peff

[1] Actually, it looks like the "try_to_free" routine starts as nothing,
    and then add_packed_git sets it lazily to try_to_free_pack_memory.
    But what builtin/pack-objects tries to do is overwrite that with a
    version of try_to_free_pack_memory that does locking. So it's
    possible that we would not have read any packed objects while
    setting up the threads, and add_packed_git will overwrite our
    careful, locking version of try_to_free_pack_memory.

    I _think_ pack-objects is probably OK, because it will have already
    done the complete "counting objects" phase, which would look in any
    packs. But it may be harder for grep.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] grep: Fix race condition in delta_base_cache
  2011-08-31  1:59 ` Jeff King
@ 2011-08-31  6:32   ` Nicolas Morey-Chaisemartin
  2011-08-31 19:13   ` Nicolas Morey-Chaisemartin
  1 sibling, 0 replies; 4+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2011-08-31  6:32 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 08/31/2011 03:59 AM, Jeff King wrote:
> 
> I notice there are some other code paths that end up in xmalloc without
> locking, too (e.g., load_file, and some strbuf_* calls). Don't those
> need locking, too, as malloc may try to release packfile memory?
> 
After some consideration I think they do.
Grep threads definitly get the try_to_free  function registered while the main one does at least some xreallock out of sha1_file.c (so not owning the lock).
I guess it requires the same kind of lock pack-objects.c uses (meaning we need to set the try_to_free function in grep.c too).

> 
> [1] Actually, it looks like the "try_to_free" routine starts as nothing,
>     and then add_packed_git sets it lazily to try_to_free_pack_memory.
>     But what builtin/pack-objects tries to do is overwrite that with a
>     version of try_to_free_pack_memory that does locking. So it's
>     possible that we would not have read any packed objects while
>     setting up the threads, and add_packed_git will overwrite our
>     careful, locking version of try_to_free_pack_memory.
> 
>     I _think_ pack-objects is probably OK, because it will have already
>     done the complete "counting objects" phase, which would look in any
>     packs. But it may be harder for grep.

I'm not expert enough in git pathways to be sure about pack_objects but I agree it looks "risky".
Maybe add_packed_git should check whether there already is a free routine (other than do_nothing) instead of simply setting it up the first time.

Nicolas Morey-Chaisemartin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] grep: Fix race condition in delta_base_cache
  2011-08-31  1:59 ` Jeff King
  2011-08-31  6:32   ` Nicolas Morey-Chaisemartin
@ 2011-08-31 19:13   ` Nicolas Morey-Chaisemartin
  1 sibling, 0 replies; 4+ messages in thread
From: Nicolas Morey-Chaisemartin @ 2011-08-31 19:13 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 08/31/2011 03:59 AM, Jeff King wrote:
> I notice there are some other code paths that end up in xmalloc without
> locking, too (e.g., load_file, and some strbuf_* calls). Don't those
> need locking, too, as malloc may try to release packfile memory?
> 
> builtin/pack-objects.c dealt with this already by setting a new
> "try_to_free" routine that locks[1], which we should also do. It
> probably comes up less frequently, because it only happens when we're
> under memory pressure.
> 
> -Peff
> 
> [1] Actually, it looks like the "try_to_free" routine starts as nothing,
>     and then add_packed_git sets it lazily to try_to_free_pack_memory.
>     But what builtin/pack-objects tries to do is overwrite that with a
>     version of try_to_free_pack_memory that does locking. So it's
>     possible that we would not have read any packed objects while
>     setting up the threads, and add_packed_git will overwrite our
>     careful, locking version of try_to_free_pack_memory.
> 
>     I _think_ pack-objects is probably OK, because it will have already
>     done the complete "counting objects" phase, which would look in any
>     packs. But it may be harder for grep.
> 

After some more looking around, I'd say the best way to fix this is provide a lock to the try_to_free function from sha1_file
and provide access to this lock for pack-objects and grep to replace respectively the read_mutex and read_sha1_mutex.
So we can simplify the problem by having a single lock to avoid all the cache/free issues (and reusable elsewhere if needed in the future), whether it's shared through direct access or API (I'm not sure what's git policy about that).
And this way, there is no need to duplicate what pack-objects is achieving and it gives some peace of mind about the fact that the try_to _free function won't be overwritten in our backs.

Nicolas Morey-Chaisemartin

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-08-31 21:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-30 13:45 [PATCH] grep: Fix race condition in delta_base_cache Nicolas Morey-Chaisemartin
2011-08-31  1:59 ` Jeff King
2011-08-31  6:32   ` Nicolas Morey-Chaisemartin
2011-08-31 19:13   ` Nicolas Morey-Chaisemartin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.