git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Fix gc segfault
@ 2018-10-16 21:02 Jameson Miller via GitGitGadget
  2018-10-16 21:02 ` [PATCH 1/3] Fix typo 'detla' -> 'delta' Johannes Schindelin via GitGitGadget
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Jameson Miller via GitGitGadget @ 2018-10-16 21:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

In 9ac3f0e (pack-objects: fix performance issues on packing large deltas,
2018-07-22), a mutex was introduced that is used to guard the call to set
the delta size. This commit added code to initialize it, but at an incorrect
spot: in init_threaded_search(), while the call to oe_set_delta_size() (and
hence to packing_data_lock()) can happen in the call chain check_object() <- 
get_object_details() <-prepare_pack() <- cmd_pack_objects(), which is long
before theprepare_pack() function calls ll_find_deltas() (which initializes
the threaded search).

Another tell-tale that the mutex was initialized in an incorrect spot is
that the function to initialize it lives in builtin/, while the code that
uses the mutex is defined in a libgit.a header file.

Let's use a more appropriate function: prepare_packing_data(), which not
only lives in libgit.a, but has to be called before thepacking_data struct
is used that contains that mutex.

Johannes Schindelin (3):
  Fix typo 'detla' -> 'delta'
  pack-objects (mingw): demonstrate a segmentation fault with large
    deltas
  pack-objects (mingw): initialize `packing_data` mutex in the correct
    spot

 builtin/pack-objects.c        |  1 -
 pack-objects.c                |  3 +++
 pack-objects.h                |  2 +-
 t/t5321-pack-large-objects.sh | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100755 t/t5321-pack-large-objects.sh


base-commit: 5a0cc8aca797dbd7d2be3b67458ff880ed45cddf
Published-As: https://github.com/gitgitgadget/git/releases/tags/pr-47%2Fjamill%2Ffix-gc-segfault-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-47/jamill/fix-gc-segfault-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/47
-- 
gitgitgadget

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/3] Fix typo 'detla' -> 'delta'
  2018-10-16 21:02 [PATCH 0/3] Fix gc segfault Jameson Miller via GitGitGadget
@ 2018-10-16 21:02 ` Johannes Schindelin via GitGitGadget
  2018-10-16 21:02 ` [PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas Johannes Schindelin via GitGitGadget
  2018-10-16 21:02 ` [PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot Johannes Schindelin via GitGitGadget
  2 siblings, 0 replies; 6+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2018-10-16 21:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 pack-objects.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/pack-objects.h b/pack-objects.h
index 2ca39cfcfe..86ee93feb4 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -377,7 +377,7 @@ static inline unsigned long oe_delta_size(struct packing_data *pack,
 		return e->delta_size_;
 
 	/*
-	 * pack->detla_size[] can't be NULL because oe_set_delta_size()
+	 * pack->delta_size[] can't be NULL because oe_set_delta_size()
 	 * must have been called when a new delta is saved with
 	 * oe_set_delta().
 	 * If oe_delta() returns NULL (i.e. default state, which means
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas
  2018-10-16 21:02 [PATCH 0/3] Fix gc segfault Jameson Miller via GitGitGadget
  2018-10-16 21:02 ` [PATCH 1/3] Fix typo 'detla' -> 'delta' Johannes Schindelin via GitGitGadget
@ 2018-10-16 21:02 ` Johannes Schindelin via GitGitGadget
  2018-10-16 23:43   ` brian m. carlson
  2018-10-16 21:02 ` [PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot Johannes Schindelin via GitGitGadget
  2 siblings, 1 reply; 6+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2018-10-16 21:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

There is a problem in the way 9ac3f0e5b3e4 (pack-objects: fix
performance issues on packing large deltas, 2018-07-22) initializes that
mutex in the `packing_data` struct. The problem manifests in a
segmentation fault on Windows, when a mutex (AKA critical section) is
accessed without being initialized. (With pthreads, you apparently do
not really have to initialize them?)

This was reported in https://github.com/git-for-windows/git/issues/1839.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 t/t5321-pack-large-objects.sh | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)
 create mode 100755 t/t5321-pack-large-objects.sh

diff --git a/t/t5321-pack-large-objects.sh b/t/t5321-pack-large-objects.sh
new file mode 100755
index 0000000000..c36c66fbb4
--- /dev/null
+++ b/t/t5321-pack-large-objects.sh
@@ -0,0 +1,32 @@
+#!/bin/sh
+#
+# Copyright (c) 2018 Johannes Schindelin
+#
+
+test_description='git pack-object with "large" deltas
+
+'
+. ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-pack.sh
+
+# Two similar-ish objects that we have computed deltas between.
+A=01d7713666f4de822776c7622c10f1b07de280dc
+B=e68fe8129b546b101aee9510c5328e7f21ca1d18
+
+test_expect_success 'setup' '
+	clear_packs &&
+	{
+		pack_header 2 &&
+		pack_obj $A $B &&
+		pack_obj $B
+	} >ab.pack &&
+	pack_trailer ab.pack &&
+	git index-pack --stdin <ab.pack
+'
+
+test_expect_failure 'repack large deltas' '
+	printf "%s\\n" $A $B |
+	GIT_TEST_OE_DELTA_SIZE=2 git pack-objects tmp-pack
+'
+
+test_done
-- 
gitgitgadget


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot
  2018-10-16 21:02 [PATCH 0/3] Fix gc segfault Jameson Miller via GitGitGadget
  2018-10-16 21:02 ` [PATCH 1/3] Fix typo 'detla' -> 'delta' Johannes Schindelin via GitGitGadget
  2018-10-16 21:02 ` [PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas Johannes Schindelin via GitGitGadget
@ 2018-10-16 21:02 ` Johannes Schindelin via GitGitGadget
  2018-10-17 10:14   ` Jeff King
  2 siblings, 1 reply; 6+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2018-10-16 21:02 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Johannes Schindelin

From: Johannes Schindelin <johannes.schindelin@gmx.de>

In 9ac3f0e5b3e4 (pack-objects: fix performance issues on packing large
deltas, 2018-07-22), a mutex was introduced that is used to guard the
call to set the delta size. This commit even added code to initialize
it, but at an incorrect spot: in `init_threaded_search()`, while the
call to `oe_set_delta_size()` (and hence to `packing_data_lock()`) can
happen in the call chain `check_object()` <- `get_object_details()` <-
`prepare_pack()` <- `cmd_pack_objects()`, which is long before the
`prepare_pack()` function calls `ll_find_deltas()` (which initializes
the threaded search).

Another tell-tale that the mutex was initialized in an incorrect spot is
that the function to initialize it lives in builtin/, while the code
that uses the mutex is defined in a libgit.a header file.

Let's use a more appropriate function: `prepare_packing_data()`, which
not only lives in libgit.a, but *has* to be called before the
`packing_data` struct is used that contains that mutex.

This fixes https://github.com/git-for-windows/git/issues/1839.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
 builtin/pack-objects.c        | 1 -
 pack-objects.c                | 3 +++
 t/t5321-pack-large-objects.sh | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e6316d294d..e752cf9c7a 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2363,7 +2363,6 @@ static void init_threaded_search(void)
 	pthread_mutex_init(&cache_mutex, NULL);
 	pthread_mutex_init(&progress_mutex, NULL);
 	pthread_cond_init(&progress_cond, NULL);
-	pthread_mutex_init(&to_pack.lock, NULL);
 	old_try_to_free_routine = set_try_to_free_routine(try_to_free_from_threads);
 }
 
diff --git a/pack-objects.c b/pack-objects.c
index 7e624c30eb..b6cdbb0166 100644
--- a/pack-objects.c
+++ b/pack-objects.c
@@ -148,6 +148,9 @@ void prepare_packing_data(struct packing_data *pdata)
 					     1U << OE_SIZE_BITS);
 	pdata->oe_delta_size_limit = git_env_ulong("GIT_TEST_OE_DELTA_SIZE",
 						   1UL << OE_DELTA_SIZE_BITS);
+#ifndef NO_PTHREADS
+	pthread_mutex_init(&pdata->lock, NULL);
+#endif
 }
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
diff --git a/t/t5321-pack-large-objects.sh b/t/t5321-pack-large-objects.sh
index c36c66fbb4..a75eab87d3 100755
--- a/t/t5321-pack-large-objects.sh
+++ b/t/t5321-pack-large-objects.sh
@@ -24,7 +24,7 @@ test_expect_success 'setup' '
 	git index-pack --stdin <ab.pack
 '
 
-test_expect_failure 'repack large deltas' '
+test_expect_success 'repack large deltas' '
 	printf "%s\\n" $A $B |
 	GIT_TEST_OE_DELTA_SIZE=2 git pack-objects tmp-pack
 '
-- 
gitgitgadget

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas
  2018-10-16 21:02 ` [PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas Johannes Schindelin via GitGitGadget
@ 2018-10-16 23:43   ` brian m. carlson
  0 siblings, 0 replies; 6+ messages in thread
From: brian m. carlson @ 2018-10-16 23:43 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Junio C Hamano, Johannes Schindelin

[-- Attachment #1: Type: text/plain, Size: 883 bytes --]

On Tue, Oct 16, 2018 at 02:02:50PM -0700, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> There is a problem in the way 9ac3f0e5b3e4 (pack-objects: fix
> performance issues on packing large deltas, 2018-07-22) initializes that
> mutex in the `packing_data` struct. The problem manifests in a
> segmentation fault on Windows, when a mutex (AKA critical section) is
> accessed without being initialized. (With pthreads, you apparently do
> not really have to initialize them?)

This is a good catch.  You do have to initialize a pthread mutex, but on
amd64 Linux the default initializer is all zeros, so since we use a
static variable, it happens to coincidentally have the right value, so
we don't notice.

Thanks for fixing this.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot
  2018-10-16 21:02 ` [PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot Johannes Schindelin via GitGitGadget
@ 2018-10-17 10:14   ` Jeff King
  0 siblings, 0 replies; 6+ messages in thread
From: Jeff King @ 2018-10-17 10:14 UTC (permalink / raw)
  To: Johannes Schindelin via GitGitGadget
  Cc: git, Junio C Hamano, Johannes Schindelin

On Tue, Oct 16, 2018 at 02:02:52PM -0700, Johannes Schindelin via GitGitGadget wrote:

> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> 
> In 9ac3f0e5b3e4 (pack-objects: fix performance issues on packing large
> deltas, 2018-07-22), a mutex was introduced that is used to guard the
> call to set the delta size. This commit even added code to initialize
> it, but at an incorrect spot: in `init_threaded_search()`, while the
> call to `oe_set_delta_size()` (and hence to `packing_data_lock()`) can
> happen in the call chain `check_object()` <- `get_object_details()` <-
> `prepare_pack()` <- `cmd_pack_objects()`, which is long before the
> `prepare_pack()` function calls `ll_find_deltas()` (which initializes
> the threaded search).
> 
> Another tell-tale that the mutex was initialized in an incorrect spot is
> that the function to initialize it lives in builtin/, while the code
> that uses the mutex is defined in a libgit.a header file.
> 
> Let's use a more appropriate function: `prepare_packing_data()`, which
> not only lives in libgit.a, but *has* to be called before the
> `packing_data` struct is used that contains that mutex.

Nicely explained. I think this is a good solution.

Both before and after your patch, we do still take the lock even in
single-threaded scenarios (the case you found where we are not yet in
the delta search phase, but also when --threads=1). I think that should
be fine.  It looks like we do that with the other locks in
pack-objects.c already.

In index-pack.c, we check a threads_active flag before looking at the
lock, which could be another possible solution. I doubt it's any faster,
though (which is why I assume index-pack.c does it). Locking/unlocking a
mutex should not really be much slower than checking the conditional
flag in the first place.

Which is all a roundabout way of saying "looks good to me".

-Peff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-10-17 10:14 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-16 21:02 [PATCH 0/3] Fix gc segfault Jameson Miller via GitGitGadget
2018-10-16 21:02 ` [PATCH 1/3] Fix typo 'detla' -> 'delta' Johannes Schindelin via GitGitGadget
2018-10-16 21:02 ` [PATCH 2/3] pack-objects (mingw): demonstrate a segmentation fault with large deltas Johannes Schindelin via GitGitGadget
2018-10-16 23:43   ` brian m. carlson
2018-10-16 21:02 ` [PATCH 3/3] pack-objects (mingw): initialize `packing_data` mutex in the correct spot Johannes Schindelin via GitGitGadget
2018-10-17 10:14   ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).