git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] index-pack: spawn threads atomically
@ 2024-01-05  8:50 Jeff King
  2024-01-05 16:33 ` Taylor Blau
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2024-01-05  8:50 UTC (permalink / raw)
  To: git; +Cc: Taylor Blau

The t5309 script triggers a racy false positive with SANITIZE=leak on a
multi-core system. Running with "--stress --run=6" usually fails within
10 seconds or so for me, complaining with something like:

    + git index-pack --fix-thin --stdin
    fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)

    =================================================================
    ==3904583==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 32 byte(s) in 1 object(s) allocated from:
        #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
        #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
        #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
        #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
        #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
        #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
        #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
        #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

    SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
    Aborted

What happens is this:

  0. We construct a bogus pack with a duplicate object in it and trigger
     index-pack.

  1. We spawn a bunch of worker threads to resolve deltas (on my system
     it is 16 threads).

  2. One of the threads sees the duplicate object and bails by calling
     exit(), taking down all of the threads. This is expected and is the
     point of the test.

  3. At the time exit() is called, we may still be spawning threads from
     the main process via pthread_create(). LSan hooks thread creation
     to update its book-keeping; it has to know where each thread's
     stack is (so it can find entry points for reachable memory). So it
     calls pthread_getattr_np() to get information about the new thread.
     That may allocate memory that must be freed with a matching call to
     pthread_attr_destroy(). Probably LSan does that immediately, but
     if you're unlucky enough, the exit() will happen while it's between
     those two calls, and the allocated pthread_attr_t appears as a
     leak.

This isn't a real leak. It's not even in our code, but rather in the
LSan instrumentation code. So we could just ignore it. But the false
positive can cause people to waste time tracking it down.

It's possibly something that LSan could protect against (e.g., cover the
getattr/destroy pair with a mutex, and then in the final post-exit()
check for leaks try to take the same mutex). But I don't know enough
about LSan to say if that's a reasonable approach or not (or if my
analysis is even completely correct).

In the meantime, it's pretty easy to avoid the race by making creation
of the worker threads "atomic". That is, we'll spawn all of them before
letting any of them start to work. That's easy to do because we already
have a work_lock() mutex for handing out that work. If the main process
takes it, then all of the threads will immediately block until we've
finished spawning and released it.

This shouldn't make any practical difference for non-LSan runs. The
thread spawning is quick, and could happen before any worker thread gets
scheduled anyway.

Probably other spots that use threads are subject to the same issues.
But since we have to manually insert locking (and since this really is
kind of a hack), let's not bother with them unless somebody experiences
a similar racy false-positive in practice.

Signed-off-by: Jeff King <peff@peff.net>
---
Rescuing this from:

  https://lore.kernel.org/git/20231221105124.GD570888@coredump.intra.peff.net/

where it was buried deep in a thread. I still think it's kind of gross,
but it may be the least-bad thing.

 builtin/index-pack.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index dda94a9f46..0e94819216 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1257,13 +1257,15 @@ static void resolve_deltas(void)
 	base_cache_limit = delta_base_cache_limit * nr_threads;
 	if (nr_threads > 1 || getenv("GIT_FORCE_THREADS")) {
 		init_thread();
+		work_lock();
 		for (i = 0; i < nr_threads; i++) {
 			int ret = pthread_create(&thread_data[i].thread, NULL,
 						 threaded_second_pass, thread_data + i);
 			if (ret)
 				die(_("unable to create thread: %s"),
 				    strerror(ret));
 		}
+		work_unlock();
 		for (i = 0; i < nr_threads; i++)
 			pthread_join(thread_data[i].thread, NULL);
 		cleanup_thread();
-- 
2.43.0.553.g8113c77dd0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: spawn threads atomically
  2024-01-05  8:50 [PATCH] index-pack: spawn threads atomically Jeff King
@ 2024-01-05 16:33 ` Taylor Blau
  2024-01-10 11:44   ` Jeff King
  0 siblings, 1 reply; 12+ messages in thread
From: Taylor Blau @ 2024-01-05 16:33 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Fri, Jan 05, 2024 at 03:50:34AM -0500, Jeff King wrote:
> The t5309 script triggers a racy false positive with SANITIZE=leak on a
> multi-core system. Running with "--stress --run=6" usually fails within
> 10 seconds or so for me, complaining with something like:
>
>     + git index-pack --fix-thin --stdin
>     fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)
>
>     =================================================================
>     ==3904583==ERROR: LeakSanitizer: detected memory leaks
>
>     Direct leak of 32 byte(s) in 1 object(s) allocated from:
>         #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
>         #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
>         #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
>         #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
>         #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
>         #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
>         #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
>         #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
>
>     SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
>     Aborted

We discussed this in another thread (beginning here [1]), and I would be
fine with this approach. I share your feeling that it is a little gross
to have to work around LSan's implementation by tweaking production
code, but I think that this doesn't have to be the most pristine patch
ever written, either ;-).

Just playing devil's advocate for a moment, I wonder if another approach
might be to disable the threading altogether for the purposes of this
test. The performance difference is negligible, and I don't think we're
exercising any interesting paths in this particular test that have to do
with pack.threads > 1 that aren't covered extensively elsewhere.

So, in other words, I think a reasonable approach would be to do
something like:

--- 8< ---
diff --git a/t/t5309-pack-delta-cycles.sh b/t/t5309-pack-delta-cycles.sh
index 4e910c5b9d..1d132b6324 100755
--- a/t/t5309-pack-delta-cycles.sh
+++ b/t/t5309-pack-delta-cycles.sh
@@ -73,7 +73,7 @@ test_expect_success 'failover to a duplicate object in the same pack' '
 		pack_obj $A
 	} >recoverable.pack &&
 	pack_trailer recoverable.pack &&
-	test_must_fail git index-pack --fix-thin --stdin <recoverable.pack
+	test_must_fail git index-pack --threads=1 --fix-thin --stdin <recoverable.pack
 '

 test_done
--- >8 ---

And call it a day. I built with SANITIZE=leak, and then ran:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t5309-pack-delta-cycles.sh --stress --run=6

For a while and didn't see any failures. That could be luck, of course,
but without the above patch I was seeing failures within a few seconds.
I'm reasonably confident that this would do the trick.

For what it's worth, I'm fine with either approach, mostly to avoid
tying up more of the list's time discussing the options. But I have a
vague preference towards `--threads=1` since it doesn't require us to
touch production code.

> Rescuing this from:
>
>   https://lore.kernel.org/git/20231221105124.GD570888@coredump.intra.peff.net/
>
> where it was buried deep in a thread. I still think it's kind of gross,
> but it may be the least-bad thing.

In either case, thanks for digging it back up :-).

Thanks,
Taylor

[1]: https://lore.kernel.org/git/xmqqbkasnxba.fsf@gitster.g/

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: spawn threads atomically
  2024-01-05 16:33 ` Taylor Blau
@ 2024-01-10 11:44   ` Jeff King
  2024-01-10 17:34     ` Taylor Blau
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2024-01-10 11:44 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git

On Fri, Jan 05, 2024 at 11:33:23AM -0500, Taylor Blau wrote:

> -	test_must_fail git index-pack --fix-thin --stdin <recoverable.pack
> +	test_must_fail git index-pack --threads=1 --fix-thin --stdin <recoverable.pack
> [...]
> For what it's worth, I'm fine with either approach, mostly to avoid
> tying up more of the list's time discussing the options. But I have a
> vague preference towards `--threads=1` since it doesn't require us to
> touch production code.

That's quite tempting, actually. The flip side, though, is that the test
no longer reflects the production code as well. That is, in the real
world we'd still call exit() from a thread. That obviously works OK now
(modulo LSan), but if we ever had a regression where that left us in an
inconsistent state, we'd be less likely to notice it. Feels kind of
unlikely in practice, though.

I dunno. I guess the real least-bad thing is seeing if LSan can be
fixed to handle this atomically. I haven't even reported it there.

If do go with "--threads=1", I suspect several tests in that file need
it.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: spawn threads atomically
  2024-01-10 11:44   ` Jeff King
@ 2024-01-10 17:34     ` Taylor Blau
  2024-01-10 17:55       ` [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1` Taylor Blau
                         ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 17:34 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Wed, Jan 10, 2024 at 06:44:56AM -0500, Jeff King wrote:
> On Fri, Jan 05, 2024 at 11:33:23AM -0500, Taylor Blau wrote:
>
> > -	test_must_fail git index-pack --fix-thin --stdin <recoverable.pack
> > +	test_must_fail git index-pack --threads=1 --fix-thin --stdin <recoverable.pack
> > [...]
> > For what it's worth, I'm fine with either approach, mostly to avoid
> > tying up more of the list's time discussing the options. But I have a
> > vague preference towards `--threads=1` since it doesn't require us to
> > touch production code.
>
> That's quite tempting, actually. The flip side, though, is that the test
> no longer reflects the production code as well. That is, in the real
> world we'd still call exit() from a thread. That obviously works OK now
> (modulo LSan), but if we ever had a regression where that left us in an
> inconsistent state, we'd be less likely to notice it. Feels kind of
> unlikely in practice, though.
>
> I dunno. I guess the real least-bad thing is seeing if LSan can be
> fixed to handle this atomically. I haven't even reported it there.

In the meantime, I think that the `--threads=1` approach feels less
invasive. I tend to agree that neither option is ideal, but that
`--threads=1` is probably the least bad, and that failing to catch a
regression there feels rather unlikely.

> If do go with "--threads=1", I suspect several tests in that file need
> it.

Yeah, there are a couple of others. I think the ones that need modifying
are at the intersection of "expected to fail" and "in a test which is
expected to pass leak-free":

    $ grep -l 'TEST_PASSES_SANITIZE_LEAK=true' t????-*.sh |
      xargs grep -l 'test_must_fail git index-pack'
    t5302-pack-index.sh
    t5308-pack-detect-duplicates.sh
    t5309-pack-delta-cycles.sh
    t5313-pack-bounds-checks.sh
    t5325-reverse-index.sh

I'll send a series shortly to tweak those test scripts to avoid this
issue if you want to notify the LSan folks of this issue more generally.

> -Peff

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 17:34     ` Taylor Blau
@ 2024-01-10 17:55       ` Taylor Blau
  2024-01-10 22:18         ` Junio C Hamano
  2024-01-10 17:55       ` [PATCH 2/5] t5302: " Taylor Blau
                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 17:55 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

The t5309 script triggers a racy false positive with SANITIZE=leak on a
multi-core system. Running with "--stress --run=6" usually fails within
10 seconds or so for me, complaining with something like:

    + git index-pack --fix-thin --stdin
    fatal: REF_DELTA at offset 46 already resolved (duplicate base 01d7713666f4de822776c7622c10f1b07de280dc?)

    =================================================================
    ==3904583==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 32 byte(s) in 1 object(s) allocated from:
        #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
        #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
        #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
        #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
        #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
        #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
        #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
        #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

    SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).
    Aborted

What happens is this:

  0. We construct a bogus pack with a duplicate object in it and trigger
     index-pack.

  1. We spawn a bunch of worker threads to resolve deltas (on my system
     it is 16 threads).

  2. One of the threads sees the duplicate object and bails by calling
     exit(), taking down all of the threads. This is expected and is the
     point of the test.

  3. At the time exit() is called, we may still be spawning threads from
     the main process via pthread_create(). LSan hooks thread creation
     to update its book-keeping; it has to know where each thread's
     stack is (so it can find entry points for reachable memory). So it
     calls pthread_getattr_np() to get information about the new thread.
     That may allocate memory that must be freed with a matching call to
     pthread_attr_destroy(). Probably LSan does that immediately, but
     if you're unlucky enough, the exit() will happen while it's between
     those two calls, and the allocated pthread_attr_t appears as a
     leak.

This isn't a real leak. It's not even in our code, but rather in the
LSan instrumentation code. So we could just ignore it. But the false
positive can cause people to waste time tracking it down.

It's possibly something that LSan could protect against (e.g., cover the
getattr/destroy pair with a mutex, and then in the final post-exit()
check for leaks try to take the same mutex). But I don't know enough
about LSan to say if that's a reasonable approach or not (or if my
analysis is even completely correct).

One approach to papering over this issue (short of LSan fixing it
upstream) is to make the creation of work threads "atomic", i.e. by
spawning all of them before letting any of them start to work.  This
shouldn't make any practical difference for non-LSan runs. The thread
spawning is quick, and could happen before any worker thread gets
scheduled anyway.

But that requires us to tweak production code (albeit at a negligible
cost) in order to appease LSan in this narrow circumstance. Another
approach is to simply run these expected-to-fail `index-pack`
invocations with `--threads=1` so that we bypass the above issue
entirely.

The downside of that approach is that the test doesn't match our
production code as well as it did before (where we might have run those
same `index-pack` invocations with >1 thread, depending on how many CPUs
the testing machine has). The risk there is that we might miss a
regression that would leave us in an inconsistent state. But that feels
rather unlikely in practice, and there are many other tests related to
`index-pack` in the suite.

Original-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5309-pack-delta-cycles.sh | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/t/t5309-pack-delta-cycles.sh b/t/t5309-pack-delta-cycles.sh
index 4e910c5b9d..4100595c89 100755
--- a/t/t5309-pack-delta-cycles.sh
+++ b/t/t5309-pack-delta-cycles.sh
@@ -44,7 +44,7 @@ test_expect_success 'index-pack detects missing base objects' '
 		pack_obj $A $B
 	} >missing.pack &&
 	pack_trailer missing.pack &&
-	test_must_fail git index-pack --fix-thin --stdin <missing.pack
+	test_must_fail git index-pack --threads=1 --fix-thin --stdin <missing.pack
 '
 
 test_expect_success 'index-pack detects REF_DELTA cycles' '
@@ -55,13 +55,13 @@ test_expect_success 'index-pack detects REF_DELTA cycles' '
 		pack_obj $B $A
 	} >cycle.pack &&
 	pack_trailer cycle.pack &&
-	test_must_fail git index-pack --fix-thin --stdin <cycle.pack
+	test_must_fail git index-pack --threads=1 --fix-thin --stdin <cycle.pack
 '
 
 test_expect_success 'failover to an object in another pack' '
 	clear_packs &&
 	git index-pack --stdin <ab.pack &&
-	test_must_fail git index-pack --stdin --fix-thin <cycle.pack
+	test_must_fail git index-pack --threads=1 --stdin --fix-thin <cycle.pack
 '
 
 test_expect_success 'failover to a duplicate object in the same pack' '
@@ -73,7 +73,7 @@ test_expect_success 'failover to a duplicate object in the same pack' '
 		pack_obj $A
 	} >recoverable.pack &&
 	pack_trailer recoverable.pack &&
-	test_must_fail git index-pack --fix-thin --stdin <recoverable.pack
+	test_must_fail git index-pack --threads=1 --fix-thin --stdin <recoverable.pack
 '
 
 test_done
-- 
2.43.0.288.g906e6a084d


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/5] t5302: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 17:34     ` Taylor Blau
  2024-01-10 17:55       ` [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1` Taylor Blau
@ 2024-01-10 17:55       ` Taylor Blau
  2024-01-10 17:55       ` [PATCH 3/5] t5308: " Taylor Blau
                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 17:55 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

For identical reasons as in the previous commit, apply the same
treatment to expected-to-fail `index-pack` invocations in t5302 with
`--threads=1`.

(Note that this treatment only needs to be applied in tests which are
expected to pass when built with the leak-checker enabled).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5302-pack-index.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index d88e6f1691..fdbcd80f89 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -290,7 +290,7 @@ test_expect_success 'too-large packs report the breach' '
 	pack=$(git pack-objects --all pack </dev/null) &&
 	sz="$(test_file_size pack-$pack.pack)" &&
 	test "$sz" -gt 20 &&
-	test_must_fail git index-pack --max-input-size=20 pack-$pack.pack 2>err &&
+	test_must_fail git index-pack --threads=1 --max-input-size=20 pack-$pack.pack 2>err &&
 	grep "maximum allowed size (20 bytes)" err
 '
 
-- 
2.43.0.288.g906e6a084d


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/5] t5308: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 17:34     ` Taylor Blau
  2024-01-10 17:55       ` [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1` Taylor Blau
  2024-01-10 17:55       ` [PATCH 2/5] t5302: " Taylor Blau
@ 2024-01-10 17:55       ` Taylor Blau
  2024-01-10 17:55       ` [PATCH 4/5] t5313: " Taylor Blau
                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 17:55 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

For identical reasons as in the previous commit, apply the same
treatment to expected-to-fail `index-pack` invocations in t5308 with
`--threads=1`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5308-pack-detect-duplicates.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5308-pack-detect-duplicates.sh b/t/t5308-pack-detect-duplicates.sh
index 655cafa054..5df552a42e 100755
--- a/t/t5308-pack-detect-duplicates.sh
+++ b/t/t5308-pack-detect-duplicates.sh
@@ -76,7 +76,7 @@ test_expect_success 'lookup in duplicated pack' '
 test_expect_success 'index-pack can reject packs with duplicates' '
 	clear_packs &&
 	create_pack dups.pack 2 &&
-	test_must_fail git index-pack --strict --stdin <dups.pack &&
+	test_must_fail git index-pack --threads=1 --strict --stdin <dups.pack &&
 	test_expect_code 1 git cat-file -e $LO_SHA1
 '
 
-- 
2.43.0.288.g906e6a084d


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/5] t5313: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 17:34     ` Taylor Blau
                         ` (2 preceding siblings ...)
  2024-01-10 17:55       ` [PATCH 3/5] t5308: " Taylor Blau
@ 2024-01-10 17:55       ` Taylor Blau
  2024-01-10 17:55       ` [PATCH 5/5] t5325: " Taylor Blau
  2024-01-11  6:53       ` [PATCH] index-pack: spawn threads atomically Jeff King
  5 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 17:55 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

For identical reasons as in the previous commit, apply the same
treatment to expected-to-fail `index-pack` invocations in t5313 with
`--threads=1`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5313-pack-bounds-checks.sh | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t5313-pack-bounds-checks.sh b/t/t5313-pack-bounds-checks.sh
index ceaa6700a2..e9829b6ef3 100755
--- a/t/t5313-pack-bounds-checks.sh
+++ b/t/t5313-pack-bounds-checks.sh
@@ -91,7 +91,7 @@ test_expect_success 'pack/index object count mismatch' '
 
 	# ...and make sure that index-pack --verify, which has its
 	# own reading routines, does not segfault.
-	test_must_fail git index-pack --verify $pack
+	test_must_fail git index-pack --threads=1 --verify $pack
 '
 
 test_expect_success 'matched bogus object count' '
@@ -111,7 +111,7 @@ test_expect_success 'matched bogus object count' '
 	git cat-file blob $object >actual &&
 	test_cmp file actual &&
 
-	test_must_fail git index-pack --verify $pack
+	test_must_fail git index-pack --threads=1 --verify $pack
 '
 
 # Note that we cannot check the fallback case for these
@@ -128,7 +128,7 @@ test_expect_success 'bogus object offset (v1)' '
 	munge $idx $((4 * 256)) "\377\0\0\0" &&
 	clear_base &&
 	test_must_fail git cat-file blob $object &&
-	test_must_fail git index-pack --verify $pack
+	test_must_fail git index-pack --threads=1 --verify $pack
 '
 
 test_expect_success 'bogus object offset (v2, no msb)' '
@@ -136,7 +136,7 @@ test_expect_success 'bogus object offset (v2, no msb)' '
 	munge $idx $(ofs_table 1) "\0\377\0\0" &&
 	clear_base &&
 	test_must_fail git cat-file blob $object &&
-	test_must_fail git index-pack --verify $pack
+	test_must_fail git index-pack --threads=1 --verify $pack
 '
 
 test_expect_success 'bogus offset into v2 extended table' '
@@ -144,7 +144,7 @@ test_expect_success 'bogus offset into v2 extended table' '
 	munge $idx $(ofs_table 1) "\377\0\0\0" &&
 	clear_base &&
 	test_must_fail git cat-file blob $object &&
-	test_must_fail git index-pack --verify $pack
+	test_must_fail git index-pack --threads=1 --verify $pack
 '
 
 test_expect_success 'bogus offset inside v2 extended table' '
@@ -169,7 +169,7 @@ test_expect_success 'bogus offset inside v2 extended table' '
 	mv tmp "$idx" &&
 	clear_base &&
 	test_must_fail git cat-file blob $object &&
-	test_must_fail git index-pack --verify $pack
+	test_must_fail git index-pack --threads=1 --verify $pack
 '
 
 test_expect_success 'bogus OFS_DELTA in packfile' '
-- 
2.43.0.288.g906e6a084d


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/5] t5325: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 17:34     ` Taylor Blau
                         ` (3 preceding siblings ...)
  2024-01-10 17:55       ` [PATCH 4/5] t5313: " Taylor Blau
@ 2024-01-10 17:55       ` Taylor Blau
  2024-01-11  6:53       ` [PATCH] index-pack: spawn threads atomically Jeff King
  5 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 17:55 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Junio C Hamano

For identical reasons as in the previous commit, apply the same
treatment to expected-to-fail `index-pack` invocations in t5325 with
`--threads=1`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5325-reverse-index.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t5325-reverse-index.sh b/t/t5325-reverse-index.sh
index 431a603ca0..dc3d2235e8 100755
--- a/t/t5325-reverse-index.sh
+++ b/t/t5325-reverse-index.sh
@@ -64,7 +64,7 @@ test_expect_success 'index-pack can verify reverse indexes' '
 	chmod u+w $rev &&
 	printf "xxxx" | dd of=$rev bs=1 count=4 conv=notrunc &&
 
-	test_must_fail git index-pack --rev-index --verify \
+	test_must_fail git index-pack --threads=1 --rev-index --verify \
 		$packdir/pack-$pack.pack 2>err &&
 	grep "validation error" err
 '
-- 
2.43.0.288.g906e6a084d

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 17:55       ` [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1` Taylor Blau
@ 2024-01-10 22:18         ` Junio C Hamano
  2024-01-10 22:25           ` Taylor Blau
  0 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2024-01-10 22:18 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King

Taylor Blau <me@ttaylorr.com> writes:

> But that requires us to tweak production code (albeit at a negligible
> cost) in order to appease LSan in this narrow circumstance. Another
> approach is to simply run these expected-to-fail `index-pack`
> invocations with `--threads=1` so that we bypass the above issue
> entirely.

But of course, multi-threaded operation that production folks use
will not be tested at all with the alternative.

> The downside of that approach is that the test doesn't match our
> production code as well as it did before (where we might have run those
> same `index-pack` invocations with >1 thread, depending on how many CPUs
> the testing machine has). The risk there is that we might miss a
> regression that would leave us in an inconsistent state. But that feels
> rather unlikely in practice, and there are many other tests related to
> `index-pack` in the suite.

As long as "make sure we spawn all of them atmically" has negligible
downside, I'd rather take that approach. Buying predictability with
minimum cost is quite attractive.

Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1`
  2024-01-10 22:18         ` Junio C Hamano
@ 2024-01-10 22:25           ` Taylor Blau
  0 siblings, 0 replies; 12+ messages in thread
From: Taylor Blau @ 2024-01-10 22:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King

On Wed, Jan 10, 2024 at 02:18:52PM -0800, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > But that requires us to tweak production code (albeit at a negligible
> > cost) in order to appease LSan in this narrow circumstance. Another
> > approach is to simply run these expected-to-fail `index-pack`
> > invocations with `--threads=1` so that we bypass the above issue
> > entirely.
>
> But of course, multi-threaded operation that production folks use
> will not be tested at all with the alternative.

Just the ones that we expect to fail *and* are in test scripts which are
marked as leak-free.

> > The downside of that approach is that the test doesn't match our
> > production code as well as it did before (where we might have run those
> > same `index-pack` invocations with >1 thread, depending on how many CPUs
> > the testing machine has). The risk there is that we might miss a
> > regression that would leave us in an inconsistent state. But that feels
> > rather unlikely in practice, and there are many other tests related to
> > `index-pack` in the suite.
>
> As long as "make sure we spawn all of them atmically" has negligible
> downside, I'd rather take that approach. Buying predictability with
> minimum cost is quite attractive.

Like I said earlier, I have no strong preference between either
approach. If you want to pick up Peff's patch instead of these five,
that is fine with me :-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] index-pack: spawn threads atomically
  2024-01-10 17:34     ` Taylor Blau
                         ` (4 preceding siblings ...)
  2024-01-10 17:55       ` [PATCH 5/5] t5325: " Taylor Blau
@ 2024-01-11  6:53       ` Jeff King
  5 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2024-01-11  6:53 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, git

On Wed, Jan 10, 2024 at 12:34:09PM -0500, Taylor Blau wrote:

> > If do go with "--threads=1", I suspect several tests in that file need
> > it.
> 
> Yeah, there are a couple of others. I think the ones that need modifying
> are at the intersection of "expected to fail" and "in a test which is
> expected to pass leak-free":
> 
>     $ grep -l 'TEST_PASSES_SANITIZE_LEAK=true' t????-*.sh |
>       xargs grep -l 'test_must_fail git index-pack'
>     t5302-pack-index.sh
>     t5308-pack-detect-duplicates.sh
>     t5309-pack-delta-cycles.sh
>     t5313-pack-bounds-checks.sh
>     t5325-reverse-index.sh

I think that is more than we need. It's only a problem when we hit a
die() inside a thread, which happens only during delta resolution. So
your patch 2, for example, touches a test which triggers the
--max-input-size check. But we would find that out on the initial
unthreaded pass over the pack.

The one in patch 3 seems at first glance like it might be a problem
(it's another duplicate-object case, like the one of them in 5309). But
it isn't a problem because the duplicate object isn't a delta, so we
notice the problem in write_idx_file() from the main thread (which I
verified by running it under gdb and setting a breakpoint at die()).

I suspect patch 4 is the same, but didn't run gdb on each case. And
patch 5 is about a corrupt reverse index, so almost certainly the main
thread. So I suspect that patch 1 is the only one that matters here (and
probably all of those are needed, because it is all about broken
deltas).

All that said, I am on the fence between the two approaches. If Junio
prefers the atomic-spawn direction, I'm fine with that, and there's not
much point in polishing the --threads=1 approach further.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-01-11  6:53 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-01-05  8:50 [PATCH] index-pack: spawn threads atomically Jeff King
2024-01-05 16:33 ` Taylor Blau
2024-01-10 11:44   ` Jeff King
2024-01-10 17:34     ` Taylor Blau
2024-01-10 17:55       ` [PATCH 1/5] t5309: run expected-to-fail `index-pack`s with `--threads=1` Taylor Blau
2024-01-10 22:18         ` Junio C Hamano
2024-01-10 22:25           ` Taylor Blau
2024-01-10 17:55       ` [PATCH 2/5] t5302: " Taylor Blau
2024-01-10 17:55       ` [PATCH 3/5] t5308: " Taylor Blau
2024-01-10 17:55       ` [PATCH 4/5] t5313: " Taylor Blau
2024-01-10 17:55       ` [PATCH 5/5] t5325: " Taylor Blau
2024-01-11  6:53       ` [PATCH] index-pack: spawn threads atomically Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).