All of lore.kernel.org
 help / color / mirror / Atom feed
* UNLEAK(), leak checking in the default tests etc.
@ 2021-06-09 14:38 Ævar Arnfjörð Bjarmason
  2021-06-09 17:44 ` Andrzej Hunt
                   ` (2 more replies)
  0 siblings, 3 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-09 14:38 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: git, Andrzej Hunt, Jeff King, Lénaïc Huard, Derrick Stolee


[In-Reply-To
<a74bbcae7363df03bf8e93167d9274d16dc807f3.1615747662.git.gitgitgadget@gmail.com>,
but intentionally breaking threading for a new topic]

On Sun, Mar 14 2021, Andrzej Hunt via GitGitGadget wrote:

> Most of these pointers can safely be freed when cmd_clone() completes,
> therefore we make sure to free them. The one exception is that we
> have to UNLEAK(repo) because it can point either to argv[0], or a
> malloc'd string returned by absolute_pathdup().

I ran into this when manually checking with valgrind and discovered that
you need SANITIZERS for -DSUPPRESS_ANNOTATED_LEAKS to squash it.

I wonder if that shouldn't be in DEVOPTS (or even a default under
DEVELOPER=1). I.e. you don't need any other special compile flags, just
a compiled git that you then run under valgrind to spot this.

>  builtin/clone.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 51e844a2de0a..952fe3d8fc88 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -964,10 +964,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  {
>  	int is_bundle = 0, is_local;
>  	const char *repo_name, *repo, *work_tree, *git_dir;
> -	char *path, *dir, *display_repo = NULL;
> +	char *path = NULL, *dir, *display_repo = NULL;
>  	int dest_exists, real_dest_exists = 0;
>  	const struct ref *refs, *remote_head;
> -	const struct ref *remote_head_points_at;
> +	struct ref *remote_head_points_at = NULL;
>  	const struct ref *our_head_points_at;
>  	struct ref *mapped_refs;
>  	const struct ref *ref;
> @@ -1017,9 +1017,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	repo_name = argv[0];
>  
>  	path = get_repo_path(repo_name, &is_bundle);
> -	if (path)
> +	if (path) {
> +		FREE_AND_NULL(path);
>  		repo = absolute_pathdup(repo_name);
> -	else if (strchr(repo_name, ':')) {
> +	} else if (strchr(repo_name, ':')) {
>  		repo = repo_name;
>  		display_repo = transport_anonymize_url(repo);
>  	} else

In this case it seems better to just have a :

    int repo_heap = 0;

    Then set "repo_heap = 1" in that absolute_pathdup(repo_name) branch,
    and...

> @@ -1393,6 +1394,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	strbuf_release(&reflog_msg);
>  	strbuf_release(&branch_top);
>  	strbuf_release(&key);
> +	free_refs(mapped_refs);
> +	free_refs(remote_head_points_at);
> +	free(dir);
> +	free(path);
> +	UNLEAK(repo);

Here do:

    if (repo_heap)
        free(repo);

But maybe there's some other out of the box way to make leak checking
Just Work without special flags in this case. I'm just noting this one
because it ended up being the only one that leaked unless I compiled
with -DSUPPRESS_ANNOTATED_LEAKS. I was fixing some leaks in the bundle
code.

Anyway, getting to the "default tests" point. I fixed a memory leak, and
wanted to it tested that the specific command doesn't leak in git's
default tests.

Do we have such a thing, if not why not?

The closest I got to getting this was:

    GIT_VALGRIND_MODE=memcheck GIT_VALGRIND_OPTIONS="--leak-check=full --errors-for-leak-kinds=definite --error-exitcode=123" <SOME TEST> --valgrind

But as t/README notes it implies --verbose so we can't currently run it
under the test harness (although I have out-of-tree patches to fix that
in general).

It seems pretty straightforward to turn that specific thing into a test
with a prereq to detect if valgrind works in that mode at all, and then
do (in some dedicated test file):

	# Exit/skip if we can't setup valgrind, then setup relevant
        # valgrind options (maybe needing to re-source test-lib.sh, ew!)
	test_expect_successs 'ls-heads should not leak' '
		git bundle ls-heads a.bdl
	'

But from what I've found so far no such thing exists, and it seems to
the extent that this is checked it's run manually as a one-off (see git
log --grep=valgrind), but we don't explicitly test for this
anywhere. Have I missed something?

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-09 14:38 UNLEAK(), leak checking in the default tests etc Ævar Arnfjörð Bjarmason
@ 2021-06-09 17:44 ` Andrzej Hunt
  2021-06-09 20:36   ` Felipe Contreras
                     ` (2 more replies)
  2021-06-10 19:01 ` SZEDER Gábor
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2 siblings, 3 replies; 125+ messages in thread
From: Andrzej Hunt @ 2021-06-09 17:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Lénaïc Huard, Derrick Stolee



On 09/06/2021 16:38, Ævar Arnfjörð Bjarmason wrote:
> 
> [In-Reply-To
> <a74bbcae7363df03bf8e93167d9274d16dc807f3.1615747662.git.gitgitgadget@gmail.com>,
> but intentionally breaking threading for a new topic]
> 
> On Sun, Mar 14 2021, Andrzej Hunt via GitGitGadget wrote:
> 
>> Most of these pointers can safely be freed when cmd_clone() completes,
>> therefore we make sure to free them. The one exception is that we
>> have to UNLEAK(repo) because it can point either to argv[0], or a
>> malloc'd string returned by absolute_pathdup().
> 
> I ran into this when manually checking with valgrind and discovered that
> you need SANITIZERS for -DSUPPRESS_ANNOTATED_LEAKS to squash it.
> 
> I wonder if that shouldn't be in DEVOPTS (or even a default under
> DEVELOPER=1). I.e. you don't need any other special compile flags, just
> a compiled git that you then run under valgrind to spot this.

I'm not familiar with git's development conventions/philosophy, but my 
2c is that it's better not to enable it by default in order to minimise 
divergence from the code that users are running. OTOH it's not a major 
difference in behaviour so perhaps that's not a concern here.

More significantly: I get the impression it's easier to do leak checking 
using LSAN, which requires recompiling git anyway - at which point you 
get the flag for free - so how often will people actually perform leak 
checking with Valgrind in the first place?

> 
>>   builtin/clone.c | 14 ++++++++++----
>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>
>> diff --git a/builtin/clone.c b/builtin/clone.c
>> index 51e844a2de0a..952fe3d8fc88 100644
>> --- a/builtin/clone.c
>> +++ b/builtin/clone.c
>> @@ -964,10 +964,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>   {
>>   	int is_bundle = 0, is_local;
>>   	const char *repo_name, *repo, *work_tree, *git_dir;
>> -	char *path, *dir, *display_repo = NULL;
>> +	char *path = NULL, *dir, *display_repo = NULL;
>>   	int dest_exists, real_dest_exists = 0;
>>   	const struct ref *refs, *remote_head;
>> -	const struct ref *remote_head_points_at;
>> +	struct ref *remote_head_points_at = NULL;
>>   	const struct ref *our_head_points_at;
>>   	struct ref *mapped_refs;
>>   	const struct ref *ref;
>> @@ -1017,9 +1017,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>   	repo_name = argv[0];
>>   
>>   	path = get_repo_path(repo_name, &is_bundle);
>> -	if (path)
>> +	if (path) {
>> +		FREE_AND_NULL(path);
>>   		repo = absolute_pathdup(repo_name);
>> -	else if (strchr(repo_name, ':')) {
>> +	} else if (strchr(repo_name, ':')) {
>>   		repo = repo_name;
>>   		display_repo = transport_anonymize_url(repo);
>>   	} else
> 
> In this case it seems better to just have a :
> 
>      int repo_heap = 0;
> 
>      Then set "repo_heap = 1" in that absolute_pathdup(repo_name) branch,
>      and...
> 
>> @@ -1393,6 +1394,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>   	strbuf_release(&reflog_msg);
>>   	strbuf_release(&branch_top);
>>   	strbuf_release(&key);
>> +	free_refs(mapped_refs);
>> +	free_refs(remote_head_points_at);
>> +	free(dir);
>> +	free(path);
>> +	UNLEAK(repo);
> 
> Here do:
> 
>      if (repo_heap)
>          free(repo);
> 

Although this is possible, I don't think it's worth it: if UNLEAK 
already exists, we might as well use it here to make the code simpler. 
And UNLEAK is unlikely to go away anytime soon because... (continued below)

> But maybe there's some other out of the box way to make leak checking
> Just Work without special flags in this case. I'm just noting this one
> because it ended up being the only one that leaked unless I compiled
> with -DSUPPRESS_ANNOTATED_LEAKS. I was fixing some leaks in the bundle
> code.

There are trickier examples where a cmd_* function has a complex struct 
on the stack, and correctly clearing all allocated memory pointed to by 
its members (or in turn further children with potentially multiple 
levels of indirection) is a lot of work - and that work doesn't actually 
benefit the user in any way. In other words, we either need to be able 
to use UNLEAK to suppress certain classes of uninteresting memory leaks 
- which allows us to focus on the interesting/real leaks - or someone 
has to spend a lot of time doing cleanup by hand (and/or someone has to 
implement a bunch of new cleanup functions)).

In your example above, the UNLEAK can be avoided at the cost of one 
additional tracking variable - but in many other cases avoiding an 
UNLEAK is much more expensive. It's certainly valid to debate the merits 
of the UNLEAK here, but that won't remove the need for UNLEAK's 
existence in general.

(The most common example that I remember is where cmd_* has a rev_info, 
and AFAICT there's no one-liner to clean that up. Using UNLEAK is 
honestly the best approach there. I don't think I've actually submitted 
any patches doing this, but I have a few in my local backlog.)
> Anyway, getting to the "default tests" point. I fixed a memory leak, and
> wanted to it tested that the specific command doesn't leak in git's
> default tests.
> 
> Do we have such a thing, if not why not?
> 
> The closest I got to getting this was:
> 
>      GIT_VALGRIND_MODE=memcheck GIT_VALGRIND_OPTIONS="--leak-check=full --errors-for-leak-kinds=definite --error-exitcode=123" <SOME TEST> --valgrind

It's easy to perform leak-checking runs *if* you're OK recompiling with 
LSAN, instead of using valgrind. My usual recipe for running against a 
range of tests is something like:

   make SANITIZE=address,leak 
ASAN_OPTIONS="detect_leaks=1:abort_on_error=1" CFLAGS="-Og -g" 
T="\$(wildcard t00[0-9][0-9]-*.sh)" test

Additionally: I usually specify CC=clang, although gcc+LSAN has mostly 
been stable enough in my experience so you might be able to skip that.
(I've found ASAN+LSAN to be more stable than LSAN by itself, which is 
why I specify address+leak, but adding ASAN in turn requires overriding 
ASAN_OPTIONS to reenable leak checking.)

I don't know whether or not Valgrind is more/less effective at finding 
leaks, so being able to run the test suite under valgrind would be nice 
for comparison purposes though.

ATB,

   Andrzej

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-09 17:44 ` Andrzej Hunt
@ 2021-06-09 20:36   ` Felipe Contreras
  2021-06-10 10:46   ` Jeff King
  2021-06-10 10:56   ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Felipe Contreras @ 2021-06-09 20:36 UTC (permalink / raw)
  To: Andrzej Hunt, Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Lénaïc Huard, Derrick Stolee

Andrzej Hunt wrote:
> On 09/06/2021 16:38, Ævar Arnfjörð Bjarmason wrote:

> > I wonder if that shouldn't be in DEVOPTS (or even a default under
> > DEVELOPER=1). I.e. you don't need any other special compile flags, just
> > a compiled git that you then run under valgrind to spot this.
> 
> I'm not familiar with git's development conventions/philosophy, but my 
> 2c is that it's better not to enable it by default in order to minimise 
> divergence from the code that users are running.

It woudln't be on by default, you would need to turn it on with
`make DEVEOPER=1`.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-09 17:44 ` Andrzej Hunt
  2021-06-09 20:36   ` Felipe Contreras
@ 2021-06-10 10:46   ` Jeff King
  2021-06-10 10:56   ` Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-06-10 10:46 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Ævar Arnfjörð Bjarmason, git,
	Lénaïc Huard, Derrick Stolee

On Wed, Jun 09, 2021 at 07:44:12PM +0200, Andrzej Hunt wrote:

> > I ran into this when manually checking with valgrind and discovered that
> > you need SANITIZERS for -DSUPPRESS_ANNOTATED_LEAKS to squash it.
> > 
> > I wonder if that shouldn't be in DEVOPTS (or even a default under
> > DEVELOPER=1). I.e. you don't need any other special compile flags, just
> > a compiled git that you then run under valgrind to spot this.
> 
> I'm not familiar with git's development conventions/philosophy, but my 2c is
> that it's better not to enable it by default in order to minimise divergence
> from the code that users are running. OTOH it's not a major difference in
> behaviour so perhaps that's not a concern here.

Yeah, I'd rather not enable the option during normal builds. It carries
a run-time penalty (it is actually building a pointless data structure
that _does_ effectively leak the pointers, but backed by a global so
they're "findable" by leak checkers). So it changes speed and possibly
correctness of the final binary in a way that is different from what
people would actually run in practice.

That might be worth it if there was some advantage to just turning it
on (i.e., if by running with it all the time we might detect some bug).
But by itself it does nothing useful.

If you really want to leak-check more thoroughly the normal binary, then
IMHO you'd be better off to convert UNLEAK() sites to actual free calls.

> More significantly: I get the impression it's easier to do leak checking
> using LSAN, which requires recompiling git anyway - at which point you get
> the flag for free - so how often will people actually perform leak checking
> with Valgrind in the first place?

And yeah, I'd very much agree here. It's definitely not wrong to run
with Valgrind. But it's slower and much less thorough than ASan (probably not for
leak detection, but definitely for bug-finding, since it can't look at
stack variables).

If you do use it, and want to build with -DSUPPRESS_ANNOTATED_LEAKS all
the time, that's OK, but I don't think it makes sense for it to the
default even under DEVELOPER=1. I'm not opposed to a patch to make it
easier to flip the switch, though (but I also find sticking a line in
your config.mak to be pretty easy already).

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-09 17:44 ` Andrzej Hunt
  2021-06-09 20:36   ` Felipe Contreras
  2021-06-10 10:46   ` Jeff King
@ 2021-06-10 10:56   ` Ævar Arnfjörð Bjarmason
  2021-06-10 13:38     ` Jeff King
  2 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-06-10 10:56 UTC (permalink / raw)
  To: Andrzej Hunt; +Cc: git, Jeff King, Lénaïc Huard, Derrick Stolee


On Wed, Jun 09 2021, Andrzej Hunt wrote:

> On 09/06/2021 16:38, Ævar Arnfjörð Bjarmason wrote:
>> [In-Reply-To
>> <a74bbcae7363df03bf8e93167d9274d16dc807f3.1615747662.git.gitgitgadget@gmail.com>,
>> but intentionally breaking threading for a new topic]
>> On Sun, Mar 14 2021, Andrzej Hunt via GitGitGadget wrote:
>> 
>>> Most of these pointers can safely be freed when cmd_clone() completes,
>>> therefore we make sure to free them. The one exception is that we
>>> have to UNLEAK(repo) because it can point either to argv[0], or a
>>> malloc'd string returned by absolute_pathdup().
>> I ran into this when manually checking with valgrind and discovered
>> that
>> you need SANITIZERS for -DSUPPRESS_ANNOTATED_LEAKS to squash it.
>> I wonder if that shouldn't be in DEVOPTS (or even a default under
>> DEVELOPER=1). I.e. you don't need any other special compile flags, just
>> a compiled git that you then run under valgrind to spot this.
>
> I'm not familiar with git's development conventions/philosophy, but my
> 2c is that it's better not to enable it by default in order to
> minimise divergence from the code that users are running. OTOH it's
> not a major difference in behaviour so perhaps that's not a concern
> here.
>
> More significantly: I get the impression it's easier to do leak
> checking using LSAN, which requires recompiling git anyway - at which
> point you get the flag for free - so how often will people actually
> perform leak checking with Valgrind in the first place?

*Nod*, I didn't investigate the runtime penalty you and Jeff point
out. In any case, it seems that can also be done with valgrind exclusion
rules and/or manually ignoring these cases in the test wrapper.

>> 
>>>   builtin/clone.c | 14 ++++++++++----
>>>   1 file changed, 10 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/builtin/clone.c b/builtin/clone.c
>>> index 51e844a2de0a..952fe3d8fc88 100644
>>> --- a/builtin/clone.c
>>> +++ b/builtin/clone.c
>>> @@ -964,10 +964,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>>   {
>>>   	int is_bundle = 0, is_local;
>>>   	const char *repo_name, *repo, *work_tree, *git_dir;
>>> -	char *path, *dir, *display_repo = NULL;
>>> +	char *path = NULL, *dir, *display_repo = NULL;
>>>   	int dest_exists, real_dest_exists = 0;
>>>   	const struct ref *refs, *remote_head;
>>> -	const struct ref *remote_head_points_at;
>>> +	struct ref *remote_head_points_at = NULL;
>>>   	const struct ref *our_head_points_at;
>>>   	struct ref *mapped_refs;
>>>   	const struct ref *ref;
>>> @@ -1017,9 +1017,10 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>>   	repo_name = argv[0];
>>>     	path = get_repo_path(repo_name, &is_bundle);
>>> -	if (path)
>>> +	if (path) {
>>> +		FREE_AND_NULL(path);
>>>   		repo = absolute_pathdup(repo_name);
>>> -	else if (strchr(repo_name, ':')) {
>>> +	} else if (strchr(repo_name, ':')) {
>>>   		repo = repo_name;
>>>   		display_repo = transport_anonymize_url(repo);
>>>   	} else
>> In this case it seems better to just have a :
>>      int repo_heap = 0;
>>      Then set "repo_heap = 1" in that absolute_pathdup(repo_name)
>> branch,
>>      and...
>> 
>>> @@ -1393,6 +1394,11 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>>   	strbuf_release(&reflog_msg);
>>>   	strbuf_release(&branch_top);
>>>   	strbuf_release(&key);
>>> +	free_refs(mapped_refs);
>>> +	free_refs(remote_head_points_at);
>>> +	free(dir);
>>> +	free(path);
>>> +	UNLEAK(repo);
>> Here do:
>>      if (repo_heap)
>>          free(repo);
>> 
>
> Although this is possible, I don't think it's worth it: if UNLEAK
> already exists, we might as well use it here to make the code
> simpler. And UNLEAK is unlikely to go away anytime soon
> because... (continued below)
>
>> But maybe there's some other out of the box way to make leak checking
>> Just Work without special flags in this case. I'm just noting this one
>> because it ended up being the only one that leaked unless I compiled
>> with -DSUPPRESS_ANNOTATED_LEAKS. I was fixing some leaks in the bundle
>> code.
>
> There are trickier examples where a cmd_* function has a complex
> struct on the stack, and correctly clearing all allocated memory
> pointed to by its members (or in turn further children with
> potentially multiple levels of indirection) is a lot of work - and
> that work doesn't actually benefit the user in any way. In other
> words, we either need to be able to use UNLEAK to suppress certain
> classes of uninteresting memory leaks - which allows us to focus on
> the interesting/real leaks - or someone has to spend a lot of time
> doing cleanup by hand (and/or someone has to implement a bunch of new
> cleanup functions)).
>
> In your example above, the UNLEAK can be avoided at the cost of one
> additional tracking variable - but in many other cases avoiding an 
> UNLEAK is much more expensive. It's certainly valid to debate the
> merits of the UNLEAK here, but that won't remove the need for UNLEAK's 
> existence in general.
>
> (The most common example that I remember is where cmd_* has a
> rev_info, and AFAICT there's no one-liner to clean that up. Using
> UNLEAK is honestly the best approach there. I don't think I've
> actually submitted any patches doing this, but I have a few in my
> local backlog.)

The thing I was patching happened to be making rev_info * not leak. I
probably didn't cover some more complex cases, but some simple cases
seem relatively easy.

I.e. it just doesn't have a release() function, and at least the things
I was looking at (bundle.c code) were relatively easy cases where we
were just missing a loop to free() data from some struct.

But yes, I agree that free()-ing just before we exit() is rather useless
in itself, the reason I wanted it is because it's a useful (although not
perfect) proxy for checking if the APIs the command uses as a one-off
leak when used as libraries, where we may be processing N items, later
doing other work etc.

We should probably eventually have a s/free/end_free()/g and imitate
perl(1)'s PERL_DESTRUCT_LEVEL option. I.e. you can globally configure
perl to run in a mode that assumes a one-off command, in that case
you'll just let the OS handle the cleanup, or one where you care about
memory leaks because you're using it e.g. as an embedded library.

But maybe it's not even worth it. In Perl the main benefit is that it's
a programming language with DESTROY handlers etc., so destruction can
often be expensive; turning it off entirely can also be buggy, imagine
relying on destructors to free temporary files etc.

We have that issue in theory with the interaction of atexit() handlers
and e.g. things that would behave differently at a distance if certain
thing were free()'d already, but in practice we probably don't.

But maybe it's not even worth pursuing. Have you (or anyone else) tried
e.g. benchmarking git's tests or t/perf tests where free() is defined to
be some noop stub? I'd expect it not to matter, but maybe I'm wrong...

>> Anyway, getting to the "default tests" point. I fixed a memory leak, and
>> wanted to it tested that the specific command doesn't leak in git's
>> default tests.
>> Do we have such a thing, if not why not?
>> The closest I got to getting this was:
>>      GIT_VALGRIND_MODE=memcheck
>> GIT_VALGRIND_OPTIONS="--leak-check=full
>> --errors-for-leak-kinds=definite --error-exitcode=123" <SOME TEST>
>> --valgrind
>
> It's easy to perform leak-checking runs *if* you're OK recompiling
> with LSAN, instead of using valgrind. My usual recipe for running
> against a range of tests is something like:

I thought valgrind would be a better approach since we might rely on it
just being there, so we could run some known-good commands that don't
leak even in a "normal" test run, but...

>   make SANITIZE=address,leak
>   ASAN_OPTIONS="detect_leaks=1:abort_on_error=1" CFLAGS="-Og -g" 
> T="\$(wildcard t00[0-9][0-9]-*.sh)" test
>
> Additionally: I usually specify CC=clang, although gcc+LSAN has mostly
> been stable enough in my experience so you might be able to skip that.
> (I've found ASAN+LSAN to be more stable than LSAN by itself, which is
> why I specify address+leak, but adding ASAN in turn requires
> overriding ASAN_OPTIONS to reenable leak checking.)
>
> I don't know whether or not Valgrind is more/less effective at finding
> leaks, so being able to run the test suite under valgrind would be
> nice for comparison purposes though.

I didn't know how to set that up, that seems easy enough.

This works for me:

    make CC=clang SANITIZE=address,leak CFLAGS="-00 -g"
    (cd t && make ASAN_OPTIONS="<what you said>" [...])

I.e. it's just SANITIZE & flags that's important at compile-time. You
doubtless knew that, mainly for my own notes & others following along.

I ran it, noted the failing tests, produced a giant GIT_SKIP_TESTS list
and hacked ci/ to run that as a new linux-clang-SANITIZE job. That messy
WIP code is currently running at:
https://github.com/avar/git/runs/2793150092

Wouldn't it be a good idea to have such a job and slowly work on the
exclusion list?

E.g. I saw that t0004 failed, which was trivially fixed with a single
strbuf_release(), and we could guard against regressions.

Anyway, I can submit some cleaned-up patches for that. I was just
fishing for whether there was some good reason not to do it, since there
seemed to have been interest in leak fixes, but it hadn't made it into
CI / some "blessed" GIT_TEST_* mode or whatever. I.e. maybe the reports
were unstable or unreliable...


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-10 10:56   ` Ævar Arnfjörð Bjarmason
@ 2021-06-10 13:38     ` Jeff King
  2021-06-10 15:32       ` Andrzej Hunt
  0 siblings, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-06-10 13:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Andrzej Hunt, git, Lénaïc Huard, Derrick Stolee

On Thu, Jun 10, 2021 at 12:56:55PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > More significantly: I get the impression it's easier to do leak
> > checking using LSAN, which requires recompiling git anyway - at which
> > point you get the flag for free - so how often will people actually
> > perform leak checking with Valgrind in the first place?
> 
> *Nod*, I didn't investigate the runtime penalty you and Jeff point
> out. In any case, it seems that can also be done with valgrind exclusion
> rules and/or manually ignoring these cases in the test wrapper.

I had trouble using valgrind's exclusions; there's more discussion in
0e5bba53af (add UNLEAK annotation for reducing leak false positives,
2017-09-08), but the gist of it is that it's awkward to annotate the
point of leak, rather than the point of allocation (so you have to
provide the complete callstack to the allocation, which is a maintenance
headache).

Of course, if you find ways to make useful annotations with valgrind,
I'm all for it. We have a few in t/valgrind already.

> But maybe it's not even worth pursuing. Have you (or anyone else) tried
> e.g. benchmarking git's tests or t/perf tests where free() is defined to
> be some noop stub? I'd expect it not to matter, but maybe I'm wrong...

I haven't. Even though I originated UNLEAK(), I'm not really all that
concerned about the cost of free() in general. My motivation for
introducing it (versus adding free() calls) was mostly about convenience
(complex data structures that don't have an easy free/release function,
but also the fact that you can still access data after marking it with
unleak).

The fact that it also preempts any arguments about the performance of
calling free() was just a bonus. ;)

To be clear, I could easily be convinced by real numbers that the cost
of free() at program end matters. I am just saying I am not one of the
people who is going to argue that position in the meantime.

> I didn't know how to set that up, that seems easy enough.
> 
> This works for me:
> 
>     make CC=clang SANITIZE=address,leak CFLAGS="-00 -g"
>     (cd t && make ASAN_OPTIONS="<what you said>" [...])
> 
> I.e. it's just SANITIZE & flags that's important at compile-time. You
> doubtless knew that, mainly for my own notes & others following along.

It should Just Work with:

  make SANITIZE=leak test

for both gcc and clang. You do need ASAN_OPTIONS if you're asking ASan
to do leak-checking (since we usually suppress that for the obvious
reason that almost every test fails). I'm not sure if using both ASan
and LSan together confuses LSan there (if so, it may be reasonable for
test-lib.sh to modify its ASAN_OPTIONS setting if LSan is enabled).

> I ran it, noted the failing tests, produced a giant GIT_SKIP_TESTS list
> and hacked ci/ to run that as a new linux-clang-SANITIZE job. That messy
> WIP code is currently running at:
> https://github.com/avar/git/runs/2793150092
> 
> Wouldn't it be a good idea to have such a job and slowly work on the
> exclusion list?
> 
> E.g. I saw that t0004 failed, which was trivially fixed with a single
> strbuf_release(), and we could guard against regressions.

I don't mind that. My intent was to get the whole suite clean
eventually, and then start worrying about regressions. But that may take
a while.

I do think it would be worth splitting out ASan from leak-checking. The
whole suite should run clean with regular ASan already, and we'd want to
find regressions there even in the tests that aren't leak-clean. I do
periodic ASan runs already; the main argument against doing it for every
CI run is just that's a lot more CPU. But maybe not enough to be
prohibitive? It's probably still way cheaper than running the test suite
on Windows.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-10 13:38     ` Jeff King
@ 2021-06-10 15:32       ` Andrzej Hunt
  2021-06-10 16:36         ` Jeff King
  0 siblings, 1 reply; 125+ messages in thread
From: Andrzej Hunt @ 2021-06-10 15:32 UTC (permalink / raw)
  To: Jeff King, Ævar Arnfjörð Bjarmason
  Cc: git, Lénaïc Huard, Derrick Stolee



On 10/06/2021 15:38, Jeff King wrote:
>> I ran it, noted the failing tests, produced a giant GIT_SKIP_TESTS list
>> and hacked ci/ to run that as a new linux-clang-SANITIZE job. That messy
>> WIP code is currently running at:
>> https://github.com/avar/git/runs/2793150092
>>
>> Wouldn't it be a good idea to have such a job and slowly work on the
>> exclusion list?
>>
>> E.g. I saw that t0004 failed, which was trivially fixed with a single
>> strbuf_release(), and we could guard against regressions.
> 
> I don't mind that. My intent was to get the whole suite clean
> eventually, and then start worrying about regressions. But that may take
> a while.
> 
> I do think it would be worth splitting out ASan from leak-checking. The
> whole suite should run clean with regular ASan already, and we'd want to
> find regressions there even in the tests that aren't leak-clean. I do
> periodic ASan runs already; the main argument against doing it for every
> CI run is just that's a lot more CPU. But maybe not enough to be
> prohibitive? It's probably still way cheaper than running the test suite
> on Windows.

I've been running tests with ASAN in the Github Actions environment, and 
a single run takes just over 30 minutes [1] - which I believe is similar 
to the normal test jobs (they do run the test suite twice in that time I 
think).

I've been doing the same with UBSAN, and that's even faster at 15-20 
minutes [2]. However I get the impression that ASAN issues are both more 
common (at least on seen), and more impactful - so I would argue that 
ASAN should be prioritised if there's spare capacity. (I have no idea if 
ASAN+UBSAN can be combined, but I suspect that doing so would make the 
tests slower?)

I'm also running LSAN tests in CI to try and catch regressions, but I've 
only enabled a handful of tests so far. My much simpler approach was to 
specify the range of tests to run as 0-X, and as we make progress on 
fixing leaks, X will slowly approach 9999 (currently we're at something 
like X~=5, although I'm not too far off sending out some patches to 
boost that to 99). The skip-tests approach seems much more useful!

ATB,

   Andrzej

[1] https://github.com/ahunt/git/runs/2789921851?check_suite_focus=true
[2] https://github.com/ahunt/git/runs/2760632000?check_suite_focus=true


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-10 15:32       ` Andrzej Hunt
@ 2021-06-10 16:36         ` Jeff King
  2021-06-11 15:44           ` Andrzej Hunt
  0 siblings, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-06-10 16:36 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Ævar Arnfjörð Bjarmason, git,
	Lénaïc Huard, Derrick Stolee

On Thu, Jun 10, 2021 at 05:32:41PM +0200, Andrzej Hunt wrote:

> > I do think it would be worth splitting out ASan from leak-checking. The
> > whole suite should run clean with regular ASan already, and we'd want to
> > find regressions there even in the tests that aren't leak-clean. I do
> > periodic ASan runs already; the main argument against doing it for every
> > CI run is just that's a lot more CPU. But maybe not enough to be
> > prohibitive? It's probably still way cheaper than running the test suite
> > on Windows.
> 
> I've been running tests with ASAN in the Github Actions environment, and a
> single run takes just over 30 minutes [1] - which I believe is similar to
> the normal test jobs (they do run the test suite twice in that time I
> think).
> 
> I've been doing the same with UBSAN, and that's even faster at 15-20 minutes
> [2]. However I get the impression that ASAN issues are both more common (at
> least on seen), and more impactful - so I would argue that ASAN should be
> prioritised if there's spare capacity. (I have no idea if ASAN+UBSAN can be
> combined, but I suspect that doing so would make the tests slower?)

I routinely do SANITIZE=address,undefined since they are both useful
(and we do not trigger either in the current test suite). I never
measured the time of their combined use versus just one, but surely it's
faster the two-at-once approach is faster than running the test suite
twice.

> I'm also running LSAN tests in CI to try and catch regressions, but I've
> only enabled a handful of tests so far. My much simpler approach was to
> specify the range of tests to run as 0-X, and as we make progress on fixing
> leaks, X will slowly approach 9999 (currently we're at something like X~=5,
> although I'm not too far off sending out some patches to boost that to 99).
> The skip-tests approach seems much more useful!

Depending how fine-grained you get with skip-tests, it can create a
hassle as individual tests are removed or reordered (and now somebody
has to maintain the skip list). Doing it with whole scripts (whether
saying "these ones are OK" or "these ones are known bad") seems like
less maintenance overall. The results aren't as fine-grained, but for
something that is meant to be a transitional step, I'm not sure it's
worth the trouble to get more specific.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-09 14:38 UNLEAK(), leak checking in the default tests etc Ævar Arnfjörð Bjarmason
  2021-06-09 17:44 ` Andrzej Hunt
@ 2021-06-10 19:01 ` SZEDER Gábor
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: SZEDER Gábor @ 2021-06-10 19:01 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Andrzej Hunt, git, Andrzej Hunt, Jeff King,
	Lénaïc Huard, Derrick Stolee

On Wed, Jun 09, 2021 at 04:38:52PM +0200, Ævar Arnfjörð Bjarmason wrote:
>     GIT_VALGRIND_MODE=memcheck GIT_VALGRIND_OPTIONS="--leak-check=full --errors-for-leak-kinds=definite --error-exitcode=123" <SOME TEST> --valgrind
> 
> But as t/README notes it implies --verbose so we can't currently run it
> under the test harness (although I have out-of-tree patches to fix that
> in general).

'--valgrind' doesn't imply '--verbose' if '--verbose-log' was given,
and that works with the test harness just fine; see 88c6e9d31c
(test-lib: --valgrind should not override --verbose-log, 2017-09-05).


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: UNLEAK(), leak checking in the default tests etc.
  2021-06-10 16:36         ` Jeff King
@ 2021-06-11 15:44           ` Andrzej Hunt
  0 siblings, 0 replies; 125+ messages in thread
From: Andrzej Hunt @ 2021-06-11 15:44 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git,
	Lénaïc Huard, Derrick Stolee



On 10/06/2021 18:36, Jeff King wrote:
> On Thu, Jun 10, 2021 at 05:32:41PM +0200, Andrzej Hunt wrote:
> 
>>> I do think it would be worth splitting out ASan from leak-checking. The
>>> whole suite should run clean with regular ASan already, and we'd want to
>>> find regressions there even in the tests that aren't leak-clean. I do
>>> periodic ASan runs already; the main argument against doing it for every
>>> CI run is just that's a lot more CPU. But maybe not enough to be
>>> prohibitive? It's probably still way cheaper than running the test suite
>>> on Windows.
>>
>> I've been running tests with ASAN in the Github Actions environment, and a
>> single run takes just over 30 minutes [1] - which I believe is similar to
>> the normal test jobs (they do run the test suite twice in that time I
>> think).
>>
>> I've been doing the same with UBSAN, and that's even faster at 15-20 minutes
>> [2]. However I get the impression that ASAN issues are both more common (at
>> least on seen), and more impactful - so I would argue that ASAN should be
>> prioritised if there's spare capacity. (I have no idea if ASAN+UBSAN can be
>> combined, but I suspect that doing so would make the tests slower?)
> 
> I routinely do SANITIZE=address,undefined since they are both useful
> (and we do not trigger either in the current test suite). I never
> measured the time of their combined use versus just one, but surely it's
> faster the two-at-once approach is faster than running the test suite
> twice.

I'm seeing 33 minutes for SANITIZE=address,undefined - which is no 
slower than SANITIZE=address by itself (disclaimer: it's only one 
measurement):
https://github.com/ahunt/git/runs/2795642716?check_suite_focus=true
(The job's name is wrong but if you look in the logs you can confirm 
that it's using address+undefined.)

The usual linux and mac test-jobs are actually from 24 to 30 minutes 
(the numbers seem a bit variable) - with the exception of one faster 10 
minute job:
https://github.com/git/git/actions/runs/925771097
vs
https://github.com/git/git/actions/runs/927729395

So to summarise: adding an ASAN+UBSAN job would make things a bit 
slower, but not a huge amount slower.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI
  2021-06-09 14:38 UNLEAK(), leak checking in the default tests etc Ævar Arnfjörð Bjarmason
  2021-06-09 17:44 ` Andrzej Hunt
  2021-06-10 19:01 ` SZEDER Gábor
@ 2021-07-14  0:11 ` Ævar Arnfjörð Bjarmason
  2021-07-14  0:11   ` [PATCH 1/4] tests: " Ævar Arnfjörð Bjarmason
                     ` (4 more replies)
  2 siblings, 5 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14  0:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

As a follow-up to my recent thread asking if we had some test mode or
CI to test for memory leak regression (we don't), add such a test
mode, and run it in CI.

Currently the two new CI targets take ~2-3 minutes to run in GitHub
CI, whereas the normal test targets take 20-30 minutes. The tests run
slower, but we have a small whitelist of test scripts that are OK.

1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/

Ævar Arnfjörð Bjarmason (4):
  tests: add a test mode for SANITIZE=leak, run it in CI
  SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  SANITIZE tests: fix memory leaks in t5701*, add to whitelist
  SANITIZE tests: fix leak in mailmap.c

 .github/workflows/main.yml  |  6 ++++
 Makefile                    |  5 +++
 ci/install-dependencies.sh  |  4 +--
 ci/lib.sh                   | 18 ++++++++---
 ci/run-build-and-tests.sh   |  4 +--
 config.c                    | 17 ++++++++---
 mailmap.c                   |  2 ++
 protocol-caps.c             |  5 +--
 t/README                    | 16 ++++++++++
 t/t0500-progress-display.sh |  3 +-
 t/t1300-config.sh           | 16 ++++++----
 t/t4203-mailmap.sh          |  6 ++++
 t/t5701-git-serve.sh        |  3 +-
 t/test-lib.sh               | 61 +++++++++++++++++++++++++++++++++++++
 14 files changed, 142 insertions(+), 24 deletions(-)

-- 
2.32.0-dev


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-07-14  0:11   ` Ævar Arnfjörð Bjarmason
  2021-07-14  3:23     ` Đoàn Trần Công Danh
  2021-07-14  0:11   ` [PATCH 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14  0:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak there has been no
corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
fixed as one-offs without structured regression testing.

This change add such a mode, we now have new
linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
tests as linux-{clang,gcc}, except that almost all of them are
skipped.

There is a whitelist of some tests that are OK in test-lib.sh, and
individual tests can be opted-in by setting
GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
individual test can be skipped with the "!SANITIZE_LEAK"
prerequisite. See the updated t/README for more details.

I'm using the GIT_TEST_SANITIZE_LEAK=true and !SANITIZE_LEAK pattern
in a couple of tests whose memory leaks I'll fix in subsequent
commits.

I'm not being aggressive about opting in tests, it's not all tests
that currently pass under SANITIZE=leak, just a small number of
known-good tests. We can add more later as we fix leaks and grow more
confident in this test mode.

See the recent discussion at [1] about the lack of this sort of test
mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false
positives, 2017-09-08) for the initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

When calling maybe_skip_all_sanitize_leak matching against
"$TEST_NAME" instead of "$this_test" as other "match_pattern_list()"
users do is intentional. I'd like to match things like "t13*config*"
in subsequent commits. This part of the API isn't public, so we can
freely change it in the future.

1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .github/workflows/main.yml  |  6 ++++
 Makefile                    |  5 ++++
 ci/install-dependencies.sh  |  4 +--
 ci/lib.sh                   | 18 +++++++----
 ci/run-build-and-tests.sh   |  4 +--
 t/README                    | 16 ++++++++++
 t/t0500-progress-display.sh |  3 +-
 t/t5701-git-serve.sh        |  2 +-
 t/test-lib.sh               | 60 +++++++++++++++++++++++++++++++++++++
 9 files changed, 107 insertions(+), 11 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 73856bafc9..b81ec34959 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -297,6 +297,12 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-clang-sanitize-leak
+            cc: clang
+            pool: ubuntu-latest
+          - jobname: linux-gcc-sanitize-leak
+            cc: clang
+            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/Makefile b/Makefile
index 502e0c9a81..d4cad5136f 100644
--- a/Makefile
+++ b/Makefile
@@ -1216,6 +1216,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1260,6 +1263,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2793,6 +2797,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 67852d0d37..31e519cde9 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang*|linux-gcc*)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
 		$UBUNTU_COMMON_PKGS
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc*)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f..34fd914438 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,14 +183,16 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
-	if [ "$jobname" = linux-gcc ]
-	then
+linux-clang*|linux-gcc*)
+	case "$jobname" in
+	linux-gcc*)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
-	fi
+		;;
+	esac
 
 	export GIT_TEST_HTTPD=true
 
@@ -233,4 +235,10 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-*-sanitize-leak)
+	export SANITIZE=leak
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3ce81ffee9..07b9c09f45 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -12,7 +12,7 @@ esac
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc*)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
@@ -29,7 +29,7 @@ linux-gcc)
 	export GIT_TEST_CHECKOUT_WORKERS=2
 	make test
 	;;
-linux-clang)
+linux-clang*)
 	export GIT_TEST_DEFAULT_HASH=sha1
 	make test
 	export GIT_TEST_DEFAULT_HASH=sha256
diff --git a/t/README b/t/README
index 1a2072b2c8..303d0be817 100644
--- a/t/README
+++ b/t/README
@@ -448,6 +448,22 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
 to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
 execution of the parallel-checkout code.
 
+GIT_TEST_SANITIZE_LEAK=<boolean> will force the tests to run when git
+is compiled with SANITIZE=leak (we pick it up via
+../GIT-BUILD-OPTIONS).
+
+By default all tests are skipped when compiled with SANITIZE=leak, and
+individual test scripts opt themselves in to leak testing by setting
+GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
+tests use the SANITIZE_LEAK prerequisite to skip individiual tests
+(i.e. test_expect_success !SANITIZE_LEAK [...]).
+
+So the GIT_TEST_SANITIZE_LEAK setting is different in behavior from
+both other GIT_TEST_*=[true|false] settings, but more useful given how
+SANITIZE=leak works & the state of the test suite. Manually setting
+GIT_TEST_SANITIZE_LEAK=true is only useful during development when
+finding and fixing memory leaks.
+
 Naming Tests
 ------------
 
diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
index 22058b503a..7afb9abb1f 100755
--- a/t/t0500-progress-display.sh
+++ b/t/t0500-progress-display.sh
@@ -2,6 +2,7 @@
 
 test_description='progress display'
 
+GIT_TEST_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 show_cr () {
@@ -283,7 +284,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
 	test_cmp expect out
 '
 
-test_expect_success 'progress generates traces' '
+test_expect_success !SANITIZE_LEAK 'progress generates traces' '
 	cat >in <<-\EOF &&
 	throughput 102400 1000
 	update
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 930721f053..d58efb0aa9 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -243,7 +243,7 @@ test_expect_success 'unexpected lines are not allowed in fetch request' '
 
 # Test the basics of object-info
 #
-test_expect_success 'basics of object-info' '
+test_expect_success !SANITIZE_LEAK 'basics of object-info' '
 	test-tool pkt-line pack >in <<-EOF &&
 	command=object-info
 	object-format=$(test_oid algo)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 7036f83b33..9201510e16 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1353,6 +1353,40 @@ then
 	exit 1
 fi
 
+# SANITIZE=leak test mode
+sanitize_leak_true=
+add_sanitize_leak_true () {
+	sanitize_leak_true="$sanitize_leak_true$1 "
+}
+
+sanitize_leak_false=
+add_sanitize_leak_false () {
+	sanitize_leak_false="$sanitize_leak_false$1 "
+}
+
+sanitize_leak_opt_in_msg="opt-in with GIT_TEST_SANITIZE_LEAK=true"
+maybe_skip_all_sanitize_leak () {
+	# Whitelist patterns
+	add_sanitize_leak_true 't000*'
+	add_sanitize_leak_true 't001*'
+	add_sanitize_leak_true 't006*'
+
+	# Blacklist patterns (overrides whitelist)
+	add_sanitize_leak_false 't000[469]*'
+	add_sanitize_leak_false 't001[2459]*'
+	add_sanitize_leak_false 't006[0248]*'
+
+	if match_pattern_list "$1" "$sanitize_leak_false"
+	then
+		skip_all="test $this_test on SANITIZE=leak blacklist, $sanitize_leak_opt_in_msg"
+		test_done
+	elif match_pattern_list "$1" "$sanitize_leak_true"
+	then
+		return 0
+	fi
+	return 1
+}
+
 # Are we running this test at all?
 remove_trash=
 this_test=${0##*/}
@@ -1364,6 +1398,31 @@ then
 	test_done
 fi
 
+# Aggressively skip non-whitelisted tests when compiled with
+# SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test -z "$GIT_TEST_SANITIZE_LEAK" &&
+		maybe_skip_all_sanitize_leak "$TEST_NAME"
+	then
+		say_color info >&3 "test $this_test on SANITIZE=leak whitelist"
+		GIT_TEST_SANITIZE_LEAK=true
+	fi
+
+	# We need to see it in "git env--helper" (via
+	# test_bool_env)
+	export GIT_TEST_SANITIZE_LEAK
+
+	if ! test_bool_env GIT_TEST_SANITIZE_LEAK false
+	then
+		skip_all="skip all tests in $this_test under SANITIZE=leak, $sanitize_leak_opt_in_msg"
+		test_done
+	fi
+elif test_bool_env GIT_TEST_SANITIZE_LEAK false
+then
+	error "GIT_TEST_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 HOME="$TRASH_DIRECTORY"
 GNUPGHOME="$HOME/gnupg-home-not-used"
@@ -1516,6 +1575,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.32.0-dev


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-07-14  0:11   ` [PATCH 1/4] tests: " Ævar Arnfjörð Bjarmason
@ 2021-07-14  0:11   ` Ævar Arnfjörð Bjarmason
  2021-07-14  0:11   ` [PATCH 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14  0:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

Fix a couple of trivial memory leaks introduced in 3efd0bedc6 (config:
add conditional include, 2017-03-01) and my own 867ad08a26 (hooks:
allow customizing where the hook directory is, 2016-05-04).

In the latter case the "fix" is UNLEAK() on the global variable. This
allows us to run all t13*config* tests under SANITIZE=leak.

With this change we can now run almost the whole set of config.c
tests (t13*config) under SANITIZE=leak, so let's do so, with a few
exceptions:

 * The test added in ce81b1da23 (config: add new way to pass config
   via `--config-env`, 2021-01-12), it fails in GitHub CI, but passes
   for me locally. Let's just skip it for now.

 * Ditto the split_cmdline and "aliases of builtins" tests, the former
   required splitting up an existing test, there an issue with the test
   that would have also been revealed by skipping it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 config.c          | 17 ++++++++++++-----
 t/t1300-config.sh | 16 ++++++++++------
 t/test-lib.sh     |  1 +
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index f9c400ad30..38e132c0e2 100644
--- a/config.c
+++ b/config.c
@@ -138,8 +138,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
 		return config_error_nonbool("include.path");
 
 	expanded = expand_user_path(path, 0);
-	if (!expanded)
-		return error(_("could not expand include path '%s'"), path);
+	if (!expanded) {
+		ret = error(_("could not expand include path '%s'"), path);
+		goto cleanup;
+	}
 	path = expanded;
 
 	/*
@@ -149,8 +151,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
 	if (!is_absolute_path(path)) {
 		char *slash;
 
-		if (!cf || !cf->path)
-			return error(_("relative config includes must come from files"));
+		if (!cf || !cf->path) {
+			ret = error(_("relative config includes must come from files"));
+			goto cleanup;
+		}
 
 		slash = find_last_dir_sep(cf->path);
 		if (slash)
@@ -168,6 +172,7 @@ static int handle_path_include(const char *path, struct config_include_data *inc
 		ret = git_config_from_file(git_config_include, path, inc);
 		inc->depth--;
 	}
+cleanup:
 	strbuf_release(&buf);
 	free(expanded);
 	return ret;
@@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "core.attributesfile"))
 		return git_config_pathname(&git_attributes_file, var, value);
 
-	if (!strcmp(var, "core.hookspath"))
+	if (!strcmp(var, "core.hookspath")) {
+		UNLEAK(git_hooks_path);
 		return git_config_pathname(&git_hooks_path, var, value);
+	}
 
 	if (!strcmp(var, "core.bare")) {
 		is_bare_repository_cfg = git_config_bool(var, value);
diff --git a/t/t1300-config.sh b/t/t1300-config.sh
index 9ff46f3b04..93ad0f4887 100755
--- a/t/t1300-config.sh
+++ b/t/t1300-config.sh
@@ -1050,12 +1050,16 @@ test_expect_success SYMLINKS 'symlink to nonexistent configuration' '
 	test_must_fail git config --file=linktolinktonada --list
 '
 
-test_expect_success 'check split_cmdline return' "
-	git config alias.split-cmdline-fix 'echo \"' &&
-	test_must_fail git split-cmdline-fix &&
+test_expect_success 'setup check split_cmdline return' "
 	echo foo > foo &&
 	git add foo &&
-	git commit -m 'initial commit' &&
+	git commit -m 'initial commit'
+"
+
+test_expect_success !SANITIZE_LEAK 'check split_cmdline return' "
+	git config alias.split-cmdline-fix 'echo \"' &&
+	test_must_fail git split-cmdline-fix &&
+
 	git config branch.main.mergeoptions 'echo \"' &&
 	test_must_fail git merge main
 "
@@ -1101,7 +1105,7 @@ test_expect_success 'key sanity-checking' '
 	git config foo."ba =z".bar false
 '
 
-test_expect_success 'git -c works with aliases of builtins' '
+test_expect_success !SANITIZE_LEAK 'git -c works with aliases of builtins' '
 	git config alias.checkconfig "-c foo.check=bar config foo.check" &&
 	echo bar >expect &&
 	git checkconfig >actual &&
@@ -1397,7 +1401,7 @@ test_expect_success 'git --config-env with missing value' '
 	grep "invalid config format: config" error
 '
 
-test_expect_success 'git --config-env fails with invalid parameters' '
+test_expect_success !SANITIZE_LEAK 'git --config-env fails with invalid parameters' '
 	test_must_fail git --config-env=foo.flag config --bool foo.flag 2>error &&
 	test_i18ngrep "invalid config format: foo.flag" error &&
 	test_must_fail git --config-env=foo.flag= config --bool foo.flag 2>error &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 9201510e16..98e20950c3 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1370,6 +1370,7 @@ maybe_skip_all_sanitize_leak () {
 	add_sanitize_leak_true 't000*'
 	add_sanitize_leak_true 't001*'
 	add_sanitize_leak_true 't006*'
+	add_sanitize_leak_true 't13*config*'
 
 	# Blacklist patterns (overrides whitelist)
 	add_sanitize_leak_false 't000[469]*'
-- 
2.32.0-dev


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 3/4] SANITIZE tests: fix memory leaks in t5701*, add to whitelist
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-07-14  0:11   ` [PATCH 1/4] tests: " Ævar Arnfjörð Bjarmason
  2021-07-14  0:11   ` [PATCH 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
@ 2021-07-14  0:11   ` Ævar Arnfjörð Bjarmason
  2021-07-14  0:11   ` [PATCH 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  4 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14  0:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

Fix a memory leak in a2ba162cda (object-info: support for retrieving
object info, 2021-04-20) which appears to have been based on a
misunderstanding of how the pkt-line.c API works, there is no need to
strdup() input to, it's just a printf()-like format function.

This fixes a potentially large memory leak, since the number of OID
lines the "object-info" call can be arbitrarily large (or a small one
if the request is small).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 protocol-caps.c      | 5 +++--
 t/t5701-git-serve.sh | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/protocol-caps.c b/protocol-caps.c
index 13a9e63a04..901b6795e4 100644
--- a/protocol-caps.c
+++ b/protocol-caps.c
@@ -69,9 +69,10 @@ static void send_info(struct repository *r, struct packet_writer *writer,
 			}
 		}
 
-		packet_writer_write(writer, "%s",
-				    strbuf_detach(&send_buffer, NULL));
+		packet_writer_write(writer, "%s", send_buffer.buf);
+		strbuf_reset(&send_buffer);
 	}
+	strbuf_release(&send_buffer);
 }
 
 int cap_object_info(struct repository *r, struct strvec *keys,
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index d58efb0aa9..e2f4832adf 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -5,6 +5,7 @@ test_description='test protocol v2 server commands'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'test capability advertisement' '
-- 
2.32.0-dev


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 4/4] SANITIZE tests: fix leak in mailmap.c
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                     ` (2 preceding siblings ...)
  2021-07-14  0:11   ` [PATCH 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
@ 2021-07-14  0:11   ` Ævar Arnfjörð Bjarmason
  2021-07-14  2:19     ` Eric Sunshine
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14  0:11 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Ævar Arnfjörð Bjarmason

Get closer to being able to run t4203-mailmap.sh by fixing a couple of
memory leak in mailmap.c.

In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
clear the "me" structure, but while we freed parts of the
mailmap_entry structure, we didn't free the structure itself. The same
goes for the "mailmap_info" structure.
---
 mailmap.c          | 2 ++
 t/t4203-mailmap.sh | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/mailmap.c b/mailmap.c
index d1f7c0d272..e1c8736093 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -36,6 +36,7 @@ static void free_mailmap_info(void *p, const char *s)
 		 s, debug_str(mi->name), debug_str(mi->email));
 	free(mi->name);
 	free(mi->email);
+	free(mi);
 }
 
 static void free_mailmap_entry(void *p, const char *s)
@@ -51,6 +52,7 @@ static void free_mailmap_entry(void *p, const char *s)
 
 	me->namemap.strdup_strings = 1;
 	string_list_clear_func(&me->namemap, free_mailmap_info);
+	free(me);
 }
 
 /*
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 0b2d21ec55..c7de4299cf 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -79,6 +79,12 @@ test_expect_success 'check-mailmap bogus contact --stdin' '
 	test_must_fail git check-mailmap --stdin bogus </dev/null
 '
 
+if test_have_prereq SANITIZE_LEAK
+then
+	skip_all='skipping the rest of mailmap tests under SANITIZE_LEAK'
+	test_done
+fi
+
 test_expect_success 'No mailmap' '
 	cat >expect <<-EOF &&
 	$GIT_AUTHOR_NAME (1):
-- 
2.32.0-dev


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH 4/4] SANITIZE tests: fix leak in mailmap.c
  2021-07-14  0:11   ` [PATCH 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
@ 2021-07-14  2:19     ` Eric Sunshine
  0 siblings, 0 replies; 125+ messages in thread
From: Eric Sunshine @ 2021-07-14  2:19 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King, Andrzej Hunt,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor

On Tue, Jul 13, 2021 at 8:12 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> Get closer to being able to run t4203-mailmap.sh by fixing a couple of
> memory leak in mailmap.c.
>
> In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
> and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
> clear the "me" structure, but while we freed parts of the
> mailmap_entry structure, we didn't free the structure itself. The same
> goes for the "mailmap_info" structure.
> ---

Missing sign-off.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14  0:11   ` [PATCH 1/4] tests: " Ævar Arnfjörð Bjarmason
@ 2021-07-14  3:23     ` Đoàn Trần Công Danh
  0 siblings, 0 replies; 125+ messages in thread
From: Đoàn Trần Công Danh @ 2021-07-14  3:23 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Jeff King, Andrzej Hunt,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor

On 2021-07-14 02:11:46+0200, Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> While git can be compiled with SANITIZE=leak there has been no
> corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
> fixed as one-offs without structured regression testing.
> 
> This change add such a mode, we now have new
> linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
> tests as linux-{clang,gcc}, except that almost all of them are
> skipped.
> 
> There is a whitelist of some tests that are OK in test-lib.sh, and
> individual tests can be opted-in by setting
> GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
> individual test can be skipped with the "!SANITIZE_LEAK"
> prerequisite. See the updated t/README for more details.
> 
> I'm using the GIT_TEST_SANITIZE_LEAK=true and !SANITIZE_LEAK pattern
> in a couple of tests whose memory leaks I'll fix in subsequent
> commits.
> 
> I'm not being aggressive about opting in tests, it's not all tests
> that currently pass under SANITIZE=leak, just a small number of
> known-good tests. We can add more later as we fix leaks and grow more
> confident in this test mode.
> 
> See the recent discussion at [1] about the lack of this sort of test
> mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false
> positives, 2017-09-08) for the initial addition of SANITIZE=leak.
> 
> See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
> 7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
> 936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
> past history of "one-off" SANITIZE=leak (and more) fixes.
> 
> When calling maybe_skip_all_sanitize_leak matching against
> "$TEST_NAME" instead of "$this_test" as other "match_pattern_list()"
> users do is intentional. I'd like to match things like "t13*config*"
> in subsequent commits. This part of the API isn't public, so we can
> freely change it in the future.
> 
> 1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  .github/workflows/main.yml  |  6 ++++
>  Makefile                    |  5 ++++
>  ci/install-dependencies.sh  |  4 +--
>  ci/lib.sh                   | 18 +++++++----
>  ci/run-build-and-tests.sh   |  4 +--
>  t/README                    | 16 ++++++++++
>  t/t0500-progress-display.sh |  3 +-
>  t/t5701-git-serve.sh        |  2 +-
>  t/test-lib.sh               | 60 +++++++++++++++++++++++++++++++++++++
>  9 files changed, 107 insertions(+), 11 deletions(-)
> 
> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index 73856bafc9..b81ec34959 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -297,6 +297,12 @@ jobs:
>            - jobname: linux-gcc-default
>              cc: gcc
>              pool: ubuntu-latest
> +          - jobname: linux-clang-sanitize-leak
> +            cc: clang
> +            pool: ubuntu-latest
> +          - jobname: linux-gcc-sanitize-leak
> +            cc: clang

I think you meant:

	cc: gcc

?
> +            pool: ubuntu-latest
>      env:
>        CC: ${{matrix.vector.cc}}
>        jobname: ${{matrix.vector.jobname}}
> diff --git a/Makefile b/Makefile
> index 502e0c9a81..d4cad5136f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1216,6 +1216,9 @@ PTHREAD_CFLAGS =
>  SPARSE_FLAGS ?=
>  SP_EXTRA_FLAGS = -Wno-universal-initializer
>  
> +# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
> +SANITIZE_LEAK =
> +
>  # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>  # usually result in less CPU usage at the cost of higher peak memory.
>  # Setting it to 0 will feed all files in a single spatch invocation.
> @@ -1260,6 +1263,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
>  endif
>  ifneq ($(filter leak,$(SANITIZERS)),)
>  BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
> +SANITIZE_LEAK = YesCompiledWithIt
>  endif
>  ifneq ($(filter address,$(SANITIZERS)),)
>  NO_REGEX = NeededForASAN
> @@ -2793,6 +2797,7 @@ GIT-BUILD-OPTIONS: FORCE
>  	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
>  	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>  	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
> +	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
>  	@echo X=\'$(X)\' >>$@+
>  ifdef TEST_OUTPUT_DIRECTORY
>  	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index 67852d0d37..31e519cde9 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
>   libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
>  
>  case "$jobname" in
> -linux-clang|linux-gcc)
> +linux-clang*|linux-gcc*)

This also affects linux-gcc-default, is it intended?
I think no? So, a case for linux-gcc-default is needed here.

>  	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
>  	sudo apt-get -q update
>  	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
>  		$UBUNTU_COMMON_PKGS
>  	case "$jobname" in
> -	linux-gcc)
> +	linux-gcc*)
>  		sudo apt-get -q -y install gcc-8
>  		;;
>  	esac
> diff --git a/ci/lib.sh b/ci/lib.sh
> index 476c3f369f..34fd914438 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -183,14 +183,16 @@ export GIT_TEST_CLONE_2GB=true
>  export SKIP_DASHED_BUILT_INS=YesPlease
>  
>  case "$jobname" in
> -linux-clang|linux-gcc)
> -	if [ "$jobname" = linux-gcc ]
> -	then
> +linux-clang*|linux-gcc*)

Ditto.

> +	case "$jobname" in
> +	linux-gcc*)
>  		export CC=gcc-8
>  		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
> -	else
> +		;;
> +	*)
>  		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
> -	fi
> +		;;
> +	esac
>  
>  	export GIT_TEST_HTTPD=true
>  
> @@ -233,4 +235,10 @@ linux-musl)
>  	;;
>  esac
>  
> +case "$jobname" in
> +linux-*-sanitize-leak)
> +	export SANITIZE=leak
> +	;;
> +esac
> +
>  MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index 3ce81ffee9..07b9c09f45 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -12,7 +12,7 @@ esac
>  
>  make
>  case "$jobname" in
> -linux-gcc)
> +linux-gcc*)

But, I'm not sure about this one, though.
linux-gcc-default falls into '*' leg, as of it's now.
Do we want to run it in this leg or the original one?

>  	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  	make test
>  	export GIT_TEST_SPLIT_INDEX=yes
> @@ -29,7 +29,7 @@ linux-gcc)
>  	export GIT_TEST_CHECKOUT_WORKERS=2
>  	make test
>  	;;
> -linux-clang)
> +linux-clang*)
>  	export GIT_TEST_DEFAULT_HASH=sha1
>  	make test
>  	export GIT_TEST_DEFAULT_HASH=sha256
> diff --git a/t/README b/t/README
> index 1a2072b2c8..303d0be817 100644
> --- a/t/README
> +++ b/t/README
> @@ -448,6 +448,22 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
>  to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
>  execution of the parallel-checkout code.
>  
> +GIT_TEST_SANITIZE_LEAK=<boolean> will force the tests to run when git
> +is compiled with SANITIZE=leak (we pick it up via
> +../GIT-BUILD-OPTIONS).
> +
> +By default all tests are skipped when compiled with SANITIZE=leak, and
> +individual test scripts opt themselves in to leak testing by setting
> +GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
> +tests use the SANITIZE_LEAK prerequisite to skip individiual tests
> +(i.e. test_expect_success !SANITIZE_LEAK [...]).
> +
> +So the GIT_TEST_SANITIZE_LEAK setting is different in behavior from
> +both other GIT_TEST_*=[true|false] settings, but more useful given how
> +SANITIZE=leak works & the state of the test suite. Manually setting
> +GIT_TEST_SANITIZE_LEAK=true is only useful during development when
> +finding and fixing memory leaks.
> +
>  Naming Tests
>  ------------
>  
> diff --git a/t/t0500-progress-display.sh b/t/t0500-progress-display.sh
> index 22058b503a..7afb9abb1f 100755
> --- a/t/t0500-progress-display.sh
> +++ b/t/t0500-progress-display.sh
> @@ -2,6 +2,7 @@
>  
>  test_description='progress display'
>  
> +GIT_TEST_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  
>  show_cr () {
> @@ -283,7 +284,7 @@ test_expect_success 'cover up after throughput shortens a lot' '
>  	test_cmp expect out
>  '
>  
> -test_expect_success 'progress generates traces' '
> +test_expect_success !SANITIZE_LEAK 'progress generates traces' '
>  	cat >in <<-\EOF &&
>  	throughput 102400 1000
>  	update
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> index 930721f053..d58efb0aa9 100755
> --- a/t/t5701-git-serve.sh
> +++ b/t/t5701-git-serve.sh
> @@ -243,7 +243,7 @@ test_expect_success 'unexpected lines are not allowed in fetch request' '
>  
>  # Test the basics of object-info
>  #
> -test_expect_success 'basics of object-info' '
> +test_expect_success !SANITIZE_LEAK 'basics of object-info' '
>  	test-tool pkt-line pack >in <<-EOF &&
>  	command=object-info
>  	object-format=$(test_oid algo)
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 7036f83b33..9201510e16 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1353,6 +1353,40 @@ then
>  	exit 1
>  fi
>  
> +# SANITIZE=leak test mode
> +sanitize_leak_true=
> +add_sanitize_leak_true () {
> +	sanitize_leak_true="$sanitize_leak_true$1 "
> +}
> +
> +sanitize_leak_false=
> +add_sanitize_leak_false () {
> +	sanitize_leak_false="$sanitize_leak_false$1 "
> +}
> +
> +sanitize_leak_opt_in_msg="opt-in with GIT_TEST_SANITIZE_LEAK=true"
> +maybe_skip_all_sanitize_leak () {
> +	# Whitelist patterns
> +	add_sanitize_leak_true 't000*'
> +	add_sanitize_leak_true 't001*'
> +	add_sanitize_leak_true 't006*'
> +
> +	# Blacklist patterns (overrides whitelist)
> +	add_sanitize_leak_false 't000[469]*'
> +	add_sanitize_leak_false 't001[2459]*'
> +	add_sanitize_leak_false 't006[0248]*'
> +
> +	if match_pattern_list "$1" "$sanitize_leak_false"
> +	then
> +		skip_all="test $this_test on SANITIZE=leak blacklist, $sanitize_leak_opt_in_msg"
> +		test_done
> +	elif match_pattern_list "$1" "$sanitize_leak_true"
> +	then
> +		return 0
> +	fi
> +	return 1
> +}
> +
>  # Are we running this test at all?
>  remove_trash=
>  this_test=${0##*/}
> @@ -1364,6 +1398,31 @@ then
>  	test_done
>  fi
>  
> +# Aggressively skip non-whitelisted tests when compiled with
> +# SANITIZE=leak
> +if test -n "$SANITIZE_LEAK"
> +then
> +	if test -z "$GIT_TEST_SANITIZE_LEAK" &&
> +		maybe_skip_all_sanitize_leak "$TEST_NAME"
> +	then
> +		say_color info >&3 "test $this_test on SANITIZE=leak whitelist"
> +		GIT_TEST_SANITIZE_LEAK=true
> +	fi
> +
> +	# We need to see it in "git env--helper" (via
> +	# test_bool_env)
> +	export GIT_TEST_SANITIZE_LEAK
> +
> +	if ! test_bool_env GIT_TEST_SANITIZE_LEAK false
> +	then
> +		skip_all="skip all tests in $this_test under SANITIZE=leak, $sanitize_leak_opt_in_msg"
> +		test_done
> +	fi
> +elif test_bool_env GIT_TEST_SANITIZE_LEAK false
> +then
> +	error "GIT_TEST_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
> +fi
> +
>  # Last-minute variable setup
>  HOME="$TRASH_DIRECTORY"
>  GNUPGHOME="$HOME/gnupg-home-not-used"
> @@ -1516,6 +1575,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
>  test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
>  test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
>  test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
> +test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
>  
>  if test -z "$GIT_TEST_CHECK_CACHE_TREE"
>  then
> -- 
> 2.32.0-dev
> 

-- 
Danh

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI
  2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                     ` (3 preceding siblings ...)
  2021-07-14  0:11   ` [PATCH 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
@ 2021-07-14 17:23   ` Ævar Arnfjörð Bjarmason
  2021-07-14 17:23     ` [PATCH v2 1/4] tests: " Ævar Arnfjörð Bjarmason
                       ` (6 more replies)
  4 siblings, 7 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 17:23 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

As a follow-up to my recent thread asking if we had some test mode or
CI to test for memory leak regression (we don't), add such a test
mode, and run it in CI.

Currently the two new CI targets take ~2-3 minutes to run in GitHub
CI, whereas the normal test targets take 20-30 minutes. The tests run
slower, but we have a small whitelist of test scripts that are OK.

v2:

 * Fixes issues spotted by Đoàn Trần Công Danh and Eric Sunshine,
   thanks both!

 * I got rid of the change to t0500, I saw it being flaky in GitHub
   CI, and looks like there'll be other concurrent edits to that file,
   so leaving it be.

v1: http://lore.kernel.org/git/cover-0.4-0000000000-20210714T001007Z-avarab@gmail.com

Ævar Arnfjörð Bjarmason (4):
  tests: add a test mode for SANITIZE=leak, run it in CI
  SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  SANITIZE tests: fix memory leaks in t5701*, add to whitelist
  SANITIZE tests: fix leak in mailmap.c

 .github/workflows/main.yml |  6 ++++
 Makefile                   |  5 ++++
 ci/install-dependencies.sh |  4 +--
 ci/lib.sh                  | 18 +++++++----
 ci/run-build-and-tests.sh  |  4 +--
 config.c                   | 17 +++++++----
 mailmap.c                  |  2 ++
 protocol-caps.c            |  5 ++--
 t/README                   | 16 ++++++++++
 t/t1300-config.sh          | 16 ++++++----
 t/t4203-mailmap.sh         |  6 ++++
 t/t5701-git-serve.sh       |  3 +-
 t/test-lib.sh              | 61 ++++++++++++++++++++++++++++++++++++++
 13 files changed, 140 insertions(+), 23 deletions(-)

Range-diff against v1:
1:  b7948c408d ! 1:  0795436a24 tests: add a test mode for SANITIZE=leak, run it in CI
    @@ .github/workflows/main.yml: jobs:
     +            cc: clang
     +            pool: ubuntu-latest
     +          - jobname: linux-gcc-sanitize-leak
    -+            cc: clang
    ++            cc: gcc
     +            pool: ubuntu-latest
          env:
            CC: ${{matrix.vector.cc}}
    @@ ci/install-dependencies.sh: UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl
      
      case "$jobname" in
     -linux-clang|linux-gcc)
    -+linux-clang*|linux-gcc*)
    ++linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
      	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
      	sudo apt-get -q update
      	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
      		$UBUNTU_COMMON_PKGS
      	case "$jobname" in
     -	linux-gcc)
    -+	linux-gcc*)
    ++	linux-gcc|linux-gcc-sanitize-leak)
      		sudo apt-get -q -y install gcc-8
      		;;
      	esac
    @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
     -linux-clang|linux-gcc)
     -	if [ "$jobname" = linux-gcc ]
     -	then
    -+linux-clang*|linux-gcc*)
    ++linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
     +	case "$jobname" in
    -+	linux-gcc*)
    ++	linux-gcc|linux-gcc-sanitize-leak)
      		export CC=gcc-8
      		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
     -	else
    @@ ci/lib.sh: linux-musl)
      esac
      
     +case "$jobname" in
    -+linux-*-sanitize-leak)
    ++linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
     +	export SANITIZE=leak
     +	;;
     +esac
    @@ ci/run-build-and-tests.sh: esac
      make
      case "$jobname" in
     -linux-gcc)
    -+linux-gcc*)
    ++linux-gcc|linux-gcc-sanitize-leak)
      	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
      	make test
      	export GIT_TEST_SPLIT_INDEX=yes
    @@ ci/run-build-and-tests.sh: linux-gcc)
      	make test
      	;;
     -linux-clang)
    -+linux-clang*)
    ++linux-clang|linux-clang-sanitize-leak)
      	export GIT_TEST_DEFAULT_HASH=sha1
      	make test
      	export GIT_TEST_DEFAULT_HASH=sha256
    @@ t/README: GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
      ------------
      
     
    - ## t/t0500-progress-display.sh ##
    -@@
    - 
    - test_description='progress display'
    - 
    -+GIT_TEST_SANITIZE_LEAK=true
    - . ./test-lib.sh
    - 
    - show_cr () {
    -@@ t/t0500-progress-display.sh: test_expect_success 'cover up after throughput shortens a lot' '
    - 	test_cmp expect out
    - '
    - 
    --test_expect_success 'progress generates traces' '
    -+test_expect_success !SANITIZE_LEAK 'progress generates traces' '
    - 	cat >in <<-\EOF &&
    - 	throughput 102400 1000
    - 	update
    -
      ## t/t5701-git-serve.sh ##
     @@ t/t5701-git-serve.sh: test_expect_success 'unexpected lines are not allowed in fetch request' '
      
2:  babcb1c289 = 2:  867e8e9a6c SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
3:  11aa2f3bb5 = 3:  b7fb5d5a56 SANITIZE tests: fix memory leaks in t5701*, add to whitelist
4:  7f4e433559 ! 4:  ad8680f529 SANITIZE tests: fix leak in mailmap.c
    @@ Commit message
         mailmap_entry structure, we didn't free the structure itself. The same
         goes for the "mailmap_info" structure.
     
    +    Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +
      ## mailmap.c ##
     @@ mailmap.c: static void free_mailmap_info(void *p, const char *s)
      		 s, debug_str(mi->name), debug_str(mi->email));
-- 
2.32.0.853.g5a570c9bf9


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-07-14 17:23     ` Ævar Arnfjörð Bjarmason
  2021-07-14 18:42       ` Andrzej Hunt
  2021-07-15 21:06       ` Jeff King
  2021-07-14 17:23     ` [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
                       ` (5 subsequent siblings)
  6 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 17:23 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak there has been no
corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
fixed as one-offs without structured regression testing.

This change add such a mode, we now have new
linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
tests as linux-{clang,gcc}, except that almost all of them are
skipped.

There is a whitelist of some tests that are OK in test-lib.sh, and
individual tests can be opted-in by setting
GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
individual test can be skipped with the "!SANITIZE_LEAK"
prerequisite. See the updated t/README for more details.

I'm using the GIT_TEST_SANITIZE_LEAK=true and !SANITIZE_LEAK pattern
in a couple of tests whose memory leaks I'll fix in subsequent
commits.

I'm not being aggressive about opting in tests, it's not all tests
that currently pass under SANITIZE=leak, just a small number of
known-good tests. We can add more later as we fix leaks and grow more
confident in this test mode.

See the recent discussion at [1] about the lack of this sort of test
mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false
positives, 2017-09-08) for the initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

When calling maybe_skip_all_sanitize_leak matching against
"$TEST_NAME" instead of "$this_test" as other "match_pattern_list()"
users do is intentional. I'd like to match things like "t13*config*"
in subsequent commits. This part of the API isn't public, so we can
freely change it in the future.

1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .github/workflows/main.yml |  6 ++++
 Makefile                   |  5 ++++
 ci/install-dependencies.sh |  4 +--
 ci/lib.sh                  | 18 ++++++++----
 ci/run-build-and-tests.sh  |  4 +--
 t/README                   | 16 ++++++++++
 t/t5701-git-serve.sh       |  2 +-
 t/test-lib.sh              | 60 ++++++++++++++++++++++++++++++++++++++
 8 files changed, 105 insertions(+), 10 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 73856bafc9..752fe187f9 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -297,6 +297,12 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-clang-sanitize-leak
+            cc: clang
+            pool: ubuntu-latest
+          - jobname: linux-gcc-sanitize-leak
+            cc: gcc
+            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/Makefile b/Makefile
index 502e0c9a81..d4cad5136f 100644
--- a/Makefile
+++ b/Makefile
@@ -1216,6 +1216,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1260,6 +1263,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2793,6 +2797,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 67852d0d37..8ac72d7246 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
 		$UBUNTU_COMMON_PKGS
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-gcc-sanitize-leak)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f..bb02b5abf4 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,14 +183,16 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
-	if [ "$jobname" = linux-gcc ]
-	then
+linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
+	case "$jobname" in
+	linux-gcc|linux-gcc-sanitize-leak)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
-	fi
+		;;
+	esac
 
 	export GIT_TEST_HTTPD=true
 
@@ -233,4 +235,10 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
+	export SANITIZE=leak
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3ce81ffee9..5fe047b5c6 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -12,7 +12,7 @@ esac
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc|linux-gcc-sanitize-leak)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
@@ -29,7 +29,7 @@ linux-gcc)
 	export GIT_TEST_CHECKOUT_WORKERS=2
 	make test
 	;;
-linux-clang)
+linux-clang|linux-clang-sanitize-leak)
 	export GIT_TEST_DEFAULT_HASH=sha1
 	make test
 	export GIT_TEST_DEFAULT_HASH=sha256
diff --git a/t/README b/t/README
index 1a2072b2c8..303d0be817 100644
--- a/t/README
+++ b/t/README
@@ -448,6 +448,22 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
 to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
 execution of the parallel-checkout code.
 
+GIT_TEST_SANITIZE_LEAK=<boolean> will force the tests to run when git
+is compiled with SANITIZE=leak (we pick it up via
+../GIT-BUILD-OPTIONS).
+
+By default all tests are skipped when compiled with SANITIZE=leak, and
+individual test scripts opt themselves in to leak testing by setting
+GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
+tests use the SANITIZE_LEAK prerequisite to skip individiual tests
+(i.e. test_expect_success !SANITIZE_LEAK [...]).
+
+So the GIT_TEST_SANITIZE_LEAK setting is different in behavior from
+both other GIT_TEST_*=[true|false] settings, but more useful given how
+SANITIZE=leak works & the state of the test suite. Manually setting
+GIT_TEST_SANITIZE_LEAK=true is only useful during development when
+finding and fixing memory leaks.
+
 Naming Tests
 ------------
 
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index 930721f053..d58efb0aa9 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -243,7 +243,7 @@ test_expect_success 'unexpected lines are not allowed in fetch request' '
 
 # Test the basics of object-info
 #
-test_expect_success 'basics of object-info' '
+test_expect_success !SANITIZE_LEAK 'basics of object-info' '
 	test-tool pkt-line pack >in <<-EOF &&
 	command=object-info
 	object-format=$(test_oid algo)
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 7036f83b33..9201510e16 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1353,6 +1353,40 @@ then
 	exit 1
 fi
 
+# SANITIZE=leak test mode
+sanitize_leak_true=
+add_sanitize_leak_true () {
+	sanitize_leak_true="$sanitize_leak_true$1 "
+}
+
+sanitize_leak_false=
+add_sanitize_leak_false () {
+	sanitize_leak_false="$sanitize_leak_false$1 "
+}
+
+sanitize_leak_opt_in_msg="opt-in with GIT_TEST_SANITIZE_LEAK=true"
+maybe_skip_all_sanitize_leak () {
+	# Whitelist patterns
+	add_sanitize_leak_true 't000*'
+	add_sanitize_leak_true 't001*'
+	add_sanitize_leak_true 't006*'
+
+	# Blacklist patterns (overrides whitelist)
+	add_sanitize_leak_false 't000[469]*'
+	add_sanitize_leak_false 't001[2459]*'
+	add_sanitize_leak_false 't006[0248]*'
+
+	if match_pattern_list "$1" "$sanitize_leak_false"
+	then
+		skip_all="test $this_test on SANITIZE=leak blacklist, $sanitize_leak_opt_in_msg"
+		test_done
+	elif match_pattern_list "$1" "$sanitize_leak_true"
+	then
+		return 0
+	fi
+	return 1
+}
+
 # Are we running this test at all?
 remove_trash=
 this_test=${0##*/}
@@ -1364,6 +1398,31 @@ then
 	test_done
 fi
 
+# Aggressively skip non-whitelisted tests when compiled with
+# SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test -z "$GIT_TEST_SANITIZE_LEAK" &&
+		maybe_skip_all_sanitize_leak "$TEST_NAME"
+	then
+		say_color info >&3 "test $this_test on SANITIZE=leak whitelist"
+		GIT_TEST_SANITIZE_LEAK=true
+	fi
+
+	# We need to see it in "git env--helper" (via
+	# test_bool_env)
+	export GIT_TEST_SANITIZE_LEAK
+
+	if ! test_bool_env GIT_TEST_SANITIZE_LEAK false
+	then
+		skip_all="skip all tests in $this_test under SANITIZE=leak, $sanitize_leak_opt_in_msg"
+		test_done
+	fi
+elif test_bool_env GIT_TEST_SANITIZE_LEAK false
+then
+	error "GIT_TEST_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 HOME="$TRASH_DIRECTORY"
 GNUPGHOME="$HOME/gnupg-home-not-used"
@@ -1516,6 +1575,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.32.0.853.g5a570c9bf9


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-07-14 17:23     ` [PATCH v2 1/4] tests: " Ævar Arnfjörð Bjarmason
@ 2021-07-14 17:23     ` Ævar Arnfjörð Bjarmason
  2021-07-14 18:57       ` Andrzej Hunt
  2021-07-14 17:23     ` [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
                       ` (4 subsequent siblings)
  6 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 17:23 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Fix a couple of trivial memory leaks introduced in 3efd0bedc6 (config:
add conditional include, 2017-03-01) and my own 867ad08a26 (hooks:
allow customizing where the hook directory is, 2016-05-04).

In the latter case the "fix" is UNLEAK() on the global variable. This
allows us to run all t13*config* tests under SANITIZE=leak.

With this change we can now run almost the whole set of config.c
tests (t13*config) under SANITIZE=leak, so let's do so, with a few
exceptions:

 * The test added in ce81b1da23 (config: add new way to pass config
   via `--config-env`, 2021-01-12), it fails in GitHub CI, but passes
   for me locally. Let's just skip it for now.

 * Ditto the split_cmdline and "aliases of builtins" tests, the former
   required splitting up an existing test, there an issue with the test
   that would have also been revealed by skipping it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 config.c          | 17 ++++++++++++-----
 t/t1300-config.sh | 16 ++++++++++------
 t/test-lib.sh     |  1 +
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index f9c400ad30..38e132c0e2 100644
--- a/config.c
+++ b/config.c
@@ -138,8 +138,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
 		return config_error_nonbool("include.path");
 
 	expanded = expand_user_path(path, 0);
-	if (!expanded)
-		return error(_("could not expand include path '%s'"), path);
+	if (!expanded) {
+		ret = error(_("could not expand include path '%s'"), path);
+		goto cleanup;
+	}
 	path = expanded;
 
 	/*
@@ -149,8 +151,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
 	if (!is_absolute_path(path)) {
 		char *slash;
 
-		if (!cf || !cf->path)
-			return error(_("relative config includes must come from files"));
+		if (!cf || !cf->path) {
+			ret = error(_("relative config includes must come from files"));
+			goto cleanup;
+		}
 
 		slash = find_last_dir_sep(cf->path);
 		if (slash)
@@ -168,6 +172,7 @@ static int handle_path_include(const char *path, struct config_include_data *inc
 		ret = git_config_from_file(git_config_include, path, inc);
 		inc->depth--;
 	}
+cleanup:
 	strbuf_release(&buf);
 	free(expanded);
 	return ret;
@@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
 	if (!strcmp(var, "core.attributesfile"))
 		return git_config_pathname(&git_attributes_file, var, value);
 
-	if (!strcmp(var, "core.hookspath"))
+	if (!strcmp(var, "core.hookspath")) {
+		UNLEAK(git_hooks_path);
 		return git_config_pathname(&git_hooks_path, var, value);
+	}
 
 	if (!strcmp(var, "core.bare")) {
 		is_bare_repository_cfg = git_config_bool(var, value);
diff --git a/t/t1300-config.sh b/t/t1300-config.sh
index 9ff46f3b04..93ad0f4887 100755
--- a/t/t1300-config.sh
+++ b/t/t1300-config.sh
@@ -1050,12 +1050,16 @@ test_expect_success SYMLINKS 'symlink to nonexistent configuration' '
 	test_must_fail git config --file=linktolinktonada --list
 '
 
-test_expect_success 'check split_cmdline return' "
-	git config alias.split-cmdline-fix 'echo \"' &&
-	test_must_fail git split-cmdline-fix &&
+test_expect_success 'setup check split_cmdline return' "
 	echo foo > foo &&
 	git add foo &&
-	git commit -m 'initial commit' &&
+	git commit -m 'initial commit'
+"
+
+test_expect_success !SANITIZE_LEAK 'check split_cmdline return' "
+	git config alias.split-cmdline-fix 'echo \"' &&
+	test_must_fail git split-cmdline-fix &&
+
 	git config branch.main.mergeoptions 'echo \"' &&
 	test_must_fail git merge main
 "
@@ -1101,7 +1105,7 @@ test_expect_success 'key sanity-checking' '
 	git config foo."ba =z".bar false
 '
 
-test_expect_success 'git -c works with aliases of builtins' '
+test_expect_success !SANITIZE_LEAK 'git -c works with aliases of builtins' '
 	git config alias.checkconfig "-c foo.check=bar config foo.check" &&
 	echo bar >expect &&
 	git checkconfig >actual &&
@@ -1397,7 +1401,7 @@ test_expect_success 'git --config-env with missing value' '
 	grep "invalid config format: config" error
 '
 
-test_expect_success 'git --config-env fails with invalid parameters' '
+test_expect_success !SANITIZE_LEAK 'git --config-env fails with invalid parameters' '
 	test_must_fail git --config-env=foo.flag config --bool foo.flag 2>error &&
 	test_i18ngrep "invalid config format: foo.flag" error &&
 	test_must_fail git --config-env=foo.flag= config --bool foo.flag 2>error &&
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 9201510e16..98e20950c3 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1370,6 +1370,7 @@ maybe_skip_all_sanitize_leak () {
 	add_sanitize_leak_true 't000*'
 	add_sanitize_leak_true 't001*'
 	add_sanitize_leak_true 't006*'
+	add_sanitize_leak_true 't13*config*'
 
 	# Blacklist patterns (overrides whitelist)
 	add_sanitize_leak_false 't000[469]*'
-- 
2.32.0.853.g5a570c9bf9


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, add to whitelist
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-07-14 17:23     ` [PATCH v2 1/4] tests: " Ævar Arnfjörð Bjarmason
  2021-07-14 17:23     ` [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
@ 2021-07-14 17:23     ` Ævar Arnfjörð Bjarmason
  2021-07-15 17:37       ` Andrzej Hunt
                         ` (2 more replies)
  2021-07-14 17:23     ` [PATCH v2 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
                       ` (3 subsequent siblings)
  6 siblings, 3 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 17:23 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Fix a memory leak in a2ba162cda (object-info: support for retrieving
object info, 2021-04-20) which appears to have been based on a
misunderstanding of how the pkt-line.c API works, there is no need to
strdup() input to, it's just a printf()-like format function.

This fixes a potentially large memory leak, since the number of OID
lines the "object-info" call can be arbitrarily large (or a small one
if the request is small).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 protocol-caps.c      | 5 +++--
 t/t5701-git-serve.sh | 1 +
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/protocol-caps.c b/protocol-caps.c
index 13a9e63a04..901b6795e4 100644
--- a/protocol-caps.c
+++ b/protocol-caps.c
@@ -69,9 +69,10 @@ static void send_info(struct repository *r, struct packet_writer *writer,
 			}
 		}
 
-		packet_writer_write(writer, "%s",
-				    strbuf_detach(&send_buffer, NULL));
+		packet_writer_write(writer, "%s", send_buffer.buf);
+		strbuf_reset(&send_buffer);
 	}
+	strbuf_release(&send_buffer);
 }
 
 int cap_object_info(struct repository *r, struct strvec *keys,
diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
index d58efb0aa9..e2f4832adf 100755
--- a/t/t5701-git-serve.sh
+++ b/t/t5701-git-serve.sh
@@ -5,6 +5,7 @@ test_description='test protocol v2 server commands'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+GIT_TEST_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'test capability advertisement' '
-- 
2.32.0.853.g5a570c9bf9


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 4/4] SANITIZE tests: fix leak in mailmap.c
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                       ` (2 preceding siblings ...)
  2021-07-14 17:23     ` [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
@ 2021-07-14 17:23     ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:42       ` [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}() Ævar Arnfjörð Bjarmason
  2021-07-15 17:37     ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Andrzej Hunt
                       ` (2 subsequent siblings)
  6 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 17:23 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Get closer to being able to run t4203-mailmap.sh by fixing a couple of
memory leak in mailmap.c.

In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
clear the "me" structure, but while we freed parts of the
mailmap_entry structure, we didn't free the structure itself. The same
goes for the "mailmap_info" structure.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 mailmap.c          | 2 ++
 t/t4203-mailmap.sh | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/mailmap.c b/mailmap.c
index d1f7c0d272..e1c8736093 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -36,6 +36,7 @@ static void free_mailmap_info(void *p, const char *s)
 		 s, debug_str(mi->name), debug_str(mi->email));
 	free(mi->name);
 	free(mi->email);
+	free(mi);
 }
 
 static void free_mailmap_entry(void *p, const char *s)
@@ -51,6 +52,7 @@ static void free_mailmap_entry(void *p, const char *s)
 
 	me->namemap.strdup_strings = 1;
 	string_list_clear_func(&me->namemap, free_mailmap_info);
+	free(me);
 }
 
 /*
diff --git a/t/t4203-mailmap.sh b/t/t4203-mailmap.sh
index 0b2d21ec55..c7de4299cf 100755
--- a/t/t4203-mailmap.sh
+++ b/t/t4203-mailmap.sh
@@ -79,6 +79,12 @@ test_expect_success 'check-mailmap bogus contact --stdin' '
 	test_must_fail git check-mailmap --stdin bogus </dev/null
 '
 
+if test_have_prereq SANITIZE_LEAK
+then
+	skip_all='skipping the rest of mailmap tests under SANITIZE_LEAK'
+	test_done
+fi
+
 test_expect_success 'No mailmap' '
 	cat >expect <<-EOF &&
 	$GIT_AUTHOR_NAME (1):
-- 
2.32.0.853.g5a570c9bf9


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 17:23     ` [PATCH v2 1/4] tests: " Ævar Arnfjörð Bjarmason
@ 2021-07-14 18:42       ` Andrzej Hunt
  2021-07-14 22:39         ` Ævar Arnfjörð Bjarmason
  2021-07-15 21:14         ` Jeff King
  2021-07-15 21:06       ` Jeff King
  1 sibling, 2 replies; 125+ messages in thread
From: Andrzej Hunt @ 2021-07-14 18:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine



On 14/07/2021 19:23, Ævar Arnfjörð Bjarmason wrote:
> While git can be compiled with SANITIZE=leak there has been no
> corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
> fixed as one-offs without structured regression testing.
> 
> This change add such a mode, we now have new
> linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
> tests as linux-{clang,gcc}, except that almost all of them are
> skipped.
> 
> There is a whitelist of some tests that are OK in test-lib.sh, and
> individual tests can be opted-in by setting
> GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
> individual test can be skipped with the "!SANITIZE_LEAK"
> prerequisite. See the updated t/README for more details.
> 
> I'm using the GIT_TEST_SANITIZE_LEAK=true and !SANITIZE_LEAK pattern
> in a couple of tests whose memory leaks I'll fix in subsequent
> commits.
> 
> I'm not being aggressive about opting in tests, it's not all tests
> that currently pass under SANITIZE=leak, just a small number of
> known-good tests. We can add more later as we fix leaks and grow more
> confident in this test mode.
> 
> See the recent discussion at [1] about the lack of this sort of test
> mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false
> positives, 2017-09-08) for the initial addition of SANITIZE=leak.
> 
> See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
> 7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
> 936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
> past history of "one-off" SANITIZE=leak (and more) fixes.
> 
> When calling maybe_skip_all_sanitize_leak matching against
> "$TEST_NAME" instead of "$this_test" as other "match_pattern_list()"
> users do is intentional. I'd like to match things like "t13*config*"
> in subsequent commits. This part of the API isn't public, so we can
> freely change it in the future.
> 
> 1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   .github/workflows/main.yml |  6 ++++
>   Makefile                   |  5 ++++
>   ci/install-dependencies.sh |  4 +--
>   ci/lib.sh                  | 18 ++++++++----
>   ci/run-build-and-tests.sh  |  4 +--
>   t/README                   | 16 ++++++++++
>   t/t5701-git-serve.sh       |  2 +-
>   t/test-lib.sh              | 60 ++++++++++++++++++++++++++++++++++++++
>   8 files changed, 105 insertions(+), 10 deletions(-)
> 
> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index 73856bafc9..752fe187f9 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -297,6 +297,12 @@ jobs:
>             - jobname: linux-gcc-default
>               cc: gcc
>               pool: ubuntu-latest
> +          - jobname: linux-clang-sanitize-leak
> +            cc: clang
> +            pool: ubuntu-latest
> +          - jobname: linux-gcc-sanitize-leak
> +            cc: gcc
> +            pool: ubuntu-latest

Is there any advantage to running leak checking with both gcc and clang? 
My understanding is that you end up using the same sanitiser 
implementation under the hood - I can't remember if using a different 
compiler actually helps find different leaks though.

My other question is: if we are adding a new job - should it really be 
just a leak checking job? Leak checking is just a subset of ASAN 
(Address Sanitizer). And as discussed at [1] it's possible to run ASAN 
and UBSAN (Undefined Behaviour Sanitizer) in the same build. I feel like 
it's much more useful to first add a combined ASAN+UBSAN job, followed 
by enabling leak-checking as part of ASAN in those jobs for known 
leak-free tests - as opposed to only adding leak checking. We currently 
disable Leak checking for ASAN here [2], but that could be made 
conditional on the test ID (i.e. check an allowlist to enable leak 
checking for some tests)?

I think it's worth focusing on ASAN+UBSAN first because they tend to 
find more impactful issues (e.g. buffer overflows, and other real bugs) 
- whereas leaks... are ugly, but leaks in git don't actually have much 
user impact?

[1] 
https://lore.kernel.org/git/YMI%2Fg1sHxJgb8%2FYD@coredump.intra.peff.net/

[2] https://git.kernel.org/pub/scm/git/git.git/tree/t/test-lib.sh#n44

>       env:
>         CC: ${{matrix.vector.cc}}
>         jobname: ${{matrix.vector.jobname}}
> diff --git a/Makefile b/Makefile
> index 502e0c9a81..d4cad5136f 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1216,6 +1216,9 @@ PTHREAD_CFLAGS =
>   SPARSE_FLAGS ?=
>   SP_EXTRA_FLAGS = -Wno-universal-initializer
>   
> +# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
> +SANITIZE_LEAK =
> +
>   # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>   # usually result in less CPU usage at the cost of higher peak memory.
>   # Setting it to 0 will feed all files in a single spatch invocation.
> @@ -1260,6 +1263,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
>   endif
>   ifneq ($(filter leak,$(SANITIZERS)),)
>   BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
> +SANITIZE_LEAK = YesCompiledWithIt >   endif
>   ifneq ($(filter address,$(SANITIZERS)),)
>   NO_REGEX = NeededForASAN
> @@ -2793,6 +2797,7 @@ GIT-BUILD-OPTIONS: FORCE
>   	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
>   	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>   	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
> +	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
>   	@echo X=\'$(X)\' >>$@+
>   ifdef TEST_OUTPUT_DIRECTORY
>   	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index 67852d0d37..8ac72d7246 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
>    libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
>   
>   case "$jobname" in
> -linux-clang|linux-gcc)
> +linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)

How about `linux-clang*|linux-gcc*)` here and below?

>   	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
>   	sudo apt-get -q update
>   	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
>   		$UBUNTU_COMMON_PKGS
>   	case "$jobname" in
> -	linux-gcc)
> +	linux-gcc|linux-gcc-sanitize-leak)
>   		sudo apt-get -q -y install gcc-8
>   		;;
>   	esac
> diff --git a/ci/lib.sh b/ci/lib.sh
> index 476c3f369f..bb02b5abf4 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -183,14 +183,16 @@ export GIT_TEST_CLONE_2GB=true
>   export SKIP_DASHED_BUILT_INS=YesPlease
>   
>   case "$jobname" in
> -linux-clang|linux-gcc)
> -	if [ "$jobname" = linux-gcc ]
> -	then
> +linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
> +	case "$jobname" in
> +	linux-gcc|linux-gcc-sanitize-leak)
>   		export CC=gcc-8
>   		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
> -	else
> +		;;
> +	*)
>   		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
> -	fi
> +		;;
> +	esac
>   
>   	export GIT_TEST_HTTPD=true
>   
> @@ -233,4 +235,10 @@ linux-musl)
>   	;;
>   esac
>   
> +case "$jobname" in
> +linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
> +	export SANITIZE=leak
> +	;;
> +esac
> +

Have you considered doing this in the yaml job configuration instead? 
It's possible to set env-vars in yaml, although it will require some 
careful tweaking - here's an example where I'm setting different values 
for SANITIZE depending on job (you'd probably just have to set it to 
empty for the non leak-checking jobs):

https://github.com/ahunt/git/blob/master/.github/workflows/ahunt-sync-next2.yml#L51-L69

That does make the yaml more complex, but I think it's worth it to 
reduce the amount of special-casing elsewhere (and is also worth it if 
we ever add other sanitisers)?

>   MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index 3ce81ffee9..5fe047b5c6 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -12,7 +12,7 @@ esac
>   
>   make
>   case "$jobname" in
> -linux-gcc)
> +linux-gcc|linux-gcc-sanitize-leak)
>   	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>   	make test
>   	export GIT_TEST_SPLIT_INDEX=yes
> @@ -29,7 +29,7 @@ linux-gcc)
>   	export GIT_TEST_CHECKOUT_WORKERS=2
>   	make test
>   	;;
> -linux-clang)
> +linux-clang|linux-clang-sanitize-leak)
>   	export GIT_TEST_DEFAULT_HASH=sha1
>   	make test
>   	export GIT_TEST_DEFAULT_HASH=sha256
> diff --git a/t/README b/t/README
> index 1a2072b2c8..303d0be817 100644
> --- a/t/README
> +++ b/t/README
> @@ -448,6 +448,22 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
>   to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
>   execution of the parallel-checkout code.
>   
> +GIT_TEST_SANITIZE_LEAK=<boolean> will force the tests to run when git
> +is compiled with SANITIZE=leak (we pick it up via
> +../GIT-BUILD-OPTIONS).
> +
> +By default all tests are skipped when compiled with SANITIZE=leak, and
> +individual test scripts opt themselves in to leak testing by setting
> +GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
> +tests use the SANITIZE_LEAK prerequisite to skip individiual tests
> +(i.e. test_expect_success !SANITIZE_LEAK [...]).
> +
> +So the GIT_TEST_SANITIZE_LEAK setting is different in behavior from
> +both other GIT_TEST_*=[true|false] settings, but more useful given how
> +SANITIZE=leak works & the state of the test suite. Manually setting
> +GIT_TEST_SANITIZE_LEAK=true is only useful during development when
> +finding and fixing memory leaks.
> +
>   Naming Tests
>   ------------
>   
> diff --git a/t/t5701-git-serve.sh b/t/t5701-git-serve.sh
> index 930721f053..d58efb0aa9 100755
> --- a/t/t5701-git-serve.sh
> +++ b/t/t5701-git-serve.sh
> @@ -243,7 +243,7 @@ test_expect_success 'unexpected lines are not allowed in fetch request' '
>   
>   # Test the basics of object-info
>   #
> -test_expect_success 'basics of object-info' '
> +test_expect_success !SANITIZE_LEAK 'basics of object-info' '
>   	test-tool pkt-line pack >in <<-EOF &&
>   	command=object-info
>   	object-format=$(test_oid algo)
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 7036f83b33..9201510e16 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1353,6 +1353,40 @@ then
>   	exit 1
>   fi
>   
> +# SANITIZE=leak test mode
> +sanitize_leak_true=
> +add_sanitize_leak_true () {
> +	sanitize_leak_true="$sanitize_leak_true$1 "
> +}
> +
> +sanitize_leak_false=
> +add_sanitize_leak_false () {
> +	sanitize_leak_false="$sanitize_leak_false$1 "
> +}
> +
> +sanitize_leak_opt_in_msg="opt-in with GIT_TEST_SANITIZE_LEAK=true"
> +maybe_skip_all_sanitize_leak () {
> +	# Whitelist patterns
> +	add_sanitize_leak_true 't000*'
> +	add_sanitize_leak_true 't001*'
> +	add_sanitize_leak_true 't006*'
> +
> +	# Blacklist patterns (overrides whitelist)
> +	add_sanitize_leak_false 't000[469]*'
> +	add_sanitize_leak_false 't001[2459]*'
> +	add_sanitize_leak_false 't006[0248]*'
> +
> +	if match_pattern_list "$1" "$sanitize_leak_false"
> +	then
> +		skip_all="test $this_test on SANITIZE=leak blacklist, $sanitize_leak_opt_in_msg"
> +		test_done
> +	elif match_pattern_list "$1" "$sanitize_leak_true"
> +	then
> +		return 0
> +	fi
> +	return 1
> +}
> +
>   # Are we running this test at all?
>   remove_trash=
>   this_test=${0##*/}
> @@ -1364,6 +1398,31 @@ then
>   	test_done
>   fi
>   
> +# Aggressively skip non-whitelisted tests when compiled with
> +# SANITIZE=leak
> +if test -n "$SANITIZE_LEAK"
> +then
> +	if test -z "$GIT_TEST_SANITIZE_LEAK" &&
> +		maybe_skip_all_sanitize_leak "$TEST_NAME"
> +	then
> +		say_color info >&3 "test $this_test on SANITIZE=leak whitelist"
> +		GIT_TEST_SANITIZE_LEAK=true
> +	fi
> +
> +	# We need to see it in "git env--helper" (via
> +	# test_bool_env)
> +	export GIT_TEST_SANITIZE_LEAK
> +
> +	if ! test_bool_env GIT_TEST_SANITIZE_LEAK false
> +	then
> +		skip_all="skip all tests in $this_test under SANITIZE=leak, $sanitize_leak_opt_in_msg"
> +		test_done
> +	fi
> +elif test_bool_env GIT_TEST_SANITIZE_LEAK false
> +then
> +	error "GIT_TEST_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
> +fi
> +
>   # Last-minute variable setup
>   HOME="$TRASH_DIRECTORY"
>   GNUPGHOME="$HOME/gnupg-home-not-used"
> @@ -1516,6 +1575,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
>   test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
>   test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
>   test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
> +test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
>   
>   if test -z "$GIT_TEST_CHECK_CACHE_TREE"
>   then
> 

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-14 17:23     ` [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
@ 2021-07-14 18:57       ` Andrzej Hunt
  2021-07-14 22:56         ` Ævar Arnfjörð Bjarmason
  2021-07-15 21:42         ` Jeff King
  0 siblings, 2 replies; 125+ messages in thread
From: Andrzej Hunt @ 2021-07-14 18:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine



On 14/07/2021 19:23, Ævar Arnfjörð Bjarmason wrote:
> Fix a couple of trivial memory leaks introduced in 3efd0bedc6 (config:
> add conditional include, 2017-03-01) and my own 867ad08a26 (hooks:
> allow customizing where the hook directory is, 2016-05-04).
> 
> In the latter case the "fix" is UNLEAK() on the global variable. This
> allows us to run all t13*config* tests under SANITIZE=leak.
> 
> With this change we can now run almost the whole set of config.c
> tests (t13*config) under SANITIZE=leak, so let's do so, with a few
> exceptions:
> 
>   * The test added in ce81b1da23 (config: add new way to pass config
>     via `--config-env`, 2021-01-12), it fails in GitHub CI, but passes
>     for me locally. Let's just skip it for now.
> 
>   * Ditto the split_cmdline and "aliases of builtins" tests, the former
>     required splitting up an existing test, there an issue with the test
>     that would have also been revealed by skipping it.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   config.c          | 17 ++++++++++++-----
>   t/t1300-config.sh | 16 ++++++++++------
>   t/test-lib.sh     |  1 +
>   3 files changed, 23 insertions(+), 11 deletions(-)
> 
> diff --git a/config.c b/config.c
> index f9c400ad30..38e132c0e2 100644
> --- a/config.c
> +++ b/config.c
> @@ -138,8 +138,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
>   		return config_error_nonbool("include.path");
>   
>   	expanded = expand_user_path(path, 0);
> -	if (!expanded)
> -		return error(_("could not expand include path '%s'"), path);
> +	if (!expanded) {
> +		ret = error(_("could not expand include path '%s'"), path);
> +		goto cleanup;
> +	}
>   	path = expanded;
>   
>   	/*
> @@ -149,8 +151,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
>   	if (!is_absolute_path(path)) {
>   		char *slash;
>   
> -		if (!cf || !cf->path)
> -			return error(_("relative config includes must come from files"));
> +		if (!cf || !cf->path) {
> +			ret = error(_("relative config includes must come from files"));
> +			goto cleanup;
> +		}
>   
>   		slash = find_last_dir_sep(cf->path);
>   		if (slash)
> @@ -168,6 +172,7 @@ static int handle_path_include(const char *path, struct config_include_data *inc
>   		ret = git_config_from_file(git_config_include, path, inc);
>   		inc->depth--;
>   	}
> +cleanup:
>   	strbuf_release(&buf);
>   	free(expanded);
>   	return ret;
> @@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>   	if (!strcmp(var, "core.attributesfile"))
>   		return git_config_pathname(&git_attributes_file, var, value);
>   
> -	if (!strcmp(var, "core.hookspath"))
> +	if (!strcmp(var, "core.hookspath")) {
> +		UNLEAK(git_hooks_path);
>   		return git_config_pathname(&git_hooks_path, var, value);
> +	}

Why is the UNLEAK necessary here? We generally want to limit use of 
UNLEAK to cmd_* functions or direct helpers. git_default_core_config() 
seems generic enough that it could be called from anywhere, and using 
UNLEAK here means we're potentially masking a real leak?

IIUC the leak here happens because:
- git_hooks_path is a global variable - hence it's unlikely we'd ever
   bother cleaning it up.
- git_default_core_config() gets called a first time with
   core.hookspath, and we end up allocating new memory into
   git_hooks_path.
- git_default_core_config() gets called again with core.hookspath,
   and we overwrite git_hooks_path with a new string which leaks
   the string that git_hooks_path used to point to.

So I think the real fix is to free(git_hooks_path) instead of an UNLEAK? 
(Looking at the surrounding code, it looks like the same pattern of leak 
might be repeated for other similar globals - is it worth auditing those 
while we're here?)

>   
>   	if (!strcmp(var, "core.bare")) {
>   		is_bare_repository_cfg = git_config_bool(var, value);
> diff --git a/t/t1300-config.sh b/t/t1300-config.sh
> index 9ff46f3b04..93ad0f4887 100755
> --- a/t/t1300-config.sh
> +++ b/t/t1300-config.sh
> @@ -1050,12 +1050,16 @@ test_expect_success SYMLINKS 'symlink to nonexistent configuration' '
>   	test_must_fail git config --file=linktolinktonada --list
>   '
>   
> -test_expect_success 'check split_cmdline return' "
> -	git config alias.split-cmdline-fix 'echo \"' &&
> -	test_must_fail git split-cmdline-fix &&
> +test_expect_success 'setup check split_cmdline return' "
>   	echo foo > foo &&
>   	git add foo &&
> -	git commit -m 'initial commit' &&
> +	git commit -m 'initial commit'
> +"
> +
> +test_expect_success !SANITIZE_LEAK 'check split_cmdline return' "
> +	git config alias.split-cmdline-fix 'echo \"' &&
> +	test_must_fail git split-cmdline-fix &&
> +
>   	git config branch.main.mergeoptions 'echo \"' &&
>   	test_must_fail git merge main
>   "
> @@ -1101,7 +1105,7 @@ test_expect_success 'key sanity-checking' '
>   	git config foo."ba =z".bar false
>   '
>   
> -test_expect_success 'git -c works with aliases of builtins' '
> +test_expect_success !SANITIZE_LEAK 'git -c works with aliases of builtins' '
>   	git config alias.checkconfig "-c foo.check=bar config foo.check" &&
>   	echo bar >expect &&
>   	git checkconfig >actual &&
> @@ -1397,7 +1401,7 @@ test_expect_success 'git --config-env with missing value' '
>   	grep "invalid config format: config" error
>   '
>   
> -test_expect_success 'git --config-env fails with invalid parameters' '
> +test_expect_success !SANITIZE_LEAK 'git --config-env fails with invalid parameters' '
>   	test_must_fail git --config-env=foo.flag config --bool foo.flag 2>error &&
>   	test_i18ngrep "invalid config format: foo.flag" error &&
>   	test_must_fail git --config-env=foo.flag= config --bool foo.flag 2>error &&
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 9201510e16..98e20950c3 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1370,6 +1370,7 @@ maybe_skip_all_sanitize_leak () {
>   	add_sanitize_leak_true 't000*'
>   	add_sanitize_leak_true 't001*'
>   	add_sanitize_leak_true 't006*'
> +	add_sanitize_leak_true 't13*config*'
>   
>   	# Blacklist patterns (overrides whitelist)
>   	add_sanitize_leak_false 't000[469]*'
> 

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 18:42       ` Andrzej Hunt
@ 2021-07-14 22:39         ` Ævar Arnfjörð Bjarmason
  2021-07-15 21:14         ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 22:39 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: git, Junio C Hamano, Jeff King, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine


On Wed, Jul 14 2021, Andrzej Hunt wrote:

> On 14/07/2021 19:23, Ævar Arnfjörð Bjarmason wrote:
>> While git can be compiled with SANITIZE=leak there has been no
>> corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
>> fixed as one-offs without structured regression testing.
>> This change add such a mode, we now have new
>> linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
>> tests as linux-{clang,gcc}, except that almost all of them are
>> skipped.
>> There is a whitelist of some tests that are OK in test-lib.sh, and
>> individual tests can be opted-in by setting
>> GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
>> individual test can be skipped with the "!SANITIZE_LEAK"
>> prerequisite. See the updated t/README for more details.
>> I'm using the GIT_TEST_SANITIZE_LEAK=true and !SANITIZE_LEAK pattern
>> in a couple of tests whose memory leaks I'll fix in subsequent
>> commits.
>> I'm not being aggressive about opting in tests, it's not all tests
>> that currently pass under SANITIZE=leak, just a small number of
>> known-good tests. We can add more later as we fix leaks and grow more
>> confident in this test mode.
>> See the recent discussion at [1] about the lack of this sort of test
>> mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false
>> positives, 2017-09-08) for the initial addition of SANITIZE=leak.
>> See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
>> 7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
>> 936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
>> past history of "one-off" SANITIZE=leak (and more) fixes.
>> When calling maybe_skip_all_sanitize_leak matching against
>> "$TEST_NAME" instead of "$this_test" as other "match_pattern_list()"
>> users do is intentional. I'd like to match things like "t13*config*"
>> in subsequent commits. This part of the API isn't public, so we can
>> freely change it in the future.
>> 1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>   .github/workflows/main.yml |  6 ++++
>>   Makefile                   |  5 ++++
>>   ci/install-dependencies.sh |  4 +--
>>   ci/lib.sh                  | 18 ++++++++----
>>   ci/run-build-and-tests.sh  |  4 +--
>>   t/README                   | 16 ++++++++++
>>   t/t5701-git-serve.sh       |  2 +-
>>   t/test-lib.sh              | 60 ++++++++++++++++++++++++++++++++++++++
>>   8 files changed, 105 insertions(+), 10 deletions(-)
>> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
>> index 73856bafc9..752fe187f9 100644
>> --- a/.github/workflows/main.yml
>> +++ b/.github/workflows/main.yml
>> @@ -297,6 +297,12 @@ jobs:
>>             - jobname: linux-gcc-default
>>               cc: gcc
>>               pool: ubuntu-latest
>> +          - jobname: linux-clang-sanitize-leak
>> +            cc: clang
>> +            pool: ubuntu-latest
>> +          - jobname: linux-gcc-sanitize-leak
>> +            cc: gcc
>> +            pool: ubuntu-latest
>
> Is there any advantage to running leak checking with both gcc and
> clang? My understanding is that you end up using the same sanitiser 
> implementation under the hood - I can't remember if using a different
> compiler actually helps find different leaks though.

I didn't know that, makes sense. I'll make it one job and have it use
whatever CC is.

> My other question is: if we are adding a new job - should it really be
> just a leak checking job? Leak checking is just a subset of ASAN 
> (Address Sanitizer). And as discussed at [1] it's possible to run ASAN
> and UBSAN (Undefined Behaviour Sanitizer) in the same build. I feel
> like it's much more useful to first add a combined ASAN+UBSAN job,
> followed by enabling leak-checking as part of ASAN in those jobs for
> known leak-free tests - as opposed to only adding leak checking. We
> currently disable Leak checking for ASAN here [2], but that could be
> made conditional on the test ID (i.e. check an allowlist to enable
> leak checking for some tests)?

It sounds good to support that, but at least right now I've got the itch
of finding leaks during development, and I think in any case being able
to do a full run with just sanitizing, leak checking (or combined) makes
sense, i.e. to make GIT_TEST_SANITIZE_LEAK=* the top-level interface.

I haven't checked how noisy ASAN is, is it like the leak checking where
we fail almost all tests now?

Anyway, once we have some test mode like this it'll be trivial to extend
it. I mainly want us to get this into CI so we can have an expanding
line in the sand with regressions.

> I think it's worth focusing on ASAN+UBSAN first because they tend to
> find more impactful issues (e.g. buffer overflows, and other real
> bugs) - whereas leaks... are ugly, but leaks in git don't actually
> have much user impact?

We have one-off commands, but also long-lived things like "git cat-file
--batch", it's useful if we don't leak in those.

The entry point to those tends to be one-off commands in tests, so
checking leaks for all commands in a test (if you can get there) is a
useful indicator for how the underlying API performs.

I think in the git.git codebase we don't have much of an issue with
buffer overflows etc, because we tend to consistently use APIs like
strbuf that avoid those issues, but then again I haven't run the tests
with that, maybe I'll be unpleasantly surprised.

I also find leak checking to be useful during development to spot faulty
assumptions, i.e. the leak itself may not be a big deal, but it's
usually an early sign that I'm structuring something incorrectly.

> [1]
> https://lore.kernel.org/git/YMI%2Fg1sHxJgb8%2FYD@coredump.intra.peff.net/
>
> [2] https://git.kernel.org/pub/scm/git/git.git/tree/t/test-lib.sh#n44
>
>>       env:
>>         CC: ${{matrix.vector.cc}}
>>         jobname: ${{matrix.vector.jobname}}
>> diff --git a/Makefile b/Makefile
>> index 502e0c9a81..d4cad5136f 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -1216,6 +1216,9 @@ PTHREAD_CFLAGS =
>>   SPARSE_FLAGS ?=
>>   SP_EXTRA_FLAGS = -Wno-universal-initializer
>>   +# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
>> +SANITIZE_LEAK =
>> +
>>   # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
>>   # usually result in less CPU usage at the cost of higher peak memory.
>>   # Setting it to 0 will feed all files in a single spatch invocation.
>> @@ -1260,6 +1263,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
>>   endif
>>   ifneq ($(filter leak,$(SANITIZERS)),)
>>   BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
>> +SANITIZE_LEAK = YesCompiledWithIt >   endif
>>   ifneq ($(filter address,$(SANITIZERS)),)
>>   NO_REGEX = NeededForASAN
>> @@ -2793,6 +2797,7 @@ GIT-BUILD-OPTIONS: FORCE
>>   	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
>>   	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
>>   	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
>> +	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
>>   	@echo X=\'$(X)\' >>$@+
>>   ifdef TEST_OUTPUT_DIRECTORY
>>   	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
>> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
>> index 67852d0d37..8ac72d7246 100755
>> --- a/ci/install-dependencies.sh
>> +++ b/ci/install-dependencies.sh
>> @@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
>>    libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
>>     case "$jobname" in
>> -linux-clang|linux-gcc)
>> +linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
>
> How about `linux-clang*|linux-gcc*)` here and below?

I did that in v1, as Đoàn Trần Công Danh pointed out we have other jobs
that would match that.

>>   	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
>>   	sudo apt-get -q update
>>   	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
>>   		$UBUNTU_COMMON_PKGS
>>   	case "$jobname" in
>> -	linux-gcc)
>> +	linux-gcc|linux-gcc-sanitize-leak)
>>   		sudo apt-get -q -y install gcc-8
>>   		;;
>>   	esac
>> diff --git a/ci/lib.sh b/ci/lib.sh
>> index 476c3f369f..bb02b5abf4 100755
>> --- a/ci/lib.sh
>> +++ b/ci/lib.sh
>> @@ -183,14 +183,16 @@ export GIT_TEST_CLONE_2GB=true
>>   export SKIP_DASHED_BUILT_INS=YesPlease
>>     case "$jobname" in
>> -linux-clang|linux-gcc)
>> -	if [ "$jobname" = linux-gcc ]
>> -	then
>> +linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
>> +	case "$jobname" in
>> +	linux-gcc|linux-gcc-sanitize-leak)
>>   		export CC=gcc-8
>>   		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
>> -	else
>> +		;;
>> +	*)
>>   		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
>> -	fi
>> +		;;
>> +	esac
>>     	export GIT_TEST_HTTPD=true
>>   @@ -233,4 +235,10 @@ linux-musl)
>>   	;;
>>   esac
>>   +case "$jobname" in
>> +linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
>> +	export SANITIZE=leak
>> +	;;
>> +esac
>> +
>
> Have you considered doing this in the yaml job configuration instead?
> It's possible to set env-vars in yaml, although it will require some 
> careful tweaking - here's an example where I'm setting different
> values for SANITIZE depending on job (you'd probably just have to set
> it to empty for the non leak-checking jobs):
>
> https://github.com/ahunt/git/blob/master/.github/workflows/ahunt-sync-next2.yml#L51-L69
>
> That does make the yaml more complex, but I think it's worth it to
> reduce the amount of special-casing elsewhere (and is also worth it if 
> we ever add other sanitisers)?

I'm not too familiar with git.git's ci/* dir, but I think it's like that
in general because we don't just run in GitHub CI, but want to support
Azure, Travis etc.

So almost all of the logic is in those shellscripts, right now this
target just runs in the GitHub CI, but it would be useful to make it
drop-in enabled elsewhere.

I think given that that doing anything overly clever in the YAML would
probably be counter-productive.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-14 18:57       ` Andrzej Hunt
@ 2021-07-14 22:56         ` Ævar Arnfjörð Bjarmason
  2021-07-15 21:42         ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-14 22:56 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: git, Junio C Hamano, Jeff King, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine


On Wed, Jul 14 2021, Andrzej Hunt wrote:

> On 14/07/2021 19:23, Ævar Arnfjörð Bjarmason wrote:
>> Fix a couple of trivial memory leaks introduced in 3efd0bedc6 (config:
>> add conditional include, 2017-03-01) and my own 867ad08a26 (hooks:
>> allow customizing where the hook directory is, 2016-05-04).
>> In the latter case the "fix" is UNLEAK() on the global
>> variable. This
>> allows us to run all t13*config* tests under SANITIZE=leak.
>> With this change we can now run almost the whole set of config.c
>> tests (t13*config) under SANITIZE=leak, so let's do so, with a few
>> exceptions:
>>   * The test added in ce81b1da23 (config: add new way to pass config
>>     via `--config-env`, 2021-01-12), it fails in GitHub CI, but passes
>>     for me locally. Let's just skip it for now.
>>   * Ditto the split_cmdline and "aliases of builtins" tests, the
>> former
>>     required splitting up an existing test, there an issue with the test
>>     that would have also been revealed by skipping it.
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>>   config.c          | 17 ++++++++++++-----
>>   t/t1300-config.sh | 16 ++++++++++------
>>   t/test-lib.sh     |  1 +
>>   3 files changed, 23 insertions(+), 11 deletions(-)
>> diff --git a/config.c b/config.c
>> index f9c400ad30..38e132c0e2 100644
>> --- a/config.c
>> +++ b/config.c
>> @@ -138,8 +138,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
>>   		return config_error_nonbool("include.path");
>>     	expanded = expand_user_path(path, 0);
>> -	if (!expanded)
>> -		return error(_("could not expand include path '%s'"), path);
>> +	if (!expanded) {
>> +		ret = error(_("could not expand include path '%s'"), path);
>> +		goto cleanup;
>> +	}
>>   	path = expanded;
>>     	/*
>> @@ -149,8 +151,10 @@ static int handle_path_include(const char *path, struct config_include_data *inc
>>   	if (!is_absolute_path(path)) {
>>   		char *slash;
>>   -		if (!cf || !cf->path)
>> -			return error(_("relative config includes must come from files"));
>> +		if (!cf || !cf->path) {
>> +			ret = error(_("relative config includes must come from files"));
>> +			goto cleanup;
>> +		}
>>     		slash = find_last_dir_sep(cf->path);
>>   		if (slash)
>> @@ -168,6 +172,7 @@ static int handle_path_include(const char *path, struct config_include_data *inc
>>   		ret = git_config_from_file(git_config_include, path, inc);
>>   		inc->depth--;
>>   	}
>> +cleanup:
>>   	strbuf_release(&buf);
>>   	free(expanded);
>>   	return ret;
>> @@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>>   	if (!strcmp(var, "core.attributesfile"))
>>   		return git_config_pathname(&git_attributes_file, var, value);
>>   -	if (!strcmp(var, "core.hookspath"))
>> +	if (!strcmp(var, "core.hookspath")) {
>> +		UNLEAK(git_hooks_path);
>>   		return git_config_pathname(&git_hooks_path, var, value);
>> +	}
>
> Why is the UNLEAK necessary here? We generally want to limit use of
> UNLEAK to cmd_* functions or direct helpers. git_default_core_config() 
> seems generic enough that it could be called from anywhere, and using
> UNLEAK here means we're potentially masking a real leak?
>
> IIUC the leak here happens because:
> - git_hooks_path is a global variable - hence it's unlikely we'd ever
>   bother cleaning it up.
> - git_default_core_config() gets called a first time with
>   core.hookspath, and we end up allocating new memory into
>   git_hooks_path.
> - git_default_core_config() gets called again with core.hookspath,
>   and we overwrite git_hooks_path with a new string which leaks
>   the string that git_hooks_path used to point to.
>
> So I think the real fix is to free(git_hooks_path) instead of an
> UNLEAK? (Looking at the surrounding code, it looks like the same
> pattern of leak might be repeated for other similar globals - is it
> worth auditing those while we're here?)

Good point, I'll fix that.

I was doing this rather blindly to see if I could get this larg batch of
tests to pass with some a minimal fixes/whitelisting of some "known
bad".

>>     	if (!strcmp(var, "core.bare")) {
>>   		is_bare_repository_cfg = git_config_bool(var, value);
>> diff --git a/t/t1300-config.sh b/t/t1300-config.sh
>> index 9ff46f3b04..93ad0f4887 100755
>> --- a/t/t1300-config.sh
>> +++ b/t/t1300-config.sh
>> @@ -1050,12 +1050,16 @@ test_expect_success SYMLINKS 'symlink to nonexistent configuration' '
>>   	test_must_fail git config --file=linktolinktonada --list
>>   '
>>   -test_expect_success 'check split_cmdline return' "
>> -	git config alias.split-cmdline-fix 'echo \"' &&
>> -	test_must_fail git split-cmdline-fix &&
>> +test_expect_success 'setup check split_cmdline return' "
>>   	echo foo > foo &&
>>   	git add foo &&
>> -	git commit -m 'initial commit' &&
>> +	git commit -m 'initial commit'
>> +"
>> +
>> +test_expect_success !SANITIZE_LEAK 'check split_cmdline return' "
>> +	git config alias.split-cmdline-fix 'echo \"' &&
>> +	test_must_fail git split-cmdline-fix &&
>> +
>>   	git config branch.main.mergeoptions 'echo \"' &&
>>   	test_must_fail git merge main
>>   "
>> @@ -1101,7 +1105,7 @@ test_expect_success 'key sanity-checking' '
>>   	git config foo."ba =z".bar false
>>   '
>>   -test_expect_success 'git -c works with aliases of builtins' '
>> +test_expect_success !SANITIZE_LEAK 'git -c works with aliases of builtins' '
>>   	git config alias.checkconfig "-c foo.check=bar config foo.check" &&
>>   	echo bar >expect &&
>>   	git checkconfig >actual &&
>> @@ -1397,7 +1401,7 @@ test_expect_success 'git --config-env with missing value' '
>>   	grep "invalid config format: config" error
>>   '
>>   -test_expect_success 'git --config-env fails with invalid
>> parameters' '
>> +test_expect_success !SANITIZE_LEAK 'git --config-env fails with invalid parameters' '
>>   	test_must_fail git --config-env=foo.flag config --bool foo.flag 2>error &&
>>   	test_i18ngrep "invalid config format: foo.flag" error &&
>>   	test_must_fail git --config-env=foo.flag= config --bool foo.flag 2>error &&
>> diff --git a/t/test-lib.sh b/t/test-lib.sh
>> index 9201510e16..98e20950c3 100644
>> --- a/t/test-lib.sh
>> +++ b/t/test-lib.sh
>> @@ -1370,6 +1370,7 @@ maybe_skip_all_sanitize_leak () {
>>   	add_sanitize_leak_true 't000*'
>>   	add_sanitize_leak_true 't001*'
>>   	add_sanitize_leak_true 't006*'
>> +	add_sanitize_leak_true 't13*config*'
>>     	# Blacklist patterns (overrides whitelist)
>>   	add_sanitize_leak_false 't000[469]*'
>> 


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, add to whitelist
  2021-07-14 17:23     ` [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
@ 2021-07-15 17:37       ` Andrzej Hunt
  2021-07-15 21:43       ` Jeff King
  2021-08-31 13:46       ` [PATCH] protocol-caps.c: fix memory leak in send_info() Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Andrzej Hunt @ 2021-07-15 17:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine



On 14/07/2021 19:23, Ævar Arnfjörð Bjarmason wrote:
> Fix a memory leak in a2ba162cda (object-info: support for retrieving
> object info, 2021-04-20) which appears to have been based on a
> misunderstanding of how the pkt-line.c API works, there is no need to
> strdup() input to, it's just a printf()-like format function.
> 
> This fixes a potentially large memory leak, since the number of OID
> lines the "object-info" call can be arbitrarily large (or a small one
> if the request is small).
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>   protocol-caps.c      | 5 +++--
>   t/t5701-git-serve.sh | 1 +
>   2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/protocol-caps.c b/protocol-caps.c
> index 13a9e63a04..901b6795e4 100644
> --- a/protocol-caps.c
> +++ b/protocol-caps.c
> @@ -69,9 +69,10 @@ static void send_info(struct repository *r, struct packet_writer *writer,
>   			}
>   		}
>   
> -		packet_writer_write(writer, "%s",
> -				    strbuf_detach(&send_buffer, NULL));
> +		packet_writer_write(writer, "%s", send_buffer.buf);
> +		strbuf_reset(&send_buffer);
>   	}
> +	strbuf_release(&send_buffer);
>   }

Good catch! strbuf's seem to be a common source of leak, where either 
the release is forgotten or detach is used incorrectly - and I'm tempted 
to try and implement some automated checks to catch those (I wonder if 
coccicheck is powerful enough for this?).

>   ...

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                       ` (3 preceding siblings ...)
  2021-07-14 17:23     ` [PATCH v2 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
@ 2021-07-15 17:37     ` Andrzej Hunt
  2021-08-31 13:35     ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
  6 siblings, 0 replies; 125+ messages in thread
From: Andrzej Hunt @ 2021-07-15 17:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason, git
  Cc: Junio C Hamano, Jeff King, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On 14/07/2021 19:23, Ævar Arnfjörð Bjarmason wrote:
> As a follow-up to my recent thread asking if we had some test mode or
> CI to test for memory leak regression (we don't), add such a test
> mode, and run it in CI.
> 
> Currently the two new CI targets take ~2-3 minutes to run in GitHub
> CI, whereas the normal test targets take 20-30 minutes. The tests run
> slower, but we have a small whitelist of test scripts that are OK.
> 
> v2:
> 
>   * Fixes issues spotted by Đoàn Trần Công Danh and Eric Sunshine,
>     thanks both!
> 
>   * I got rid of the change to t0500, I saw it being flaky in GitHub
>     CI, and looks like there'll be other concurrent edits to that file,
>     so leaving it be.
> 
> v1: http://lore.kernel.org/git/cover-0.4-0000000000-20210714T001007Z-avarab@gmail.com

> 
> Ævar Arnfjörð Bjarmason (4):
>    tests: add a test mode for SANITIZE=leak, run it in CI
>    SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
>    SANITIZE tests: fix memory leaks in t5701*, add to whitelist
>    SANITIZE tests: fix leak in mailmap.c
> 

The leak fixes look good to me, modulo the UNLEAK as already commented 
on in patch 2/4 - thank you!

I don't feel qualified to review the test and CI related scripting, 
hopefully someone else will be able to look at those changes :).

ATB,

Andrzej

[...snip...]

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 17:23     ` [PATCH v2 1/4] tests: " Ævar Arnfjörð Bjarmason
  2021-07-14 18:42       ` Andrzej Hunt
@ 2021-07-15 21:06       ` Jeff King
  2021-07-16 14:46         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-07-15 21:06 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Wed, Jul 14, 2021 at 07:23:51PM +0200, Ævar Arnfjörð Bjarmason wrote:

> While git can be compiled with SANITIZE=leak there has been no
> corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
> fixed as one-offs without structured regression testing.

This opening puzzled me. I'm not sure I understand why we need a special
GIT_TEST_* mode for it.  If you do "make SANITIZE=leak test", then your
binaries will leak-check while running the tests.

I.e., there is nothing that test-lib.sh itself needs to do differently
to enable it.

What we _do_ need is some mechanism of annotating to tests to say "this
is known to leak", so that we can skip them for normal integration runs.

And that is part of what's going on in this patch, but I'm not sure it
is the simplest way to do it. The first question is: how do we want to
annotate the tests. By marking individual scripts or tests in the
test-files themselves? Or by using a separate list of "these scripts or
tests are known to pass"?

IMHO the latter is preferable. It keeps the annotations out of the way
of normal work (they are a temporary thing until we eventually pass the
whole suite leak free, but I expect they'll be with us for a while). The
downside is that the annotations may get out of sync with test numbers.
But if we are primarily annotating whole scripts (and not individual
tests), then that is generally pretty stable.

And with that in mind, can we just use an existing mechanism for picking
which tests to run, and drive it externally from the CI job?

We already have GIT_SKIP_TESTS and --run. Those are perhaps a bit
awkward for feeding huge lists to, and there is no environment
equivalent for --run (so you can't trigger it easily from "make test").
But what if we could do something like:

  GIT_TEST_RUN_FROM=t/leak-free make SANITIZE=leak test

and then t/leak-free contained the usual patterns like:

  t000*
  t1234.5

and so on. That requires two new features in test-lib.sh:

  - making a GIT_TEST_RUN variable that is the opposite of GIT_TEST_SKIP
    (instead of just the command-line --run).

  - adding GIT_TEST_{RUN,SKIP}_FROM variables to read the values from a
    file rather than the environment (I suppose the caller could just
    stuff the contents into the variable, but I expect that test-lib.sh
    may want to pare down the entries that do not even apply to the
    current script for the sake of efficiency in checking each test).

That infrastructure would then be applicable to other cases, too. Or
even just useful for using another list (or no list at all) when you
are looking at whether other tests are leak-free or not.

> This change add such a mode, we now have new
> linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
> tests as linux-{clang,gcc}, except that almost all of them are
> skipped.

I'm not clear on what we expect to get out of running it with both clang
and gcc. They should be producing identical results.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 18:42       ` Andrzej Hunt
  2021-07-14 22:39         ` Ævar Arnfjörð Bjarmason
@ 2021-07-15 21:14         ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-07-15 21:14 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh,
	Eric Sunshine

On Wed, Jul 14, 2021 at 08:42:27PM +0200, Andrzej Hunt wrote:

> My other question is: if we are adding a new job - should it really be just
> a leak checking job? Leak checking is just a subset of ASAN (Address
> Sanitizer). And as discussed at [1] it's possible to run ASAN and UBSAN
> (Undefined Behaviour Sanitizer) in the same build. I feel like it's much
> more useful to first add a combined ASAN+UBSAN job, followed by enabling
> leak-checking as part of ASAN in those jobs for known leak-free tests - as
> opposed to only adding leak checking. We currently disable Leak checking for
> ASAN here [2], but that could be made conditional on the test ID (i.e. check
> an allowlist to enable leak checking for some tests)?

I do think it's worth having an ASan+UBSan job. In the CI we use for our
custom fork of Git at GitHub, we run it for every pull request (and I do
bring upstream any applicable fixes). It's kind of expensive compared to
a regular "make test", but probably not nearly as bad as just running
the regular test suite on Windows.

And it's true that ASan can do leak-checking, too. In the long run, when
we are leak-free, I think it may make sense to combine the jobs. But in
the interim state where we can run the whole suite with ASan/UBSan, but
not with LSan, I think it's simpler to just keep them separate. That
lets us just entirely skip tests or scripts in the leak-checking run. I
haven't measured, but I also expect that LSan is not much more expensive
than a regular run, so combining the two isn't that big a win).

So I do like your suggestion, but I think it just be orthogonal further
to leak-checking.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-14 18:57       ` Andrzej Hunt
  2021-07-14 22:56         ` Ævar Arnfjörð Bjarmason
@ 2021-07-15 21:42         ` Jeff King
  2021-07-16  5:18           ` Andrzej Hunt
  2021-07-16  7:46           ` Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 125+ messages in thread
From: Jeff King @ 2021-07-15 21:42 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh,
	Eric Sunshine

On Wed, Jul 14, 2021 at 08:57:37PM +0200, Andrzej Hunt wrote:

> > @@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
> >   	if (!strcmp(var, "core.attributesfile"))
> >   		return git_config_pathname(&git_attributes_file, var, value);
> > -	if (!strcmp(var, "core.hookspath"))
> > +	if (!strcmp(var, "core.hookspath")) {
> > +		UNLEAK(git_hooks_path);
> >   		return git_config_pathname(&git_hooks_path, var, value);
> > +	}
> 
> Why is the UNLEAK necessary here? We generally want to limit use of UNLEAK
> to cmd_* functions or direct helpers. git_default_core_config() seems
> generic enough that it could be called from anywhere, and using UNLEAK here
> means we're potentially masking a real leak?
> 
> IIUC the leak here happens because:
> - git_hooks_path is a global variable - hence it's unlikely we'd ever
>   bother cleaning it up.
> - git_default_core_config() gets called a first time with
>   core.hookspath, and we end up allocating new memory into
>   git_hooks_path.
> - git_default_core_config() gets called again with core.hookspath,
>   and we overwrite git_hooks_path with a new string which leaks
>   the string that git_hooks_path used to point to.
> 
> So I think the real fix is to free(git_hooks_path) instead of an UNLEAK?
> (Looking at the surrounding code, it looks like the same pattern of leak
> might be repeated for other similar globals - is it worth auditing those
> while we're here?)

This is a common leak pattern in Git. We do something like:

  static const char *foo = "default";
  ...
  int config_cb(const char *var, const char *value, void *)
  {
          if (!strcmp(var, "core.foo"))
	          foo = xstrdup(value);
  }

So we leak if the variable appears twice. But we can't just call
"free(foo)" here. In the first call, it's pointing to a string literal!

In the case of git_hooks_path, it defaults to NULL, so this works out
OK. But it's setting up a trap for somebody later on, who assigns it a
default value (and the compiler won't help; it's a "const char *", so
the assignment is fine, and the free() would already be casting away the
constness).

I see a few possible solutions:

  - instead of strdup'ing long-lived config values, strintern() them.
    This is really leaking them, but in a way that we hold on to the old
    values. This is actually more or less what UNLEAK() is doing under
    the hood (saving a reference to the old buffer, even the variable is
    overwritten).

  - find a way to tell when a string comes from the heap versus a
    literal. I don't think you can do this portably without keeping your
    own separate flag. We could abstract away some of the pain with a
    struct like:

       struct def_string {
               /* might point to heap memory; const because you must
                * check flag before modifying */
               const char *value;
               int from_heap;
       }

       /* regular static initialization is OK if you don't want a default */
       #define DEF_STRING_INIT(str) { .value = str }

       static void def_string_set(struct def_string *ds, const char *value)
       {
               if (ds->from_heap)
                       free(ds->value);
               ds->value = xstrdup(value);
               ds->from_heap = 1;
       }

    The annoying thing is all of the users need to refer to
    git_hook_path.value instead of just git_hook_path. If you don't mind
    a little macro hackery, we could get around that by declaring pairs
    of variables. Like:

      #define DEF_STRING_DECLARE(name, value) \
      const char *name = value; \
      int name##_from_heap

      #define DEF_STRING_SET(name, value) do { \
              if (name##_from_heap) \
                      free(name); \
              name = xstrdup(value); \
              name##_from_heap = 1; \
      } while(0)

I can't say I _love_ any of that, but I think it would work (and
probably we'd adapt our helpers like git_config_pathname() to take a
def_string. Or I guess just have a def_string_free() which can be called
before writing into them).

But maybe there's a better solution I'm missing.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, add to whitelist
  2021-07-14 17:23     ` [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
  2021-07-15 17:37       ` Andrzej Hunt
@ 2021-07-15 21:43       ` Jeff King
  2021-08-31 13:46       ` [PATCH] protocol-caps.c: fix memory leak in send_info() Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-07-15 21:43 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Wed, Jul 14, 2021 at 07:23:53PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Fix a memory leak in a2ba162cda (object-info: support for retrieving
> object info, 2021-04-20) which appears to have been based on a
> misunderstanding of how the pkt-line.c API works, there is no need to
> strdup() input to, it's just a printf()-like format function.
> 
> This fixes a potentially large memory leak, since the number of OID
> lines the "object-info" call can be arbitrarily large (or a small one
> if the request is small).

Very nice. This will also be much more efficient, since we get to reuse
the same buffer in each run through the loop.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-15 21:42         ` Jeff King
@ 2021-07-16  5:18           ` Andrzej Hunt
  2021-07-16 21:20             ` Jeff King
  2021-07-16  7:46           ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 125+ messages in thread
From: Andrzej Hunt @ 2021-07-16  5:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh,
	Eric Sunshine



On 15/07/2021 23:42, Jeff King wrote:
> On Wed, Jul 14, 2021 at 08:57:37PM +0200, Andrzej Hunt wrote:
> 
>>> @@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>>>    	if (!strcmp(var, "core.attributesfile"))
>>>    		return git_config_pathname(&git_attributes_file, var, value);
>>> -	if (!strcmp(var, "core.hookspath"))
>>> +	if (!strcmp(var, "core.hookspath")) {
>>> +		UNLEAK(git_hooks_path);
>>>    		return git_config_pathname(&git_hooks_path, var, value);
>>> +	}
>>
>> Why is the UNLEAK necessary here? We generally want to limit use of UNLEAK
>> to cmd_* functions or direct helpers. git_default_core_config() seems
>> generic enough that it could be called from anywhere, and using UNLEAK here
>> means we're potentially masking a real leak?
>>
>> IIUC the leak here happens because:
>> - git_hooks_path is a global variable - hence it's unlikely we'd ever
>>    bother cleaning it up.
>> - git_default_core_config() gets called a first time with
>>    core.hookspath, and we end up allocating new memory into
>>    git_hooks_path.
>> - git_default_core_config() gets called again with core.hookspath,
>>    and we overwrite git_hooks_path with a new string which leaks
>>    the string that git_hooks_path used to point to.
>>
>> So I think the real fix is to free(git_hooks_path) instead of an UNLEAK?
>> (Looking at the surrounding code, it looks like the same pattern of leak
>> might be repeated for other similar globals - is it worth auditing those
>> while we're here?)
> 
> This is a common leak pattern in Git. We do something like:
> 
>    static const char *foo = "default";
>    ...
>    int config_cb(const char *var, const char *value, void *)
>    {
>            if (!strcmp(var, "core.foo"))
> 	          foo = xstrdup(value);
>    }
> 
> So we leak if the variable appears twice. But we can't just call
> "free(foo)" here. In the first call, it's pointing to a string literal!
> 
> In the case of git_hooks_path, it defaults to NULL, so this works out
> OK. But it's setting up a trap for somebody later on, who assigns it a
> default value (and the compiler won't help; it's a "const char *", so
> the assignment is fine, and the free() would already be casting away the
> constness).

Ah, right. I didn't think about the risk of future breakages.

> 
> I see a few possible solutions:
>  [...]
> I can't say I _love_ any of that, but I think it would work (and
> probably we'd adapt our helpers like git_config_pathname() to take a
> def_string. Or I guess just have a def_string_free() which can be called
> before writing into them).

Is it worth sidestepping the whole globals issue by migrating 
core.hookspath (and other string config values) to be fetched via 
git_config_get_pathname() and equivalents at the point of use instead?

I looked at the commit below which introduced git_config_get* which 
suggests that these methods were indeed intended to be an improvement 
over the callback based API, and IIUC switching over should have a bunch 
of advantages:
  - Removes some potential bugs that can happen if git_config() was never
    called with the right callback.
  - Potentially reduces the number of times we have to iterate over the
    config in the first place (assuming we migrate *all* config access
    and not just strings).
  - Fewer globals - which reduces potential for such leaks (and probably
    makes it easier to read the code in the first place).
OTOH I'm not familiar enough with this code to know what the 
disadvantages of such a migration might be (it's definitely going to be 
a lot of work... but that's going to apply to any of the approaches we 
can choose to fix these leaks).

git_config_get* were introduced in:
   3c8687a73e (add `config_set` API for caching config-like files, 
2014-07-28)

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-15 21:42         ` Jeff King
  2021-07-16  5:18           ` Andrzej Hunt
@ 2021-07-16  7:46           ` Ævar Arnfjörð Bjarmason
  2021-07-16 21:16             ` Jeff King
  1 sibling, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-16  7:46 UTC (permalink / raw)
  To: Jeff King
  Cc: Andrzej Hunt, git, Junio C Hamano, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Elijah Newren


On Thu, Jul 15 2021, Jeff King wrote:

> On Wed, Jul 14, 2021 at 08:57:37PM +0200, Andrzej Hunt wrote:
>
>> > @@ -1331,8 +1336,10 @@ static int git_default_core_config(const char *var, const char *value, void *cb)
>> >   	if (!strcmp(var, "core.attributesfile"))
>> >   		return git_config_pathname(&git_attributes_file, var, value);
>> > -	if (!strcmp(var, "core.hookspath"))
>> > +	if (!strcmp(var, "core.hookspath")) {
>> > +		UNLEAK(git_hooks_path);
>> >   		return git_config_pathname(&git_hooks_path, var, value);
>> > +	}
>> 
>> Why is the UNLEAK necessary here? We generally want to limit use of UNLEAK
>> to cmd_* functions or direct helpers. git_default_core_config() seems
>> generic enough that it could be called from anywhere, and using UNLEAK here
>> means we're potentially masking a real leak?
>> 
>> IIUC the leak here happens because:
>> - git_hooks_path is a global variable - hence it's unlikely we'd ever
>>   bother cleaning it up.
>> - git_default_core_config() gets called a first time with
>>   core.hookspath, and we end up allocating new memory into
>>   git_hooks_path.
>> - git_default_core_config() gets called again with core.hookspath,
>>   and we overwrite git_hooks_path with a new string which leaks
>>   the string that git_hooks_path used to point to.
>> 
>> So I think the real fix is to free(git_hooks_path) instead of an UNLEAK?
>> (Looking at the surrounding code, it looks like the same pattern of leak
>> might be repeated for other similar globals - is it worth auditing those
>> while we're here?)
>
> This is a common leak pattern in Git. We do something like:
>
>   static const char *foo = "default";
>   ...
>   int config_cb(const char *var, const char *value, void *)
>   {
>           if (!strcmp(var, "core.foo"))
> 	          foo = xstrdup(value);
>   }
>
> So we leak if the variable appears twice. But we can't just call
> "free(foo)" here. In the first call, it's pointing to a string literal!
>
> In the case of git_hooks_path, it defaults to NULL, so this works out
> OK. But it's setting up a trap for somebody later on, who assigns it a
> default value (and the compiler won't help; it's a "const char *", so
> the assignment is fine, and the free() would already be casting away the
> constness).
>
> I see a few possible solutions:
>
>   - instead of strdup'ing long-lived config values, strintern() them.
>     This is really leaking them, but in a way that we hold on to the old
>     values. This is actually more or less what UNLEAK() is doing under
>     the hood (saving a reference to the old buffer, even the variable is
>     overwritten).
>
>   - find a way to tell when a string comes from the heap versus a
>     literal. I don't think you can do this portably without keeping your
>     own separate flag. We could abstract away some of the pain with a
>     struct like:
>
>        struct def_string {
>                /* might point to heap memory; const because you must
>                 * check flag before modifying */
>                const char *value;
>                int from_heap;
>        }
>
>        /* regular static initialization is OK if you don't want a default */
>        #define DEF_STRING_INIT(str) { .value = str }
>
>        static void def_string_set(struct def_string *ds, const char *value)
>        {
>                if (ds->from_heap)
>                        free(ds->value);
>                ds->value = xstrdup(value);
>                ds->from_heap = 1;
>        }
>
>     The annoying thing is all of the users need to refer to
>     git_hook_path.value instead of just git_hook_path. If you don't mind
>     a little macro hackery, we could get around that by declaring pairs
>     of variables. Like:
>
>       #define DEF_STRING_DECLARE(name, value) \
>       const char *name = value; \
>       int name##_from_heap
>
>       #define DEF_STRING_SET(name, value) do { \
>               if (name##_from_heap) \
>                       free(name); \
>               name = xstrdup(value); \
>               name##_from_heap = 1; \
>       } while(0)
>
> I can't say I _love_ any of that, but I think it would work (and
> probably we'd adapt our helpers like git_config_pathname() to take a
> def_string. Or I guess just have a def_string_free() which can be called
> before writing into them).
>
> But maybe there's a better solution I'm missing.

Instead of: "int from_heap" in your "def_string" I think we should just
use "struct string_list_item". I.e. you want a void* here. Why?

<Digression>

I have an unsent series for handling some more common cases in the
string-list API. I started writing it due to a very related problem,
i.e. that we conflate "string init dup/nodup" with "do we want to
free?".

We (ab)use the "strdup_strings" in a few places to free that sort of
thing at the end if we have heap-allocated strings, but ones we did not
strdup ourselves, e.g. this in merge-ort.c (not picking on Elijah (CC'd)
here, it's common in lots of places, and this one was pretty much lifted
from merge-recursive).

        opti->paths_to_free.strdup_strings = 1;
        string_list_clear(&opti->paths_to_free, 0);
        opti->paths_to_free.strdup_strings = 0;

So I improved the string-list and strmap free functions so you can
instead do:

    string_list_clear_strings((&opti->paths_to_free, 0);

And that along with some other changes allows you to clear (or not) any
combination of the string, util, or have a callback function of your own
run (but be ensured to run all of those before we get to any of the
other freeing).

</Digression>

You must be thinking what any of this has to do with heap strings in C,
well one common case you've not discussed is that we sometimes do the
equivalent of, with string-list.h or not (somewhat pseudocode);

	void add_to_list(struct string_list *list, char *on_heap_now_we_own_it)
	{
		char *ptr = on_heap_now_we_own_it;
		char *mydup = xstrdup("foo");

	        ptr++; /* skip first byte */
		string_list_append(list, ptr);
		string_list_append(list, mydup);
	}

And:

        struct string_list list = STRING_LIST_INIT_NODUP;
        /* other stuff here, we get strings from somewhere etc. */
        add_to_list(list, some_string);

So now you're left with needing to free both at the end, but we since we
did ptr++ there we can't free() that (we'd need to free(ptr - 1), but
how to keep track of that?).

Well, tying this back to my clear() improvements for string-list.h I
thought a really neat solution to this was:

    string_list_append(list, ptr)->util = on_heap_now_we_own_it;
    string_list_append(list, mydup)->util = mydup;

I.e. by convention we store the pointer we need to free (if any) in the
"util" field.

And then if you get a string not from the heap you just leave the "util"
as NULL, and at the end you just free() all your "util" fields, and it
just so happens that some of them are the same as the "string" field.

We're not in the habit of passing loose "string_list_item" around now,
but I don't see why we wouldn't (possibly with a change to extract that
bit out, so we could use it in other places).

The neat thing about doing this is also that you're not left with every
API boundary needing to deal with your new "def_string", a lot of them
use string_list already, and hardly need to change anything, to the
extent that we do need to change anything having a "void *util" is a lot
more generally usable. You end up getting memory management for free as
you gain a feature to pass arbitrary data along with your items.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-15 21:06       ` Jeff King
@ 2021-07-16 14:46         ` Ævar Arnfjörð Bjarmason
  2021-07-16 18:09           ` Jeff King
  0 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-16 14:46 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine


On Thu, Jul 15 2021, Jeff King wrote:

> On Wed, Jul 14, 2021 at 07:23:51PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> While git can be compiled with SANITIZE=leak there has been no
>> corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
>> fixed as one-offs without structured regression testing.
>
> This opening puzzled me. I'm not sure I understand why we need a special
> GIT_TEST_* mode for it.  If you do "make SANITIZE=leak test", then your
> binaries will leak-check while running the tests.
>
> I.e., there is nothing that test-lib.sh itself needs to do differently
> to enable it.
>
> What we _do_ need is some mechanism of annotating to tests to say "this
> is known to leak", so that we can skip them for normal integration runs.
>
> And that is part of what's going on in this patch, but I'm not sure it
> is the simplest way to do it. The first question is: how do we want to
> annotate the tests. By marking individual scripts or tests in the
> test-files themselves? Or by using a separate list of "these scripts or
> tests are known to pass"?
>
> IMHO the latter is preferable. It keeps the annotations out of the way
> of normal work (they are a temporary thing until we eventually pass the
> whole suite leak free, but I expect they'll be with us for a while). The
> downside is that the annotations may get out of sync with test numbers.
> But if we are primarily annotating whole scripts (and not individual
> tests), then that is generally pretty stable.
>
> And with that in mind, can we just use an existing mechanism for picking
> which tests to run, and drive it externally from the CI job?
>
> We already have GIT_SKIP_TESTS and --run. Those are perhaps a bit
> awkward for feeding huge lists to, and there is no environment
> equivalent for --run (so you can't trigger it easily from "make test").
> But what if we could do something like:
>
>   GIT_TEST_RUN_FROM=t/leak-free make SANITIZE=leak test
>
> and then t/leak-free contained the usual patterns like:
>
>   t000*
>   t1234.5
>
> and so on. That requires two new features in test-lib.sh:
>
>   - making a GIT_TEST_RUN variable that is the opposite of GIT_TEST_SKIP
>     (instead of just the command-line --run).
>
>   - adding GIT_TEST_{RUN,SKIP}_FROM variables to read the values from a
>     file rather than the environment (I suppose the caller could just
>     stuff the contents into the variable, but I expect that test-lib.sh
>     may want to pare down the entries that do not even apply to the
>     current script for the sake of efficiency in checking each test).
>
> That infrastructure would then be applicable to other cases, too. Or
> even just useful for using another list (or no list at all) when you
> are looking at whether other tests are leak-free or not.

I've included a mechanism for whitelisting specific globs, the idea was
not to have that be too detailed, but we'd e.g. get to the point of t00*
or whatever passing.

Anything that's a lot more granular than that is doing to suck,
e.g. exposing teh GIT_TEST_SKIP and --run features. of specific test
numbers, now you need to count your tests if you add one in the middle
of one of those, and more likely you won't test under the mode and just
see it in CI.

The marking at a distance I've done also has that problem in theory, but
I think in practice we'll use it carefully for globs of tests unlikely
to break.

This whole thing is much more with the GIT_TEST_SANITIZE_LEAK mode, it's
a really common case that we e.g. leak in some revision.c API user, we
should fix that, but holding up marking the rest of at test whose entire
tests otherwise pass is bad, it means you can't do any testing of a
given API or subsystem without getting the entire file to pass.

Whereas while we're fixing very common leaks in the codabase it's likely
that any given test file will have a few such tests.

It also means everything works by default, you get an appropriate notice
from prove(1), and even if you run one test manually it'll skip, but
emit a message saying you can set the env var to force its run.

>> This change add such a mode, we now have new
>> linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
>> tests as linux-{clang,gcc}, except that almost all of them are
>> skipped.
>
> I'm not clear on what we expect to get out of running it with both clang
> and gcc. They should be producing identical results.

Indeed, addressed elsewhere, i.e. it's just a thinko of mine.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-16 14:46         ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 18:09           ` Jeff King
  2021-07-16 18:45             ` Jeff King
  2021-07-16 18:56             ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 125+ messages in thread
From: Jeff King @ 2021-07-16 18:09 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Fri, Jul 16, 2021 at 04:46:12PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > and so on. That requires two new features in test-lib.sh:
> >
> >   - making a GIT_TEST_RUN variable that is the opposite of GIT_TEST_SKIP
> >     (instead of just the command-line --run).
> >
> >   - adding GIT_TEST_{RUN,SKIP}_FROM variables to read the values from a
> >     file rather than the environment (I suppose the caller could just
> >     stuff the contents into the variable, but I expect that test-lib.sh
> >     may want to pare down the entries that do not even apply to the
> >     current script for the sake of efficiency in checking each test).
> >
> > That infrastructure would then be applicable to other cases, too. Or
> > even just useful for using another list (or no list at all) when you
> > are looking at whether other tests are leak-free or not.
> 
> I've included a mechanism for whitelisting specific globs, the idea was
> not to have that be too detailed, but we'd e.g. get to the point of t00*
> or whatever passing.
> 
> Anything that's a lot more granular than that is doing to suck,
> e.g. exposing teh GIT_TEST_SKIP and --run features. of specific test
> numbers, now you need to count your tests if you add one in the middle
> of one of those, and more likely you won't test under the mode and just
> see it in CI.

I think you can do the same level of skipping with GIT_TEST_SKIP,
though. My argument was just that adding a new mechanism does not make
sense when we already have one. I.e., running:

  GIT_SKIP_TESTS='
    t[123456789]*
    t0[^0]*
    t00[^016]*
    t000[469]
    t001[2459]
    t006[0248]
  ' make SANITIZE=leak test

works already to do the same thing. The only thing we might want is a
nicer syntax (e.g., to allow positive and negative patterns, or to read
from a file). But that would benefit all users of GIT_SKIP_TESTS, not
just people interested in leaks.

> It also means everything works by default, you get an appropriate notice
> from prove(1), and even if you run one test manually it'll skip, but
> emit a message saying you can set the env var to force its run.

With GIT_SKIP_TESTS you obviously don't get a message saying "try
skipping this test" when it fails. :) But IMHO that is not that big a
deal. You'll get a test failure with good LSan output. If you are
working on expanding leak-checker coverage, you already know about your
options for skipping. If you're adding a new test that leaks, you might
consider fixing the leak (though not always, if it's far from code
you're touching).

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-16 18:09           ` Jeff King
@ 2021-07-16 18:45             ` Jeff King
  2021-07-16 18:56             ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-07-16 18:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Fri, Jul 16, 2021 at 02:09:22PM -0400, Jeff King wrote:

> > Anything that's a lot more granular than that is doing to suck,
> > e.g. exposing teh GIT_TEST_SKIP and --run features. of specific test
> > numbers, now you need to count your tests if you add one in the middle
> > of one of those, and more likely you won't test under the mode and just
> > see it in CI.
> 
> I think you can do the same level of skipping with GIT_TEST_SKIP,
> though. My argument was just that adding a new mechanism does not make
> sense when we already have one. I.e., running:
> 
>   GIT_SKIP_TESTS='
>     t[123456789]*
>     t0[^0]*
>     t00[^016]*
>     t000[469]
>     t001[2459]
>     t006[0248]
>   ' make SANITIZE=leak test
> 
> works already to do the same thing. The only thing we might want is a
> nicer syntax (e.g., to allow positive and negative patterns, or to read
> from a file). But that would benefit all users of GIT_SKIP_TESTS, not
> just people interested in leaks.

I cheated a little here; an unrelated bug does cause a failure in t0000
with this pattern. I've just sent:

  https://lore.kernel.org/git/YPHTY5G9JaQFKlX5@coredump.intra.peff.net/

to fix it.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-16 18:09           ` Jeff King
  2021-07-16 18:45             ` Jeff King
@ 2021-07-16 18:56             ` Ævar Arnfjörð Bjarmason
  2021-07-16 19:22               ` Jeff King
  1 sibling, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-07-16 18:56 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine


On Fri, Jul 16 2021, Jeff King wrote:

> On Fri, Jul 16, 2021 at 04:46:12PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > and so on. That requires two new features in test-lib.sh:
>> >
>> >   - making a GIT_TEST_RUN variable that is the opposite of GIT_TEST_SKIP
>> >     (instead of just the command-line --run).
>> >
>> >   - adding GIT_TEST_{RUN,SKIP}_FROM variables to read the values from a
>> >     file rather than the environment (I suppose the caller could just
>> >     stuff the contents into the variable, but I expect that test-lib.sh
>> >     may want to pare down the entries that do not even apply to the
>> >     current script for the sake of efficiency in checking each test).
>> >
>> > That infrastructure would then be applicable to other cases, too. Or
>> > even just useful for using another list (or no list at all) when you
>> > are looking at whether other tests are leak-free or not.
>> 
>> I've included a mechanism for whitelisting specific globs, the idea was
>> not to have that be too detailed, but we'd e.g. get to the point of t00*
>> or whatever passing.
>> 
>> Anything that's a lot more granular than that is doing to suck,
>> e.g. exposing teh GIT_TEST_SKIP and --run features. of specific test
>> numbers, now you need to count your tests if you add one in the middle
>> of one of those, and more likely you won't test under the mode and just
>> see it in CI.
>
> I think you can do the same level of skipping with GIT_TEST_SKIP,
> though. My argument was just that adding a new mechanism does not make
> sense when we already have one. I.e., running:
>
>   GIT_SKIP_TESTS='
>     t[123456789]*
>     t0[^0]*
>     t00[^016]*
>     t000[469]
>     t001[2459]
>     t006[0248]
>   ' make SANITIZE=leak test
>
> works already to do the same thing. The only thing we might want is a
> nicer syntax (e.g., to allow positive and negative patterns, or to read
> from a file). But that would benefit all users of GIT_SKIP_TESTS, not
> just people interested in leaks.

A glob in this series is t13*config*, you can't do that with
GIT_SKIP_TESTS because it only includes the numeric part of the test,
i.e. t1300, not t1300-config, or t1306-xdg-files.

But sure, it could happen via some other mechanism than the exact one I
picked, or we could add GIT_SKIP_TESTS2 or whatever.

I would like to be able to compile with it and run "make test" without a
wall of failures by default, i.e. we should be able to tell regressions
from known-OK to get anywhere with it, but that's orthagonal to the
exact mechanism.

>> It also means everything works by default, you get an appropriate notice
>> from prove(1), and even if you run one test manually it'll skip, but
>> emit a message saying you can set the env var to force its run.
>
> With GIT_SKIP_TESTS you obviously don't get a message saying "try
> skipping this test" when it fails. :) But IMHO that is not that big a
> deal. You'll get a test failure with good LSan output. If you are
> working on expanding leak-checker coverage, you already know about your
> options for skipping. If you're adding a new test that leaks, you might
> consider fixing the leak (though not always, if it's far from code
> you're touching).

I do think it makes sense as a test mode test-lib.sh is aware of,
e.g. on obvious next step is to not fail everything right away, but just
let the test run and log all failures to a file, then e.g. fail one test
at the end, or if we're running in that mode collate all the callstacks
and emit a summary for the whole test run.

But yes, the message it emits now isn't such a big deal.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 1/4] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-07-16 18:56             ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 19:22               ` Jeff King
  0 siblings, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-07-16 19:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Fri, Jul 16, 2021 at 08:56:04PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I think you can do the same level of skipping with GIT_TEST_SKIP,
> > though. My argument was just that adding a new mechanism does not make
> > sense when we already have one. I.e., running:
> >
> >   GIT_SKIP_TESTS='
> >     t[123456789]*
> >     t0[^0]*
> >     t00[^016]*
> >     t000[469]
> >     t001[2459]
> >     t006[0248]
> >   ' make SANITIZE=leak test
> >
> > works already to do the same thing. The only thing we might want is a
> > nicer syntax (e.g., to allow positive and negative patterns, or to read
> > from a file). But that would benefit all users of GIT_SKIP_TESTS, not
> > just people interested in leaks.
> 
> A glob in this series is t13*config*, you can't do that with
> GIT_SKIP_TESTS because it only includes the numeric part of the test,
> i.e. t1300, not t1300-config, or t1306-xdg-files.

That seems like a feature that GIT_SKIP_TESTS could learn (though IMHO
just using the test number in your patterns is sufficient).

> I would like to be able to compile with it and run "make test" without a
> wall of failures by default, i.e. we should be able to tell regressions
> from known-OK to get anywhere with it, but that's orthagonal to the
> exact mechanism.

Right, I definitely agree on the goal. I just don't see the need to add
a new, very-specific mechanism. The skip-list above is gross and
obviously not something you'd want to type. Driving it from a ci script
is not too bad, but I agree people who want to leak-check locally would
want an easy way to use it, too. That's why I suggested extending it to
a file that could be easily specified (and possibly even auto-triggered
in the Makefile by SANITIZE=leak).

> > With GIT_SKIP_TESTS you obviously don't get a message saying "try
> > skipping this test" when it fails. :) But IMHO that is not that big a
> > deal. You'll get a test failure with good LSan output. If you are
> > working on expanding leak-checker coverage, you already know about your
> > options for skipping. If you're adding a new test that leaks, you might
> > consider fixing the leak (though not always, if it's far from code
> > you're touching).
> 
> I do think it makes sense as a test mode test-lib.sh is aware of,
> e.g. on obvious next step is to not fail everything right away, but just
> let the test run and log all failures to a file, then e.g. fail one test
> at the end, or if we're running in that mode collate all the callstacks
> and emit a summary for the whole test run.

That's a more compelling reason, if we did implement that feature. My
hope was that all of this would be a temporary state, though, and we'd
get to a point where you can simply run "make SANITIZE=leak test" and
actually run all of the tests. And then such a feature would not be that
interesting, because failures would be rare and cause for immediate
human attention.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-16  7:46           ` Ævar Arnfjörð Bjarmason
@ 2021-07-16 21:16             ` Jeff King
  2021-08-31 12:47               ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-07-16 21:16 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Andrzej Hunt, git, Junio C Hamano, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Elijah Newren

On Fri, Jul 16, 2021 at 09:46:33AM +0200, Ævar Arnfjörð Bjarmason wrote:

> > I can't say I _love_ any of that, but I think it would work (and
> > probably we'd adapt our helpers like git_config_pathname() to take a
> > def_string. Or I guess just have a def_string_free() which can be called
> > before writing into them).
> >
> > But maybe there's a better solution I'm missing.
> 
> Instead of: "int from_heap" in your "def_string" I think we should just
> use "struct string_list_item". I.e. you want a void* here. Why?

Yes, an equivalent way to write it is with a separate to_free buffer.
But why would we want it to be void? And why would we want to use a
string_list_item, which is otherwise unrelated?

> Well, tying this back to my clear() improvements for string-list.h I
> thought a really neat solution to this was:
> 
>     string_list_append(list, ptr)->util = on_heap_now_we_own_it;
>     string_list_append(list, mydup)->util = mydup;
> 
> I.e. by convention we store the pointer we need to free (if any) in the
> "util" field.

That works, but now "util" is not available for all the _other_ uses for
which it was intended. And if we're not using it for those other uses,
then why does it need to exist at all? If we are only using it to hold
the allocated string pointer, then shouldn't it be "char *to_free"?

> We're not in the habit of passing loose "string_list_item" around now,
> but I don't see why we wouldn't (possibly with a change to extract that
> bit out, so we could use it in other places).

It seems unnecessarily confusing to me. It sounds like you have a struct
which just _happens_ to have a "void *" in it you can re-use, so you
start using it in lots of other places that are not in fact string lists
at all. That is confusing to me on the face, but what happens when
string_list needs a feature which requires adding more fields to it?

If the point is to have a maybe-allocated string, why not make that a
type itself? And then if we want string_list to use it, it can.

> The neat thing about doing this is also that you're not left with every
> API boundary needing to deal with your new "def_string", a lot of them
> use string_list already, and hardly need to change anything, to the
> extent that we do need to change anything having a "void *util" is a lot
> more generally usable. You end up getting memory management for free as
> you gain a feature to pass arbitrary data along with your items.

I don't think most interfaces take a string_list_item now, so wouldn't
they similarly need to be changed? Though the point is that all of these
degrade to a regular C-string, so when you are just passing the value
(and not ownership), you would just dereference at that point.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-16  5:18           ` Andrzej Hunt
@ 2021-07-16 21:20             ` Jeff King
  0 siblings, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-07-16 21:20 UTC (permalink / raw)
  To: Andrzej Hunt
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh,
	Eric Sunshine

On Fri, Jul 16, 2021 at 07:18:59AM +0200, Andrzej Hunt wrote:

> > I see a few possible solutions:
> >  [...]
> > I can't say I _love_ any of that, but I think it would work (and
> > probably we'd adapt our helpers like git_config_pathname() to take a
> > def_string. Or I guess just have a def_string_free() which can be called
> > before writing into them).
> 
> Is it worth sidestepping the whole globals issue by migrating core.hookspath
> (and other string config values) to be fetched via git_config_get_pathname()
> and equivalents at the point of use instead?

Yeah, I almost suggested that. It probably does work OK in this
instance, but it's a much bigger change to convert all cases.

I'm also not sure if you'd run into tricky details. For instance,
calling git_config_* is doing an expensive-ish lookup in the cached
config (well, expensive to accessing a single pointer). So in cases
where we're going to access the string many times (say, in a loop), we'd
want our own variable. That might end up being easy by calling it once
outside the loop. Or it might not be, for cases where the variable is
used under the hood by a helper function.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-07-16 21:16             ` Jeff King
@ 2021-08-31 12:47               ` Ævar Arnfjörð Bjarmason
  2021-09-01  7:53                 ` Jeff King
  0 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 12:47 UTC (permalink / raw)
  To: Jeff King
  Cc: Andrzej Hunt, git, Junio C Hamano, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Elijah Newren


On Fri, Jul 16 2021, Jeff King wrote:

[Very late reply, just getting back to this thread]

> On Fri, Jul 16, 2021 at 09:46:33AM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > I can't say I _love_ any of that, but I think it would work (and
>> > probably we'd adapt our helpers like git_config_pathname() to take a
>> > def_string. Or I guess just have a def_string_free() which can be called
>> > before writing into them).
>> >
>> > But maybe there's a better solution I'm missing.
>> 
>> Instead of: "int from_heap" in your "def_string" I think we should just
>> use "struct string_list_item". I.e. you want a void* here. Why?
>
> Yes, an equivalent way to write it is with a separate to_free buffer.
> But why would we want it to be void? And why would we want to use a
> string_list_item, which is otherwise unrelated?

We could factor "string_list_item" out into a "string_pair" or
whatever.

Sorry, I didn't mean to get into the naming/code location aspect of the
discussion, just the sensibility of using a "char */void * pair" for
these common memory management cases, v.s. your suggestion of having a
"char *, int from_heap" pair.

>> Well, tying this back to my clear() improvements for string-list.h I
>> thought a really neat solution to this was:
>> 
>>     string_list_append(list, ptr)->util = on_heap_now_we_own_it;
>>     string_list_append(list, mydup)->util = mydup;
>> 
>> I.e. by convention we store the pointer we need to free (if any) in the
>> "util" field.
>
> That works, but now "util" is not available for all the _other_ uses for
> which it was intended. And if we're not using it for those other uses,
> then why does it need to exist at all? If we are only using it to hold
> the allocated string pointer, then shouldn't it be "char *to_free"?

Because having it be "char *" doesn't cover the common case of
e.g. getting an already allocated "struct something *" which contains
your string, setting the "string" in "struct string_list_item" to some
string in that struct, and the "util" to the struct itself, as we now
own it and want to free() it later in its entirety.

That and the even more common case I mentioned upthread of wanting to
ferry around the truncated version of some char *, but still wanting to
account for the original for an eventual free().

But yes, if you want to account for freeing that data *and* have util
set to something else you'll need to have e.g. your own wrapper struct
and your own string_list_clear_func() callback.

I'm not suggesting that this handles every possible scenario, just that
having look at a lot of the code involved recently this seemed like a
neat solution for the common cases.

>> We're not in the habit of passing loose "string_list_item" around now,
>> but I don't see why we wouldn't (possibly with a change to extract that
>> bit out, so we could use it in other places).
>
> It seems unnecessarily confusing to me. It sounds like you have a struct
> which just _happens_ to have a "void *" in it you can re-use, so you
> start using it in lots of other places that are not in fact string lists
> at all. That is confusing to me on the face, but what happens when
> string_list needs a feature which requires adding more fields to it?
>
> If the point is to have a maybe-allocated string, why not make that a
> type itself? And then if we want string_list to use it, it can.

*nod*, covered above. My examples were unnecessarily confusing...

>> The neat thing about doing this is also that you're not left with every
>> API boundary needing to deal with your new "def_string", a lot of them
>> use string_list already, and hardly need to change anything, to the
>> extent that we do need to change anything having a "void *util" is a lot
>> more generally usable. You end up getting memory management for free as
>> you gain a feature to pass arbitrary data along with your items.
>
> I don't think most interfaces take a string_list_item now, so wouldn't
> they similarly need to be changed? Though the point is that all of these
> degrade to a regular C-string, so when you are just passing the value
> (and not ownership), you would just dereference at that point.

Sure, just like things would need to be changed to handle your proposed
"struct def_string".

By piggy-backing on an already used struct in our codebase we can get a
lot of that memory management pretty much for free without much
churn.

If you squint and pretend that "struct string_list_item" isn't called
something to do with that particular collections API (but it would make
use of it) then we've already set up most of the scaffolding and
management for this.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v3 0/8] add a test mode for SANITIZE=leak, run it in CI
  2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                       ` (4 preceding siblings ...)
  2021-07-15 17:37     ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Andrzej Hunt
@ 2021-08-31 13:35     ` Ævar Arnfjörð Bjarmason
  2021-09-01  9:56       ` Jeff King
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
  6 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

We can compile git with SANITIZE=leak, and have had various efforts in
the past such as 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04)
to plug memory leaks, but have had no CI testing of it to ensure that
we don't get regressions. This series adds a GIT_TEST_* mode for
checking those regressions, and runs it in CI.

Since I submitted v2 the delta between origin/master..origin/seen
broke even t0001-init.sh when run under SANITIZE=leak, so this series
will cause test smoke on "seen". It's a prblom with another topic[1]
though, which I and Emily will fix.

Changes since v2:

 * In v2 compiling with SANITIZE=leak would change things so only
   known-good passing tests were run by default, everything else would
   pass as a dummy. Now the default running of tests is unchanged, but
   if we run with GIT_TEST_PASSING_SANITIZE_LEAK=true only those tests
   are run which set and export TEST_PASSES_SANITIZE_LEAK=true.

 * The facility for declaring known-good tests in test-lib.sh based on
   wildcards is gone, instead individual tests need to declare if
   they're OK under SANITIZE=leak. This is done via "export
   TEST_PASSES_SANITIZE_LEAK=true", there's a handy import of
   "./test-pragma-SANITIZE=leak-ok.sh" before sourcing "./test-lib.sh"
   itself to set this.

 * The various leak fixes are gone entirely. I'll submit some of those
   independently or as follow-ups.

 * We now mark 57 tests in the test suite as OK under
   SANITIZE=leak. This is far from all of them, but gives us a decent
   set to start out with. The largest chunk of these is in t0*.sh.

 * The CI job is not run under both GCC and Clang, but just whatever
   with one default compiler (which happens to be GCC). I'd missed
   that running under both was pointless.

   It would be meaningful to run this under e.g. OSX & Windows too, as
   we take different codepaths there, but that can be left to a
   follow-up series.

1. https://lore.kernel.org/git/8735qvyw0p.fsf@evledraar.gmail.com/

Ævar Arnfjörð Bjarmason (8):
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  CI: refactor "if" to "case" statement
  tests: add a test mode for SANITIZE=leak, run it in CI
  tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true
  tests: annotate t001*.sh with TEST_PASSES_SANITIZE_LEAK=true
  tests: annotate t002*.sh with TEST_PASSES_SANITIZE_LEAK=true
  tests: annotate select t0*.sh with TEST_PASSES_SANITIZE_LEAK=true
  tests: annotate select t*.sh with TEST_PASSES_SANITIZE_LEAK=true

 .github/workflows/main.yml              |  2 ++
 Makefile                                |  5 +++++
 ci/install-dependencies.sh              |  4 ++--
 ci/lib.sh                               | 29 +++++++++++++++++--------
 ci/run-build-and-tests.sh               |  4 ++--
 t/README                                |  7 ++++++
 t/t0000-basic.sh                        |  1 +
 t/t0001-init.sh                         |  1 +
 t/t0002-gitfile.sh                      |  1 +
 t/t0003-attributes.sh                   |  1 +
 t/t0004-unwritable.sh                   |  3 ++-
 t/t0005-signals.sh                      |  2 ++
 t/t0007-git-var.sh                      |  2 ++
 t/t0008-ignores.sh                      |  1 +
 t/t0010-racy-git.sh                     |  1 +
 t/t0011-hashmap.sh                      |  1 +
 t/t0013-sha1dc.sh                       |  1 +
 t/t0016-oidmap.sh                       |  1 +
 t/t0017-env-helper.sh                   |  1 +
 t/t0018-advice.sh                       |  1 +
 t/t0022-crlf-rename.sh                  |  1 +
 t/t0024-crlf-archive.sh                 |  1 +
 t/t0025-crlf-renormalize.sh             |  1 +
 t/t0026-eol-config.sh                   |  1 +
 t/t0029-core-unsetenvvars.sh            |  1 +
 t/t0030-stripspace.sh                   |  1 +
 t/t0052-simple-ipc.sh                   |  1 +
 t/t0061-run-command.sh                  |  1 +
 t/t0063-string-list.sh                  |  1 +
 t/t0066-dir-iterator.sh                 |  1 +
 t/t0067-parse_pathspec_file.sh          |  1 +
 t/t0091-bugreport.sh                    |  1 +
 t/t1010-mktree.sh                       |  1 +
 t/t1100-commit-tree-options.sh          |  1 +
 t/t1308-config-set.sh                   |  1 +
 t/t1309-early-config.sh                 |  1 +
 t/t1420-lost-found.sh                   |  1 +
 t/t1430-bad-ref-name.sh                 |  1 +
 t/t1509-root-work-tree.sh               |  1 +
 t/t2002-checkout-cache-u.sh             |  1 +
 t/t2050-git-dir-relative.sh             |  1 +
 t/t2081-parallel-checkout-collisions.sh |  1 +
 t/t2100-update-cache-badpath.sh         |  1 +
 t/t2200-add-update.sh                   |  1 +
 t/t2201-add-update-typechange.sh        |  1 +
 t/t2202-add-addremove.sh                |  1 +
 t/t2204-add-ignored.sh                  |  1 +
 t/t2300-cd-to-toplevel.sh               |  1 +
 t/t3000-ls-files-others.sh              |  1 +
 t/t3004-ls-files-basic.sh               |  1 +
 t/t3006-ls-files-long.sh                |  1 +
 t/t3008-ls-files-lazy-init-name-hash.sh |  1 +
 t/t3100-ls-tree-restrict.sh             |  1 +
 t/t3101-ls-tree-dirname.sh              |  1 +
 t/t3102-ls-tree-wildcards.sh            |  1 +
 t/t3103-ls-tree-misc.sh                 |  1 +
 t/t3205-branch-color.sh                 |  1 +
 t/t3211-peel-ref.sh                     |  1 +
 t/t3300-funny-names.sh                  |  1 +
 t/t3902-quoted.sh                       |  1 +
 t/t4002-diff-basic.sh                   |  1 +
 t/t4026-color.sh                        |  1 +
 t/t4300-merge-tree.sh                   |  1 +
 t/test-lib.sh                           | 22 +++++++++++++++++++
 t/test-pragma-SANITIZE=leak-ok.sh       |  8 +++++++
 65 files changed, 128 insertions(+), 14 deletions(-)
 create mode 100644 t/test-pragma-SANITIZE=leak-ok.sh

Range-diff against v2:
-:  ----------- > 1:  85619728d41 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
-:  ----------- > 2:  91c36b94eaa CI: refactor "if" to "case" statement
1:  df5a44e70b5 ! 3:  7e3577e4e3c tests: add a test mode for SANITIZE=leak, run it in CI
    @@ Commit message
         corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
         fixed as one-offs without structured regression testing.
     
    -    This change add such a mode, we now have new
    -    linux-{clang,gcc}-sanitize-leak CI targets, these targets run the same
    -    tests as linux-{clang,gcc}, except that almost all of them are
    -    skipped.
    +    This change add such a mode, and a new linux-SANITIZE=leak CI
    +    target. The test mode and CI target only runs a whitelist of
    +    known-good tests using a mechanism discussed below, to ensure that we
    +    won't add regressions to code that's had its memory leaks fixed.
     
    -    There is a whitelist of some tests that are OK in test-lib.sh, and
    -    individual tests can be opted-in by setting
    -    GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
    -    individual test can be skipped with the "!SANITIZE_LEAK"
    -    prerequisite. See the updated t/README for more details.
    +    The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
    +    mode. When running in that mode all tests except those that have opted
    +    themselves in to running by setting and exporting
    +    TEST_PASSES_SANITIZE_LEAK=true before sourcing test-lib.sh.
     
    -    I'm using the GIT_TEST_SANITIZE_LEAK=true and !SANITIZE_LEAK pattern
    -    in a couple of tests whose memory leaks I'll fix in subsequent
    -    commits.
    +    I'm adding a "test-pragma-SANITIZE=leak-ok.sh" wrapper for setting and
    +    exporting that variable, as the assignment/export boilerplate would
    +    otherwise get quite verbose and repetitive in subsequent commits.
     
    -    I'm not being aggressive about opting in tests, it's not all tests
    -    that currently pass under SANITIZE=leak, just a small number of
    -    known-good tests. We can add more later as we fix leaks and grow more
    -    confident in this test mode.
    +    The tests using the "test-pragma-SANITIZE=leak-ok.sh" pragma can in
    +    turn make use of the "SANITIZE_LEAK" prerequisite added in a preceding
    +    commit, should they wish to selectively skip tests even under
    +    "GIT_TEST_PASSING_SANITIZE_LEAK=true".
     
    -    See the recent discussion at [1] about the lack of this sort of test
    -    mode, and 0e5bba53af (add UNLEAK annotation for reducing leak false
    -    positives, 2017-09-08) for the initial addition of SANITIZE=leak.
    +    Now tests that don't set the "test-pragma-SANITIZE=leak-ok.sh" pragma
    +    will be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
    +
    +        $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    +        1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set
    +
    +    In subsequents commit we'll conservatively add more
    +    TEST_PASSES_SANITIZE_LEAK=true annotations. The idea is that as memory
    +    leaks are fixed we can add more known-good tests to this CI target, to
    +    ensure that we won't have regressions.
    +
    +    As of writing this we've got major regressions between master..seen,
    +    i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
    +    'ah/plugleaks', 2021-08-04) have regressed recently.
    +
    +    See the discussion at <87czsv2idy.fsf@evledraar.gmail.com> about the
    +    lack of this sort of test mode, and 0e5bba53af (add UNLEAK annotation
    +    for reducing leak false positives, 2017-09-08) for the initial
    +    addition of SANITIZE=leak.
     
         See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
         7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
         936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
         past history of "one-off" SANITIZE=leak (and more) fixes.
     
    -    When calling maybe_skip_all_sanitize_leak matching against
    -    "$TEST_NAME" instead of "$this_test" as other "match_pattern_list()"
    -    users do is intentional. I'd like to match things like "t13*config*"
    -    in subsequent commits. This part of the API isn't public, so we can
    -    freely change it in the future.
    -
    -    1. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
    -
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## .github/workflows/main.yml ##
    @@ .github/workflows/main.yml: jobs:
                - jobname: linux-gcc-default
                  cc: gcc
                  pool: ubuntu-latest
    -+          - jobname: linux-clang-sanitize-leak
    -+            cc: clang
    -+            pool: ubuntu-latest
    -+          - jobname: linux-gcc-sanitize-leak
    -+            cc: gcc
    ++          - jobname: linux-SANITIZE=leak
     +            pool: ubuntu-latest
          env:
            CC: ${{matrix.vector.cc}}
            jobname: ${{matrix.vector.jobname}}
     
    - ## Makefile ##
    -@@ Makefile: PTHREAD_CFLAGS =
    - SPARSE_FLAGS ?=
    - SP_EXTRA_FLAGS = -Wno-universal-initializer
    - 
    -+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
    -+SANITIZE_LEAK =
    -+
    - # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
    - # usually result in less CPU usage at the cost of higher peak memory.
    - # Setting it to 0 will feed all files in a single spatch invocation.
    -@@ Makefile: BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
    - endif
    - ifneq ($(filter leak,$(SANITIZERS)),)
    - BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
    -+SANITIZE_LEAK = YesCompiledWithIt
    - endif
    - ifneq ($(filter address,$(SANITIZERS)),)
    - NO_REGEX = NeededForASAN
    -@@ Makefile: GIT-BUILD-OPTIONS: FORCE
    - 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
    - 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
    - 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
    -+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
    - 	@echo X=\'$(X)\' >>$@+
    - ifdef TEST_OUTPUT_DIRECTORY
    - 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
    -
      ## ci/install-dependencies.sh ##
     @@ ci/install-dependencies.sh: UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
       libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
      
      case "$jobname" in
     -linux-clang|linux-gcc)
    -+linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
    ++linux-clang|linux-gcc|linux-SANITIZE=leak)
      	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
      	sudo apt-get -q update
      	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
      		$UBUNTU_COMMON_PKGS
      	case "$jobname" in
     -	linux-gcc)
    -+	linux-gcc|linux-gcc-sanitize-leak)
    ++	linux-gcc|linux-SANITIZE=leak)
      		sudo apt-get -q -y install gcc-8
      		;;
      	esac
    @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
      
      case "$jobname" in
     -linux-clang|linux-gcc)
    --	if [ "$jobname" = linux-gcc ]
    --	then
    -+linux-clang|linux-gcc|linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
    -+	case "$jobname" in
    -+	linux-gcc|linux-gcc-sanitize-leak)
    ++linux-clang|linux-gcc|linux-SANITIZE=leak)
    + 	case "$jobname" in
    +-	linux-gcc)
    ++	linux-gcc|linux-SANITIZE=leak)
      		export CC=gcc-8
      		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
    --	else
    -+		;;
    -+	*)
    - 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
    --	fi
    -+		;;
    -+	esac
    - 
    - 	export GIT_TEST_HTTPD=true
    - 
    + 		;;
     @@ ci/lib.sh: linux-musl)
      	;;
      esac
      
     +case "$jobname" in
    -+linux-clang-sanitize-leak|linux-gcc-sanitize-leak)
    ++linux-SANITIZE=leak)
     +	export SANITIZE=leak
    ++	export GIT_TEST_PASSING_SANITIZE_LEAK=true
     +	;;
     +esac
     +
    @@ ci/run-build-and-tests.sh: esac
      make
      case "$jobname" in
     -linux-gcc)
    -+linux-gcc|linux-gcc-sanitize-leak)
    ++linux-gcc|linux-SANITIZE=leak)
      	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
      	make test
      	export GIT_TEST_SPLIT_INDEX=yes
    @@ ci/run-build-and-tests.sh: linux-gcc)
      	make test
      	;;
     -linux-clang)
    -+linux-clang|linux-clang-sanitize-leak)
    ++linux-clang|linux-SANITIZE=leak)
      	export GIT_TEST_DEFAULT_HASH=sha1
      	make test
      	export GIT_TEST_DEFAULT_HASH=sha256
    @@ t/README: GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
      to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
      execution of the parallel-checkout code.
      
    -+GIT_TEST_SANITIZE_LEAK=<boolean> will force the tests to run when git
    -+is compiled with SANITIZE=leak (we pick it up via
    -+../GIT-BUILD-OPTIONS).
    -+
    -+By default all tests are skipped when compiled with SANITIZE=leak, and
    -+individual test scripts opt themselves in to leak testing by setting
    -+GIT_TEST_SANITIZE_LEAK=true before sourcing test-lib.sh. Within those
    -+tests use the SANITIZE_LEAK prerequisite to skip individiual tests
    -+(i.e. test_expect_success !SANITIZE_LEAK [...]).
    -+
    -+So the GIT_TEST_SANITIZE_LEAK setting is different in behavior from
    -+both other GIT_TEST_*=[true|false] settings, but more useful given how
    -+SANITIZE=leak works & the state of the test suite. Manually setting
    -+GIT_TEST_SANITIZE_LEAK=true is only useful during development when
    -+finding and fixing memory leaks.
    ++GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
    ++SANITIZE=leak will run only those tests that have whitelisted
    ++themselves as passing with no memory leaks. Do this by sourcing
    ++"test-pragma-SANITIZE=leak-ok.sh" before sourcing "test-lib.sh" itself
    ++at the top of the test script. This test mode is used by the
    ++"linux-SANITIZE=leak" CI target.
     +
      Naming Tests
      ------------
      
     
    - ## t/t5701-git-serve.sh ##
    -@@ t/t5701-git-serve.sh: test_expect_success 'unexpected lines are not allowed in fetch request' '
    + ## t/t0000-basic.sh ##
    +@@ t/t0000-basic.sh: swapping compression and hashing order, the person who is making the
    + modification *should* take notice and update the test vectors here.
    + '
    + 
    ++. ./test-pragma-SANITIZE=leak-ok.sh
    + . ./test-lib.sh
      
    - # Test the basics of object-info
    - #
    --test_expect_success 'basics of object-info' '
    -+test_expect_success !SANITIZE_LEAK 'basics of object-info' '
    - 	test-tool pkt-line pack >in <<-EOF &&
    - 	command=object-info
    - 	object-format=$(test_oid algo)
    + try_local_xy () {
     
      ## t/test-lib.sh ##
    -@@ t/test-lib.sh: then
    - 	exit 1
    - fi
    - 
    -+# SANITIZE=leak test mode
    -+sanitize_leak_true=
    -+add_sanitize_leak_true () {
    -+	sanitize_leak_true="$sanitize_leak_true$1 "
    -+}
    -+
    -+sanitize_leak_false=
    -+add_sanitize_leak_false () {
    -+	sanitize_leak_false="$sanitize_leak_false$1 "
    -+}
    -+
    -+sanitize_leak_opt_in_msg="opt-in with GIT_TEST_SANITIZE_LEAK=true"
    -+maybe_skip_all_sanitize_leak () {
    -+	# Whitelist patterns
    -+	add_sanitize_leak_true 't000*'
    -+	add_sanitize_leak_true 't001*'
    -+	add_sanitize_leak_true 't006*'
    -+
    -+	# Blacklist patterns (overrides whitelist)
    -+	add_sanitize_leak_false 't000[469]*'
    -+	add_sanitize_leak_false 't001[2459]*'
    -+	add_sanitize_leak_false 't006[0248]*'
    -+
    -+	if match_pattern_list "$1" "$sanitize_leak_false"
    -+	then
    -+		skip_all="test $this_test on SANITIZE=leak blacklist, $sanitize_leak_opt_in_msg"
    -+		test_done
    -+	elif match_pattern_list "$1" "$sanitize_leak_true"
    -+	then
    -+		return 0
    -+	fi
    -+	return 1
    -+}
    -+
    - # Are we running this test at all?
    - remove_trash=
    - this_test=${0##*/}
     @@ t/test-lib.sh: then
      	test_done
      fi
    @@ t/test-lib.sh: then
     +# SANITIZE=leak
     +if test -n "$SANITIZE_LEAK"
     +then
    -+	if test -z "$GIT_TEST_SANITIZE_LEAK" &&
    -+		maybe_skip_all_sanitize_leak "$TEST_NAME"
    ++	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
     +	then
    -+		say_color info >&3 "test $this_test on SANITIZE=leak whitelist"
    -+		GIT_TEST_SANITIZE_LEAK=true
    -+	fi
    ++		# We need to see it in "git env--helper" (via
    ++		# test_bool_env)
    ++		export TEST_PASSES_SANITIZE_LEAK
     +
    -+	# We need to see it in "git env--helper" (via
    -+	# test_bool_env)
    -+	export GIT_TEST_SANITIZE_LEAK
    -+
    -+	if ! test_bool_env GIT_TEST_SANITIZE_LEAK false
    -+	then
    -+		skip_all="skip all tests in $this_test under SANITIZE=leak, $sanitize_leak_opt_in_msg"
    -+		test_done
    ++		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
    ++		then
    ++			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
    ++			test_done
    ++		fi
     +	fi
    -+elif test_bool_env GIT_TEST_SANITIZE_LEAK false
    ++elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
     +then
    -+	error "GIT_TEST_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
    ++	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
     +fi
     +
      # Last-minute variable setup
      HOME="$TRASH_DIRECTORY"
      GNUPGHOME="$HOME/gnupg-home-not-used"
    -@@ t/test-lib.sh: test -z "$NO_PYTHON" && test_set_prereq PYTHON
    - test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
    - test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
    - test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
    -+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
    - 
    - if test -z "$GIT_TEST_CHECK_CACHE_TREE"
    - then
    +
    + ## t/test-pragma-SANITIZE=leak-ok.sh (new) ##
    +@@
    ++#!/bin/sh
    ++
    ++## This "pragma" (as in "perldoc perlpragma") declares that the test
    ++## will pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. Source this
    ++## before sourcing test-lib.sh
    ++
    ++TEST_PASSES_SANITIZE_LEAK=true
    ++export TEST_PASSES_SANITIZE_LEAK
2:  51fd1c400da < -:  ----------- SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
3:  720852eee0b < -:  ----------- SANITIZE tests: fix memory leaks in t5701*, add to whitelist
4:  80edda308c9 < -:  ----------- SANITIZE tests: fix leak in mailmap.c
-:  ----------- > 4:  0cd14d64165 tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true
-:  ----------- > 5:  ed5f5705755 tests: annotate t001*.sh with TEST_PASSES_SANITIZE_LEAK=true
-:  ----------- > 6:  2599016c4e7 tests: annotate t002*.sh with TEST_PASSES_SANITIZE_LEAK=true
-:  ----------- > 7:  ddc4d6d2cf1 tests: annotate select t0*.sh with TEST_PASSES_SANITIZE_LEAK=true
-:  ----------- > 8:  e611d2c23d9 tests: annotate select t*.sh with TEST_PASSES_SANITIZE_LEAK=true
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v3 1/8] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 2/8] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
                         ` (6 subsequent siblings)
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              | 5 +++++
 t/t0004-unwritable.sh | 2 +-
 t/test-lib.sh         | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index d1feab008fc..005d647d46e 100644
--- a/Makefile
+++ b/Makefile
@@ -1221,6 +1221,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1265,6 +1268,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2812,6 +2816,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index e3137d638ee..fbdcb926b3a 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -21,7 +21,7 @@ test_expect_success POSIXPERM,SANITY 'write-tree should notice unwritable reposi
 	test_must_fail git write-tree
 '
 
-test_expect_success POSIXPERM,SANITY 'commit should notice unwritable repository' '
+test_expect_success POSIXPERM,SANITY,!SANITIZE_LEAK 'commit should notice unwritable repository' '
 	test_when_finished "chmod 775 .git/objects .git/objects/??" &&
 	chmod a-w .git/objects .git/objects/?? &&
 	test_must_fail git commit -m second
diff --git a/t/test-lib.sh b/t/test-lib.sh
index abcfbed6d61..4ab18914a3d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1533,6 +1533,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 2/8] CI: refactor "if" to "case" statement
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
  2021-08-31 13:35       ` [PATCH v3 1/8] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 3/8] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                         ` (5 subsequent siblings)
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Refactor an "if" statement for "linux-gcc" to a "case" statement in
preparation for another case being added to it, and do the same for
the "osx-gcc" just below it for consistency.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 ci/lib.sh | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f5..33b9777ab7e 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -184,13 +184,15 @@ export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
 linux-clang|linux-gcc)
-	if [ "$jobname" = linux-gcc ]
-	then
+	case "$jobname" in
+	linux-gcc)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
-	fi
+		;;
+	esac
 
 	export GIT_TEST_HTTPD=true
 
@@ -207,13 +209,15 @@ linux-clang|linux-gcc)
 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
 	;;
 osx-clang|osx-gcc)
-	if [ "$jobname" = osx-gcc ]
-	then
+	case "$jobname" in
+	osx-gcc)
 		export CC=gcc-9
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python2)"
-	fi
+		;;
+	esac
 
 	# t9810 occasionally fails on Travis CI OS X
 	# t9816 occasionally fails with "TAP out of sequence errors" on
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 3/8] tests: add a test mode for SANITIZE=leak, run it in CI
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
  2021-08-31 13:35       ` [PATCH v3 1/8] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 2/8] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 4/8] tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true Ævar Arnfjörð Bjarmason
                         ` (4 subsequent siblings)
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak there has been no
corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
fixed as one-offs without structured regression testing.

This change add such a mode, and a new linux-SANITIZE=leak CI
target. The test mode and CI target only runs a whitelist of
known-good tests using a mechanism discussed below, to ensure that we
won't add regressions to code that's had its memory leaks fixed.

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode all tests except those that have opted
themselves in to running by setting and exporting
TEST_PASSES_SANITIZE_LEAK=true before sourcing test-lib.sh.

I'm adding a "test-pragma-SANITIZE=leak-ok.sh" wrapper for setting and
exporting that variable, as the assignment/export boilerplate would
otherwise get quite verbose and repetitive in subsequent commits.

The tests using the "test-pragma-SANITIZE=leak-ok.sh" pragma can in
turn make use of the "SANITIZE_LEAK" prerequisite added in a preceding
commit, should they wish to selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true".

Now tests that don't set the "test-pragma-SANITIZE=leak-ok.sh" pragma
will be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

In subsequents commit we'll conservatively add more
TEST_PASSES_SANITIZE_LEAK=true annotations. The idea is that as memory
leaks are fixed we can add more known-good tests to this CI target, to
ensure that we won't have regressions.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com> about the
lack of this sort of test mode, and 0e5bba53af (add UNLEAK annotation
for reducing leak false positives, 2017-09-08) for the initial
addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .github/workflows/main.yml        |  2 ++
 ci/install-dependencies.sh        |  4 ++--
 ci/lib.sh                         | 11 +++++++++--
 ci/run-build-and-tests.sh         |  4 ++--
 t/README                          |  7 +++++++
 t/t0000-basic.sh                  |  1 +
 t/test-lib.sh                     | 21 +++++++++++++++++++++
 t/test-pragma-SANITIZE=leak-ok.sh |  8 ++++++++
 8 files changed, 52 insertions(+), 6 deletions(-)
 create mode 100644 t/test-pragma-SANITIZE=leak-ok.sh

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 47876a4f02e..d11b971f970 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -232,6 +232,8 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-SANITIZE=leak
+            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 5772081b6e5..30276ae1e00 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-SANITIZE=leak)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
 		$UBUNTU_COMMON_PKGS
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-SANITIZE=leak)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
diff --git a/ci/lib.sh b/ci/lib.sh
index 33b9777ab7e..d86b83ed203 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,9 +183,9 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-SANITIZE=leak)
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-SANITIZE=leak)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
 		;;
@@ -237,4 +237,11 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-SANITIZE=leak)
+	export SANITIZE=leak
+	export GIT_TEST_PASSING_SANITIZE_LEAK=true
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3ce81ffee94..f0b9775b6c7 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -12,7 +12,7 @@ esac
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc|linux-SANITIZE=leak)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
@@ -29,7 +29,7 @@ linux-gcc)
 	export GIT_TEST_CHECKOUT_WORKERS=2
 	make test
 	;;
-linux-clang)
+linux-clang|linux-SANITIZE=leak)
 	export GIT_TEST_DEFAULT_HASH=sha1
 	make test
 	export GIT_TEST_DEFAULT_HASH=sha256
diff --git a/t/README b/t/README
index 9e701223020..f5dfac568d1 100644
--- a/t/README
+++ b/t/README
@@ -448,6 +448,13 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
 to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
 execution of the parallel-checkout code.
 
+GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
+SANITIZE=leak will run only those tests that have whitelisted
+themselves as passing with no memory leaks. Do this by sourcing
+"test-pragma-SANITIZE=leak-ok.sh" before sourcing "test-lib.sh" itself
+at the top of the test script. This test mode is used by the
+"linux-SANITIZE=leak" CI target.
+
 Naming Tests
 ------------
 
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index cb87768513c..14836c97cc6 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -18,6 +18,7 @@ swapping compression and hashing order, the person who is making the
 modification *should* take notice and update the test vectors here.
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 try_local_xy () {
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4ab18914a3d..332dd59257d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1379,6 +1379,27 @@ then
 	test_done
 fi
 
+# Aggressively skip non-whitelisted tests when compiled with
+# SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+	then
+		# We need to see it in "git env--helper" (via
+		# test_bool_env)
+		export TEST_PASSES_SANITIZE_LEAK
+
+		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
+		then
+			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
+			test_done
+		fi
+	fi
+elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+then
+	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 HOME="$TRASH_DIRECTORY"
 GNUPGHOME="$HOME/gnupg-home-not-used"
diff --git a/t/test-pragma-SANITIZE=leak-ok.sh b/t/test-pragma-SANITIZE=leak-ok.sh
new file mode 100644
index 00000000000..5f03397075d
--- /dev/null
+++ b/t/test-pragma-SANITIZE=leak-ok.sh
@@ -0,0 +1,8 @@
+#!/bin/sh
+
+## This "pragma" (as in "perldoc perlpragma") declares that the test
+## will pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. Source this
+## before sourcing test-lib.sh
+
+TEST_PASSES_SANITIZE_LEAK=true
+export TEST_PASSES_SANITIZE_LEAK
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 4/8] tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
                         ` (2 preceding siblings ...)
  2021-08-31 13:35       ` [PATCH v3 3/8] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 5/8] tests: annotate t001*.sh " Ævar Arnfjörð Bjarmason
                         ` (3 subsequent siblings)
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Annotate the t000*.sh tests that pass under SANITIZE=leak, these tests
now pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. We skip
t0006-date.sh and t0009-prio-queue.sh due to outstanding memory leaks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0001-init.sh       | 1 +
 t/t0002-gitfile.sh    | 1 +
 t/t0003-attributes.sh | 1 +
 t/t0004-unwritable.sh | 1 +
 t/t0005-signals.sh    | 2 ++
 t/t0007-git-var.sh    | 2 ++
 t/t0008-ignores.sh    | 1 +
 7 files changed, 9 insertions(+)

diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index df544bb321f..8ce04bcabd2 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -2,6 +2,7 @@
 
 test_description='git init'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 check_config () {
diff --git a/t/t0002-gitfile.sh b/t/t0002-gitfile.sh
index 8440e6add12..3dcb3d16944 100755
--- a/t/t0002-gitfile.sh
+++ b/t/t0002-gitfile.sh
@@ -7,6 +7,7 @@ Verify that plumbing commands work when .git is a file
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 objpath() {
diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh
index 1e4c672b84a..4ef24a35ab5 100755
--- a/t/t0003-attributes.sh
+++ b/t/t0003-attributes.sh
@@ -2,6 +2,7 @@
 
 test_description=gitattributes
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 attr_check_basic () {
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index fbdcb926b3a..35571947ec5 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -2,6 +2,7 @@
 
 test_description='detect unwritable repository and fail correctly'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t0005-signals.sh b/t/t0005-signals.sh
index 4c214bd11c4..cd3ecf403e0 100755
--- a/t/t0005-signals.sh
+++ b/t/t0005-signals.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='signals work as we expect'
+
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 cat >expect <<EOF
diff --git a/t/t0007-git-var.sh b/t/t0007-git-var.sh
index 88b9ae81588..bb8353e6d32 100755
--- a/t/t0007-git-var.sh
+++ b/t/t0007-git-var.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='basic sanity checks for git var'
+
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'get GIT_AUTHOR_IDENT' '
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index a594b4aa7d0..6daa7ce529e 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -2,6 +2,7 @@
 
 test_description=check-ignore
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 init_vars () {
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 5/8] tests: annotate t001*.sh with TEST_PASSES_SANITIZE_LEAK=true
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
                         ` (3 preceding siblings ...)
  2021-08-31 13:35       ` [PATCH v3 4/8] tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 6/8] tests: annotate t002*.sh " Ævar Arnfjörð Bjarmason
                         ` (2 subsequent siblings)
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Annotate the t001*.sh tests that pass under SANITIZE=leak, these tests
now pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. We skip
t0012-help.sh, t0014-alias.sh, t0015-hash.sh and t0019-json-writer.sh
due to outstanding memory leaks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0010-racy-git.sh   | 1 +
 t/t0011-hashmap.sh    | 1 +
 t/t0013-sha1dc.sh     | 1 +
 t/t0016-oidmap.sh     | 1 +
 t/t0017-env-helper.sh | 1 +
 t/t0018-advice.sh     | 1 +
 6 files changed, 6 insertions(+)

diff --git a/t/t0010-racy-git.sh b/t/t0010-racy-git.sh
index 5657c5a87b6..9a627077be4 100755
--- a/t/t0010-racy-git.sh
+++ b/t/t0010-racy-git.sh
@@ -2,6 +2,7 @@
 
 test_description='racy GIT'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 # This test can give false success if your machine is sufficiently
diff --git a/t/t0011-hashmap.sh b/t/t0011-hashmap.sh
index 5343ffd3f92..02b07ffa75c 100755
--- a/t/t0011-hashmap.sh
+++ b/t/t0011-hashmap.sh
@@ -1,6 +1,7 @@
 #!/bin/sh
 
 test_description='test hashmap and string hash functions'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_hashmap() {
diff --git a/t/t0013-sha1dc.sh b/t/t0013-sha1dc.sh
index 419f31a8f7d..812b5fcaff3 100755
--- a/t/t0013-sha1dc.sh
+++ b/t/t0013-sha1dc.sh
@@ -1,6 +1,7 @@
 #!/bin/sh
 
 test_description='test sha1 collision detection'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 TEST_DATA="$TEST_DIRECTORY/t0013"
 
diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
index 31f8276ba82..a9e135d859b 100755
--- a/t/t0016-oidmap.sh
+++ b/t/t0016-oidmap.sh
@@ -1,6 +1,7 @@
 #!/bin/sh
 
 test_description='test oidmap'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 # This purposefully is very similar to t0011-hashmap.sh
diff --git a/t/t0017-env-helper.sh b/t/t0017-env-helper.sh
index 4a159f99e44..14bb6797b30 100755
--- a/t/t0017-env-helper.sh
+++ b/t/t0017-env-helper.sh
@@ -2,6 +2,7 @@
 
 test_description='test env--helper'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 
diff --git a/t/t0018-advice.sh b/t/t0018-advice.sh
index 39e5e4b34f8..326752a9711 100755
--- a/t/t0018-advice.sh
+++ b/t/t0018-advice.sh
@@ -2,6 +2,7 @@
 
 test_description='Test advise_if_enabled functionality'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'advice should be printed when config variable is unset' '
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 6/8] tests: annotate t002*.sh with TEST_PASSES_SANITIZE_LEAK=true
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
                         ` (4 preceding siblings ...)
  2021-08-31 13:35       ` [PATCH v3 5/8] tests: annotate t001*.sh " Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 7/8] tests: annotate select t0*.sh " Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 8/8] tests: annotate select t*.sh " Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Annotate the t002*.sh tests that pass under SANITIZE=leak, these tests
now pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. We skip
t0020-crlf.sh, t0021-conversion.sh, t0023-crlf-am.sh and
t0028-working-tree-encoding.sh due to outstanding memory leaks.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0022-crlf-rename.sh       | 1 +
 t/t0024-crlf-archive.sh      | 1 +
 t/t0025-crlf-renormalize.sh  | 1 +
 t/t0026-eol-config.sh        | 1 +
 t/t0029-core-unsetenvvars.sh | 1 +
 5 files changed, 5 insertions(+)

diff --git a/t/t0022-crlf-rename.sh b/t/t0022-crlf-rename.sh
index 7af3fbcc7b9..d8ae0879bdb 100755
--- a/t/t0022-crlf-rename.sh
+++ b/t/t0022-crlf-rename.sh
@@ -2,6 +2,7 @@
 
 test_description='ignore CR in CRLF sequence while computing similiarity'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t0024-crlf-archive.sh b/t/t0024-crlf-archive.sh
index 4e9fa3cd684..95913032524 100755
--- a/t/t0024-crlf-archive.sh
+++ b/t/t0024-crlf-archive.sh
@@ -2,6 +2,7 @@
 
 test_description='respect crlf in git archive'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t0025-crlf-renormalize.sh b/t/t0025-crlf-renormalize.sh
index e13363ade5c..88cbdc5ed3a 100755
--- a/t/t0025-crlf-renormalize.sh
+++ b/t/t0025-crlf-renormalize.sh
@@ -2,6 +2,7 @@
 
 test_description='CRLF renormalization'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t0026-eol-config.sh b/t/t0026-eol-config.sh
index c5203e232c8..3be010e2f12 100755
--- a/t/t0026-eol-config.sh
+++ b/t/t0026-eol-config.sh
@@ -2,6 +2,7 @@
 
 test_description='CRLF conversion'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 has_cr() {
diff --git a/t/t0029-core-unsetenvvars.sh b/t/t0029-core-unsetenvvars.sh
index 24ce46a6ea1..87566900c2b 100755
--- a/t/t0029-core-unsetenvvars.sh
+++ b/t/t0029-core-unsetenvvars.sh
@@ -2,6 +2,7 @@
 
 test_description='test the Windows-only core.unsetenvvars setting'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 if ! test_have_prereq MINGW
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 7/8] tests: annotate select t0*.sh with TEST_PASSES_SANITIZE_LEAK=true
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
                         ` (5 preceding siblings ...)
  2021-08-31 13:35       ` [PATCH v3 6/8] tests: annotate t002*.sh " Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  2021-08-31 13:35       ` [PATCH v3 8/8] tests: annotate select t*.sh " Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Annotate a few t0*.sh tests that pass with SANITIZE=leak, these tests
now pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. These aren't all
of the ones in t0*.sh that pass, I'm selecting a few ones that test
some core APIs, and the simple "git bugreport" built-in.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t0030-stripspace.sh          | 1 +
 t/t0052-simple-ipc.sh          | 1 +
 t/t0061-run-command.sh         | 1 +
 t/t0063-string-list.sh         | 1 +
 t/t0066-dir-iterator.sh        | 1 +
 t/t0067-parse_pathspec_file.sh | 1 +
 t/t0091-bugreport.sh           | 1 +
 7 files changed, 7 insertions(+)

diff --git a/t/t0030-stripspace.sh b/t/t0030-stripspace.sh
index 0c24a0f9a37..d00f7dd01e8 100755
--- a/t/t0030-stripspace.sh
+++ b/t/t0030-stripspace.sh
@@ -5,6 +5,7 @@
 
 test_description='git stripspace'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 t40='A quick brown fox jumps over the lazy do'
diff --git a/t/t0052-simple-ipc.sh b/t/t0052-simple-ipc.sh
index ff98be31a51..f76a1f5e249 100755
--- a/t/t0052-simple-ipc.sh
+++ b/t/t0052-simple-ipc.sh
@@ -2,6 +2,7 @@
 
 test_description='simple command server'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test-tool simple-ipc SUPPORTS_SIMPLE_IPC || {
diff --git a/t/t0061-run-command.sh b/t/t0061-run-command.sh
index 7d599675e35..89fd3b18e52 100755
--- a/t/t0061-run-command.sh
+++ b/t/t0061-run-command.sh
@@ -5,6 +5,7 @@
 
 test_description='Test run command'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 cat >hello-script <<-EOF
diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
index c6ee9f66b11..0bd69de4f75 100755
--- a/t/t0063-string-list.sh
+++ b/t/t0063-string-list.sh
@@ -5,6 +5,7 @@
 
 test_description='Test string list functionality'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_split () {
diff --git a/t/t0066-dir-iterator.sh b/t/t0066-dir-iterator.sh
index 92910e4e6c1..edafdbbe7dc 100755
--- a/t/t0066-dir-iterator.sh
+++ b/t/t0066-dir-iterator.sh
@@ -2,6 +2,7 @@
 
 test_description='Test the dir-iterator functionality'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t0067-parse_pathspec_file.sh b/t/t0067-parse_pathspec_file.sh
index 7bab49f361a..cc2540db9f9 100755
--- a/t/t0067-parse_pathspec_file.sh
+++ b/t/t0067-parse_pathspec_file.sh
@@ -2,6 +2,7 @@
 
 test_description='Test parse_pathspec_file()'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'one item from stdin' '
diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh
index 526304ff95b..946909dbfde 100755
--- a/t/t0091-bugreport.sh
+++ b/t/t0091-bugreport.sh
@@ -2,6 +2,7 @@
 
 test_description='git bugreport'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 # Headers "[System Info]" will be followed by a non-empty line if we put some
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 8/8] tests: annotate select t*.sh with TEST_PASSES_SANITIZE_LEAK=true
       [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
                         ` (6 preceding siblings ...)
  2021-08-31 13:35       ` [PATCH v3 7/8] tests: annotate select t0*.sh " Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:35       ` Ævar Arnfjörð Bjarmason
  7 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:35 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Annotate a few t*.sh tests that pass with SANITIZE=leak, these tests
now pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. These aren't all
of the ones in t*.sh that pass, I'm selecting a few arbitrary passing
ones that stress a few common commands and APIs.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 t/t1010-mktree.sh                       | 1 +
 t/t1100-commit-tree-options.sh          | 1 +
 t/t1308-config-set.sh                   | 1 +
 t/t1309-early-config.sh                 | 1 +
 t/t1420-lost-found.sh                   | 1 +
 t/t1430-bad-ref-name.sh                 | 1 +
 t/t1509-root-work-tree.sh               | 1 +
 t/t2002-checkout-cache-u.sh             | 1 +
 t/t2050-git-dir-relative.sh             | 1 +
 t/t2081-parallel-checkout-collisions.sh | 1 +
 t/t2100-update-cache-badpath.sh         | 1 +
 t/t2200-add-update.sh                   | 1 +
 t/t2201-add-update-typechange.sh        | 1 +
 t/t2202-add-addremove.sh                | 1 +
 t/t2204-add-ignored.sh                  | 1 +
 t/t2300-cd-to-toplevel.sh               | 1 +
 t/t3000-ls-files-others.sh              | 1 +
 t/t3004-ls-files-basic.sh               | 1 +
 t/t3006-ls-files-long.sh                | 1 +
 t/t3008-ls-files-lazy-init-name-hash.sh | 1 +
 t/t3100-ls-tree-restrict.sh             | 1 +
 t/t3101-ls-tree-dirname.sh              | 1 +
 t/t3102-ls-tree-wildcards.sh            | 1 +
 t/t3103-ls-tree-misc.sh                 | 1 +
 t/t3205-branch-color.sh                 | 1 +
 t/t3211-peel-ref.sh                     | 1 +
 t/t3300-funny-names.sh                  | 1 +
 t/t3902-quoted.sh                       | 1 +
 t/t4002-diff-basic.sh                   | 1 +
 t/t4026-color.sh                        | 1 +
 t/t4300-merge-tree.sh                   | 1 +
 31 files changed, 31 insertions(+)

diff --git a/t/t1010-mktree.sh b/t/t1010-mktree.sh
index b946f876864..3912625cfad 100755
--- a/t/t1010-mktree.sh
+++ b/t/t1010-mktree.sh
@@ -2,6 +2,7 @@
 
 test_description='git mktree'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t1100-commit-tree-options.sh b/t/t1100-commit-tree-options.sh
index ae66ba5babf..d603d9a8544 100755
--- a/t/t1100-commit-tree-options.sh
+++ b/t/t1100-commit-tree-options.sh
@@ -12,6 +12,7 @@ Also make sure that command line parser understands the normal
 "flags first and then non flag arguments" command line.
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 cat >expected <<EOF
diff --git a/t/t1308-config-set.sh b/t/t1308-config-set.sh
index 88b119a0a35..06d384369df 100755
--- a/t/t1308-config-set.sh
+++ b/t/t1308-config-set.sh
@@ -2,6 +2,7 @@
 
 test_description='Test git config-set API in different settings'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 # 'check_config get_* section.key value' verifies that the entry for
diff --git a/t/t1309-early-config.sh b/t/t1309-early-config.sh
index b4a9158307f..8556ec2ac23 100755
--- a/t/t1309-early-config.sh
+++ b/t/t1309-early-config.sh
@@ -2,6 +2,7 @@
 
 test_description='Test read_early_config()'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'read early config' '
diff --git a/t/t1420-lost-found.sh b/t/t1420-lost-found.sh
index dc9e402c555..0c137b047ae 100755
--- a/t/t1420-lost-found.sh
+++ b/t/t1420-lost-found.sh
@@ -4,6 +4,7 @@
 #
 
 test_description='Test fsck --lost-found'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t1430-bad-ref-name.sh b/t/t1430-bad-ref-name.sh
index b1839e08771..c49e336bf2e 100755
--- a/t/t1430-bad-ref-name.sh
+++ b/t/t1430-bad-ref-name.sh
@@ -4,6 +4,7 @@ test_description='Test handling of ref names that check-ref-format rejects'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t1509-root-work-tree.sh b/t/t1509-root-work-tree.sh
index 553a3f601ba..3410b53d6a4 100755
--- a/t/t1509-root-work-tree.sh
+++ b/t/t1509-root-work-tree.sh
@@ -9,6 +9,7 @@ Script t1509/prepare-chroot.sh may help you setup chroot, then you
 can chroot in and execute this test from there.
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_cmp_val() {
diff --git a/t/t2002-checkout-cache-u.sh b/t/t2002-checkout-cache-u.sh
index 70361c806e1..b45cc0dff41 100755
--- a/t/t2002-checkout-cache-u.sh
+++ b/t/t2002-checkout-cache-u.sh
@@ -8,6 +8,7 @@ test_description='git checkout-index -u test.
 With -u flag, git checkout-index internally runs the equivalent of
 git update-index --refresh on the checked out entry.'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success \
diff --git a/t/t2050-git-dir-relative.sh b/t/t2050-git-dir-relative.sh
index 21f4659a9d1..8bc80f0d969 100755
--- a/t/t2050-git-dir-relative.sh
+++ b/t/t2050-git-dir-relative.sh
@@ -12,6 +12,7 @@ into the subdir while keeping the worktree location,
 and tries commits from the top and the subdir, checking
 that the commit-hook still gets called.'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 COMMIT_FILE="$(pwd)/output"
diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh
index f6fcfc0c1e4..f717709db3d 100755
--- a/t/t2081-parallel-checkout-collisions.sh
+++ b/t/t2081-parallel-checkout-collisions.sh
@@ -11,6 +11,7 @@ The tests in this file exercise parallel checkout's collision detection code in
 both these mechanics.
 "
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 . "$TEST_DIRECTORY/lib-parallel-checkout.sh"
 
diff --git a/t/t2100-update-cache-badpath.sh b/t/t2100-update-cache-badpath.sh
index 2df3fdde8bf..c700b6ee0ae 100755
--- a/t/t2100-update-cache-badpath.sh
+++ b/t/t2100-update-cache-badpath.sh
@@ -22,6 +22,7 @@ and tries to git update-index --add the following:
 All of the attempts should fail.
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 mkdir path2 path3
diff --git a/t/t2200-add-update.sh b/t/t2200-add-update.sh
index 45ca35d60ac..81a53420813 100755
--- a/t/t2200-add-update.sh
+++ b/t/t2200-add-update.sh
@@ -14,6 +14,7 @@ only the updates to dir/sub.
 Also tested are "git add -u" without limiting, and "git add -u"
 without contents changes, and other conditions'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t2201-add-update-typechange.sh b/t/t2201-add-update-typechange.sh
index a4eec0a3465..78593dc7451 100755
--- a/t/t2201-add-update-typechange.sh
+++ b/t/t2201-add-update-typechange.sh
@@ -2,6 +2,7 @@
 
 test_description='more git add -u'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t2202-add-addremove.sh b/t/t2202-add-addremove.sh
index 9ee659098c4..cd0bbf96525 100755
--- a/t/t2202-add-addremove.sh
+++ b/t/t2202-add-addremove.sh
@@ -2,6 +2,7 @@
 
 test_description='git add --all'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t2204-add-ignored.sh b/t/t2204-add-ignored.sh
index 2e07365bbb0..efb973d688f 100755
--- a/t/t2204-add-ignored.sh
+++ b/t/t2204-add-ignored.sh
@@ -2,6 +2,7 @@
 
 test_description='giving ignored paths to git add'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t2300-cd-to-toplevel.sh b/t/t2300-cd-to-toplevel.sh
index c8de6d8a190..52794afe14b 100755
--- a/t/t2300-cd-to-toplevel.sh
+++ b/t/t2300-cd-to-toplevel.sh
@@ -2,6 +2,7 @@
 
 test_description='cd_to_toplevel'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 EXEC_PATH="$(git --exec-path)"
diff --git a/t/t3000-ls-files-others.sh b/t/t3000-ls-files-others.sh
index 740ce56eab5..0a7a07ab99f 100755
--- a/t/t3000-ls-files-others.sh
+++ b/t/t3000-ls-files-others.sh
@@ -15,6 +15,7 @@ filesystem.
     path3/file3 - a file in a directory
     path4       - an empty directory
 '
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'setup ' '
diff --git a/t/t3004-ls-files-basic.sh b/t/t3004-ls-files-basic.sh
index 9fd5a1f188a..2a69a12e0f0 100755
--- a/t/t3004-ls-files-basic.sh
+++ b/t/t3004-ls-files-basic.sh
@@ -6,6 +6,7 @@ This test runs git ls-files with various unusual or malformed
 command-line arguments.
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'ls-files in empty repository' '
diff --git a/t/t3006-ls-files-long.sh b/t/t3006-ls-files-long.sh
index e109c3fbfb5..bfb70e0b11d 100755
--- a/t/t3006-ls-files-long.sh
+++ b/t/t3006-ls-files-long.sh
@@ -1,6 +1,7 @@
 #!/bin/sh
 
 test_description='overly long paths'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t3008-ls-files-lazy-init-name-hash.sh b/t/t3008-ls-files-lazy-init-name-hash.sh
index 85f37049587..fce9e4c44cf 100755
--- a/t/t3008-ls-files-lazy-init-name-hash.sh
+++ b/t/t3008-ls-files-lazy-init-name-hash.sh
@@ -2,6 +2,7 @@
 
 test_description='Test the lazy init name hash with various folder structures'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 if test 1 -eq $(test-tool online-cpus)
diff --git a/t/t3100-ls-tree-restrict.sh b/t/t3100-ls-tree-restrict.sh
index 18baf49a49c..0562998120f 100755
--- a/t/t3100-ls-tree-restrict.sh
+++ b/t/t3100-ls-tree-restrict.sh
@@ -16,6 +16,7 @@ This test runs git ls-tree with the following in a tree.
 The new path restriction code should do the right thing for path2 and
 path2/baz.  Also path0/ should snow nothing.
 '
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success \
diff --git a/t/t3101-ls-tree-dirname.sh b/t/t3101-ls-tree-dirname.sh
index 12bf31022a8..57df6c7548b 100755
--- a/t/t3101-ls-tree-dirname.sh
+++ b/t/t3101-ls-tree-dirname.sh
@@ -19,6 +19,7 @@ This test runs git ls-tree with the following in a tree.
 Test the handling of multiple directories which have matching file
 entries.  Also test odd filename and missing entries handling.
 '
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t3102-ls-tree-wildcards.sh b/t/t3102-ls-tree-wildcards.sh
index 1e16c6b8ea6..47070e60428 100755
--- a/t/t3102-ls-tree-wildcards.sh
+++ b/t/t3102-ls-tree-wildcards.sh
@@ -2,6 +2,7 @@
 
 test_description='ls-tree with(out) globs'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t3103-ls-tree-misc.sh b/t/t3103-ls-tree-misc.sh
index 14520913afc..552c9e16574 100755
--- a/t/t3103-ls-tree-misc.sh
+++ b/t/t3103-ls-tree-misc.sh
@@ -7,6 +7,7 @@ Miscellaneous tests for git ls-tree.
 
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'setup' '
diff --git a/t/t3205-branch-color.sh b/t/t3205-branch-color.sh
index 08bd906173b..624abb51c1b 100755
--- a/t/t3205-branch-color.sh
+++ b/t/t3205-branch-color.sh
@@ -4,6 +4,7 @@ test_description='basic branch output coloring'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'set up some sample branches' '
diff --git a/t/t3211-peel-ref.sh b/t/t3211-peel-ref.sh
index 37b9d26f4b6..ac4db4fcf51 100755
--- a/t/t3211-peel-ref.sh
+++ b/t/t3211-peel-ref.sh
@@ -4,6 +4,7 @@ test_description='tests for the peel_ref optimization of packed-refs'
 GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success 'create annotated tag in refs/tags' '
diff --git a/t/t3300-funny-names.sh b/t/t3300-funny-names.sh
index f5bf16abcd8..2d562b407fe 100755
--- a/t/t3300-funny-names.sh
+++ b/t/t3300-funny-names.sh
@@ -9,6 +9,7 @@ This test tries pathnames with funny characters in the working
 tree, index, and tree objects.
 '
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 HT='	'
diff --git a/t/t3902-quoted.sh b/t/t3902-quoted.sh
index f528008c363..1720fe73686 100755
--- a/t/t3902-quoted.sh
+++ b/t/t3902-quoted.sh
@@ -5,6 +5,7 @@
 
 test_description='quoted output'
 
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 FN='濱野'
diff --git a/t/t4002-diff-basic.sh b/t/t4002-diff-basic.sh
index 6a9f010197c..46964db1ceb 100755
--- a/t/t4002-diff-basic.sh
+++ b/t/t4002-diff-basic.sh
@@ -6,6 +6,7 @@
 test_description='Test diff raw-output.
 
 '
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 . "$TEST_DIRECTORY"/lib-read-tree-m-3way.sh
diff --git a/t/t4026-color.sh b/t/t4026-color.sh
index c0b642c1ab0..8b4b1e01734 100755
--- a/t/t4026-color.sh
+++ b/t/t4026-color.sh
@@ -4,6 +4,7 @@
 #
 
 test_description='Test diff/status color escape codes'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 ESC=$(printf '\033')
diff --git a/t/t4300-merge-tree.sh b/t/t4300-merge-tree.sh
index e59601e5fe9..2527749aeff 100755
--- a/t/t4300-merge-tree.sh
+++ b/t/t4300-merge-tree.sh
@@ -4,6 +4,7 @@
 #
 
 test_description='git merge-tree'
+. ./test-pragma-SANITIZE=leak-ok.sh
 . ./test-lib.sh
 
 test_expect_success setup '
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}()
  2021-07-14 17:23     ` [PATCH v2 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
@ 2021-08-31 13:42       ` Ævar Arnfjörð Bjarmason
  2021-08-31 16:22         ` Eric Sunshine
  2021-08-31 19:38         ` Jeff King
  0 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:42 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Marius Storm-Olsen,
	Ævar Arnfjörð Bjarmason

In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
clear the "me" structure, but while we freed parts of the
mailmap_entry structure, we didn't free the structure itself. The same
goes for the "mailmap_info" structure.

This brings us from 50 failures when running t4203-mailmap.sh to
49. Not really progress as far as the number of failures is concerned,
but as far as I can tell this fixes all leaks in mailmap.c
itself. There's still users of it such as builtin/log.c that call
read_mailmap() without a clear_mailmap(), but that's on them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

This was originally submitted as part of the SANITIZE=leak series as
https://lore.kernel.org/git/patch-4.4-ad8680f529-20210714T172251Z-avarab@gmail.com/

In its v3 I stopped doing these leak fixes & test changes, let's just
consider this separately. We'll eventually want to add SANITIZE=leak
whitelisting to the relevant test if and when my SANITIZE=leak series
goes in, but we can just do that then along with adding various other
tests.

Range-diff:
1:  80edda308c9 ! 1:  f11eb44e4c5 SANITIZE tests: fix leak in mailmap.c
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    SANITIZE tests: fix leak in mailmap.c
    -
    -    Get closer to being able to run t4203-mailmap.sh by fixing a couple of
    -    memory leak in mailmap.c.
    +    mailmap.c: fix a memory leak in free_mailap_{info,entry}()
     
         In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
         and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
    @@ Commit message
         mailmap_entry structure, we didn't free the structure itself. The same
         goes for the "mailmap_info" structure.
     
    +    This brings us from 50 failures when running t4203-mailmap.sh to
    +    49. Not really progress as far as the number of failures is concerned,
    +    but as far as I can tell this fixes all leaks in mailmap.c
    +    itself. There's still users of it such as builtin/log.c that call
    +    read_mailmap() without a clear_mailmap(), but that's on them.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## mailmap.c ##
    @@ mailmap.c: static void free_mailmap_entry(void *p, const char *s)
      }
      
      /*
    -
    - ## t/t4203-mailmap.sh ##
    -@@ t/t4203-mailmap.sh: test_expect_success 'check-mailmap bogus contact --stdin' '
    - 	test_must_fail git check-mailmap --stdin bogus </dev/null
    - '
    - 
    -+if test_have_prereq SANITIZE_LEAK
    -+then
    -+	skip_all='skipping the rest of mailmap tests under SANITIZE_LEAK'
    -+	test_done
    -+fi
    -+
    - test_expect_success 'No mailmap' '
    - 	cat >expect <<-EOF &&
    - 	$GIT_AUTHOR_NAME (1):

 mailmap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mailmap.c b/mailmap.c
index 462b3956340..40ce152024d 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -37,6 +37,7 @@ static void free_mailmap_info(void *p, const char *s)
 		 s, debug_str(mi->name), debug_str(mi->email));
 	free(mi->name);
 	free(mi->email);
+	free(mi);
 }
 
 static void free_mailmap_entry(void *p, const char *s)
@@ -52,6 +53,7 @@ static void free_mailmap_entry(void *p, const char *s)
 
 	me->namemap.strdup_strings = 1;
 	string_list_clear_func(&me->namemap, free_mailmap_info);
+	free(me);
 }
 
 /*
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH] protocol-caps.c: fix memory leak in send_info()
  2021-07-14 17:23     ` [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
  2021-07-15 17:37       ` Andrzej Hunt
  2021-07-15 21:43       ` Jeff King
@ 2021-08-31 13:46       ` Ævar Arnfjörð Bjarmason
  2021-08-31 15:32         ` Bruno Albuquerque
       [not found]         ` <CAPeR6H69a_HMwWnpHzssaCm_ow=ic7AnzMdZVQJQ2ECRDaWzaA@mail.gmail.com>
  2 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 13:46 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Bruno Albuquerque, Andrzej Hunt,
	Ævar Arnfjörð Bjarmason

Fix a memory leak in a2ba162cda (object-info: support for retrieving
object info, 2021-04-20) which appears to have been based on a
misunderstanding of how the pkt-line.c API works. There is no need to
strdup() input to packet_writer_write(), it's just a printf()-like
format function.

This fixes a potentially large memory leak, since the number of OID
lines the "object-info" call can be arbitrarily large (or a small one
if the request is small).

This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
did before a2ba162cda2.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

This was originally submitted as part of the SANITIZE=leak series as
https://lore.kernel.org/git/patch-3.4-b7fb5d5a56-20210714T172251Z-avarab@gmail.com/

In its v3 I stopped doing these leak fixes & test changes, let's just
consider this separately. We'll eventually want to add SANITIZE=leak
whitelisting to the relevant test if and when my SANITIZE=leak series
goes in, but we can just do that then along with adding various other
tests.

Range-diff:
1:  720852eee0b ! 1:  9acbc21cdd3 SANITIZE tests: fix memory leaks in t5701*, add to whitelist
    @@ Metadata
     Author: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## Commit message ##
    -    SANITIZE tests: fix memory leaks in t5701*, add to whitelist
    +    protocol-caps.c: fix memory leak in send_info()
     
         Fix a memory leak in a2ba162cda (object-info: support for retrieving
         object info, 2021-04-20) which appears to have been based on a
    -    misunderstanding of how the pkt-line.c API works, there is no need to
    -    strdup() input to, it's just a printf()-like format function.
    +    misunderstanding of how the pkt-line.c API works. There is no need to
    +    strdup() input to packet_writer_write(), it's just a printf()-like
    +    format function.
     
         This fixes a potentially large memory leak, since the number of OID
         lines the "object-info" call can be arbitrarily large (or a small one
         if the request is small).
     
    +    This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
    +    did before a2ba162cda2.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## protocol-caps.c ##
    @@ protocol-caps.c: static void send_info(struct repository *r, struct packet_write
      }
      
      int cap_object_info(struct repository *r, struct strvec *keys,
    -
    - ## t/t5701-git-serve.sh ##
    -@@ t/t5701-git-serve.sh: test_description='test protocol v2 server commands'
    - GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
    - export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
    - 
    -+GIT_TEST_SANITIZE_LEAK=true
    - . ./test-lib.sh
    - 
    - test_expect_success 'test capability advertisement' '

 protocol-caps.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/protocol-caps.c b/protocol-caps.c
index 13a9e63a04a..901b6795e42 100644
--- a/protocol-caps.c
+++ b/protocol-caps.c
@@ -69,9 +69,10 @@ static void send_info(struct repository *r, struct packet_writer *writer,
 			}
 		}
 
-		packet_writer_write(writer, "%s",
-				    strbuf_detach(&send_buffer, NULL));
+		packet_writer_write(writer, "%s", send_buffer.buf);
+		strbuf_reset(&send_buffer);
 	}
+	strbuf_release(&send_buffer);
 }
 
 int cap_object_info(struct repository *r, struct strvec *keys,
-- 
2.33.0.805.g739b16c2189


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH] protocol-caps.c: fix memory leak in send_info()
  2021-08-31 13:46       ` [PATCH] protocol-caps.c: fix memory leak in send_info() Ævar Arnfjörð Bjarmason
@ 2021-08-31 15:32         ` Bruno Albuquerque
  2021-08-31 18:15           ` Junio C Hamano
       [not found]         ` <CAPeR6H69a_HMwWnpHzssaCm_ow=ic7AnzMdZVQJQ2ECRDaWzaA@mail.gmail.com>
  1 sibling, 1 reply; 125+ messages in thread
From: Bruno Albuquerque @ 2021-08-31 15:32 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano, Andrzej Hunt

On Tue, Aug 31, 2021 at 6:46 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:

[Replying again as I used HTML mail by mistake. Sorry.]

> Fix a memory leak in a2ba162cda (object-info: support for retrieving
> object info, 2021-04-20) which appears to have been based on a
> misunderstanding of how the pkt-line.c API works. There is no need to
> strdup() input to packet_writer_write(), it's just a printf()-like
> format function.
>
> This fixes a potentially large memory leak, since the number of OID
> lines the "object-info" call can be arbitrarily large (or a small one
> if the request is small).
>
> This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
> did before a2ba162cda2.


Thanks for cleaning up after me. Yes, this was my lack of knowledge on
how the internals of Git works. I was also not aware of SANITIZE=leak
so thanks for the heads up. This looks good to me.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}()
  2021-08-31 13:42       ` [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}() Ævar Arnfjörð Bjarmason
@ 2021-08-31 16:22         ` Eric Sunshine
  2021-08-31 19:38         ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Eric Sunshine @ 2021-08-31 16:22 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Marius Storm-Olsen

On Tue, Aug 31, 2021 at 9:43 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
> and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
> clear the "me" structure, but while we freed parts of the
> mailmap_entry structure, we didn't free the structure itself. The same
> goes for the "mailmap_info" structure.
>
> This brings us from 50 failures when running t4203-mailmap.sh to
> 49. Not really progress as far as the number of failures is concerned,
> but as far as I can tell this fixes all leaks in mailmap.c
> itself. There's still users of it such as builtin/log.c that call
> read_mailmap() without a clear_mailmap(), but that's on them.

As a standalone patch, the "50 failures" is confusing and sounds quite
alarming. Adding even a tiny bit of context:

    s/50 failure/50 SANITIZE failures/

would help reduce the confusion. Alternatively, just dropping the
second paragraph altogether would clear up any misunderstanding since
the first paragraph and the patch body stand well on their own without
any additional explanation.

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] protocol-caps.c: fix memory leak in send_info()
  2021-08-31 15:32         ` Bruno Albuquerque
@ 2021-08-31 18:15           ` Junio C Hamano
  0 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2021-08-31 18:15 UTC (permalink / raw)
  To: Bruno Albuquerque
  Cc: Ævar Arnfjörð Bjarmason, git, Andrzej Hunt

Bruno Albuquerque <bga@google.com> writes:

> On Tue, Aug 31, 2021 at 6:46 AM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>
> [Replying again as I used HTML mail by mistake. Sorry.]
>
>> Fix a memory leak in a2ba162cda (object-info: support for retrieving
>> object info, 2021-04-20) which appears to have been based on a
>> misunderstanding of how the pkt-line.c API works. There is no need to
>> strdup() input to packet_writer_write(), it's just a printf()-like
>> format function.
>>
>> This fixes a potentially large memory leak, since the number of OID
>> lines the "object-info" call can be arbitrarily large (or a small one
>> if the request is small).
>>
>> This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
>> did before a2ba162cda2.
>
>
> Thanks for cleaning up after me. Yes, this was my lack of knowledge on
> how the internals of Git works. I was also not aware of SANITIZE=leak
> so thanks for the heads up. This looks good to me.

Thanks, both.

Will apply.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}()
  2021-08-31 13:42       ` [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}() Ævar Arnfjörð Bjarmason
  2021-08-31 16:22         ` Eric Sunshine
@ 2021-08-31 19:38         ` Jeff King
  2021-08-31 19:46           ` Junio C Hamano
  1 sibling, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-08-31 19:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Marius Storm-Olsen

On Tue, Aug 31, 2021 at 03:42:52PM +0200, Ævar Arnfjörð Bjarmason wrote:

> In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
> and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
> clear the "me" structure, but while we freed parts of the
> mailmap_entry structure, we didn't free the structure itself. The same
> goes for the "mailmap_info" structure.
> 
> This brings us from 50 failures when running t4203-mailmap.sh to
> 49. Not really progress as far as the number of failures is concerned,
> but as far as I can tell this fixes all leaks in mailmap.c
> itself. There's still users of it such as builtin/log.c that call
> read_mailmap() without a clear_mailmap(), but that's on them.
> 
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---

Thanks, the patch looks good to me. I agree with Eric that mentioning
"leak failures" in the second paragraph would make it less confusing. :)

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}()
  2021-08-31 19:38         ` Jeff King
@ 2021-08-31 19:46           ` Junio C Hamano
  0 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2021-08-31 19:46 UTC (permalink / raw)
  To: Jeff King; +Cc: Ævar Arnfjörð Bjarmason, git, Marius Storm-Olsen

Jeff King <peff@peff.net> writes:

> On Tue, Aug 31, 2021 at 03:42:52PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
>> and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
>> clear the "me" structure, but while we freed parts of the
>> mailmap_entry structure, we didn't free the structure itself. The same
>> goes for the "mailmap_info" structure.
>> 
>> This brings us from 50 failures when running t4203-mailmap.sh to
>> 49. Not really progress as far as the number of failures is concerned,
>> but as far as I can tell this fixes all leaks in mailmap.c
>> itself. There's still users of it such as builtin/log.c that call
>> read_mailmap() without a clear_mailmap(), but that's on them.
>> 
>> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
>> ---
>
> Thanks, the patch looks good to me. I agree with Eric that mentioning
> "leak failures" in the second paragraph would make it less confusing. :)

Here is what I queued.

Thanks, all.

From ccdd5d1eb14a6735c34428e856c0de33f1055520 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=86var=20Arnfj=C3=B6r=C3=B0=20Bjarmason?=
 <avarab@gmail.com>
Date: Tue, 31 Aug 2021 15:42:52 +0200
Subject: [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

In the free_mailmap_entry() code added in 0925ce4d49 (Add map_user()
and clear_mailmap() to mailmap, 2009-02-08) the intent was clearly to
clear the "me" structure, but while we freed parts of the
mailmap_entry structure, we didn't free the structure itself. The same
goes for the "mailmap_info" structure.

This brings the number of SANITIZE=leak failures in t4203-mailmap.sh
down from 50 to 49. Not really progress as far as the number of
failures is concerned, but as far as I can tell this fixes all leaks
in mailmap.c itself. There's still users of it such as builtin/log.c
that call read_mailmap() without a clear_mailmap(), but that's on
them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 mailmap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mailmap.c b/mailmap.c
index d1f7c0d272..e1c8736093 100644
--- a/mailmap.c
+++ b/mailmap.c
@@ -36,6 +36,7 @@ static void free_mailmap_info(void *p, const char *s)
 		 s, debug_str(mi->name), debug_str(mi->email));
 	free(mi->name);
 	free(mi->email);
+	free(mi);
 }
 
 static void free_mailmap_entry(void *p, const char *s)
@@ -51,6 +52,7 @@ static void free_mailmap_entry(void *p, const char *s)
 
 	me->namemap.strdup_strings = 1;
 	string_list_clear_func(&me->namemap, free_mailmap_info);
+	free(me);
 }
 
 /*
-- 
2.33.0-323-g897a01baa9


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH] protocol-caps.c: fix memory leak in send_info()
       [not found]         ` <CAPeR6H69a_HMwWnpHzssaCm_ow=ic7AnzMdZVQJQ2ECRDaWzaA@mail.gmail.com>
@ 2021-08-31 20:08           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-08-31 20:08 UTC (permalink / raw)
  To: Bruno Albuquerque; +Cc: git, Junio C Hamano, Andrzej Hunt


On Tue, Aug 31 2021, Bruno Albuquerque wrote:

> On Tue, Aug 31, 2021 at 6:46 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
>
>  Fix a memory leak in a2ba162cda (object-info: support for retrieving
>  object info, 2021-04-20) which appears to have been based on a
>  misunderstanding of how the pkt-line.c API works. There is no need to
>  strdup() input to packet_writer_write(), it's just a printf()-like
>  format function.
>
>  This fixes a potentially large memory leak, since the number of OID
>  lines the "object-info" call can be arbitrarily large (or a small one
>  if the request is small).
>
>  This makes t5701-git-serve.sh pass again under SANITIZE=leak, as it
>  did before a2ba162cda2.
>
> Thanks for cleaning up after me. Yes, this was my lack of knowledge on how the internals of Git works. I was also not aware of SANITIZE=leak so thanks for
> the heads up. This looks good to me.

Thanks, for what it's worth the series I submitted in parallel to this
to add a SANITIZE=leak CI mode at
https://lore.kernel.org/git/cover-v3-0.8-00000000000-20210831T132546Z-avarab@gmail.com
could use reviewers :)

I.e. having some real tests for this sort of thing and running them in
CI will help to catch any such issues earlier.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-08-31 12:47               ` Ævar Arnfjörð Bjarmason
@ 2021-09-01  7:53                 ` Jeff King
  2021-09-01 11:45                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-09-01  7:53 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Andrzej Hunt, git, Junio C Hamano, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Elijah Newren

On Tue, Aug 31, 2021 at 02:47:01PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > That works, but now "util" is not available for all the _other_ uses for
> > which it was intended. And if we're not using it for those other uses,
> > then why does it need to exist at all? If we are only using it to hold
> > the allocated string pointer, then shouldn't it be "char *to_free"?
> 
> Because having it be "char *" doesn't cover the common case of
> e.g. getting an already allocated "struct something *" which contains
> your string, setting the "string" in "struct string_list_item" to some
> string in that struct, and the "util" to the struct itself, as we now
> own it and want to free() it later in its entirety.

OK. I buy that storing a void pointer makes it more flexible. I'm not
altogether convinced this pattern is especially common, but it's not any
harder to work with than a "need_to_free" flag, so there's no reason not
to do that (and to be fair, I didn't look around for possible uses of
the pattern; it's just not one I think of as common off the top of my
head).

> That and the even more common case I mentioned upthread of wanting to
> ferry around the truncated version of some char *, but still wanting to
> account for the original for an eventual free().
> 
> But yes, if you want to account for freeing that data *and* have util
> set to something else you'll need to have e.g. your own wrapper struct
> and your own string_list_clear_func() callback.

But stuffing it into the util field of string_list really feels like a
stretch, and something that would make existing string_list use painful.
There are tons of cases where util points to some totally unrelated (in
terms of memory ownership) item. I'd venture to say most cases where
string_list_clear() is called without free_util would count here.

> > I don't think most interfaces take a string_list_item now, so wouldn't
> > they similarly need to be changed? Though the point is that all of these
> > degrade to a regular C-string, so when you are just passing the value
> > (and not ownership), you would just dereference at that point.
> 
> Sure, just like things would need to be changed to handle your proposed
> "struct def_string".
> 
> By piggy-backing on an already used struct in our codebase we can get a
> lot of that memory management pretty much for free without much
> churn.
> 
> If you squint and pretend that "struct string_list_item" isn't called
> something to do with that particular collections API (but it would make
> use of it) then we've already set up most of the scaffolding and
> management for this.

It's that squinting that bothers me. Sure, it's _kinda_ similar. And I
don't have any problem with some kind of struct that says "this is a
string, and when you are done with it, this is how you free it". And I
don't have any problem with building the "dup" version of string_list
with that struct as a primitive. But it seems to me to be orthogonal
from the "util" pointer of a string_list, which is about creating a
mapping from the string to some other thing (which may or may not
contain the string, and may or may not be owned).

TBH, I have always found the "util" field of string_list a bit ugly (and
really most of string_list). I think most cases would be better off with
a different data structure (a set or a hash table), but we didn't have
convenient versions of those for a long time. I don't mind seeing
conversions of string_list to other data structures. But that seems to
be working against using string_list's string struct in more places.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v3 0/8] add a test mode for SANITIZE=leak, run it in CI
  2021-08-31 13:35     ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
@ 2021-09-01  9:56       ` Jeff King
  2021-09-01 10:42         ` Jeff King
  2021-09-02 12:25         ` Ævar Arnfjörð Bjarmason
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
  1 sibling, 2 replies; 125+ messages in thread
From: Jeff King @ 2021-09-01  9:56 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Tue, Aug 31, 2021 at 03:35:34PM +0200, Ævar Arnfjörð Bjarmason wrote:

>  * In v2 compiling with SANITIZE=leak would change things so only
>    known-good passing tests were run by default, everything else would
>    pass as a dummy. Now the default running of tests is unchanged, but
>    if we run with GIT_TEST_PASSING_SANITIZE_LEAK=true only those tests
>    are run which set and export TEST_PASSES_SANITIZE_LEAK=true.
> 
>  * The facility for declaring known-good tests in test-lib.sh based on
>    wildcards is gone, instead individual tests need to declare if
>    they're OK under SANITIZE=leak.[...]

Hmm. This still seems more complicated than we need. If we just want a
flag in each script, then test-lib.sh can use that flag to tweak
LSAN_OPTIONS. See the patch below.

That has two drawbacks:

  - it doesn't have any way to switch the flag per-test. But IMHO it is
    a mistake to go in that direction. This is all temporary scaffolding
    while we have leaks, and the script-level of granularity is fine.

  - it runs the tests not marked as LSAN-OK, just without leak checking,
    which is redundant in CI where we're already running them. But we
    could still be collecting leak stats (and just not failing the
    tests). See the patch below.

    If we do care about not running them, then I think it makes more
    sense to extend the run/skip mechanisms and build on that.

    (I also think I prefer the central list of "mark these scripts as OK
    for leak-checking", rather than annotating individuals. Because
    again, this is temporary, and it's nice to keep it in a sandbox that
    only people working on leak-checking would look at or touch).

I realize this is kind-of bikeshedding, and I'm not vehemently opposed
to what you have here. It just seems like fewer moving parts would be
less likely to confuse folks who want to poke at it.

>    This is done via "export
>    TEST_PASSES_SANITIZE_LEAK=true", there's a handy import of
>    "./test-pragma-SANITIZE=leak-ok.sh" before sourcing "./test-lib.sh"
>    itself to set this.

I found the extra level of indirection added by this pragma confusing.
We just need to set a variable, which is also a one-liner, and one that
is more obvious about what it's doing. In your code you also export it,
but that's not necessary for something that test-lib.sh is going to look
at. Or if it's really necessary at some point, then test-lib.sh can do
the export itself.

> Ævar Arnfjörð Bjarmason (8):
>   Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
>   CI: refactor "if" to "case" statement
>   tests: add a test mode for SANITIZE=leak, run it in CI
>   tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true
>   tests: annotate t001*.sh with TEST_PASSES_SANITIZE_LEAK=true
>   tests: annotate t002*.sh with TEST_PASSES_SANITIZE_LEAK=true
>   tests: annotate select t0*.sh with TEST_PASSES_SANITIZE_LEAK=true
>   tests: annotate select t*.sh with TEST_PASSES_SANITIZE_LEAK=true

Sort of a meta-question, but what's the plan for folks who add a new
test to say t0000, and it reveals a leak in code they didn't touch?

They'll get a CI failure (as will Junio if he picks up the patch), so
somebody is going to have to deal with it. Do they fix it? Do they unset
the "this script is OK" flag? Do they mark the individual test as
non-lsan-ok?

I do like the idea of finding real regressions. But while the state of
leak-checking is still so immature, I'm worried about this adding extra
friction for developers. Especially if they get some spooky action at a
distance caused by a leak in far-away code.

Anyway, here's LSAN_OPTIONS thing I was thinking of.

---
diff --git a/t/t0001-init.sh b/t/t0001-init.sh
index df544bb321..b1da18955d 100755
--- a/t/t0001-init.sh
+++ b/t/t0001-init.sh
@@ -2,6 +2,7 @@
 
 test_description='git init'
 
+TEST_LSAN_OK=1
 . ./test-lib.sh
 
 check_config () {
diff --git a/t/test-lib.sh b/t/test-lib.sh
index abcfbed6d6..62627afeaf 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -44,9 +44,30 @@ GIT_BUILD_DIR="$TEST_DIRECTORY"/..
 : ${ASAN_OPTIONS=detect_leaks=0:abort_on_error=1}
 export ASAN_OPTIONS
 
-# If LSAN is in effect we _do_ want leak checking, but we still
-# want to abort so that we notice the problems.
-: ${LSAN_OPTIONS=abort_on_error=1}
+if test -n "$LSAN_OPTIONS"
+then
+	# Leave user-provided options alone.
+	:
+elif test -n "$TEST_LSAN_OK"
+then
+	# The test script has declared itself as LSAN-clean; turn on full leak
+	# checking.
+	LSAN_OPTIONS=abort_on_error=1
+else
+	# The test script has possible LSAN failures. Just disable
+	# leak-checking entirely. Another option would be to log the failures
+	# with:
+	#
+	#   LSAN_OPTIONS=exitcode=0:log_path=$TEST_DIRECTORY/lsan/out
+	#
+	# The results are rather confusing, though, as the logs are
+	# per-process; you have no idea which one came from which test script.
+	# Ideally we'd send them to descriptor 4 along with the rest of the
+	# script log, but there's no LSAN_OPTION for that (recent versions of
+	# libsanitizer do have a public function to do so, so we could hook it
+	# ourselves via common-main).
+	LSAN_OPTIONS=detect_leaks=0
+fi
 export LSAN_OPTIONS
 
 if test ! -f "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v3 0/8] add a test mode for SANITIZE=leak, run it in CI
  2021-09-01  9:56       ` Jeff King
@ 2021-09-01 10:42         ` Jeff King
  2021-09-02 12:25         ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-09-01 10:42 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Wed, Sep 01, 2021 at 05:56:31AM -0400, Jeff King wrote:

> +else
> +	# The test script has possible LSAN failures. Just disable
> +	# leak-checking entirely. Another option would be to log the failures
> +	# with:
> +	#
> +	#   LSAN_OPTIONS=exitcode=0:log_path=$TEST_DIRECTORY/lsan/out
> +	#
> +	# The results are rather confusing, though, as the logs are
> +	# per-process; you have no idea which one came from which test script.
> +	# Ideally we'd send them to descriptor 4 along with the rest of the
> +	# script log, but there's no LSAN_OPTION for that (recent versions of
> +	# libsanitizer do have a public function to do so, so we could hook it
> +	# ourselves via common-main).
> +	LSAN_OPTIONS=detect_leaks=0
> +fi

I was curious about the fd thing. The patch below implements it, and
lets t0203 (for example) pass while including its sanitizer output in a
--verbose-log.

But this is exactly the kind of gross complexity I was suggesting to
avoid. :)

diff --git a/Makefile b/Makefile
index d1feab008f..ba2174fb79 100644
--- a/Makefile
+++ b/Makefile
@@ -1260,6 +1260,7 @@ ifdef SANITIZE
 SANITIZERS := $(foreach flag,$(subst $(comma),$(space),$(SANITIZE)),$(flag))
 BASIC_CFLAGS += -fsanitize=$(SANITIZE) -fno-sanitize-recover=$(SANITIZE)
 BASIC_CFLAGS += -fno-omit-frame-pointer
+BASIC_CFLAGS += -DENABLE_CUSTOM_SANITIZER_OPTIONS
 ifneq ($(filter undefined,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
diff --git a/common-main.c b/common-main.c
index 71e21dd20a..bff594ac04 100644
--- a/common-main.c
+++ b/common-main.c
@@ -2,6 +2,10 @@
 #include "exec-cmd.h"
 #include "attr.h"
 
+#ifdef ENABLE_CUSTOM_SANITIZER_OPTIONS
+#include <sanitizer/asan_interface.h>
+#endif
+
 /*
  * Many parts of Git have subprograms communicate via pipe, expect the
  * upstream of a pipe to die with SIGPIPE when the downstream of a
@@ -23,6 +27,18 @@ static void restore_sigpipe_to_default(void)
 	signal(SIGPIPE, SIG_DFL);
 }
 
+static void handle_custom_sanitizer_options(void)
+{
+#ifdef ENABLE_CUSTOM_SANITIZER_OPTIONS
+	const char *v;
+	v = getenv("GIT_SANITIZER_FD");
+	if (v) {
+		/* weird int-as-void interface from libsanitizer */
+		__sanitizer_set_report_fd((void *)(intptr_t)atoi(v));
+	}
+#endif
+}
+
 int main(int argc, const char **argv)
 {
 	int result;
@@ -37,6 +53,8 @@ int main(int argc, const char **argv)
 	sanitize_stdfds();
 	restore_sigpipe_to_default();
 
+	handle_custom_sanitizer_options();
+
 	git_resolve_executable_dir(argv[0]);
 
 	git_setup_gettext();
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 62627afeaf..674bd30c44 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -54,22 +54,16 @@ then
 	# checking.
 	LSAN_OPTIONS=abort_on_error=1
 else
-	# The test script has possible LSAN failures. Just disable
-	# leak-checking entirely. Another option would be to log the failures
-	# with:
-	#
-	#   LSAN_OPTIONS=exitcode=0:log_path=$TEST_DIRECTORY/lsan/out
-	#
-	# The results are rather confusing, though, as the logs are
-	# per-process; you have no idea which one came from which test script.
-	# Ideally we'd send them to descriptor 4 along with the rest of the
-	# script log, but there's no LSAN_OPTION for that (recent versions of
-	# libsanitizer do have a public function to do so, so we could hook it
-	# ourselves via common-main).
-	LSAN_OPTIONS=detect_leaks=0
+	# The test script has possible LSAN failures. Just log them but don't
+	# touch the exit code.
+	LSAN_OPTIONS=exitcode=0
 fi
 export LSAN_OPTIONS
 
+# If we do generate output, try to avoid it getting tangled up with stderr.
+GIT_SANITIZER_FD=4
+export GIT_SANITIZER_FD
+
 if test ! -f "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS
 then
 	echo >&2 'error: GIT-BUILD-OPTIONS missing (has Git been built?).'
@@ -463,6 +457,7 @@ unset VISUAL EMAIL LANGUAGE $("$PERL_PATH" -e '
 		PERF_
 		CURL_VERBOSE
 		TRACE_CURL
+		SANITIZER_.*
 	));
 	my @vars = grep(/^GIT_/ && !/^GIT_($ok)/o, @env);
 	print join("\n", @vars);

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist
  2021-09-01  7:53                 ` Jeff King
@ 2021-09-01 11:45                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-01 11:45 UTC (permalink / raw)
  To: Jeff King
  Cc: Andrzej Hunt, git, Junio C Hamano, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Elijah Newren


On Wed, Sep 01 2021, Jeff King wrote:

> On Tue, Aug 31, 2021 at 02:47:01PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > That works, but now "util" is not available for all the _other_ uses for
>> > which it was intended. And if we're not using it for those other uses,
>> > then why does it need to exist at all? If we are only using it to hold
>> > the allocated string pointer, then shouldn't it be "char *to_free"?
>> 
>> Because having it be "char *" doesn't cover the common case of
>> e.g. getting an already allocated "struct something *" which contains
>> your string, setting the "string" in "struct string_list_item" to some
>> string in that struct, and the "util" to the struct itself, as we now
>> own it and want to free() it later in its entirety.
>
> OK. I buy that storing a void pointer makes it more flexible. I'm not
> altogether convinced this pattern is especially common, but it's not any
> harder to work with than a "need_to_free" flag, so there's no reason not
> to do that (and to be fair, I didn't look around for possible uses of
> the pattern; it's just not one I think of as common off the top of my
> head).
>
>> That and the even more common case I mentioned upthread of wanting to
>> ferry around the truncated version of some char *, but still wanting to
>> account for the original for an eventual free().
>> 
>> But yes, if you want to account for freeing that data *and* have util
>> set to something else you'll need to have e.g. your own wrapper struct
>> and your own string_list_clear_func() callback.
>
> But stuffing it into the util field of string_list really feels like a
> stretch, and something that would make existing string_list use painful.
> There are tons of cases where util points to some totally unrelated (in
> terms of memory ownership) item. I'd venture to say most cases where
> string_list_clear() is called without free_util would count here.

For what it's worth I've got some WIP code that's part of my daily build
where I did end up going through all those callers, as part of general
string_list_clear() improvements mentioned offhand in
https://lore.kernel.org/git/87bl6kq631.fsf@evledraar.gmail.com/

This is just from fuzzy memory & I can't recall the specifics (and
haven't combed through that WIP code now), but it's something like that
in the ~100 uses of string_list in our codebase 60-70% are the simple
case where the "strdup_strings" and string_list_clear() is enough, maybe
another 10-20% have a "util" field they manage or not, 5%-ish have a
simple string_list_clear_func().

It was just 2-3 cases that leaked memory due to skipping a prefix and
sticking it in the list, and maybe another 1-2 where the void* to a
struct containing the string stuck into the string slot was something we
could use.

So it's not "common" in the sense of absolute numbers, but I did run
into a handful of them, and having them handled by having the
string_list take an arbitrary "util" was something I found neat.

I should probably have said "well known" (as in "well known technique"),
"idiomatic" or something...

>> > I don't think most interfaces take a string_list_item now, so wouldn't
>> > they similarly need to be changed? Though the point is that all of these
>> > degrade to a regular C-string, so when you are just passing the value
>> > (and not ownership), you would just dereference at that point.
>> 
>> Sure, just like things would need to be changed to handle your proposed
>> "struct def_string".
>> 
>> By piggy-backing on an already used struct in our codebase we can get a
>> lot of that memory management pretty much for free without much
>> churn.
>> 
>> If you squint and pretend that "struct string_list_item" isn't called
>> something to do with that particular collections API (but it would make
>> use of it) then we've already set up most of the scaffolding and
>> management for this.
>
> It's that squinting that bothers me. Sure, it's _kinda_ similar. And I
> don't have any problem with some kind of struct that says "this is a
> string, and when you are done with it, this is how you free it". And I
> don't have any problem with building the "dup" version of string_list
> with that struct as a primitive. But it seems to me to be orthogonal
> from the "util" pointer of a string_list, which is about creating a
> mapping from the string to some other thing (which may or may not
> contain the string, and may or may not be owned).

The "util" is whatever the user makes it. We could add a
"pointer_to_free" to every container type to solve this more
cleanly/generally at the API level, but just handing the problem to the
user seems better to me. I.e. an API like string_list has convenience
functions for freeing all the "util", if you only need it for memory
tracking use it as-is, if you need a "real util" *and* such tracking
just create a 2-member wrapper struct yourself & use that.

> TBH, I have always found the "util" field of string_list a bit ugly (and
> really most of string_list). I think most cases would be better off with
> a different data structure (a set or a hash table), but we didn't have
> convenient versions of those for a long time. I don't mind seeing
> conversions of string_list to other data structures. But that seems to
> be working against using string_list's string struct in more places.

If we followed my idle musings we'd be using string_list_item in more
places, not necessarily string_list, and would rename
s/string_list_item/string_and_util/ or something.

One way to look at this problem is that we're pretty close to just
re-inventing the sort of generalized refcounted container type that some
programming languages carry around. E.g. Perl has a "struct SV*" that a
$string maps to, but also hash and array values etc.

Those languages usually have a "refcount" or whatever, but since we're
using this in native C and it's usually (or at least should be) clear
who owns the memory just having something to point free() at will do.

I'm just saying that if we're going halfway there it would be
unfortunate if we'd end up with a "struct def_string" which wouldn't
handle this "borrowing a string from a struct" case.

Or maybe we should just use "struct strbuf" and do copying in even more
places...

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v3 0/8] add a test mode for SANITIZE=leak, run it in CI
  2021-09-01  9:56       ` Jeff King
  2021-09-01 10:42         ` Jeff King
@ 2021-09-02 12:25         ` Ævar Arnfjörð Bjarmason
  2021-09-03 11:13           ` Jeff King
  1 sibling, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-02 12:25 UTC (permalink / raw)
  To: Jeff King
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine


On Wed, Sep 01 2021, Jeff King wrote:

> On Tue, Aug 31, 2021 at 03:35:34PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>>  * In v2 compiling with SANITIZE=leak would change things so only
>>    known-good passing tests were run by default, everything else would
>>    pass as a dummy. Now the default running of tests is unchanged, but
>>    if we run with GIT_TEST_PASSING_SANITIZE_LEAK=true only those tests
>>    are run which set and export TEST_PASSES_SANITIZE_LEAK=true.
>> 
>>  * The facility for declaring known-good tests in test-lib.sh based on
>>    wildcards is gone, instead individual tests need to declare if
>>    they're OK under SANITIZE=leak.[...]
>
> Hmm. This still seems more complicated than we need. If we just want a
> flag in each script, then test-lib.sh can use that flag to tweak
> LSAN_OPTIONS. See the patch below.

On the "pragma" include v.s. env var + export: I figured this would be
easier to read as I thought the export was required (I don't think it is
in most cases, but e.g. for t0000*.sh I think it is, but that's from
memory...).

> That has two drawbacks:
>
>   - it doesn't have any way to switch the flag per-test. But IMHO it is
>     a mistake to go in that direction. This is all temporary scaffolding
>     while we have leaks, and the script-level of granularity is fine.

We have a lot of tests that do simple checking of the tool itself, and
later in the script might be stressing trace2, or common sources of
leaks like "git log" in combination with the tool (e.g. the commit-graph
tests).

So being able to tweak this inside the script is useful, but that can of
course also be done with this proposed TEST_LSAN_OK + prereq.

>   - it runs the tests not marked as LSAN-OK, just without leak checking,
>     which is redundant in CI where we're already running them. But we
>     could still be collecting leak stats (and just not failing the
>     tests). See the patch below.

Sure, I'd prefer 

>     If we do care about not running them, then I think it makes more
>     sense to extend the run/skip mechanisms and build on that.

The patch I have here is already nicely integrated with the skip
mechanism. I.e. we use skip_all which shows a summary in any TAP
consumer, and we can skip individual tests with prerequisites.

>     (I also think I prefer the central list of "mark these scripts as OK
>     for leak-checking", rather than annotating individuals. Because
>     again, this is temporary, and it's nice to keep it in a sandbox that
>     only people working on leak-checking would look at or touch).
>
> I realize this is kind-of bikeshedding, and I'm not vehemently opposed
> to what you have here. It just seems like fewer moving parts would be
> less likely to confuse folks who want to poke at it.

I can see that for the proposed v2 mechanism, but in this v3 nothing
changes unless you opt-in to things via new GIT_TEST_* setting. So the
chance for confusion seems minimal to nonexisting.

I was interested in doing some summaries of existing leaks
eventually. It seems even with LSAN_OPTIONS=detect_leaks=0 compiling
with SANITIZE=leak make things a bit slower, but not by much (but actual
leak checking is much slower).

But I'd prefer to leave any "write out leak logs and summarize" step for
some later change.

>>    This is done via "export
>>    TEST_PASSES_SANITIZE_LEAK=true", there's a handy import of
>>    "./test-pragma-SANITIZE=leak-ok.sh" before sourcing "./test-lib.sh"
>>    itself to set this.
>
> I found the extra level of indirection added by this pragma confusing.
> We just need to set a variable, which is also a one-liner, and one that
> is more obvious about what it's doing. In your code you also export it,
> but that's not necessary for something that test-lib.sh is going to look
> at. Or if it's really necessary at some point, then test-lib.sh can do
> the export itself.

*nod*, will remove it per discussion above.

>> Ævar Arnfjörð Bjarmason (8):
>>   Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
>>   CI: refactor "if" to "case" statement
>>   tests: add a test mode for SANITIZE=leak, run it in CI
>>   tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true
>>   tests: annotate t001*.sh with TEST_PASSES_SANITIZE_LEAK=true
>>   tests: annotate t002*.sh with TEST_PASSES_SANITIZE_LEAK=true
>>   tests: annotate select t0*.sh with TEST_PASSES_SANITIZE_LEAK=true
>>   tests: annotate select t*.sh with TEST_PASSES_SANITIZE_LEAK=true
>
> Sort of a meta-question, but what's the plan for folks who add a new
> test to say t0000, and it reveals a leak in code they didn't touch?

Then CI will fail on this job. We'd have those same failures now
(e.g. the mentioned current delta between master..seen), we just don't
see them. Having visibility on them seems like an improvement.

> They'll get a CI failure (as will Junio if he picks up the patch), so
> somebody is going to have to deal with it. Do they fix it? Do they unset
> the "this script is OK" flag? Do they mark the individual test as
> non-lsan-ok?

I'd think they'd fix it, or make marking the regression as OK part of
their re-roll, just like failures on master..seen now.

If you're getting at that we should start out this job as an FYI job
that doesn't impact the CI run's overall status if it fails I think that
would be OK as a start.

> I do like the idea of finding real regressions. But while the state of
> leak-checking is still so immature, I'm worried about this adding extra
> friction for developers. Especially if they get some spooky action at a
> distance caused by a leak in far-away code.

Yeah, ultimately this series is an implicit endorsement of us caring
more than we do now.

I think this friction point is going to be mitigated a lot by the
ability I've added to not just skip entire test scripts, but allow
prereq skipping of some tests, early bailing out etc.

It allows you to say add a "git log" test at the end of some test that
otherwise just uses some core API or a test tool and not have to throw
the baby out with the bathwater in terms of disabling all existing leak
checks there to make forward progress (or split up the entire test
script).

> Anyway, here's LSAN_OPTIONS thing I was thinking of.

Thanks, that & your follow-up is very interesting. Can I assume this has
your SOB? I'd like to add that redirect to fd 4 change to this series.

> diff --git a/t/t0001-init.sh b/t/t0001-init.sh
> index df544bb321..b1da18955d 100755
> --- a/t/t0001-init.sh
> +++ b/t/t0001-init.sh
> @@ -2,6 +2,7 @@
>  
>  test_description='git init'
>  
> +TEST_LSAN_OK=1
>  . ./test-lib.sh
>  
>  check_config () {
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index abcfbed6d6..62627afeaf 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -44,9 +44,30 @@ GIT_BUILD_DIR="$TEST_DIRECTORY"/..
>  : ${ASAN_OPTIONS=detect_leaks=0:abort_on_error=1}
>  export ASAN_OPTIONS
>  
> -# If LSAN is in effect we _do_ want leak checking, but we still
> -# want to abort so that we notice the problems.
> -: ${LSAN_OPTIONS=abort_on_error=1}
> +if test -n "$LSAN_OPTIONS"
> +then
> +	# Leave user-provided options alone.
> +	:
> +elif test -n "$TEST_LSAN_OK"
> +then
> +	# The test script has declared itself as LSAN-clean; turn on full leak
> +	# checking.
> +	LSAN_OPTIONS=abort_on_error=1
> +else
> +	# The test script has possible LSAN failures. Just disable
> +	# leak-checking entirely. Another option would be to log the failures
> +	# with:
> +	#
> +	#   LSAN_OPTIONS=exitcode=0:log_path=$TEST_DIRECTORY/lsan/out
> +	#
> +	# The results are rather confusing, though, as the logs are
> +	# per-process; you have no idea which one came from which test script.
> +	# Ideally we'd send them to descriptor 4 along with the rest of the
> +	# script log, but there's no LSAN_OPTION for that (recent versions of
> +	# libsanitizer do have a public function to do so, so we could hook it
> +	# ourselves via common-main).
> +	LSAN_OPTIONS=detect_leaks=0
> +fi
>  export LSAN_OPTIONS
>  
>  if test ! -f "$GIT_BUILD_DIR"/GIT-BUILD-OPTIONS


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v3 0/8] add a test mode for SANITIZE=leak, run it in CI
  2021-09-02 12:25         ` Ævar Arnfjörð Bjarmason
@ 2021-09-03 11:13           ` Jeff King
  0 siblings, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-09-03 11:13 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Thu, Sep 02, 2021 at 02:25:33PM +0200, Ævar Arnfjörð Bjarmason wrote:

> > Hmm. This still seems more complicated than we need. If we just want a
> > flag in each script, then test-lib.sh can use that flag to tweak
> > LSAN_OPTIONS. See the patch below.
> 
> On the "pragma" include v.s. env var + export: I figured this would be
> easier to read as I thought the export was required (I don't think it is
> in most cases, but e.g. for t0000*.sh I think it is, but that's from
> memory...).

I admit that half of my complaint with the pragma is the weird filename
with an "=" in it. :) But I do think just assigning the variable is the
most readable thing.  If t0000 needs to export for whatever reason, it
can do so (preferably with a comment explaining why).

> > That has two drawbacks:
> >
> >   - it doesn't have any way to switch the flag per-test. But IMHO it is
> >     a mistake to go in that direction. This is all temporary scaffolding
> >     while we have leaks, and the script-level of granularity is fine.
> 
> We have a lot of tests that do simple checking of the tool itself, and
> later in the script might be stressing trace2, or common sources of
> leaks like "git log" in combination with the tool (e.g. the commit-graph
> tests).
> 
> So being able to tweak this inside the script is useful, but that can of
> course also be done with this proposed TEST_LSAN_OK + prereq.

Getting rid of the "let's tell the tests that we were built with LSAN"
was part of the simplicity I was going for (and obviously does preclude
a prerequisite). I had hoped we wouldn't need to do per-test stuff,
because this was all a temporary state. But maybe that's naive.

> >     If we do care about not running them, then I think it makes more
> >     sense to extend the run/skip mechanisms and build on that.
> 
> The patch I have here is already nicely integrated with the skip
> mechanism. I.e. we use skip_all which shows a summary in any TAP
> consumer, and we can skip individual tests with prerequisites.

I meant here that we'd be driving the selection externally from the
tests using the skip/run mechanisms (something along the lines of what I
sketched out before).

But I admit that there isn't really a big difference between the two
approaches. Since you've coded this one up already, let's go in that
direction (i.e., this series).

> I was interested in doing some summaries of existing leaks
> eventually. It seems even with LSAN_OPTIONS=detect_leaks=0 compiling
> with SANITIZE=leak make things a bit slower, but not by much (but actual
> leak checking is much slower).
> 
> But I'd prefer to leave any "write out leak logs and summarize" step for
> some later change.

OK, I can live with that (especially given how apparently difficult it
is to convince LSAN to do it).

> > Sort of a meta-question, but what's the plan for folks who add a new
> > test to say t0000, and it reveals a leak in code they didn't touch?
> 
> Then CI will fail on this job. We'd have those same failures now
> (e.g. the mentioned current delta between master..seen), we just don't
> see them. Having visibility on them seems like an improvement.
> 
> > They'll get a CI failure (as will Junio if he picks up the patch), so
> > somebody is going to have to deal with it. Do they fix it? Do they unset
> > the "this script is OK" flag? Do they mark the individual test as
> > non-lsan-ok?
> 
> I'd think they'd fix it, or make marking the regression as OK part of
> their re-roll, just like failures on master..seen now.
> 
> If you're getting at that we should start out this job as an FYI job
> that doesn't impact the CI run's overall status if it fails I think that
> would be OK as a start.

I think that would be OK, but I'm not quite sure of the best way to do
it. Why don't we start it as a regular required job, and then we can see
how often it is causing a headache. If once every few months somebody
fixes a leak, I'd be happy. If new developers are getting tangled up
constantly in unrelated leaks, then that's something we'd need to
revisit.

> > I do like the idea of finding real regressions. But while the state of
> > leak-checking is still so immature, I'm worried about this adding extra
> > friction for developers. Especially if they get some spooky action at a
> > distance caused by a leak in far-away code.
> 
> Yeah, ultimately this series is an implicit endorsement of us caring
> more than we do now.
> 
> I think this friction point is going to be mitigated a lot by the
> ability I've added to not just skip entire test scripts, but allow
> prereq skipping of some tests, early bailing out etc.

I half-agree with your final paragraph. The biggest friction point I
think will be for new folks when CI starts failing, and they don't
understand why (or where the problem is, or how to debug it, etc). But
like I said, let's see what happens.

> > Anyway, here's LSAN_OPTIONS thing I was thinking of.
> 
> Thanks, that & your follow-up is very interesting. Can I assume this has
> your SOB? I'd like to add that redirect to fd 4 change to this series.

Yes, go for it.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v4 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-08-31 13:35     ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
  2021-09-01  9:56       ` Jeff King
@ 2021-09-07 15:33       ` Ævar Arnfjörð Bjarmason
  2021-09-07 15:33         ` [PATCH v4 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
                           ` (4 more replies)
  1 sibling, 5 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 15:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

We can compile git with SANITIZE=leak, and have had various efforts in
the past such as 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04)
to plug memory leaks, but have had no CI testing of it to ensure that
we don't get regressions. This series adds a GIT_TEST_* mode for
checking those regressions, and runs it in CI.

Since I submitted v2 the delta between origin/master..origin/seen
broke even t0001-init.sh when run under SANITIZE=leak, so this series
will cause test smoke on "seen".

That failure is due to a bug in es/config-based-hooks [1] and the
hn/reftable topic, i.e. these patches are legitimately catching
regressions in "seen" from day 1.

Changes since v3:

 * Much updated commit message

 * Re-arranged the t/README change to avoid a conflict with "seen".

 * Now testing OSX as well as Linux. Full CI passes on top of "master"
   on both: https://github.com/avar/git/runs/3535331215

 * I ejected the previous 4-8/8 patches of adding SANITIZE=leak
   annotations to various tests, let's focus on the test mode itself
   here and not overly distracting ourselves with whatever other
   regressions on "seen" those annotations might cause, I can submit
   those annotations later.

 * As noted in the updated commit message I didn't end up going with
   Jeff King's suggestion of supporting LSAN_OPTIONS directly, and
   fixing the "fd" the tests write to. All of those things can be
   extended or fixed later.

1. https://lore.kernel.org/git/8735qvyw0p.fsf@evledraar.gmail.com/ [1]

Ævar Arnfjörð Bjarmason (3):
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  CI: refactor "if" to "case" statement
  tests: add a test mode for SANITIZE=leak, run it in CI

 .github/workflows/main.yml |  6 ++++++
 Makefile                   |  5 +++++
 ci/install-dependencies.sh |  6 +++---
 ci/lib.sh                  | 31 +++++++++++++++++++++----------
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0000-basic.sh           |  1 +
 t/t0004-unwritable.sh      |  3 ++-
 t/test-lib.sh              | 21 +++++++++++++++++++++
 9 files changed, 67 insertions(+), 15 deletions(-)

Range-diff against v3:
1:  85619728d41 = 1:  bdfe2279271 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2:  91c36b94eaa ! 2:  6aaa60e3759 CI: refactor "if" to "case" statement
    @@ Metadata
      ## Commit message ##
         CI: refactor "if" to "case" statement
     
    -    Refactor an "if" statement for "linux-gcc" to a "case" statement in
    -    preparation for another case being added to it, and do the same for
    -    the "osx-gcc" just below it for consistency.
    +    Refactor an "if" statement for "linux-gcc" and "osx-gcc" to a "case"
    +    statement in preparation for another case being added to them.
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
3:  7e3577e4e3c ! 3:  fffbfc35c00 tests: add a test mode for SANITIZE=leak, run it in CI
    @@ Metadata
      ## Commit message ##
         tests: add a test mode for SANITIZE=leak, run it in CI
     
    -    While git can be compiled with SANITIZE=leak there has been no
    -    corresponding GIT_TEST_* mode for it, i.e. memory leaks have been
    -    fixed as one-offs without structured regression testing.
    +    While git can be compiled with SANITIZE=leak we have not run
    +    regression tests under that mode, memory leaks have only been fixed as
    +    one-offs without structured regression testing.
     
    -    This change add such a mode, and a new linux-SANITIZE=leak CI
    -    target. The test mode and CI target only runs a whitelist of
    -    known-good tests using a mechanism discussed below, to ensure that we
    -    won't add regressions to code that's had its memory leaks fixed.
    +    This change add CI testing for it. We'll now build with GCC under
    +    Linux and test t000[04]*.sh with SANITIZE=leak, and likewise with GCC
    +    on OSX. The new jobs are called "linux-SANITIZE=leak" and
    +    "osx-SANITIZE=leak".
     
         The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
    -    mode. When running in that mode all tests except those that have opted
    -    themselves in to running by setting and exporting
    -    TEST_PASSES_SANITIZE_LEAK=true before sourcing test-lib.sh.
    +    mode. When running in that mode, we'll assert that we were compiled
    +    with SANITIZE=leak, and then skip all tests except those that we've
    +    opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
    +    test-lib.sh (see discussion in t/README).
     
    -    I'm adding a "test-pragma-SANITIZE=leak-ok.sh" wrapper for setting and
    -    exporting that variable, as the assignment/export boilerplate would
    -    otherwise get quite verbose and repetitive in subsequent commits.
    +    The tests using the "TEST_PASSES_SANITIZE_LEAK=true" setting can in
    +    turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to
    +    selectively skip tests even under
    +    "GIT_TEST_PASSING_SANITIZE_LEAK=true". In a preceding commit we
    +    started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
    +    it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".
     
    -    The tests using the "test-pragma-SANITIZE=leak-ok.sh" pragma can in
    -    turn make use of the "SANITIZE_LEAK" prerequisite added in a preceding
    -    commit, should they wish to selectively skip tests even under
    -    "GIT_TEST_PASSING_SANITIZE_LEAK=true".
    -
    -    Now tests that don't set the "test-pragma-SANITIZE=leak-ok.sh" pragma
    -    will be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
    +    Now tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will be
    +    skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
     
             $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
             1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set
     
    -    In subsequents commit we'll conservatively add more
    -    TEST_PASSES_SANITIZE_LEAK=true annotations. The idea is that as memory
    -    leaks are fixed we can add more known-good tests to this CI target, to
    -    ensure that we won't have regressions.
    +    The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
    +    as follow-up change, but let's start small to begin with.
    +
    +    It would also be possible to implement a more lightweight version of
    +    this by only relying on setting "LSAN_OPTIONS". See
    +    <YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
    +    <YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
    +    that. I've opted for this approach of adding a GIT_TEST_* mode instead
    +    because it's consistent with how we handle other special test modes.
    +
    +    Being able to add a "!SANITIZE_LEAK" prerequisite and calling
    +    "test_done" early if it isn't satisfied also means that we can more
    +    incrementally add regression tests without being forced to fix
    +    widespread and hard-to-fix leaks at the same time.
    +
    +    We have tests that do simple checking of some tool we're interested
    +    in, but later on in the script might be stressing trace2, or common
    +    sources of leaks like "git log" in combination with the tool (e.g. the
    +    commit-graph tests). To be clear having a prerequisite could also be
    +    accomplished by using "LSAN_OPTIONS" directly.
    +
    +    On the topi of "LSAN_OPTIONS": It would be nice to have a mode to
    +    aggregate all failures in our various scripts, see [2] for a start at
    +    doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
    +    that for now, it can be added later, and that proposed patch is also
    +    hindered by us wanting to test e.g. test-tool leaks (and by proxy, any
    +    API leaks they uncover), not just the "common-main.c" entry point.
     
         As of writing this we've got major regressions between master..seen,
         i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
    @@ Commit message
         936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
         past history of "one-off" SANITIZE=leak (and more) fixes.
     
    +    The reason for using gcc on OSX over the clang default is because
    +    it'll currently fail to build with:
    +
    +        clang: error: unsupported option '-fsanitize=leak' for target 'x86_64-apple-darwin19.6.0'
    +
    +    If that's sorted out in the future we might want to run that job with
    +    "clang" merely to make use of the default, and also to add some
    +    compiler variance into the mix. Both use the
    +    "AddressSanitizerLeakSanitizer" library[3], so in they shouldn't be
    +    have differently under GCC or clang.
    +
    +    1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
    +    2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
    +    3. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## .github/workflows/main.yml ##
    @@ .github/workflows/main.yml: jobs:
                  cc: gcc
                  pool: ubuntu-latest
     +          - jobname: linux-SANITIZE=leak
    ++            cc: gcc
     +            pool: ubuntu-latest
    ++          - jobname: osx-SANITIZE=leak
    ++            cc: gcc
    ++            pool: macos-latest
          env:
            CC: ${{matrix.vector.cc}}
            jobname: ${{matrix.vector.jobname}}
    @@ ci/install-dependencies.sh: UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl
      		sudo apt-get -q -y install gcc-8
      		;;
      	esac
    +@@ ci/install-dependencies.sh: linux-clang|linux-gcc)
    + 		cp git-lfs-$LINUX_GIT_LFS_VERSION/git-lfs .
    + 	popd
    + 	;;
    +-osx-clang|osx-gcc)
    ++osx-clang|osx-gcc|osx-SANITIZE=leak)
    + 	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
    + 	# Uncomment this if you want to run perf tests:
    + 	# brew install gnu-time
     
      ## ci/lib.sh ##
     @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
    @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
      		export CC=gcc-8
      		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
      		;;
    +@@ ci/lib.sh: linux-clang|linux-gcc)
    + 	GIT_LFS_PATH="$HOME/custom/git-lfs"
    + 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
    + 	;;
    +-osx-clang|osx-gcc)
    ++osx-clang|osx-gcc|osx-SANITIZE=leak)
    + 	case "$jobname" in
    +-	osx-gcc)
    ++	osx-gcc|osx-SANITIZE=leak)
    + 		export CC=gcc-9
    + 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
    + 		;;
     @@ ci/lib.sh: linux-musl)
      	;;
      esac
      
     +case "$jobname" in
    -+linux-SANITIZE=leak)
    ++linux-SANITIZE=leak|osx-SANITIZE=leak)
     +	export SANITIZE=leak
     +	export GIT_TEST_PASSING_SANITIZE_LEAK=true
     +	;;
    @@ ci/run-build-and-tests.sh: esac
      	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
      	make test
      	export GIT_TEST_SPLIT_INDEX=yes
    -@@ ci/run-build-and-tests.sh: linux-gcc)
    - 	export GIT_TEST_CHECKOUT_WORKERS=2
    - 	make test
    - 	;;
    --linux-clang)
    -+linux-clang|linux-SANITIZE=leak)
    - 	export GIT_TEST_DEFAULT_HASH=sha1
    - 	make test
    - 	export GIT_TEST_DEFAULT_HASH=sha256
     
      ## t/README ##
    -@@ t/README: GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
    - to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
    - execution of the parallel-checkout code.
    +@@ t/README: excluded as so much relies on it, but this might change in the future.
    + GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
    + test suite. Accept any boolean values that are accepted by git-config.
      
     +GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
     +SANITIZE=leak will run only those tests that have whitelisted
    -+themselves as passing with no memory leaks. Do this by sourcing
    -+"test-pragma-SANITIZE=leak-ok.sh" before sourcing "test-lib.sh" itself
    -+at the top of the test script. This test mode is used by the
    -+"linux-SANITIZE=leak" CI target.
    ++themselves as passing with no memory leaks. Tests can be whitelisted
    ++by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
    ++"test-lib.sh" itself at the top of the test script. This test mode is
    ++used by the "linux-SANITIZE=leak" CI target.
     +
    - Naming Tests
    - ------------
    + GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
    + default to n.
      
     
      ## t/t0000-basic.sh ##
    @@ t/t0000-basic.sh: swapping compression and hashing order, the person who is maki
      modification *should* take notice and update the test vectors here.
      '
      
    -+. ./test-pragma-SANITIZE=leak-ok.sh
    ++TEST_PASSES_SANITIZE_LEAK=true
      . ./test-lib.sh
      
      try_local_xy () {
     
    + ## t/t0004-unwritable.sh ##
    +@@
    + 
    + test_description='detect unwritable repository and fail correctly'
    + 
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + test_expect_success setup '
    +
      ## t/test-lib.sh ##
     @@ t/test-lib.sh: then
      	test_done
      fi
      
    -+# Aggressively skip non-whitelisted tests when compiled with
    -+# SANITIZE=leak
    ++# skip non-whitelisted tests when compiled with SANITIZE=leak
     +if test -n "$SANITIZE_LEAK"
     +then
     +	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
    @@ t/test-lib.sh: then
      # Last-minute variable setup
      HOME="$TRASH_DIRECTORY"
      GNUPGHOME="$HOME/gnupg-home-not-used"
    -
    - ## t/test-pragma-SANITIZE=leak-ok.sh (new) ##
    -@@
    -+#!/bin/sh
    -+
    -+## This "pragma" (as in "perldoc perlpragma") declares that the test
    -+## will pass under GIT_TEST_PASSING_SANITIZE_LEAK=true. Source this
    -+## before sourcing test-lib.sh
    -+
    -+TEST_PASSES_SANITIZE_LEAK=true
    -+export TEST_PASSES_SANITIZE_LEAK
4:  0cd14d64165 < -:  ----------- tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true
5:  ed5f5705755 < -:  ----------- tests: annotate t001*.sh with TEST_PASSES_SANITIZE_LEAK=true
6:  2599016c4e7 < -:  ----------- tests: annotate t002*.sh with TEST_PASSES_SANITIZE_LEAK=true
7:  ddc4d6d2cf1 < -:  ----------- tests: annotate select t0*.sh with TEST_PASSES_SANITIZE_LEAK=true
8:  e611d2c23d9 < -:  ----------- tests: annotate select t*.sh with TEST_PASSES_SANITIZE_LEAK=true
-- 
2.33.0.818.gd2ef2916285


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v4 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
@ 2021-09-07 15:33         ` Ævar Arnfjörð Bjarmason
  2021-09-07 15:33         ` [PATCH v4 2/3] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 15:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              | 5 +++++
 t/t0004-unwritable.sh | 2 +-
 t/test-lib.sh         | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 429c276058d..34c12ea6e6f 100644
--- a/Makefile
+++ b/Makefile
@@ -1221,6 +1221,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1265,6 +1268,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2812,6 +2816,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index e3137d638ee..fbdcb926b3a 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -21,7 +21,7 @@ test_expect_success POSIXPERM,SANITY 'write-tree should notice unwritable reposi
 	test_must_fail git write-tree
 '
 
-test_expect_success POSIXPERM,SANITY 'commit should notice unwritable repository' '
+test_expect_success POSIXPERM,SANITY,!SANITIZE_LEAK 'commit should notice unwritable repository' '
 	test_when_finished "chmod 775 .git/objects .git/objects/??" &&
 	chmod a-w .git/objects .git/objects/?? &&
 	test_must_fail git commit -m second
diff --git a/t/test-lib.sh b/t/test-lib.sh
index abcfbed6d61..4ab18914a3d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1533,6 +1533,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.33.0.818.gd2ef2916285


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v4 2/3] CI: refactor "if" to "case" statement
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
  2021-09-07 15:33         ` [PATCH v4 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
@ 2021-09-07 15:33         ` Ævar Arnfjörð Bjarmason
  2021-09-07 15:33         ` [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 15:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Refactor an "if" statement for "linux-gcc" and "osx-gcc" to a "case"
statement in preparation for another case being added to them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 ci/lib.sh | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f5..33b9777ab7e 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -184,13 +184,15 @@ export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
 linux-clang|linux-gcc)
-	if [ "$jobname" = linux-gcc ]
-	then
+	case "$jobname" in
+	linux-gcc)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
-	fi
+		;;
+	esac
 
 	export GIT_TEST_HTTPD=true
 
@@ -207,13 +209,15 @@ linux-clang|linux-gcc)
 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
 	;;
 osx-clang|osx-gcc)
-	if [ "$jobname" = osx-gcc ]
-	then
+	case "$jobname" in
+	osx-gcc)
 		export CC=gcc-9
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python2)"
-	fi
+		;;
+	esac
 
 	# t9810 occasionally fails on Travis CI OS X
 	# t9816 occasionally fails with "TAP out of sequence errors" on
-- 
2.33.0.818.gd2ef2916285


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
  2021-09-07 15:33         ` [PATCH v4 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
  2021-09-07 15:33         ` [PATCH v4 2/3] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
@ 2021-09-07 15:33         ` Ævar Arnfjörð Bjarmason
  2021-09-07 16:29           ` Eric Sunshine
  2021-09-07 16:51           ` Jeff King
  2021-09-07 16:44         ` [PATCH v4 0/3] " Jeff King
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
  4 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 15:33 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak we have not run
regression tests under that mode, memory leaks have only been fixed as
one-offs without structured regression testing.

This change add CI testing for it. We'll now build with GCC under
Linux and test t000[04]*.sh with SANITIZE=leak, and likewise with GCC
on OSX. The new jobs are called "linux-SANITIZE=leak" and
"osx-SANITIZE=leak".

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak, and then skip all tests except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
test-lib.sh (see discussion in t/README).

The tests using the "TEST_PASSES_SANITIZE_LEAK=true" setting can in
turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In a preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

Now tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will be
skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.

It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.

Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.

We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.

On the topi of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later, and that proposed patch is also
hindered by us wanting to test e.g. test-tool leaks (and by proxy, any
API leaks they uncover), not just the "common-main.c" entry point.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com> about the
lack of this sort of test mode, and 0e5bba53af (add UNLEAK annotation
for reducing leak false positives, 2017-09-08) for the initial
addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

The reason for using gcc on OSX over the clang default is because
it'll currently fail to build with:

    clang: error: unsupported option '-fsanitize=leak' for target 'x86_64-apple-darwin19.6.0'

If that's sorted out in the future we might want to run that job with
"clang" merely to make use of the default, and also to add some
compiler variance into the mix. Both use the
"AddressSanitizerLeakSanitizer" library[3], so in they shouldn't be
have differently under GCC or clang.

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .github/workflows/main.yml |  6 ++++++
 ci/install-dependencies.sh |  6 +++---
 ci/lib.sh                  | 15 +++++++++++----
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0000-basic.sh           |  1 +
 t/t0004-unwritable.sh      |  1 +
 t/test-lib.sh              | 20 ++++++++++++++++++++
 8 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 68596f25927..b41572293c9 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -232,6 +232,12 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-SANITIZE=leak
+            cc: gcc
+            pool: ubuntu-latest
+          - jobname: osx-SANITIZE=leak
+            cc: gcc
+            pool: macos-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 5772081b6e5..a89e72c1438 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-SANITIZE=leak)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
 		$UBUNTU_COMMON_PKGS
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-SANITIZE=leak)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
@@ -37,7 +37,7 @@ linux-clang|linux-gcc)
 		cp git-lfs-$LINUX_GIT_LFS_VERSION/git-lfs .
 	popd
 	;;
-osx-clang|osx-gcc)
+osx-clang|osx-gcc|osx-SANITIZE=leak)
 	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
 	# Uncomment this if you want to run perf tests:
 	# brew install gnu-time
diff --git a/ci/lib.sh b/ci/lib.sh
index 33b9777ab7e..36b7c0d3020 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,9 +183,9 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-SANITIZE=leak)
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-SANITIZE=leak)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
 		;;
@@ -208,9 +208,9 @@ linux-clang|linux-gcc)
 	GIT_LFS_PATH="$HOME/custom/git-lfs"
 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
 	;;
-osx-clang|osx-gcc)
+osx-clang|osx-gcc|osx-SANITIZE=leak)
 	case "$jobname" in
-	osx-gcc)
+	osx-gcc|osx-SANITIZE=leak)
 		export CC=gcc-9
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
 		;;
@@ -237,4 +237,11 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-SANITIZE=leak|osx-SANITIZE=leak)
+	export SANITIZE=leak
+	export GIT_TEST_PASSING_SANITIZE_LEAK=true
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3ce81ffee94..4133239fc36 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -12,7 +12,7 @@ esac
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc|linux-SANITIZE=leak)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
diff --git a/t/README b/t/README
index 9e701223020..4864f208c8a 100644
--- a/t/README
+++ b/t/README
@@ -366,6 +366,13 @@ excluded as so much relies on it, but this might change in the future.
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
+GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
+SANITIZE=leak will run only those tests that have whitelisted
+themselves as passing with no memory leaks. Tests can be whitelisted
+by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
+"test-lib.sh" itself at the top of the test script. This test mode is
+used by the "linux-SANITIZE=leak" CI target.
+
 GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
 default to n.
 
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index cb87768513c..54318af3861 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -18,6 +18,7 @@ swapping compression and hashing order, the person who is making the
 modification *should* take notice and update the test vectors here.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 try_local_xy () {
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index fbdcb926b3a..37d68ef03be 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -2,6 +2,7 @@
 
 test_description='detect unwritable repository and fail correctly'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4ab18914a3d..3b7acfec23b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1379,6 +1379,26 @@ then
 	test_done
 fi
 
+# skip non-whitelisted tests when compiled with SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+	then
+		# We need to see it in "git env--helper" (via
+		# test_bool_env)
+		export TEST_PASSES_SANITIZE_LEAK
+
+		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
+		then
+			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
+			test_done
+		fi
+	fi
+elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+then
+	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 HOME="$TRASH_DIRECTORY"
 GNUPGHOME="$HOME/gnupg-home-not-used"
-- 
2.33.0.818.gd2ef2916285


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 15:33         ` [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-07 16:29           ` Eric Sunshine
  2021-09-07 16:51           ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Eric Sunshine @ 2021-09-07 16:29 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King, Andrzej Hunt,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh

On Tue, Sep 7, 2021 at 11:33 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
> [...]
> On the topi of "LSAN_OPTIONS": It would be nice to have a mode to
> aggregate all failures in our various scripts, see [2] for a start at
> doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
> that for now, it can be added later, and that proposed patch is also
> hindered by us wanting to test e.g. test-tool leaks (and by proxy, any
> API leaks they uncover), not just the "common-main.c" entry point.

s/topi/topic/

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
                           ` (2 preceding siblings ...)
  2021-09-07 15:33         ` [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-07 16:44         ` Jeff King
  2021-09-07 18:22           ` Junio C Hamano
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-09-07 16:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Tue, Sep 07, 2021 at 05:33:28PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Changes since v3:
> 
>  * Much updated commit message
> 
>  * Re-arranged the t/README change to avoid a conflict with "seen".
> 
>  * Now testing OSX as well as Linux. Full CI passes on top of "master"
>    on both: https://github.com/avar/git/runs/3535331215
> 
>  * I ejected the previous 4-8/8 patches of adding SANITIZE=leak
>    annotations to various tests, let's focus on the test mode itself
>    here and not overly distracting ourselves with whatever other
>    regressions on "seen" those annotations might cause, I can submit
>    those annotations later.
> 
>  * As noted in the updated commit message I didn't end up going with
>    Jeff King's suggestion of supporting LSAN_OPTIONS directly, and
>    fixing the "fd" the tests write to. All of those things can be
>    extended or fixed later.

OK, I think we should proceed with this series/approach, then. The
question of friction when CI fails is an open one, but we won't know
until we have more data. So let's see what happens. :)

The patches themselves look fine to me, though I had a few nits on the
third commit message.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 15:33         ` [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-09-07 16:29           ` Eric Sunshine
@ 2021-09-07 16:51           ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Jeff King @ 2021-09-07 16:51 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

On Tue, Sep 07, 2021 at 05:33:31PM +0200, Ævar Arnfjörð Bjarmason wrote:

> Subject: [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI

The patch looks OK to me. There are a bunch of typos/nits in the commit
message which made it a little harder to read. I don't care _that_ much,
but there's one inaccuracy I wanted to point out, and the others are
along for the ride. :)

> While git can be compiled with SANITIZE=leak we have not run
> regression tests under that mode, memory leaks have only been fixed as
> one-offs without structured regression testing.

Funky comma placement. Maybe:

  While git can be compiled with SANITIZE=leak, we have not run
  regression tests under that mode. Memory leaks have only been fixed as
  one-offs without structured regression testing.

> This change add CI testing for it. We'll now build with GCC under
> Linux and test t000[04]*.sh with SANITIZE=leak, and likewise with GCC
> on OSX. The new jobs are called "linux-SANITIZE=leak" and
> "osx-SANITIZE=leak".

s/add/adds/

A matter of taste, but I find the "linux-SANITIZE=leak" a little funny
to read because of the mixed-caps and punctuation. Just linux-leaks or
something is descriptive enough. Pure bikeshedding, of course.

> On the topi of "LSAN_OPTIONS": It would be nice to have a mode to
> aggregate all failures in our various scripts, see [2] for a start at
> doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
> that for now, it can be added later, and that proposed patch is also
> hindered by us wanting to test e.g. test-tool leaks (and by proxy, any
> API leaks they uncover), not just the "common-main.c" entry point.

I think test-tool does actually use common-main.c, so we'd be covered
there, too. That said, I'm perfectly fine to leave this for now (or
perhaps never; if we can get the whole suite passing with leak-checking
on, then aggregating the many leak reports without having test failures
will become a moot point).

> +# skip non-whitelisted tests when compiled with SANITIZE=leak
> +if test -n "$SANITIZE_LEAK"
> +then
> +	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
> +	then
> +		# We need to see it in "git env--helper" (via
> +		# test_bool_env)
> +		export TEST_PASSES_SANITIZE_LEAK
> +
> +		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
> +		then
> +			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
> +			test_done
> +		fi
> +	fi
> +elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
> +then
> +	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
> +fi

I wondered if it would be helpful for this to be more forgiving. But
there's not much point in setting GIT_TEST_PASSING_SANITIZE_LEAK all the
time (say, in your config.mak), since it will just skip a bunch of
tests. So it probably does make sense to alert the user that "oops, you
did not actually build things correctly".

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v4 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 16:44         ` [PATCH v4 0/3] " Jeff King
@ 2021-09-07 18:22           ` Junio C Hamano
  0 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2021-09-07 18:22 UTC (permalink / raw)
  To: Jeff King
  Cc: Ævar Arnfjörð Bjarmason, git, Andrzej Hunt,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh,
	Eric Sunshine

Jeff King <peff@peff.net> writes:

> On Tue, Sep 07, 2021 at 05:33:28PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
> OK, I think we should proceed with this series/approach, then. The
> question of friction when CI fails is an open one, but we won't know
> until we have more data. So let's see what happens. :)
>
> The patches themselves look fine to me, though I had a few nits on the
> third commit message.

Thank you all.

Let's see a copy-edited v5 and we can go from there.



^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v5 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
                           ` (3 preceding siblings ...)
  2021-09-07 16:44         ` [PATCH v4 0/3] " Jeff King
@ 2021-09-07 21:30         ` Ævar Arnfjörð Bjarmason
  2021-09-07 21:30           ` [PATCH v5 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
                             ` (4 more replies)
  4 siblings, 5 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 21:30 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

We can compile git with SANITIZE=leak, and have had various efforts in
the past such as 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04)
to plug memory leaks, but have had no CI testing of it to ensure that
we don't get regressions. This series adds a GIT_TEST_* mode for
checking those regressions, and runs it in CI.

Since I submitted v2 the delta between origin/master..origin/seen
broke even t0001-init.sh when run under SANITIZE=leak, so this series
will cause test smoke on "seen".

That failure is due to a bug in es/config-based-hooks [1] and the
hn/reftable topic, i.e. these patches are legitimately catching
regressions in "seen" from day 1.

Changes since v4 (see
https://lore.kernel.org/git/cover-v4-0.3-00000000000-20210907T151855Z-avarab@gmail.com/):

 * Renamed the jobs to linux-leaks and osx-leaks, per Jeff King's
   suggestion.

 * Took all the suggestions from Jeff King for commit message
   improvements, and tried to make some of my own fixing overly
   verbose wording/grammar errors etc.

 * Ditto the small typo fix Eric Sunshine pointed out. Thanks both!

See https://github.com/avar/git/runs/3538356269 for the CI run for
this version.

Ævar Arnfjörð Bjarmason (3):
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  CI: refactor "if" to "case" statement
  tests: add a test mode for SANITIZE=leak, run it in CI

 .github/workflows/main.yml |  6 ++++++
 Makefile                   |  5 +++++
 ci/install-dependencies.sh |  6 +++---
 ci/lib.sh                  | 31 +++++++++++++++++++++----------
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0000-basic.sh           |  1 +
 t/t0004-unwritable.sh      |  3 ++-
 t/test-lib.sh              | 21 +++++++++++++++++++++
 9 files changed, 67 insertions(+), 15 deletions(-)

Range-diff against v4:
1:  bdfe2279271 = 1:  bdfe2279271 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2:  6aaa60e3759 = 2:  6aaa60e3759 CI: refactor "if" to "case" statement
3:  fffbfc35c00 ! 3:  f3cd04b16d1 tests: add a test mode for SANITIZE=leak, run it in CI
    @@ Metadata
      ## Commit message ##
         tests: add a test mode for SANITIZE=leak, run it in CI
     
    -    While git can be compiled with SANITIZE=leak we have not run
    -    regression tests under that mode, memory leaks have only been fixed as
    +    While git can be compiled with SANITIZE=leak, we have not run
    +    regression tests under that mode. Memory leaks have only been fixed as
         one-offs without structured regression testing.
     
    -    This change add CI testing for it. We'll now build with GCC under
    -    Linux and test t000[04]*.sh with SANITIZE=leak, and likewise with GCC
    -    on OSX. The new jobs are called "linux-SANITIZE=leak" and
    -    "osx-SANITIZE=leak".
    +    This change adds CI testing for it. We'll now build and test
    +    t000[04]*.sh under both Linux and OSX. The new jobs are called
    +    "linux-leaks" and "osx-leaks".
     
         The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
         mode. When running in that mode, we'll assert that we were compiled
    -    with SANITIZE=leak, and then skip all tests except those that we've
    -    opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
    -    test-lib.sh (see discussion in t/README).
    +    with SANITIZE=leak. We'll then skip all tests, except those that we've
    +    opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".
     
    -    The tests using the "TEST_PASSES_SANITIZE_LEAK=true" setting can in
    +    A test tests setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in
         turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to
         selectively skip tests even under
         "GIT_TEST_PASSING_SANITIZE_LEAK=true". In a preceding commit we
         started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
         it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".
     
    -    Now tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will be
    -    skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
    +    This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
    +    be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
     
             $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
             1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set
    @@ Commit message
         commit-graph tests). To be clear having a prerequisite could also be
         accomplished by using "LSAN_OPTIONS" directly.
     
    -    On the topi of "LSAN_OPTIONS": It would be nice to have a mode to
    +    On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
         aggregate all failures in our various scripts, see [2] for a start at
         doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
    -    that for now, it can be added later, and that proposed patch is also
    -    hindered by us wanting to test e.g. test-tool leaks (and by proxy, any
    -    API leaks they uncover), not just the "common-main.c" entry point.
    +    that for now, it can be added later.
     
         As of writing this we've got major regressions between master..seen,
         i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
         'ah/plugleaks', 2021-08-04) have regressed recently.
     
    -    See the discussion at <87czsv2idy.fsf@evledraar.gmail.com> about the
    -    lack of this sort of test mode, and 0e5bba53af (add UNLEAK annotation
    -    for reducing leak false positives, 2017-09-08) for the initial
    -    addition of SANITIZE=leak.
    +    See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
    +    the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
    +    annotation for reducing leak false positives, 2017-09-08) for the
    +    initial addition of SANITIZE=leak.
     
         See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
         7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
         936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
         past history of "one-off" SANITIZE=leak (and more) fixes.
     
    -    The reason for using gcc on OSX over the clang default is because
    -    it'll currently fail to build with:
    +    The reason for using gcc on OSX over the clang default is because when
    +    used with clang on "macos-latest" it'll currently fail to build with:
     
             clang: error: unsupported option '-fsanitize=leak' for target 'x86_64-apple-darwin19.6.0'
     
         If that's sorted out in the future we might want to run that job with
         "clang" merely to make use of the default, and also to add some
         compiler variance into the mix. Both use the
    -    "AddressSanitizerLeakSanitizer" library[3], so in they shouldn't be
    -    have differently under GCC or clang.
    +    "AddressSanitizerLeakSanitizer" library[4], so in they shouldn't
    +    behave differently under GCC or clang.
     
         1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
         2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
    -    3. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
    +    3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
    +    4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
    @@ .github/workflows/main.yml: jobs:
                - jobname: linux-gcc-default
                  cc: gcc
                  pool: ubuntu-latest
    -+          - jobname: linux-SANITIZE=leak
    ++          - jobname: linux-leaks
     +            cc: gcc
     +            pool: ubuntu-latest
    -+          - jobname: osx-SANITIZE=leak
    ++          - jobname: osx-leaks
     +            cc: gcc
     +            pool: macos-latest
          env:
    @@ ci/install-dependencies.sh: UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl
      
      case "$jobname" in
     -linux-clang|linux-gcc)
    -+linux-clang|linux-gcc|linux-SANITIZE=leak)
    ++linux-clang|linux-gcc|linux-leaks)
      	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
      	sudo apt-get -q update
      	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
      		$UBUNTU_COMMON_PKGS
      	case "$jobname" in
     -	linux-gcc)
    -+	linux-gcc|linux-SANITIZE=leak)
    ++	linux-gcc|linux-leaks)
      		sudo apt-get -q -y install gcc-8
      		;;
      	esac
    @@ ci/install-dependencies.sh: linux-clang|linux-gcc)
      	popd
      	;;
     -osx-clang|osx-gcc)
    -+osx-clang|osx-gcc|osx-SANITIZE=leak)
    ++osx-clang|osx-gcc|osx-leaks)
      	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
      	# Uncomment this if you want to run perf tests:
      	# brew install gnu-time
    @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
      
      case "$jobname" in
     -linux-clang|linux-gcc)
    -+linux-clang|linux-gcc|linux-SANITIZE=leak)
    ++linux-clang|linux-gcc|linux-leaks)
      	case "$jobname" in
     -	linux-gcc)
    -+	linux-gcc|linux-SANITIZE=leak)
    ++	linux-gcc|linux-leaks)
      		export CC=gcc-8
      		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
      		;;
    @@ ci/lib.sh: linux-clang|linux-gcc)
      	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
      	;;
     -osx-clang|osx-gcc)
    -+osx-clang|osx-gcc|osx-SANITIZE=leak)
    ++osx-clang|osx-gcc|osx-leaks)
      	case "$jobname" in
     -	osx-gcc)
    -+	osx-gcc|osx-SANITIZE=leak)
    ++	osx-gcc|osx-leaks)
      		export CC=gcc-9
      		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
      		;;
    @@ ci/lib.sh: linux-musl)
      esac
      
     +case "$jobname" in
    -+linux-SANITIZE=leak|osx-SANITIZE=leak)
    ++linux-leaks|osx-leaks)
     +	export SANITIZE=leak
     +	export GIT_TEST_PASSING_SANITIZE_LEAK=true
     +	;;
    @@ ci/run-build-and-tests.sh: esac
      make
      case "$jobname" in
     -linux-gcc)
    -+linux-gcc|linux-SANITIZE=leak)
    ++linux-gcc|linux-leaks)
      	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
      	make test
      	export GIT_TEST_SPLIT_INDEX=yes
    @@ t/README: excluded as so much relies on it, but this might change in the future.
     +themselves as passing with no memory leaks. Tests can be whitelisted
     +by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
     +"test-lib.sh" itself at the top of the test script. This test mode is
    -+used by the "linux-SANITIZE=leak" CI target.
    ++used by the "linux-leaks" CI target.
     +
      GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
      default to n.
-- 
2.33.0.819.g59feb45f5e0


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v5 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
@ 2021-09-07 21:30           ` Ævar Arnfjörð Bjarmason
  2021-09-07 21:30           ` [PATCH v5 2/3] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
                             ` (3 subsequent siblings)
  4 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 21:30 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              | 5 +++++
 t/t0004-unwritable.sh | 2 +-
 t/test-lib.sh         | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 429c276058d..34c12ea6e6f 100644
--- a/Makefile
+++ b/Makefile
@@ -1221,6 +1221,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1265,6 +1268,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2812,6 +2816,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index e3137d638ee..fbdcb926b3a 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -21,7 +21,7 @@ test_expect_success POSIXPERM,SANITY 'write-tree should notice unwritable reposi
 	test_must_fail git write-tree
 '
 
-test_expect_success POSIXPERM,SANITY 'commit should notice unwritable repository' '
+test_expect_success POSIXPERM,SANITY,!SANITIZE_LEAK 'commit should notice unwritable repository' '
 	test_when_finished "chmod 775 .git/objects .git/objects/??" &&
 	chmod a-w .git/objects .git/objects/?? &&
 	test_must_fail git commit -m second
diff --git a/t/test-lib.sh b/t/test-lib.sh
index abcfbed6d61..4ab18914a3d 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1533,6 +1533,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.33.0.819.g59feb45f5e0


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v5 2/3] CI: refactor "if" to "case" statement
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
  2021-09-07 21:30           ` [PATCH v5 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
@ 2021-09-07 21:30           ` Ævar Arnfjörð Bjarmason
  2021-09-07 21:30           ` [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
                             ` (2 subsequent siblings)
  4 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 21:30 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

Refactor an "if" statement for "linux-gcc" and "osx-gcc" to a "case"
statement in preparation for another case being added to them.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 ci/lib.sh | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f5..33b9777ab7e 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -184,13 +184,15 @@ export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
 linux-clang|linux-gcc)
-	if [ "$jobname" = linux-gcc ]
-	then
+	case "$jobname" in
+	linux-gcc)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python2"
-	fi
+		;;
+	esac
 
 	export GIT_TEST_HTTPD=true
 
@@ -207,13 +209,15 @@ linux-clang|linux-gcc)
 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
 	;;
 osx-clang|osx-gcc)
-	if [ "$jobname" = osx-gcc ]
-	then
+	case "$jobname" in
+	osx-gcc)
 		export CC=gcc-9
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
-	else
+		;;
+	*)
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python2)"
-	fi
+		;;
+	esac
 
 	# t9810 occasionally fails on Travis CI OS X
 	# t9816 occasionally fails with "TAP out of sequence errors" on
-- 
2.33.0.819.g59feb45f5e0


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
  2021-09-07 21:30           ` [PATCH v5 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
  2021-09-07 21:30           ` [PATCH v5 2/3] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
@ 2021-09-07 21:30           ` Ævar Arnfjörð Bjarmason
  2021-09-08  4:46             ` Eric Sunshine
  2021-09-16  3:56             ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
  2021-09-08 11:02           ` [PATCH v5 0/3] " Junio C Hamano
  2021-09-16 10:48           ` [PATCH v6 0/2] " Ævar Arnfjörð Bjarmason
  4 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-07 21:30 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak, we have not run
regression tests under that mode. Memory leaks have only been fixed as
one-offs without structured regression testing.

This change adds CI testing for it. We'll now build and test
t000[04]*.sh under both Linux and OSX. The new jobs are called
"linux-leaks" and "osx-leaks".

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak. We'll then skip all tests, except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".

A test tests setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in
turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In a preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.

It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.

Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.

We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.

On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) for the
initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

The reason for using gcc on OSX over the clang default is because when
used with clang on "macos-latest" it'll currently fail to build with:

    clang: error: unsupported option '-fsanitize=leak' for target 'x86_64-apple-darwin19.6.0'

If that's sorted out in the future we might want to run that job with
"clang" merely to make use of the default, and also to add some
compiler variance into the mix. Both use the
"AddressSanitizerLeakSanitizer" library[4], so in they shouldn't
behave differently under GCC or clang.

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 .github/workflows/main.yml |  6 ++++++
 ci/install-dependencies.sh |  6 +++---
 ci/lib.sh                  | 15 +++++++++++----
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0000-basic.sh           |  1 +
 t/t0004-unwritable.sh      |  1 +
 t/test-lib.sh              | 20 ++++++++++++++++++++
 8 files changed, 50 insertions(+), 8 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 68596f25927..a2d345fb00e 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -232,6 +232,12 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-leaks
+            cc: gcc
+            pool: ubuntu-latest
+          - jobname: osx-leaks
+            cc: gcc
+            pool: macos-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 5772081b6e5..bb88afd3699 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,13 +12,13 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
 		$UBUNTU_COMMON_PKGS
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-leaks)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
@@ -37,7 +37,7 @@ linux-clang|linux-gcc)
 		cp git-lfs-$LINUX_GIT_LFS_VERSION/git-lfs .
 	popd
 	;;
-osx-clang|osx-gcc)
+osx-clang|osx-gcc|osx-leaks)
 	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
 	# Uncomment this if you want to run perf tests:
 	# brew install gnu-time
diff --git a/ci/lib.sh b/ci/lib.sh
index 33b9777ab7e..043c99d31cb 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,9 +183,9 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	case "$jobname" in
-	linux-gcc)
+	linux-gcc|linux-leaks)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
 		;;
@@ -208,9 +208,9 @@ linux-clang|linux-gcc)
 	GIT_LFS_PATH="$HOME/custom/git-lfs"
 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
 	;;
-osx-clang|osx-gcc)
+osx-clang|osx-gcc|osx-leaks)
 	case "$jobname" in
-	osx-gcc)
+	osx-gcc|osx-leaks)
 		export CC=gcc-9
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
 		;;
@@ -237,4 +237,11 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-leaks|osx-leaks)
+	export SANITIZE=leak
+	export GIT_TEST_PASSING_SANITIZE_LEAK=true
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 3ce81ffee94..23d2fa5565a 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -12,7 +12,7 @@ esac
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc|linux-leaks)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
diff --git a/t/README b/t/README
index 9e701223020..8b5f86a46f3 100644
--- a/t/README
+++ b/t/README
@@ -366,6 +366,13 @@ excluded as so much relies on it, but this might change in the future.
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
+GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
+SANITIZE=leak will run only those tests that have whitelisted
+themselves as passing with no memory leaks. Tests can be whitelisted
+by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
+"test-lib.sh" itself at the top of the test script. This test mode is
+used by the "linux-leaks" CI target.
+
 GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
 default to n.
 
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index cb87768513c..54318af3861 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -18,6 +18,7 @@ swapping compression and hashing order, the person who is making the
 modification *should* take notice and update the test vectors here.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 try_local_xy () {
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index fbdcb926b3a..37d68ef03be 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -2,6 +2,7 @@
 
 test_description='detect unwritable repository and fail correctly'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 4ab18914a3d..3b7acfec23b 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1379,6 +1379,26 @@ then
 	test_done
 fi
 
+# skip non-whitelisted tests when compiled with SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+	then
+		# We need to see it in "git env--helper" (via
+		# test_bool_env)
+		export TEST_PASSES_SANITIZE_LEAK
+
+		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
+		then
+			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
+			test_done
+		fi
+	fi
+elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+then
+	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 HOME="$TRASH_DIRECTORY"
 GNUPGHOME="$HOME/gnupg-home-not-used"
-- 
2.33.0.819.g59feb45f5e0


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 21:30           ` [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-08  4:46             ` Eric Sunshine
  2021-09-16  3:56             ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
  1 sibling, 0 replies; 125+ messages in thread
From: Eric Sunshine @ 2021-09-08  4:46 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Git List, Junio C Hamano, Jeff King, Andrzej Hunt,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh

On Tue, Sep 7, 2021 at 5:30 PM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote:
> [...]
> A test tests setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in
> turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to
> selectively skip tests even under
> "GIT_TEST_PASSING_SANITIZE_LEAK=true". In a preceding commit we
> started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
> it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

Is the wording "A test tests setting ... setting" intentional?

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v5 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
                             ` (2 preceding siblings ...)
  2021-09-07 21:30           ` [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-08 11:02           ` Junio C Hamano
  2021-09-08 12:03             ` Ævar Arnfjörð Bjarmason
  2021-09-16 10:48           ` [PATCH v6 0/2] " Ævar Arnfjörð Bjarmason
  4 siblings, 1 reply; 125+ messages in thread
From: Junio C Hamano @ 2021-09-08 11:02 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> We can compile git with SANITIZE=leak, and have had various efforts in
> the past such as 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04)
> to plug memory leaks, but have had no CI testing of it to ensure that
> we don't get regressions. This series adds a GIT_TEST_* mode for
> checking those regressions, and runs it in CI.
>
> Since I submitted v2 the delta between origin/master..origin/seen
> broke even t0001-init.sh when run under SANITIZE=leak, so this series
> will cause test smoke on "seen".
>
> That failure is due to a bug in es/config-based-hooks [1] and the
> hn/reftable topic, i.e. these patches are legitimately catching
> regressions in "seen" from day 1.

So is there a point in sending this out to the list, before sending
fixes to these broken topic and making sure they get corrected?

Because the CI does not "bisect" to tell us "ok, up to this point in
'seen', all the topics merged play well together", the overall
effect in the bigger picture is that 'seen' with this series would
cause CI to stay in failed state.

For now, I'll keep this near the tip of 'seen'.

Thanks.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v5 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-09-08 11:02           ` [PATCH v5 0/3] " Junio C Hamano
@ 2021-09-08 12:03             ` Ævar Arnfjörð Bjarmason
  2021-09-09 23:10               ` Emily Shaffer
  0 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-08 12:03 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh, Eric Sunshine,
	Emily Shaffer


On Wed, Sep 08 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> We can compile git with SANITIZE=leak, and have had various efforts in
>> the past such as 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04)
>> to plug memory leaks, but have had no CI testing of it to ensure that
>> we don't get regressions. This series adds a GIT_TEST_* mode for
>> checking those regressions, and runs it in CI.
>>
>> Since I submitted v2 the delta between origin/master..origin/seen
>> broke even t0001-init.sh when run under SANITIZE=leak, so this series
>> will cause test smoke on "seen".
>>
>> That failure is due to a bug in es/config-based-hooks [1] and the
>> hn/reftable topic, i.e. these patches are legitimately catching
>> regressions in "seen" from day 1.
>
> So is there a point in sending this out to the list, before sending
> fixes to these broken topic and making sure they get corrected?
>
> Because the CI does not "bisect" to tell us "ok, up to this point in
> 'seen', all the topics merged play well together", the overall
> effect in the bigger picture is that 'seen' with this series would
> cause CI to stay in failed state.
>
> For now, I'll keep this near the tip of 'seen'.

The breakages with it are in combination with:

    ab/config-based-hooks-base
    es/config-based-hooks
    hn/reftable

You've got v4 of ab/config-based-hooks-base, the v5 is at [1], but we've
been waiting on emily to re-roll hers on top. As noted in that E-Mail
I've got a working re-roll of it as
avar-nasamuffin/config-based-hooks-restart-3 in my repo.

That'll leave hn/reftable, which given [2] I thought you were planning
to eject, and wiht the number of fixups for it / the planned re-doing of
it by Han-Wen[3] maybe it's better to do that now?

What do you think about that plan?

I.e. ejecting hn/reftable while waiting on a re-roll, and either
ejecting es/config-based-hooks while waiting, or I can submit the
avar-nasamuffin/config-based-hooks-restart-3 I've got pending Emily's
own re-roll (which may or may not be different from that).

That along with picking up the v5 of my ab/config-based-hooks-base
should make "seen" pass with SANITIZE=leak on these tests, unless
there's other just-introduced regressions. I tried re-building it a few
days ago, I haven't done that just now.

1. https://lore.kernel.org/git/cover-v5-00.36-00000000000-20210902T125110Z-avarab@gmail.com/
2. https://lore.kernel.org/git/xmqq4kaxe5dt.fsf@gitster.g/
3. https://lore.kernel.org/git/CAFQ2z_N8pUsp3cdBpybHBD-V9_1sARCZvSxr0UkMfcwCoQfCbw@mail.gmail.com/

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v5 0/3] add a test mode for SANITIZE=leak, run it in CI
  2021-09-08 12:03             ` Ævar Arnfjörð Bjarmason
@ 2021-09-09 23:10               ` Emily Shaffer
  0 siblings, 0 replies; 125+ messages in thread
From: Emily Shaffer @ 2021-09-09 23:10 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Junio C Hamano, git, Jeff King, Andrzej Hunt,
	Lénaïc Huard, Derrick Stolee, Felipe Contreras,
	SZEDER Gábor, Đoàn Trần Công Danh,
	Eric Sunshine

On Wed, Sep 08, 2021 at 02:03:30PM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> 
> On Wed, Sep 08 2021, Junio C Hamano wrote:
> 
> > Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
> >
> >> We can compile git with SANITIZE=leak, and have had various efforts in
> >> the past such as 31f9acf9ce2 (Merge branch 'ah/plugleaks', 2021-08-04)
> >> to plug memory leaks, but have had no CI testing of it to ensure that
> >> we don't get regressions. This series adds a GIT_TEST_* mode for
> >> checking those regressions, and runs it in CI.
> >>
> >> Since I submitted v2 the delta between origin/master..origin/seen
> >> broke even t0001-init.sh when run under SANITIZE=leak, so this series
> >> will cause test smoke on "seen".
> >>
> >> That failure is due to a bug in es/config-based-hooks [1] and the
> >> hn/reftable topic, i.e. these patches are legitimately catching
> >> regressions in "seen" from day 1.
> >
> > So is there a point in sending this out to the list, before sending
> > fixes to these broken topic and making sure they get corrected?
> >
> > Because the CI does not "bisect" to tell us "ok, up to this point in
> > 'seen', all the topics merged play well together", the overall
> > effect in the bigger picture is that 'seen' with this series would
> > cause CI to stay in failed state.
> >
> > For now, I'll keep this near the tip of 'seen'.
> 
> The breakages with it are in combination with:
> 
>     ab/config-based-hooks-base
>     es/config-based-hooks
>     hn/reftable
> 
> You've got v4 of ab/config-based-hooks-base, the v5 is at [1], but we've
> been waiting on emily to re-roll hers on top. As noted in that E-Mail
> I've got a working re-roll of it as
> avar-nasamuffin/config-based-hooks-restart-3 in my repo.
> 
> That'll leave hn/reftable, which given [2] I thought you were planning
> to eject, and wiht the number of fixups for it / the planned re-doing of
> it by Han-Wen[3] maybe it's better to do that now?
> 
> What do you think about that plan?
> 
> I.e. ejecting hn/reftable while waiting on a re-roll, and either
> ejecting es/config-based-hooks while waiting, or I can submit the
> avar-nasamuffin/config-based-hooks-restart-3 I've got pending Emily's
> own re-roll (which may or may not be different from that).

My own reroll is waiting on some feedback internally and probably won't
show up this week at all, so I suggest to kick mine out and prioritize
the reftable stuff for now.

 - Emily

> 
> That along with picking up the v5 of my ab/config-based-hooks-base
> should make "seen" pass with SANITIZE=leak on these tests, unless
> there's other just-introduced regressions. I tried re-building it a few
> days ago, I haven't done that just now.
> 
> 1. https://lore.kernel.org/git/cover-v5-00.36-00000000000-20210902T125110Z-avarab@gmail.com/
> 2. https://lore.kernel.org/git/xmqq4kaxe5dt.fsf@gitster.g/
> 3. https://lore.kernel.org/git/CAFQ2z_N8pUsp3cdBpybHBD-V9_1sARCZvSxr0UkMfcwCoQfCbw@mail.gmail.com/

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH] fixup! tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 21:30           ` [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-09-08  4:46             ` Eric Sunshine
@ 2021-09-16  3:56             ` Carlo Marcelo Arenas Belón
  2021-09-16  6:14               ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 125+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2021-09-16  3:56 UTC (permalink / raw)
  To: git; +Cc: avarab, Carlo Marcelo Arenas Belón

Use the standard gcc in Linux, instead of the older version

Remove the osx-leaks job; neither clang or gcc support it and won't
until clang 14 is released.

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 .github/workflows/main.yml | 3 ---
 ci/install-dependencies.sh | 4 ++--
 ci/lib.sh                  | 8 ++++----
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index 7c273147a0..59acc35d37 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -234,9 +234,6 @@ jobs:
           - jobname: linux-leaks
             cc: gcc
             pool: ubuntu-latest
-          - jobname: osx-leaks
-            cc: gcc
-            pool: macos-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index bb88afd369..1d0e48f451 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -18,7 +18,7 @@ linux-clang|linux-gcc|linux-leaks)
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
 		$UBUNTU_COMMON_PKGS
 	case "$jobname" in
-	linux-gcc|linux-leaks)
+	linux-gcc)
 		sudo apt-get -q -y install gcc-8
 		;;
 	esac
@@ -37,7 +37,7 @@ linux-clang|linux-gcc|linux-leaks)
 		cp git-lfs-$LINUX_GIT_LFS_VERSION/git-lfs .
 	popd
 	;;
-osx-clang|osx-gcc|osx-leaks)
+osx-clang|osx-gcc)
 	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
 	# Uncomment this if you want to run perf tests:
 	# brew install gnu-time
diff --git a/ci/lib.sh b/ci/lib.sh
index cf62f786a3..36f594751d 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -185,7 +185,7 @@ export SKIP_DASHED_BUILT_INS=YesPlease
 case "$jobname" in
 linux-clang|linux-gcc|linux-leaks)
 	case "$jobname" in
-	linux-gcc|linux-leaks)
+	linux-gcc)
 		export CC=gcc-8
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
 		;;
@@ -208,9 +208,9 @@ linux-clang|linux-gcc|linux-leaks)
 	GIT_LFS_PATH="$HOME/custom/git-lfs"
 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
 	;;
-osx-clang|osx-gcc|osx-leaks)
+osx-clang|osx-gcc)
 	case "$jobname" in
-	osx-gcc|osx-leaks)
+	osx-gcc)
 		export CC=gcc-9
 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
 		;;
@@ -239,7 +239,7 @@ linux-musl)
 esac
 
 case "$jobname" in
-linux-leaks|osx-leaks)
+*-leaks)
 	export SANITIZE=leak
 	export GIT_TEST_PASSING_SANITIZE_LEAK=true
 	;;
-- 
2.33.0.481.g26d3bed244


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH] fixup! tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-16  3:56             ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
@ 2021-09-16  6:14               ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-16  6:14 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git


On Wed, Sep 15 2021, Carlo Marcelo Arenas Belón wrote:

> Use the standard gcc in Linux, instead of the older version
>
> Remove the osx-leaks job; neither clang or gcc support it and won't
> until clang 14 is released.

Thanks, that's well spotted. I'll fix this in a re-roll. I just tested
and the osx job does nothing.

FWIW I'd tested and it errored on clang, but didn't check the gcc case
of silently ignoring it.

> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  .github/workflows/main.yml | 3 ---
>  ci/install-dependencies.sh | 4 ++--
>  ci/lib.sh                  | 8 ++++----
>  3 files changed, 6 insertions(+), 9 deletions(-)
>
> diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
> index 7c273147a0..59acc35d37 100644
> --- a/.github/workflows/main.yml
> +++ b/.github/workflows/main.yml
> @@ -234,9 +234,6 @@ jobs:
>            - jobname: linux-leaks
>              cc: gcc
>              pool: ubuntu-latest
> -          - jobname: osx-leaks
> -            cc: gcc
> -            pool: macos-latest
>      env:
>        CC: ${{matrix.vector.cc}}
>        jobname: ${{matrix.vector.jobname}}
> diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
> index bb88afd369..1d0e48f451 100755
> --- a/ci/install-dependencies.sh
> +++ b/ci/install-dependencies.sh
> @@ -18,7 +18,7 @@ linux-clang|linux-gcc|linux-leaks)
>  	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
>  		$UBUNTU_COMMON_PKGS
>  	case "$jobname" in
> -	linux-gcc|linux-leaks)
> +	linux-gcc)
>  		sudo apt-get -q -y install gcc-8
>  		;;
>  	esac


> @@ -37,7 +37,7 @@ linux-clang|linux-gcc|linux-leaks)
>  		cp git-lfs-$LINUX_GIT_LFS_VERSION/git-lfs .
>  	popd
>  	;;
> -osx-clang|osx-gcc|osx-leaks)
> +osx-clang|osx-gcc)
>  	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
>  	# Uncomment this if you want to run perf tests:
>  	# brew install gnu-time
> diff --git a/ci/lib.sh b/ci/lib.sh
> index cf62f786a3..36f594751d 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -185,7 +185,7 @@ export SKIP_DASHED_BUILT_INS=YesPlease
>  case "$jobname" in
>  linux-clang|linux-gcc|linux-leaks)
>  	case "$jobname" in
> -	linux-gcc|linux-leaks)
> +	linux-gcc)
>  		export CC=gcc-8
>  		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
>  		;;
> @@ -208,9 +208,9 @@ linux-clang|linux-gcc|linux-leaks)
>  	GIT_LFS_PATH="$HOME/custom/git-lfs"
>  	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
>  	;;
> -osx-clang|osx-gcc|osx-leaks)
> +osx-clang|osx-gcc)
>  	case "$jobname" in
> -	osx-gcc|osx-leaks)
> +	osx-gcc)
>  		export CC=gcc-9
>  		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
>  		;;
> @@ -239,7 +239,7 @@ linux-musl)
>  esac
>  
>  case "$jobname" in
> -linux-leaks|osx-leaks)
> +*-leaks)
>  	export SANITIZE=leak
>  	export GIT_TEST_PASSING_SANITIZE_LEAK=true
>  	;;

I'll leave this stray cleanup out, yes it's functionally equivalent, but
it really helps to be able to see an identifier in main.yml and grep for
it across the untyped-language boundary of that YAML going into
shellscript.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v6 0/2] add a test mode for SANITIZE=leak, run it in CI
  2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
                             ` (3 preceding siblings ...)
  2021-09-08 11:02           ` [PATCH v5 0/3] " Junio C Hamano
@ 2021-09-16 10:48           ` Ævar Arnfjörð Bjarmason
  2021-09-16 10:48             ` [PATCH v6 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
                               ` (2 more replies)
  4 siblings, 3 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-16 10:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

This v6 incorporates a suggested fixup from Carlo Marcelo Arenas
Belón, we weren't running at all under OSX as it turns out :
https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/

So the osx-leaks job has been dropped, and we're not using an older
compiler anymore (I'd just copy/pasted that setting). Since we don't
need it, we can drop the 2nd patch of v5. For v5 see:
https://lore.kernel.org/git/cover-v5-0.3-00000000000-20210907T212626Z-avarab@gmail.com/

This also incorporates a wording fix from Eric Sunshine.

The rest of this CL is just a message for Eric Sunshine, included here
for what should be in case he'll see this on-list:

Eric: If you're reading this I dropped you from CC because since
around September 2nd mailer-daemon@googlemail.com has been failing to
deliver all mail to you. It appears the dynadot.com MTA you use has
banned delivery from GMail's public IP's due to spam complaints:

 The recipient server did not accept our requests to connect. Learn
 more at https://support.google.com/mail/answer/7720
 [parkmail.dynadot.com. 68.68.98.83: 421 parkmail.dynadot.com your ip
 address has been banned due to spam complaints 209.85.167.53 ]
 [parkmail.dynadot.com. 68.68.98.74: 421 parkmail.dynadot.com your ip
 address has been banned due to spam complaints 209.85.167.48 ]
 [parkmail.dynadot.com. 68.68.98.84: 421 parkmail.dynadot.com your ip
 address has been banned due to spam complaints 209.85.167.48 ]

Ævar Arnfjörð Bjarmason (2):
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  tests: add a test mode for SANITIZE=leak, run it in CI

 .github/workflows/main.yml |  3 +++
 Makefile                   |  5 +++++
 ci/install-dependencies.sh |  2 +-
 ci/lib.sh                  |  9 ++++++++-
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0000-basic.sh           |  1 +
 t/t0004-unwritable.sh      |  3 ++-
 t/test-lib.sh              | 21 +++++++++++++++++++++
 9 files changed, 49 insertions(+), 4 deletions(-)

Range-diff against v5:
1:  bdfe2279271 = 1:  fc7ba4cb1c3 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2:  6aaa60e3759 < -:  ----------- CI: refactor "if" to "case" statement
3:  f3cd04b16d1 ! 2:  8dcb1269881 tests: add a test mode for SANITIZE=leak, run it in CI
    @@ Commit message
         one-offs without structured regression testing.
     
         This change adds CI testing for it. We'll now build and test
    -    t000[04]*.sh under both Linux and OSX. The new jobs are called
    -    "linux-leaks" and "osx-leaks".
    +    t000[04]*.sh under Linux with a new job called "linux-leaks".
     
         The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
         mode. When running in that mode, we'll assert that we were compiled
         with SANITIZE=leak. We'll then skip all tests, except those that we've
         opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".
     
    -    A test tests setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in
    -    turn make use of the "SANITIZE_LEAK" prerequisite, should they wish to
    +    A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn
    +    make use of the "SANITIZE_LEAK" prerequisite, should they wish to
         selectively skip tests even under
    -    "GIT_TEST_PASSING_SANITIZE_LEAK=true". In a preceding commit we
    +    "GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we
         started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
         it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".
     
    @@ Commit message
         936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
         past history of "one-off" SANITIZE=leak (and more) fixes.
     
    -    The reason for using gcc on OSX over the clang default is because when
    -    used with clang on "macos-latest" it'll currently fail to build with:
    -
    -        clang: error: unsupported option '-fsanitize=leak' for target 'x86_64-apple-darwin19.6.0'
    -
    -    If that's sorted out in the future we might want to run that job with
    -    "clang" merely to make use of the default, and also to add some
    -    compiler variance into the mix. Both use the
    -    "AddressSanitizerLeakSanitizer" library[4], so in they shouldn't
    -    behave differently under GCC or clang.
    +    As noted in [5] we can't support this on OSX yet until Clang 14 is
    +    released, at that point we'll probably want to resurrect that
    +    "osx-leaks" job.
     
         1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
         2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
         3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
         4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
    +    5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/
     
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    +    Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
     
      ## .github/workflows/main.yml ##
     @@ .github/workflows/main.yml: jobs:
    @@ .github/workflows/main.yml: jobs:
     +          - jobname: linux-leaks
     +            cc: gcc
     +            pool: ubuntu-latest
    -+          - jobname: osx-leaks
    -+            cc: gcc
    -+            pool: macos-latest
          env:
            CC: ${{matrix.vector.cc}}
            jobname: ${{matrix.vector.jobname}}
    @@ ci/install-dependencies.sh: UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl
      	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
      	sudo apt-get -q update
      	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
    - 		$UBUNTU_COMMON_PKGS
    - 	case "$jobname" in
    --	linux-gcc)
    -+	linux-gcc|linux-leaks)
    - 		sudo apt-get -q -y install gcc-8
    - 		;;
    - 	esac
    -@@ ci/install-dependencies.sh: linux-clang|linux-gcc)
    - 		cp git-lfs-$LINUX_GIT_LFS_VERSION/git-lfs .
    - 	popd
    - 	;;
    --osx-clang|osx-gcc)
    -+osx-clang|osx-gcc|osx-leaks)
    - 	export HOMEBREW_NO_AUTO_UPDATE=1 HOMEBREW_NO_INSTALL_CLEANUP=1
    - 	# Uncomment this if you want to run perf tests:
    - 	# brew install gnu-time
     
      ## ci/lib.sh ##
     @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
    @@ ci/lib.sh: export GIT_TEST_CLONE_2GB=true
      case "$jobname" in
     -linux-clang|linux-gcc)
     +linux-clang|linux-gcc|linux-leaks)
    - 	case "$jobname" in
    --	linux-gcc)
    -+	linux-gcc|linux-leaks)
    + 	if [ "$jobname" = linux-gcc ]
    + 	then
      		export CC=gcc-8
    - 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=/usr/bin/python3"
    - 		;;
    -@@ ci/lib.sh: linux-clang|linux-gcc)
    - 	GIT_LFS_PATH="$HOME/custom/git-lfs"
    - 	export PATH="$GIT_LFS_PATH:$P4_PATH:$PATH"
    - 	;;
    --osx-clang|osx-gcc)
    -+osx-clang|osx-gcc|osx-leaks)
    - 	case "$jobname" in
    --	osx-gcc)
    -+	osx-gcc|osx-leaks)
    - 		export CC=gcc-9
    - 		MAKEFLAGS="$MAKEFLAGS PYTHON_PATH=$(which python3)"
    - 		;;
     @@ ci/lib.sh: linux-musl)
      	;;
      esac
      
     +case "$jobname" in
    -+linux-leaks|osx-leaks)
    ++linux-leaks)
     +	export SANITIZE=leak
     +	export GIT_TEST_PASSING_SANITIZE_LEAK=true
     +	;;
    @@ ci/lib.sh: linux-musl)
      MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
     
      ## ci/run-build-and-tests.sh ##
    -@@ ci/run-build-and-tests.sh: esac
    +@@ ci/run-build-and-tests.sh: fi
      
      make
      case "$jobname" in
    @@ t/test-lib.sh: then
     +fi
     +
      # Last-minute variable setup
    + USER_HOME="$HOME"
      HOME="$TRASH_DIRECTORY"
    - GNUPGHOME="$HOME/gnupg-home-not-used"
-- 
2.33.0.1056.gb2c8c79e36d


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v6 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  2021-09-16 10:48           ` [PATCH v6 0/2] " Ævar Arnfjörð Bjarmason
@ 2021-09-16 10:48             ` Ævar Arnfjörð Bjarmason
  2021-09-16 10:48             ` [PATCH v6 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-09-19  8:03             ` [PATCH v7 0/2] " Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-16 10:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              | 5 +++++
 t/t0004-unwritable.sh | 2 +-
 t/test-lib.sh         | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b90af71a7a2..b4ad91743b5 100644
--- a/Makefile
+++ b/Makefile
@@ -1224,6 +1224,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1268,6 +1271,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2815,6 +2819,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index e3137d638ee..fbdcb926b3a 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -21,7 +21,7 @@ test_expect_success POSIXPERM,SANITY 'write-tree should notice unwritable reposi
 	test_must_fail git write-tree
 '
 
-test_expect_success POSIXPERM,SANITY 'commit should notice unwritable repository' '
+test_expect_success POSIXPERM,SANITY,!SANITIZE_LEAK 'commit should notice unwritable repository' '
 	test_when_finished "chmod 775 .git/objects .git/objects/??" &&
 	chmod a-w .git/objects .git/objects/?? &&
 	test_must_fail git commit -m second
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d5ee9642548..06831086060 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1536,6 +1536,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.33.0.1056.gb2c8c79e36d


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v6 2/2] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-16 10:48           ` [PATCH v6 0/2] " Ævar Arnfjörð Bjarmason
  2021-09-16 10:48             ` [PATCH v6 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
@ 2021-09-16 10:48             ` Ævar Arnfjörð Bjarmason
  2021-09-19  8:03             ` [PATCH v7 0/2] " Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-16 10:48 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak, we have not run
regression tests under that mode. Memory leaks have only been fixed as
one-offs without structured regression testing.

This change adds CI testing for it. We'll now build and test
t000[04]*.sh under Linux with a new job called "linux-leaks".

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak. We'll then skip all tests, except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".

A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn
make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.

It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.

Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.

We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.

On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) for the
initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

As noted in [5] we can't support this on OSX yet until Clang 14 is
released, at that point we'll probably want to resurrect that
"osx-leaks" job.

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 .github/workflows/main.yml |  3 +++
 ci/install-dependencies.sh |  2 +-
 ci/lib.sh                  |  9 ++++++++-
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0000-basic.sh           |  1 +
 t/t0004-unwritable.sh      |  1 +
 t/test-lib.sh              | 20 ++++++++++++++++++++
 8 files changed, 42 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b053b01c66e..47281684782 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -232,6 +232,9 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-leaks
+            cc: gcc
+            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 5772081b6e5..1d0e48f4515 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,7 +12,7 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f5..82cb17f8eea 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,7 +183,7 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	if [ "$jobname" = linux-gcc ]
 	then
 		export CC=gcc-8
@@ -233,4 +233,11 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-leaks)
+	export SANITIZE=leak
+	export GIT_TEST_PASSING_SANITIZE_LEAK=true
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index f3aba5d6cbb..ba29a93d84b 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -17,7 +17,7 @@ fi
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc|linux-leaks)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
diff --git a/t/README b/t/README
index e924bd81e2d..ab84278b7eb 100644
--- a/t/README
+++ b/t/README
@@ -366,6 +366,13 @@ excluded as so much relies on it, but this might change in the future.
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
+GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
+SANITIZE=leak will run only those tests that have whitelisted
+themselves as passing with no memory leaks. Tests can be whitelisted
+by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
+"test-lib.sh" itself at the top of the test script. This test mode is
+used by the "linux-leaks" CI target.
+
 GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
 default to n.
 
diff --git a/t/t0000-basic.sh b/t/t0000-basic.sh
index cb87768513c..54318af3861 100755
--- a/t/t0000-basic.sh
+++ b/t/t0000-basic.sh
@@ -18,6 +18,7 @@ swapping compression and hashing order, the person who is making the
 modification *should* take notice and update the test vectors here.
 '
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 try_local_xy () {
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index fbdcb926b3a..37d68ef03be 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -2,6 +2,7 @@
 
 test_description='detect unwritable repository and fail correctly'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 06831086060..9310d9d900a 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1381,6 +1381,26 @@ then
 	test_done
 fi
 
+# skip non-whitelisted tests when compiled with SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+	then
+		# We need to see it in "git env--helper" (via
+		# test_bool_env)
+		export TEST_PASSES_SANITIZE_LEAK
+
+		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
+		then
+			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
+			test_done
+		fi
+	fi
+elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+then
+	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 USER_HOME="$HOME"
 HOME="$TRASH_DIRECTORY"
-- 
2.33.0.1056.gb2c8c79e36d


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v7 0/2] add a test mode for SANITIZE=leak, run it in CI
  2021-09-16 10:48           ` [PATCH v6 0/2] " Ævar Arnfjörð Bjarmason
  2021-09-16 10:48             ` [PATCH v6 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
  2021-09-16 10:48             ` [PATCH v6 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-19  8:03             ` Ævar Arnfjörð Bjarmason
  2021-09-19  8:03               ` [PATCH v7 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
                                 ` (2 more replies)
  2 siblings, 3 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-19  8:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

This series adds a small beachhead of tests we run in CI that we
assert to be memory-leak free with the SANITIZE=leak test mode. Once
it lands the intent is to expand the parts of the test suite we
whitelist as memory-leak free.

This v7 of the "test with SANITIZE=leak in CI" topic should be ready
for merging down. The v6 got marked as "Will merge to 'next'?", but as
Carlo points out[1] there were concurrent regresisons in
t0000-basic.sh that caused the tests to faile. There's proposed fixes
to those[2] as well as Carlo's own series to fix other issues with
it[3].

All of those are worth doing, but the reason I picked t0000-basic.sh
was that it would hopefully stay leak free through the
seen->next->master cycle.

Let's not pick that one, but instead a few of the very small and basic
tests in t00*.sh.

These all run cleanly on top of master, and also when merged with next
and seen (except for the semantic "seen" failure due to merging with
v6 of this topic, and therefore t0000-basic.sh being run in the test
mode).

For v6 of this topic see:
https://lore.kernel.org/git/cover-v6-0.2-00000000000-20210916T085311Z-avarab@gmail.com

1. https://lore.kernel.org/git/CAPUEsphMUNYRACmK-nksotP1RrMn09mNGFdEHLLuNEWH4AcU7Q@mail.gmail.com/
2. https://lore.kernel.org/git/pull.1092.git.git.1631972978.gitgitgadget@gmail.com/
3. https://lore.kernel.org/git/20210916023706.55760-1-carenas@gmail.com/

Ævar Arnfjörð Bjarmason (2):
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  tests: add a test mode for SANITIZE=leak, run it in CI

 .github/workflows/main.yml |  3 +++
 Makefile                   |  5 +++++
 ci/install-dependencies.sh |  2 +-
 ci/lib.sh                  |  9 ++++++++-
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0004-unwritable.sh      |  3 ++-
 t/t0011-hashmap.sh         |  2 ++
 t/t0016-oidmap.sh          |  2 ++
 t/t0017-env-helper.sh      |  1 +
 t/t0018-advice.sh          |  1 +
 t/t0030-stripspace.sh      |  1 +
 t/t0063-string-list.sh     |  1 +
 t/t0091-bugreport.sh       |  1 +
 t/test-lib.sh              | 21 +++++++++++++++++++++
 15 files changed, 57 insertions(+), 4 deletions(-)

Range-diff against v6:
1:  fc7ba4cb1c3 = 1:  fc7ba4cb1c3 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2:  8dcb1269881 ! 2:  56592952db5 tests: add a test mode for SANITIZE=leak, run it in CI
    @@ Commit message
         regression tests under that mode. Memory leaks have only been fixed as
         one-offs without structured regression testing.
     
    -    This change adds CI testing for it. We'll now build and test
    -    t000[04]*.sh under Linux with a new job called "linux-leaks".
    +    This change adds CI testing for it. We'll now build and small set of
    +    whitelisted t00*.sh tests under Linux with a new job called
    +    "linux-leaks".
     
         The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
         mode. When running in that mode, we'll assert that we were compiled
    @@ t/README: excluded as so much relies on it, but this might change in the future.
      default to n.
      
     
    - ## t/t0000-basic.sh ##
    -@@ t/t0000-basic.sh: swapping compression and hashing order, the person who is making the
    - modification *should* take notice and update the test vectors here.
    - '
    + ## t/t0004-unwritable.sh ##
    +@@
    + 
    + test_description='detect unwritable repository and fail correctly'
      
     +TEST_PASSES_SANITIZE_LEAK=true
      . ./test-lib.sh
      
    - try_local_xy () {
    + test_expect_success setup '
     
    - ## t/t0004-unwritable.sh ##
    + ## t/t0011-hashmap.sh ##
     @@
    + #!/bin/sh
      
    - test_description='detect unwritable repository and fail correctly'
    + test_description='test hashmap and string hash functions'
    ++
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + test_hashmap() {
    +
    + ## t/t0016-oidmap.sh ##
    +@@
    + #!/bin/sh
      
    + test_description='test oidmap'
    ++
     +TEST_PASSES_SANITIZE_LEAK=true
      . ./test-lib.sh
      
    - test_expect_success setup '
    + # This purposefully is very similar to t0011-hashmap.sh
    +
    + ## t/t0017-env-helper.sh ##
    +@@
    + 
    + test_description='test env--helper'
    + 
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + 
    +
    + ## t/t0018-advice.sh ##
    +@@
    + 
    + test_description='Test advise_if_enabled functionality'
    + 
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + test_expect_success 'advice should be printed when config variable is unset' '
    +
    + ## t/t0030-stripspace.sh ##
    +@@
    + 
    + test_description='git stripspace'
    + 
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + t40='A quick brown fox jumps over the lazy do'
    +
    + ## t/t0063-string-list.sh ##
    +@@
    + 
    + test_description='Test string list functionality'
    + 
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + test_split () {
    +
    + ## t/t0091-bugreport.sh ##
    +@@
    + 
    + test_description='git bugreport'
    + 
    ++TEST_PASSES_SANITIZE_LEAK=true
    + . ./test-lib.sh
    + 
    + # Headers "[System Info]" will be followed by a non-empty line if we put some
     
      ## t/test-lib.sh ##
     @@ t/test-lib.sh: then
-- 
2.33.0.1092.g44c994ea1be


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v7 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  2021-09-19  8:03             ` [PATCH v7 0/2] " Ævar Arnfjörð Bjarmason
@ 2021-09-19  8:03               ` Ævar Arnfjörð Bjarmason
  2021-09-19  8:03               ` [PATCH v7 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2021-09-23  9:20               ` [PATCH v8 0/2] " Ævar Arnfjörð Bjarmason
  2 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-19  8:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              | 5 +++++
 t/t0004-unwritable.sh | 2 +-
 t/test-lib.sh         | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index b90af71a7a2..b4ad91743b5 100644
--- a/Makefile
+++ b/Makefile
@@ -1224,6 +1224,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1268,6 +1271,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2815,6 +2819,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index e3137d638ee..fbdcb926b3a 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -21,7 +21,7 @@ test_expect_success POSIXPERM,SANITY 'write-tree should notice unwritable reposi
 	test_must_fail git write-tree
 '
 
-test_expect_success POSIXPERM,SANITY 'commit should notice unwritable repository' '
+test_expect_success POSIXPERM,SANITY,!SANITIZE_LEAK 'commit should notice unwritable repository' '
 	test_when_finished "chmod 775 .git/objects .git/objects/??" &&
 	chmod a-w .git/objects .git/objects/?? &&
 	test_must_fail git commit -m second
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d5ee9642548..06831086060 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1536,6 +1536,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.33.0.1092.g44c994ea1be


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v7 2/2] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-19  8:03             ` [PATCH v7 0/2] " Ævar Arnfjörð Bjarmason
  2021-09-19  8:03               ` [PATCH v7 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
@ 2021-09-19  8:03               ` Ævar Arnfjörð Bjarmason
  2021-09-22 11:17                 ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
  2021-09-23  9:20               ` [PATCH v8 0/2] " Ævar Arnfjörð Bjarmason
  2 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-19  8:03 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak, we have not run
regression tests under that mode. Memory leaks have only been fixed as
one-offs without structured regression testing.

This change adds CI testing for it. We'll now build and small set of
whitelisted t00*.sh tests under Linux with a new job called
"linux-leaks".

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak. We'll then skip all tests, except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".

A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn
make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.

It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.

Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.

We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.

On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) for the
initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

As noted in [5] we can't support this on OSX yet until Clang 14 is
released, at that point we'll probably want to resurrect that
"osx-leaks" job.

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 .github/workflows/main.yml |  3 +++
 ci/install-dependencies.sh |  2 +-
 ci/lib.sh                  |  9 ++++++++-
 ci/run-build-and-tests.sh  |  2 +-
 t/README                   |  7 +++++++
 t/t0004-unwritable.sh      |  1 +
 t/t0011-hashmap.sh         |  2 ++
 t/t0016-oidmap.sh          |  2 ++
 t/t0017-env-helper.sh      |  1 +
 t/t0018-advice.sh          |  1 +
 t/t0030-stripspace.sh      |  1 +
 t/t0063-string-list.sh     |  1 +
 t/t0091-bugreport.sh       |  1 +
 t/test-lib.sh              | 20 ++++++++++++++++++++
 14 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b053b01c66e..47281684782 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -232,6 +232,9 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-leaks
+            cc: gcc
+            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 5772081b6e5..1d0e48f4515 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,7 +12,7 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f5..82cb17f8eea 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,7 +183,7 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	if [ "$jobname" = linux-gcc ]
 	then
 		export CC=gcc-8
@@ -233,4 +233,11 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-leaks)
+	export SANITIZE=leak
+	export GIT_TEST_PASSING_SANITIZE_LEAK=true
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index f3aba5d6cbb..ba29a93d84b 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -17,7 +17,7 @@ fi
 
 make
 case "$jobname" in
-linux-gcc)
+linux-gcc|linux-leaks)
 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 	make test
 	export GIT_TEST_SPLIT_INDEX=yes
diff --git a/t/README b/t/README
index e924bd81e2d..ab84278b7eb 100644
--- a/t/README
+++ b/t/README
@@ -366,6 +366,13 @@ excluded as so much relies on it, but this might change in the future.
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
+GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
+SANITIZE=leak will run only those tests that have whitelisted
+themselves as passing with no memory leaks. Tests can be whitelisted
+by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
+"test-lib.sh" itself at the top of the test script. This test mode is
+used by the "linux-leaks" CI target.
+
 GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
 default to n.
 
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index fbdcb926b3a..37d68ef03be 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -2,6 +2,7 @@
 
 test_description='detect unwritable repository and fail correctly'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t0011-hashmap.sh b/t/t0011-hashmap.sh
index 5343ffd3f92..e094975b13b 100755
--- a/t/t0011-hashmap.sh
+++ b/t/t0011-hashmap.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test hashmap and string hash functions'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_hashmap() {
diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
index 31f8276ba82..0faef1f4f11 100755
--- a/t/t0016-oidmap.sh
+++ b/t/t0016-oidmap.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test oidmap'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This purposefully is very similar to t0011-hashmap.sh
diff --git a/t/t0017-env-helper.sh b/t/t0017-env-helper.sh
index 4a159f99e44..2e42fba9567 100755
--- a/t/t0017-env-helper.sh
+++ b/t/t0017-env-helper.sh
@@ -2,6 +2,7 @@
 
 test_description='test env--helper'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 
diff --git a/t/t0018-advice.sh b/t/t0018-advice.sh
index 39e5e4b34f8..c13057a4ca3 100755
--- a/t/t0018-advice.sh
+++ b/t/t0018-advice.sh
@@ -2,6 +2,7 @@
 
 test_description='Test advise_if_enabled functionality'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'advice should be printed when config variable is unset' '
diff --git a/t/t0030-stripspace.sh b/t/t0030-stripspace.sh
index 0c24a0f9a37..ae1ca380c1a 100755
--- a/t/t0030-stripspace.sh
+++ b/t/t0030-stripspace.sh
@@ -5,6 +5,7 @@
 
 test_description='git stripspace'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 t40='A quick brown fox jumps over the lazy do'
diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
index c6ee9f66b11..46d4839194b 100755
--- a/t/t0063-string-list.sh
+++ b/t/t0063-string-list.sh
@@ -5,6 +5,7 @@
 
 test_description='Test string list functionality'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_split () {
diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh
index 526304ff95b..eeedbfa9193 100755
--- a/t/t0091-bugreport.sh
+++ b/t/t0091-bugreport.sh
@@ -2,6 +2,7 @@
 
 test_description='git bugreport'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Headers "[System Info]" will be followed by a non-empty line if we put some
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 06831086060..9310d9d900a 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1381,6 +1381,26 @@ then
 	test_done
 fi
 
+# skip non-whitelisted tests when compiled with SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+	then
+		# We need to see it in "git env--helper" (via
+		# test_bool_env)
+		export TEST_PASSES_SANITIZE_LEAK
+
+		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
+		then
+			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
+			test_done
+		fi
+	fi
+elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+then
+	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 USER_HOME="$HOME"
 HOME="$TRASH_DIRECTORY"
-- 
2.33.0.1092.g44c994ea1be


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH] fixup! tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-19  8:03               ` [PATCH v7 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-22 11:17                 ` Carlo Marcelo Arenas Belón
  2021-09-23  1:50                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 125+ messages in thread
From: Carlo Marcelo Arenas Belón @ 2021-09-22 11:17 UTC (permalink / raw)
  To: git; +Cc: avarab, Carlo Marcelo Arenas Belón

runs cleanly in seen as shown by :

  https://github.com/carenas/git/runs/3673976105

previously failing in the extended checks as shown at at least by :

  https://github.com/git/git/runs/3657308323

Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 t/t0016-oidmap.sh | 1 -
 1 file changed, 1 deletion(-)

diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
index 0faef1f4f1..f81aa9ea03 100755
--- a/t/t0016-oidmap.sh
+++ b/t/t0016-oidmap.sh
@@ -2,7 +2,6 @@
 
 test_description='test oidmap'
 
-TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This purposefully is very similar to t0011-hashmap.sh
-- 
2.33.0.911.gbe391d4e11


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH] fixup! tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-22 11:17                 ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
@ 2021-09-23  1:50                   ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-23  1:50 UTC (permalink / raw)
  To: Carlo Marcelo Arenas Belón; +Cc: git, SZEDER Gábor


On Wed, Sep 22 2021, Carlo Marcelo Arenas Belón wrote:

> runs cleanly in seen as shown by :
>
>   https://github.com/carenas/git/runs/3673976105
>
> previously failing in the extended checks as shown at at least by :
>
>   https://github.com/git/git/runs/3657308323

Thanks, it broke because it combined with sg/test-split-index-fix,
running the test with GIT_TEST_SPLIT_INDEX=true reveals a memory leak
that we weren't testing until then.

Junio: I think just applying this fixup is the right thing for now, are
you willing to do that or should I submit a re-roll with it?

> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
> ---
>  t/t0016-oidmap.sh | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
> index 0faef1f4f1..f81aa9ea03 100755
> --- a/t/t0016-oidmap.sh
> +++ b/t/t0016-oidmap.sh
> @@ -2,7 +2,6 @@
>  
>  test_description='test oidmap'
>  
> -TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  
>  # This purposefully is very similar to t0011-hashmap.sh


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v8 0/2] add a test mode for SANITIZE=leak, run it in CI
  2021-09-19  8:03             ` [PATCH v7 0/2] " Ævar Arnfjörð Bjarmason
  2021-09-19  8:03               ` [PATCH v7 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
  2021-09-19  8:03               ` [PATCH v7 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-09-23  9:20               ` Ævar Arnfjörð Bjarmason
  2021-09-23  9:20                 ` [PATCH v8 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
  2021-09-23  9:20                 ` [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  2 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-23  9:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

This series adds a small beachhead of tests we run in CI that we
assert to be memory-leak free with the SANITIZE=leak test mode. Once
it lands the intent is to expand the parts of the test suite we
whitelist as memory-leak free.

For the v7 see:
https://lore.kernel.org/git/cover-v7-0.2-00000000000-20210919T075619Z-avarab@gmail.com/

This v8 fixes a test failure that happened in combination with the
sg/test-split-index-fix topic, which just unearthed an old
GIT_TEST_SPLIT_INDEX=true memory leak.

Carlo Marcelo Arenas Belón had a fixup for it (that's currently
applied to the v7) here:
https://lore.kernel.org/git/20210922111741.82142-1-carenas@gmail.com/

I acked it in
https://lore.kernel.org/git/87h7ec59m7.fsf@evledraar.gmail.com/; but
on second thought I think this is a better solution for the reasons
noted in the updated commit message.

Ævar Arnfjörð Bjarmason (2):
  Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  tests: add a test mode for SANITIZE=leak, run it in CI

 .github/workflows/main.yml |  3 +++
 Makefile                   |  5 +++++
 ci/install-dependencies.sh |  2 +-
 ci/lib.sh                  |  9 ++++++++-
 t/README                   |  7 +++++++
 t/t0004-unwritable.sh      |  3 ++-
 t/t0011-hashmap.sh         |  2 ++
 t/t0016-oidmap.sh          |  2 ++
 t/t0017-env-helper.sh      |  1 +
 t/t0018-advice.sh          |  1 +
 t/t0030-stripspace.sh      |  1 +
 t/t0063-string-list.sh     |  1 +
 t/t0091-bugreport.sh       |  1 +
 t/test-lib.sh              | 21 +++++++++++++++++++++
 14 files changed, 56 insertions(+), 3 deletions(-)

Range-diff against v7:
1:  fc7ba4cb1c3 = 1:  c68a7108dc4 Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
2:  56592952db5 ! 2:  90ecd49c910 tests: add a test mode for SANITIZE=leak, run it in CI
    @@ Commit message
         The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
         as follow-up change, but let's start small to begin with.
     
    +    In ci/run-build-and-tests.sh we make use of the default "*" case to
    +    run "make test" without any GIT_TEST_* modes. SANITIZE=leak is known
    +    to fail in combination with GIT_TEST_SPLIT_INDEX=true in
    +    t0016-oidmap.sh, and we're likely to have other such failures in
    +    various GIT_TEST_* modes. Let's focus on getting the base tests
    +    passing, we can expand coverage to GIT_TEST_* modes later.
    +
         It would also be possible to implement a more lightweight version of
         this by only relying on setting "LSAN_OPTIONS". See
         <YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
    @@ ci/lib.sh: linux-musl)
     +
      MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
     
    - ## ci/run-build-and-tests.sh ##
    -@@ ci/run-build-and-tests.sh: fi
    - 
    - make
    - case "$jobname" in
    --linux-gcc)
    -+linux-gcc|linux-leaks)
    - 	export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
    - 	make test
    - 	export GIT_TEST_SPLIT_INDEX=yes
    -
      ## t/README ##
     @@ t/README: excluded as so much relies on it, but this might change in the future.
      GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
-- 
2.33.0.1228.gdc65525c655


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v8 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS
  2021-09-23  9:20               ` [PATCH v8 0/2] " Ævar Arnfjörð Bjarmason
@ 2021-09-23  9:20                 ` Ævar Arnfjörð Bjarmason
  2021-09-23  9:20                 ` [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-23  9:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

When SANITIZE=leak is specified we'll now add a SANITIZE_LEAK flag to
GIT-BUILD-OPTIONS, this can then be picked up by the test-lib.sh,
which sets a SANITIZE_LEAK prerequisite.

We can then skip specific tests that are known to fail under
SANITIZE=leak, add one such annotation to t0004-unwritable.sh, which
now passes under SANITIZE=leak.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 Makefile              | 5 +++++
 t/t0004-unwritable.sh | 2 +-
 t/test-lib.sh         | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/Makefile b/Makefile
index 9df565f27bb..d7390e6b2b5 100644
--- a/Makefile
+++ b/Makefile
@@ -1214,6 +1214,9 @@ PTHREAD_CFLAGS =
 SPARSE_FLAGS ?=
 SP_EXTRA_FLAGS = -Wno-universal-initializer
 
+# For informing GIT-BUILD-OPTIONS of the SANITIZE=leak target
+SANITIZE_LEAK =
+
 # For the 'coccicheck' target; setting SPATCH_BATCH_SIZE higher will
 # usually result in less CPU usage at the cost of higher peak memory.
 # Setting it to 0 will feed all files in a single spatch invocation.
@@ -1258,6 +1261,7 @@ BASIC_CFLAGS += -DSHA1DC_FORCE_ALIGNED_ACCESS
 endif
 ifneq ($(filter leak,$(SANITIZERS)),)
 BASIC_CFLAGS += -DSUPPRESS_ANNOTATED_LEAKS
+SANITIZE_LEAK = YesCompiledWithIt
 endif
 ifneq ($(filter address,$(SANITIZERS)),)
 NO_REGEX = NeededForASAN
@@ -2803,6 +2807,7 @@ GIT-BUILD-OPTIONS: FORCE
 	@echo NO_UNIX_SOCKETS=\''$(subst ','\'',$(subst ','\'',$(NO_UNIX_SOCKETS)))'\' >>$@+
 	@echo PAGER_ENV=\''$(subst ','\'',$(subst ','\'',$(PAGER_ENV)))'\' >>$@+
 	@echo DC_SHA1=\''$(subst ','\'',$(subst ','\'',$(DC_SHA1)))'\' >>$@+
+	@echo SANITIZE_LEAK=\''$(subst ','\'',$(subst ','\'',$(SANITIZE_LEAK)))'\' >>$@+
 	@echo X=\'$(X)\' >>$@+
 ifdef TEST_OUTPUT_DIRECTORY
 	@echo TEST_OUTPUT_DIRECTORY=\''$(subst ','\'',$(subst ','\'',$(TEST_OUTPUT_DIRECTORY)))'\' >>$@+
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index e3137d638ee..fbdcb926b3a 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -21,7 +21,7 @@ test_expect_success POSIXPERM,SANITY 'write-tree should notice unwritable reposi
 	test_must_fail git write-tree
 '
 
-test_expect_success POSIXPERM,SANITY 'commit should notice unwritable repository' '
+test_expect_success POSIXPERM,SANITY,!SANITIZE_LEAK 'commit should notice unwritable repository' '
 	test_when_finished "chmod 775 .git/objects .git/objects/??" &&
 	chmod a-w .git/objects .git/objects/?? &&
 	test_must_fail git commit -m second
diff --git a/t/test-lib.sh b/t/test-lib.sh
index d5ee9642548..06831086060 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1536,6 +1536,7 @@ test -z "$NO_PYTHON" && test_set_prereq PYTHON
 test -n "$USE_LIBPCRE2" && test_set_prereq PCRE
 test -n "$USE_LIBPCRE2" && test_set_prereq LIBPCRE2
 test -z "$NO_GETTEXT" && test_set_prereq GETTEXT
+test -n "$SANITIZE_LEAK" && test_set_prereq SANITIZE_LEAK
 
 if test -z "$GIT_TEST_CHECK_CACHE_TREE"
 then
-- 
2.33.0.1228.gdc65525c655


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-23  9:20               ` [PATCH v8 0/2] " Ævar Arnfjörð Bjarmason
  2021-09-23  9:20                 ` [PATCH v8 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
@ 2021-09-23  9:20                 ` Ævar Arnfjörð Bjarmason
  2021-11-03 22:44                   ` Re* " Junio C Hamano
  1 sibling, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-09-23  9:20 UTC (permalink / raw)
  To: git
  Cc: Junio C Hamano, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón,
	Ævar Arnfjörð Bjarmason

While git can be compiled with SANITIZE=leak, we have not run
regression tests under that mode. Memory leaks have only been fixed as
one-offs without structured regression testing.

This change adds CI testing for it. We'll now build and small set of
whitelisted t00*.sh tests under Linux with a new job called
"linux-leaks".

The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
mode. When running in that mode, we'll assert that we were compiled
with SANITIZE=leak. We'll then skip all tests, except those that we've
opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".

A test setting "TEST_PASSES_SANITIZE_LEAK=true" setting can in turn
make use of the "SANITIZE_LEAK" prerequisite, should they wish to
selectively skip tests even under
"GIT_TEST_PASSING_SANITIZE_LEAK=true". In the preceding commit we
started doing this in "t0004-unwritable.sh" under SANITIZE=leak, now
it'll combine nicely with "GIT_TEST_PASSING_SANITIZE_LEAK=true".

This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

    $ GIT_TEST_PASSING_SANITIZE_LEAK=true ./t0001-init.sh
    1..0 # SKIP skip all tests in t0001 under SANITIZE=leak, TEST_PASSES_SANITIZE_LEAK not set

The intent is to add more TEST_PASSES_SANITIZE_LEAK=true annotations
as follow-up change, but let's start small to begin with.

In ci/run-build-and-tests.sh we make use of the default "*" case to
run "make test" without any GIT_TEST_* modes. SANITIZE=leak is known
to fail in combination with GIT_TEST_SPLIT_INDEX=true in
t0016-oidmap.sh, and we're likely to have other such failures in
various GIT_TEST_* modes. Let's focus on getting the base tests
passing, we can expand coverage to GIT_TEST_* modes later.

It would also be possible to implement a more lightweight version of
this by only relying on setting "LSAN_OPTIONS". See
<YS9OT/pn5rRK9cGB@coredump.intra.peff.net>[1] and
<YS9ZIDpANfsh7N+S@coredump.intra.peff.net>[2] for a discussion of
that. I've opted for this approach of adding a GIT_TEST_* mode instead
because it's consistent with how we handle other special test modes.

Being able to add a "!SANITIZE_LEAK" prerequisite and calling
"test_done" early if it isn't satisfied also means that we can more
incrementally add regression tests without being forced to fix
widespread and hard-to-fix leaks at the same time.

We have tests that do simple checking of some tool we're interested
in, but later on in the script might be stressing trace2, or common
sources of leaks like "git log" in combination with the tool (e.g. the
commit-graph tests). To be clear having a prerequisite could also be
accomplished by using "LSAN_OPTIONS" directly.

On the topic of "LSAN_OPTIONS": It would be nice to have a mode to
aggregate all failures in our various scripts, see [2] for a start at
doing that which sets "log_path" in "LSAN_OPTIONS". I've punted on
that for now, it can be added later.

As of writing this we've got major regressions between master..seen,
i.e. the t000*.sh tests and more fixed since 31f9acf9ce2 (Merge branch
'ah/plugleaks', 2021-08-04) have regressed recently.

See the discussion at <87czsv2idy.fsf@evledraar.gmail.com>[3] about
the lack of this sort of test mode, and 0e5bba53af (add UNLEAK
annotation for reducing leak false positives, 2017-09-08) for the
initial addition of SANITIZE=leak.

See also 09595ab381 (Merge branch 'jk/leak-checkers', 2017-09-19),
7782066f67 (Merge branch 'jk/apache-lsan', 2019-05-19) and the recent
936e58851a (Merge branch 'ah/plugleaks', 2021-05-07) for some of the
past history of "one-off" SANITIZE=leak (and more) fixes.

As noted in [5] we can't support this on OSX yet until Clang 14 is
released, at that point we'll probably want to resurrect that
"osx-leaks" job.

1. https://github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer
2. https://lore.kernel.org/git/YS9OT%2Fpn5rRK9cGB@coredump.intra.peff.net/
3. https://lore.kernel.org/git/87czsv2idy.fsf@evledraar.gmail.com/
4. https://lore.kernel.org/git/YS9ZIDpANfsh7N+S@coredump.intra.peff.net/
5. https://lore.kernel.org/git/20210916035603.76369-1-carenas@gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
---
 .github/workflows/main.yml |  3 +++
 ci/install-dependencies.sh |  2 +-
 ci/lib.sh                  |  9 ++++++++-
 t/README                   |  7 +++++++
 t/t0004-unwritable.sh      |  1 +
 t/t0011-hashmap.sh         |  2 ++
 t/t0016-oidmap.sh          |  2 ++
 t/t0017-env-helper.sh      |  1 +
 t/t0018-advice.sh          |  1 +
 t/t0030-stripspace.sh      |  1 +
 t/t0063-string-list.sh     |  1 +
 t/t0091-bugreport.sh       |  1 +
 t/test-lib.sh              | 20 ++++++++++++++++++++
 13 files changed, 49 insertions(+), 2 deletions(-)

diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
index b053b01c66e..47281684782 100644
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -232,6 +232,9 @@ jobs:
           - jobname: linux-gcc-default
             cc: gcc
             pool: ubuntu-latest
+          - jobname: linux-leaks
+            cc: gcc
+            pool: ubuntu-latest
     env:
       CC: ${{matrix.vector.cc}}
       jobname: ${{matrix.vector.jobname}}
diff --git a/ci/install-dependencies.sh b/ci/install-dependencies.sh
index 5772081b6e5..1d0e48f4515 100755
--- a/ci/install-dependencies.sh
+++ b/ci/install-dependencies.sh
@@ -12,7 +12,7 @@ UBUNTU_COMMON_PKGS="make libssl-dev libcurl4-openssl-dev libexpat-dev
  libemail-valid-perl libio-socket-ssl-perl libnet-smtp-ssl-perl"
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	sudo apt-add-repository -y "ppa:ubuntu-toolchain-r/test"
 	sudo apt-get -q update
 	sudo apt-get -q -y install language-pack-is libsvn-perl apache2 \
diff --git a/ci/lib.sh b/ci/lib.sh
index 476c3f369f5..82cb17f8eea 100755
--- a/ci/lib.sh
+++ b/ci/lib.sh
@@ -183,7 +183,7 @@ export GIT_TEST_CLONE_2GB=true
 export SKIP_DASHED_BUILT_INS=YesPlease
 
 case "$jobname" in
-linux-clang|linux-gcc)
+linux-clang|linux-gcc|linux-leaks)
 	if [ "$jobname" = linux-gcc ]
 	then
 		export CC=gcc-8
@@ -233,4 +233,11 @@ linux-musl)
 	;;
 esac
 
+case "$jobname" in
+linux-leaks)
+	export SANITIZE=leak
+	export GIT_TEST_PASSING_SANITIZE_LEAK=true
+	;;
+esac
+
 MAKEFLAGS="$MAKEFLAGS CC=${CC:-cc}"
diff --git a/t/README b/t/README
index 51065d08006..b92155a822e 100644
--- a/t/README
+++ b/t/README
@@ -366,6 +366,13 @@ excluded as so much relies on it, but this might change in the future.
 GIT_TEST_SPLIT_INDEX=<boolean> forces split-index mode on the whole
 test suite. Accept any boolean values that are accepted by git-config.
 
+GIT_TEST_PASSING_SANITIZE_LEAK=<boolean> when compiled with
+SANITIZE=leak will run only those tests that have whitelisted
+themselves as passing with no memory leaks. Tests can be whitelisted
+by setting "TEST_PASSES_SANITIZE_LEAK=true" before sourcing
+"test-lib.sh" itself at the top of the test script. This test mode is
+used by the "linux-leaks" CI target.
+
 GIT_TEST_PROTOCOL_VERSION=<n>, when set, makes 'protocol.version'
 default to n.
 
diff --git a/t/t0004-unwritable.sh b/t/t0004-unwritable.sh
index fbdcb926b3a..37d68ef03be 100755
--- a/t/t0004-unwritable.sh
+++ b/t/t0004-unwritable.sh
@@ -2,6 +2,7 @@
 
 test_description='detect unwritable repository and fail correctly'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success setup '
diff --git a/t/t0011-hashmap.sh b/t/t0011-hashmap.sh
index 5343ffd3f92..e094975b13b 100755
--- a/t/t0011-hashmap.sh
+++ b/t/t0011-hashmap.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test hashmap and string hash functions'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_hashmap() {
diff --git a/t/t0016-oidmap.sh b/t/t0016-oidmap.sh
index 31f8276ba82..0faef1f4f11 100755
--- a/t/t0016-oidmap.sh
+++ b/t/t0016-oidmap.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test oidmap'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # This purposefully is very similar to t0011-hashmap.sh
diff --git a/t/t0017-env-helper.sh b/t/t0017-env-helper.sh
index 4a159f99e44..2e42fba9567 100755
--- a/t/t0017-env-helper.sh
+++ b/t/t0017-env-helper.sh
@@ -2,6 +2,7 @@
 
 test_description='test env--helper'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 
diff --git a/t/t0018-advice.sh b/t/t0018-advice.sh
index 39e5e4b34f8..c13057a4ca3 100755
--- a/t/t0018-advice.sh
+++ b/t/t0018-advice.sh
@@ -2,6 +2,7 @@
 
 test_description='Test advise_if_enabled functionality'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_expect_success 'advice should be printed when config variable is unset' '
diff --git a/t/t0030-stripspace.sh b/t/t0030-stripspace.sh
index 0c24a0f9a37..ae1ca380c1a 100755
--- a/t/t0030-stripspace.sh
+++ b/t/t0030-stripspace.sh
@@ -5,6 +5,7 @@
 
 test_description='git stripspace'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 t40='A quick brown fox jumps over the lazy do'
diff --git a/t/t0063-string-list.sh b/t/t0063-string-list.sh
index c6ee9f66b11..46d4839194b 100755
--- a/t/t0063-string-list.sh
+++ b/t/t0063-string-list.sh
@@ -5,6 +5,7 @@
 
 test_description='Test string list functionality'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 test_split () {
diff --git a/t/t0091-bugreport.sh b/t/t0091-bugreport.sh
index 526304ff95b..eeedbfa9193 100755
--- a/t/t0091-bugreport.sh
+++ b/t/t0091-bugreport.sh
@@ -2,6 +2,7 @@
 
 test_description='git bugreport'
 
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # Headers "[System Info]" will be followed by a non-empty line if we put some
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 06831086060..9310d9d900a 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1381,6 +1381,26 @@ then
 	test_done
 fi
 
+# skip non-whitelisted tests when compiled with SANITIZE=leak
+if test -n "$SANITIZE_LEAK"
+then
+	if test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+	then
+		# We need to see it in "git env--helper" (via
+		# test_bool_env)
+		export TEST_PASSES_SANITIZE_LEAK
+
+		if ! test_bool_env TEST_PASSES_SANITIZE_LEAK false
+		then
+			skip_all="skipping $this_test under GIT_TEST_PASSING_SANITIZE_LEAK=true"
+			test_done
+		fi
+	fi
+elif test_bool_env GIT_TEST_PASSING_SANITIZE_LEAK false
+then
+	error "GIT_TEST_PASSING_SANITIZE_LEAK=true has no effect except when compiled with SANITIZE=leak"
+fi
+
 # Last-minute variable setup
 USER_HOME="$HOME"
 HOME="$TRASH_DIRECTORY"
-- 
2.33.0.1228.gdc65525c655


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re* [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-09-23  9:20                 ` [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
@ 2021-11-03 22:44                   ` Junio C Hamano
  2021-11-03 23:57                     ` Junio C Hamano
  2021-11-04 10:06                     ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 125+ messages in thread
From: Junio C Hamano @ 2021-11-03 22:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
> mode. When running in that mode, we'll assert that we were compiled
> with SANITIZE=leak. We'll then skip all tests, except those that we've
> opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".
> ...
> This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
> be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:

I've been playing with this locally, but cannot shake the nagging
feeling that GIT_TEST_PASSING_SANITIZE_LEAK must default to true.
Otherwise, it is one more thing they need to find out and set when
they do

    make SANITYZE=leak test

because they want to be a good developer and to ensure that they did
not introduce new leaks.

If we want to encourage folks to locally run the leak checks before
declaring their own work "done", that is.

Those who are hunting for and cleaning up existing leaks can and
should set it to false, no?


In any case, here is a small fallout out of my adventure into this
corner.

----- >8 --------- >8 --------- >8 --------- >8 -----
Subject: t0006: date_mode can leak .strftime_fmt member

As there is no date_mode_release() API function, and given the
set of current callers it probably is not worth adding one, let's
release the .strftime_fmt member that is obtained from strdup()
before the caller of show_date() is done with it.

This allows us to mark t0006 as passing under the leak sanitizer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/helper/test-date.c | 2 ++
 t/t0006-date.sh      | 2 ++
 2 files changed, 4 insertions(+)

diff --git c/t/helper/test-date.c w/t/helper/test-date.c
index 099eff4f0f..e15ea02626 100644
--- c/t/helper/test-date.c
+++ w/t/helper/test-date.c
@@ -53,6 +53,8 @@ static void show_dates(const char **argv, const char *format)
 
 		printf("%s -> %s\n", *argv, show_date(t, tz, &mode));
 	}
+
+	free((void *)mode.strftime_fmt);
 }
 
 static void parse_dates(const char **argv)
diff --git c/t/t0006-date.sh w/t/t0006-date.sh
index 6b757d7169..5d01f57b27 100755
--- c/t/t0006-date.sh
+++ w/t/t0006-date.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test date parsing and printing'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # arbitrary reference time: 2009-08-30 19:20:00

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: Re* [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-11-03 22:44                   ` Re* " Junio C Hamano
@ 2021-11-03 23:57                     ` Junio C Hamano
  2021-11-04 10:06                     ` Ævar Arnfjörð Bjarmason
  1 sibling, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2021-11-03 23:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón

Junio C Hamano <gitster@pobox.com> writes:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
>> mode. When running in that mode, we'll assert that we were compiled
>> with SANITIZE=leak. We'll then skip all tests, except those that we've
>> opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".
>> ...
>> This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
>> be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
>
> I've been playing with this locally, but cannot shake the nagging
> feeling that GIT_TEST_PASSING_SANITIZE_LEAK must default to true.
> Otherwise, it is one more thing they need to find out and set when
> they do
>
>     make SANITYZE=leak test
>
> because they want to be a good developer and to ensure that they did
> not introduce new leaks.
>
> If we want to encourage folks to locally run the leak checks before
> declaring their own work "done", that is.
>
> Those who are hunting for and cleaning up existing leaks can and
> should set it to false, no?

Another thing while I am at it, I have a feeling that the polarity
of the TEST_PASSES_SANITIZE_LEAK declaration is the other way
around.

Marking the tests that do not yet pass the leak check with a special
annotation will make it easier to find not-yet-clean tests for those
who have too much time on their hands ;-) to find ones that are
affected by the leaky tests.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: Re* [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI
  2021-11-03 22:44                   ` Re* " Junio C Hamano
  2021-11-03 23:57                     ` Junio C Hamano
@ 2021-11-04 10:06                     ` Ævar Arnfjörð Bjarmason
  2021-11-16 18:31                       ` [PATCH] t0006: date_mode can leak .strftime_fmt member Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-04 10:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Andrzej Hunt, Lénaïc Huard,
	Derrick Stolee, Felipe Contreras, SZEDER Gábor,
	Đoàn Trần Công Danh,
	Carlo Marcelo Arenas Belón


On Wed, Nov 03 2021, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:
>
>> The CI target uses a new GIT_TEST_PASSING_SANITIZE_LEAK=true test
>> mode. When running in that mode, we'll assert that we were compiled
>> with SANITIZE=leak. We'll then skip all tests, except those that we've
>> opted-in by setting "TEST_PASSES_SANITIZE_LEAK=true".
>> ...
>> This is how tests that don't set "TEST_PASSES_SANITIZE_LEAK=true" will
>> be skipped under GIT_TEST_PASSING_SANITIZE_LEAK=true:
>
> I've been playing with this locally, but cannot shake the nagging
> feeling that GIT_TEST_PASSING_SANITIZE_LEAK must default to true.
> Otherwise, it is one more thing they need to find out and set when
> they do
>
>     make SANITYZE=leak test
>
> because they want to be a good developer and to ensure that they did
> not introduce new leaks.
>
> If we want to encourage folks to locally run the leak checks before
> declaring their own work "done", that is.
>
> Those who are hunting for and cleaning up existing leaks can and
> should set it to false, no?

I agree that that would make a lot more sense and be more useful :)

That was the behavior of the patch I originally suggested for
integrating this SANITIZE=leak[1], but due to feedback on it I ended up
keeping the pre-image behavior of how SANITIZE=leak worked, unless there
were any opt-in test modes etc. in play:
https://lore.kernel.org/git/patch-1.4-a61a294132-20210714T001007Z-avarab@gmail.com/

I think at this point it's probably better to just keep it as it is...

> in any case, here is a small fallout out of my adventure into this
> corner.
>
> ----- >8 --------- >8 --------- >8 --------- >8 -----
> Subject: t0006: date_mode can leak .strftime_fmt member
>
> As there is no date_mode_release() API function, and given the
> set of current callers it probably is not worth adding one, let's
> release the .strftime_fmt member that is obtained from strdup()
> before the caller of show_date() is done with it.
>
> This allows us to mark t0006 as passing under the leak sanitizer.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  t/helper/test-date.c | 2 ++
>  t/t0006-date.sh      | 2 ++
>  2 files changed, 4 insertions(+)
>
> diff --git c/t/helper/test-date.c w/t/helper/test-date.c
> index 099eff4f0f..e15ea02626 100644
> --- c/t/helper/test-date.c
> +++ w/t/helper/test-date.c
> @@ -53,6 +53,8 @@ static void show_dates(const char **argv, const char *format)
>  
>  		printf("%s -> %s\n", *argv, show_date(t, tz, &mode));
>  	}
> +
> +	free((void *)mode.strftime_fmt);
>  }

I'd notice that failure before, but hadn't looked into it. That was
easier to fix than I thought.

This fix looks good to me, except that you also need to change this at
the top:

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index e15ea026267..9defeb57360 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -34,7 +34,7 @@ static void show_human_dates(const char **argv)
 
 static void show_dates(const char **argv, const char *format)
 {
-       struct date_mode mode;
+       struct date_mode mode = { 0 };
 
        parse_date_format(format, &mode);
        for (; *argv; argv++) {

I.e. this makes this specific thing pass, but in other tests we'd end up
freeing a non-NULL and randomly initialized pointer unless we init it to
zero.

>  
>  static void parse_dates(const char **argv)
> diff --git c/t/t0006-date.sh w/t/t0006-date.sh
> index 6b757d7169..5d01f57b27 100755
> --- c/t/t0006-date.sh
> +++ w/t/t0006-date.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>  
>  test_description='test date parsing and printing'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  
>  # arbitrary reference time: 2009-08-30 19:20:00

And yeah, that's all that's needed in the test file then.

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH] t0006: date_mode can leak .strftime_fmt member
  2021-11-04 10:06                     ` Ævar Arnfjörð Bjarmason
@ 2021-11-16 18:31                       ` Ævar Arnfjörð Bjarmason
  2021-11-16 19:04                         ` Junio C Hamano
  2021-11-16 19:31                         ` Jeff King
  0 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2021-11-16 18:31 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Ævar Arnfjörð Bjarmason

From: Junio C Hamano <gitster@pobox.com>

As there is no date_mode_release() API function, and given the
set of current callers it probably is not worth adding one, let's
release the .strftime_fmt member that is obtained from strdup()
before the caller of show_date() is done with it.

This allows us to mark t0006 as passing under the leak sanitizer.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---

A trivial leak test from Junio that fell between the cracks. Submitted
with my suggested fix-up in
https://lore.kernel.org/git/211104.86mtmki5ol.gmgdl@evledraar.gmail.com/

 t/helper/test-date.c | 4 +++-
 t/t0006-date.sh      | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 099eff4f0fc..27a36a5c5fe 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -34,7 +34,7 @@ static void show_human_dates(const char **argv)
 
 static void show_dates(const char **argv, const char *format)
 {
-	struct date_mode mode;
+	struct date_mode mode = { 0 };
 
 	parse_date_format(format, &mode);
 	for (; *argv; argv++) {
@@ -53,6 +53,8 @@ static void show_dates(const char **argv, const char *format)
 
 		printf("%s -> %s\n", *argv, show_date(t, tz, &mode));
 	}
+
+	free((void *)mode.strftime_fmt);
 }
 
 static void parse_dates(const char **argv)
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 6b757d71692..5d01f57b270 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test date parsing and printing'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # arbitrary reference time: 2009-08-30 19:20:00
-- 
2.34.0.795.g1e9501ab396


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH] t0006: date_mode can leak .strftime_fmt member
  2021-11-16 18:31                       ` [PATCH] t0006: date_mode can leak .strftime_fmt member Ævar Arnfjörð Bjarmason
@ 2021-11-16 19:04                         ` Junio C Hamano
  2021-11-16 19:31                         ` Jeff King
  1 sibling, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2021-11-16 19:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> As there is no date_mode_release() API function, and given the
> set of current callers it probably is not worth adding one, let's
> release the .strftime_fmt member that is obtained from strdup()
> before the caller of show_date() is done with it.

I do not know what the last line exactly wants to say.  Perhaps the
original author meant "after", not "before"? ;-)

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH] t0006: date_mode can leak .strftime_fmt member
  2021-11-16 18:31                       ` [PATCH] t0006: date_mode can leak .strftime_fmt member Ævar Arnfjörð Bjarmason
  2021-11-16 19:04                         ` Junio C Hamano
@ 2021-11-16 19:31                         ` Jeff King
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 125+ messages in thread
From: Jeff King @ 2021-11-16 19:31 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Junio C Hamano

On Tue, Nov 16, 2021 at 07:31:12PM +0100, Ævar Arnfjörð Bjarmason wrote:

> As there is no date_mode_release() API function, and given the
> set of current callers it probably is not worth adding one, let's
> release the .strftime_fmt member that is obtained from strdup()
> before the caller of show_date() is done with it.

It does feel a bit ugly to assume that we can touch strftime_fmt here,
especially since we don't even confirm that we parsed DATE_STRFTIME.
You initialize it as NULL and the current code doesn't touch it
otherwise, so there's no bug. But it would be reasonable for other date
formats to store ancillary data as a union with strftime_fmt, which
would invalidate this.

It also seems like other callers will need to do similar cleanup. E.g.,
"git -c log.date=format:foo log" has the same leak. So maybe it is worth
adding an actual cleanup function.

-Peff

^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory
  2021-11-16 19:31                         ` Jeff King
@ 2022-02-02 21:03                           ` Ævar Arnfjörð Bjarmason
  2022-02-02 21:03                             ` [PATCH 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
                                               ` (5 more replies)
  0 siblings, 6 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

This is a follow-up to a much smaller patch[1] discussed in November
to make t0006-date.sh pass with SANITIZE=leak.

In reply Jeff King pointed out that reaching into its private guts in
the test helper felt ugly[2].

So this series pursues a more thorough approach, creating a date.h,
moving our date functions there out of cache.h, documenting the core
functions, and finally adding and using a date_mode_release()
function.

It's definitely taking the long way around, but I think that the end
result is worth it. I then have a follow-up series to plug memory
leaks in revision.h, which will make use of this new API.

1. https://lore.kernel.org/git/patch-1.1-15f5bd3e4f4-20211116T183025Z-avarab@gmail.com/
2. https://lore.kernel.org/git/YZQHEiFnOdyxYX5t@coredump.intra.peff.net/

Ævar Arnfjörð Bjarmason (5):
  cache.h: remove always unused show_date_human() declaration
  date API: create a date.h, split from cache.h
  date API: provide and use a DATE_MODE_INIT
  date API: add basic API docs
  date API: add and use a date_mode_release()

 archive-zip.c         |  1 +
 builtin/am.c          |  1 +
 builtin/commit.c      |  1 +
 builtin/fast-import.c |  1 +
 builtin/show-branch.c |  1 +
 builtin/tag.c         |  1 +
 cache.h               | 50 -----------------------------
 config.c              |  1 +
 date.c                |  9 ++++--
 date.h                | 73 +++++++++++++++++++++++++++++++++++++++++++
 http-backend.c        |  1 +
 ident.c               |  1 +
 object-name.c         |  1 +
 pretty.h              | 10 ++++++
 ref-filter.c          |  3 +-
 refs.c                |  1 +
 strbuf.c              |  1 +
 t/helper/test-date.c  |  5 ++-
 t/t0006-date.sh       |  2 ++
 19 files changed, 110 insertions(+), 54 deletions(-)
 create mode 100644 date.h

-- 
2.35.0.913.g12b4baa2536


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH 1/5] cache.h: remove always unused show_date_human() declaration
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
@ 2022-02-02 21:03                             ` Ævar Arnfjörð Bjarmason
  2022-02-02 21:03                             ` [PATCH 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
                                               ` (4 subsequent siblings)
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

There has never been a show_date_human() function on the "master"
branch in git.git. This declaration was added in b841d4ff438 (Add
`human` format to test-tool, 2019-01-28).

A look at the ML history reveals that it was leftover cruft from an
earlier version of that commit[1].

1. https://lore.kernel.org/git/20190118061805.19086-5-ischis2@cox.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/cache.h b/cache.h
index 281f00ab1b1..49b46244c74 100644
--- a/cache.h
+++ b/cache.h
@@ -1586,8 +1586,6 @@ struct date_mode *date_mode_from_type(enum date_mode_type type);
 
 const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
-void show_date_human(timestamp_t time, int tz, const struct timeval *now,
-			struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
 int parse_expiry_date(const char *date, timestamp_t *timestamp);
-- 
2.35.0.913.g12b4baa2536


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 2/5] date API: create a date.h, split from cache.h
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  2022-02-02 21:03                             ` [PATCH 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
@ 2022-02-02 21:03                             ` Ævar Arnfjörð Bjarmason
  2022-02-02 21:19                               ` Ævar Arnfjörð Bjarmason
  2022-02-15  3:04                               ` Junio C Hamano
  2022-02-02 21:03                             ` [PATCH 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
                                               ` (3 subsequent siblings)
  5 siblings, 2 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.

The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.

Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 archive-zip.c         |  1 +
 builtin/am.c          |  1 +
 builtin/commit.c      |  1 +
 builtin/fast-import.c |  1 +
 builtin/show-branch.c |  1 +
 builtin/tag.c         |  1 +
 cache.h               | 48 -------------------------------------------
 config.c              |  1 +
 date.c                |  1 +
 date.h                | 43 ++++++++++++++++++++++++++++++++++++++
 http-backend.c        |  1 +
 ident.c               |  1 +
 object-name.c         |  1 +
 pretty.h              | 10 +++++++++
 refs.c                |  1 +
 strbuf.c              |  1 +
 t/helper/test-date.c  |  1 +
 17 files changed, 67 insertions(+), 48 deletions(-)
 create mode 100644 date.h

diff --git a/archive-zip.c b/archive-zip.c
index 2961e01c754..8ea9d1a5dae 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -9,6 +9,7 @@
 #include "object-store.h"
 #include "userdiff.h"
 #include "xdiff-interface.h"
+#include "date.h"
 
 static int zip_date;
 static int zip_time;
diff --git a/builtin/am.c b/builtin/am.c
index b6be1f1cb11..cc8cd6d6e4b 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -34,6 +34,7 @@
 #include "string-list.h"
 #include "packfile.h"
 #include "repository.h"
+#include "pretty.h"
 
 /**
  * Returns the length of the first line of msg.
diff --git a/builtin/commit.c b/builtin/commit.c
index b9ed0374e30..6b99ac276d8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -37,6 +37,7 @@
 #include "help.h"
 #include "commit-reach.h"
 #include "commit-graph.h"
+#include "pretty.h"
 
 static const char * const builtin_commit_usage[] = {
 	N_("git commit [<options>] [--] <pathspec>..."),
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 2b2e28bad79..28f2b9cc91f 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -19,6 +19,7 @@
 #include "mem-pool.h"
 #include "commit-reach.h"
 #include "khash.h"
+#include "date.h"
 
 #define PACK_ID_BITS 16
 #define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index e12c5e80e3e..330b0553b9d 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -8,6 +8,7 @@
 #include "parse-options.h"
 #include "dir.h"
 #include "commit-slab.h"
+#include "date.h"
 
 static const char* show_branch_usage[] = {
     N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | --date-order]\n"
diff --git a/builtin/tag.c b/builtin/tag.c
index 134b3f1edf0..2479da07049 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -20,6 +20,7 @@
 #include "oid-array.h"
 #include "column.h"
 #include "ref-filter.h"
+#include "date.h"
 
 static const char * const git_tag_usage[] = {
 	N_("git tag [-a | -s | -u <key-id>] [-f] [-m <msg> | -F <file>]\n"
diff --git a/cache.h b/cache.h
index 49b46244c74..6add78fd701 100644
--- a/cache.h
+++ b/cache.h
@@ -1557,46 +1557,6 @@ struct object *repo_peel_to_type(struct repository *r,
 #define peel_to_type(name, namelen, obj, type) \
 	repo_peel_to_type(the_repository, name, namelen, obj, type)
 
-enum date_mode_type {
-	DATE_NORMAL = 0,
-	DATE_HUMAN,
-	DATE_RELATIVE,
-	DATE_SHORT,
-	DATE_ISO8601,
-	DATE_ISO8601_STRICT,
-	DATE_RFC2822,
-	DATE_STRFTIME,
-	DATE_RAW,
-	DATE_UNIX
-};
-
-struct date_mode {
-	enum date_mode_type type;
-	const char *strftime_fmt;
-	int local;
-};
-
-/*
- * Convenience helper for passing a constant type, like:
- *
- *   show_date(t, tz, DATE_MODE(NORMAL));
- */
-#define DATE_MODE(t) date_mode_from_type(DATE_##t)
-struct date_mode *date_mode_from_type(enum date_mode_type type);
-
-const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
-void show_date_relative(timestamp_t time, struct strbuf *timebuf);
-int parse_date(const char *date, struct strbuf *out);
-int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
-int parse_expiry_date(const char *date, timestamp_t *timestamp);
-void datestamp(struct strbuf *out);
-#define approxidate(s) approxidate_careful((s), NULL)
-timestamp_t approxidate_careful(const char *, int *);
-timestamp_t approxidate_relative(const char *date);
-void parse_date_format(const char *format, struct date_mode *mode);
-int date_overflows(timestamp_t date);
-time_t tm_to_time_t(const struct tm *tm);
-
 #define IDENT_STRICT	       1
 #define IDENT_NO_DATE	       2
 #define IDENT_NO_NAME	       4
@@ -1642,14 +1602,6 @@ struct ident_split {
  */
 int split_ident_line(struct ident_split *, const char *, int);
 
-/*
- * Like show_date, but pull the timestamp and tz parameters from
- * the ident_split. It will also sanity-check the values and produce
- * a well-known sentinel date if they appear bogus.
- */
-const char *show_ident_date(const struct ident_split *id,
-			    const struct date_mode *mode);
-
 /*
  * Compare split idents for equality or strict ordering. Note that we
  * compare only the ident part of the line, ignoring any timestamp.
diff --git a/config.c b/config.c
index 2bffa8d4a01..9c9dc8a6f62 100644
--- a/config.c
+++ b/config.c
@@ -6,6 +6,7 @@
  *
  */
 #include "cache.h"
+#include "date.h"
 #include "branch.h"
 #include "config.h"
 #include "environment.h"
diff --git a/date.c b/date.c
index 84bb4451c1a..863b07e9e63 100644
--- a/date.c
+++ b/date.c
@@ -5,6 +5,7 @@
  */
 
 #include "cache.h"
+#include "date.h"
 
 /*
  * This is like mktime, but without normalization of tm_wday and tm_yday.
diff --git a/date.h b/date.h
new file mode 100644
index 00000000000..5db9ec8dd29
--- /dev/null
+++ b/date.h
@@ -0,0 +1,43 @@
+#ifndef DATE_H
+#define DATE_H
+
+enum date_mode_type {
+	DATE_NORMAL = 0,
+	DATE_HUMAN,
+	DATE_RELATIVE,
+	DATE_SHORT,
+	DATE_ISO8601,
+	DATE_ISO8601_STRICT,
+	DATE_RFC2822,
+	DATE_STRFTIME,
+	DATE_RAW,
+	DATE_UNIX
+};
+
+struct date_mode {
+	enum date_mode_type type;
+	const char *strftime_fmt;
+	int local;
+};
+
+/*
+ * Convenience helper for passing a constant type, like:
+ *
+ *   show_date(t, tz, DATE_MODE(NORMAL));
+ */
+#define DATE_MODE(t) date_mode_from_type(DATE_##t)
+struct date_mode *date_mode_from_type(enum date_mode_type type);
+
+const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
+void show_date_relative(timestamp_t time, struct strbuf *timebuf);
+int parse_date(const char *date, struct strbuf *out);
+int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
+int parse_expiry_date(const char *date, timestamp_t *timestamp);
+void datestamp(struct strbuf *out);
+#define approxidate(s) approxidate_careful((s), NULL)
+timestamp_t approxidate_careful(const char *, int *);
+timestamp_t approxidate_relative(const char *date);
+void parse_date_format(const char *format, struct date_mode *mode);
+int date_overflows(timestamp_t date);
+time_t tm_to_time_t(const struct tm *tm);
+#endif
diff --git a/http-backend.c b/http-backend.c
index 807fb8839e7..81a7229ece0 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -13,6 +13,7 @@
 #include "packfile.h"
 #include "object-store.h"
 #include "protocol.h"
+#include "date.h"
 
 static const char content_type[] = "Content-Type";
 static const char content_length[] = "Content-Length";
diff --git a/ident.c b/ident.c
index 6aba4b5cb6f..89ca5b47008 100644
--- a/ident.c
+++ b/ident.c
@@ -7,6 +7,7 @@
  */
 #include "cache.h"
 #include "config.h"
+#include "date.h"
 
 static struct strbuf git_default_name = STRBUF_INIT;
 static struct strbuf git_default_email = STRBUF_INIT;
diff --git a/object-name.c b/object-name.c
index fdff4601b2c..f9527817b64 100644
--- a/object-name.c
+++ b/object-name.c
@@ -15,6 +15,7 @@
 #include "submodule.h"
 #include "midx.h"
 #include "commit-reach.h"
+#include "date.h"
 
 static int get_oid_oneline(struct repository *r, const char *, struct object_id *, struct commit_list *);
 
diff --git a/pretty.h b/pretty.h
index 2f16acd213d..f34e24c53a4 100644
--- a/pretty.h
+++ b/pretty.h
@@ -2,6 +2,7 @@
 #define PRETTY_H
 
 #include "cache.h"
+#include "date.h"
 #include "string-list.h"
 
 struct commit;
@@ -163,4 +164,13 @@ int format_set_trailers_options(struct process_trailer_options *opts,
 			const char **arg,
 			char **invalid_arg);
 
+/*
+ * Like show_date, but pull the timestamp and tz parameters from
+ * the ident_split. It will also sanity-check the values and produce
+ * a well-known sentinel date if they appear bogus.
+ */
+const char *show_ident_date(const struct ident_split *id,
+			    const struct date_mode *mode);
+
+
 #endif /* PRETTY_H */
diff --git a/refs.c b/refs.c
index addb26293b4..33ed3732d1b 100644
--- a/refs.c
+++ b/refs.c
@@ -19,6 +19,7 @@
 #include "strvec.h"
 #include "repository.h"
 #include "sigchain.h"
+#include "date.h"
 
 /*
  * List of all available backends
diff --git a/strbuf.c b/strbuf.c
index 613fee8c82e..00abeb55afd 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -2,6 +2,7 @@
 #include "refs.h"
 #include "string-list.h"
 #include "utf8.h"
+#include "date.h"
 
 int starts_with(const char *str, const char *prefix)
 {
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 099eff4f0fc..ded3d059f56 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -1,5 +1,6 @@
 #include "test-tool.h"
 #include "cache.h"
+#include "date.h"
 
 static const char *usage_msg = "\n"
 "  test-tool date relative [time_t]...\n"
-- 
2.35.0.913.g12b4baa2536


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 3/5] date API: provide and use a DATE_MODE_INIT
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  2022-02-02 21:03                             ` [PATCH 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
  2022-02-02 21:03                             ` [PATCH 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
@ 2022-02-02 21:03                             ` Ævar Arnfjörð Bjarmason
  2022-02-02 21:03                             ` [PATCH 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
                                               ` (2 subsequent siblings)
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Provide and use a DATE_MODE_INIT macro. Most of the users of struct
date_mode" use it via pretty.h's "struct pretty_print_context" which
doesn't have an initialization macro, so we're still bound to being
initialized to "{ 0 }" by default.

But we can change the couple of callers that directly declared a
variable on the stack to instead use the initializer, and thus do away
with the "mode.local = 0" added in add00ba2de9 (date: make "local"
orthogonal to date format, 2015-09-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.c               | 3 +--
 date.h               | 4 ++++
 ref-filter.c         | 2 +-
 t/helper/test-date.c | 2 +-
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/date.c b/date.c
index 863b07e9e63..54c709e4a08 100644
--- a/date.c
+++ b/date.c
@@ -206,11 +206,10 @@ void show_date_relative(timestamp_t time, struct strbuf *timebuf)
 
 struct date_mode *date_mode_from_type(enum date_mode_type type)
 {
-	static struct date_mode mode;
+	static struct date_mode mode = DATE_MODE_INIT;
 	if (type == DATE_STRFTIME)
 		BUG("cannot create anonymous strftime date_mode struct");
 	mode.type = type;
-	mode.local = 0;
 	return &mode;
 }
 
diff --git a/date.h b/date.h
index 5db9ec8dd29..c3a00d08ed6 100644
--- a/date.h
+++ b/date.h
@@ -20,6 +20,10 @@ struct date_mode {
 	int local;
 };
 
+#define DATE_MODE_INIT { \
+	.type = DATE_NORMAL, \
+}
+
 /*
  * Convenience helper for passing a constant type, like:
  *
diff --git a/ref-filter.c b/ref-filter.c
index f7a2f17bfd9..3399bde932f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1251,7 +1251,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	char *zone;
 	timestamp_t timestamp;
 	long tz;
-	struct date_mode date_mode = { DATE_NORMAL };
+	struct date_mode date_mode = DATE_MODE_INIT;
 	const char *formatp;
 
 	/*
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index ded3d059f56..111071e1dd1 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -35,7 +35,7 @@ static void show_human_dates(const char **argv)
 
 static void show_dates(const char **argv, const char *format)
 {
-	struct date_mode mode;
+	struct date_mode mode = DATE_MODE_INIT;
 
 	parse_date_format(format, &mode);
 	for (; *argv; argv++) {
-- 
2.35.0.913.g12b4baa2536


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 4/5] date API: add basic API docs
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                               ` (2 preceding siblings ...)
  2022-02-02 21:03                             ` [PATCH 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
@ 2022-02-02 21:03                             ` Ævar Arnfjörð Bjarmason
  2022-02-15  2:14                               ` Junio C Hamano
  2022-02-02 21:03                             ` [PATCH 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  5 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Add basic API doc comments to date.h, and while doing so move the the
parse_date_format() function adjacent to show_date(). This way all the
"struct date_mode" functions are grouped together. Documenting the
rest is one of our #leftoverbits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/date.h b/date.h
index c3a00d08ed6..4ed83506de9 100644
--- a/date.h
+++ b/date.h
@@ -1,6 +1,12 @@
 #ifndef DATE_H
 #define DATE_H
 
+/**
+ * The date mode type. This has DATE_NORMAL at an explicit "= 0" to
+ * accommodate a memset([...], 0, [...]) initialization when "struct
+ * date_mode" is used as an embedded struct member, as in the case of
+ * e.g. "struct pretty_print_context" and "struct rev_info".
+ */
 enum date_mode_type {
 	DATE_NORMAL = 0,
 	DATE_HUMAN,
@@ -24,7 +30,7 @@ struct date_mode {
 	.type = DATE_NORMAL, \
 }
 
-/*
+/**
  * Convenience helper for passing a constant type, like:
  *
  *   show_date(t, tz, DATE_MODE(NORMAL));
@@ -32,7 +38,21 @@ struct date_mode {
 #define DATE_MODE(t) date_mode_from_type(DATE_##t)
 struct date_mode *date_mode_from_type(enum date_mode_type type);
 
+/**
+ * Show the date given an initialized "struct date_mode" (usually from
+ * the DATE_MODE() macro).
+ */
 const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
+
+/**
+ * Parse a date format for later use with show_date().
+ *
+ * When the "date_mode_type" is DATE_STRFTIME the "strftime_fmt"
+ * member of "struct date_mode" will be a malloc()'d format string to
+ * be used with strbuf_addftime().
+ */
+void parse_date_format(const char *format, struct date_mode *mode);
+
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
@@ -41,7 +61,6 @@ void datestamp(struct strbuf *out);
 #define approxidate(s) approxidate_careful((s), NULL)
 timestamp_t approxidate_careful(const char *, int *);
 timestamp_t approxidate_relative(const char *date);
-void parse_date_format(const char *format, struct date_mode *mode);
 int date_overflows(timestamp_t date);
 time_t tm_to_time_t(const struct tm *tm);
 #endif
-- 
2.35.0.913.g12b4baa2536


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH 5/5] date API: add and use a date_mode_release()
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                               ` (3 preceding siblings ...)
  2022-02-02 21:03                             ` [PATCH 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
@ 2022-02-02 21:03                             ` Ævar Arnfjörð Bjarmason
  2022-02-15  0:28                               ` Junio C Hamano
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  5 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Fix a memory leak in the parse_date_format() function by providing a
new date_mode_release() companion function.

By using this in "t/helper/test-date.c" we can mark the
"t0006-date.sh" test as passing when git is compiled with
SANITIZE=leak, and whitelist it to run under
"GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding
"TEST_PASSES_SANITIZE_LEAK=true" to the test itself.

The other tests that expose this memory leak (i.e. take the
"mode->type == DATE_STRFTIME" branch in parse_date_format()) are
"t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an
easily fixed leak in "ref-filter.c", and brings the failures in
"t6300-for-each-ref.sh" down from 51 to 48.

Fixing the remaining leaks will have to wait until there's a
release_revisions() in "revision.c", as they have to do with leaks via
"struct rev_info".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.c               | 5 +++++
 date.h               | 9 ++++++++-
 ref-filter.c         | 1 +
 t/helper/test-date.c | 2 ++
 t/t0006-date.sh      | 2 ++
 5 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/date.c b/date.c
index 54c709e4a08..68a260c214d 100644
--- a/date.c
+++ b/date.c
@@ -993,6 +993,11 @@ void parse_date_format(const char *format, struct date_mode *mode)
 		die("unknown date format %s", format);
 }
 
+void date_mode_release(struct date_mode *mode)
+{
+	free((char *)mode->strftime_fmt);
+}
+
 void datestamp(struct strbuf *out)
 {
 	time_t now;
diff --git a/date.h b/date.h
index 4ed83506de9..bfcd4eb458c 100644
--- a/date.h
+++ b/date.h
@@ -49,10 +49,17 @@ const char *show_date(timestamp_t time, int timezone, const struct date_mode *mo
  *
  * When the "date_mode_type" is DATE_STRFTIME the "strftime_fmt"
  * member of "struct date_mode" will be a malloc()'d format string to
- * be used with strbuf_addftime().
+ * be used with strbuf_addftime(), in which case you'll need to call
+ * date_mode_release() later.
  */
 void parse_date_format(const char *format, struct date_mode *mode);
 
+/**
+ * Release a "struct date_mode", currently only required if
+ * parse_date_format() has parsed a "DATE_STRFTIME" format.
+ */
+void date_mode_release(struct date_mode *mode);
+
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
diff --git a/ref-filter.c b/ref-filter.c
index 3399bde932f..7838bd22b8d 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1276,6 +1276,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 		goto bad;
 	v->s = xstrdup(show_date(timestamp, tz, &date_mode));
 	v->value = timestamp;
+	date_mode_release(&date_mode);
 	return;
  bad:
 	v->s = xstrdup("");
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 111071e1dd1..45951b1df87 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -54,6 +54,8 @@ static void show_dates(const char **argv, const char *format)
 
 		printf("%s -> %s\n", *argv, show_date(t, tz, &mode));
 	}
+
+	date_mode_release(&mode);
 }
 
 static void parse_dates(const char **argv)
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 794186961ee..2490162071e 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test date parsing and printing'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # arbitrary reference time: 2009-08-30 19:20:00
-- 
2.35.0.913.g12b4baa2536


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH 2/5] date API: create a date.h, split from cache.h
  2022-02-02 21:03                             ` [PATCH 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
@ 2022-02-02 21:19                               ` Ævar Arnfjörð Bjarmason
  2022-02-15  3:04                               ` Junio C Hamano
  1 sibling, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-02 21:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason


On Wed, Feb 02 2022, Ævar Arnfjörð Bjarmason wrote:

> Move the declaration of the date.c functions from cache.h, and adjust
> the relevant users to include the new date.h header.
>
> The show_ident_date() function belonged in pretty.h (it's defined in
> pretty.c), its two users outside of pretty.c didn't strictly need to
> include pretty.h, as they get it indirectly, but let's add it to them
> anyway.
>
> Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
> isn't needed as far as the compiler is concerned, but since they all
> use the "DATE_MODE()" macro we now define in date.h, let's have them
> include it.
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  archive-zip.c         |  1 +
>  builtin/am.c          |  1 +
>  builtin/commit.c      |  1 +
>  builtin/fast-import.c |  1 +
>  builtin/show-branch.c |  1 +
>  builtin/tag.c         |  1 +
>  cache.h               | 48 -------------------------------------------
>  config.c              |  1 +
>  date.c                |  1 +
>  date.h                | 43 ++++++++++++++++++++++++++++++++++++++
>  http-backend.c        |  1 +
>  ident.c               |  1 +
>  object-name.c         |  1 +
>  pretty.h              | 10 +++++++++
>  refs.c                |  1 +
>  strbuf.c              |  1 +
>  t/helper/test-date.c  |  1 +
>  17 files changed, 67 insertions(+), 48 deletions(-)
>  create mode 100644 date.h

I managed to notice just after hitting "send" that I'd forgotten to
"make hdr-check". This commit will need the below fix-up. I'll hold off
on a v2 for now for any further comments though:

diff --git a/reflog-walk.h b/reflog-walk.h
index f26408f6cc1..e9e00ffd479 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct reflog_walk_info;
+struct date_mode;
 
 void init_reflog_walk(struct reflog_walk_info **info);
 int add_reflog_for_walk(struct reflog_walk_info *info,

^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory
  2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                               ` (4 preceding siblings ...)
  2022-02-02 21:03                             ` [PATCH 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:53                             ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
                                                 ` (6 more replies)
  5 siblings, 7 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Fix memory leaks in the date.[ch] API, in preparation for larger
changes to make the revision walking API stop leaking memory.

This is a trivial re-roll to v1, to fix an issue that "make hdr-check"
spotted. For v1 see:
https://lore.kernel.org/git/cover-0.5-00000000000-20220202T195651Z-avarab@gmail.com/

Ævar Arnfjörð Bjarmason (5):
  cache.h: remove always unused show_date_human() declaration
  date API: create a date.h, split from cache.h
  date API: provide and use a DATE_MODE_INIT
  date API: add basic API docs
  date API: add and use a date_mode_release()

 archive-zip.c         |  1 +
 builtin/am.c          |  1 +
 builtin/commit.c      |  1 +
 builtin/fast-import.c |  1 +
 builtin/show-branch.c |  1 +
 builtin/tag.c         |  1 +
 cache.h               | 50 -----------------------------
 config.c              |  1 +
 date.c                |  9 ++++--
 date.h                | 73 +++++++++++++++++++++++++++++++++++++++++++
 http-backend.c        |  1 +
 ident.c               |  1 +
 object-name.c         |  1 +
 pretty.h              | 10 ++++++
 ref-filter.c          |  3 +-
 reflog-walk.h         |  1 +
 refs.c                |  1 +
 strbuf.c              |  1 +
 t/helper/test-date.c  |  5 ++-
 t/t0006-date.sh       |  2 ++
 20 files changed, 111 insertions(+), 54 deletions(-)
 create mode 100644 date.h

Range-diff against v1:
1:  fb21bd7b2c5 = 1:  fb21bd7b2c5 cache.h: remove always unused show_date_human() declaration
2:  7de62956db4 ! 2:  96c904d0b9a date API: create a date.h, split from cache.h
    @@ pretty.h: int format_set_trailers_options(struct process_trailer_options *opts,
     +
      #endif /* PRETTY_H */
     
    + ## reflog-walk.h ##
    +@@
    + 
    + struct commit;
    + struct reflog_walk_info;
    ++struct date_mode;
    + 
    + void init_reflog_walk(struct reflog_walk_info **info);
    + int add_reflog_for_walk(struct reflog_walk_info *info,
    +
      ## refs.c ##
     @@
      #include "strvec.h"
3:  2d5210f9421 = 3:  9ef003a83bd date API: provide and use a DATE_MODE_INIT
4:  aab2ae9cc72 = 4:  3f70b1aa4c5 date API: add basic API docs
5:  b67e23549ed = 5:  60dbadacb16 date API: add and use a date_mode_release()
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v2 1/5] cache.h: remove always unused show_date_human() declaration
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:53                               ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
                                                 ` (5 subsequent siblings)
  6 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

There has never been a show_date_human() function on the "master"
branch in git.git. This declaration was added in b841d4ff438 (Add
`human` format to test-tool, 2019-01-28).

A look at the ML history reveals that it was leftover cruft from an
earlier version of that commit[1].

1. https://lore.kernel.org/git/20190118061805.19086-5-ischis2@cox.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/cache.h b/cache.h
index 281f00ab1b1..49b46244c74 100644
--- a/cache.h
+++ b/cache.h
@@ -1586,8 +1586,6 @@ struct date_mode *date_mode_from_type(enum date_mode_type type);
 
 const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
-void show_date_human(timestamp_t time, int tz, const struct timeval *now,
-			struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
 int parse_expiry_date(const char *date, timestamp_t *timestamp);
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 2/5] date API: create a date.h, split from cache.h
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:53                               ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
                                                 ` (4 subsequent siblings)
  6 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.

The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.

Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 archive-zip.c         |  1 +
 builtin/am.c          |  1 +
 builtin/commit.c      |  1 +
 builtin/fast-import.c |  1 +
 builtin/show-branch.c |  1 +
 builtin/tag.c         |  1 +
 cache.h               | 48 -------------------------------------------
 config.c              |  1 +
 date.c                |  1 +
 date.h                | 43 ++++++++++++++++++++++++++++++++++++++
 http-backend.c        |  1 +
 ident.c               |  1 +
 object-name.c         |  1 +
 pretty.h              | 10 +++++++++
 reflog-walk.h         |  1 +
 refs.c                |  1 +
 strbuf.c              |  1 +
 t/helper/test-date.c  |  1 +
 18 files changed, 68 insertions(+), 48 deletions(-)
 create mode 100644 date.h

diff --git a/archive-zip.c b/archive-zip.c
index 2961e01c754..8ea9d1a5dae 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -9,6 +9,7 @@
 #include "object-store.h"
 #include "userdiff.h"
 #include "xdiff-interface.h"
+#include "date.h"
 
 static int zip_date;
 static int zip_time;
diff --git a/builtin/am.c b/builtin/am.c
index b6be1f1cb11..cc8cd6d6e4b 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -34,6 +34,7 @@
 #include "string-list.h"
 #include "packfile.h"
 #include "repository.h"
+#include "pretty.h"
 
 /**
  * Returns the length of the first line of msg.
diff --git a/builtin/commit.c b/builtin/commit.c
index b9ed0374e30..6b99ac276d8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -37,6 +37,7 @@
 #include "help.h"
 #include "commit-reach.h"
 #include "commit-graph.h"
+#include "pretty.h"
 
 static const char * const builtin_commit_usage[] = {
 	N_("git commit [<options>] [--] <pathspec>..."),
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 2b2e28bad79..28f2b9cc91f 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -19,6 +19,7 @@
 #include "mem-pool.h"
 #include "commit-reach.h"
 #include "khash.h"
+#include "date.h"
 
 #define PACK_ID_BITS 16
 #define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index e12c5e80e3e..330b0553b9d 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -8,6 +8,7 @@
 #include "parse-options.h"
 #include "dir.h"
 #include "commit-slab.h"
+#include "date.h"
 
 static const char* show_branch_usage[] = {
     N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | --date-order]\n"
diff --git a/builtin/tag.c b/builtin/tag.c
index 134b3f1edf0..2479da07049 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -20,6 +20,7 @@
 #include "oid-array.h"
 #include "column.h"
 #include "ref-filter.h"
+#include "date.h"
 
 static const char * const git_tag_usage[] = {
 	N_("git tag [-a | -s | -u <key-id>] [-f] [-m <msg> | -F <file>]\n"
diff --git a/cache.h b/cache.h
index 49b46244c74..6add78fd701 100644
--- a/cache.h
+++ b/cache.h
@@ -1557,46 +1557,6 @@ struct object *repo_peel_to_type(struct repository *r,
 #define peel_to_type(name, namelen, obj, type) \
 	repo_peel_to_type(the_repository, name, namelen, obj, type)
 
-enum date_mode_type {
-	DATE_NORMAL = 0,
-	DATE_HUMAN,
-	DATE_RELATIVE,
-	DATE_SHORT,
-	DATE_ISO8601,
-	DATE_ISO8601_STRICT,
-	DATE_RFC2822,
-	DATE_STRFTIME,
-	DATE_RAW,
-	DATE_UNIX
-};
-
-struct date_mode {
-	enum date_mode_type type;
-	const char *strftime_fmt;
-	int local;
-};
-
-/*
- * Convenience helper for passing a constant type, like:
- *
- *   show_date(t, tz, DATE_MODE(NORMAL));
- */
-#define DATE_MODE(t) date_mode_from_type(DATE_##t)
-struct date_mode *date_mode_from_type(enum date_mode_type type);
-
-const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
-void show_date_relative(timestamp_t time, struct strbuf *timebuf);
-int parse_date(const char *date, struct strbuf *out);
-int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
-int parse_expiry_date(const char *date, timestamp_t *timestamp);
-void datestamp(struct strbuf *out);
-#define approxidate(s) approxidate_careful((s), NULL)
-timestamp_t approxidate_careful(const char *, int *);
-timestamp_t approxidate_relative(const char *date);
-void parse_date_format(const char *format, struct date_mode *mode);
-int date_overflows(timestamp_t date);
-time_t tm_to_time_t(const struct tm *tm);
-
 #define IDENT_STRICT	       1
 #define IDENT_NO_DATE	       2
 #define IDENT_NO_NAME	       4
@@ -1642,14 +1602,6 @@ struct ident_split {
  */
 int split_ident_line(struct ident_split *, const char *, int);
 
-/*
- * Like show_date, but pull the timestamp and tz parameters from
- * the ident_split. It will also sanity-check the values and produce
- * a well-known sentinel date if they appear bogus.
- */
-const char *show_ident_date(const struct ident_split *id,
-			    const struct date_mode *mode);
-
 /*
  * Compare split idents for equality or strict ordering. Note that we
  * compare only the ident part of the line, ignoring any timestamp.
diff --git a/config.c b/config.c
index 2bffa8d4a01..9c9dc8a6f62 100644
--- a/config.c
+++ b/config.c
@@ -6,6 +6,7 @@
  *
  */
 #include "cache.h"
+#include "date.h"
 #include "branch.h"
 #include "config.h"
 #include "environment.h"
diff --git a/date.c b/date.c
index 84bb4451c1a..863b07e9e63 100644
--- a/date.c
+++ b/date.c
@@ -5,6 +5,7 @@
  */
 
 #include "cache.h"
+#include "date.h"
 
 /*
  * This is like mktime, but without normalization of tm_wday and tm_yday.
diff --git a/date.h b/date.h
new file mode 100644
index 00000000000..5db9ec8dd29
--- /dev/null
+++ b/date.h
@@ -0,0 +1,43 @@
+#ifndef DATE_H
+#define DATE_H
+
+enum date_mode_type {
+	DATE_NORMAL = 0,
+	DATE_HUMAN,
+	DATE_RELATIVE,
+	DATE_SHORT,
+	DATE_ISO8601,
+	DATE_ISO8601_STRICT,
+	DATE_RFC2822,
+	DATE_STRFTIME,
+	DATE_RAW,
+	DATE_UNIX
+};
+
+struct date_mode {
+	enum date_mode_type type;
+	const char *strftime_fmt;
+	int local;
+};
+
+/*
+ * Convenience helper for passing a constant type, like:
+ *
+ *   show_date(t, tz, DATE_MODE(NORMAL));
+ */
+#define DATE_MODE(t) date_mode_from_type(DATE_##t)
+struct date_mode *date_mode_from_type(enum date_mode_type type);
+
+const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
+void show_date_relative(timestamp_t time, struct strbuf *timebuf);
+int parse_date(const char *date, struct strbuf *out);
+int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
+int parse_expiry_date(const char *date, timestamp_t *timestamp);
+void datestamp(struct strbuf *out);
+#define approxidate(s) approxidate_careful((s), NULL)
+timestamp_t approxidate_careful(const char *, int *);
+timestamp_t approxidate_relative(const char *date);
+void parse_date_format(const char *format, struct date_mode *mode);
+int date_overflows(timestamp_t date);
+time_t tm_to_time_t(const struct tm *tm);
+#endif
diff --git a/http-backend.c b/http-backend.c
index 807fb8839e7..81a7229ece0 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -13,6 +13,7 @@
 #include "packfile.h"
 #include "object-store.h"
 #include "protocol.h"
+#include "date.h"
 
 static const char content_type[] = "Content-Type";
 static const char content_length[] = "Content-Length";
diff --git a/ident.c b/ident.c
index 6aba4b5cb6f..89ca5b47008 100644
--- a/ident.c
+++ b/ident.c
@@ -7,6 +7,7 @@
  */
 #include "cache.h"
 #include "config.h"
+#include "date.h"
 
 static struct strbuf git_default_name = STRBUF_INIT;
 static struct strbuf git_default_email = STRBUF_INIT;
diff --git a/object-name.c b/object-name.c
index fdff4601b2c..f9527817b64 100644
--- a/object-name.c
+++ b/object-name.c
@@ -15,6 +15,7 @@
 #include "submodule.h"
 #include "midx.h"
 #include "commit-reach.h"
+#include "date.h"
 
 static int get_oid_oneline(struct repository *r, const char *, struct object_id *, struct commit_list *);
 
diff --git a/pretty.h b/pretty.h
index 2f16acd213d..f34e24c53a4 100644
--- a/pretty.h
+++ b/pretty.h
@@ -2,6 +2,7 @@
 #define PRETTY_H
 
 #include "cache.h"
+#include "date.h"
 #include "string-list.h"
 
 struct commit;
@@ -163,4 +164,13 @@ int format_set_trailers_options(struct process_trailer_options *opts,
 			const char **arg,
 			char **invalid_arg);
 
+/*
+ * Like show_date, but pull the timestamp and tz parameters from
+ * the ident_split. It will also sanity-check the values and produce
+ * a well-known sentinel date if they appear bogus.
+ */
+const char *show_ident_date(const struct ident_split *id,
+			    const struct date_mode *mode);
+
+
 #endif /* PRETTY_H */
diff --git a/reflog-walk.h b/reflog-walk.h
index f26408f6cc1..e9e00ffd479 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct reflog_walk_info;
+struct date_mode;
 
 void init_reflog_walk(struct reflog_walk_info **info);
 int add_reflog_for_walk(struct reflog_walk_info *info,
diff --git a/refs.c b/refs.c
index addb26293b4..33ed3732d1b 100644
--- a/refs.c
+++ b/refs.c
@@ -19,6 +19,7 @@
 #include "strvec.h"
 #include "repository.h"
 #include "sigchain.h"
+#include "date.h"
 
 /*
  * List of all available backends
diff --git a/strbuf.c b/strbuf.c
index 613fee8c82e..00abeb55afd 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -2,6 +2,7 @@
 #include "refs.h"
 #include "string-list.h"
 #include "utf8.h"
+#include "date.h"
 
 int starts_with(const char *str, const char *prefix)
 {
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 099eff4f0fc..ded3d059f56 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -1,5 +1,6 @@
 #include "test-tool.h"
 #include "cache.h"
+#include "date.h"
 
 static const char *usage_msg = "\n"
 "  test-tool date relative [time_t]...\n"
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 3/5] date API: provide and use a DATE_MODE_INIT
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:53                               ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
                                                 ` (3 subsequent siblings)
  6 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Provide and use a DATE_MODE_INIT macro. Most of the users of struct
date_mode" use it via pretty.h's "struct pretty_print_context" which
doesn't have an initialization macro, so we're still bound to being
initialized to "{ 0 }" by default.

But we can change the couple of callers that directly declared a
variable on the stack to instead use the initializer, and thus do away
with the "mode.local = 0" added in add00ba2de9 (date: make "local"
orthogonal to date format, 2015-09-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.c               | 3 +--
 date.h               | 4 ++++
 ref-filter.c         | 2 +-
 t/helper/test-date.c | 2 +-
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/date.c b/date.c
index 863b07e9e63..54c709e4a08 100644
--- a/date.c
+++ b/date.c
@@ -206,11 +206,10 @@ void show_date_relative(timestamp_t time, struct strbuf *timebuf)
 
 struct date_mode *date_mode_from_type(enum date_mode_type type)
 {
-	static struct date_mode mode;
+	static struct date_mode mode = DATE_MODE_INIT;
 	if (type == DATE_STRFTIME)
 		BUG("cannot create anonymous strftime date_mode struct");
 	mode.type = type;
-	mode.local = 0;
 	return &mode;
 }
 
diff --git a/date.h b/date.h
index 5db9ec8dd29..c3a00d08ed6 100644
--- a/date.h
+++ b/date.h
@@ -20,6 +20,10 @@ struct date_mode {
 	int local;
 };
 
+#define DATE_MODE_INIT { \
+	.type = DATE_NORMAL, \
+}
+
 /*
  * Convenience helper for passing a constant type, like:
  *
diff --git a/ref-filter.c b/ref-filter.c
index f7a2f17bfd9..3399bde932f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1251,7 +1251,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	char *zone;
 	timestamp_t timestamp;
 	long tz;
-	struct date_mode date_mode = { DATE_NORMAL };
+	struct date_mode date_mode = DATE_MODE_INIT;
 	const char *formatp;
 
 	/*
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index ded3d059f56..111071e1dd1 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -35,7 +35,7 @@ static void show_human_dates(const char **argv)
 
 static void show_dates(const char **argv, const char *format)
 {
-	struct date_mode mode;
+	struct date_mode mode = DATE_MODE_INIT;
 
 	parse_date_format(format, &mode);
 	for (; *argv; argv++) {
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 4/5] date API: add basic API docs
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                                 ` (2 preceding siblings ...)
  2022-02-04 23:53                               ` [PATCH v2 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:53                               ` Ævar Arnfjörð Bjarmason
  2022-02-04 23:53                               ` [PATCH v2 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
                                                 ` (2 subsequent siblings)
  6 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Add basic API doc comments to date.h, and while doing so move the the
parse_date_format() function adjacent to show_date(). This way all the
"struct date_mode" functions are grouped together. Documenting the
rest is one of our #leftoverbits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.h | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/date.h b/date.h
index c3a00d08ed6..4ed83506de9 100644
--- a/date.h
+++ b/date.h
@@ -1,6 +1,12 @@
 #ifndef DATE_H
 #define DATE_H
 
+/**
+ * The date mode type. This has DATE_NORMAL at an explicit "= 0" to
+ * accommodate a memset([...], 0, [...]) initialization when "struct
+ * date_mode" is used as an embedded struct member, as in the case of
+ * e.g. "struct pretty_print_context" and "struct rev_info".
+ */
 enum date_mode_type {
 	DATE_NORMAL = 0,
 	DATE_HUMAN,
@@ -24,7 +30,7 @@ struct date_mode {
 	.type = DATE_NORMAL, \
 }
 
-/*
+/**
  * Convenience helper for passing a constant type, like:
  *
  *   show_date(t, tz, DATE_MODE(NORMAL));
@@ -32,7 +38,21 @@ struct date_mode {
 #define DATE_MODE(t) date_mode_from_type(DATE_##t)
 struct date_mode *date_mode_from_type(enum date_mode_type type);
 
+/**
+ * Show the date given an initialized "struct date_mode" (usually from
+ * the DATE_MODE() macro).
+ */
 const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
+
+/**
+ * Parse a date format for later use with show_date().
+ *
+ * When the "date_mode_type" is DATE_STRFTIME the "strftime_fmt"
+ * member of "struct date_mode" will be a malloc()'d format string to
+ * be used with strbuf_addftime().
+ */
+void parse_date_format(const char *format, struct date_mode *mode);
+
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
@@ -41,7 +61,6 @@ void datestamp(struct strbuf *out);
 #define approxidate(s) approxidate_careful((s), NULL)
 timestamp_t approxidate_careful(const char *, int *);
 timestamp_t approxidate_relative(const char *date);
-void parse_date_format(const char *format, struct date_mode *mode);
 int date_overflows(timestamp_t date);
 time_t tm_to_time_t(const struct tm *tm);
 #endif
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v2 5/5] date API: add and use a date_mode_release()
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                                 ` (3 preceding siblings ...)
  2022-02-04 23:53                               ` [PATCH v2 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
@ 2022-02-04 23:53                               ` Ævar Arnfjörð Bjarmason
  2022-02-14 17:25                               ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
  6 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-04 23:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Fix a memory leak in the parse_date_format() function by providing a
new date_mode_release() companion function.

By using this in "t/helper/test-date.c" we can mark the
"t0006-date.sh" test as passing when git is compiled with
SANITIZE=leak, and whitelist it to run under
"GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding
"TEST_PASSES_SANITIZE_LEAK=true" to the test itself.

The other tests that expose this memory leak (i.e. take the
"mode->type == DATE_STRFTIME" branch in parse_date_format()) are
"t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an
easily fixed leak in "ref-filter.c", and brings the failures in
"t6300-for-each-ref.sh" down from 51 to 48.

Fixing the remaining leaks will have to wait until there's a
release_revisions() in "revision.c", as they have to do with leaks via
"struct rev_info".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.c               | 5 +++++
 date.h               | 9 ++++++++-
 ref-filter.c         | 1 +
 t/helper/test-date.c | 2 ++
 t/t0006-date.sh      | 2 ++
 5 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/date.c b/date.c
index 54c709e4a08..68a260c214d 100644
--- a/date.c
+++ b/date.c
@@ -993,6 +993,11 @@ void parse_date_format(const char *format, struct date_mode *mode)
 		die("unknown date format %s", format);
 }
 
+void date_mode_release(struct date_mode *mode)
+{
+	free((char *)mode->strftime_fmt);
+}
+
 void datestamp(struct strbuf *out)
 {
 	time_t now;
diff --git a/date.h b/date.h
index 4ed83506de9..bfcd4eb458c 100644
--- a/date.h
+++ b/date.h
@@ -49,10 +49,17 @@ const char *show_date(timestamp_t time, int timezone, const struct date_mode *mo
  *
  * When the "date_mode_type" is DATE_STRFTIME the "strftime_fmt"
  * member of "struct date_mode" will be a malloc()'d format string to
- * be used with strbuf_addftime().
+ * be used with strbuf_addftime(), in which case you'll need to call
+ * date_mode_release() later.
  */
 void parse_date_format(const char *format, struct date_mode *mode);
 
+/**
+ * Release a "struct date_mode", currently only required if
+ * parse_date_format() has parsed a "DATE_STRFTIME" format.
+ */
+void date_mode_release(struct date_mode *mode);
+
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
diff --git a/ref-filter.c b/ref-filter.c
index 3399bde932f..7838bd22b8d 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1276,6 +1276,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 		goto bad;
 	v->s = xstrdup(show_date(timestamp, tz, &date_mode));
 	v->value = timestamp;
+	date_mode_release(&date_mode);
 	return;
  bad:
 	v->s = xstrdup("");
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 111071e1dd1..45951b1df87 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -54,6 +54,8 @@ static void show_dates(const char **argv, const char *format)
 
 		printf("%s -> %s\n", *argv, show_date(t, tz, &mode));
 	}
+
+	date_mode_release(&mode);
 }
 
 static void parse_dates(const char **argv)
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 794186961ee..2490162071e 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test date parsing and printing'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # arbitrary reference time: 2009-08-30 19:20:00
-- 
2.35.1.940.ge7a5b4b05f2


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                                 ` (4 preceding siblings ...)
  2022-02-04 23:53                               ` [PATCH v2 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
@ 2022-02-14 17:25                               ` Ævar Arnfjörð Bjarmason
  2022-02-14 19:52                                 ` Junio C Hamano
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
  6 siblings, 1 reply; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-14 17:25 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason


On Sat, Feb 05 2022, Ævar Arnfjörð Bjarmason wrote:

> Fix memory leaks in the date.[ch] API, in preparation for larger
> changes to make the revision walking API stop leaking memory.
>
> This is a trivial re-roll to v1, to fix an issue that "make hdr-check"
> spotted. For v1 see:
> https://lore.kernel.org/git/cover-0.5-00000000000-20220202T195651Z-avarab@gmail.com/

Junio: I think this series may have fallen between the cracks. Any
chance you're willing to pick this up? I'm keen to submit the larger
revision.[ch] leak fixes in this cycle, and this is one of the few
remaining dependencies for that.

> Ævar Arnfjörð Bjarmason (5):
>   cache.h: remove always unused show_date_human() declaration
>   date API: create a date.h, split from cache.h
>   date API: provide and use a DATE_MODE_INIT
>   date API: add basic API docs
>   date API: add and use a date_mode_release()
>
>  archive-zip.c         |  1 +
>  builtin/am.c          |  1 +
>  builtin/commit.c      |  1 +
>  builtin/fast-import.c |  1 +
>  builtin/show-branch.c |  1 +
>  builtin/tag.c         |  1 +
>  cache.h               | 50 -----------------------------
>  config.c              |  1 +
>  date.c                |  9 ++++--
>  date.h                | 73 +++++++++++++++++++++++++++++++++++++++++++
>  http-backend.c        |  1 +
>  ident.c               |  1 +
>  object-name.c         |  1 +
>  pretty.h              | 10 ++++++
>  ref-filter.c          |  3 +-
>  reflog-walk.h         |  1 +
>  refs.c                |  1 +
>  strbuf.c              |  1 +
>  t/helper/test-date.c  |  5 ++-
>  t/t0006-date.sh       |  2 ++
>  20 files changed, 111 insertions(+), 54 deletions(-)
>  create mode 100644 date.h
>
> Range-diff against v1:
> 1:  fb21bd7b2c5 = 1:  fb21bd7b2c5 cache.h: remove always unused show_date_human() declaration
> 2:  7de62956db4 ! 2:  96c904d0b9a date API: create a date.h, split from cache.h
>     @@ pretty.h: int format_set_trailers_options(struct process_trailer_options *opts,
>      +
>       #endif /* PRETTY_H */
>      
>     + ## reflog-walk.h ##
>     +@@
>     + 
>     + struct commit;
>     + struct reflog_walk_info;
>     ++struct date_mode;
>     + 
>     + void init_reflog_walk(struct reflog_walk_info **info);
>     + int add_reflog_for_walk(struct reflog_walk_info *info,
>     +
>       ## refs.c ##
>      @@
>       #include "strvec.h"
> 3:  2d5210f9421 = 3:  9ef003a83bd date API: provide and use a DATE_MODE_INIT
> 4:  aab2ae9cc72 = 4:  3f70b1aa4c5 date API: add basic API docs
> 5:  b67e23549ed = 5:  60dbadacb16 date API: add and use a date_mode_release()


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory
  2022-02-14 17:25                               ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
@ 2022-02-14 19:52                                 ` Junio C Hamano
  0 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2022-02-14 19:52 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> On Sat, Feb 05 2022, Ævar Arnfjörð Bjarmason wrote:
>
>> Fix memory leaks in the date.[ch] API, in preparation for larger
>> changes to make the revision walking API stop leaking memory.
>>
>> This is a trivial re-roll to v1, to fix an issue that "make hdr-check"
>> spotted. For v1 see:
>> https://lore.kernel.org/git/cover-0.5-00000000000-20220202T195651Z-avarab@gmail.com/
>
> Junio: I think this series may have fallen between the cracks. Any
> chance you're willing to pick this up? I'm keen to submit the larger
> revision.[ch] leak fixes in this cycle, and this is one of the few
> remaining dependencies for that.

I haven't seen the topic reviewed, and I haven't even had a chance
to give a cursory look, so until then, it will remain on the list
archive.

Thanks for reminding.

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 5/5] date API: add and use a date_mode_release()
  2022-02-02 21:03                             ` [PATCH 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
@ 2022-02-15  0:28                               ` Junio C Hamano
  0 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2022-02-15  0:28 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Fix a memory leak in the parse_date_format() function by providing a
> new date_mode_release() companion function.
>
> By using this in "t/helper/test-date.c" we can mark the
> "t0006-date.sh" test as passing when git is compiled with
> SANITIZE=leak, and whitelist it to run under
> "GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding
> "TEST_PASSES_SANITIZE_LEAK=true" to the test itself.
>
> The other tests that expose this memory leak (i.e. take the
> "mode->type == DATE_STRFTIME" branch in parse_date_format()) are
> "t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an
> easily fixed leak in "ref-filter.c", and brings the failures in
> "t6300-for-each-ref.sh" down from 51 to 48.
>
> Fixing the remaining leaks will have to wait until there's a
> release_revisions() in "revision.c", as they have to do with leaks via
> "struct rev_info".

Here are hits from "git grep -e parse_date_format -e date_mode_release":

builtin/blame.c:701:		parse_date_format(value, &blame_date_mode);
builtin/log.c:162:		parse_date_format(default_date_mode, &rev->date_mode);
date.c:966:void parse_date_format(const char *format, struct date_mode *mode)
date.c:996:void date_mode_release(struct date_mode *mode)
date.h:53: * date_mode_release() later.
date.h:55:void parse_date_format(const char *format, struct date_mode *mode);
date.h:59: * parse_date_format() has parsed a "DATE_STRFTIME" format.
date.h:61:void date_mode_release(struct date_mode *mode);
ref-filter.c:1266:		parse_date_format(formatp, &date_mode);
ref-filter.c:1279:	date_mode_release(&date_mode);
revision.c:2478:		parse_date_format(optarg, &revs->date_mode);
t/helper/test-date.c:40:	parse_date_format(format, &mode);
t/helper/test-date.c:58:	date_mode_release(&mode);

Unlike builtin/log.c which uses the date_mode member that is
embedded in a rev_info, the one used by format_time() in
builtin/blame.c should be releasable without waiting for updating
revision.c, right?


^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 4/5] date API: add basic API docs
  2022-02-02 21:03                             ` [PATCH 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
@ 2022-02-15  2:14                               ` Junio C Hamano
  0 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2022-02-15  2:14 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> +/**
> + * Show the date given an initialized "struct date_mode" (usually from
> + * the DATE_MODE() macro).
> + */
>  const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);

It's a bit of wasted bytes to explain "show_date()" as "show".  In
the context of this function, the verb "show" in its name does not
mean emitting to any output stream, but return a short-lived memory
stuffed with a  date string formatted according to the date mode
that the caller needs to either immediately consume or strdup() away
if it wants to use it later, which is a lot more helpful thing to
tell to the readers.

    /**
     * Format <'time', 'timezone'> into static memory according to
     * 'mode' and return it.
     */

or something along that line?

^ permalink raw reply	[flat|nested] 125+ messages in thread

* Re: [PATCH 2/5] date API: create a date.h, split from cache.h
  2022-02-02 21:03                             ` [PATCH 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
  2022-02-02 21:19                               ` Ævar Arnfjörð Bjarmason
@ 2022-02-15  3:04                               ` Junio C Hamano
  1 sibling, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2022-02-15  3:04 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> Move the declaration of the date.c functions from cache.h, and adjust
> the relevant users to include the new date.h header.

It makes the patch larger than it could be to split off part of
cache.h into a new header and force users to include the new date.h
in the same commit, rather than first including date.h in cache.h
so that users do not have to change, and then update the inclusion
in a separate follow-up commit.   The end result looks OK, though.


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v3 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory
  2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
                                                 ` (5 preceding siblings ...)
  2022-02-14 17:25                               ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
@ 2022-02-16  8:14                               ` Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
                                                   ` (5 more replies)
  6 siblings, 6 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  8:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Fix memory leaks in the date.[ch] API, in preparation for larger
changes to make the revision walking API stop leaking memory.

This is small re-roll of v2 to address Junio's feedback on that
version. For v2 see:
https://lore.kernel.org/git/cover-v2-0.5-00000000000-20220204T235143Z-avarab@gmail.com/

This is a documentation and commit-message only update. As explained
below I think it makes sense to punt on the "builtin/blame.c" leak,
and to keep 2/5 as-is with date.h not included in cache.h, but those
things are now all rationalized in the commit message. Thanks for the
review Junio!

Ævar Arnfjörð Bjarmason (5):
  cache.h: remove always unused show_date_human() declaration
  date API: create a date.h, split from cache.h
  date API: provide and use a DATE_MODE_INIT
  date API: add basic API docs
  date API: add and use a date_mode_release()

 archive-zip.c         |  1 +
 builtin/am.c          |  1 +
 builtin/commit.c      |  1 +
 builtin/fast-import.c |  1 +
 builtin/show-branch.c |  1 +
 builtin/tag.c         |  1 +
 cache.h               | 50 -----------------------------
 config.c              |  1 +
 date.c                |  9 ++++--
 date.h                | 74 +++++++++++++++++++++++++++++++++++++++++++
 http-backend.c        |  1 +
 ident.c               |  1 +
 object-name.c         |  1 +
 pretty.h              | 10 ++++++
 ref-filter.c          |  3 +-
 reflog-walk.h         |  1 +
 refs.c                |  1 +
 strbuf.c              |  1 +
 t/helper/test-date.c  |  5 ++-
 t/t0006-date.sh       |  2 ++
 20 files changed, 112 insertions(+), 54 deletions(-)
 create mode 100644 date.h

Range-diff against v2:
1:  fb21bd7b2c5 = 1:  97746d97810 cache.h: remove always unused show_date_human() declaration
2:  96c904d0b9a ! 2:  f73aa601e95 date API: create a date.h, split from cache.h
    @@ Commit message
         use the "DATE_MODE()" macro we now define in date.h, let's have them
         include it.
     
    +    We could simply include this new header in "cache.h", but as this
    +    change shows these functions weren't common enough to warrant
    +    including in it in the first place. By moving them out of cache.h
    +    changes to this API will no longer cause a (mostly) full re-build of
    +    the project when "make" is run.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## archive-zip.c ##
3:  9ef003a83bd = 3:  764147e90e1 date API: provide and use a DATE_MODE_INIT
4:  3f70b1aa4c5 ! 4:  5c244960133 date API: add basic API docs
    @@ date.h: struct date_mode {
      struct date_mode *date_mode_from_type(enum date_mode_type type);
      
     +/**
    -+ * Show the date given an initialized "struct date_mode" (usually from
    -+ * the DATE_MODE() macro).
    ++ * Format <'time', 'timezone'> into static memory according to 'mode'
    ++ * and return it. The mode is an initialized "struct date_mode"
    ++ * (usually from the DATE_MODE() macro).
     + */
      const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
     +
5:  60dbadacb16 ! 5:  b1ee9a30913 date API: add and use a date_mode_release()
    @@ Commit message
         release_revisions() in "revision.c", as they have to do with leaks via
         "struct rev_info".
     
    +    There is also a leak in "builtin/blame.c" due to its call to
    +    parse_date_format() to parse the "blame.date" configuration. However
    +    as it declares a file-level "static struct date_mode blame_date_mode"
    +    to track the data, LSAN will not report it as a leak. It's possible to
    +    get valgrind(1) to complain about it with e.g.:
    +
    +        valgrind --leak-check=full --show-leak-kinds=all ./git -P -c blame.date=format:%Y blame README.md
    +
    +    But let's focus on things LSAN complains about, and are thus
    +    observable with "TEST_PASSES_SANITIZE_LEAK=true". We should get to
    +    fixing memory leaks in "builtin/blame.c", but as doing so would
    +    require some re-arrangement of cmd_blame() let's leave it for some
    +    other time.
    +
         Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
     
      ## date.c ##
-- 
2.35.1.1028.g2d2d4be19de


^ permalink raw reply	[flat|nested] 125+ messages in thread

* [PATCH v3 1/5] cache.h: remove always unused show_date_human() declaration
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
@ 2022-02-16  8:14                                 ` Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
                                                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  8:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

There has never been a show_date_human() function on the "master"
branch in git.git. This declaration was added in b841d4ff438 (Add
`human` format to test-tool, 2019-01-28).

A look at the ML history reveals that it was leftover cruft from an
earlier version of that commit[1].

1. https://lore.kernel.org/git/20190118061805.19086-5-ischis2@cox.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 cache.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/cache.h b/cache.h
index 4148b6322d5..703a474e5a7 100644
--- a/cache.h
+++ b/cache.h
@@ -1588,8 +1588,6 @@ struct date_mode *date_mode_from_type(enum date_mode_type type);
 
 const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
-void show_date_human(timestamp_t time, int tz, const struct timeval *now,
-			struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
 int parse_expiry_date(const char *date, timestamp_t *timestamp);
-- 
2.35.1.1028.g2d2d4be19de


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 2/5] date API: create a date.h, split from cache.h
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
@ 2022-02-16  8:14                                 ` Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
                                                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  8:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.

The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.

Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.

We could simply include this new header in "cache.h", but as this
change shows these functions weren't common enough to warrant
including in it in the first place. By moving them out of cache.h
changes to this API will no longer cause a (mostly) full re-build of
the project when "make" is run.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 archive-zip.c         |  1 +
 builtin/am.c          |  1 +
 builtin/commit.c      |  1 +
 builtin/fast-import.c |  1 +
 builtin/show-branch.c |  1 +
 builtin/tag.c         |  1 +
 cache.h               | 48 -------------------------------------------
 config.c              |  1 +
 date.c                |  1 +
 date.h                | 43 ++++++++++++++++++++++++++++++++++++++
 http-backend.c        |  1 +
 ident.c               |  1 +
 object-name.c         |  1 +
 pretty.h              | 10 +++++++++
 reflog-walk.h         |  1 +
 refs.c                |  1 +
 strbuf.c              |  1 +
 t/helper/test-date.c  |  1 +
 18 files changed, 68 insertions(+), 48 deletions(-)
 create mode 100644 date.h

diff --git a/archive-zip.c b/archive-zip.c
index 2961e01c754..8ea9d1a5dae 100644
--- a/archive-zip.c
+++ b/archive-zip.c
@@ -9,6 +9,7 @@
 #include "object-store.h"
 #include "userdiff.h"
 #include "xdiff-interface.h"
+#include "date.h"
 
 static int zip_date;
 static int zip_time;
diff --git a/builtin/am.c b/builtin/am.c
index 7de2c89ef22..eb24bc89bb5 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -34,6 +34,7 @@
 #include "string-list.h"
 #include "packfile.h"
 #include "repository.h"
+#include "pretty.h"
 
 /**
  * Returns the length of the first line of msg.
diff --git a/builtin/commit.c b/builtin/commit.c
index b9ed0374e30..6b99ac276d8 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -37,6 +37,7 @@
 #include "help.h"
 #include "commit-reach.h"
 #include "commit-graph.h"
+#include "pretty.h"
 
 static const char * const builtin_commit_usage[] = {
 	N_("git commit [<options>] [--] <pathspec>..."),
diff --git a/builtin/fast-import.c b/builtin/fast-import.c
index 2b2e28bad79..28f2b9cc91f 100644
--- a/builtin/fast-import.c
+++ b/builtin/fast-import.c
@@ -19,6 +19,7 @@
 #include "mem-pool.h"
 #include "commit-reach.h"
 #include "khash.h"
+#include "date.h"
 
 #define PACK_ID_BITS 16
 #define MAX_PACK_ID ((1<<PACK_ID_BITS)-1)
diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index e12c5e80e3e..330b0553b9d 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -8,6 +8,7 @@
 #include "parse-options.h"
 #include "dir.h"
 #include "commit-slab.h"
+#include "date.h"
 
 static const char* show_branch_usage[] = {
     N_("git show-branch [-a | --all] [-r | --remotes] [--topo-order | --date-order]\n"
diff --git a/builtin/tag.c b/builtin/tag.c
index 134b3f1edf0..2479da07049 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -20,6 +20,7 @@
 #include "oid-array.h"
 #include "column.h"
 #include "ref-filter.h"
+#include "date.h"
 
 static const char * const git_tag_usage[] = {
 	N_("git tag [-a | -s | -u <key-id>] [-f] [-m <msg> | -F <file>]\n"
diff --git a/cache.h b/cache.h
index 703a474e5a7..48e77aa0697 100644
--- a/cache.h
+++ b/cache.h
@@ -1559,46 +1559,6 @@ struct object *repo_peel_to_type(struct repository *r,
 #define peel_to_type(name, namelen, obj, type) \
 	repo_peel_to_type(the_repository, name, namelen, obj, type)
 
-enum date_mode_type {
-	DATE_NORMAL = 0,
-	DATE_HUMAN,
-	DATE_RELATIVE,
-	DATE_SHORT,
-	DATE_ISO8601,
-	DATE_ISO8601_STRICT,
-	DATE_RFC2822,
-	DATE_STRFTIME,
-	DATE_RAW,
-	DATE_UNIX
-};
-
-struct date_mode {
-	enum date_mode_type type;
-	const char *strftime_fmt;
-	int local;
-};
-
-/*
- * Convenience helper for passing a constant type, like:
- *
- *   show_date(t, tz, DATE_MODE(NORMAL));
- */
-#define DATE_MODE(t) date_mode_from_type(DATE_##t)
-struct date_mode *date_mode_from_type(enum date_mode_type type);
-
-const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
-void show_date_relative(timestamp_t time, struct strbuf *timebuf);
-int parse_date(const char *date, struct strbuf *out);
-int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
-int parse_expiry_date(const char *date, timestamp_t *timestamp);
-void datestamp(struct strbuf *out);
-#define approxidate(s) approxidate_careful((s), NULL)
-timestamp_t approxidate_careful(const char *, int *);
-timestamp_t approxidate_relative(const char *date);
-void parse_date_format(const char *format, struct date_mode *mode);
-int date_overflows(timestamp_t date);
-time_t tm_to_time_t(const struct tm *tm);
-
 #define IDENT_STRICT	       1
 #define IDENT_NO_DATE	       2
 #define IDENT_NO_NAME	       4
@@ -1644,14 +1604,6 @@ struct ident_split {
  */
 int split_ident_line(struct ident_split *, const char *, int);
 
-/*
- * Like show_date, but pull the timestamp and tz parameters from
- * the ident_split. It will also sanity-check the values and produce
- * a well-known sentinel date if they appear bogus.
- */
-const char *show_ident_date(const struct ident_split *id,
-			    const struct date_mode *mode);
-
 /*
  * Compare split idents for equality or strict ordering. Note that we
  * compare only the ident part of the line, ignoring any timestamp.
diff --git a/config.c b/config.c
index e0c03d154c9..430868f1ec0 100644
--- a/config.c
+++ b/config.c
@@ -6,6 +6,7 @@
  *
  */
 #include "cache.h"
+#include "date.h"
 #include "branch.h"
 #include "config.h"
 #include "environment.h"
diff --git a/date.c b/date.c
index 84bb4451c1a..863b07e9e63 100644
--- a/date.c
+++ b/date.c
@@ -5,6 +5,7 @@
  */
 
 #include "cache.h"
+#include "date.h"
 
 /*
  * This is like mktime, but without normalization of tm_wday and tm_yday.
diff --git a/date.h b/date.h
new file mode 100644
index 00000000000..5db9ec8dd29
--- /dev/null
+++ b/date.h
@@ -0,0 +1,43 @@
+#ifndef DATE_H
+#define DATE_H
+
+enum date_mode_type {
+	DATE_NORMAL = 0,
+	DATE_HUMAN,
+	DATE_RELATIVE,
+	DATE_SHORT,
+	DATE_ISO8601,
+	DATE_ISO8601_STRICT,
+	DATE_RFC2822,
+	DATE_STRFTIME,
+	DATE_RAW,
+	DATE_UNIX
+};
+
+struct date_mode {
+	enum date_mode_type type;
+	const char *strftime_fmt;
+	int local;
+};
+
+/*
+ * Convenience helper for passing a constant type, like:
+ *
+ *   show_date(t, tz, DATE_MODE(NORMAL));
+ */
+#define DATE_MODE(t) date_mode_from_type(DATE_##t)
+struct date_mode *date_mode_from_type(enum date_mode_type type);
+
+const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
+void show_date_relative(timestamp_t time, struct strbuf *timebuf);
+int parse_date(const char *date, struct strbuf *out);
+int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
+int parse_expiry_date(const char *date, timestamp_t *timestamp);
+void datestamp(struct strbuf *out);
+#define approxidate(s) approxidate_careful((s), NULL)
+timestamp_t approxidate_careful(const char *, int *);
+timestamp_t approxidate_relative(const char *date);
+void parse_date_format(const char *format, struct date_mode *mode);
+int date_overflows(timestamp_t date);
+time_t tm_to_time_t(const struct tm *tm);
+#endif
diff --git a/http-backend.c b/http-backend.c
index 807fb8839e7..81a7229ece0 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -13,6 +13,7 @@
 #include "packfile.h"
 #include "object-store.h"
 #include "protocol.h"
+#include "date.h"
 
 static const char content_type[] = "Content-Type";
 static const char content_length[] = "Content-Length";
diff --git a/ident.c b/ident.c
index 6aba4b5cb6f..89ca5b47008 100644
--- a/ident.c
+++ b/ident.c
@@ -7,6 +7,7 @@
  */
 #include "cache.h"
 #include "config.h"
+#include "date.h"
 
 static struct strbuf git_default_name = STRBUF_INIT;
 static struct strbuf git_default_email = STRBUF_INIT;
diff --git a/object-name.c b/object-name.c
index 92862eeb1ac..060d892a97f 100644
--- a/object-name.c
+++ b/object-name.c
@@ -15,6 +15,7 @@
 #include "submodule.h"
 #include "midx.h"
 #include "commit-reach.h"
+#include "date.h"
 
 static int get_oid_oneline(struct repository *r, const char *, struct object_id *, struct commit_list *);
 
diff --git a/pretty.h b/pretty.h
index 2f16acd213d..f34e24c53a4 100644
--- a/pretty.h
+++ b/pretty.h
@@ -2,6 +2,7 @@
 #define PRETTY_H
 
 #include "cache.h"
+#include "date.h"
 #include "string-list.h"
 
 struct commit;
@@ -163,4 +164,13 @@ int format_set_trailers_options(struct process_trailer_options *opts,
 			const char **arg,
 			char **invalid_arg);
 
+/*
+ * Like show_date, but pull the timestamp and tz parameters from
+ * the ident_split. It will also sanity-check the values and produce
+ * a well-known sentinel date if they appear bogus.
+ */
+const char *show_ident_date(const struct ident_split *id,
+			    const struct date_mode *mode);
+
+
 #endif /* PRETTY_H */
diff --git a/reflog-walk.h b/reflog-walk.h
index f26408f6cc1..e9e00ffd479 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -5,6 +5,7 @@
 
 struct commit;
 struct reflog_walk_info;
+struct date_mode;
 
 void init_reflog_walk(struct reflog_walk_info **info);
 int add_reflog_for_walk(struct reflog_walk_info *info,
diff --git a/refs.c b/refs.c
index 7017ae59804..b74f3815a52 100644
--- a/refs.c
+++ b/refs.c
@@ -19,6 +19,7 @@
 #include "strvec.h"
 #include "repository.h"
 #include "sigchain.h"
+#include "date.h"
 
 /*
  * List of all available backends
diff --git a/strbuf.c b/strbuf.c
index 613fee8c82e..00abeb55afd 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -2,6 +2,7 @@
 #include "refs.h"
 #include "string-list.h"
 #include "utf8.h"
+#include "date.h"
 
 int starts_with(const char *str, const char *prefix)
 {
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 099eff4f0fc..ded3d059f56 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -1,5 +1,6 @@
 #include "test-tool.h"
 #include "cache.h"
+#include "date.h"
 
 static const char *usage_msg = "\n"
 "  test-tool date relative [time_t]...\n"
-- 
2.35.1.1028.g2d2d4be19de


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 3/5] date API: provide and use a DATE_MODE_INIT
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
@ 2022-02-16  8:14                                 ` Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
                                                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  8:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Provide and use a DATE_MODE_INIT macro. Most of the users of struct
date_mode" use it via pretty.h's "struct pretty_print_context" which
doesn't have an initialization macro, so we're still bound to being
initialized to "{ 0 }" by default.

But we can change the couple of callers that directly declared a
variable on the stack to instead use the initializer, and thus do away
with the "mode.local = 0" added in add00ba2de9 (date: make "local"
orthogonal to date format, 2015-09-03).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.c               | 3 +--
 date.h               | 4 ++++
 ref-filter.c         | 2 +-
 t/helper/test-date.c | 2 +-
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/date.c b/date.c
index 863b07e9e63..54c709e4a08 100644
--- a/date.c
+++ b/date.c
@@ -206,11 +206,10 @@ void show_date_relative(timestamp_t time, struct strbuf *timebuf)
 
 struct date_mode *date_mode_from_type(enum date_mode_type type)
 {
-	static struct date_mode mode;
+	static struct date_mode mode = DATE_MODE_INIT;
 	if (type == DATE_STRFTIME)
 		BUG("cannot create anonymous strftime date_mode struct");
 	mode.type = type;
-	mode.local = 0;
 	return &mode;
 }
 
diff --git a/date.h b/date.h
index 5db9ec8dd29..c3a00d08ed6 100644
--- a/date.h
+++ b/date.h
@@ -20,6 +20,10 @@ struct date_mode {
 	int local;
 };
 
+#define DATE_MODE_INIT { \
+	.type = DATE_NORMAL, \
+}
+
 /*
  * Convenience helper for passing a constant type, like:
  *
diff --git a/ref-filter.c b/ref-filter.c
index f7a2f17bfd9..3399bde932f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1251,7 +1251,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 	char *zone;
 	timestamp_t timestamp;
 	long tz;
-	struct date_mode date_mode = { DATE_NORMAL };
+	struct date_mode date_mode = DATE_MODE_INIT;
 	const char *formatp;
 
 	/*
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index ded3d059f56..111071e1dd1 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -35,7 +35,7 @@ static void show_human_dates(const char **argv)
 
 static void show_dates(const char **argv, const char *format)
 {
-	struct date_mode mode;
+	struct date_mode mode = DATE_MODE_INIT;
 
 	parse_date_format(format, &mode);
 	for (; *argv; argv++) {
-- 
2.35.1.1028.g2d2d4be19de


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 4/5] date API: add basic API docs
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
                                                   ` (2 preceding siblings ...)
  2022-02-16  8:14                                 ` [PATCH v3 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
@ 2022-02-16  8:14                                 ` Ævar Arnfjörð Bjarmason
  2022-02-16  8:14                                 ` [PATCH v3 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
  2022-02-16 17:45                                 ` [PATCH v3 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Junio C Hamano
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  8:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Add basic API doc comments to date.h, and while doing so move the the
parse_date_format() function adjacent to show_date(). This way all the
"struct date_mode" functions are grouped together. Documenting the
rest is one of our #leftoverbits.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.h | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/date.h b/date.h
index c3a00d08ed6..bbd6a6477b5 100644
--- a/date.h
+++ b/date.h
@@ -1,6 +1,12 @@
 #ifndef DATE_H
 #define DATE_H
 
+/**
+ * The date mode type. This has DATE_NORMAL at an explicit "= 0" to
+ * accommodate a memset([...], 0, [...]) initialization when "struct
+ * date_mode" is used as an embedded struct member, as in the case of
+ * e.g. "struct pretty_print_context" and "struct rev_info".
+ */
 enum date_mode_type {
 	DATE_NORMAL = 0,
 	DATE_HUMAN,
@@ -24,7 +30,7 @@ struct date_mode {
 	.type = DATE_NORMAL, \
 }
 
-/*
+/**
  * Convenience helper for passing a constant type, like:
  *
  *   show_date(t, tz, DATE_MODE(NORMAL));
@@ -32,7 +38,22 @@ struct date_mode {
 #define DATE_MODE(t) date_mode_from_type(DATE_##t)
 struct date_mode *date_mode_from_type(enum date_mode_type type);
 
+/**
+ * Format <'time', 'timezone'> into static memory according to 'mode'
+ * and return it. The mode is an initialized "struct date_mode"
+ * (usually from the DATE_MODE() macro).
+ */
 const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);
+
+/**
+ * Parse a date format for later use with show_date().
+ *
+ * When the "date_mode_type" is DATE_STRFTIME the "strftime_fmt"
+ * member of "struct date_mode" will be a malloc()'d format string to
+ * be used with strbuf_addftime().
+ */
+void parse_date_format(const char *format, struct date_mode *mode);
+
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
@@ -41,7 +62,6 @@ void datestamp(struct strbuf *out);
 #define approxidate(s) approxidate_careful((s), NULL)
 timestamp_t approxidate_careful(const char *, int *);
 timestamp_t approxidate_relative(const char *date);
-void parse_date_format(const char *format, struct date_mode *mode);
 int date_overflows(timestamp_t date);
 time_t tm_to_time_t(const struct tm *tm);
 #endif
-- 
2.35.1.1028.g2d2d4be19de


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* [PATCH v3 5/5] date API: add and use a date_mode_release()
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
                                                   ` (3 preceding siblings ...)
  2022-02-16  8:14                                 ` [PATCH v3 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
@ 2022-02-16  8:14                                 ` Ævar Arnfjörð Bjarmason
  2022-02-16 17:45                                 ` [PATCH v3 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Junio C Hamano
  5 siblings, 0 replies; 125+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-02-16  8:14 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Jeff King, Ævar Arnfjörð Bjarmason

Fix a memory leak in the parse_date_format() function by providing a
new date_mode_release() companion function.

By using this in "t/helper/test-date.c" we can mark the
"t0006-date.sh" test as passing when git is compiled with
SANITIZE=leak, and whitelist it to run under
"GIT_TEST_PASSING_SANITIZE_LEAK=true" by adding
"TEST_PASSES_SANITIZE_LEAK=true" to the test itself.

The other tests that expose this memory leak (i.e. take the
"mode->type == DATE_STRFTIME" branch in parse_date_format()) are
"t6300-for-each-ref.sh" and "t7004-tag.sh". The former is due to an
easily fixed leak in "ref-filter.c", and brings the failures in
"t6300-for-each-ref.sh" down from 51 to 48.

Fixing the remaining leaks will have to wait until there's a
release_revisions() in "revision.c", as they have to do with leaks via
"struct rev_info".

There is also a leak in "builtin/blame.c" due to its call to
parse_date_format() to parse the "blame.date" configuration. However
as it declares a file-level "static struct date_mode blame_date_mode"
to track the data, LSAN will not report it as a leak. It's possible to
get valgrind(1) to complain about it with e.g.:

    valgrind --leak-check=full --show-leak-kinds=all ./git -P -c blame.date=format:%Y blame README.md

But let's focus on things LSAN complains about, and are thus
observable with "TEST_PASSES_SANITIZE_LEAK=true". We should get to
fixing memory leaks in "builtin/blame.c", but as doing so would
require some re-arrangement of cmd_blame() let's leave it for some
other time.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
 date.c               | 5 +++++
 date.h               | 9 ++++++++-
 ref-filter.c         | 1 +
 t/helper/test-date.c | 2 ++
 t/t0006-date.sh      | 2 ++
 5 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/date.c b/date.c
index 54c709e4a08..68a260c214d 100644
--- a/date.c
+++ b/date.c
@@ -993,6 +993,11 @@ void parse_date_format(const char *format, struct date_mode *mode)
 		die("unknown date format %s", format);
 }
 
+void date_mode_release(struct date_mode *mode)
+{
+	free((char *)mode->strftime_fmt);
+}
+
 void datestamp(struct strbuf *out)
 {
 	time_t now;
diff --git a/date.h b/date.h
index bbd6a6477b5..5d4eaba0a90 100644
--- a/date.h
+++ b/date.h
@@ -50,10 +50,17 @@ const char *show_date(timestamp_t time, int timezone, const struct date_mode *mo
  *
  * When the "date_mode_type" is DATE_STRFTIME the "strftime_fmt"
  * member of "struct date_mode" will be a malloc()'d format string to
- * be used with strbuf_addftime().
+ * be used with strbuf_addftime(), in which case you'll need to call
+ * date_mode_release() later.
  */
 void parse_date_format(const char *format, struct date_mode *mode);
 
+/**
+ * Release a "struct date_mode", currently only required if
+ * parse_date_format() has parsed a "DATE_STRFTIME" format.
+ */
+void date_mode_release(struct date_mode *mode);
+
 void show_date_relative(timestamp_t time, struct strbuf *timebuf);
 int parse_date(const char *date, struct strbuf *out);
 int parse_date_basic(const char *date, timestamp_t *timestamp, int *offset);
diff --git a/ref-filter.c b/ref-filter.c
index 3399bde932f..7838bd22b8d 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1276,6 +1276,7 @@ static void grab_date(const char *buf, struct atom_value *v, const char *atomnam
 		goto bad;
 	v->s = xstrdup(show_date(timestamp, tz, &date_mode));
 	v->value = timestamp;
+	date_mode_release(&date_mode);
 	return;
  bad:
 	v->s = xstrdup("");
diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 111071e1dd1..45951b1df87 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -54,6 +54,8 @@ static void show_dates(const char **argv, const char *format)
 
 		printf("%s -> %s\n", *argv, show_date(t, tz, &mode));
 	}
+
+	date_mode_release(&mode);
 }
 
 static void parse_dates(const char **argv)
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 794186961ee..2490162071e 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -1,6 +1,8 @@
 #!/bin/sh
 
 test_description='test date parsing and printing'
+
+TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
 # arbitrary reference time: 2009-08-30 19:20:00
-- 
2.35.1.1028.g2d2d4be19de


^ permalink raw reply related	[flat|nested] 125+ messages in thread

* Re: [PATCH v3 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory
  2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
                                                   ` (4 preceding siblings ...)
  2022-02-16  8:14                                 ` [PATCH v3 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
@ 2022-02-16 17:45                                 ` Junio C Hamano
  5 siblings, 0 replies; 125+ messages in thread
From: Junio C Hamano @ 2022-02-16 17:45 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: git, Jeff King

Ævar Arnfjörð Bjarmason  <avarab@gmail.com> writes:

> 2:  96c904d0b9a ! 2:  f73aa601e95 date API: create a date.h, split from cache.h
>     @@ Commit message
>          use the "DATE_MODE()" macro we now define in date.h, let's have them
>          include it.
>      
>     +    We could simply include this new header in "cache.h", but as this
>     +    change shows these functions weren't common enough to warrant
>     +    including in it in the first place. By moving them out of cache.h
>     +    changes to this API will no longer cause a (mostly) full re-build of
>     +    the project when "make" is run.
>     +

If this step were to include the new header in "cache.h" to reduce
the patch noise, and there were a follow-up step to update the *.c
files to include the new header while removing the inclusion of the
header from "cache.h", then the above would make a fine draft for
the log message that justifies that follow-up step.

But if we are doing these two things in a single step, the paragraph
would not make a very useful comment to help readers of "git log".

> 4:  3f70b1aa4c5 ! 4:  5c244960133 date API: add basic API docs
>     @@ date.h: struct date_mode {
>       struct date_mode *date_mode_from_type(enum date_mode_type type);
>       
>      +/**
>     -+ * Show the date given an initialized "struct date_mode" (usually from
>     -+ * the DATE_MODE() macro).
>     ++ * Format <'time', 'timezone'> into static memory according to 'mode'
>     ++ * and return it. The mode is an initialized "struct date_mode"
>     ++ * (usually from the DATE_MODE() macro).
>      + */
>       const char *show_date(timestamp_t time, int timezone, const struct date_mode *mode);

OK.

> 5:  60dbadacb16 ! 5:  b1ee9a30913 date API: add and use a date_mode_release()
>     @@ Commit message
>          release_revisions() in "revision.c", as they have to do with leaks via
>          "struct rev_info".
>      
>     +    There is also a leak in "builtin/blame.c" due to its call to
>     +    parse_date_format() to parse the "blame.date" configuration. However
>     +    as it declares a file-level "static struct date_mode blame_date_mode"
>     +    to track the data, LSAN will not report it as a leak.

Ah, it is not even a leak, then.  Is blame the only thing that uses
parse_date_format() outside the revision walkers?

Thanks.

^ permalink raw reply	[flat|nested] 125+ messages in thread

end of thread, other threads:[~2022-02-16 17:45 UTC | newest]

Thread overview: 125+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-09 14:38 UNLEAK(), leak checking in the default tests etc Ævar Arnfjörð Bjarmason
2021-06-09 17:44 ` Andrzej Hunt
2021-06-09 20:36   ` Felipe Contreras
2021-06-10 10:46   ` Jeff King
2021-06-10 10:56   ` Ævar Arnfjörð Bjarmason
2021-06-10 13:38     ` Jeff King
2021-06-10 15:32       ` Andrzej Hunt
2021-06-10 16:36         ` Jeff King
2021-06-11 15:44           ` Andrzej Hunt
2021-06-10 19:01 ` SZEDER Gábor
2021-07-14  0:11 ` [PATCH 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-07-14  0:11   ` [PATCH 1/4] tests: " Ævar Arnfjörð Bjarmason
2021-07-14  3:23     ` Đoàn Trần Công Danh
2021-07-14  0:11   ` [PATCH 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
2021-07-14  0:11   ` [PATCH 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
2021-07-14  0:11   ` [PATCH 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
2021-07-14  2:19     ` Eric Sunshine
2021-07-14 17:23   ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-07-14 17:23     ` [PATCH v2 1/4] tests: " Ævar Arnfjörð Bjarmason
2021-07-14 18:42       ` Andrzej Hunt
2021-07-14 22:39         ` Ævar Arnfjörð Bjarmason
2021-07-15 21:14         ` Jeff King
2021-07-15 21:06       ` Jeff King
2021-07-16 14:46         ` Ævar Arnfjörð Bjarmason
2021-07-16 18:09           ` Jeff King
2021-07-16 18:45             ` Jeff King
2021-07-16 18:56             ` Ævar Arnfjörð Bjarmason
2021-07-16 19:22               ` Jeff King
2021-07-14 17:23     ` [PATCH v2 2/4] SANITIZE tests: fix memory leaks in t13*config*, add to whitelist Ævar Arnfjörð Bjarmason
2021-07-14 18:57       ` Andrzej Hunt
2021-07-14 22:56         ` Ævar Arnfjörð Bjarmason
2021-07-15 21:42         ` Jeff King
2021-07-16  5:18           ` Andrzej Hunt
2021-07-16 21:20             ` Jeff King
2021-07-16  7:46           ` Ævar Arnfjörð Bjarmason
2021-07-16 21:16             ` Jeff King
2021-08-31 12:47               ` Ævar Arnfjörð Bjarmason
2021-09-01  7:53                 ` Jeff King
2021-09-01 11:45                   ` Ævar Arnfjörð Bjarmason
2021-07-14 17:23     ` [PATCH v2 3/4] SANITIZE tests: fix memory leaks in t5701*, " Ævar Arnfjörð Bjarmason
2021-07-15 17:37       ` Andrzej Hunt
2021-07-15 21:43       ` Jeff King
2021-08-31 13:46       ` [PATCH] protocol-caps.c: fix memory leak in send_info() Ævar Arnfjörð Bjarmason
2021-08-31 15:32         ` Bruno Albuquerque
2021-08-31 18:15           ` Junio C Hamano
     [not found]         ` <CAPeR6H69a_HMwWnpHzssaCm_ow=ic7AnzMdZVQJQ2ECRDaWzaA@mail.gmail.com>
2021-08-31 20:08           ` Ævar Arnfjörð Bjarmason
2021-07-14 17:23     ` [PATCH v2 4/4] SANITIZE tests: fix leak in mailmap.c Ævar Arnfjörð Bjarmason
2021-08-31 13:42       ` [PATCH] mailmap.c: fix a memory leak in free_mailap_{info,entry}() Ævar Arnfjörð Bjarmason
2021-08-31 16:22         ` Eric Sunshine
2021-08-31 19:38         ` Jeff King
2021-08-31 19:46           ` Junio C Hamano
2021-07-15 17:37     ` [PATCH v2 0/4] add a test mode for SANITIZE=leak, run it in CI Andrzej Hunt
2021-08-31 13:35     ` [PATCH v3 0/8] " Ævar Arnfjörð Bjarmason
2021-09-01  9:56       ` Jeff King
2021-09-01 10:42         ` Jeff King
2021-09-02 12:25         ` Ævar Arnfjörð Bjarmason
2021-09-03 11:13           ` Jeff King
2021-09-07 15:33       ` [PATCH v4 0/3] " Ævar Arnfjörð Bjarmason
2021-09-07 15:33         ` [PATCH v4 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
2021-09-07 15:33         ` [PATCH v4 2/3] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
2021-09-07 15:33         ` [PATCH v4 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-09-07 16:29           ` Eric Sunshine
2021-09-07 16:51           ` Jeff King
2021-09-07 16:44         ` [PATCH v4 0/3] " Jeff King
2021-09-07 18:22           ` Junio C Hamano
2021-09-07 21:30         ` [PATCH v5 " Ævar Arnfjörð Bjarmason
2021-09-07 21:30           ` [PATCH v5 1/3] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
2021-09-07 21:30           ` [PATCH v5 2/3] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
2021-09-07 21:30           ` [PATCH v5 3/3] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-09-08  4:46             ` Eric Sunshine
2021-09-16  3:56             ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
2021-09-16  6:14               ` Ævar Arnfjörð Bjarmason
2021-09-08 11:02           ` [PATCH v5 0/3] " Junio C Hamano
2021-09-08 12:03             ` Ævar Arnfjörð Bjarmason
2021-09-09 23:10               ` Emily Shaffer
2021-09-16 10:48           ` [PATCH v6 0/2] " Ævar Arnfjörð Bjarmason
2021-09-16 10:48             ` [PATCH v6 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
2021-09-16 10:48             ` [PATCH v6 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-09-19  8:03             ` [PATCH v7 0/2] " Ævar Arnfjörð Bjarmason
2021-09-19  8:03               ` [PATCH v7 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
2021-09-19  8:03               ` [PATCH v7 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-09-22 11:17                 ` [PATCH] fixup! " Carlo Marcelo Arenas Belón
2021-09-23  1:50                   ` Ævar Arnfjörð Bjarmason
2021-09-23  9:20               ` [PATCH v8 0/2] " Ævar Arnfjörð Bjarmason
2021-09-23  9:20                 ` [PATCH v8 1/2] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
2021-09-23  9:20                 ` [PATCH v8 2/2] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-11-03 22:44                   ` Re* " Junio C Hamano
2021-11-03 23:57                     ` Junio C Hamano
2021-11-04 10:06                     ` Ævar Arnfjörð Bjarmason
2021-11-16 18:31                       ` [PATCH] t0006: date_mode can leak .strftime_fmt member Ævar Arnfjörð Bjarmason
2021-11-16 19:04                         ` Junio C Hamano
2021-11-16 19:31                         ` Jeff King
2022-02-02 21:03                           ` [PATCH 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
2022-02-02 21:03                             ` [PATCH 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
2022-02-02 21:03                             ` [PATCH 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
2022-02-02 21:19                               ` Ævar Arnfjörð Bjarmason
2022-02-15  3:04                               ` Junio C Hamano
2022-02-02 21:03                             ` [PATCH 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
2022-02-02 21:03                             ` [PATCH 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
2022-02-15  2:14                               ` Junio C Hamano
2022-02-02 21:03                             ` [PATCH 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
2022-02-15  0:28                               ` Junio C Hamano
2022-02-04 23:53                             ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
2022-02-04 23:53                               ` [PATCH v2 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
2022-02-04 23:53                               ` [PATCH v2 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
2022-02-04 23:53                               ` [PATCH v2 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
2022-02-04 23:53                               ` [PATCH v2 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
2022-02-04 23:53                               ` [PATCH v2 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
2022-02-14 17:25                               ` [PATCH v2 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Ævar Arnfjörð Bjarmason
2022-02-14 19:52                                 ` Junio C Hamano
2022-02-16  8:14                               ` [PATCH v3 " Ævar Arnfjörð Bjarmason
2022-02-16  8:14                                 ` [PATCH v3 1/5] cache.h: remove always unused show_date_human() declaration Ævar Arnfjörð Bjarmason
2022-02-16  8:14                                 ` [PATCH v3 2/5] date API: create a date.h, split from cache.h Ævar Arnfjörð Bjarmason
2022-02-16  8:14                                 ` [PATCH v3 3/5] date API: provide and use a DATE_MODE_INIT Ævar Arnfjörð Bjarmason
2022-02-16  8:14                                 ` [PATCH v3 4/5] date API: add basic API docs Ævar Arnfjörð Bjarmason
2022-02-16  8:14                                 ` [PATCH v3 5/5] date API: add and use a date_mode_release() Ævar Arnfjörð Bjarmason
2022-02-16 17:45                                 ` [PATCH v3 0/5] date.[ch] API: split from cache.h, add API docs, stop leaking memory Junio C Hamano
     [not found]     ` <cover-v3-0.8-00000000000-20210831T132607Z-avarab@gmail.com>
2021-08-31 13:35       ` [PATCH v3 1/8] Makefile: add SANITIZE=leak flag to GIT-BUILD-OPTIONS Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 2/8] CI: refactor "if" to "case" statement Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 3/8] tests: add a test mode for SANITIZE=leak, run it in CI Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 4/8] tests: annotate t000*.sh with TEST_PASSES_SANITIZE_LEAK=true Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 5/8] tests: annotate t001*.sh " Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 6/8] tests: annotate t002*.sh " Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 7/8] tests: annotate select t0*.sh " Ævar Arnfjörð Bjarmason
2021-08-31 13:35       ` [PATCH v3 8/8] tests: annotate select t*.sh " Ævar Arnfjörð Bjarmason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.