From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: Andrzej Hunt via GitGitGadget <gitgitgadget@gmail.com>
Cc: git@vger.kernel.org, Andrzej Hunt <andrzej@ahunt.org>,
Andrzej Hunt <ajrhunt@google.com>
Subject: Re: [PATCH 04/12] bloom: clear each bloom_key after use
Date: Sun, 11 Apr 2021 09:26:51 +0200 [thread overview]
Message-ID: <20210411072651.GF2947267@szeder.dev> (raw)
In-Reply-To: <9ae15b94881369fa1cbd09fc2de9cc94c30edb2d.1617994052.git.gitgitgadget@gmail.com>
On Fri, Apr 09, 2021 at 06:47:23PM +0000, Andrzej Hunt via GitGitGadget wrote:
> From: Andrzej Hunt <ajrhunt@google.com>
>
> fill_bloom_key() allocates memory into bloom_key, we need to clean that
> up once the key is no longer needed.
>
> This fixes the following leak which was found while running t0002-t0099.
> Although this leak is happening in code being called from a test-helper,
> the same code is also used in various locations around git, and could
> presumably happen during normal usage too.
It does indeed happen: 'git commit-graph write --reachable
--changed-paths' generates Bloom filters for every commit, with each
filter containing all paths modified by its associated commit, so it
leaks a lot of 7 * 4byte hashes. This patch reduces the memory usage
of that command:
Max RSS
before after
---------------------------------------------
android-base 1275028k 1006576k -21.1%
chromium 3245144k 3127764k -3.6%
cmssw 793996k 699156k -12.0%
cpython 371584k 343480k -7.6%
elasticsearch 748104k 637936k -14.7%
freebsd-src 819020k 741272k -9.5%
gcc 867412k 730332k -15.8%
gecko-dev 2619112k 2457280k -6.2%
git 252684k 216900k -14.2%
glibc 239000k 222228k -7.0%
go 264132k 251344k -4.9%
homebrew-cask 542188k 480588k -11.4%
homebrew-core 805332k 715848k -11.1%
jdk 417832k 342928k -17.9%
libreoff-core 1257296k 1089980k -13.3%
linux 2033296k 1759712k -13.5%
llvm-project 1067216k 956704k -10.4%
mariadb-srv 695172k 559508k -19.5%
postgres 340132k 317416k -6.7%
rails 325432k 294332k -9.6%
rust 655244k 584904k -10.7%
tensorflow 507308k 480848k -5.2%
webkit 2466812k 2237332k -9.3%
Just out of curiosity, I disabled the questionable hardcoded 512 paths
limit on the size of modified path Bloom filters, and the memory usage
in the jdk repository sunk by over 55%, from 849520k to 379760k.
Please feel free to include any of the above data points in the commit
message.
> Direct leak of 308 byte(s) in 11 object(s) allocated from:
> #0 0x49a5e2 in calloc ../projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:154:3
> #1 0x6f4032 in xcalloc wrapper.c:140:8
> #2 0x4f2905 in fill_bloom_key bloom.c:137:28
> #3 0x4f34c1 in get_or_compute_bloom_filter bloom.c:284:4
> #4 0x4cb484 in get_bloom_filter_for_commit t/helper/test-bloom.c:43:11
> #5 0x4cb072 in cmd__bloom t/helper/test-bloom.c:97:3
> #6 0x4ca7ef in cmd_main t/helper/test-tool.c:121:11
> #7 0x4caace in main common-main.c:52:11
> #8 0x7f798af95349 in __libc_start_main (/lib64/libc.so.6+0x24349)
>
> SUMMARY: AddressSanitizer: 308 byte(s) leaked in 11 allocation(s).
>
> Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
> ---
> bloom.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/bloom.c b/bloom.c
> index 52b87474c6eb..5e297038bb1f 100644
> --- a/bloom.c
> +++ b/bloom.c
> @@ -283,6 +283,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r,
> struct bloom_key key;
> fill_bloom_key(e->path, strlen(e->path), &key, settings);
> add_key_to_filter(&key, filter, settings);
> + clear_bloom_key(&key);
> }
>
> cleanup:
> --
> gitgitgadget
>
next prev parent reply other threads:[~2021-04-11 7:26 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-09 18:47 [PATCH 00/12] Fix all leaks in tests t0002-t0099: Part 1 Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 01/12] revision: free remainder of old commit list in limit_list Andrzej Hunt via GitGitGadget
2021-04-10 7:29 ` René Scharfe
2021-04-25 13:32 ` Andrzej Hunt
2021-04-09 18:47 ` [PATCH 02/12] wt-status: fix multiple small leaks Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 03/12] ls-files: free max_prefix when done Andrzej Hunt via GitGitGadget
2021-04-10 8:12 ` René Scharfe
2021-04-25 13:16 ` Andrzej Hunt
2021-04-09 18:47 ` [PATCH 04/12] bloom: clear each bloom_key after use Andrzej Hunt via GitGitGadget
2021-04-11 7:26 ` SZEDER Gábor [this message]
2021-04-25 13:17 ` Andrzej Hunt
2021-04-09 18:47 ` [PATCH 05/12] branch: FREE_AND_NULL instead of NULL'ing real_ref Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 06/12] builtin/bugreport: don't leak prefixed filename Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 07/12] builtin/check-ignore: clear_pathspec before returning Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 08/12] builtin/checkout: clear pending objects after diffing Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 09/12] mailinfo: also free strbuf lists when clearing mailinfo Andrzej Hunt via GitGitGadget
2021-04-11 11:43 ` Junio C Hamano
2021-04-25 13:15 ` Andrzej Hunt
2021-04-09 18:47 ` [PATCH 10/12] builtin/for-each-ref: free filter and UNLEAK sorting Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 11/12] builtin/rebase: release git_format_patch_opt too Andrzej Hunt via GitGitGadget
2021-04-09 18:47 ` [PATCH 12/12] builtin/rm: avoid leaking pathspec and seen Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 00/12] Fix all leaks in tests t0002-t0099: Part 1 Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 01/12] revision: free remainder of old commit list in limit_list Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 02/12] wt-status: fix multiple small leaks Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 03/12] ls-files: free max_prefix when done Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 04/12] bloom: clear each bloom_key after use Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 05/12] branch: FREE_AND_NULL instead of NULL'ing real_ref Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 06/12] builtin/bugreport: don't leak prefixed filename Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 07/12] builtin/check-ignore: clear_pathspec before returning Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 08/12] builtin/checkout: clear pending objects after diffing Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 09/12] mailinfo: also free strbuf lists when clearing mailinfo Andrzej Hunt via GitGitGadget
2021-04-28 0:43 ` Junio C Hamano
2021-04-25 14:16 ` [PATCH v2 10/12] builtin/for-each-ref: free filter and UNLEAK sorting Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 11/12] builtin/rebase: release git_format_patch_opt too Andrzej Hunt via GitGitGadget
2021-04-25 14:16 ` [PATCH v2 12/12] builtin/rm: avoid leaking pathspec and seen Andrzej Hunt via GitGitGadget
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210411072651.GF2947267@szeder.dev \
--to=szeder.dev@gmail.com \
--cc=ajrhunt@google.com \
--cc=andrzej@ahunt.org \
--cc=git@vger.kernel.org \
--cc=gitgitgadget@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.