From: "René Scharfe" <l.s.r@web.de>
To: git@vger.kernel.org
Cc: "Junio C Hamano" <gitster@pobox.com>, "Jeff King" <peff@peff.net>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Subject: [PATCH 3/3] object-store: use one oid_array per subdirectory for loose cache
Date: Sun, 6 Jan 2019 17:45:52 +0100 [thread overview]
Message-ID: <c8dd851f-0a18-848f-8e58-cc0ee5f8e705@web.de> (raw)
In-Reply-To: <3512c798-aa42-6fba-ee82-d33a8985be91@web.de>
The loose objects cache is filled one subdirectory at a time as needed.
It is stored in an oid_array, which has to be resorted after each add
operation. So when querying a wide range of objects, the partially
filled array needs to be resorted up to 255 times, which takes over 100
times longer than sorting once.
Use one oid_array for each subdirectory. This ensures that entries have
to only be sorted a single time. It also avoids eight binary search
steps for each cache lookup as a small bonus.
The cache is used for collision checks for the log placeholders %h, %t
and %p, and we can see the change speeding them up in a repository with
ca. 100 objects per subdirectory:
$ git count-objects
26733 objects, 68808 kilobytes
Test HEAD^ HEAD
--------------------------------------------------------------------
4205.1: log with %H 0.51(0.47+0.04) 0.51(0.49+0.02) +0.0%
4205.2: log with %h 0.84(0.82+0.02) 0.60(0.57+0.03) -28.6%
4205.3: log with %T 0.53(0.49+0.04) 0.52(0.48+0.03) -1.9%
4205.4: log with %t 0.84(0.80+0.04) 0.60(0.59+0.01) -28.6%
4205.5: log with %P 0.52(0.48+0.03) 0.51(0.50+0.01) -1.9%
4205.6: log with %p 0.85(0.78+0.06) 0.61(0.56+0.05) -28.2%
4205.7: log with %h-%h-%h 0.96(0.92+0.03) 0.69(0.64+0.04) -28.1%
Reported-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
---
object-store.h | 2 +-
sha1-file.c | 9 ++++++---
2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/object-store.h b/object-store.h
index 709bf856b6..2fb6c0e4db 100644
--- a/object-store.h
+++ b/object-store.h
@@ -20,7 +20,7 @@ struct object_directory {
* Be sure to call odb_load_loose_cache() before using.
*/
char loose_objects_subdir_seen[256];
- struct oid_array loose_objects_cache;
+ struct oid_array loose_objects_cache[256];
/*
* Path to the alternative object store. If this is a relative path,
diff --git a/sha1-file.c b/sha1-file.c
index 2f965b2688..c3c6e50704 100644
--- a/sha1-file.c
+++ b/sha1-file.c
@@ -2155,7 +2155,7 @@ struct oid_array *odb_loose_cache(struct object_directory *odb,
{
int subdir_nr = oid->hash[0];
odb_load_loose_cache(odb, subdir_nr);
- return &odb->loose_objects_cache;
+ return &odb->loose_objects_cache[subdir_nr];
}
void odb_load_loose_cache(struct object_directory *odb, int subdir_nr)
@@ -2173,14 +2173,17 @@ void odb_load_loose_cache(struct object_directory *odb, int subdir_nr)
for_each_file_in_obj_subdir(subdir_nr, &buf,
append_loose_object,
NULL, NULL,
- &odb->loose_objects_cache);
+ &odb->loose_objects_cache[subdir_nr]);
odb->loose_objects_subdir_seen[subdir_nr] = 1;
strbuf_release(&buf);
}
void odb_clear_loose_cache(struct object_directory *odb)
{
- oid_array_clear(&odb->loose_objects_cache);
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(odb->loose_objects_cache); i++)
+ oid_array_clear(&odb->loose_objects_cache[i]);
memset(&odb->loose_objects_subdir_seen, 0,
sizeof(odb->loose_objects_subdir_seen));
}
--
2.20.1
next prev parent reply other threads:[~2019-01-06 16:51 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-12-28 18:04 What's cooking in git.git (Dec 2018, #02; Fri, 28) Junio C Hamano
2018-12-28 18:23 ` Elijah Newren
2019-01-03 13:27 ` Johannes Schindelin
2019-01-07 17:13 ` Elijah Newren
2018-12-28 19:21 ` ag/sequencer-reduce-rewriting-todo, was " Alban Gruin
2018-12-28 20:28 ` Junio C Hamano
2018-12-29 12:08 ` Denton Liu
2019-01-03 13:23 ` ps/stash-in-c, was " Johannes Schindelin
2019-01-06 16:39 ` jk/loose-object-cache René Scharfe
2019-01-06 16:45 ` [PATCH 1/3] object-store: factor out odb_loose_cache() René Scharfe
2019-01-07 8:27 ` Jeff King
2019-01-07 13:26 ` René Scharfe
2019-01-07 17:29 ` René Scharfe
2019-01-07 11:27 ` Philip Oakley
2019-01-07 12:30 ` Jeff King
2019-01-07 13:11 ` René Scharfe
2019-01-06 16:45 ` [PATCH 2/3] object-store: factor out odb_clear_loose_cache() René Scharfe
2019-01-06 16:45 ` René Scharfe [this message]
2019-01-06 20:38 ` [PATCH 3/3] object-store: use one oid_array per subdirectory for loose cache Ævar Arnfjörð Bjarmason
2019-01-06 22:58 ` René Scharfe
2019-01-07 8:31 ` [PATCH 0/11] jk/loose-object-cache sha1/object_id fixups Jeff King
2019-01-07 8:33 ` [PATCH 01/11] sha1-file: fix outdated sha1 comment references Jeff King
2019-01-07 8:34 ` [PATCH 02/11] update comment references to sha1_object_info() Jeff King
2019-01-07 8:34 ` [PATCH 03/11] http: use struct object_id instead of bare sha1 Jeff King
2019-01-07 8:35 ` [PATCH 04/11] sha1-file: modernize loose object file functions Jeff King
2019-01-07 8:37 ` [PATCH 05/11] sha1-file: modernize loose header/stream functions Jeff King
2019-01-07 8:37 ` [PATCH 06/11] sha1-file: convert pass-through functions to object_id Jeff King
2019-01-07 8:37 ` [PATCH 07/11] convert has_sha1_file() callers to has_object_file() Jeff King
2019-01-07 8:39 ` [PATCH 08/11] sha1-file: drop has_sha1_file() Jeff King
2019-01-07 8:39 ` [PATCH 09/11] sha1-file: prefer "loose object file" to "sha1 file" in messages Jeff King
2019-01-07 8:39 ` [PATCH 10/11] sha1-file: avoid "sha1 file" for generic use " Jeff King
2019-01-07 8:40 ` [PATCH 11/11] prefer "hash mismatch" to "sha1 mismatch" Jeff King
2019-01-08 16:40 ` [PATCH 0/11] jk/loose-object-cache sha1/object_id fixups René Scharfe
2019-01-08 17:39 ` Junio C Hamano
2019-01-08 18:05 ` Jeff King
2019-01-08 18:07 ` Junio C Hamano
2019-01-08 18:27 ` Derrick Stolee
2019-01-08 18:52 ` Junio C Hamano
2019-01-08 21:16 ` Jeff King
2019-01-09 21:37 ` Stefan Beller
2019-01-09 22:42 ` Stefan Beller
2019-01-10 6:17 ` Jeff King
2019-01-07 17:29 ` [PATCH 4/3] object-store: retire odb_load_loose_cache() René Scharfe
2019-01-07 19:32 ` Junio C Hamano
2019-01-07 19:29 ` jk/loose-object-cache Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c8dd851f-0a18-848f-8e58-cc0ee5f8e705@web.de \
--to=l.s.r@web.de \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).