From: Derrick Stolee <stolee@gmail.com>
To: Jeff King <peff@peff.net>, Geert Jansen <gerardu@amazon.com>
Cc: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Junio C Hamano" <gitster@pobox.com>,
"git@vger.kernel.org" <git@vger.kernel.org>,
"René Scharfe" <l.s.r@web.de>,
"Takuto Ikuta" <tikuta@chromium.org>
Subject: Re: [PATCH 6/9] sha1-file: use an object_directory for the main object dir
Date: Mon, 12 Nov 2018 10:48:36 -0500 [thread overview]
Message-ID: <421d3b43-3425-72c9-218e-facd86e28267@gmail.com> (raw)
In-Reply-To: <20181112145039.GF7400@sigill.intra.peff.net>
On 11/12/2018 9:50 AM, Jeff King wrote:
> Our handling of alternate object directories is needlessly different
> from the main object directory. As a result, many places in the code
> basically look like this:
>
> do_something(r->objects->objdir);
>
> for (odb = r->objects->alt_odb_list; odb; odb = odb->next)
> do_something(odb->path);
>
> That gets annoying when do_something() is non-trivial, and we've
> resorted to gross hacks like creating fake alternates (see
> find_short_object_filename()).
>
> Instead, let's give each raw_object_store a unified list of
> object_directory structs. The first will be the main store, and
> everything after is an alternate. Very few callers even care about the
> distinction, and can just loop over the whole list (and those who care
> can just treat the first element differently).
>
> A few observations:
>
> - we don't need r->objects->objectdir anymore, and can just
> mechanically convert that to r->objects->odb->path
>
> - object_directory's path field needs to become a real pointer rather
> than a FLEX_ARRAY, in order to fill it with expand_base_dir()
>
> - we'll call prepare_alt_odb() earlier in many functions (i.e.,
> outside of the loop). This may result in us calling it even when our
> function would be satisfied looking only at the main odb.
>
> But this doesn't matter in practice. It's not a very expensive
> operation in the first place, and in the majority of cases it will
> be a noop. We call it already (and cache its results) in
> prepare_packed_git(), and we'll generally check packs before loose
> objects. So essentially every program is going to call it
> immediately once per program.
>
> Arguably we should just prepare_alt_odb() immediately upon setting
> up the repository's object directory, which would save us sprinkling
> calls throughout the code base (and forgetting to do so has been a
> source of subtle bugs in the past). But I've stopped short of that
> here, since there are already a lot of other moving parts in this
> patch.
>
> - Most call sites just get shorter. The check_and_freshen() functions
> are an exception, because they have entry points to handle local and
> nonlocal directories separately.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> If the "the first one is the main store, the rest are alternates" bit is
> too subtle, we could mark each "struct object_directory" with a bit for
> "is_local".
This is probably a good thing to do proactively. We have the equivalent
in the packed_git struct, but that's also because they get out of order.
At the moment, I can't think of a read-only action that needs to treat
the local object directory more carefully. The closest I know about is
'git pack-objects --local', but that also writes a pack-file.
I assume that when we write a pack-file to the "default location" we use
get_object_directory() instead of referring to the default object_directory?
>
> builtin/fsck.c | 21 ++-------
> builtin/grep.c | 2 +-
> commit-graph.c | 5 +-
> environment.c | 4 +-
> object-store.h | 27 ++++++-----
> object.c | 19 ++++----
> packfile.c | 10 ++--
> path.c | 2 +-
> repository.c | 8 +++-
> sha1-file.c | 122 ++++++++++++++++++-------------------------------
> sha1-name.c | 17 ++-----
> 11 files changed, 90 insertions(+), 147 deletions(-)
>
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 55153cf92a..15338bd178 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -725,13 +725,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
> for_each_loose_object(mark_loose_for_connectivity, NULL, 0);
> for_each_packed_object(mark_packed_for_connectivity, NULL, 0);
> } else {
> - struct object_directory *alt_odb_list;
> -
> - fsck_object_dir(get_object_directory());
> -
> prepare_alt_odb(the_repository);
> - alt_odb_list = the_repository->objects->alt_odb_list;
> - for (odb = alt_odb_list; odb; odb = odb->next)
> + for (odb = the_repository->objects->odb; odb; odb = odb->next)
> fsck_object_dir(odb->path);
>
> if (check_full) {
> @@ -834,13 +829,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
> struct child_process commit_graph_verify = CHILD_PROCESS_INIT;
> const char *verify_argv[] = { "commit-graph", "verify", NULL, NULL, NULL };
>
> - commit_graph_verify.argv = verify_argv;
> - commit_graph_verify.git_cmd = 1;
> - if (run_command(&commit_graph_verify))
> - errors_found |= ERROR_COMMIT_GRAPH;
> -
> prepare_alt_odb(the_repository);
> - for (odb = the_repository->objects->alt_odb_list; odb; odb = odb->next) {
> + for (odb = the_repository->objects->odb; odb; odb = odb->next) {
> child_process_init(&commit_graph_verify);
> commit_graph_verify.argv = verify_argv;
> commit_graph_verify.git_cmd = 1;
> @@ -855,13 +845,8 @@ int cmd_fsck(int argc, const char **argv, const char *prefix)
> struct child_process midx_verify = CHILD_PROCESS_INIT;
> const char *midx_argv[] = { "multi-pack-index", "verify", NULL, NULL, NULL };
>
> - midx_verify.argv = midx_argv;
> - midx_verify.git_cmd = 1;
> - if (run_command(&midx_verify))
> - errors_found |= ERROR_COMMIT_GRAPH;
> -
> prepare_alt_odb(the_repository);
> - for (odb = the_repository->objects->alt_odb_list; odb; odb = odb->next) {
> + for (odb = the_repository->objects->odb; odb; odb = odb->next) {
> child_process_init(&midx_verify);
> midx_verify.argv = midx_argv;
> midx_verify.git_cmd = 1;
> diff --git a/builtin/grep.c b/builtin/grep.c
> index d8508ddf79..714c8d91ba 100644
> --- a/builtin/grep.c
> +++ b/builtin/grep.c
> @@ -441,7 +441,7 @@ static int grep_submodule(struct grep_opt *opt, struct repository *superproject,
> * object.
> */
> grep_read_lock();
> - add_to_alternates_memory(submodule.objects->objectdir);
> + add_to_alternates_memory(submodule.objects->odb->path);
> grep_read_unlock();
>
> if (oid) {
> diff --git a/commit-graph.c b/commit-graph.c
> index 5dd3f5b15c..99163c244b 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -231,7 +231,6 @@ static void prepare_commit_graph_one(struct repository *r, const char *obj_dir)
> static int prepare_commit_graph(struct repository *r)
> {
> struct object_directory *odb;
> - char *obj_dir;
> int config_value;
>
> if (r->objects->commit_graph_attempted)
> @@ -252,10 +251,8 @@ static int prepare_commit_graph(struct repository *r)
> if (!commit_graph_compatible(r))
> return 0;
>
> - obj_dir = r->objects->objectdir;
> - prepare_commit_graph_one(r, obj_dir);
> prepare_alt_odb(r);
> - for (odb = r->objects->alt_odb_list;
> + for (odb = r->objects->odb;
> !r->objects->commit_graph && odb;
> odb = odb->next)
> prepare_commit_graph_one(r, odb->path);
> diff --git a/environment.c b/environment.c
> index 3f3c8746c2..441ce56690 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -274,9 +274,9 @@ const char *get_git_work_tree(void)
>
> char *get_object_directory(void)
> {
> - if (!the_repository->objects->objectdir)
> + if (!the_repository->objects->odb)
> BUG("git environment hasn't been setup");
> - return the_repository->objects->objectdir;
> + return the_repository->objects->odb->path;
> }
>
> int odb_mkstemp(struct strbuf *temp_filename, const char *pattern)
> diff --git a/object-store.h b/object-store.h
> index b2fa0d0df0..30faf7b391 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -24,19 +24,14 @@ struct object_directory {
> * Path to the alternative object store. If this is a relative path,
> * it is relative to the current working directory.
> */
> - char path[FLEX_ARRAY];
> + char *path;
> };
> +
> void prepare_alt_odb(struct repository *r);
> char *compute_alternate_path(const char *path, struct strbuf *err);
> typedef int alt_odb_fn(struct object_directory *, void *);
> int foreach_alt_odb(alt_odb_fn, void*);
>
> -/*
> - * Allocate a "struct alternate_object_database" but do _not_ actually
> - * add it to the list of alternates.
> - */
> -struct object_directory *alloc_alt_odb(const char *dir);
> -
> /*
> * Add the directory to the on-disk alternates file; the new entry will also
> * take effect in the current process.
> @@ -80,17 +75,21 @@ struct multi_pack_index;
>
> struct raw_object_store {
> /*
> - * Path to the repository's object store.
> - * Cannot be NULL after initialization.
> + * Set of all object directories; the main directory is first (and
> + * cannot be NULL after initialization). Subsequent directories are
> + * alternates.
> */
> - char *objectdir;
> + struct object_directory *odb;
> + struct object_directory **odb_tail;
> + int loaded_alternates;
>
> - /* Path to extra alternate object database if not NULL */
> + /*
> + * A list of alternate object directories loaded from the environment;
> + * this should not generally need to be accessed directly, but will
> + * populate the "odb" list when prepare_alt_odb() is run.
> + */
> char *alternate_db;
>
> - struct object_directory *alt_odb_list;
> - struct object_directory **alt_odb_tail;
> -
> /*
> * Objects that should be substituted by other objects
> * (see git-replace(1)).
> diff --git a/object.c b/object.c
> index dd485ac629..79d636091c 100644
> --- a/object.c
> +++ b/object.c
> @@ -482,26 +482,26 @@ struct raw_object_store *raw_object_store_new(void)
> return o;
> }
>
> -static void free_alt_odb(struct object_directory *odb)
> +static void free_object_directory(struct object_directory *odb)
> {
> + free(odb->path);
> oid_array_clear(&odb->loose_objects_cache);
> free(odb);
> }
>
> -static void free_alt_odbs(struct raw_object_store *o)
> +static void free_object_directories(struct raw_object_store *o)
> {
> - while (o->alt_odb_list) {
> + while (o->odb) {
> struct object_directory *next;
>
> - next = o->alt_odb_list->next;
> - free_alt_odb(o->alt_odb_list);
> - o->alt_odb_list = next;
> + next = o->odb->next;
> + free_object_directory(o->odb);
> + o->odb = next;
> }
> }
>
> void raw_object_store_clear(struct raw_object_store *o)
> {
> - FREE_AND_NULL(o->objectdir);
> FREE_AND_NULL(o->alternate_db);
>
> oidmap_free(o->replace_map, 1);
> @@ -511,8 +511,9 @@ void raw_object_store_clear(struct raw_object_store *o)
> o->commit_graph = NULL;
> o->commit_graph_attempted = 0;
>
> - free_alt_odbs(o);
> - o->alt_odb_tail = NULL;
> + free_object_directories(o);
> + o->odb_tail = NULL;
> + o->loaded_alternates = 0;
>
> INIT_LIST_HEAD(&o->packed_git_mru);
> close_all_packs(o);
> diff --git a/packfile.c b/packfile.c
> index d6d511cfd2..1eda33247f 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -970,12 +970,12 @@ static void prepare_packed_git(struct repository *r)
>
> if (r->objects->packed_git_initialized)
> return;
> - prepare_multi_pack_index_one(r, r->objects->objectdir, 1);
> - prepare_packed_git_one(r, r->objects->objectdir, 1);
> +
> prepare_alt_odb(r);
> - for (odb = r->objects->alt_odb_list; odb; odb = odb->next) {
> - prepare_multi_pack_index_one(r, odb->path, 0);
> - prepare_packed_git_one(r, odb->path, 0);
> + for (odb = r->objects->odb; odb; odb = odb->next) {
> + int local = (odb == r->objects->odb);
Here seems to be a place where `odb->is_local` would help.
> + prepare_multi_pack_index_one(r, odb->path, local);
> + prepare_packed_git_one(r, odb->path, local);
> }
> rearrange_packed_git(r);
>
Thanks,
-Stolee
next prev parent reply other threads:[~2018-11-12 15:48 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-25 18:38 [RFC PATCH] index-pack: improve performance on NFS Jansen, Geert
2018-10-26 0:21 ` Junio C Hamano
2018-10-26 20:38 ` Ævar Arnfjörð Bjarmason
2018-10-27 7:26 ` Junio C Hamano
2018-10-27 9:33 ` Jeff King
2018-10-27 11:22 ` Ævar Arnfjörð Bjarmason
2018-10-28 22:50 ` [PATCH 0/4] index-pack: optionally turn off SHA-1 collision checking Ævar Arnfjörð Bjarmason
2018-10-30 2:49 ` Geert Jansen
2018-10-30 9:04 ` Junio C Hamano
2018-10-30 18:43 ` [PATCH v2 0/3] index-pack: test updates Ævar Arnfjörð Bjarmason
2018-11-13 20:19 ` [PATCH v3] index-pack: add ability to disable SHA-1 collision check Ævar Arnfjörð Bjarmason
2018-11-14 7:09 ` Junio C Hamano
2018-11-14 12:40 ` Ævar Arnfjörð Bjarmason
2018-10-30 18:43 ` [PATCH v2 1/3] pack-objects test: modernize style Ævar Arnfjörð Bjarmason
2018-10-30 18:43 ` [PATCH v2 2/3] pack-objects tests: don't leave test .git corrupt at end Ævar Arnfjörð Bjarmason
2018-10-30 18:43 ` [PATCH v2 3/3] index-pack tests: don't leave test repo dirty " Ævar Arnfjörð Bjarmason
2018-10-28 22:50 ` [PATCH 1/4] pack-objects test: modernize style Ævar Arnfjörð Bjarmason
2018-10-28 22:50 ` [PATCH 2/4] pack-objects tests: don't leave test .git corrupt at end Ævar Arnfjörð Bjarmason
2018-10-28 22:50 ` [PATCH 3/4] index-pack tests: don't leave test repo dirty " Ævar Arnfjörð Bjarmason
2018-10-28 22:50 ` [PATCH 4/4] index-pack: add ability to disable SHA-1 collision check Ævar Arnfjörð Bjarmason
2018-10-29 15:04 ` [RFC PATCH] index-pack: improve performance on NFS Jeff King
2018-10-29 15:09 ` Jeff King
2018-10-29 19:36 ` Ævar Arnfjörð Bjarmason
2018-10-29 23:27 ` Jeff King
2018-11-07 22:55 ` Geert Jansen
2018-11-08 12:02 ` Jeff King
2018-11-08 20:58 ` Geert Jansen
2018-11-08 21:18 ` Jeff King
2018-11-08 21:55 ` Geert Jansen
2018-11-08 22:20 ` Ævar Arnfjörð Bjarmason
2018-11-09 10:11 ` Ævar Arnfjörð Bjarmason
2018-11-12 14:31 ` Jeff King
2018-11-12 14:46 ` [PATCH 0/9] caching loose objects Jeff King
2018-11-12 14:46 ` [PATCH 1/9] fsck: do not reuse child_process structs Jeff King
2018-11-12 15:26 ` Derrick Stolee
2018-11-12 14:47 ` [PATCH 2/9] submodule--helper: prefer strip_suffix() to ends_with() Jeff King
2018-11-12 18:23 ` Stefan Beller
2018-11-12 14:48 ` [PATCH 3/9] rename "alternate_object_database" to "object_directory" Jeff King
2018-11-12 15:30 ` Derrick Stolee
2018-11-12 15:36 ` Jeff King
2018-11-12 19:41 ` Ramsay Jones
2018-11-12 14:48 ` [PATCH 4/9] sha1_file_name(): overwrite buffer instead of appending Jeff King
2018-11-12 15:32 ` Derrick Stolee
2018-11-12 14:49 ` [PATCH 5/9] handle alternates paths the same as the main object dir Jeff King
2018-11-12 15:38 ` Derrick Stolee
2018-11-12 15:46 ` Jeff King
2018-11-12 15:50 ` Derrick Stolee
2018-11-12 14:50 ` [PATCH 6/9] sha1-file: use an object_directory for " Jeff King
2018-11-12 15:48 ` Derrick Stolee [this message]
2018-11-12 16:09 ` Jeff King
2018-11-12 19:04 ` Stefan Beller
2018-11-22 17:42 ` Jeff King
2018-11-12 18:48 ` Stefan Beller
2018-11-12 14:50 ` [PATCH 7/9] object-store: provide helpers for loose_objects_cache Jeff King
2018-11-12 19:24 ` René Scharfe
2018-11-12 20:16 ` Jeff King
2018-11-12 14:54 ` [PATCH 8/9] sha1-file: use loose object cache for quick existence check Jeff King
2018-11-12 16:00 ` Derrick Stolee
2018-11-12 16:01 ` Ævar Arnfjörð Bjarmason
2018-11-12 16:21 ` Jeff King
2018-11-12 22:18 ` Ævar Arnfjörð Bjarmason
2018-11-12 22:30 ` Ævar Arnfjörð Bjarmason
2018-11-13 10:02 ` Ævar Arnfjörð Bjarmason
2018-11-14 18:21 ` René Scharfe
2018-12-02 10:52 ` René Scharfe
2018-12-03 22:04 ` Jeff King
2018-12-04 21:45 ` René Scharfe
2018-12-05 4:46 ` Jeff King
2018-12-05 6:02 ` René Scharfe
2018-12-05 6:51 ` Jeff King
2018-12-05 8:15 ` Jeff King
2018-12-05 18:41 ` René Scharfe
2018-12-05 20:17 ` Jeff King
2018-11-12 22:44 ` Geert Jansen
2018-11-27 20:48 ` René Scharfe
2018-12-01 19:49 ` Jeff King
2018-11-12 14:55 ` [PATCH 9/9] fetch-pack: drop custom loose object cache Jeff King
2018-11-12 19:25 ` René Scharfe
2018-11-12 19:32 ` Ævar Arnfjörð Bjarmason
2018-11-12 20:07 ` Jeff King
2018-11-12 20:13 ` René Scharfe
2018-11-12 16:02 ` [PATCH 0/9] caching loose objects Derrick Stolee
2018-11-12 19:10 ` Stefan Beller
2018-11-09 13:43 ` [RFC PATCH] index-pack: improve performance on NFS Ævar Arnfjörð Bjarmason
2018-11-09 16:08 ` Duy Nguyen
2018-11-10 14:04 ` Ævar Arnfjörð Bjarmason
2018-11-12 14:34 ` Jeff King
2018-11-12 22:58 ` Geert Jansen
2018-10-27 14:04 ` Duy Nguyen
2018-10-29 15:18 ` Jeff King
2018-10-29 0:48 ` Junio C Hamano
2018-10-29 15:20 ` Jeff King
2018-10-29 18:43 ` Ævar Arnfjörð Bjarmason
2018-10-29 21:34 ` Geert Jansen
2018-10-29 21:50 ` Jeff King
2018-10-29 22:21 ` Geert Jansen
2018-10-29 22:27 ` Jeff King
2018-10-29 22:35 ` Stefan Beller
2018-10-29 23:29 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=421d3b43-3425-72c9-218e-facd86e28267@gmail.com \
--to=stolee@gmail.com \
--cc=avarab@gmail.com \
--cc=gerardu@amazon.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=l.s.r@web.de \
--cc=peff@peff.net \
--cc=tikuta@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).