From: mhagger@alum.mit.edu
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Jeff King <peff@peff.net>,
Drew Northup <drew.northup@maine.edu>,
Jakub Narebski <jnareb@gmail.com>,
Heiko Voigt <hvoigt@hvoigt.net>,
Johan Herland <johan@herland.net>,
Julian Phillips <julian@quantumfyre.co.uk>,
Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH v2 31/51] refs: sort ref_dirs lazily
Date: Mon, 12 Dec 2011 06:38:38 +0100 [thread overview]
Message-ID: <1323668338-1764-32-git-send-email-mhagger@alum.mit.edu> (raw)
In-Reply-To: <1323668338-1764-1-git-send-email-mhagger@alum.mit.edu>
From: Michael Haggerty <mhagger@alum.mit.edu>
Sort ref_dirs lazily, when the ordering is needed: for searching via
search_ref_dir(), and when iterating over the references via
do_for_each_ref_in_dir() and do_for_each_ref_in_dirs().
This change means that we never have to sort directories recursively,
so change sort_ref_dirs() to not recurse.
NOTE: the dirs can now be sorted as a side-effect of other function
calls. Therefore, it would be problematic to do something from a
each_ref_fn callback that could provoke the sorting of the directory
that is currently being iterated over. This is not so likely, because
a directory is always sorted just before being iterated over and thus
can be searched through during the iteration without causing a
re-sort. But if a callback function would add a reference to a parent
directory of the reference in the iteration, then try to resolve a
reference under that directory, inconsistency could result.
Add a comment in refs.h warning against modifications during
iteration.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
refs.c | 35 +++++++++++++++--------------------
refs.h | 7 +++++--
2 files changed, 20 insertions(+), 22 deletions(-)
diff --git a/refs.c b/refs.c
index ce141ea..f01da78 100644
--- a/refs.c
+++ b/refs.c
@@ -256,9 +256,14 @@ static struct ref_entry *search_ref_dir(struct ref_dir *dir, const char *refname
/*
* We need dir to be sorted so that binary search works.
- * Calling sort_ref_dir() here is not quite as terribly
- * inefficient as it looks, because directories that are
- * already sorted are not re-sorted.
+ * Calling sort_ref_dir() here is not as terribly inefficient
+ * as it looks. (1) If the directory is already sorted, it is
+ * not re-sorted. (2) When adding a reference,
+ * search_ref_dir() is only called to find the containing
+ * subdirectories; there is no search of the directory to
+ * which the reference will be stored. Thus adding a bunch of
+ * references one after the other to a single subdirectory
+ * doesn't require *any* intermediate sorting.
*/
sort_ref_dir(dir);
@@ -364,26 +369,16 @@ static int is_dup_ref(const struct ref_entry *ref1, const struct ref_entry *ref2
}
/*
- * Sort the entries in dir and its subdirectories (if they are not
- * already sorted).
+ * Sort the entries in dir (if they are not already sorted). Sort
+ * only dir itself, not its subdirectories.
*/
static void sort_ref_dir(struct ref_dir *dir)
{
int i, j;
struct ref_entry *last = NULL;
- if (dir->sorted == dir->nr) {
- /*
- * This directory is already sorted and de-duped, but
- * we still have to sort subdirectories.
- */
- for (i = 0; i < dir->nr; i++) {
- struct ref_entry *entry = dir->entries[i];
- if (entry->flag & REF_DIR)
- sort_ref_dir(&entry->u.subdir);
- }
- return;
- }
+ if (dir->sorted == dir->nr)
+ return; /* This directory is already sorted and de-duped */
qsort(dir->entries, dir->nr, sizeof(*dir->entries), ref_entry_cmp);
@@ -393,7 +388,6 @@ static void sort_ref_dir(struct ref_dir *dir)
if (last && is_dup_ref(last, entry)) {
free_ref_entry(entry);
} else if (entry->flag & REF_DIR) {
- sort_ref_dir(&entry->u.subdir);
dir->entries[i++] = entry;
last = NULL;
} else {
@@ -430,6 +424,7 @@ static int do_for_each_ref_in_dir(struct ref_dir *dir, int offset,
each_ref_fn fn, int trim, int flags, void *cb_data)
{
int i;
+ sort_ref_dir(dir);
for (i = offset; i < dir->nr; i++) {
struct ref_entry *entry = dir->entries[i];
int retval;
@@ -453,6 +448,8 @@ static int do_for_each_ref_in_dirs(struct ref_dir *dir1,
int retval;
int i1 = 0, i2 = 0;
+ sort_ref_dir(dir1);
+ sort_ref_dir(dir2);
while (1) {
struct ref_entry *e1, *e2, *entry;
int cmp;
@@ -701,7 +698,6 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir)
!get_sha1_hex(refline + 1, sha1))
hashcpy(last->u.value.peeled, sha1);
}
- sort_ref_dir(dir);
}
void add_extra_ref(const char *refname, const unsigned char *sha1, int flag)
@@ -803,7 +799,6 @@ static struct ref_dir *get_loose_refs(struct ref_cache *refs)
{
if (!refs->did_loose) {
get_ref_dir(refs, "refs", &refs->loose);
- sort_ref_dir(&refs->loose);
refs->did_loose = 1;
}
return &refs->loose;
diff --git a/refs.h b/refs.h
index d498291..5bb4678 100644
--- a/refs.h
+++ b/refs.h
@@ -15,8 +15,11 @@ struct ref_lock {
#define REF_ISBROKEN 0x04
/*
- * Calls the specified function for each ref file until it returns nonzero,
- * and returns the value
+ * Calls the specified function for each ref file until it returns
+ * nonzero, and returns the value. Please note that it is not safe to
+ * modify references while an iteration is in progress, unless the
+ * same callback function invocation that modifies the reference also
+ * returns a nonzero value to immediately stop the iteration.
*/
typedef int each_ref_fn(const char *refname, const unsigned char *sha1, int flags, void *cb_data);
extern int head_ref(each_ref_fn, void *);
--
1.7.8
next prev parent reply other threads:[~2011-12-12 5:41 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-12 5:38 [PATCH v2 00/51] ref-api-C and ref-api-D re-roll mhagger
2011-12-12 5:38 ` [PATCH v2 01/51] struct ref_entry: document name member mhagger
2011-12-12 5:38 ` [PATCH v2 02/51] refs: rename "refname" variables mhagger
2011-12-13 0:37 ` Junio C Hamano
2011-12-12 5:38 ` [PATCH v2 03/51] refs: rename parameters result -> sha1 mhagger
2011-12-12 5:38 ` [PATCH v2 04/51] clear_ref_array(): rename from free_ref_array() mhagger
2011-12-12 5:38 ` [PATCH v2 05/51] is_refname_available(): remove the "quiet" argument mhagger
2011-12-12 5:38 ` [PATCH v2 06/51] parse_ref_line(): add docstring mhagger
2011-12-12 5:38 ` [PATCH v2 07/51] add_ref(): " mhagger
2011-12-12 5:38 ` [PATCH v2 08/51] is_dup_ref(): extract function from sort_ref_array() mhagger
2011-12-12 8:33 ` Jeff King
2011-12-12 11:44 ` Michael Haggerty
2011-12-12 17:14 ` Junio C Hamano
2011-12-12 22:33 ` Junio C Hamano
2011-12-13 4:35 ` Michael Haggerty
2011-12-13 5:00 ` Michael Haggerty
2011-12-12 5:38 ` [PATCH v2 09/51] refs: change signatures of get_packed_refs() and get_loose_refs() mhagger
2011-12-12 5:38 ` [PATCH v2 10/51] get_ref_dir(): change signature mhagger
2011-12-12 5:38 ` [PATCH v2 11/51] resolve_gitlink_ref(): improve docstring mhagger
2011-12-12 5:38 ` [PATCH v2 12/51] Pass a (ref_cache *) to the resolve_gitlink_*() helper functions mhagger
2011-12-12 5:38 ` [PATCH v2 13/51] resolve_gitlink_ref_recursive(): change to work with struct ref_cache mhagger
2011-12-12 5:38 ` [PATCH v2 14/51] repack_without_ref(): remove temporary mhagger
2011-12-12 5:38 ` [PATCH v2 15/51] create_ref_entry(): extract function from add_ref() mhagger
2011-12-12 5:38 ` [PATCH v2 16/51] add_ref(): take a (struct ref_entry *) parameter mhagger
2011-12-12 5:38 ` [PATCH v2 17/51] do_for_each_ref(): correctly terminate while processesing extra_refs mhagger
2011-12-12 22:41 ` Junio C Hamano
2011-12-12 5:38 ` [PATCH v2 18/51] do_for_each_ref_in_array(): new function mhagger
2011-12-12 5:38 ` [PATCH v2 19/51] do_for_each_ref_in_arrays(): " mhagger
2011-12-12 5:38 ` [PATCH v2 20/51] repack_without_ref(): reimplement using do_for_each_ref_in_array() mhagger
2011-12-12 22:44 ` Junio C Hamano
2011-12-12 5:38 ` [PATCH v2 21/51] names_conflict(): new function, extracted from is_refname_available() mhagger
2011-12-12 5:38 ` [PATCH v2 22/51] names_conflict(): simplify implementation mhagger
2011-12-12 5:38 ` [PATCH v2 23/51] is_refname_available(): reimplement using do_for_each_ref_in_array() mhagger
2011-12-12 5:38 ` [PATCH v2 24/51] refs.c: reorder definitions more logically mhagger
2011-12-12 5:38 ` [PATCH v2 25/51] free_ref_entry(): new function mhagger
2011-12-12 5:38 ` [PATCH v2 26/51] check_refname_component(): return 0 for zero-length components mhagger
2011-12-12 5:38 ` [PATCH v2 27/51] struct ref_entry: nest the value part in a union mhagger
2011-12-12 5:38 ` [PATCH v2 28/51] refs.c: rename ref_array -> ref_dir mhagger
2011-12-13 0:45 ` Junio C Hamano
2011-12-13 5:43 ` Michael Haggerty
2011-12-13 6:37 ` Junio C Hamano
2011-12-13 19:12 ` Michael Haggerty
2011-12-13 19:17 ` Junio C Hamano
2011-12-13 22:13 ` Michael Haggerty
2011-12-13 23:24 ` Junio C Hamano
2011-12-14 0:19 ` Junio C Hamano
2011-12-14 2:33 ` Jeff King
2011-12-15 8:19 ` Michael Haggerty
2011-12-15 8:37 ` Jeff King
2012-01-17 15:07 ` Michael Haggerty
2012-02-10 14:51 ` Michael Haggerty
2012-02-10 20:44 ` Jeff King
2012-02-10 21:17 ` Junio C Hamano
2012-02-11 6:33 ` Michael Haggerty
2011-12-12 5:38 ` [PATCH v2 29/51] refs: store references hierarchically mhagger
2011-12-12 5:38 ` [PATCH v2 30/51] sort_ref_dir(): do not sort if already sorted mhagger
2011-12-12 23:26 ` Junio C Hamano
2011-12-12 5:38 ` mhagger [this message]
2011-12-12 5:38 ` [PATCH v2 32/51] do_for_each_ref(): only iterate over the subtree that was requested mhagger
2011-12-12 5:38 ` [PATCH v2 33/51] get_ref_dir(): keep track of the current ref_dir mhagger
2011-12-12 5:38 ` [PATCH v2 34/51] refs: wrap top-level ref_dirs in ref_entries mhagger
2011-12-12 5:38 ` [PATCH v2 35/51] get_packed_refs(): return (ref_entry *) instead of (ref_dir *) mhagger
2011-12-12 5:38 ` [PATCH v2 36/51] get_loose_refs(): " mhagger
2011-12-12 5:38 ` [PATCH v2 37/51] is_refname_available(): take " mhagger
2011-12-12 5:38 ` [PATCH v2 38/51] find_ref(): " mhagger
2011-12-12 5:38 ` [PATCH v2 39/51] read_packed_refs(): " mhagger
2011-12-12 5:38 ` [PATCH v2 40/51] add_ref(): " mhagger
2011-12-12 5:38 ` [PATCH v2 41/51] find_containing_direntry(): use " mhagger
2011-12-12 5:38 ` [PATCH v2 42/51] search_ref_dir(): take " mhagger
2011-12-12 5:38 ` [PATCH v2 43/51] add_entry(): " mhagger
2011-12-12 5:38 ` [PATCH v2 44/51] do_for_each_ref_in_dir*(): " mhagger
2011-12-12 5:38 ` [PATCH v2 45/51] sort_ref_dir(): " mhagger
2011-12-12 5:38 ` [PATCH v2 46/51] struct ref_dir: store a reference to the enclosing ref_cache mhagger
2011-12-12 5:38 ` [PATCH v2 47/51] read_loose_refs(): take a (ref_entry *) as argument mhagger
2011-12-12 5:38 ` [PATCH v2 48/51] refs: read loose references lazily mhagger
2011-12-12 5:38 ` [PATCH v2 49/51] is_refname_available(): query only possibly-conflicting references mhagger
2011-12-12 5:38 ` [PATCH v2 50/51] read_packed_refs(): keep track of the directory being worked in mhagger
2011-12-12 5:38 ` [PATCH v2 51/51] repack_without_ref(): call clear_packed_ref_cache() mhagger
2011-12-12 8:24 ` [PATCH v2 00/51] ref-api-C and ref-api-D re-roll Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1323668338-1764-32-git-send-email-mhagger@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=drew.northup@maine.edu \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=hvoigt@hvoigt.net \
--cc=jnareb@gmail.com \
--cc=johan@herland.net \
--cc=julian@quantumfyre.co.uk \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).