git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 1/4] refs: clean up common_list
@ 2015-08-12 21:57 David Turner
  2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: David Turner @ 2015-08-12 21:57 UTC (permalink / raw)
  To: git, mhagger, chriscool, pclouds; +Cc: David Turner

Instead of common_list having formatting like ! and /, use a struct to
hold common_list data in a structured form.

We don't use 'exclude' yet; instead, we keep the old codepath that
handles info/sparse-checkout and logs/HEAD.  Later, we will use exclude.

Signed-off-by: David Turner <dturner@twopensource.com>
---

Junio was worried about the performance of common_list and the weird
string parsing bits of update_common_dir, so this version of the patch
series begins by cleaning and optimizing those bits.

Additionally, I incorporated Junio's suggestion to use
is_per_worktree_ref, and his formatting suggestions.

There is now a hack so that git for-each-ref works on per-worktree
refs.

I also added git-bisect.sh, which I had overzealously reverted during
my proofreading step last time.

---
 path.c | 58 +++++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 37 insertions(+), 21 deletions(-)

diff --git a/path.c b/path.c
index 10f4cbf..236f797 100644
--- a/path.c
+++ b/path.c
@@ -91,35 +91,51 @@ static void replace_dir(struct strbuf *buf, int len, const char *newdir)
 		buf->buf[newlen] = '/';
 }
 
-static const char *common_list[] = {
-	"/branches", "/hooks", "/info", "!/logs", "/lost-found",
-	"/objects", "/refs", "/remotes", "/worktrees", "/rr-cache", "/svn",
-	"config", "!gc.pid", "packed-refs", "shallow",
-	NULL
+struct common_dir {
+	const char *dirname;
+	/* Not considered garbage for report_linked_checkout_garbage */
+	unsigned ignore_garbage:1;
+	unsigned is_dir:1;
+	/* Not common even though its parent is */
+	unsigned exclude:1;
+};
+
+struct common_dir common_list[] = {
+	{ "branches", 0, 1, 0 },
+	{ "hooks", 0, 1, 0 },
+	{ "info", 0, 1, 0 },
+	{ "info/sparse-checkout", 0, 0, 1 },
+	{ "logs", 1, 1, 0 },
+	{ "logs/HEAD", 1, 1, 1 },
+	{ "lost-found", 0, 1, 0 },
+	{ "objects", 0, 1, 0 },
+	{ "refs", 0, 1, 0 },
+	{ "remotes", 0, 1, 0 },
+	{ "worktrees", 0, 1, 0 },
+	{ "rr-cache", 0, 1, 0 },
+	{ "svn", 0, 1, 0 },
+	{ "config", 0, 0, 0 },
+	{ "gc.pid", 1, 0, 0 },
+	{ "packed-refs", 0, 0, 0 },
+	{ "shallow", 0, 0, 0 },
+	{ NULL, 0, 0, 0 }
 };
 
 static void update_common_dir(struct strbuf *buf, int git_dir_len)
 {
 	char *base = buf->buf + git_dir_len;
-	const char **p;
+	const struct common_dir *p;
 
 	if (is_dir_file(base, "logs", "HEAD") ||
 	    is_dir_file(base, "info", "sparse-checkout"))
 		return;	/* keep this in $GIT_DIR */
-	for (p = common_list; *p; p++) {
-		const char *path = *p;
-		int is_dir = 0;
-		if (*path == '!')
-			path++;
-		if (*path == '/') {
-			path++;
-			is_dir = 1;
-		}
-		if (is_dir && dir_prefix(base, path)) {
+	for (p = common_list; p->dirname; p++) {
+		const char *path = p->dirname;
+		if (p->is_dir && dir_prefix(base, path)) {
 			replace_dir(buf, git_dir_len, get_git_common_dir());
 			return;
 		}
-		if (!is_dir && !strcmp(base, path)) {
+		if (!p->is_dir && !strcmp(base, path)) {
 			replace_dir(buf, git_dir_len, get_git_common_dir());
 			return;
 		}
@@ -129,16 +145,16 @@ static void update_common_dir(struct strbuf *buf, int git_dir_len)
 void report_linked_checkout_garbage(void)
 {
 	struct strbuf sb = STRBUF_INIT;
-	const char **p;
+	const struct common_dir *p;
 	int len;
 
 	if (!git_common_dir_env)
 		return;
 	strbuf_addf(&sb, "%s/", get_git_dir());
 	len = sb.len;
-	for (p = common_list; *p; p++) {
-		const char *path = *p;
-		if (*path == '!')
+	for (p = common_list; p->dirname; p++) {
+		const char *path = p->dirname;
+		if (p->ignore_garbage)
 			continue;
 		strbuf_setlen(&sb, len);
 		strbuf_addstr(&sb, path);
-- 
2.0.4.315.gad8727a-twtrsrc

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 2/4] path: optimize common dir checking
  2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
@ 2015-08-12 21:57 ` David Turner
  2015-08-12 22:48   ` Junio C Hamano
                     ` (2 more replies)
  2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 24+ messages in thread
From: David Turner @ 2015-08-12 21:57 UTC (permalink / raw)
  To: git, mhagger, chriscool, pclouds; +Cc: David Turner

Instead of a linear search over common_list to check whether
a path is common, use a trie.  The trie search operates on
path prefixes, and handles excludes.

Signed-off-by: David Turner <dturner@twopensource.com>
---

Probably overkill, but maybe we could later use it for making exclude
or sparse-checkout matching faster (or maybe we have to go all the way
to McNaughton-Yamada for that to be truly worthwhile).

---
 path.c | 215 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 201 insertions(+), 14 deletions(-)

diff --git a/path.c b/path.c
index 236f797..21a4ce7 100644
--- a/path.c
+++ b/path.c
@@ -121,25 +121,212 @@ struct common_dir common_list[] = {
 	{ NULL, 0, 0, 0 }
 };
 
-static void update_common_dir(struct strbuf *buf, int git_dir_len)
+/*
+ * A compressed trie.  A trie node consists of zero or more characters that
+ * are common to all elements with this prefix, optionally followed by some
+ * children.  If value is not NULL, the trie node is a terminal node.
+ *
+ * For example, consider the following set of strings:
+ * abc
+ * def
+ * definite
+ * definition
+ *
+ * The trie would look look like:
+ * root: len = 0, value = (something), children a and d non-NULL.
+ *    a: len = 2, contents = bc
+ *    d: len = 2, contents = ef, children i non-NULL, value = (something)
+ *       i: len = 3, contents = nit, children e and i non-NULL, value = NULL
+ *           e: len = 0, children all NULL, value = (something)
+ *           i: len = 2, contents = on, children all NULL, value = (something)
+ */
+struct trie {
+	struct trie *children[256];
+	int len;
+	char *contents;
+	void *value;
+};
+
+static struct trie *make_trie_node(const char *key, void *value)
 {
-	char *base = buf->buf + git_dir_len;
-	const struct common_dir *p;
+	struct trie *new_node = xcalloc(1, sizeof(*new_node));
+	new_node->len = strlen(key);
+	if (new_node->len) {
+		new_node->contents = xmalloc(new_node->len);
+		memcpy(new_node->contents, key, new_node->len);
+	}
+	new_node->value = value;
+	return new_node;
+}
 
-	if (is_dir_file(base, "logs", "HEAD") ||
-	    is_dir_file(base, "info", "sparse-checkout"))
-		return;	/* keep this in $GIT_DIR */
-	for (p = common_list; p->dirname; p++) {
-		const char *path = p->dirname;
-		if (p->is_dir && dir_prefix(base, path)) {
-			replace_dir(buf, git_dir_len, get_git_common_dir());
-			return;
+/*
+ * Add a key/value pair to a trie.  The key is assumed to be \0-terminated.
+ * If there was an existing value for this key, return it.
+ */
+static void *add_to_trie(struct trie *root, const char *key, void *value)
+{
+	struct trie *child;
+	void *old;
+	int i;
+
+	if (!*key) {
+		/* we have reached the end of the key */
+		old = root->value;
+		root->value = value;
+		return old;
+	}
+
+	for (i = 0; i < root->len; ++i) {
+		if (root->contents[i] == key[i])
+			continue;
+
+		/*
+		 * Split this node: child will contain this node's
+		 * existing children.
+		 */
+		child = malloc(sizeof(*child));
+		memcpy(child->children, root->children, sizeof(root->children));
+
+		child->len = root->len - i - 1;
+		if (child->len) {
+			child->contents = strndup(root->contents + i + 1,
+						   child->len);
 		}
-		if (!p->is_dir && !strcmp(base, path)) {
-			replace_dir(buf, git_dir_len, get_git_common_dir());
-			return;
+		child->value = root->value;
+		root->value = NULL;
+		root->len = i;
+
+		memset(root->children, 0, sizeof(root->children));
+		root->children[(unsigned char)root->contents[i]] = child;
+
+		/* This is the newly-added child. */
+		root->children[(unsigned char)key[i]] =
+			make_trie_node(key + i + 1, value);
+		return NULL;
+	}
+
+	/* We have matched the entire compressed section */
+	if (key[i]) {
+		child = root->children[(unsigned char)key[root->len]];
+		if (child) {
+			return add_to_trie(child, key + root->len + 1, value);
+		} else {
+			child = make_trie_node(key + root->len + 1, value);
+			root->children[(unsigned char)key[root->len]] = child;
+			return NULL;
 		}
 	}
+
+	old = root->value;
+	root->value = value;
+	return old;
+}
+
+typedef int (*match_fn)(const char *unmatched, void *data, void *baton);
+
+/*
+ * Search a trie for some key.  Find the longest /-or-\0-terminated
+ * prefix of the key for which the trie contains a value.  Call fn
+ * with the unmatched portion of the key and the found value, and
+ * return its return value.  If there is no such prefix, return -1.
+ *
+ * For example, consider the trie containing only [refs,
+ * refs/worktree] (both with values).
+ *
+ * | key             | unmatched  | val from node | return value |
+ * |-----------------|------------|---------------|--------------|
+ * | a               | not called | n/a           | -1           |
+ * | refs            | \0         | refs          | as per fn    |
+ * | refs/           | /          | refs          | as per fn    |
+ * | refs/w          | /w         | refs          | as per fn    |
+ * | refs/worktree   | \0         | refs/worktree | as per fn    |
+ * | refs/worktree/  | /          | refs/worktree | as per fn    |
+ * | refs/worktree/a | /a         | refs/worktree | as per fn    |
+ * |-----------------|------------|---------------|--------------|
+ *
+ */
+static int trie_find(struct trie *root, const char *key, match_fn fn,
+		     void *baton)
+{
+	int i;
+	int result;
+	struct trie *child;
+
+	if (!*key) {
+		/* we have reached the end of the key */
+		if (root->value && !root->len)
+			return fn(key, root->value, baton);
+		else
+			return -1;
+	}
+
+	for (i = 0; i < root->len; ++i) {
+		if (root->contents[i] != key[i])
+			return -1;
+	}
+
+	/* Matched the entire compressed section */
+	key += i;
+	if (!*key)
+		/* End of key */
+		return fn(key, root->value, baton);
+
+	child = root->children[(unsigned char)*key];
+	if (child)
+		result = trie_find(child, key + 1, fn, baton);
+	else
+		result = -1;
+
+	if (result >= 0 || (*key != '/' && *key != 0))
+		return result;
+	if (root->value)
+		return fn(key, root->value, baton);
+	else
+		return -1;
+}
+
+static struct trie common_trie;
+static int common_trie_done_setup;
+
+static void init_common_trie(void)
+{
+	struct common_dir *p;
+
+	if (common_trie_done_setup)
+		return;
+
+	for (p = common_list; p->dirname; p++)
+		add_to_trie(&common_trie, p->dirname, p);
+
+	common_trie_done_setup = 1;
+}
+
+/*
+ * Helper function for update_common_dir: returns 1 if the dir
+ * prefix is common.
+ */
+static int check_common(const char *unmatched, void *value, void *baton)
+{
+	struct common_dir *dir = value;
+
+	if (!dir)
+		return 0;
+
+	if (dir->is_dir && (unmatched[0] == 0 || unmatched[0] == '/'))
+		return !dir->exclude;
+
+	if (!dir->is_dir && unmatched[0] == 0)
+		return !dir->exclude;
+
+	return 0;
+}
+
+static void update_common_dir(struct strbuf *buf, int git_dir_len)
+{
+	char *base = buf->buf + git_dir_len;
+	init_common_trie();
+	if (trie_find(&common_trie, base, check_common, NULL) > 0)
+		replace_dir(buf, git_dir_len, get_git_common_dir());
 }
 
 void report_linked_checkout_garbage(void)
-- 
2.0.4.315.gad8727a-twtrsrc

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
  2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
@ 2015-08-12 21:57 ` David Turner
  2015-08-13 17:15   ` Eric Sunshine
  2015-08-15  8:04   ` Duy Nguyen
  2015-08-12 21:57 ` [PATCH v3 4/4] bisect: make bisection refs per-worktree David Turner
  2015-08-15  7:44 ` [PATCH v3 1/4] refs: clean up common_list Duy Nguyen
  3 siblings, 2 replies; 24+ messages in thread
From: David Turner @ 2015-08-12 21:57 UTC (permalink / raw)
  To: git, mhagger, chriscool, pclouds; +Cc: David Turner

We need a place to stick refs for bisects in progress that is not
shared between worktrees.  So we use the refs/worktree/ hierarchy.

The is_per_worktree_ref function and associated docs learn that
refs/worktree/ is per-worktree, as does the git_path code in path.c

The ref-packing functions learn that per-worktree refs should not be
packed (since packed-refs is common rather than per-worktree).

Signed-off-by: David Turner <dturner@twopensource.com>
---
 Documentation/glossary-content.txt |  5 +++--
 path.c                             |  1 +
 refs.c                             | 32 +++++++++++++++++++++++++++++++-
 t/t0060-path-utils.sh              |  1 +
 t/t1400-update-ref.sh              | 18 ++++++++++++++++++
 t/t3210-pack-refs.sh               |  7 +++++++
 6 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/Documentation/glossary-content.txt b/Documentation/glossary-content.txt
index 8c6478b..5c707e6 100644
--- a/Documentation/glossary-content.txt
+++ b/Documentation/glossary-content.txt
@@ -413,8 +413,9 @@ exclude;;
 
 [[def_per_worktree_ref]]per-worktree ref::
 	Refs that are per-<<def_working_tree,worktree>>, rather than
-	global.  This is presently only <<def_HEAD,HEAD>>, but might
-	later include other unusual refs.
+	global.  This is presently only <<def_HEAD,HEAD>> and any refs
+	that start with `refs/worktree/`, but might later include other
+	unusual refs.
 
 [[def_pseudoref]]pseudoref::
 	Pseudorefs are a class of files under `$GIT_DIR` which behave
diff --git a/path.c b/path.c
index 21a4ce7..c53f732 100644
--- a/path.c
+++ b/path.c
@@ -110,6 +110,7 @@ struct common_dir common_list[] = {
 	{ "lost-found", 0, 1, 0 },
 	{ "objects", 0, 1, 0 },
 	{ "refs", 0, 1, 0 },
+	{ "refs/worktree", 0, 1, 1 },
 	{ "remotes", 0, 1, 0 },
 	{ "worktrees", 0, 1, 0 },
 	{ "rr-cache", 0, 1, 0 },
diff --git a/refs.c b/refs.c
index e6fc3fe..30331bc 100644
--- a/refs.c
+++ b/refs.c
@@ -298,6 +298,11 @@ struct ref_entry {
 };
 
 static void read_loose_refs(const char *dirname, struct ref_dir *dir);
+static int search_ref_dir(struct ref_dir *dir, const char *refname, size_t len);
+static struct ref_entry *create_dir_entry(struct ref_cache *ref_cache,
+					  const char *dirname, size_t len,
+					  int incomplete);
+static void add_entry_to_dir(struct ref_dir *dir, struct ref_entry *entry);
 
 static struct ref_dir *get_ref_dir(struct ref_entry *entry)
 {
@@ -306,6 +311,24 @@ static struct ref_dir *get_ref_dir(struct ref_entry *entry)
 	dir = &entry->u.subdir;
 	if (entry->flag & REF_INCOMPLETE) {
 		read_loose_refs(entry->name, dir);
+
+		/*
+		 * Manually add refs/worktree, which, being
+		 * per-worktree, might not appear in the directory
+		 * listing for refs/ in the main repo.
+		 */
+		if (!strcmp(entry->name, "refs/")) {
+			int pos = search_ref_dir(dir, "refs/worktree/", 14);
+			if (pos < 0) {
+				struct ref_entry *child_entry;
+				child_entry = create_dir_entry(dir->ref_cache,
+							       "refs/worktree/",
+							       14, 1);
+				add_entry_to_dir(dir, child_entry);
+				read_loose_refs("refs/worktree",
+						&child_entry->u.subdir);
+			}
+		}
 		entry->flag &= ~REF_INCOMPLETE;
 	}
 	return dir;
@@ -2643,6 +2666,8 @@ struct pack_refs_cb_data {
 	struct ref_to_prune *ref_to_prune;
 };
 
+static int is_per_worktree_ref(const char *refname);
+
 /*
  * An each_ref_entry_fn that is run over loose references only.  If
  * the loose reference can be packed, add an entry in the packed ref
@@ -2656,6 +2681,10 @@ static int pack_if_possible_fn(struct ref_entry *entry, void *cb_data)
 	struct ref_entry *packed_entry;
 	int is_tag_ref = starts_with(entry->name, "refs/tags/");
 
+	/* Do not pack per-worktree refs: */
+	if (is_per_worktree_ref(entry->name))
+		return 0;
+
 	/* ALWAYS pack tags */
 	if (!(cb->flags & PACK_REFS_ALL) && !is_tag_ref)
 		return 0;
@@ -2850,7 +2879,8 @@ static int delete_ref_loose(struct ref_lock *lock, int flag, struct strbuf *err)
 
 static int is_per_worktree_ref(const char *refname)
 {
-	return !strcmp(refname, "HEAD");
+	return !strcmp(refname, "HEAD") ||
+		starts_with(refname, "refs/worktree/");
 }
 
 static int is_pseudoref_syntax(const char *refname)
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 93605f4..28e6dff 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -275,6 +275,7 @@ test_git_path GIT_COMMON_DIR=bar remotes/bar              bar/remotes/bar
 test_git_path GIT_COMMON_DIR=bar branches/bar             bar/branches/bar
 test_git_path GIT_COMMON_DIR=bar logs/refs/heads/master   bar/logs/refs/heads/master
 test_git_path GIT_COMMON_DIR=bar refs/heads/master        bar/refs/heads/master
+test_git_path GIT_COMMON_DIR=bar refs/worktree/foo        .git/refs/worktree/foo
 test_git_path GIT_COMMON_DIR=bar hooks/me                 bar/hooks/me
 test_git_path GIT_COMMON_DIR=bar config                   bar/config
 test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
index 9d21c19..7ecd33b 100755
--- a/t/t1400-update-ref.sh
+++ b/t/t1400-update-ref.sh
@@ -1131,4 +1131,22 @@ test_expect_success ULIMIT_FILE_DESCRIPTORS 'large transaction deleting branches
 )
 '
 
+test_expect_success 'handle per-worktree refs in refs/worktree' '
+	git commit --allow-empty -m "initial commit" &&
+	git worktree add -b branch worktree &&
+	(
+		cd worktree &&
+		git commit --allow-empty -m "test commit"  &&
+		git for-each-ref | test_must_fail grep refs/worktree &&
+		git update-ref refs/worktree/something HEAD &&
+		git rev-parse refs/worktree/something >../worktree-head &&
+		git for-each-ref | grep refs/worktree/something
+	) &&
+	test_path_is_missing .git/refs/worktree &&
+	test_must_fail git rev-parse refs/worktree/something &&
+	git update-ref refs/worktree/something HEAD &&
+	git rev-parse refs/worktree/something >main-head &&
+	! test_cmp main-head worktree-head
+'
+
 test_done
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 8aae98d..c54cd29 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -160,6 +160,13 @@ test_expect_success 'pack ref directly below refs/' '
 	test_path_is_missing .git/refs/top
 '
 
+test_expect_success 'do not pack ref in refs/worktree' '
+	git update-ref refs/worktree/local HEAD &&
+	git pack-refs --all --prune &&
+	! grep refs/worktree/local .git/packed-refs >/dev/null &&
+	test_path_is_file .git/refs/worktree/local
+'
+
 test_expect_success 'disable reflogs' '
 	git config core.logallrefupdates false &&
 	rm -rf .git/logs
-- 
2.0.4.315.gad8727a-twtrsrc

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v3 4/4] bisect: make bisection refs per-worktree
  2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
  2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
  2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
@ 2015-08-12 21:57 ` David Turner
  2015-08-15  7:44 ` [PATCH v3 1/4] refs: clean up common_list Duy Nguyen
  3 siblings, 0 replies; 24+ messages in thread
From: David Turner @ 2015-08-12 21:57 UTC (permalink / raw)
  To: git, mhagger, chriscool, pclouds; +Cc: David Turner

Using the new refs/worktree/ refs, make bisection per-worktree.

Signed-off-by: David Turner <dturner@twopensource.com>
---
 Documentation/git-bisect.txt       |  4 ++--
 Documentation/rev-list-options.txt | 14 +++++++-------
 bisect.c                           |  2 +-
 builtin/rev-parse.c                |  6 ++++--
 git-bisect.sh                      | 14 +++++++-------
 revision.c                         |  2 +-
 t/t6030-bisect-porcelain.sh        | 20 ++++++++++----------
 7 files changed, 32 insertions(+), 30 deletions(-)

diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
index e97f2de..f0c52d1 100644
--- a/Documentation/git-bisect.txt
+++ b/Documentation/git-bisect.txt
@@ -82,7 +82,7 @@ to ask for the next commit that needs testing.
 
 Eventually there will be no more revisions left to inspect, and the
 command will print out a description of the first bad commit. The
-reference `refs/bisect/bad` will be left pointing at that commit.
+reference `refs/worktree/bisect/bad` will be left pointing at that commit.
 
 
 Bisect reset
@@ -373,7 +373,7 @@ on a single line.
 ------------
 $ git bisect start HEAD <known-good-commit> [ <boundary-commit> ... ] --no-checkout
 $ git bisect run sh -c '
-	GOOD=$(git for-each-ref "--format=%(objectname)" refs/bisect/good-*) &&
+	GOOD=$(git for-each-ref "--format=%(objectname)" refs/worktree/bisect/good-*) &&
 	git rev-list --objects BISECT_HEAD --not $GOOD >tmp.$$ &&
 	git pack-objects --stdout >/dev/null <tmp.$$
 	rc=$?
diff --git a/Documentation/rev-list-options.txt b/Documentation/rev-list-options.txt
index a9b808f..1175960 100644
--- a/Documentation/rev-list-options.txt
+++ b/Documentation/rev-list-options.txt
@@ -183,9 +183,9 @@ explicitly.
 
 ifndef::git-rev-list[]
 --bisect::
-	Pretend as if the bad bisection ref `refs/bisect/bad`
+	Pretend as if the bad bisection ref `refs/worktree/bisect/bad`
 	was listed and as if it was followed by `--not` and the good
-	bisection refs `refs/bisect/good-*` on the command
+	bisection refs `refs/worktree/bisect/good-*` on the command
 	line. Cannot be combined with --first-parent.
 endif::git-rev-list[]
 
@@ -548,10 +548,10 @@ Bisection Helpers
 --bisect::
 	Limit output to the one commit object which is roughly halfway between
 	included and excluded commits. Note that the bad bisection ref
-	`refs/bisect/bad` is added to the included commits (if it
-	exists) and the good bisection refs `refs/bisect/good-*` are
+	`refs/worktree/bisect/bad` is added to the included commits (if it
+	exists) and the good bisection refs `refs/worktree/bisect/good-*` are
 	added to the excluded commits (if they exist). Thus, supposing there
-	are no refs in `refs/bisect/`, if
+	are no refs in `refs/worktree/bisect/`, if
 +
 -----------------------------------------------------------------------
 	$ git rev-list --bisect foo ^bar ^baz
@@ -571,7 +571,7 @@ one. Cannot be combined with --first-parent.
 
 --bisect-vars::
 	This calculates the same as `--bisect`, except that refs in
-	`refs/bisect/` are not used, and except that this outputs
+	`refs/worktree/bisect/` are not used, and except that this outputs
 	text ready to be eval'ed by the shell. These lines will assign the
 	name of the midpoint revision to the variable `bisect_rev`, and the
 	expected number of commits to be tested after `bisect_rev` is tested
@@ -584,7 +584,7 @@ one. Cannot be combined with --first-parent.
 --bisect-all::
 	This outputs all the commit objects between the included and excluded
 	commits, ordered by their distance to the included and excluded
-	commits. Refs in `refs/bisect/` are not used. The farthest
+	commits. Refs in `refs/worktree/bisect/` are not used. The farthest
 	from them is displayed first. (This is the only one displayed by
 	`--bisect`.)
 +
diff --git a/bisect.c b/bisect.c
index 33ac88d..dbe3461 100644
--- a/bisect.c
+++ b/bisect.c
@@ -425,7 +425,7 @@ static int register_ref(const char *refname, const struct object_id *oid,
 
 static int read_bisect_refs(void)
 {
-	return for_each_ref_in("refs/bisect/", register_ref, NULL);
+	return for_each_ref_in("refs/worktree/bisect/", register_ref, NULL);
 }
 
 static void read_bisect_paths(struct argv_array *array)
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 02d747d..3240ddf 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -663,8 +663,10 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				continue;
 			}
 			if (!strcmp(arg, "--bisect")) {
-				for_each_ref_in("refs/bisect/bad", show_reference, NULL);
-				for_each_ref_in("refs/bisect/good", anti_reference, NULL);
+				for_each_ref_in("refs/worktree/bisect/bad",
+						show_reference, NULL);
+				for_each_ref_in("refs/worktree/bisect/good",
+						anti_reference, NULL);
 				continue;
 			}
 			if (starts_with(arg, "--branches=")) {
diff --git a/git-bisect.sh b/git-bisect.sh
index ea63223..c16433d 100755
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -210,7 +210,7 @@ bisect_write() {
 		*)
 			die "$(eval_gettext "Bad bisect_write argument: \$state")" ;;
 	esac
-	git update-ref "refs/bisect/$tag" "$rev" || exit
+	git update-ref "refs/worktree/bisect/$tag" "$rev" || exit
 	echo "# $state: $(git show-branch $rev)" >>"$GIT_DIR/BISECT_LOG"
 	test -n "$nolog" || echo "git bisect $state $rev" >>"$GIT_DIR/BISECT_LOG"
 }
@@ -282,8 +282,8 @@ bisect_state() {
 
 bisect_next_check() {
 	missing_good= missing_bad=
-	git show-ref -q --verify refs/bisect/$TERM_BAD || missing_bad=t
-	test -n "$(git for-each-ref "refs/bisect/$TERM_GOOD-*")" || missing_good=t
+	git show-ref -q --verify refs/worktree/bisect/$TERM_BAD || missing_bad=t
+	test -n "$(git for-each-ref "refs/worktree/bisect/$TERM_GOOD-*")" || missing_good=t
 
 	case "$missing_good,$missing_bad,$1" in
 	,,*)
@@ -341,15 +341,15 @@ bisect_next() {
 	# Check if we should exit because bisection is finished
 	if test $res -eq 10
 	then
-		bad_rev=$(git show-ref --hash --verify refs/bisect/$TERM_BAD)
+		bad_rev=$(git show-ref --hash --verify refs/worktree/bisect/$TERM_BAD)
 		bad_commit=$(git show-branch $bad_rev)
 		echo "# first $TERM_BAD commit: $bad_commit" >>"$GIT_DIR/BISECT_LOG"
 		exit 0
 	elif test $res -eq 2
 	then
 		echo "# only skipped commits left to test" >>"$GIT_DIR/BISECT_LOG"
-		good_revs=$(git for-each-ref --format="%(objectname)" "refs/bisect/$TERM_GOOD-*")
-		for skipped in $(git rev-list refs/bisect/$TERM_BAD --not $good_revs)
+		good_revs=$(git for-each-ref --format="%(objectname)" "refs/worktree/bisect/$TERM_GOOD-*")
+		for skipped in $(git rev-list refs/worktree/bisect/$TERM_BAD --not $good_revs)
 		do
 			skipped_commit=$(git show-branch $skipped)
 			echo "# possible first $TERM_BAD commit: $skipped_commit" >>"$GIT_DIR/BISECT_LOG"
@@ -412,7 +412,7 @@ Try 'git bisect reset <commit>'.")"
 
 bisect_clean_state() {
 	# There may be some refs packed during bisection.
-	git for-each-ref --format='%(refname) %(objectname)' refs/bisect/\* |
+	git for-each-ref --format='%(refname) %(objectname)' refs/worktree/bisect/\* |
 	while read ref hash
 	do
 		git update-ref -d $ref $hash || exit
diff --git a/revision.c b/revision.c
index b6b2cf7..974a11f 100644
--- a/revision.c
+++ b/revision.c
@@ -2084,7 +2084,7 @@ extern void read_bisect_terms(const char **bad, const char **good);
 static int for_each_bisect_ref(const char *submodule, each_ref_fn fn, void *cb_data, const char *term) {
 	struct strbuf bisect_refs = STRBUF_INIT;
 	int status;
-	strbuf_addf(&bisect_refs, "refs/bisect/%s", term);
+	strbuf_addf(&bisect_refs, "refs/worktree/bisect/%s", term);
 	status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, cb_data);
 	strbuf_release(&bisect_refs);
 	return status;
diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh
index 9e2c203..bfd5463 100755
--- a/t/t6030-bisect-porcelain.sh
+++ b/t/t6030-bisect-porcelain.sh
@@ -68,7 +68,7 @@ test_expect_success 'bisect fails if given any junk instead of revs' '
 	git bisect reset &&
 	test_must_fail git bisect start foo $HASH1 -- &&
 	test_must_fail git bisect start $HASH4 $HASH1 bar -- &&
-	test -z "$(git for-each-ref "refs/bisect/*")" &&
+	test -z "$(git for-each-ref "refs/worktree/bisect/*")" &&
 	test -z "$(ls .git/BISECT_* 2>/dev/null)" &&
 	git bisect start &&
 	test_must_fail git bisect good foo $HASH1 &&
@@ -77,7 +77,7 @@ test_expect_success 'bisect fails if given any junk instead of revs' '
 	test_must_fail git bisect bad $HASH3 $HASH4 &&
 	test_must_fail git bisect skip bar $HASH3 &&
 	test_must_fail git bisect skip $HASH1 foo &&
-	test -z "$(git for-each-ref "refs/bisect/*")" &&
+	test -z "$(git for-each-ref "refs/worktree/bisect/*")" &&
 	git bisect good $HASH1 &&
 	git bisect bad $HASH4
 '
@@ -115,7 +115,7 @@ test_expect_success 'bisect reset removes packed refs' '
 	git pack-refs --all --prune &&
 	git bisect next &&
 	git bisect reset &&
-	test -z "$(git for-each-ref "refs/bisect/*")" &&
+	test -z "$(git for-each-ref "refs/worktree/bisect/*")" &&
 	test -z "$(git for-each-ref "refs/heads/bisect")"
 '
 
@@ -126,7 +126,7 @@ test_expect_success 'bisect reset removes bisect state after --no-checkout' '
 	git bisect bad $HASH3 &&
 	git bisect next &&
 	git bisect reset &&
-	test -z "$(git for-each-ref "refs/bisect/*")" &&
+	test -z "$(git for-each-ref "refs/worktree/bisect/*")" &&
 	test -z "$(git for-each-ref "refs/heads/bisect")" &&
 	test -z "$(git for-each-ref "BISECT_HEAD")"
 '
@@ -176,7 +176,7 @@ test_expect_success 'bisect start: no ".git/BISECT_START" if checkout error' '
 	git branch > branch.output &&
 	grep "* other" branch.output > /dev/null &&
 	test_must_fail test -e .git/BISECT_START &&
-	test -z "$(git for-each-ref "refs/bisect/*")" &&
+	test -z "$(git for-each-ref "refs/worktree/bisect/*")" &&
 	git checkout HEAD hello
 '
 
@@ -671,7 +671,7 @@ test_expect_success 'bisect: --no-checkout - target before breakage' '
 	git bisect bad BISECT_HEAD &&
 	check_same BROKEN_HASH5 BISECT_HEAD &&
 	git bisect bad BISECT_HEAD &&
-	check_same BROKEN_HASH5 bisect/bad &&
+	check_same BROKEN_HASH5 refs/worktree/bisect/bad &&
 	git bisect reset
 '
 
@@ -682,7 +682,7 @@ test_expect_success 'bisect: --no-checkout - target in breakage' '
 	git bisect bad BISECT_HEAD &&
 	check_same BROKEN_HASH5 BISECT_HEAD &&
 	git bisect good BISECT_HEAD &&
-	check_same BROKEN_HASH6 bisect/bad &&
+	check_same BROKEN_HASH6 refs/worktree/bisect/bad &&
 	git bisect reset
 '
 
@@ -693,7 +693,7 @@ test_expect_success 'bisect: --no-checkout - target after breakage' '
 	git bisect good BISECT_HEAD &&
 	check_same BROKEN_HASH8 BISECT_HEAD &&
 	git bisect good BISECT_HEAD &&
-	check_same BROKEN_HASH9 bisect/bad &&
+	check_same BROKEN_HASH9 refs/worktree/bisect/bad &&
 	git bisect reset
 '
 
@@ -702,13 +702,13 @@ test_expect_success 'bisect: demonstrate identification of damage boundary' "
 	git checkout broken &&
 	git bisect start broken master --no-checkout &&
 	git bisect run \"\$SHELL_PATH\" -c '
-		GOOD=\$(git for-each-ref \"--format=%(objectname)\" refs/bisect/good-*) &&
+		GOOD=\$(git for-each-ref \"--format=%(objectname)\" refs/worktree/bisect/good-*) &&
 		git rev-list --objects BISECT_HEAD --not \$GOOD >tmp.\$\$ &&
 		git pack-objects --stdout >/dev/null < tmp.\$\$
 		rc=\$?
 		rm -f tmp.\$\$
 		test \$rc = 0' &&
-	check_same BROKEN_HASH6 bisect/bad &&
+	check_same BROKEN_HASH6 refs/worktree/bisect/bad &&
 	git bisect reset
 "
 
-- 
2.0.4.315.gad8727a-twtrsrc

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
@ 2015-08-12 22:48   ` Junio C Hamano
  2015-08-13  9:05   ` Michael Haggerty
  2015-08-15  7:59   ` Duy Nguyen
  2 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2015-08-12 22:48 UTC (permalink / raw)
  To: David Turner; +Cc: git, mhagger, chriscool, pclouds

David Turner <dturner@twopensource.com> writes:

> Instead of a linear search over common_list to check whether
> a path is common, use a trie.  The trie search operates on
> path prefixes, and handles excludes.
>
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
>
> Probably overkill, but maybe we could later use it for making exclude
> or sparse-checkout matching faster (or maybe we have to go all the way
> to McNaughton-Yamada for that to be truly worthwhile).

This is why I love this list.  A mere mention of "something better
than linear list" immediately is answered by a trie and a mention of
McNaughton-Yamada ;-).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
  2015-08-12 22:48   ` Junio C Hamano
@ 2015-08-13  9:05   ` Michael Haggerty
  2015-08-14 17:04     ` Junio C Hamano
  2015-08-15  7:59   ` Duy Nguyen
  2 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2015-08-13  9:05 UTC (permalink / raw)
  To: David Turner, git, chriscool, pclouds

On 08/12/2015 11:57 PM, David Turner wrote:
> Instead of a linear search over common_list to check whether
> a path is common, use a trie.  The trie search operates on
> path prefixes, and handles excludes.
> 
> Signed-off-by: David Turner <dturner@twopensource.com>
> ---
> 
> Probably overkill, but maybe we could later use it for making exclude
> or sparse-checkout matching faster (or maybe we have to go all the way
> to McNaughton-Yamada for that to be truly worthwhile).

Let's take a step back.

We have always had a ton of code that uses `git_path()` and friends to
convert abstract things into filesystem paths. Let's take the
reference-handling code as an example:

`git_path("refs/heads/master")` returns something like
".git/refs/heads/master", which happens to be the place where we would
store a loose reference with that name. But in reality,
"refs/heads/master" is a reference name, not a fragment of a path. It's
just that the reference code knows that the transformation done by
`git_path()` *accidentally* happens to convert a reference name into the
name of the path of the corresponding loose reference file.

In fact, the reference code is even smarter than that. It knows that
within submodules, `git_path()` does *not* do the right mapping. In
those cases it calls `git_path_submodule()` instead.

But now we have workspaces, and things have become more complicated.
Some references are stored in `$GIT_DIR`, while others are stored in
`$GIT_COMMON_DIR`. Who should know all of these details?

The current answer is that the reference-handling code remains (mostly)
ignorant of workspaces. It just stupidly calls `git_path()` (or
`git_path_submodule()`) regardless of the reference name. It is
`git_path()` that has grown the global insight to know which files are
now stored in `$GIT_COMMON_DIR` vs `$GIT_DIR`. Now it helpfully
transforms "refs/heads/master" into "$GIT_COMMON_DIR/refs/heads/master"
but transforms "refs/worktree/foo" into "$GIT_DIR/refs/worktree/foo". It
has developed similar insight into lots of other file types. IT KNOWS
TOO MUCH. And because of that, it become a lot more complicated and
might even be a performance problem.

This seems crazy to me. It is the *reference* code that should know
whether a particular reference should be stored under `$GIT_DIR` or
`$GIT_COMMON_DIR`, or indeed whether it should be stored in a database.

We should have two *stupid* functions, `git_workspace_path()` and
`git_common_path()`, and have the *callers* decide which one to call.

The only reason to retain a knows-everything `git_path()` function is as
a crutch for 3rd-party applications that think they are clever enough to
grub around in `$GIT_DIR` at the filesystem level. But that should be
highly discouraged, and we should make it our mission to provide
commands that make it unnecessary.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
@ 2015-08-13 17:15   ` Eric Sunshine
  2015-08-13 17:41     ` David Turner
  2015-08-15  8:04   ` Duy Nguyen
  1 sibling, 1 reply; 24+ messages in thread
From: Eric Sunshine @ 2015-08-13 17:15 UTC (permalink / raw)
  To: David Turner
  Cc: Git List, Michael Haggerty, Christian Couder,
	Nguyễn Thái Ngọc Duy

On Wed, Aug 12, 2015 at 5:57 PM, David Turner <dturner@twopensource.com> wrote:
> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
> index 93605f4..28e6dff 100755
> --- a/t/t0060-path-utils.sh
> +++ b/t/t0060-path-utils.sh
> +test_expect_success 'handle per-worktree refs in refs/worktree' '
> +       git commit --allow-empty -m "initial commit" &&
> +       git worktree add -b branch worktree &&
> +       (
> +               cd worktree &&
> +               git commit --allow-empty -m "test commit"  &&
> +               git for-each-ref | test_must_fail grep refs/worktree &&

s/test_must_fail/!/

>From t/README:

   On the other hand, don't use test_must_fail for running regular
   platform commands; just use '! cmd'.  We are not in the business
   of verifying that the world given to us sanely works.

> +               git update-ref refs/worktree/something HEAD &&
> +               git rev-parse refs/worktree/something >../worktree-head &&
> +               git for-each-ref | grep refs/worktree/something
> +       ) &&
> +       test_path_is_missing .git/refs/worktree &&
> +       test_must_fail git rev-parse refs/worktree/something &&
> +       git update-ref refs/worktree/something HEAD &&
> +       git rev-parse refs/worktree/something >main-head &&
> +       ! test_cmp main-head worktree-head
> +'

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-13 17:15   ` Eric Sunshine
@ 2015-08-13 17:41     ` David Turner
  2015-08-13 20:16       ` Michael Haggerty
  0 siblings, 1 reply; 24+ messages in thread
From: David Turner @ 2015-08-13 17:41 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Git List, Michael Haggerty, Christian Couder,
	Nguyễn Thái Ngọc Duy

On Thu, 2015-08-13 at 13:15 -0400, Eric Sunshine wrote:
> On Wed, Aug 12, 2015 at 5:57 PM, David Turner <dturner@twopensource.com> wrote:
> > diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
> > index 93605f4..28e6dff 100755
> > --- a/t/t0060-path-utils.sh
> > +++ b/t/t0060-path-utils.sh
> > +test_expect_success 'handle per-worktree refs in refs/worktree' '
> > +       git commit --allow-empty -m "initial commit" &&
> > +       git worktree add -b branch worktree &&
> > +       (
> > +               cd worktree &&
> > +               git commit --allow-empty -m "test commit"  &&
> > +               git for-each-ref | test_must_fail grep refs/worktree &&
> 
> s/test_must_fail/!/
> 
> From t/README:
> 
>    On the other hand, don't use test_must_fail for running regular
>    platform commands; just use '! cmd'.  We are not in the business
>    of verifying that the world given to us sanely works.

When I make that change, my test fails with:

FATAL: Unexpected exit with code 2

Apparently, you can't use ! in pipelines like that.  So that's why I
used test_must_fail.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-13 17:41     ` David Turner
@ 2015-08-13 20:16       ` Michael Haggerty
  2015-08-13 20:32         ` David Turner
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2015-08-13 20:16 UTC (permalink / raw)
  To: David Turner, Eric Sunshine
  Cc: Git List, Christian Couder, Nguyễn Thái Ngọc Duy

On 08/13/2015 07:41 PM, David Turner wrote:
> On Thu, 2015-08-13 at 13:15 -0400, Eric Sunshine wrote:
>> On Wed, Aug 12, 2015 at 5:57 PM, David Turner <dturner@twopensource.com> wrote:
>>> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
>>> index 93605f4..28e6dff 100755
>>> --- a/t/t0060-path-utils.sh
>>> +++ b/t/t0060-path-utils.sh
>>> +test_expect_success 'handle per-worktree refs in refs/worktree' '
>>> +       git commit --allow-empty -m "initial commit" &&
>>> +       git worktree add -b branch worktree &&
>>> +       (
>>> +               cd worktree &&
>>> +               git commit --allow-empty -m "test commit"  &&
>>> +               git for-each-ref | test_must_fail grep refs/worktree &&
>>
>> s/test_must_fail/!/
>>
>> From t/README:
>>
>>    On the other hand, don't use test_must_fail for running regular
>>    platform commands; just use '! cmd'.  We are not in the business
>>    of verifying that the world given to us sanely works.
> 
> When I make that change, my test fails with:
> 
> FATAL: Unexpected exit with code 2
> 
> Apparently, you can't use ! in pipelines like that.  So that's why I
> used test_must_fail.

You would have to negate the whole pipeline, like

    ! git for-each-ref | grep refs/worktree

The result of a pipeline is taken from the last command.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-13 20:16       ` Michael Haggerty
@ 2015-08-13 20:32         ` David Turner
  2015-08-14  8:18           ` Michael Haggerty
  0 siblings, 1 reply; 24+ messages in thread
From: David Turner @ 2015-08-13 20:32 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: Eric Sunshine, Git List, Christian Couder,
	Nguyễn Thái Ngọc Duy, Jacob Keller

On Thu, 2015-08-13 at 22:16 +0200, Michael Haggerty wrote:
> On 08/13/2015 07:41 PM, David Turner wrote:
> > On Thu, 2015-08-13 at 13:15 -0400, Eric Sunshine wrote:
> >> On Wed, Aug 12, 2015 at 5:57 PM, David Turner <dturner@twopensource.com> wrote:
> >>> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
> >>> index 93605f4..28e6dff 100755
> >>> --- a/t/t0060-path-utils.sh
> >>> +++ b/t/t0060-path-utils.sh
> >>> +test_expect_success 'handle per-worktree refs in refs/worktree' '
> >>> +       git commit --allow-empty -m "initial commit" &&
> >>> +       git worktree add -b branch worktree &&
> >>> +       (
> >>> +               cd worktree &&
> >>> +               git commit --allow-empty -m "test commit"  &&
> >>> +               git for-each-ref | test_must_fail grep refs/worktree &&
> >>
> >> s/test_must_fail/!/
> >>
> >> From t/README:
> >>
> >>    On the other hand, don't use test_must_fail for running regular
> >>    platform commands; just use '! cmd'.  We are not in the business
> >>    of verifying that the world given to us sanely works.
> > 
> > When I make that change, my test fails with:
> > 
> > FATAL: Unexpected exit with code 2
> > 
> > Apparently, you can't use ! in pipelines like that.  So that's why I
> > used test_must_fail.
> 
> You would have to negate the whole pipeline, like
> 
>     ! git for-each-ref | grep refs/worktree
> 
> The result of a pipeline is taken from the last command.

Yes, but that would pass if for-each-ref fails, which I do not want.  

Jacob's suggestion of parentheses around (! grep refs/worktree) seems to
work.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-13 20:32         ` David Turner
@ 2015-08-14  8:18           ` Michael Haggerty
  2015-08-14 17:10             ` Junio C Hamano
  0 siblings, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2015-08-14  8:18 UTC (permalink / raw)
  To: David Turner
  Cc: Eric Sunshine, Git List, Christian Couder,
	Nguyễn Thái Ngọc Duy, Jacob Keller

On 08/13/2015 10:32 PM, David Turner wrote:
> On Thu, 2015-08-13 at 22:16 +0200, Michael Haggerty wrote:
>> On 08/13/2015 07:41 PM, David Turner wrote:
>>> On Thu, 2015-08-13 at 13:15 -0400, Eric Sunshine wrote:
>>>> On Wed, Aug 12, 2015 at 5:57 PM, David Turner <dturner@twopensource.com> wrote:
>>>>> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
>>>>> index 93605f4..28e6dff 100755
>>>>> --- a/t/t0060-path-utils.sh
>>>>> +++ b/t/t0060-path-utils.sh
>>>>> +test_expect_success 'handle per-worktree refs in refs/worktree' '
>>>>> +       git commit --allow-empty -m "initial commit" &&
>>>>> +       git worktree add -b branch worktree &&
>>>>> +       (
>>>>> +               cd worktree &&
>>>>> +               git commit --allow-empty -m "test commit"  &&
>>>>> +               git for-each-ref | test_must_fail grep refs/worktree &&
>>>>
>>>> s/test_must_fail/!/
>>>>
>>>> From t/README:
>>>>
>>>>    On the other hand, don't use test_must_fail for running regular
>>>>    platform commands; just use '! cmd'.  We are not in the business
>>>>    of verifying that the world given to us sanely works.
>>>
>>> When I make that change, my test fails with:
>>>
>>> FATAL: Unexpected exit with code 2
>>>
>>> Apparently, you can't use ! in pipelines like that.  So that's why I
>>> used test_must_fail.
>>
>> You would have to negate the whole pipeline, like
>>
>>     ! git for-each-ref | grep refs/worktree
>>
>> The result of a pipeline is taken from the last command.
> 
> Yes, but that would pass if for-each-ref fails, which I do not want.  
> 
> Jacob's suggestion of parentheses around (! grep refs/worktree) seems to
> work.

I don't see how that can help. The result of a pipeline is taken from
the last command. The exit codes of earlier commands in the pipeline are
lost in the sands of time:

    $ false | true
    $ echo $?
    0
    $ false | ( ! false )
    $ echo $?
    0

Working around this POSIX shell limitation is surprisingly awkward in a
general-purpose script. But in this case you could use a temporary file:

    git for-each-ref >refs-actual &&
    ! grep refs/worktree <refs-actual && [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-13  9:05   ` Michael Haggerty
@ 2015-08-14 17:04     ` Junio C Hamano
  2015-08-14 20:04       ` David Turner
  2015-08-15 18:12       ` Michael Haggerty
  0 siblings, 2 replies; 24+ messages in thread
From: Junio C Hamano @ 2015-08-14 17:04 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Turner, git, chriscool, pclouds

Michael Haggerty <mhagger@alum.mit.edu> writes:

> Let's take a step back.
>
> We have always had a ton of code that uses `git_path()` and friends to
> convert abstract things into filesystem paths. Let's take the
> reference-handling code as an example:
> ...
> This seems crazy to me. It is the *reference* code that should know
> whether a particular reference should be stored under `$GIT_DIR` or
> `$GIT_COMMON_DIR`, or indeed whether it should be stored in a database.

It is more like:

 1. The system as a whole should decide if HEAD and refs/heads/
    should be per workspace or shared across a repository (and we say
    the former should be per workspace, the latter should be shared).

 2. The reference code should decide which ref-backend is used to
    store refs.

 3. And any ref-backend should follow the decision made by the
    system as a whole in 1.

I'd imagine that David's ref-backend code inherited from Ronnie
would still accept the string "refs/heads/master" from the rest of
the system (i.e. callers that call into the ref API) to mean "the
ref that represents the 'master' branch", and uses that as the key
to decide "ok, that is shared across workspaces" to honor the
system-wide decision made in 1.  The outside callers wouldn't pass
the result of calling git_path("refs/heads/master") into the ref
API, which may expand to "$somewhere_else/refs/heads/master" when
run in a secondary workspace to point at the common location.

I'd also imagine that the workspace API would give ways for the
implementation of the reference API to ask these questions:

 - which workspace am I operating for?  where is the "common" thing?
   how would I identify this workspace among the ones that share the
   same "common" thing?

 - is this ref (or ref-like thing) supposed to be in common or per
   workspace?

I agree with you that there needs an intermediate level, between
what Duy and David's git_path() does (i.e. which gives the final
result of deciding where in the filesystem the thing should go) and
your two stupid functions would do (i.e. knowing which kind the
thing is, give you the final location in the filesystem).  That is,
to let the caller know if the thing is supposed to be shared or per
workspace in the first place.

With that intermediate level function, a database-based ref-backend
could make ('common', ref/heads/master) as the key for the current
value of the master branch and (workspace-name, HEAD) as the key for
the current value of the HEAD for a given workspace.

> We should have two *stupid* functions, `git_workspace_path()` and
> `git_common_path()`, and have the *callers* decide which one to call.

So I think we should have *three* functions:

 - git_workspace_name(void) returns some name that uniquely
   identifies the current workspace among the workspaces linked to
   the same repository.

 - is_common_thing(const char *) takes a path (that is relative to
   $GIT_DIR where the thing would have lived at in a pre-workspace
   world), and tells if it is a common thing or a per-workspace
   thing.

 - git_path() can stay the external interface and can be thought of:

	git_path(const char *path)
        {
		if (is_common_thing(path))
			return git_common_path(path);
		return git_workspace_path(git_workspace_name(), path);
	}

   if you think in terms of your two helpers.

But I do not think that git_common_path() and git_workspace_path()
need to be called from any other place in the system, and that is
the reason why I did not say we should have four functions (or five,
counting git_path() itself).

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-14  8:18           ` Michael Haggerty
@ 2015-08-14 17:10             ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2015-08-14 17:10 UTC (permalink / raw)
  To: Michael Haggerty
  Cc: David Turner, Eric Sunshine, Git List, Christian Couder,
	Nguyễn Thái Ngọc Duy, Jacob Keller

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I don't see how that can help. The result of a pipeline is taken from
> the last command. The exit codes of earlier commands in the pipeline are
> lost in the sands of time:
>
>     $ false | true
>     $ echo $?
>     0
>     $ false | ( ! false )
>     $ echo $?
>     0
>
> Working around this POSIX shell limitation is surprisingly awkward in a
> general-purpose script. But in this case you could use a temporary file:
>
>     git for-each-ref >refs-actual &&
>     ! grep refs/worktree <refs-actual && [...]

It is not just "you could", but that is what you "should" do, if you
cared the exit status from for-each-ref.

Thanks.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-14 17:04     ` Junio C Hamano
@ 2015-08-14 20:04       ` David Turner
  2015-08-14 20:27         ` Junio C Hamano
  2015-08-15 18:20         ` Michael Haggerty
  2015-08-15 18:12       ` Michael Haggerty
  1 sibling, 2 replies; 24+ messages in thread
From: David Turner @ 2015-08-14 20:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, chriscool, pclouds

On Fri, 2015-08-14 at 10:04 -0700, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
> > Let's take a step back.
> >
> > We have always had a ton of code that uses `git_path()` and friends to
> > convert abstract things into filesystem paths. Let's take the
> > reference-handling code as an example:
> > ...
> > This seems crazy to me. It is the *reference* code that should know
> > whether a particular reference should be stored under `$GIT_DIR` or
> > `$GIT_COMMON_DIR`, or indeed whether it should be stored in a database.
> 
> It is more like:
> 
>  1. The system as a whole should decide if HEAD and refs/heads/
>     should be per workspace or shared across a repository (and we say
>     the former should be per workspace, the latter should be shared).
> 
>  2. The reference code should decide which ref-backend is used to
>     store refs.
> 
>  3. And any ref-backend should follow the decision made by the
>     system as a whole in 1.
> 
> I'd imagine that David's ref-backend code inherited from Ronnie
> would still accept the string "refs/heads/master" from the rest of
> the system (i.e. callers that call into the ref API) to mean "the
> ref that represents the 'master' branch", and uses that as the key
> to decide "ok, that is shared across workspaces" to honor the
> system-wide decision made in 1.  The outside callers wouldn't pass
> the result of calling git_path("refs/heads/master") into the ref
> API, which may expand to "$somewhere_else/refs/heads/master" when
> run in a secondary workspace to point at the common location.
> 
> I'd also imagine that the workspace API would give ways for the
> implementation of the reference API to ask these questions:
> 
>  - which workspace am I operating for?  where is the "common" thing?
>    how would I identify this workspace among the ones that share the
>    same "common" thing?
> 
>  - is this ref (or ref-like thing) supposed to be in common or per
>    workspace?
> 
> I agree with you that there needs an intermediate level, between
> what Duy and David's git_path() does (i.e. which gives the final
> result of deciding where in the filesystem the thing should go) and
> your two stupid functions would do (i.e. knowing which kind the
> thing is, give you the final location in the filesystem).  That is,
> to let the caller know if the thing is supposed to be shared or per
> workspace in the first place.
> 
> With that intermediate level function, a database-based ref-backend
> could make ('common', ref/heads/master) as the key for the current
> value of the master branch and (workspace-name, HEAD) as the key for
> the current value of the HEAD for a given workspace.
> 
> > We should have two *stupid* functions, `git_workspace_path()` and
> > `git_common_path()`, and have the *callers* decide which one to call.
> 
> So I think we should have *three* functions:
> 
>  - git_workspace_name(void) returns some name that uniquely
>    identifies the current workspace among the workspaces linked to
>    the same repository.

Random side note: the present workspace path name component is not
acceptable for this if alternate ref backends use a single db for
storage across all workspaces.  That's because you might create a
workspace at foo, then manually rm -r it, and then create a new one also
named foo.  The database wouldn't know about this series of events, and
would then have stale per-workspace refs for foo.

That said, with my lmdb backend, I've been falling back to the files
backend for per-workspace refs.  This also means I don't have to worry
about expiring per-workspace refs when a workspace is removed. 

I could change this, but IIRC, there are a fair number of things that
care about the existence of a file called HEAD, so the fallback was
easier.  (That is, the other way was a giant hassle).

>  - is_common_thing(const char *) takes a path (that is relative to
>    $GIT_DIR where the thing would have lived at in a pre-workspace
>    world), and tells if it is a common thing or a per-workspace
>    thing.
> 
>  - git_path() can stay the external interface and can be thought of:
> 
> 	git_path(const char *path)
>         {
> 		if (is_common_thing(path))
> 			return git_common_path(path);
> 		return git_workspace_path(git_workspace_name(), path);
> 	}
> 
>    if you think in terms of your two helpers.
> 
> But I do not think that git_common_path() and git_workspace_path()
> need to be called from any other place in the system, and that is
> the reason why I did not say we should have four functions (or five,
> counting git_path() itself).

I wrote an email arguing for Michael's position on this, and by the time
I was done writing it, I had come around to more-or-less Junio's
position.

My argument for keeping git_path as the external interface is this: in
the multiple worktree world, $GIT_DIR is effectively an overlay
filesystem.  It's not a complete overlay API: e.g. we don't have a
git_path_opendir function that special-cases refs/ to add in
refs/worktree.  But it handles the common cases.

It is true that even if we had a complete overlay API, it would not be
sufficient to hide all of the complexity of per-worktree refs (the files
ref backend still needs to know not to pack per-worktree refs).  But
that is equally true if we have the refs code call
git_common_path/git_worktree_path.  So we're not successfully hiding
details by using git_common_path/git_worktree_path in refs.c.

For this patch series, I don't think we need to change anything (except
that I realized that I forgot to add logs/refs/worktree to refs, and
people will probably find some issues once they start reviewing the
details of my code). In the present code, adjust_git_path already
handles the git_common_path vs git_workspace_path thing; it just does it
in a slightly less elegant way than Junio's proposal.  Implementing
Junio's proposal would not affect this series; it would just be an
additional patch on top (or beforehand).

There is one case where the refs code will probably need to directly
call git_workspace_path: when we fix git prune to handle detached HEADs
in workspaces.  It could just set the workspace and then call git_path,
but that is less elegant.  So I think when we fix that (which should
probably wait on for_each_worktree), we can implement Junio's proposal.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-14 20:04       ` David Turner
@ 2015-08-14 20:27         ` Junio C Hamano
  2015-08-14 20:54           ` David Turner
  2015-08-15 18:20         ` Michael Haggerty
  1 sibling, 1 reply; 24+ messages in thread
From: Junio C Hamano @ 2015-08-14 20:27 UTC (permalink / raw)
  To: David Turner; +Cc: Michael Haggerty, git, chriscool, pclouds

David Turner <dturner@twopensource.com> writes:

> Random side note: the present workspace path name component is not
> acceptable for this if alternate ref backends use a single db for
> storage across all workspaces.  That's because you might create a
> workspace at foo, then manually rm -r it, and then create a new one also
> named foo.  The database wouldn't know about this series of events, and
> would then have stale per-workspace refs for foo.

The users can do "Create, manuallly rm -r and recreate" dance all
they want, but the result must still honor the invariant:

    For any $workspace, $workspace/.git is a "gitdir:" file that
    points at one subdirectory in $GIT_COMMON_DIR/worktrees/.

The "name" I had in mind was the names of the directories in
$GIT_COMMON_DIR/worktrees/ that by definition has to be unique.

Another invariant 

    $GIT_COMMON_DIR/worktrees/$that_subdirectory has commondir file
    that points at the $GIT_COMMON_DIR/.

must also be preserved by "Create, manuallly rm -r and recreate"
dance, but it is not important to define what the workspace ID is.

> That said, with my lmdb backend, I've been falling back to the files
> backend for per-workspace refs.

I think that is perfectly fine and within the 3-bullet design
guideline we saw earlier.  Your backend is honoring the decision
made by the system as a whole what is private and what is shared,
which is the only thing that counts in the larger picture.  It is up
to individual ref-backend implementations how they do so, and it is
perfectly fine that your backend takes advantage of the ref layout
by storing some stuff in lmdb storage (which I presume will live in
"common" part of the filesystem storage) and some other stuff
directly in the filesystem.

> There is one case where the refs code will probably need to directly
> call git_workspace_path: when we fix git prune to handle detached HEADs
> in workspaces.  It could just set the workspace and then call git_path,
> but that is less elegant.  So I think when we fix that (which should
> probably wait on for_each_worktree), we can implement Junio's proposal.

Yeah, I said *three* helper functions, but we do need the "enumerate
all workspaces" if we want to go that route, and we do need to
expose Michael's common_path() and workspace_path() to the code that
needs such an enumeration.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-14 20:27         ` Junio C Hamano
@ 2015-08-14 20:54           ` David Turner
  0 siblings, 0 replies; 24+ messages in thread
From: David Turner @ 2015-08-14 20:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git, chriscool, pclouds

On Fri, 2015-08-14 at 13:27 -0700, Junio C Hamano wrote:
> David Turner <dturner@twopensource.com> writes:
> 
> > Random side note: the present workspace path name component is not
> > acceptable for this if alternate ref backends use a single db for
> > storage across all workspaces.  That's because you might create a
> > workspace at foo, then manually rm -r it, and then create a new one also
> > named foo.  The database wouldn't know about this series of events, and
> > would then have stale per-workspace refs for foo.
> 
> The users can do "Create, manuallly rm -r and recreate" dance all
> they want, but the result must still honor the invariant:
> 
>     For any $workspace, $workspace/.git is a "gitdir:" file that
>     points at one subdirectory in $GIT_COMMON_DIR/worktrees/.
> 
> The "name" I had in mind was the names of the directories in
> $GIT_COMMON_DIR/worktrees/ that by definition has to be unique.
> 
> Another invariant 
>
>    $GIT_COMMON_DIR/worktrees/$that_subdirectory has commondir file
>    that points at the $GIT_COMMON_DIR/.
>
> must also be preserved by "Create, manuallly rm -r and recreate"
> dance, but it is not important to define what the workspace ID is.

My worry was that workspace would get deleted and
recreated without the refs db finding out.  Then the refs db would
retain per-workspace references from the deleted version of the
worskace.  That is, these names are unique at any given time, but not
across time.

My idea was just to have `git workspace add` create per-workspace
"workspace_id" file with a long random number in it. The id would be
$that_subdirectory/$random_number; we would know to split it at / and
double-check that the workspace id is still current before assuming that
our refs data was valid for the workspace.  

Alternately, git workspace add could inform the refs backend that a
workspace is being added, which would solve the problem in a different
way.

But given my refs lmdb backend design, I can actually ignore this for
now, so that's what I'm going to do.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 1/4] refs: clean up common_list
  2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
                   ` (2 preceding siblings ...)
  2015-08-12 21:57 ` [PATCH v3 4/4] bisect: make bisection refs per-worktree David Turner
@ 2015-08-15  7:44 ` Duy Nguyen
  3 siblings, 0 replies; 24+ messages in thread
From: Duy Nguyen @ 2015-08-15  7:44 UTC (permalink / raw)
  To: David Turner; +Cc: Git Mailing List, Michael Haggerty, Christian Couder

On Thu, Aug 13, 2015 at 4:57 AM, David Turner <dturner@twopensource.com> wrote:
> +struct common_dir common_list[] = {
> +       { "branches", 0, 1, 0 },
> +       { "hooks", 0, 1, 0 },
> +       { "info", 0, 1, 0 },
> +       { "info/sparse-checkout", 0, 0, 1 },
> +       { "logs", 1, 1, 0 },
> +       { "logs/HEAD", 1, 1, 1 },
> +       { "lost-found", 0, 1, 0 },
> +       { "objects", 0, 1, 0 },
> +       { "refs", 0, 1, 0 },
> +       { "remotes", 0, 1, 0 },
> +       { "worktrees", 0, 1, 0 },
> +       { "rr-cache", 0, 1, 0 },
> +       { "svn", 0, 1, 0 },
> +       { "config", 0, 0, 0 },
> +       { "gc.pid", 1, 0, 0 },
> +       { "packed-refs", 0, 0, 0 },
> +       { "shallow", 0, 0, 0 },
> +       { NULL, 0, 0, 0 }
>  };

Nit. If you make dirname the last field, we would have aligned
numbers, which might help reading.
-- 
Duy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
  2015-08-12 22:48   ` Junio C Hamano
  2015-08-13  9:05   ` Michael Haggerty
@ 2015-08-15  7:59   ` Duy Nguyen
  2015-08-16  5:04     ` David Turner
  2 siblings, 1 reply; 24+ messages in thread
From: Duy Nguyen @ 2015-08-15  7:59 UTC (permalink / raw)
  To: David Turner; +Cc: Git Mailing List, Michael Haggerty, Christian Couder

On Thu, Aug 13, 2015 at 4:57 AM, David Turner <dturner@twopensource.com> wrote:
> Instead of a linear search over common_list to check whether
> a path is common, use a trie.  The trie search operates on
> path prefixes, and handles excludes.

Just be careful that the given key from git_path is not normalized. I
think you assume it is in the code, but I haven't read carefully. We
could of course optimize for the good case: assume normalized and
search, then fall back to explicit normalizing and search again.
-- 
Duy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 3/4] refs: make refs/worktree/* per-worktree
  2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
  2015-08-13 17:15   ` Eric Sunshine
@ 2015-08-15  8:04   ` Duy Nguyen
  1 sibling, 0 replies; 24+ messages in thread
From: Duy Nguyen @ 2015-08-15  8:04 UTC (permalink / raw)
  To: David Turner; +Cc: Git Mailing List, Michael Haggerty, Christian Couder

On Thu, Aug 13, 2015 at 4:57 AM, David Turner <dturner@twopensource.com> wrote:
> We need a place to stick refs for bisects in progress that is not
> shared between worktrees.  So we use the refs/worktree/ hierarchy.
>
> The is_per_worktree_ref function and associated docs learn that
> refs/worktree/ is per-worktree, as does the git_path code in path.c

I assume you want to iterate over all these per-worktree refs later
on. Otherwise just moving bisect refs outside refs/ would
automatically make them per-worktree. But then it would polute
$GIT_DIR some more, so probably not the best move. So yeah,
refs/worktree is probably for the best.
-- 
Duy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-14 17:04     ` Junio C Hamano
  2015-08-14 20:04       ` David Turner
@ 2015-08-15 18:12       ` Michael Haggerty
  2015-08-17 15:55         ` Junio C Hamano
  1 sibling, 1 reply; 24+ messages in thread
From: Michael Haggerty @ 2015-08-15 18:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Turner, git, chriscool, pclouds

On 08/14/2015 07:04 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> Let's take a step back.
>>
>> We have always had a ton of code that uses `git_path()` and friends to
>> convert abstract things into filesystem paths. Let's take the
>> reference-handling code as an example:
>> ...
>> This seems crazy to me. It is the *reference* code that should know
>> whether a particular reference should be stored under `$GIT_DIR` or
>> `$GIT_COMMON_DIR`, or indeed whether it should be stored in a database.
> 
> It is more like:
> 
>  1. The system as a whole should decide if HEAD and refs/heads/
>     should be per workspace or shared across a repository (and we say
>     the former should be per workspace, the latter should be shared).
> 
>  2. The reference code should decide which ref-backend is used to
>     store refs.
> 
>  3. And any ref-backend should follow the decision made by the
>     system as a whole in 1.

If I understand correctly, you consider the decision of where a
particular reference should be stored to be a kind of "business logic"
decision that should live outside of the refs module. I don't think it
is so important whether this knowledge is inside or outside of the refs
module (I can live with it either way).

> I'd imagine that David's ref-backend code inherited from Ronnie
> would still accept the string "refs/heads/master" from the rest of
> the system (i.e. callers that call into the ref API) to mean "the
> ref that represents the 'master' branch", and uses that as the key
> to decide "ok, that is shared across workspaces" to honor the
> system-wide decision made in 1.  The outside callers wouldn't pass
> the result of calling git_path("refs/heads/master") into the ref
> API, which may expand to "$somewhere_else/refs/heads/master" when
> run in a secondary workspace to point at the common location.

Definitely agreed.

> I'd also imagine that the workspace API would give ways for the
> implementation of the reference API to ask these questions:
> 
>  - which workspace am I operating for?  where is the "common" thing?
>    how would I identify this workspace among the ones that share the
>    same "common" thing?
> 
>  - is this ref (or ref-like thing) supposed to be in common or per
>    workspace?

Yes, I especially like this last idea. For example, suppose the function
is "is_common_reference(refname)". It's nice that this function doesn't
have to know about all possible "things" like your is_common_thing()
function. Therefore it can be simpler. Almost always the caller (and in
this case the caller would usually be within the refs module) will know
that the "thing" it wants to inquire about is a reference name, so it
can spare the extra expense of calling is_common_thing().

If we still need is_common_thing() (e.g., for the implementation of `git
rev-parse --git-path`), its definition would become something like

    return is_common_reference(thing) ||
           is_common_file(thing) ||
           is_common_flurg(thing) || ...;

In this construction I think it is is clear that is_common_reference()
would fit pretty well in the refs module, though elsewhere would be OK too.

> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-14 20:04       ` David Turner
  2015-08-14 20:27         ` Junio C Hamano
@ 2015-08-15 18:20         ` Michael Haggerty
  1 sibling, 0 replies; 24+ messages in thread
From: Michael Haggerty @ 2015-08-15 18:20 UTC (permalink / raw)
  To: David Turner, Junio C Hamano; +Cc: git, chriscool, pclouds

On 08/14/2015 10:04 PM, David Turner wrote:
> On Fri, 2015-08-14 at 10:04 -0700, Junio C Hamano wrote:
>> [...]
>> So I think we should have *three* functions:
>>
>>  - git_workspace_name(void) returns some name that uniquely
>>    identifies the current workspace among the workspaces linked to
>>    the same repository.
> 
> Random side note: the present workspace path name component is not
> acceptable for this if alternate ref backends use a single db for
> storage across all workspaces.  That's because you might create a
> workspace at foo, then manually rm -r it, and then create a new one also
> named foo.  The database wouldn't know about this series of events, and
> would then have stale per-workspace refs for foo.

The reference backend should clearly have functions that are called on
the creation and destruction of a workspace. In these functions the
backend could initialize / clean up any reference-related resources
associated with that workspace.

These functions can be empty for the filesystem backend.

> That said, with my lmdb backend, I've been falling back to the files
> backend for per-workspace refs.  This also means I don't have to worry
> about expiring per-workspace refs when a workspace is removed. 
> 
> I could change this, but IIRC, there are a fair number of things that
> care about the existence of a file called HEAD, so the fallback was
> easier.  (That is, the other way was a giant hassle).

OK, so the functions can be empty for the lmdb backend, too :-)

> [...]
> For this patch series, I don't think we need to change anything [...]
> Implementing
> Junio's proposal would not affect this series; it would just be an
> additional patch on top (or beforehand).

OK.

> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-15  7:59   ` Duy Nguyen
@ 2015-08-16  5:04     ` David Turner
  2015-08-16 12:20       ` Duy Nguyen
  0 siblings, 1 reply; 24+ messages in thread
From: David Turner @ 2015-08-16  5:04 UTC (permalink / raw)
  To: git

Duy Nguyen <pclouds@gmail.com> writes:
> On Thu, Aug 13, 2015 at 4:57 AM, David Turner <dturner@twopensource.com> 
wrote:
> > Instead of a linear search over common_list to check whether
> > a path is common, use a trie.  The trie search operates on
> > path prefixes, and handles excludes.
> 
> Just be careful that the given key from git_path is not normalized. I
> think you assume it is in the code, but I haven't read carefully. We
> could of course optimize for the good case: assume normalized and
> search, then fall back to explicit normalizing and search again.

What does it mean for a key to be normalized?  Do you mean normalized in 
terms of upper/lowercase on case-insensitive filesystems?  If so, I think the 
assumption here is that this will be called with paths generated by git, 
which will always use the lowercase path.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-16  5:04     ` David Turner
@ 2015-08-16 12:20       ` Duy Nguyen
  0 siblings, 0 replies; 24+ messages in thread
From: Duy Nguyen @ 2015-08-16 12:20 UTC (permalink / raw)
  To: David Turner; +Cc: Git Mailing List

On Sun, Aug 16, 2015 at 12:04 PM, David Turner <dturner@twopensource.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>> On Thu, Aug 13, 2015 at 4:57 AM, David Turner <dturner@twopensource.com>
> wrote:
>> > Instead of a linear search over common_list to check whether
>> > a path is common, use a trie.  The trie search operates on
>> > path prefixes, and handles excludes.
>>
>> Just be careful that the given key from git_path is not normalized. I
>> think you assume it is in the code, but I haven't read carefully. We
>> could of course optimize for the good case: assume normalized and
>> search, then fall back to explicit normalizing and search again.
>
> What does it mean for a key to be normalized?  Do you mean normalized in
> terms of upper/lowercase on case-insensitive filesystems?  If so, I think the
> assumption here is that this will be called with paths generated by git,
> which will always use the lowercase path.

Mostly about duplicated slashes, "abc//def" instead of "abc/def".
Technically nothing forbids git_path("refs/../refs/foo"), but that
would fool even current code.
-- 
Duy

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v3 2/4] path: optimize common dir checking
  2015-08-15 18:12       ` Michael Haggerty
@ 2015-08-17 15:55         ` Junio C Hamano
  0 siblings, 0 replies; 24+ messages in thread
From: Junio C Hamano @ 2015-08-17 15:55 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: David Turner, git, chriscool, pclouds

Michael Haggerty <mhagger@alum.mit.edu> writes:

> If I understand correctly, you consider the decision of where a
> particular reference should be stored to be a kind of "business logic"
> decision that should live outside of the refs module. I don't think it
> is so important whether this knowledge is inside or outside of the refs
> module (I can live with it either way).

I think I misspoke.  The decision to make a particular reference,
e.g. HEAD or 'master' branch, a per-workspace one or a repo-wide
one, should not be made by individual refs backend (i.e. lower-half
of the refs module).  It could still be the responsibility of the
upper-half of the refs module and that _feels_ more kosher.

But without actual implementation of the interface between upper-
and lower-half of the refs module yet (as we only have fs based
backend that is tightly integrated within the refs module and
nothing else right now), I do not yet see a clear implementation for
the "is this thing common?" table Duy did that decides where things
go for everything except "refs/" part while letting the upper-half
of the refs module to take responsiblity of that decision for refs.
For one thing, "the refs module decides what is in refs/, Duy's
table decides everything else " is not even the right partition, in
the presence of things like HEAD, logs/, etc.

So...

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-08-17 15:55 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
2015-08-12 22:48   ` Junio C Hamano
2015-08-13  9:05   ` Michael Haggerty
2015-08-14 17:04     ` Junio C Hamano
2015-08-14 20:04       ` David Turner
2015-08-14 20:27         ` Junio C Hamano
2015-08-14 20:54           ` David Turner
2015-08-15 18:20         ` Michael Haggerty
2015-08-15 18:12       ` Michael Haggerty
2015-08-17 15:55         ` Junio C Hamano
2015-08-15  7:59   ` Duy Nguyen
2015-08-16  5:04     ` David Turner
2015-08-16 12:20       ` Duy Nguyen
2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
2015-08-13 17:15   ` Eric Sunshine
2015-08-13 17:41     ` David Turner
2015-08-13 20:16       ` Michael Haggerty
2015-08-13 20:32         ` David Turner
2015-08-14  8:18           ` Michael Haggerty
2015-08-14 17:10             ` Junio C Hamano
2015-08-15  8:04   ` Duy Nguyen
2015-08-12 21:57 ` [PATCH v3 4/4] bisect: make bisection refs per-worktree David Turner
2015-08-15  7:44 ` [PATCH v3 1/4] refs: clean up common_list Duy Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).