All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] Sparse checkout fixes and improvements
@ 2010-11-15 10:36 Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 01/10] add: do not rely on dtype being NULL behavior Nguyễn Thái Ngọc Duy
                   ` (9 more replies)
  0 siblings, 10 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

The first part fixes the long standing t1011.7 and removes the
work around in excluded_from_list(). As a result, the
EXC_FLAG_MUSTBEDIR fix for sparse checkout [1] is no longer needed.

Another plus is that because the index is now traversed tree-alike,
when a directory is match, all its children does not need to be checked,
which could speed things up a bit.

[1] http://http://article.gmane.org/gmane.comp.version-control.git/160892

Nguyễn Thái Ngọc Duy (10):
  add: do not rely on dtype being NULL behavior
  unpack-trees: move all skip-worktree check back to unpack_trees()
  unpack-trees: add function to update ce_flags based on sparse
    patterns
  unpack-trees: fix sparse checkout's "unable to match directories"
    fault


The second part is more experimental although I think it's good
change:

  unpack-trees: optimize full checkout case

The intention is that $GIT_DIR/info/sparse-checkout can be always-on.
But it should not impact performance when not really used.


  templates: add info/sparse-checkout

I should have done this long before. I did not notice it until recently.


  checkout: add -S to update sparse checkout
  checkout: add --full to fully populate working directory
  git-checkout.txt: mention of sparse checkout

These form a friendlier interface to update sparse checkout. Users need
not to dig deep in git-read-tree.txt just to use sparse checkout.
"git checkout -S" resembles "cleartool edcs", which makes sense to me.
Both edit a file (location "unknown" in clearcase case) and update
worktree after that.


  clean: support cleaning sparse checkout with -S

Sparse checkout does not prohibit you from checking out other parts of the
index. But you are pretty much left alone when doing so. This helps a bit.
Support "git clean -S -e" is possible with clear_ce_flags() from the first
part of this series but I need to think a bit more.


 Documentation/git-checkout.txt       |   49 +++++++++
 Documentation/git-clean.txt          |    6 +-
 Documentation/git-read-tree.txt      |   18 +---
 builtin/add.c                        |    3 +-
 builtin/checkout.c                   |   59 +++++++++++-
 builtin/clean.c                      |   70 +++++++++++++
 cache.h                              |    1 +
 dir.c                                |    6 -
 t/t1011-read-tree-sparse-checkout.sh |   39 +++++++-
 t/t7301-clean-sparse.sh              |   92 +++++++++++++++++
 templates/info--sparse-checkout      |    4 +
 unpack-trees.c                       |  188 +++++++++++++++++++++++++++++++---
 12 files changed, 490 insertions(+), 45 deletions(-)
 create mode 100755 t/t7301-clean-sparse.sh
 create mode 100644 templates/info--sparse-checkout

-- 
1.7.3.2.210.g045198

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH 01/10] add: do not rely on dtype being NULL behavior
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 12:14   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees() Nguyễn Thái Ngọc Duy
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Commit c84de70 (excluded_1(): support exclude files in index -
2009-08-20) added support for excluded() where dtype can be NULL. It
was designed specifically for index matching because there was no
other way to extract dtype information from index. It did not support
wildcard matching (for example, "a*/" pattern would fail to match).

The code was probably misread when commit 108da0d (git add: Add the
"--ignore-missing" option for the dry run - 2010-07-10) was made
because DT_UNKNOWN happens to be zero (NULL) too.

Do not pass DT_UNKNOWN/NULL to excluded(), instead pass a pointer to a
variable that contains DT_UNKNOWN. The real dtype will be extracted
from worktree by excluded(), as expected.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 builtin/add.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index 56a4e0a..1a4672d 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -446,7 +446,8 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			if (!seen[i] && pathspec[i][0]
 			    && !file_exists(pathspec[i])) {
 				if (ignore_missing) {
-					if (excluded(&dir, pathspec[i], DT_UNKNOWN))
+					int dtype = DT_UNKNOWN;
+					if (excluded(&dir, pathspec[i], &dtype))
 						dir_add_ignored(&dir, pathspec[i], strlen(pathspec[i]));
 				} else
 					die("pathspec '%s' did not match any files",
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees()
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 01/10] add: do not rely on dtype being NULL behavior Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 12:34   ` Thiago Farina
  2010-11-15 16:01   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns Nguyễn Thái Ngọc Duy
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Earlier, the will_have_skip_worktree() checks are done in various places:

1. in verify_* functions to prevent absent/uptodate checks outside
   worktree
2. in merged_entry for new index entries
3. in apply_sparse_checkout() for all entries

In case all entries are added new ($GIT_DIR/index is missing) all the
above checks will be done for all entries, which in the worst case can
become cache_nr*el->nr*3 fnmatch() calls. Quite expensive.

By moving all will_have_skip_worktree() checks to unpack_trees(), we
now have two places to run the checks:

 - one before calling traverse_trees(), where all current index
   entries are checked.
 - one after traverse_trees(), where newly-added entries are checked.

In worst case, each entry will be checked once, or cache_nr*el->nr
fnmatch() calls at most. The two places are a loop over all entries,
which also makes optimization easier.

CE_ADDED is used to mark new entries. The flag is only used by
add_to_index(), which should not be called while unpack_trees() is
running.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 cache.h        |    1 +
 unpack-trees.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
 2 files changed, 63 insertions(+), 7 deletions(-)

diff --git a/cache.h b/cache.h
index 33decd9..d87708a 100644
--- a/cache.h
+++ b/cache.h
@@ -182,6 +182,7 @@ struct cache_entry {
 #define CE_WT_REMOVE (0x400000) /* remove in work directory */
 
 #define CE_UNPACKED  (0x1000000)
+#define CE_NEW_SKIP_WORKTREE (0x2000000)
 
 /*
  * Extended on-disk flags
diff --git a/unpack-trees.c b/unpack-trees.c
index 803445a..9acd9be 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -258,7 +258,7 @@ static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_opt
 {
 	int was_skip_worktree = ce_skip_worktree(ce);
 
-	if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
+	if (ce->ce_flags & CE_NEW_SKIP_WORKTREE)
 		ce->ce_flags |= CE_SKIP_WORKTREE;
 	else
 		ce->ce_flags &= ~CE_SKIP_WORKTREE;
@@ -834,6 +834,49 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	return mask;
 }
 
+static void set_new_skip_worktree_1(struct unpack_trees_options *o)
+{
+	int i;
+
+	for (i = 0;i < o->src_index->cache_nr;i++) {
+		struct cache_entry *ce = o->src_index->cache[i];
+		ce->ce_flags &= ~CE_ADDED;
+		if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
+			ce->ce_flags |= CE_NEW_SKIP_WORKTREE;
+		else
+			ce->ce_flags &= ~CE_NEW_SKIP_WORKTREE;
+	}
+}
+
+static int verify_absent(struct cache_entry *, enum unpack_trees_error_types, struct unpack_trees_options *);
+static int set_new_skip_worktree_2(struct unpack_trees_options *o)
+{
+	int i;
+
+	/*
+	 * CE_ADDED marks new index entries. These have not been processed
+	 * by set_new_skip_worktree_1() so we do it here.
+	 */
+	for (i = 0;i < o->result.cache_nr;i++) {
+		struct cache_entry *ce = o->result.cache[i];
+
+		if (!(ce->ce_flags & CE_ADDED))
+			continue;
+
+		ce->ce_flags &= ~CE_ADDED;
+		if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
+			ce->ce_flags |= CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE;
+		else
+			ce->ce_flags &= ~(CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
+
+		/* Left-over checks from merged_entry when old == NULL */
+		if (verify_absent(ce, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o))
+			return -1;
+	}
+
+	return 0;
+}
+
 /*
  * N-way merge "len" trees.  Returns 0 on success, -1 on failure to manipulate the
  * resulting index, -2 on failure to reflect the changes to the work tree.
@@ -862,6 +905,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			o->el = &el;
 	}
 
+	if (!o->skip_sparse_checkout)
+		set_new_skip_worktree_1(o);
+
 	memset(&o->result, 0, sizeof(o->result));
 	o->result.initialized = 1;
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
@@ -922,6 +968,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 	if (!o->skip_sparse_checkout) {
 		int empty_worktree = 1;
+
+		if (set_new_skip_worktree_2(o))
+			goto return_failed;
+
 		for (i = 0;i < o->result.cache_nr;i++) {
 			struct cache_entry *ce = o->result.cache[i];
 
@@ -1017,7 +1067,7 @@ static int verify_uptodate_1(struct cache_entry *ce,
 static int verify_uptodate(struct cache_entry *ce,
 			   struct unpack_trees_options *o)
 {
-	if (!o->skip_sparse_checkout && will_have_skip_worktree(ce, o))
+	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
 		return 0;
 	return verify_uptodate_1(ce, o, ERROR_NOT_UPTODATE_FILE);
 }
@@ -1204,7 +1254,7 @@ static int verify_absent(struct cache_entry *ce,
 			 enum unpack_trees_error_types error_type,
 			 struct unpack_trees_options *o)
 {
-	if (!o->skip_sparse_checkout && will_have_skip_worktree(ce, o))
+	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
 		return 0;
 	return verify_absent_1(ce, error_type, o);
 }
@@ -1226,10 +1276,15 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
 	int update = CE_UPDATE;
 
 	if (!old) {
+		/*
+		 * Set CE_NEW_SKIP_WORKTREE on "merge" to make
+		 * verify_absent() no-op. The true check will be done
+		 * later on in unpack_trees().
+		 */
+		merge->ce_flags |= CE_NEW_SKIP_WORKTREE;
 		if (verify_absent(merge, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o))
 			return -1;
-		if (!o->skip_sparse_checkout && will_have_skip_worktree(merge, o))
-			update |= CE_SKIP_WORKTREE;
+		update |= CE_ADDED;
 		invalidate_ce_path(merge, o);
 	} else if (!(old->ce_flags & CE_CONFLICTED)) {
 		/*
@@ -1245,8 +1300,8 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
 		} else {
 			if (verify_uptodate(old, o))
 				return -1;
-			if (ce_skip_worktree(old))
-				update |= CE_SKIP_WORKTREE;
+			/* Migrate old bits over */
+			update |= old->ce_flags & (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
 			invalidate_ce_path(old, o);
 		}
 	} else {
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 01/10] add: do not rely on dtype being NULL behavior Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees() Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 18:30   ` Jonathan Nieder
  2010-11-15 20:19   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault Nguyễn Thái Ngọc Duy
                   ` (6 subsequent siblings)
  9 siblings, 2 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

The function will reconstruct directory structure from index and feed
excluded_from_list() with proper dtype.

Another advantage over the old will_have_skip_worktree() is that when
a directory is matched or not matched, we can go ahead and set
ce_flags for all of its children without calling excluded_from_list()
again. The old solution requires one excluded_from_list() call per
index entry.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 unpack-trees.c |   88 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 88 insertions(+), 0 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 9acd9be..6c266ef 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -834,6 +834,94 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	return mask;
 }
 
+/*
+ * traverse the index, find every entry that matches according to
+ * o->el. Do "ce_flags &= ~clear_mask" on those entries. Return the
+ * number of traversed entries.
+ *
+ * If select_mask is non-zero, only entries whose ce_flags has on of
+ * those bits enabled are traversed.
+ */
+static int clear_ce_flags_1(struct cache_entry **cache, int nr,
+			    int prefix_len, int match_all,
+			    int select_mask, int clear_mask,
+			    struct unpack_trees_options *o)
+{
+	const char *name, *slash;
+	struct cache_entry *ce = cache[0];
+	const char *prefix = ce->name;
+	int processed, original_nr = nr;
+
+	if (prefix_len && !match_all) {
+		int dtype = DT_DIR;
+		static char path[PATH_MAX];
+		int pathlen = prefix_len - 1;
+		const char *basename;
+
+		memcpy(path, prefix, pathlen);
+		path[pathlen] = '\0';
+		basename = strrchr(path, '/');
+		basename = basename ? basename+1 : path;
+		switch (excluded_from_list(path, pathlen, basename, &dtype, o->el)) {
+		case 0:
+			while (nr && !strncmp(ce->name, prefix, prefix_len)) {
+				cache++;
+				ce = *cache;
+				nr--;
+			}
+			return original_nr - nr;
+		case 1:
+			match_all = 1;
+			break;
+		default:	/* case -1, undecided */
+			break;
+		}
+	}
+
+	while (nr) {
+		if (select_mask && !(ce->ce_flags & select_mask)) {
+			cache++;
+			ce = *cache;
+			nr--;
+			continue;
+		}
+
+		if (prefix_len && strncmp(ce->name, prefix, prefix_len))
+			break;
+
+		name = ce->name + prefix_len;
+		slash = strchr(name, '/');
+
+		/* no slash, this is a file */
+		if (!slash) {
+			int dtype = ce_to_dtype(ce);
+			if (match_all ||
+			    excluded_from_list(ce->name, ce_namelen(ce), name, &dtype, o->el) > 0)
+				ce->ce_flags &= ~clear_mask;
+			cache++;
+			ce = *cache;
+			nr--;
+			continue;
+		}
+
+		/* has slash, this is a directory */
+		processed = clear_ce_flags_1(cache, nr,
+					     slash + 1 - ce->name, match_all,
+					     select_mask, clear_mask, o);
+		cache += processed;
+		ce = *cache;
+		nr -= processed;
+	}
+	return original_nr - nr;
+}
+
+static int clear_ce_flags(struct cache_entry **cache, int nr,
+			    int select_mask, int clear_mask,
+			    struct unpack_trees_options *o)
+{
+	return clear_ce_flags_1(cache, nr, 0, 0, select_mask, clear_mask, o);
+}
+
 static void set_new_skip_worktree_1(struct unpack_trees_options *o)
 {
 	int i;
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (2 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 19:10   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 05/10] unpack-trees: optimize full checkout case Nguyễn Thái Ngọc Duy
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

excluded_from_list() is designed to be work with a directory
structure, where there are files and directories. Index however is a
list of files, no directories, not really fit in excluded_from_list()
design.

Commit c84de70 (excluded_1(): support exclude files in index -
2009-08-20) helped make it work somewhat. Although the lack of proper
dtype means pattern "blah" won't match entries "blah/..." in the index.

clear_ce_flags() is now used to match the index against sparse
patterns. It does generate directories for excluded_from_list() so the
"blah" case should now work.

Also revert c84de70 because it is no longer needed.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-read-tree.txt      |    7 -------
 dir.c                                |    6 ------
 t/t1011-read-tree-sparse-checkout.sh |   14 +++++++++++---
 unpack-trees.c                       |   33 ++++++++++++++++++---------------
 4 files changed, 29 insertions(+), 31 deletions(-)

diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt
index e88e9c2..634423a 100644
--- a/Documentation/git-read-tree.txt
+++ b/Documentation/git-read-tree.txt
@@ -416,13 +416,6 @@ turn `core.sparseCheckout` on in order to have sparse checkout
 support.
 
 
-BUGS
-----
-In order to match a directory with $GIT_DIR/info/sparse-checkout,
-trailing slash must be used. The form without trailing slash, while
-works with .gitignore, does not work with sparse checkout.
-
-
 SEE ALSO
 --------
 linkgit:git-write-tree[1]; linkgit:git-ls-files[1];
diff --git a/dir.c b/dir.c
index d1e5e5e..2b38f94 100644
--- a/dir.c
+++ b/dir.c
@@ -359,12 +359,6 @@ int excluded_from_list(const char *pathname,
 			int to_exclude = x->to_exclude;
 
 			if (x->flags & EXC_FLAG_MUSTBEDIR) {
-				if (!dtype) {
-					if (!prefixcmp(pathname, exclude))
-						return to_exclude;
-					else
-						continue;
-				}
 				if (*dtype == DT_UNKNOWN)
 					*dtype = get_dtype(NULL, pathname, pathlen);
 				if (*dtype != DT_DIR)
diff --git a/t/t1011-read-tree-sparse-checkout.sh b/t/t1011-read-tree-sparse-checkout.sh
index 9a07de1..50f7dfe 100755
--- a/t/t1011-read-tree-sparse-checkout.sh
+++ b/t/t1011-read-tree-sparse-checkout.sh
@@ -91,12 +91,20 @@ test_expect_success 'match directories with trailing slash' '
 	test -f sub/added
 '
 
-test_expect_failure 'match directories without trailing slash' '
-	echo init.t >.git/info/sparse-checkout &&
+test_expect_success 'match directories without trailing slash' '
 	echo sub >>.git/info/sparse-checkout &&
 	git read-tree -m -u HEAD &&
 	git ls-files -t >result &&
-	test_cmp expected.swt result &&
+	test_cmp expected.swt-noinit result &&
+	test ! -f init.t &&
+	test -f sub/added
+'
+
+test_expect_success 'match directory pattern' '
+	echo "s?b" >>.git/info/sparse-checkout &&
+	git read-tree -m -u HEAD &&
+	git ls-files -t >result &&
+	test_cmp expected.swt-noinit result &&
 	test ! -f init.t &&
 	test -f sub/added
 '
diff --git a/unpack-trees.c b/unpack-trees.c
index 6c266ef..f005454 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -245,15 +245,6 @@ static int check_updates(struct unpack_trees_options *o)
 static int verify_uptodate_sparse(struct cache_entry *ce, struct unpack_trees_options *o);
 static int verify_absent_sparse(struct cache_entry *ce, enum unpack_trees_error_types, struct unpack_trees_options *o);
 
-static int will_have_skip_worktree(const struct cache_entry *ce, struct unpack_trees_options *o)
-{
-	const char *basename;
-
-	basename = strrchr(ce->name, '/');
-	basename = basename ? basename+1 : ce->name;
-	return excluded_from_list(ce->name, ce_namelen(ce), basename, NULL, o->el) <= 0;
-}
-
 static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_options *o)
 {
 	int was_skip_worktree = ce_skip_worktree(ce);
@@ -926,14 +917,19 @@ static void set_new_skip_worktree_1(struct unpack_trees_options *o)
 {
 	int i;
 
+	/* Mark everything out (except staged entries) */
 	for (i = 0;i < o->src_index->cache_nr;i++) {
 		struct cache_entry *ce = o->src_index->cache[i];
 		ce->ce_flags &= ~CE_ADDED;
-		if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
+		if (!ce_stage(ce))
 			ce->ce_flags |= CE_NEW_SKIP_WORKTREE;
 		else
 			ce->ce_flags &= ~CE_NEW_SKIP_WORKTREE;
 	}
+
+	/* Then mark some of them in again according to o->el */
+	clear_ce_flags(o->src_index->cache, o->src_index->cache_nr,
+		       0, CE_NEW_SKIP_WORKTREE, o);
 }
 
 static int verify_absent(struct cache_entry *, enum unpack_trees_error_types, struct unpack_trees_options *);
@@ -943,19 +939,26 @@ static int set_new_skip_worktree_2(struct unpack_trees_options *o)
 
 	/*
 	 * CE_ADDED marks new index entries. These have not been processed
-	 * by set_new_skip_worktree_1() so we do it here.
+	 * by set_new_skip_worktree_1() so we do it here. Mark every new
+	 * entries out.
 	 */
 	for (i = 0;i < o->result.cache_nr;i++) {
 		struct cache_entry *ce = o->result.cache[i];
+		if (ce->ce_flags & CE_ADDED)
+			ce->ce_flags |= CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE;
+	}
+
+	/* Then mark some of them in again */
+	clear_ce_flags(o->result.cache, o->result.cache_nr, CE_ADDED,
+		       CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE, o);
+
+	for (i = 0;i < o->result.cache_nr;i++) {
+		struct cache_entry *ce = o->result.cache[i];
 
 		if (!(ce->ce_flags & CE_ADDED))
 			continue;
 
 		ce->ce_flags &= ~CE_ADDED;
-		if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
-			ce->ce_flags |= CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE;
-		else
-			ce->ce_flags &= ~(CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
 
 		/* Left-over checks from merged_entry when old == NULL */
 		if (verify_absent(ce, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o))
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 05/10] unpack-trees: optimize full checkout case
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (3 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 20:41   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 06/10] templates: add info/sparse-checkout Nguyễn Thái Ngọc Duy
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 unpack-trees.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index f005454..eb5d357 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -996,6 +996,16 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			o->el = &el;
 	}
 
+	if (!o->skip_sparse_checkout &&
+	    o->el->nr == 1 &&
+	    !strcmp(o->el->excludes[0]->pattern, "*")) {
+		for (i = 0; i < o->src_index->cache_nr; i++)
+			if (ce_skip_worktree(o->src_index->cache[i]))
+				break;
+		if (i == o->src_index->cache_nr)
+			o->skip_sparse_checkout = 1;
+	}
+
 	if (!o->skip_sparse_checkout)
 		set_new_skip_worktree_1(o);
 
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 06/10] templates: add info/sparse-checkout
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (4 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 05/10] unpack-trees: optimize full checkout case Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 07/10] checkout: add -S to update sparse checkout Nguyễn Thái Ngọc Duy
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 templates/info--sparse-checkout |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)
 create mode 100644 templates/info--sparse-checkout

diff --git a/templates/info--sparse-checkout b/templates/info--sparse-checkout
new file mode 100644
index 0000000..7426475
--- /dev/null
+++ b/templates/info--sparse-checkout
@@ -0,0 +1,3 @@
+# Specify what files are checked out in working directory
+# Run "git checkout" after updating this file to update working directory
+*
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 07/10] checkout: add -S to update sparse checkout
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (5 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 06/10] templates: add info/sparse-checkout Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 21:16   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 08/10] checkout: add --full to fully populate working directory Nguyễn Thái Ngọc Duy
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

The traditional way of updating sparse checkout is to open
$GIT_DIR/info/sparse-checkout manually, edit it, save, then run "git
checkout" or "git read-tree -m -u HEAD". That's not convenient.

"git checkout -S" is introduced to combine those steps in one. It also
provides some additional checks:

 - warn users if core.sparseCheckout is off
 - refuse to update sparse checkout if there is no valid sparse
   patterns (faulty sparse-checkout file for example)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-checkout.txt       |   16 +++++++++++++
 builtin/checkout.c                   |   42 +++++++++++++++++++++++++++++++++-
 t/t1011-read-tree-sparse-checkout.sh |   18 ++++++++++++++
 templates/info--sparse-checkout      |    1 +
 4 files changed, 76 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 22d3611..478bfe7 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -12,6 +12,7 @@ SYNOPSIS
 'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
 'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>...
 'git checkout' --patch [<tree-ish>] [--] [<paths>...]
+'git checkout' -S
 
 DESCRIPTION
 -----------
@@ -176,6 +177,13 @@ the conflicted merge in the specified paths.
 This means that you can use `git checkout -p` to selectively discard
 edits from your current working tree.
 
+-S::
+--update-sparse-checkout::
+	An editor is invoked to let you update your sparse checkout
+	patterns. The updated patterns will be saved in
+	$GIT_DIR/info/sparse-checkout. The working directory is also
+	updated. An empty file will abort the process.
+
 <branch>::
 	Branch to checkout; if it refers to a branch (i.e., a name that,
 	when prepended with "refs/heads/", is a valid ref), then that
@@ -316,6 +324,14 @@ $ git add frotz
 ------------
 
 
+ENVIRONMENT AND CONFIGURATION VARIABLES
+---------------------------------------
+The editor used to edit the commit log message will be chosen from the
+GIT_EDITOR environment variable, the core.editor configuration variable, the
+VISUAL environment variable, or the EDITOR environment variable (in that
+order).  See linkgit:git-var[1] for details.
+
+
 Author
 ------
 Written by Linus Torvalds <torvalds@osdl.org>
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 9240faf..47847b3 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -675,6 +675,31 @@ static const char *unique_tracking_name(const char *name)
 	return NULL;
 }
 
+static void edit_info_sparse_checkout()
+{
+	char *tmpfile = xstrdup(git_path("sparse-checkout"));
+	struct exclude_list el;
+	int i;
+
+	copy_file(tmpfile, git_path("info/sparse-checkout"), 0666);
+
+	if (launch_editor(tmpfile, NULL, NULL))
+		exit(1);
+
+	memset(&el, 0, sizeof(el));
+	if (add_excludes_from_file_to_list(tmpfile, "", 0, NULL, &el, 0) < 0 ||
+	    el.nr == 0)
+		die("No valid sparse patterns. Abort.");
+	for (i = 0; i < el.nr; i++)
+		free(el.excludes[i]);
+	free(el.excludes);
+
+	if (rename(tmpfile, git_path("info/sparse-checkout")) < 0)
+		die_errno("Can't update %s", git_path("info/sparse-checkout"));
+
+	free(tmpfile);
+}
+
 int cmd_checkout(int argc, const char **argv, const char *prefix)
 {
 	struct checkout_opts opts;
@@ -685,6 +710,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 	char *conflict_style = NULL;
 	int patch_mode = 0;
 	int dwim_new_local_branch = 1;
+	int update_sparse_checkout = 0;
 	struct option options[] = {
 		OPT__QUIET(&opts.quiet),
 		OPT_STRING('b', NULL, &opts.new_branch, "branch",
@@ -704,6 +730,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 		OPT_STRING(0, "conflict", &conflict_style, "style",
 			   "conflict style (merge or diff3)"),
 		OPT_BOOLEAN('p', "patch", &patch_mode, "select hunks interactively"),
+		OPT_BOOLEAN('S', "update-sparse-checkout", &update_sparse_checkout,
+			    "open up editor to edit $GIT_DIR/info/sparse-checkout" ),
 		{ OPTION_BOOLEAN, 0, "guess", &dwim_new_local_branch, NULL,
 		  "second guess 'git checkout no-such-branch'",
 		  PARSE_OPT_NOARG | PARSE_OPT_HIDDEN },
@@ -722,6 +750,9 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, checkout_usage,
 			     PARSE_OPT_KEEP_DASHDASH);
 
+	if (update_sparse_checkout && !core_apply_sparse_checkout)
+		die("core.sparseCheckout needs to be turned on");
+
 	/* we can assume from now on new_branch = !new_branch_force */
 	if (opts.new_branch && opts.new_branch_force)
 		die("-B cannot be used with -b");
@@ -874,6 +905,9 @@ no_reference:
 		if (!pathspec)
 			die("invalid path specification");
 
+		if (update_sparse_checkout)
+			die("git checkout: update paths is incompatible with updating sparse checkout.");
+
 		if (patch_mode)
 			return interactive_checkout(new.name, pathspec, &opts);
 
@@ -892,8 +926,11 @@ no_reference:
 		return checkout_paths(source_tree, pathspec, &opts);
 	}
 
-	if (patch_mode)
+	if (patch_mode) {
+		if (update_sparse_checkout)
+			die("git checkout: interactive checkout is incompatible with updating sparse checkout.");
 		return interactive_checkout(new.name, NULL, &opts);
+	}
 
 	if (opts.new_branch) {
 		struct strbuf buf = STRBUF_INIT;
@@ -915,5 +952,8 @@ no_reference:
 	if (opts.writeout_stage)
 		die("--ours/--theirs is incompatible with switching branches.");
 
+	if (update_sparse_checkout)
+		edit_info_sparse_checkout();
+
 	return switch_branches(&opts, &new);
 }
diff --git a/t/t1011-read-tree-sparse-checkout.sh b/t/t1011-read-tree-sparse-checkout.sh
index 50f7dfe..fe4716c 100755
--- a/t/t1011-read-tree-sparse-checkout.sh
+++ b/t/t1011-read-tree-sparse-checkout.sh
@@ -184,4 +184,22 @@ test_expect_success 'read-tree --reset removes outside worktree' '
 	test_cmp empty result
 '
 
+test_expect_success 'git checkout -S fails to launch editor' '
+	GIT_EDITOR=/non-existent test_must_fail git checkout -S &&
+	grep init.t .git/info/sparse-checkout
+'
+
+test_expect_success 'git checkout -S' '
+	git checkout -f top &&
+	cat >editor.sh <<\EOF &&
+#!/bin/sh
+echo sub > "$1"
+EOF
+	chmod 755 editor.sh &&
+	GIT_EDITOR="./editor.sh" git checkout -S &&
+	grep sub .git/info/sparse-checkout &&
+	test -f sub/added &&
+	test ! -f init.t
+'
+
 test_done
diff --git a/templates/info--sparse-checkout b/templates/info--sparse-checkout
index 7426475..588008b 100644
--- a/templates/info--sparse-checkout
+++ b/templates/info--sparse-checkout
@@ -1,3 +1,4 @@
 # Specify what files are checked out in working directory
 # Run "git checkout" after updating this file to update working directory
+# If this is opened by "git checkout -S", empty this file will abort it.
 *
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 08/10] checkout: add --full to fully populate working directory
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (6 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 07/10] checkout: add -S to update sparse checkout Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 21:23   ` Jonathan Nieder
  2010-11-15 10:36 ` [PATCH 09/10] git-checkout.txt: mention of sparse checkout Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 10/10] clean: support cleaning sparse checkout with -S Nguyễn Thái Ngọc Duy
  9 siblings, 1 reply; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy


Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-checkout.txt       |    6 +++++-
 builtin/checkout.c                   |   23 ++++++++++++++++++++---
 t/t1011-read-tree-sparse-checkout.sh |    7 +++++++
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 478bfe7..1d82063 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -12,7 +12,7 @@ SYNOPSIS
 'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
 'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>...
 'git checkout' --patch [<tree-ish>] [--] [<paths>...]
-'git checkout' -S
+'git checkout' [-S|--full]
 
 DESCRIPTION
 -----------
@@ -184,6 +184,10 @@ edits from your current working tree.
 	$GIT_DIR/info/sparse-checkout. The working directory is also
 	updated. An empty file will abort the process.
 
+--full:
+	Reset $GIT_DIR/info/sparse-checkout to include everything and
+	update working directory accordingly.
+
 <branch>::
 	Branch to checkout; if it refers to a branch (i.e., a name that,
 	when prepended with "refs/heads/", is a valid ref), then that
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 47847b3..f070930 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -710,7 +710,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 	char *conflict_style = NULL;
 	int patch_mode = 0;
 	int dwim_new_local_branch = 1;
-	int update_sparse_checkout = 0;
+	int update_sparse_checkout = 0, full_checkout = 0;
 	struct option options[] = {
 		OPT__QUIET(&opts.quiet),
 		OPT_STRING('b', NULL, &opts.new_branch, "branch",
@@ -732,6 +732,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 		OPT_BOOLEAN('p', "patch", &patch_mode, "select hunks interactively"),
 		OPT_BOOLEAN('S', "update-sparse-checkout", &update_sparse_checkout,
 			    "open up editor to edit $GIT_DIR/info/sparse-checkout" ),
+		OPT_BOOLEAN(0, "full", &full_checkout,
+			    "reset $GIT_DIR/info/sparse-checkout to checkout everything" ),
 		{ OPTION_BOOLEAN, 0, "guess", &dwim_new_local_branch, NULL,
 		  "second guess 'git checkout no-such-branch'",
 		  PARSE_OPT_NOARG | PARSE_OPT_HIDDEN },
@@ -750,6 +752,9 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, checkout_usage,
 			     PARSE_OPT_KEEP_DASHDASH);
 
+	if (full_checkout)
+		update_sparse_checkout = 1;
+
 	if (update_sparse_checkout && !core_apply_sparse_checkout)
 		die("core.sparseCheckout needs to be turned on");
 
@@ -952,8 +957,20 @@ no_reference:
 	if (opts.writeout_stage)
 		die("--ours/--theirs is incompatible with switching branches.");
 
-	if (update_sparse_checkout)
-		edit_info_sparse_checkout();
+	if (update_sparse_checkout) {
+		if (full_checkout) {
+			FILE *fp = fopen(git_path("info/sparse-checkout"), "w+");
+			if (!fp)
+				die_errno("Unable to reset %s", git_path("info/sparse-checkout"));
+			fprintf(fp, "# Specify what files are checked out in working directory\n");
+			fprintf(fp, "# Run \"git checkout\" after updating this file to update working directory\n");
+			fprintf(fp, "# If this is opened by \"git checkout -S\", empty this file will abort it.\n");
+			fprintf(fp, "*\n");
+			fclose(fp);
+		}
+		else
+			edit_info_sparse_checkout(full_checkout);
+	}
 
 	return switch_branches(&opts, &new);
 }
diff --git a/t/t1011-read-tree-sparse-checkout.sh b/t/t1011-read-tree-sparse-checkout.sh
index fe4716c..055e858 100755
--- a/t/t1011-read-tree-sparse-checkout.sh
+++ b/t/t1011-read-tree-sparse-checkout.sh
@@ -202,4 +202,11 @@ EOF
 	test ! -f init.t
 '
 
+test_expect_success 'git checkout --full' '
+	git checkout --full &&
+	grep "^\*$" .git/info/sparse-checkout &&
+	test -f sub/added &&
+	test -f init.t
+'
+
 test_done
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 09/10] git-checkout.txt: mention of sparse checkout
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (7 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 08/10] checkout: add --full to fully populate working directory Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 10:36 ` [PATCH 10/10] clean: support cleaning sparse checkout with -S Nguyễn Thái Ngọc Duy
  9 siblings, 0 replies; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

All the boring technical stuff still resides in git-read-tree.txt. The
introduction here targets end users.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-checkout.txt  |   29 +++++++++++++++++++++++++++++
 Documentation/git-read-tree.txt |   11 +----------
 2 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 1d82063..b9a97c9 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -253,6 +253,35 @@ $ git log -g -2 HEAD
 ------------
 
 
+Sparse checkout
+---------------
+
+"Sparse checkout" allows to sparsely populate working directory
+according to patterns defined in `$GIT_DIR/info/sparse-checkout`. The
+syntax of this file is similar to linkgit:gitignore[5].  This feature
+requires core.sparseCheckout to be turned on. `git checkout -S` should
+be used when editing this file. It will update working directory
+properly after editing. See linkgit:git-read-tree[1] for more
+information.
+
+Normally sparse-checkout file contains only one rule, which indicates
+that all files are checked out:
+
+----------------
+*
+----------------
+
+You can update the file to only contain files you want to be checked
+out, or files _not_ to be checked out, using negate patterns. For
+example, to remove file "unwanted":
+
+----------------
+*
+!unwanted
+----------------
+
+To fully populate working directory again, use "git checkout --full".
+
 EXAMPLES
 --------
 
diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt
index 634423a..4b72c96 100644
--- a/Documentation/git-read-tree.txt
+++ b/Documentation/git-read-tree.txt
@@ -390,16 +390,7 @@ Then it compares the new skip-worktree value with the previous one. If
 skip-worktree turns from unset to set, it will add the corresponding
 file back. If it turns from set to unset, that file will be removed.
 
-While `$GIT_DIR/info/sparse-checkout` is usually used to specify what
-files are in. You can also specify what files are _not_ in, using
-negate patterns. For example, to remove file "unwanted":
-
-----------------
-*
-!unwanted
-----------------
-
-Another tricky thing is fully repopulating working directory when you
+A tricky thing is fully repopulating working directory when you
 no longer want sparse checkout. You cannot just disable "sparse
 checkout" because skip-worktree are still in the index and you working
 directory is still sparsely populated. You should re-populate working
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH 10/10] clean: support cleaning sparse checkout with -S
  2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
                   ` (8 preceding siblings ...)
  2010-11-15 10:36 ` [PATCH 09/10] git-checkout.txt: mention of sparse checkout Nguyễn Thái Ngọc Duy
@ 2010-11-15 10:36 ` Nguyễn Thái Ngọc Duy
  2010-11-15 21:30   ` Jonathan Nieder
  9 siblings, 1 reply; 34+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2010-11-15 10:36 UTC (permalink / raw)
  To: git; +Cc: Nguyễn Thái Ngọc Duy

Those files that are managed by git, but out of working directory due
to sparse-checkout file are subjected to be cleaned by "-S". Files
that match the index exactly will be cleaned without "-f". Otherwise
'-f' is needed.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 Documentation/git-clean.txt |    6 ++-
 builtin/clean.c             |   70 ++++++++++++++++++++++++++++++++
 t/t7301-clean-sparse.sh     |   92 +++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 167 insertions(+), 1 deletions(-)
 create mode 100755 t/t7301-clean-sparse.sh

diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
index 60e38e6..e0c95b1 100644
--- a/Documentation/git-clean.txt
+++ b/Documentation/git-clean.txt
@@ -8,7 +8,7 @@ git-clean - Remove untracked files from the working tree
 SYNOPSIS
 --------
 [verse]
-'git clean' [-d] [-f] [-n] [-q] [-e <pattern>] [-x | -X] [--] <path>...
+'git clean' [-d] [-f] [-n] [-q] [-e <pattern>] [-x | -X | -S] [--] <path>...
 
 DESCRIPTION
 -----------
@@ -61,6 +61,10 @@ OPTIONS
 	Remove only files ignored by git.  This may be useful to rebuild
 	everything from scratch, but keep manually created files.
 
+-S::
+	Remove files tracked by git but are outside of sparse checkout.
+	Files that match the index exactly will be removed even when
+	'-f' is not given and clean.requireForce is no.
 
 Author
 ------
diff --git a/builtin/clean.c b/builtin/clean.c
index c8798f5..5827993 100644
--- a/builtin/clean.c
+++ b/builtin/clean.c
@@ -34,11 +34,71 @@ static int exclude_cb(const struct option *opt, const char *arg, int unset)
 	return 0;
 }
 
+static int clean_sparse_checkout(const char *prefix, const char **pathspec,
+				 int show_only, int quiet, int force)
+{
+	struct stat st;
+	int i, errors = 0;
+	unsigned char sha1[20];
+	const char *qname;
+	struct strbuf buf = STRBUF_INIT;
+
+	if (read_cache() < 0)
+		die("index file corrupt");
+	for (i = 0; i < the_index.cache_nr; i++) {
+		struct cache_entry *ce = the_index.cache[i];
+
+		if (!ce_skip_worktree(ce))
+			continue;
+		if (pathspec && !match_pathspec(pathspec, ce->name, ce_namelen(ce), 0, NULL))
+			continue;
+
+		if (stat(ce->name, &st) < 0)
+			continue;
+		qname = quote_path_relative(ce->name, ce_namelen(ce), &buf, prefix);
+		if (index_path(sha1, ce->name, &st, 0) < 0) {
+			warning("failed to hash %s", qname);
+			errors++;
+			continue;
+		}
+		if (!hashcmp(sha1, ce->sha1)) {
+			if (show_only) {
+				printf("Would remove %s\n", qname);
+				continue;
+			}
+			if (!quiet)
+				printf("Removing %s\n", qname);
+			if (unlink(ce->name) != 0) {
+				warning("failed to remove %s", qname);
+				errors++;
+			}
+			continue;
+		}
+		if (force) {
+			if (show_only) {
+				printf("Would remove %s\n", qname);
+				continue;
+			}
+			if (!quiet)
+				printf("Removing %s\n", qname);
+			if (unlink(ce->name) != 0) {
+				warning("failed to remove %s", qname);
+				errors++;
+			}
+			continue;
+		}
+		if (show_only)
+			printf("Would not remove %s\n", qname);
+	}
+	return errors != 0;
+}
+
 int cmd_clean(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	int show_only = 0, remove_directories = 0, quiet = 0, ignored = 0;
 	int ignored_only = 0, baselen = 0, config_set = 0, errors = 0;
+	int sparse = 0;
 	int rm_flags = REMOVE_DIR_KEEP_NESTED_GIT;
 	struct strbuf directory = STRBUF_INIT;
 	struct dir_struct dir;
@@ -55,6 +115,7 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 				"remove whole directories"),
 		{ OPTION_CALLBACK, 'e', "exclude", &exclude_list, "pattern",
 		  "exclude <pattern>", PARSE_OPT_NONEG, exclude_cb },
+		OPT_BOOLEAN('S', NULL, &sparse, "remove tracked files outside sparse checkout"),
 		OPT_BOOLEAN('x', NULL, &ignored, "remove ignored files, too"),
 		OPT_BOOLEAN('X', NULL, &ignored_only,
 				"remove only ignored files"),
@@ -70,6 +131,15 @@ int cmd_clean(int argc, const char **argv, const char *prefix)
 	argc = parse_options(argc, argv, prefix, options, builtin_clean_usage,
 			     0);
 
+	if (sparse) {
+		if (ignored || ignored_only)
+			die("-S, -x and -X cannot be used together");
+		if (exclude_list.nr)
+			die("-S and -e cannot be used together (yet)");
+		pathspec = get_pathspec(prefix, argv);
+		return clean_sparse_checkout(prefix, pathspec, show_only, quiet, force);
+	}
+
 	memset(&dir, 0, sizeof(dir));
 	if (ignored_only)
 		dir.flags |= DIR_SHOW_IGNORED;
diff --git a/t/t7301-clean-sparse.sh b/t/t7301-clean-sparse.sh
new file mode 100755
index 0000000..3ac0b0a
--- /dev/null
+++ b/t/t7301-clean-sparse.sh
@@ -0,0 +1,92 @@
+#!/bin/sh
+
+test_description='git clean -S basic tests'
+
+. ./test-lib.sh
+
+git config clean.requireForce yes
+
+test_expect_success 'setup' '
+	mkdir src &&
+	touch file1 file2 &&
+	touch src/file1 src/file2 &&
+	git add . &&
+	git update-index --skip-worktree file1 src/file1
+'
+
+test_expect_success 'clean -x -S does not work' '
+	test_must_fail git clean -x -S &&
+	test_must_fail git clean -X -S
+'
+
+test_expect_success 'clean -n -S' '
+	cat >expected <<\EOF
+Would remove file1
+Would remove src/file1
+EOF
+	git clean -n -S >result &&
+	test_cmp expected result
+'
+
+test_expect_success 'clean -n -S src' '
+	cat >expected <<\EOF
+Would remove src/file1
+EOF
+	git clean -n -S src >result &&
+	test_cmp expected result
+'
+
+test_expect_success '[src] clean -n -S .' '
+	(
+	cd src
+	cat >expected <<\EOF
+Would remove file1
+EOF
+	git clean -n -S . >result &&
+	test_cmp expected result
+	)
+'
+
+test_expect_success '[src] clean -n -S ../file1' '
+	(
+	cd src
+	cat >expected <<\EOF
+Would remove ../file1
+EOF
+	git clean -n -S ../file1 >result &&
+	test_cmp expected result
+	)
+'
+
+test_expect_success 'clean -n -S with dirty worktree' '
+	echo dirty >file1 &&
+	cat >expected <<\EOF
+Would not remove file1
+Would remove src/file1
+EOF
+	git clean -n -S >result &&
+	test_cmp expected result
+'
+
+test_expect_success 'clean -f -n -S with dirty worktree' '
+	echo dirty >file1 &&
+	cat >expected <<\EOF
+Would remove file1
+Would remove src/file1
+EOF
+	git clean -f -n -S >result &&
+	test_cmp expected result
+'
+
+test_expect_success 'clean -S with dirty worktree' '
+	git clean -S &&
+	grep dirty file1 &&
+	test ! -f src/file1
+'
+
+test_expect_success 'clean -f -S with dirty worktree' '
+	git clean -f -S &&
+	test ! -f file1
+'
+
+test_done
-- 
1.7.3.2.210.g045198

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/10] add: do not rely on dtype being NULL behavior
  2010-11-15 10:36 ` [PATCH 01/10] add: do not rely on dtype being NULL behavior Nguyễn Thái Ngọc Duy
@ 2010-11-15 12:14   ` Jonathan Nieder
  2010-11-16  2:18     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 12:14 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy
  Cc: git, Jens Lehmann, Ævar Arnfjörð Bjarmason

Nguyễn Thái Ngọc Duy wrote:

> Commit c84de70 (excluded_1(): support exclude files in index -
> 2009-08-20) added support for excluded() where dtype can be NULL. It
> was designed specifically for index matching because there was no
> other way to extract dtype information from index. It did not support
> wildcard matching (for example, "a*/" pattern would fail to match).
> 
> The code was probably misread when commit 108da0d (git add: Add the
> "--ignore-missing" option for the dry run - 2010-07-10) was made
> because DT_UNKNOWN happens to be zero (NULL) too.
> 
> Do not pass DT_UNKNOWN/NULL to excluded(), instead pass a pointer to a
> variable that contains DT_UNKNOWN. The real dtype will be extracted
> from worktree by excluded(), as expected.

Could you rephrase this in a way that contrasts current and desired
behavior?  Is it like this?

	The "git add --ignore-missing --dry-run" codepath is
	interpreting .gitignore incorrectly, unlike "git add".  For
	example:

		$ test -e foo || echo missing
		missing
		$ echo foo/ >>.gitignore
		$ mkdir bar
		$ git add --ignore-missing --dry-run foo; echo $?
		The following paths are ignored by one of your .gitignore files:
		foo/
		Use -f if you really want to add them.
		fatal: no files added
		128
		$ git add --ignore-missing --dry-run bar/foo; echo $?
		0

	In the original use case (preparing to add a submodule) the
	behavior of the first command is correct, second incorrect.
	If the entry to be added was a regular file, it would be the
	other way around.

	The cause: the --ignore-missing code passes DT_UNKNOWN as the
	dtype_ptr argument to excluded() which happens to equal zero
	(NULL) and accidentally triggers the "match pathspecs in index
	only" codepath (see c84de70, excluded_1(): support exclude
	files in index, 2009-08-20) that is unfortunately a bit
	primitive.

	Surely what was really wanted is to check paths against the
	index and work tree, defaulting to "regular file".

Wait --- that's not true.  In the "git submodule add" case, we really
want to default to (or even better, force) "directory".

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees()
  2010-11-15 10:36 ` [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees() Nguyễn Thái Ngọc Duy
@ 2010-11-15 12:34   ` Thiago Farina
  2010-11-16  2:19     ` Nguyen Thai Ngoc Duy
  2010-11-15 16:01   ` Jonathan Nieder
  1 sibling, 1 reply; 34+ messages in thread
From: Thiago Farina @ 2010-11-15 12:34 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

2010/11/15 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>  cache.h        |    1 +
>  unpack-trees.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
>  2 files changed, 63 insertions(+), 7 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 33decd9..d87708a 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -182,6 +182,7 @@ struct cache_entry {
>  #define CE_WT_REMOVE (0x400000) /* remove in work directory */
>
>  #define CE_UNPACKED  (0x1000000)
> +#define CE_NEW_SKIP_WORKTREE (0x2000000)
>
Would be good to remove the () around the hex here and else in this file?

>  /*
>  * Extended on-disk flags
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 803445a..9acd9be 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -258,7 +258,7 @@ static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_opt
>  {
>        int was_skip_worktree = ce_skip_worktree(ce);
>
> -       if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
> +       if (ce->ce_flags & CE_NEW_SKIP_WORKTREE)
>                ce->ce_flags |= CE_SKIP_WORKTREE;
>        else
>                ce->ce_flags &= ~CE_SKIP_WORKTREE;
> @@ -834,6 +834,49 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
>        return mask;
>  }
>
> +static void set_new_skip_worktree_1(struct unpack_trees_options *o)
> +{
> +       int i;
> +
> +       for (i = 0;i < o->src_index->cache_nr;i++) {
Could you add spaces after ; for readability, please? There is another
for below that needs this to.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees()
  2010-11-15 10:36 ` [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees() Nguyễn Thái Ngọc Duy
  2010-11-15 12:34   ` Thiago Farina
@ 2010-11-15 16:01   ` Jonathan Nieder
  2010-11-16  2:39     ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 16:01 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> Earlier, the will_have_skip_worktree() checks are done in various places:
> 
> 1. in verify_* functions to prevent absent/uptodate checks outside
>    worktree
> 2. in merged_entry for new index entries
> 3. in apply_sparse_checkout() for all entries
> 
> In case all entries are added new ($GIT_DIR/index is missing) all the
> above checks will be done for all entries, which in the worst case can
> become cache_nr*el->nr*3 fnmatch() calls. Quite expensive.

Does this mean something like:

	Handling of sparse checkouts is slow.

	[timings]

	To fix this, we will do such-and-such.  As a first step,
	we'll do such-and-such which does not change semantics
	and at least does not make it any slower.

?

In other words, any commit message should make clear

 1. The purpose.
 2. The baseline of (sane or insane) behavior that is affected.
 3. The intended resulting functionality.

How the patch works and why it succeeds are just optional extras
(nice, certainly, but in 90% of cases it is obvious from the code once
you know (1), (2), and (3) anyway).

> --- a/cache.h
> +++ b/cache.h
> @@ -182,6 +182,7 @@ struct cache_entry {
>  #define CE_WT_REMOVE (0x400000) /* remove in work directory */
>  
>  #define CE_UNPACKED  (0x1000000)
> +#define CE_NEW_SKIP_WORKTREE (0x2000000)

Semantics?

> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -258,7 +258,7 @@ static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_opt
>  {
>  	int was_skip_worktree = ce_skip_worktree(ce);
>  
> -	if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
> +	if (ce->ce_flags & CE_NEW_SKIP_WORKTREE)
>  		ce->ce_flags |= CE_SKIP_WORKTREE;

So CE_NEW_SKIP_WORKTREE roughly means "stage-0 entry outside the sparse checkout area"?

>  	else
>  		ce->ce_flags &= ~CE_SKIP_WORKTREE;

In particular, I guess the NEW part refers to the sparse checkout
area, not the entry, since existing index entries with SKIP_WORKTREE
bits need to keep that bit.

> @@ -834,6 +834,49 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
>  	return mask;
>  }
>  
> +static void set_new_skip_worktree_1(struct unpack_trees_options *o)
> +{

Will comment on name once we get to the call site.

> +	int i;
> +
> +	for (i = 0;i < o->src_index->cache_nr;i++) {
> +		struct cache_entry *ce = o->src_index->cache[i];
> +		ce->ce_flags &= ~CE_ADDED;
> +		if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
> +			ce->ce_flags |= CE_NEW_SKIP_WORKTREE;
> +		else
> +			ce->ce_flags &= ~CE_NEW_SKIP_WORKTREE;
> +	}
> +}

Populating the CE_NEW_SKIP_WORKTREE flags based on the new
worktree area.

> +
> +static int verify_absent(struct cache_entry *, enum unpack_trees_error_types, struct unpack_trees_options *);
> +static int set_new_skip_worktree_2(struct unpack_trees_options *o)
> +{
> +	int i;
> +
> +	/*
> +	 * CE_ADDED marks new index entries. These have not been processed
> +	 * by set_new_skip_worktree_1() so we do it here.
> +	 */

Probably this comment belongs at the call site instead, to avoid some
chasing.

> +	for (i = 0;i < o->result.cache_nr;i++) {
> +		struct cache_entry *ce = o->result.cache[i];
> +
> +		if (!(ce->ce_flags & CE_ADDED))
> +			continue;
> +
> +		ce->ce_flags &= ~CE_ADDED;
> +		if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
> +			ce->ce_flags |= CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE;
> +		else
> +			ce->ce_flags &= ~(CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);

CE_ADDED is private to the add_file_to_index() code path shared
between add and rerere builtins.  rerere is libified and used e.g. by
cherry-pick foo..bar.  Can it get us in trouble or do we have clear
the flags before using them here?  I think it would be worth a note in
api-in-core-index.txt to warn future refactorers.

> +
> +		/* Left-over checks from merged_entry when old == NULL */

Huh?  (That is completely cryptic outside the context of the patch.)

> +		if (verify_absent(ce, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o))
> +			return -1;
> +	}
[...]
> @@ -862,6 +905,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  			o->el = &el;
>  	}
>  
> +	if (!o->skip_sparse_checkout)
> +		set_new_skip_worktree_1(o);
> +

Why is this not folded into the above if ()?

This populates the NEW_SKIP_WORKTREE (= future SKIP_WORKTREE?) bit
in the index that represents the preimage for a reset or merge.

Perhaps call it:

		set_new_skip_worktree(o->src_index, 0);

and mark that function __attribute__((always_inline)) if the
optimizer doesn't want to cooperate?

Or:

		set_src_skip_worktree(o);

>  	memset(&o->result, 0, sizeof(o->result));
>  	o->result.initialized = 1;
>  	o->result.timestamp.sec = o->src_index->timestamp.sec;
> @@ -922,6 +968,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  
>  	if (!o->skip_sparse_checkout) {
>  		int empty_worktree = 1;
> +
> +		if (set_new_skip_worktree_2(o))
> +			goto return_failed;
> +

Trivial part of the merge is over.  So now we can check the new
index entries against the sparse worktree patterns.  They were added in
that trivial part using add_entry() and will have the CE_ADDED flag.

So this might be:

		set_new_skip_worktree(&o->result, CE_ADDED);

or

		set_result_skip_worktree(o);

The CE_ADDED flag was set in merged_entry, called by the merge_fn_t
implementations.  If there were an api-traverse-trees.txt explaining
the API, it would be worth a note there; for now it should suffice
to add a note to future merge_fn_t implementors in the commit
message ("each unpack-trees merge function arranges for
CE_SKIP_WORKTREE and CE_SKIP_NEW_WORKTREE to be propagated and for the
CE_ADDED flag to be set on entries of new paths").

>  		for (i = 0;i < o->result.cache_nr;i++) {
>  			struct cache_entry *ce = o->result.cache[i];
>  
> @@ -1017,7 +1067,7 @@ static int verify_uptodate_1(struct cache_entry *ce,
>  static int verify_uptodate(struct cache_entry *ce,
>  			   struct unpack_trees_options *o)
>  {
> -	if (!o->skip_sparse_checkout && will_have_skip_worktree(ce, o))
> +	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
[...]
> @@ -1204,7 +1254,7 @@ static int verify_absent(struct cache_entry *ce,
>  			 enum unpack_trees_error_types error_type,
>  			 struct unpack_trees_options *o)
>  {
> -	if (!o->skip_sparse_checkout && will_have_skip_worktree(ce, o))
> +	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))

The point, I hope.  Currently we alternate between finding entries to
remove and deciding whether if lie in the worktree.  After this patch,
whether they lie in the worktree is precomputed.

> @@ -1226,10 +1276,15 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
>  	int update = CE_UPDATE;
>  
>  	if (!old) {
> +		/*
> +		 * Set CE_NEW_SKIP_WORKTREE on "merge" to make
> +		 * verify_absent() no-op. The true check will be done
> +		 * later on in unpack_trees().
> +		 */
> +		merge->ce_flags |= CE_NEW_SKIP_WORKTREE;

Mm?  Since verify_absent() is a no-op, why call it?  This looks like a
placeholder for code that calls verify_absent later, when we know if
it lies in the worktree.

>  		if (verify_absent(merge, ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN, o))
>  			return -1;
> -		if (!o->skip_sparse_checkout && will_have_skip_worktree(merge, o))
> -			update |= CE_SKIP_WORKTREE;
> +		update |= CE_ADDED;

More of the main act.  Currently we alternate between finding entries
to add and deciding if they are in the worktree.  After the patch,
they are piled up for addition first, then checked against the
worktree in one batch.

>  		invalidate_ce_path(merge, o);
>  	} else if (!(old->ce_flags & CE_CONFLICTED)) {
>  		/*
> @@ -1245,8 +1300,8 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
>  		} else {
>  			if (verify_uptodate(old, o))
>  				return -1;
> -			if (ce_skip_worktree(old))
> -				update |= CE_SKIP_WORKTREE;
> +			/* Migrate old bits over */
> +			update |= old->ce_flags & (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);

Thanks, this looks promising.
Jonathan

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns
  2010-11-15 10:36 ` [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns Nguyễn Thái Ngọc Duy
@ 2010-11-15 18:30   ` Jonathan Nieder
  2010-11-15 20:19   ` Jonathan Nieder
  1 sibling, 0 replies; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 18:30 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> The function will reconstruct directory structure from index and feed
> excluded_from_list() with proper dtype.

Might be worth squashing with a patch introducing a caller; otherwise it's
hard to get a sense of what these functions do.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault
  2010-11-15 10:36 ` [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault Nguyễn Thái Ngọc Duy
@ 2010-11-15 19:10   ` Jonathan Nieder
  2010-11-16  2:43     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 19:10 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> excluded_from_list() is designed to be work with a directory
> structure, where there are files and directories. Index however is a
                                                   ^
                                                   The
> list of files, no directories, not really fit in excluded_from_list()
                 ^with no separate entries for directories, so it does
> design.
[...]
> clear_ce_flags() is now used to match the index against sparse
> patterns. It does generate directories for excluded_from_list()

Sounds exciting.  I can't believe I missed that, but I guess what
was needed is a simple example caller.  Let's see how this patch
works in that vein.

> --- a/Documentation/git-read-tree.txt
> +++ b/Documentation/git-read-tree.txt
> @@ -416,13 +416,6 @@ turn `core.sparseCheckout` on in order to have sparse checkout
>  support.
>  
>  
> -BUGS
> -----
> -In order to match a directory with $GIT_DIR/info/sparse-checkout,
> -trailing slash must be used. The form without trailing slash, while
> -works with .gitignore, does not work with sparse checkout.

Nice.

[...]
> diff --git a/t/t1011-read-tree-sparse-checkout.sh b/t/t1011-read-tree-sparse-checkout.sh
> index 9a07de1..50f7dfe 100755
> --- a/t/t1011-read-tree-sparse-checkout.sh
> +++ b/t/t1011-read-tree-sparse-checkout.sh
> @@ -91,12 +91,20 @@ test_expect_success 'match directories with trailing slash' '
>  	test -f sub/added
>  '
>  
> -test_expect_failure 'match directories without trailing slash' '
> +test_expect_success 'match directories without trailing slash' '

Also nice.

> -	echo init.t >.git/info/sparse-checkout &&
>  	echo sub >>.git/info/sparse-checkout &&
>  	git read-tree -m -u HEAD &&
>  	git ls-files -t >result &&
> -	test_cmp expected.swt result &&
> +	test_cmp expected.swt-noinit result &&
> +	test ! -f init.t &&
> +	test -f sub/added

Maybe the new

	test_path_is_missing

and

	test_path_is_file

would be a good fit here, for better behavior when the path is a directory in
the former case and better debugging output for failures in the latter.

> +'
> +
> +test_expect_success 'match directory pattern' '
> +	echo "s?b" >>.git/info/sparse-checkout &&
> +	git read-tree -m -u HEAD &&
> +	git ls-files -t >result &&
> +	test_cmp expected.swt-noinit result &&
>  	test ! -f init.t &&
>  	test -f sub/added
>  '

A pattern change that changes the list of matched paths would be more
impressive.  (Though I understand, this version already works for
demonstrating progress.)

[...]
> --- a/dir.c
> +++ b/dir.c
> @@ -359,12 +359,6 @@ int excluded_from_list(const char *pathname,
>  			int to_exclude = x->to_exclude;
>  
>  			if (x->flags & EXC_FLAG_MUSTBEDIR) {
> -				if (!dtype) {
> -					if (!prefixcmp(pathname, exclude))
> -						return to_exclude;
> -					else
> -						continue;
> -				}

Removing the no longer used !dtype codepath.

This hunk could be a separate patch (simplifying the diff and
decoupling this series from patch 1).

> --- a/unpack-trees.c
> +++ b/unpack-trees.c
[...]
> @@ -926,14 +917,19 @@ static void set_new_skip_worktree_1(struct unpack_trees_options *o)
>  {
>  	int i;
>  
> +	/* Mark everything out (except staged entries) */
>  	for (i = 0;i < o->src_index->cache_nr;i++) {
>  		struct cache_entry *ce = o->src_index->cache[i];
>  		ce->ce_flags &= ~CE_ADDED;
> +		if (!ce_stage(ce))

Not a fan of this new definition of "staged".  The following might be
clearer:

	/*
	 * The WILL_BE_SKIP_WORKTREE bit is initialized in two stages:
	 *
	 * 1. Pretend the checkout is so narrow it contains no files.
	 *    Only unresolved entries participating in a merge would
	 *    get written to the worktree.
	 *
	 * 2. Add back entries according to .git/info/sparse-checkout
	 */

[...]
> @@ -943,19 +939,26 @@ static int set_new_skip_worktree_2(struct unpack_trees_options *o)
>  
>  	/*
>  	 * CE_ADDED marks new index entries. These have not been processed
> -	 * by set_new_skip_worktree_1() so we do it here.
> +	 * by set_new_skip_worktree_1() so we do it here. Mark every new
> +	 * entries out.
>  	 */

Nit: "every" wants a singular noun.  But even better would be to move
the description of effect to the call site.

The description of strategy can stay here so it doesn't go out of date
when the code changes:

	/*
	 * Just like set_new_skip_worktree_1.
	 * unpack-trees never writes conflicted entries to the worktree,
	 * so there is no need to worry about conflicted files.
	 *
	 * 1. Set WILL_BE_SKIP_WORKTREE for all new entries.
	 * 2. Clear it again for paths within the sparse checkout.
	 */

Is it possible to make the two code paths share code without
sacrificing speed?

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns
  2010-11-15 10:36 ` [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns Nguyễn Thái Ngọc Duy
  2010-11-15 18:30   ` Jonathan Nieder
@ 2010-11-15 20:19   ` Jonathan Nieder
  1 sibling, 0 replies; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 20:19 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> The function will reconstruct directory structure from index and feed
> excluded_from_list() with proper dtype.

Okay, so I guess what this means is:

 Matching index entries against an excludes file currently has two
 problems.

 First, there's no function to do it.  Code paths (like sparse checkout)
 that wanted to try it would iterate over index entries and for each
 index entry pass that path to excluded_from_list().  But that is not
 how excluded_from_list() works; one is supposed to feed in each
 ancester of a path before a given path to find out if it was excluded
 because of some parent or grandparent matching a

	bigsubdirectory/

 pattern despite the path not matching any .gitignore pattern directly.

 Second, it's inefficient.  The excludes mechanism is supposed to let
 us block off vast swaths of the filesystem as uninteresting; separately
 checking every index entry doesn't fit that model.

 Introduce a new function to take care of both these problems.  This
 traverses the index in depth-first order (well, that's what order the
 index is in) to mark un-excluded entries.

 Maybe some day the in-core index format will be restructured to make
 this sort of operation easier.  Or maybe we will want to try some
 binary search based thing.  The interface is simple enough to allow
 all those things.  Example:

	clear_ce_flags(the_index.cache, the_index.cache_nr,
			CE_CANDIDATE, CE_CLEARME, exclude_list);

 would clear the CE_CLEARME flag on all index entries with
 CE_CANDIDATE flag and not matched by exclude_list.

> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -834,6 +834,94 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
>  	return mask;
>  }
>  
> +/*
> + * traverse the index, find every entry that matches according to
> + * o->el. Do "ce_flags &= ~clear_mask" on those entries. Return the
> + * number of traversed entries.
> + *
> + * If select_mask is non-zero, only entries whose ce_flags has on of
> + * those bits enabled are traversed.
> + */
> +static int clear_ce_flags_1(struct cache_entry **cache, int nr,
> +			    int prefix_len, int match_all,
> +			    int select_mask, int clear_mask,
> +			    struct unpack_trees_options *o)

o is only used for its excludes list.  Why not cut out the middle man
and take o->el as argument?

State (struct-worthy?): 
	cache     	pointer to an index entry
	prefix_len	an offset into its path
	match_all 	a boolean telling whether we have some skipping to do

The current path (called the prefix), including trailing '/', is

	cache[0]->name[0] .. cache[0]->name[prefix_len - 1]

> +{
> +	const char *name, *slash;
> +	struct cache_entry *ce = cache[0];
> +	const char *prefix = ce->name;
> +	int processed, original_nr = nr;
> +
> +	if (prefix_len && !match_all) {

Top level gets special handling.  Let's skip to it (see [*] for where we
were).

> +	while (nr) {

For each cache entry...

Might be nice to phrase this as a for loop.

	const char *prefix = cache[0]->name;
	struct cache_entry ** const cache_end = cache + nr;
	struct cache_entry **cep;
	for (cep = cache; cep != cache_end; ) {
		struct cache_entry *ce = *cep;

> +		if (select_mask && !(ce->ce_flags & select_mask)) {
> +			cache++;
> +			ce = *cache;
> +			nr--;
> +			continue;
> +		}

Some index entries are disqualified.

		if (select_mask)
			if (!(ce->ce_cflags & select_mask)) {
				cep++;
				continue;
			}

> +		if (prefix_len && strncmp(ce->name, prefix, prefix_len))
> +			break;

Entries not under prefix are handled by the caller.

		if (prefix_len)
			if (strncmp(ce->name, prefix, prefix_len))
				return cep - cache;	/* number processed */

> +		name = ce->name + prefix_len;
> +		slash = strchr(name, '/');
> +
> +		/* no slash, this is a file */
> +		if (!slash) {

Leaf!

			/* File or directory?  Check if it's a submodule. */
> +			int dtype = ce_to_dtype(ce);
> +			if (match_all ||
> +			    excluded_from_list(ce->name, ce_namelen(ce), name, &dtype, o->el) > 0)
> +				ce->ce_flags &= ~clear_mask;
> +			cache++;
> +			ce = *cache;
> +			nr--;
> +			continue;
> +		}

			if (excluded_block ||
			    excluded_from_list(...		/* Excluded? */
				ce->ce_flags &= ~clear_mask;
			cep++;
			continue;
		}

Putting the loop body in its own function could allow some nice depth
reduction.

> +
> +		/* has slash, this is a directory */
> +		processed = clear_ce_flags_1(cache, nr,
> +					     slash + 1 - ce->name, match_all,
> +					     select_mask, clear_mask, o);
> +		cache += processed;
> +		ce = *cache;
> +		nr -= processed;
> +	}
> +	return original_nr - nr;

		/* Non-leaf.  Recurse. */
		cep += clear_ce_ ...
	}

How about the other case, mentioned before (see [*] above)?

	assert(prefix_len);	/* Not toplevel but a subdirectory. */
	assert(!match_all);	/* Not wholesale excluded. */

> +		int dtype = DT_DIR;
> +		static char path[PATH_MAX];
> +		int pathlen = prefix_len - 1;
> +		const char *basename;
> +
> +		memcpy(path, prefix, pathlen);
> +		path[pathlen] = '\0';
> +		basename = strrchr(path, '/');

Use memrchr?

> +		basename = basename ? basename+1 : path;
> +		switch (excluded_from_list(path, pathlen, basename, &dtype, o->el)) {
> +		case 0:
> +			while (nr && !strncmp(ce->name, prefix, prefix_len)) {
> +				cache++;
> +				ce = *cache;
> +				nr--;
> +			}
> +			return original_nr - nr;

		case 0:	/* Included.  Great, no flag to clear. */
			for (; cep != cache_end; cep++)
				if (strncmp((*cep)->name, prefix, prefix_len))
					break;
			return cep - cache;

> +		case 1:
> +			match_all = 1;
> +			break;

		case 1:	/* Excluded.  Great, clear them wholesale. */
			match_all = 1;
			continue;

> +		default:	/* case -1, undecided */
> +			break;
> +		}

		case -1:	/* Undecided.
			continue;
		}
> +	}
> +	return original_nr - nr;
> +}

	}
	/* Ran off the end. */
	return cep - cache;
 }

That last bit could even longjmp() if expecting deeply nested
directory hierarchies.  (Sorry, joking, you should not do such an
awful thing unless timing it proves it useful.)

That was clean, thanks.

Hope some of my musings are useful anyway.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 05/10] unpack-trees: optimize full checkout case
  2010-11-15 10:36 ` [PATCH 05/10] unpack-trees: optimize full checkout case Nguyễn Thái Ngọc Duy
@ 2010-11-15 20:41   ` Jonathan Nieder
  0 siblings, 0 replies; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 20:41 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>

So, what does it do?  What workflow was not optimized (yes, I know full
checkout means pattern "*", but the reader might not), why ought it to
be optimized, and what are the side-effects, if any?

> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -996,6 +996,16 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>  			o->el = &el;
>  	}
>  
> +	if (!o->skip_sparse_checkout &&
> +	    o->el->nr == 1 &&
> +	    !strcmp(o->el->excludes[0]->pattern, "*")) {
> +		for (i = 0; i < o->src_index->cache_nr; i++)
> +			if (ce_skip_worktree(o->src_index->cache[i]))
> +				break;
> +		if (i == o->src_index->cache_nr)
> +			o->skip_sparse_checkout = 1;

Millinit:

	if (!o->skip_sparse_checkout &&
			checkout_is_whole_tree(o) &&
			index_uses_flag(o->src_index, CE_SKIP_WORKTREE))
		o->skip_sparse_checkout = 1;

Functions for the two conditions (especially the second) would make
this clearer, I think.

Sensible.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 07/10] checkout: add -S to update sparse checkout
  2010-11-15 10:36 ` [PATCH 07/10] checkout: add -S to update sparse checkout Nguyễn Thái Ngọc Duy
@ 2010-11-15 21:16   ` Jonathan Nieder
  2010-11-15 21:52     ` Miles Bader
  2010-11-16  3:08     ` Nguyen Thai Ngoc Duy
  0 siblings, 2 replies; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 21:16 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> --- a/Documentation/git-checkout.txt
> +++ b/Documentation/git-checkout.txt
> @@ -12,6 +12,7 @@ SYNOPSIS
>  'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
>  'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>...
>  'git checkout' --patch [<tree-ish>] [--] [<paths>...]
> +'git checkout' -S
>  
>  DESCRIPTION
>  -----------
> @@ -176,6 +177,13 @@ the conflicted merge in the specified paths.
>  This means that you can use `git checkout -p` to selectively discard
>  edits from your current working tree.
>  
> +-S::
> +--update-sparse-checkout::
> +	An editor is invoked to let you update your sparse checkout
> +	patterns. The updated patterns will be saved in
> +	$GIT_DIR/info/sparse-checkout. The working directory is also
> +	updated. An empty file will abort the process.

Wording nit: this doesn't make the worktree more up-to-date.  How
about:

 --edit-sparse-checkout
 --define-work-area
 --narrow-worktree

Hmph.

--edit-sparse-checkout seems best for consistency of the choices I can
think of.

[...]
> @@ -316,6 +324,14 @@ $ git add frotz
>  ------------
>  
>  
> +ENVIRONMENT AND CONFIGURATION VARIABLES
> +---------------------------------------
> +The editor used to edit the commit log message will be chosen from the
                                 ^
  patterns defining what subset of the tracked tree to work on

> +GIT_EDITOR environment variable, the core.editor configuration variable, the

Might be simpler to say:

 CONFIGURATION
 -------------
 --edit-sparse-checkout uses the configured editor.  See git-var(1) for
 details.

[...]
> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -675,6 +675,31 @@ static const char *unique_tracking_name(const char *name)
>  	return NULL;
>  }
>  
> +static void edit_info_sparse_checkout()
> +{
> +	char *tmpfile = xstrdup(git_path("sparse-checkout"));

git_pathdup()?

> +	struct exclude_list el;
> +	int i;
> +
> +	copy_file(tmpfile, git_path("info/sparse-checkout"), 0666);
> +
> +	if (launch_editor(tmpfile, NULL, NULL))
> +		exit(1);
> +
> +	memset(&el, 0, sizeof(el));
> +	if (add_excludes_from_file_to_list(tmpfile, "", 0, NULL, &el, 0) < 0 ||
> +	    el.nr == 0)
> +		die("No valid sparse patterns. Abort.");

Perhaps the error message should mention the temp file.

> +	for (i = 0; i < el.nr; i++)
> +		free(el.excludes[i]);
> +	free(el.excludes);

Neat.

> +
> +	if (rename(tmpfile, git_path("info/sparse-checkout")) < 0)
> +		die_errno("Can't update %s", git_path("info/sparse-checkout"));

Wouldn't git_path() clobber errno?  Probably worth keeping a temporary
with the result from git_path before the rename.

> +
> +	free(tmpfile);
> +}
> +
>  int cmd_checkout(int argc, const char **argv, const char *prefix)
>  {
>  	struct checkout_opts opts;
> @@ -685,6 +710,7 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>  	char *conflict_style = NULL;
>  	int patch_mode = 0;
>  	int dwim_new_local_branch = 1;
> +	int update_sparse_checkout = 0;
>  	struct option options[] = {
>  		OPT__QUIET(&opts.quiet),
>  		OPT_STRING('b', NULL, &opts.new_branch, "branch",
> @@ -704,6 +730,8 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>  		OPT_STRING(0, "conflict", &conflict_style, "style",
>  			   "conflict style (merge or diff3)"),
>  		OPT_BOOLEAN('p', "patch", &patch_mode, "select hunks interactively"),
> +		OPT_BOOLEAN('S', "update-sparse-checkout", &update_sparse_checkout,
> +			    "open up editor to edit $GIT_DIR/info/sparse-checkout" ),
>  		{ OPTION_BOOLEAN, 0, "guess", &dwim_new_local_branch, NULL,
>  		  "second guess 'git checkout no-such-branch'",
>  		  PARSE_OPT_NOARG | PARSE_OPT_HIDDEN },
> @@ -722,6 +750,9 @@ int cmd_checkout(int argc, const char **argv, const char *prefix)
>  	argc = parse_options(argc, argv, prefix, options, checkout_usage,
>  			     PARSE_OPT_KEEP_DASHDASH);
>  
> +	if (update_sparse_checkout && !core_apply_sparse_checkout)
> +		die("core.sparseCheckout needs to be turned on");

--%s requires core.sparsecheckout to be turned out

> +
>  	/* we can assume from now on new_branch = !new_branch_force */
>  	if (opts.new_branch && opts.new_branch_force)
>  		die("-B cannot be used with -b");
> @@ -874,6 +905,9 @@ no_reference:
>  		if (!pathspec)
>  			die("invalid path specification");
>  
> +		if (update_sparse_checkout)
> +			die("git checkout: update paths is incompatible with updating sparse checkout.");
> +

"updating paths is incompatible", I think.

Maybe an area for future work. :)

> @@ -892,8 +926,11 @@ no_reference:
>  		return checkout_paths(source_tree, pathspec, &opts);
>  	}
>  
> -	if (patch_mode)
> +	if (patch_mode) {
> +		if (update_sparse_checkout)
> +			die("git checkout: interactive checkout is incompatible with updating sparse checkout.");

Hmm, I'd put

	if (patch_mode && update_sparse_checkout)
		die(...

earlier as part of option validation.

[...]
> --- a/t/t1011-read-tree-sparse-checkout.sh
> +++ b/t/t1011-read-tree-sparse-checkout.sh
> @@ -184,4 +184,22 @@ test_expect_success 'read-tree --reset removes outside worktree' '
>  	test_cmp empty result
>  '
>  
> +test_expect_success 'git checkout -S fails to launch editor' '
> +	GIT_EDITOR=/non-existent test_must_fail git checkout -S &&

Exporting envvars via a function is non-portable.  The usual workaround:

	(
		GIT_EDITOR=... &&
		export  GIT_EDITOR &&
		test_must_fail ...
	) &&

Maybe it's time to introduce

	eval_must_fail GIT_EDITOR=/non-existent git checkout -S

?

[...]
> +test_expect_success 'git checkout -S' '
> +	git checkout -f top &&
> +	cat >editor.sh <<\EOF &&
> +#!/bin/sh
> +echo sub > "$1"
> +EOF

Style: the git test scripts usually omit the space after >.

> --- a/templates/info--sparse-checkout
> +++ b/templates/info--sparse-checkout
> @@ -1,3 +1,4 @@
>  # Specify what files are checked out in working directory
>  # Run "git checkout" after updating this file to update working directory
> +# If this is opened by "git checkout -S", empty this file will abort it.
>  *

Usage nits: s/if/when/; s/opened/edited/; s/will/to/; s/it//.

 # When editing with "git checkout -S", empty this file to abort.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 08/10] checkout: add --full to fully populate working directory
  2010-11-15 10:36 ` [PATCH 08/10] checkout: add --full to fully populate working directory Nguyễn Thái Ngọc Duy
@ 2010-11-15 21:23   ` Jonathan Nieder
  2010-11-16  2:50     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 21:23 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

> --- a/Documentation/git-checkout.txt
> +++ b/Documentation/git-checkout.txt
> @@ -12,7 +12,7 @@ SYNOPSIS
>  'git checkout' [-q] [-f] [-m] [[-b|-B|--orphan] <new_branch>] [<start_point>]
>  'git checkout' [-f|--ours|--theirs|-m|--conflict=<style>] [<tree-ish>] [--] <paths>...
>  'git checkout' --patch [<tree-ish>] [--] [<paths>...]
> -'git checkout' -S
> +'git checkout' [-S|--full]

Brackets mean "optional".

> @@ -184,6 +184,10 @@ edits from your current working tree.
>  	$GIT_DIR/info/sparse-checkout. The working directory is also
>  	updated. An empty file will abort the process.
>  
> +--full:
> +	Reset $GIT_DIR/info/sparse-checkout to include everything and
> +	update working directory accordingly.

Maybe say something about "widen" and "whole project"?

> --- a/builtin/checkout.c
> +++ b/builtin/checkout.c
> @@ -952,8 +957,20 @@ no_reference:
>  	if (opts.writeout_stage)
>  		die("--ours/--theirs is incompatible with switching branches.");
>  
> -	if (update_sparse_checkout)
> -		edit_info_sparse_checkout();
> +	if (update_sparse_checkout) {
> +		if (full_checkout) {
> +			FILE *fp = fopen(git_path("info/sparse-checkout"), "w+");

What should --full --edit-worktree-shape-or-whatever-that-was-called do?

> --- a/t/t1011-read-tree-sparse-checkout.sh
> +++ b/t/t1011-read-tree-sparse-checkout.sh
> @@ -202,4 +202,11 @@ EOF
>  	test ! -f init.t
>  '
>  
> +test_expect_success 'git checkout --full' '
> +	git checkout --full &&
> +	grep "^\*$" .git/info/sparse-checkout &&
> +	test -f sub/added &&
> +	test -f init.t

Probably worth checking that that's the only pattern in the file so
short-circuiting happens.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/10] clean: support cleaning sparse checkout with -S
  2010-11-15 10:36 ` [PATCH 10/10] clean: support cleaning sparse checkout with -S Nguyễn Thái Ngọc Duy
@ 2010-11-15 21:30   ` Jonathan Nieder
  2010-11-16  2:53     ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-15 21:30 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git

Nguyễn Thái Ngọc Duy wrote:

>                                                              Files
> that match the index exactly will be cleaned without "-f".

Hmm, that's new.  Seems fine, though; a person using -S would be
forwarned.

> --- a/Documentation/git-clean.txt
> +++ b/Documentation/git-clean.txt
> @@ -8,7 +8,7 @@ git-clean - Remove untracked files from the working tree
>  SYNOPSIS
>  --------
>  [verse]
> -'git clean' [-d] [-f] [-n] [-q] [-e <pattern>] [-x | -X] [--] <path>...
> +'git clean' [-d] [-f] [-n] [-q] [-e <pattern>] [-x | -X | -S] [--] <path>...

So -S and -x don't combine?

>  
>  DESCRIPTION
>  -----------
> @@ -61,6 +61,10 @@ OPTIONS
>  	Remove only files ignored by git.  This may be useful to rebuild
>  	everything from scratch, but keep manually created files.
>  
> +-S::
> +	Remove files tracked by git but are outside of sparse checkout.
> +	Files that match the index exactly will be removed even when
> +	'-f' is not given and clean.requireForce is no.

Does "no" mean "yes" here?

Probably worth mentioning that -S does not apply here.

Would it be worth adding a -s/--clean-full-worktree option to
complete the analogy to -X?

> --- a/builtin/clean.c
> +++ b/builtin/clean.c

Not reading the code or tests because I'm trusting that I wouldn't
need to use this command. :)

Thanks, sparse checkout is looking sane now.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 07/10] checkout: add -S to update sparse checkout
  2010-11-15 21:16   ` Jonathan Nieder
@ 2010-11-15 21:52     ` Miles Bader
  2010-11-17 15:02       ` Nguyen Thai Ngoc Duy
  2010-11-16  3:08     ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 34+ messages in thread
From: Miles Bader @ 2010-11-15 21:52 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Nguyễn Thái Ngọc Duy, git

Jonathan Nieder <jrnieder@gmail.com> writes:
>> +-S::
>> +--update-sparse-checkout::
>> +	An editor is invoked to let you update your sparse checkout
>> +	patterns. The updated patterns will be saved in
>> +	$GIT_DIR/info/sparse-checkout. The working directory is also
>> +	updated. An empty file will abort the process.
>
> Wording nit: this doesn't make the worktree more up-to-date.  How
> about:
>
>  --edit-sparse-checkout
>  --define-work-area
>  --narrow-worktree
>
> Hmph.
>
> --edit-sparse-checkout seems best for consistency of the choices I can
> think of.

"--change-sparse-checkout"?

Onna-account of "edit" sounding like you're actually somehow editting
the checkout itself...  OTOH, since it invokes the editor... hmmm

BTW, wouldn't it be more convenient to allow specifying patterns
directly via the command line?  I'd think in many (maybe the majority
of) cases people will really only want one entry, and having to edit a
file to specify it seems vaguely annoying...

-Miles

-- 
.Numeric stability is probably not all that important when you're guessing.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/10] add: do not rely on dtype being NULL behavior
  2010-11-15 12:14   ` Jonathan Nieder
@ 2010-11-16  2:18     ` Nguyen Thai Ngoc Duy
  2010-11-16  2:42       ` Jonathan Nieder
  2010-11-16 18:58       ` Junio C Hamano
  0 siblings, 2 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  2:18 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, Jens Lehmann, Ævar Arnfjörð

On Mon, Nov 15, 2010 at 7:14 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Nguyễn Thái Ngọc Duy wrote:
>
>> Commit c84de70 (excluded_1(): support exclude files in index -
>> 2009-08-20) added support for excluded() where dtype can be NULL. It
>> was designed specifically for index matching because there was no
>> other way to extract dtype information from index. It did not support
>> wildcard matching (for example, "a*/" pattern would fail to match).
>>
>> The code was probably misread when commit 108da0d (git add: Add the
>> "--ignore-missing" option for the dry run - 2010-07-10) was made
>> because DT_UNKNOWN happens to be zero (NULL) too.
>>
>> Do not pass DT_UNKNOWN/NULL to excluded(), instead pass a pointer to a
>> variable that contains DT_UNKNOWN. The real dtype will be extracted
>> from worktree by excluded(), as expected.
>
> Could you rephrase this in a way that contrasts current and desired
> behavior?  Is it like this?
>
>        The "git add --ignore-missing --dry-run" codepath is
>        interpreting .gitignore incorrectly, unlike "git add".  For
>        example:
>
>                $ test -e foo || echo missing
>                missing
>                $ echo foo/ >>.gitignore
>                $ mkdir bar
>                $ git add --ignore-missing --dry-run foo; echo $?
>                The following paths are ignored by one of your .gitignore files:
>                foo/
>                Use -f if you really want to add them.
>                fatal: no files added
>                128
>                $ git add --ignore-missing --dry-run bar/foo; echo $?
>                0
>
>        In the original use case (preparing to add a submodule) the
>        behavior of the first command is correct, second incorrect.
>        If the entry to be added was a regular file, it would be the
>        other way around.
>
>        The cause: the --ignore-missing code passes DT_UNKNOWN as the
>        dtype_ptr argument to excluded() which happens to equal zero
>        (NULL) and accidentally triggers the "match pathspecs in index
>        only" codepath (see c84de70, excluded_1(): support exclude
>        files in index, 2009-08-20) that is unfortunately a bit
>        primitive.
>
>        Surely what was really wanted is to check paths against the
>        index and work tree, defaulting to "regular file".
>
> Wait --- that's not true.  In the "git submodule add" case, we really
> want to default to (or even better, force) "directory".

Hmm.. get_index_dtype() would return DT_DIR if the submodule exists in
index. If it does not it must be a directory in worktree, right?
Call flow: excluded_from_list() -> get_dtype() -> get_index_dtype()
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees()
  2010-11-15 12:34   ` Thiago Farina
@ 2010-11-16  2:19     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  2:19 UTC (permalink / raw)
  To: Thiago Farina; +Cc: git

2010/11/15 Thiago Farina <tfransosi@gmail.com>:
> 2010/11/15 Nguyễn Thái Ngọc Duy <pclouds@gmail.com>:
>>  cache.h        |    1 +
>>  unpack-trees.c |   69 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
>>  2 files changed, 63 insertions(+), 7 deletions(-)
>>
>> diff --git a/cache.h b/cache.h
>> index 33decd9..d87708a 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -182,6 +182,7 @@ struct cache_entry {
>>  #define CE_WT_REMOVE (0x400000) /* remove in work directory */
>>
>>  #define CE_UNPACKED  (0x1000000)
>> +#define CE_NEW_SKIP_WORKTREE (0x2000000)
>>
> Would be good to remove the () around the hex here and else in this file?

That was the tradition of CE_*. I don't know why () is there. Will do
a preparation patch to remove all () first.

>
>>  /*
>>  * Extended on-disk flags
>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index 803445a..9acd9be 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -258,7 +258,7 @@ static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_opt
>>  {
>>        int was_skip_worktree = ce_skip_worktree(ce);
>>
>> -       if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
>> +       if (ce->ce_flags & CE_NEW_SKIP_WORKTREE)
>>                ce->ce_flags |= CE_SKIP_WORKTREE;
>>        else
>>                ce->ce_flags &= ~CE_SKIP_WORKTREE;
>> @@ -834,6 +834,49 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
>>        return mask;
>>  }
>>
>> +static void set_new_skip_worktree_1(struct unpack_trees_options *o)
>> +{
>> +       int i;
>> +
>> +       for (i = 0;i < o->src_index->cache_nr;i++) {
> Could you add spaces after ; for readability, please? There is another
> for below that needs this to.
>

Yes. I kept reminding myself of doing that, then forgot.
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees()
  2010-11-15 16:01   ` Jonathan Nieder
@ 2010-11-16  2:39     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  2:39 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

2010/11/15 Jonathan Nieder <jrnieder@gmail.com>:
> Nguyễn Thái Ngọc Duy wrote:
>
>> Earlier, the will_have_skip_worktree() checks are done in various places:
>>
>> 1. in verify_* functions to prevent absent/uptodate checks outside
>>    worktree
>> 2. in merged_entry for new index entries
>> 3. in apply_sparse_checkout() for all entries
>>
>> In case all entries are added new ($GIT_DIR/index is missing) all the
>> above checks will be done for all entries, which in the worst case can
>> become cache_nr*el->nr*3 fnmatch() calls. Quite expensive.
>
> Does this mean something like:
>
>        Handling of sparse checkouts is slow.
>
>        [timings]
>
>        To fix this, we will do such-and-such.  As a first step,
>        we'll do such-and-such which does not change semantics
>        and at least does not make it any slower.
>
> ?
>
> In other words, any commit message should make clear
>
>  1. The purpose.
>  2. The baseline of (sane or insane) behavior that is affected.
>  3. The intended resulting functionality.
>
> How the patch works and why it succeeds are just optional extras
> (nice, certainly, but in 90% of cases it is obvious from the code once
> you know (1), (2), and (3) anyway).

Addressing the slowness is a nice side effect. My main concern is to
collect all skip-worktree checks in a central place so that I can do
tree-like traversing.

>> --- a/cache.h
>> +++ b/cache.h
>> @@ -182,6 +182,7 @@ struct cache_entry {
>>  #define CE_WT_REMOVE (0x400000) /* remove in work directory */
>>
>>  #define CE_UNPACKED  (0x1000000)
>> +#define CE_NEW_SKIP_WORKTREE (0x2000000)
>
> Semantics?

Yep, missed this in the commit message. While unpack_trees() is
running, this bit will be set in entries that are outside worktree
area, according to the (possibly updated) sparse-checkout file.
Compare this bit with CE_SKIP_WORKTREE, we would know this entry
should be removed/added due to sparse-checkout updates.

>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -258,7 +258,7 @@ static int apply_sparse_checkout(struct cache_entry *ce, struct unpack_trees_opt
>>  {
>>       int was_skip_worktree = ce_skip_worktree(ce);
>>
>> -     if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
>> +     if (ce->ce_flags & CE_NEW_SKIP_WORKTREE)
>>               ce->ce_flags |= CE_SKIP_WORKTREE;
>
> So CE_NEW_SKIP_WORKTREE roughly means "stage-0 entry outside the sparse checkout area"?

Yes. Conflicted entries must always stay in worktree regardless
sparse-checkout patterns.

>
>>       else
>>               ce->ce_flags &= ~CE_SKIP_WORKTREE;
>
> In particular, I guess the NEW part refers to the sparse checkout
> area, not the entry, since existing index entries with SKIP_WORKTREE
> bits need to keep that bit.

The NEW part still refers to entry. It's basically
ce_flags[CE_SKIP_WORKTREE] = ce_flags[CE_NEW_SKIP_WORKTREE]. The
former bit is kept in file, the later is in memory only.

>> +
>> +static int verify_absent(struct cache_entry *, enum unpack_trees_error_types, struct unpack_trees_options *);
>> +static int set_new_skip_worktree_2(struct unpack_trees_options *o)
>> +{
>> +     int i;
>> +
>> +     /*
>> +      * CE_ADDED marks new index entries. These have not been processed
>> +      * by set_new_skip_worktree_1() so we do it here.
>> +      */
>
> Probably this comment belongs at the call site instead, to avoid some
> chasing.

There is a comment in the callsite, though it does not directly say
"set_new_skip_worktree_2". Will fix.

>> +     for (i = 0;i < o->result.cache_nr;i++) {
>> +             struct cache_entry *ce = o->result.cache[i];
>> +
>> +             if (!(ce->ce_flags & CE_ADDED))
>> +                     continue;
>> +
>> +             ce->ce_flags &= ~CE_ADDED;
>> +             if (!ce_stage(ce) && will_have_skip_worktree(ce, o))
>> +                     ce->ce_flags |= CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE;
>> +             else
>> +                     ce->ce_flags &= ~(CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
>
> CE_ADDED is private to the add_file_to_index() code path shared
> between add and rerere builtins.  rerere is libified and used e.g. by
> cherry-pick foo..bar.  Can it get us in trouble or do we have clear
> the flags before using them here?  I think it would be worth a note in
> api-in-core-index.txt to warn future refactorers.

If it's too much trouble, I can use a new bit. Though the number of
available bits is decreasing.

>> +
>> +             /* Left-over checks from merged_entry when old == NULL */
>
> Huh?  (That is completely cryptic outside the context of the patch.)

Will elaborate.

>> @@ -862,6 +905,9 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>>                       o->el = &el;
>>       }
>>
>> +     if (!o->skip_sparse_checkout)
>> +             set_new_skip_worktree_1(o);
>> +
>
> Why is this not folded into the above if ()?
>
> This populates the NEW_SKIP_WORKTREE (= future SKIP_WORKTREE?) bit
> in the index that represents the preimage for a reset or merge.
>
> Perhaps call it:
>
>                set_new_skip_worktree(o->src_index, 0);
>
> and mark that function __attribute__((always_inline)) if the
> optimizer doesn't want to cooperate?

It's setting future skip-worktree on existing entries index, contrast
with the _2() function, which is setting future skip-worktree on _new_
entries in the index. How do you name those functions? Wait..

>>       memset(&o->result, 0, sizeof(o->result));
>>       o->result.initialized = 1;
>>       o->result.timestamp.sec = o->src_index->timestamp.sec;
>> @@ -922,6 +968,10 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
>>
>>       if (!o->skip_sparse_checkout) {
>>               int empty_worktree = 1;
>> +
>> +             if (set_new_skip_worktree_2(o))
>> +                     goto return_failed;
>> +
>
> Trivial part of the merge is over.  So now we can check the new
> index entries against the sparse worktree patterns.  They were added in
> that trivial part using add_entry() and will have the CE_ADDED flag.
>
> So this might be:
>
>                set_new_skip_worktree(&o->result, CE_ADDED);
>
> or
>
>                set_result_skip_worktree(o);
>

OK I like the former over the latter.

> The CE_ADDED flag was set in merged_entry, called by the merge_fn_t
> implementations.  If there were an api-traverse-trees.txt explaining
> the API, it would be worth a note there; for now it should suffice
> to add a note to future merge_fn_t implementors in the commit
> message ("each unpack-trees merge function arranges for
> CE_SKIP_WORKTREE and CE_SKIP_NEW_WORKTREE to be propagated and for the
> CE_ADDED flag to be set on entries of new paths").

OK

>> @@ -1017,7 +1067,7 @@ static int verify_uptodate_1(struct cache_entry *ce,
>>  static int verify_uptodate(struct cache_entry *ce,
>>                          struct unpack_trees_options *o)
>>  {
>> -     if (!o->skip_sparse_checkout && will_have_skip_worktree(ce, o))
>> +     if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
> [...]
>> @@ -1204,7 +1254,7 @@ static int verify_absent(struct cache_entry *ce,
>>                        enum unpack_trees_error_types error_type,
>>                        struct unpack_trees_options *o)
>>  {
>> -     if (!o->skip_sparse_checkout && will_have_skip_worktree(ce, o))
>> +     if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
>
> The point, I hope.  Currently we alternate between finding entries to
> remove and deciding whether if lie in the worktree.  After this patch,
> whether they lie in the worktree is precomputed.

Yep.

>> @@ -1226,10 +1276,15 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
>>       int update = CE_UPDATE;
>>
>>       if (!old) {
>> +             /*
>> +              * Set CE_NEW_SKIP_WORKTREE on "merge" to make
>> +              * verify_absent() no-op. The true check will be done
>> +              * later on in unpack_trees().
>> +              */
>> +             merge->ce_flags |= CE_NEW_SKIP_WORKTREE;
>
> Mm?  Since verify_absent() is a no-op, why call it?  This looks like a
> placeholder for code that calls verify_absent later, when we know if
> it lies in the worktree.

It is no-op only when sparse checkout is enabled. In such cases we
don't know (yet) whether we will add those files in worktree. In
normal cases, everything must be added in worktree, so verify_absent()
is real.

>> @@ -1245,8 +1300,8 @@ static int merged_entry(struct cache_entry *merge, struct cache_entry *old,
>>               } else {
>>                       if (verify_uptodate(old, o))
>>                               return -1;
>> -                     if (ce_skip_worktree(old))
>> -                             update |= CE_SKIP_WORKTREE;
>> +                     /* Migrate old bits over */
>> +                     update |= old->ce_flags & (CE_SKIP_WORKTREE | CE_NEW_SKIP_WORKTREE);
>
> Thanks, this looks promising.

Thanks for looking at it. These stuff can be error-prone.
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/10] add: do not rely on dtype being NULL behavior
  2010-11-16  2:18     ` Nguyen Thai Ngoc Duy
@ 2010-11-16  2:42       ` Jonathan Nieder
  2010-11-16 18:58       ` Junio C Hamano
  1 sibling, 0 replies; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-16  2:42 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git, Jens Lehmann, Ævar Arnfjörð

Nguyen Thai Ngoc Duy wrote:
> On Mon, Nov 15, 2010 at 7:14 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>>        Surely what was really wanted is to check paths against the
>>        index and work tree, defaulting to "regular file".
>>
>> Wait --- that's not true.  In the "git submodule add" case, we really
>> want to default to (or even better, force) "directory".
>
> Hmm.. get_index_dtype() would return DT_DIR if the submodule exists in
> index. If it does not it must be a directory in worktree, right?
> Call flow: excluded_from_list() -> get_dtype() -> get_index_dtype()

based on

	git ls-files --error-unmatch "$path" >/dev/null 2>&1 &&
	die "'$path' already exists in the index"

	if test -z "$force" && ! git add --dry-run --ignore-missing "$path" > /dev/null 2>&1
	then
		echo >&2 "The following path is ignored by one of your .gitignore files:" &&
		echo >&2 $path &&
		echo >&2 "Use -f if you really want to add it."
		exit 1
	fi

	# perhaps the path exists and is already a git repo, else clone it
	if test -e "$path"

I'd say no, the usual case is that the potential submodule does not
exist in the index or worktree, which is why that call site uses
--ignore-missing.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault
  2010-11-15 19:10   ` Jonathan Nieder
@ 2010-11-16  2:43     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  2:43 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

2010/11/16 Jonathan Nieder <jrnieder@gmail.com>:
> Is it possible to make the two code paths share code without
> sacrificing speed?

I think so. I'll merge the two functions in set_new_skip_worktree() as
you suggested in the previous mail.
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 08/10] checkout: add --full to fully populate working directory
  2010-11-15 21:23   ` Jonathan Nieder
@ 2010-11-16  2:50     ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  2:50 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

2010/11/16 Jonathan Nieder <jrnieder@gmail.com>:
>> --- a/builtin/checkout.c
>> +++ b/builtin/checkout.c
>> @@ -952,8 +957,20 @@ no_reference:
>>       if (opts.writeout_stage)
>>               die("--ours/--theirs is incompatible with switching branches.");
>>
>> -     if (update_sparse_checkout)
>> -             edit_info_sparse_checkout();
>> +     if (update_sparse_checkout) {
>> +             if (full_checkout) {
>> +                     FILE *fp = fopen(git_path("info/sparse-checkout"), "w+");
>
> What should --full --edit-worktree-shape-or-whatever-that-was-called do?

Should die().
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/10] clean: support cleaning sparse checkout with -S
  2010-11-15 21:30   ` Jonathan Nieder
@ 2010-11-16  2:53     ` Nguyen Thai Ngoc Duy
  2010-11-16  3:07       ` Jonathan Nieder
  0 siblings, 1 reply; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  2:53 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

2010/11/16 Jonathan Nieder <jrnieder@gmail.com>:
> Nguyễn Thái Ngọc Duy wrote:
>
>>                                                              Files
>> that match the index exactly will be cleaned without "-f".
>
> Hmm, that's new.  Seems fine, though; a person using -S would be
> forwarned.

It's safe, which is why I let it do that without '-f'. One can always
checkout those entries again (provided that they know what entries to
checkout).

How do you want them to be warned?

>> --- a/Documentation/git-clean.txt
>> +++ b/Documentation/git-clean.txt
>> @@ -8,7 +8,7 @@ git-clean - Remove untracked files from the working tree
>>  SYNOPSIS
>>  --------
>>  [verse]
>> -'git clean' [-d] [-f] [-n] [-q] [-e <pattern>] [-x | -X] [--] <path>...
>> +'git clean' [-d] [-f] [-n] [-q] [-e <pattern>] [-x | -X | -S] [--] <path>...
>
> So -S and -x don't combine?

No. Technical difficulties. Wait, nothing prevents me from doing that.
Yes combination is possible.

>>
>>  DESCRIPTION
>>  -----------
>> @@ -61,6 +61,10 @@ OPTIONS
>>       Remove only files ignored by git.  This may be useful to rebuild
>>       everything from scratch, but keep manually created files.
>>
>> +-S::
>> +     Remove files tracked by git but are outside of sparse checkout.
>> +     Files that match the index exactly will be removed even when
>> +     '-f' is not given and clean.requireForce is no.
>
> Does "no" mean "yes" here?

Probably, need to look at the that config variable again :(
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 10/10] clean: support cleaning sparse checkout with -S
  2010-11-16  2:53     ` Nguyen Thai Ngoc Duy
@ 2010-11-16  3:07       ` Jonathan Nieder
  0 siblings, 0 replies; 34+ messages in thread
From: Jonathan Nieder @ 2010-11-16  3:07 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

Nguyen Thai Ngoc Duy wrote:
> 2010/11/16 Jonathan Nieder <jrnieder@gmail.com>:

>> Hmm, that's new.  Seems fine, though; a person using -S would be
>> forwarned.
>
> It's safe, which is why I let it do that without '-f'. One can always
> checkout those entries again (provided that they know what entries to
> checkout).
>
> How do you want them to be warned?

Manual and -h output, ideally.

>		OPT_BOOLEAN('S', NULL, &sparse, "remove tracked files outside sparse checkout"),

Maybe something to the effect of

		OPT_SET_INT('S', NULL, &sparse,
			"prune worktree to sparse pattern (even without -f)", 1),

Probably someone will chime in with the perfect wording that fits
in the 80-column "git clean -h" output format. :)

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 07/10] checkout: add -S to update sparse checkout
  2010-11-15 21:16   ` Jonathan Nieder
  2010-11-15 21:52     ` Miles Bader
@ 2010-11-16  3:08     ` Nguyen Thai Ngoc Duy
  1 sibling, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-16  3:08 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git

2010/11/16 Jonathan Nieder <jrnieder@gmail.com>:
>>  DESCRIPTION
>>  -----------
>> @@ -176,6 +177,13 @@ the conflicted merge in the specified paths.
>>  This means that you can use `git checkout -p` to selectively discard
>>  edits from your current working tree.
>>
>> +-S::
>> +--update-sparse-checkout::
>> +     An editor is invoked to let you update your sparse checkout
>> +     patterns. The updated patterns will be saved in
>> +     $GIT_DIR/info/sparse-checkout. The working directory is also
>> +     updated. An empty file will abort the process.
>
> Wording nit: this doesn't make the worktree more up-to-date.  How
> about:
>
>  --edit-sparse-checkout
>  --define-work-area
>  --narrow-worktree
>
> Hmph.
>
> --edit-sparse-checkout seems best for consistency of the choices I can
> think of.

Yep. --edit-sparse-checkout.

>> @@ -316,6 +324,14 @@ $ git add frotz
>>  ------------
>>
>>
>> +ENVIRONMENT AND CONFIGURATION VARIABLES
>> +---------------------------------------
>> +The editor used to edit the commit log message will be chosen from the
>                                 ^
>  patterns defining what subset of the tracked tree to work on
>
>> +GIT_EDITOR environment variable, the core.editor configuration variable, the
>
> Might be simpler to say:

That was copied from git-commit.txt (I think). Maybe that file should
be updated too.

> [...]
>> --- a/builtin/checkout.c
>> +++ b/builtin/checkout.c
>> @@ -675,6 +675,31 @@ static const char *unique_tracking_name(const char *name)
>>       return NULL;
>>  }
>>
>> +static void edit_info_sparse_checkout()
>> +{
>> +     char *tmpfile = xstrdup(git_path("sparse-checkout"));
>
> git_pathdup()?

Yup.

>
>> +     for (i = 0; i < el.nr; i++)
>> +             free(el.excludes[i]);
>> +     free(el.excludes);
>
> Neat.

This code is actually duplicated in unpack_trees(). It's time to
create free_exclude() function.

>> +
>> +     if (rename(tmpfile, git_path("info/sparse-checkout")) < 0)
>> +             die_errno("Can't update %s", git_path("info/sparse-checkout"));
>
> Wouldn't git_path() clobber errno?  Probably worth keeping a temporary
> with the result from git_path before the rename.

OK

>> --- a/t/t1011-read-tree-sparse-checkout.sh
>> +++ b/t/t1011-read-tree-sparse-checkout.sh
>> @@ -184,4 +184,22 @@ test_expect_success 'read-tree --reset removes outside worktree' '
>>       test_cmp empty result
>>  '
>>
>> +test_expect_success 'git checkout -S fails to launch editor' '
>> +     GIT_EDITOR=/non-existent test_must_fail git checkout -S &&
>
> Exporting envvars via a function is non-portable.  The usual workaround:
>
>        (
>                GIT_EDITOR=... &&
>                export  GIT_EDITOR &&
>                test_must_fail ...
>        ) &&
>
> Maybe it's time to introduce
>
>        eval_must_fail GIT_EDITOR=/non-existent git checkout -S
>
> ?

I'll go with the former.
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/10] add: do not rely on dtype being NULL behavior
  2010-11-16  2:18     ` Nguyen Thai Ngoc Duy
  2010-11-16  2:42       ` Jonathan Nieder
@ 2010-11-16 18:58       ` Junio C Hamano
  2010-11-17  6:38         ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 34+ messages in thread
From: Junio C Hamano @ 2010-11-16 18:58 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Jonathan Nieder, git, Jens Lehmann, Ævar Arnfjörð

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> Hmm.. get_index_dtype() would return DT_DIR if the submodule exists in
> index. If it does not it must be a directory in worktree, right?
> Call flow: excluded_from_list() -> get_dtype() -> get_index_dtype()

By the way, your c84de70 (excluded_1(): support exclude files in index,
2009-08-20) looks broken.  What happens when the exclude pattern wants a
directory but specifies it with a wildcard?

The safest and sanest first step is to just change DT_UNKNOWN (which
clearly is a wrong thing to pass) to NULL to keep the current behaviour,
but it probably makes more sense to do the second step at the same time to
pass a pointer to an int that holds DT_UNKNOWN to avoid triggering that
broken codepath.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 01/10] add: do not rely on dtype being NULL behavior
  2010-11-16 18:58       ` Junio C Hamano
@ 2010-11-17  6:38         ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-17  6:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, git, Jens Lehmann, Ævar Arnfjörð

On Wed, Nov 17, 2010 at 1:58 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> Hmm.. get_index_dtype() would return DT_DIR if the submodule exists in
>> index. If it does not it must be a directory in worktree, right?
>> Call flow: excluded_from_list() -> get_dtype() -> get_index_dtype()
>
> By the way, your c84de70 (excluded_1(): support exclude files in index,
> 2009-08-20) looks broken.  What happens when the exclude pattern wants a
> directory but specifies it with a wildcard?

It fails. I realized that just this week unfortunately. That commit is
reverted in this series.

> The safest and sanest first step is to just change DT_UNKNOWN (which
> clearly is a wrong thing to pass) to NULL to keep the current behaviour,
> but it probably makes more sense to do the second step at the same time to
> pass a pointer to an int that holds DT_UNKNOWN to avoid triggering that
> broken codepath.

Yes.
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH 07/10] checkout: add -S to update sparse checkout
  2010-11-15 21:52     ` Miles Bader
@ 2010-11-17 15:02       ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 34+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2010-11-17 15:02 UTC (permalink / raw)
  To: Miles Bader; +Cc: Jonathan Nieder, git

2010/11/16 Miles Bader <miles@gnu.org>:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>>> +-S::
>>> +--update-sparse-checkout::
>>> +    An editor is invoked to let you update your sparse checkout
>>> +    patterns. The updated patterns will be saved in
>>> +    $GIT_DIR/info/sparse-checkout. The working directory is also
>>> +    updated. An empty file will abort the process.
>>
>> Wording nit: this doesn't make the worktree more up-to-date.  How
>> about:
>>
>>  --edit-sparse-checkout
>>  --define-work-area
>>  --narrow-worktree
>>
>> Hmph.
>>
>> --edit-sparse-checkout seems best for consistency of the choices I can
>> think of.
>
> "--change-sparse-checkout"?
>
> Onna-account of "edit" sounding like you're actually somehow editting
> the checkout itself...  OTOH, since it invokes the editor... hmmm
>
> BTW, wouldn't it be more convenient to allow specifying patterns
> directly via the command line?  I'd think in many (maybe the majority
> of) cases people will really only want one entry, and having to edit a
> file to specify it seems vaguely annoying...

A command line option to append patterns only sounds good to me. But
it would clutter up sparse-checkout file over time and may decrease
performance. The same option can also be reused for git-clone. Do you
suggest any name? I'm bad at naming.

I don't think I can make an option to remove patterns though.
-- 
Duy

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2010-11-17 15:03 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-15 10:36 [PATCH 00/10] Sparse checkout fixes and improvements Nguyễn Thái Ngọc Duy
2010-11-15 10:36 ` [PATCH 01/10] add: do not rely on dtype being NULL behavior Nguyễn Thái Ngọc Duy
2010-11-15 12:14   ` Jonathan Nieder
2010-11-16  2:18     ` Nguyen Thai Ngoc Duy
2010-11-16  2:42       ` Jonathan Nieder
2010-11-16 18:58       ` Junio C Hamano
2010-11-17  6:38         ` Nguyen Thai Ngoc Duy
2010-11-15 10:36 ` [PATCH 02/10] unpack-trees: move all skip-worktree check back to unpack_trees() Nguyễn Thái Ngọc Duy
2010-11-15 12:34   ` Thiago Farina
2010-11-16  2:19     ` Nguyen Thai Ngoc Duy
2010-11-15 16:01   ` Jonathan Nieder
2010-11-16  2:39     ` Nguyen Thai Ngoc Duy
2010-11-15 10:36 ` [PATCH 03/10] unpack-trees: add function to update ce_flags based on sparse patterns Nguyễn Thái Ngọc Duy
2010-11-15 18:30   ` Jonathan Nieder
2010-11-15 20:19   ` Jonathan Nieder
2010-11-15 10:36 ` [PATCH 04/10] unpack-trees: fix sparse checkout's "unable to match directories" fault Nguyễn Thái Ngọc Duy
2010-11-15 19:10   ` Jonathan Nieder
2010-11-16  2:43     ` Nguyen Thai Ngoc Duy
2010-11-15 10:36 ` [PATCH 05/10] unpack-trees: optimize full checkout case Nguyễn Thái Ngọc Duy
2010-11-15 20:41   ` Jonathan Nieder
2010-11-15 10:36 ` [PATCH 06/10] templates: add info/sparse-checkout Nguyễn Thái Ngọc Duy
2010-11-15 10:36 ` [PATCH 07/10] checkout: add -S to update sparse checkout Nguyễn Thái Ngọc Duy
2010-11-15 21:16   ` Jonathan Nieder
2010-11-15 21:52     ` Miles Bader
2010-11-17 15:02       ` Nguyen Thai Ngoc Duy
2010-11-16  3:08     ` Nguyen Thai Ngoc Duy
2010-11-15 10:36 ` [PATCH 08/10] checkout: add --full to fully populate working directory Nguyễn Thái Ngọc Duy
2010-11-15 21:23   ` Jonathan Nieder
2010-11-16  2:50     ` Nguyen Thai Ngoc Duy
2010-11-15 10:36 ` [PATCH 09/10] git-checkout.txt: mention of sparse checkout Nguyễn Thái Ngọc Duy
2010-11-15 10:36 ` [PATCH 10/10] clean: support cleaning sparse checkout with -S Nguyễn Thái Ngọc Duy
2010-11-15 21:30   ` Jonathan Nieder
2010-11-16  2:53     ` Nguyen Thai Ngoc Duy
2010-11-16  3:07       ` Jonathan Nieder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.