All of lore.kernel.org
 help / color / mirror / Atom feed
* What's cooking in git.git (Nov 2008, #06; Wed, 26)
@ 2008-11-27  0:28 Junio C Hamano
  2008-11-27 22:49 ` Johannes Schindelin
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2008-11-27  0:28 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking.  Commits prefixed
with '-' are only in 'pu' while commits prefixed with '+' are
in 'next'.

The topics list the commits in reverse chronological order.  The topics
meant to be merged to the maintenance series have "maint-" in their names.

----------------------------------------------------------------
[New Topics]

* cr/remote-update-v (Tue Nov 18 19:04:02 2008 +0800) 1 commit
 + git-remote: add verbose mode to git remote update

Should be in 1.6.1-rc1.

* rs/strbuf-expand (Sun Nov 23 00:16:59 2008 +0100) 6 commits
 + remove the unused files interpolate.c and interpolate.h
 + daemon: deglobalize variable 'directory'
 + daemon: inline fill_in_extra_table_entries()
 + daemon: use strbuf_expand() instead of interpolate()
 + merge-recursive: use strbuf_expand() instead of interpolate()
 + add strbuf_expand_dict_cb(), a helper for simple cases

Should be in 1.6.1-rc1.

* mv/fast-export (Sun Nov 23 12:55:54 2008 +0100) 2 commits
 + fast-export: use an unsorted string list for extra_refs
 + Add new testcase to show fast-export does not always exports all
   tags

Should be in 1.6.1-rc1 and backmerged to 'maint'.

* st/levenshtein (Thu Nov 20 14:27:27 2008 +0100) 2 commits
 + Document levenshtein.c
 + Fix deletion of last character in levenshtein distance

Should be in 1.6.1-rc1.

* js/mingw-rename-fix (Wed Nov 19 17:25:27 2008 +0100) 1 commit
 + compat/mingw.c: Teach mingw_rename() to replace read-only files

Should be in 1.6.1-rc1 and backmerged to 'maint'.

* mv/clone-strbuf (Fri Nov 21 01:45:01 2008 +0100) 3 commits
 + builtin_clone: use strbuf in cmd_clone()
 + builtin-clone: use strbuf in clone_local() and
   copy_or_link_directory()
 + builtin-clone: use strbuf in guess_dir_name()

Should be in 1.6.1-rc1.

* pw/maint-p4 (Wed Nov 26 13:52:15 2008 -0500) 1 commit
 - git-p4: fix keyword-expansion regex

Waiting for Ack from git-p4 folks.

* cc/bisect-skip (Sun Nov 23 22:02:49 2008 +0100) 1 commit
 - bisect: teach "skip" to accept special arguments like "A..B"

Should be in 1.6.1-rc1.

* cc/bisect-replace (Mon Nov 24 22:20:30 2008 +0100) 9 commits
 - bisect: add "--no-replace" option to bisect without using replace
   refs
 - rev-list: make it possible to disable replacing using "--no-
   bisect-replace"
 - bisect: use "--bisect-replace" options when checking merge bases
 - merge-base: add "--bisect-replace" option to use fixed up revs
 - commit: add "bisect_replace_all" prototype to "commit.h"
 - rev-list: add "--bisect-replace" to list revisions with fixed up
   history
 - Documentation: add "git bisect replace" documentation
 - bisect: add test cases for "git bisect replace"
 - bisect: add "git bisect replace" subcommand

I really hate the idea of introducing a potentially much more useful
replacement of the existing graft mechanism and tie it very tightly to
bisect, making it unusable from outside.

 (1) I do not think "bisect replace" workflow is a practical and usable
     one;

 (2) The underlying mechanism to express "this object replaces that other
     object" is much easier to work with than what the graft does which is
     "the parents of this commit are these", and idea to use the normal
     ref to point at them means this can potentially be used for
     transferring the graft information across repositories, which the
     current graft mechanism cannot do.

 (3) Because I like the aspect (2) of this series so much, it deeply
     disappoints and troubles me that this is implemented minimally near
     the surface, and that it is controlled by the "bisect" Porcelain
     alone, by explicitly passing command line arguments.

I think a mechanism like this should be added to replace grafts, but it
should always be enabled for normal revision traversal operation, while
always disabled for object enumeration and transfer operation (iow, fsck,
fetch and push should use the real ancestry information recorded in the
underlying objects, while rev-list, log, etc. should always use the
replaced objects).  I have a suspicion that even cat-file could honor it.

----------------------------------------------------------------
[Graduated to "master"]

* bc/maint-keep-pack (Thu Nov 13 14:11:46 2008 -0600) 1 commit
 + repack: only unpack-unreachable if we are deleting redundant packs

This makes "repack -A -d" without -d do the same thing as "repack -a -d",
which makes sense.  This does not have to go to 'maint', though.

* jk/commit-v-strip (Wed Nov 12 03:23:37 2008 -0500) 4 commits
 + status: show "-v" diff even for initial commit
 + Merge branch 'jk/maint-commit-v-strip' into jk/commit-v-strip
 + wt-status: refactor initial commit printing
 + define empty tree sha1 as a macro

----------------------------------------------------------------
[Will merge to "master" soon]

* lt/preload-lstat (Mon Nov 17 09:01:20 2008 -0800) 2 commits
 + Fix index preloading for racy dirty case
 + Add cache preload facility

* ta/quiet-pull (Mon Nov 17 23:09:30 2008 +0100) 2 commits
 + Retain multiple -q/-v occurrences in git pull
 + Teach/Fix pull/fetch -q/-v options

* nd/narrow (Tue Nov 18 06:33:16 2008 -0500) 10 commits
 + t2104: touch portability fix
 + grep: skip files outside sparse checkout area
 + checkout_entry(): CE_NO_CHECKOUT on checked out entries.
 + Prevent diff machinery from examining worktree outside sparse
   checkout
 + ls-files: Add tests for --sparse and friends
 + update-index: add --checkout/--no-checkout to update
   CE_NO_CHECKOUT bit
 + update-index: refactor mark_valid() in preparation for new options
 + ls-files: add options to support sparse checkout
 + Introduce CE_NO_CHECKOUT bit
 + Extend index to save more flags

* ph/send-email (Tue Nov 11 00:54:02 2008 +0100) 4 commits
 + git send-email: ask less questions when --compose is used.
 + git send-email: add --annotate option
 + git send-email: interpret unknown files as revision lists
 + git send-email: make the message file name more specific.

----------------------------------------------------------------
[Actively Cooking]

* cb/mergetool (Thu Nov 13 12:41:15 2008 +0000) 3 commits
 - [DONTMERGE] Add -k/--keep-going option to mergetool
 - Add -y/--no-prompt option to mergetool
 - Fix some tab/space inconsistencies in git-mergetool.sh

Jeff had good comments on the last one; the discussion needs concluded,
and also waiting for comments from the original author (Ted).

* ds/uintmax-config (Mon Nov 3 09:14:28 2008 -0900) 1 commit
 - autoconf: Enable threaded delta search when pthreads are supported

* jc/blame (Wed Jun 4 22:58:40 2008 -0700) 2 commits
 + blame: show "previous" information in --porcelain/--incremental
   format
 + git-blame: refactor code to emit "porcelain format" output

----------------------------------------------------------------
[On Hold]

* jc/send-pack-tell-me-more (Thu Mar 20 00:44:11 2008 -0700) 1 commit
 - "git push": tellme-more protocol extension

This seems to have a deadlock during communication between the peers.
Someone needs to pick up this topic and resolve the deadlock before it can
continue.

* jk/renamelimit (Sat May 3 13:58:42 2008 -0700) 1 commit
 - diff: enable "too large a rename" warning when -M/-C is explicitly
   asked for

This would be the right thing to do for command line use,
but gitk will be hit due to tcl/tk's limitation, so I am holding
this back for now.

* jc/stripspace (Sun Mar 9 00:30:35 2008 -0800) 6 commits
 - git-am --forge: add Signed-off-by: line for the author
 - git-am: clean-up Signed-off-by: lines
 - stripspace: add --log-clean option to clean up signed-off-by:
   lines
 - stripspace: use parse_options()
 - Add "git am -s" test
 - git-am: refactor code to add signed-off-by line for the committer

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-27  0:28 What's cooking in git.git (Nov 2008, #06; Wed, 26) Junio C Hamano
@ 2008-11-27 22:49 ` Johannes Schindelin
  2008-11-28  2:06   ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Johannes Schindelin @ 2008-11-27 22:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

On Wed, 26 Nov 2008, Junio C Hamano wrote:

> ----------------------------------------------------------------
> [Will merge to "master" soon]
>
> [...]
> 
> * nd/narrow (Tue Nov 18 06:33:16 2008 -0500) 10 commits
>  + t2104: touch portability fix
>  + grep: skip files outside sparse checkout area
>  + checkout_entry(): CE_NO_CHECKOUT on checked out entries.
>  + Prevent diff machinery from examining worktree outside sparse
>    checkout
>  + ls-files: Add tests for --sparse and friends
>  + update-index: add --checkout/--no-checkout to update
>    CE_NO_CHECKOUT bit
>  + update-index: refactor mark_valid() in preparation for new options
>  + ls-files: add options to support sparse checkout
>  + Introduce CE_NO_CHECKOUT bit
>  + Extend index to save more flags

I have a strong suspicion that the narrow stuff will make the worktree 
mess pale in comparison.

Note that I do not have time to review this myself (which is not helped at 
all by it being no longer a trivial single patch, but a full 10 patches!), 
but I really have a bad feeling about this.  IMO it is substantially 
under-reviewed.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-27 22:49 ` Johannes Schindelin
@ 2008-11-28  2:06   ` Junio C Hamano
  2008-11-28 11:47     ` Johannes Schindelin
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2008-11-28  2:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> I have a strong suspicion that the narrow stuff will make the worktree 
> mess pale in comparison.
>
> Note that I do not have time to review this myself (which is not helped at 
> all by it being no longer a trivial single patch, but a full 10 patches!), 
> but I really have a bad feeling about this.  IMO it is substantially 
> under-reviewed.

Well, "a bad feeling" is not a convincing enough argument either, is it?
What kind of bad interaction are you fearing?

I thought the changes this first half of the topic implements were safe
for people who do not use this feature at all (which is the most important
thing I care about very first), and also I thought they made sense.

A bigger concern I actually have about this series is that the original
author seems to have gone quiet.  I would have expected a discussion on
adding Porcelain level support after the series hit 'next' to begin, so
that the underlying feature can be made more accessible by the end users.
Also a new section to tutorial or a new addition to how-to series of
documents that describe how to work inside narrowly checked out work tree,
what the pitfalls are, etc., together with follow-up improvements of what
is already in 'next', and end user questions and reports on issues should
have come, if this is ever being used by anybody by now, but none of that
has happened.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-28  2:06   ` Junio C Hamano
@ 2008-11-28 11:47     ` Johannes Schindelin
  2008-11-28 19:20       ` Shawn O. Pearce
  0 siblings, 1 reply; 37+ messages in thread
From: Johannes Schindelin @ 2008-11-28 11:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

On Thu, 27 Nov 2008, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > I have a strong suspicion that the narrow stuff will make the worktree 
> > mess pale in comparison.
> >
> > Note that I do not have time to review this myself (which is not 
> > helped at all by it being no longer a trivial single patch, but a full 
> > 10 patches!), but I really have a bad feeling about this.  IMO it is 
> > substantially under-reviewed.
> 
> Well, "a bad feeling" is not a convincing enough argument either, is it? 
> What kind of bad interaction are you fearing?

I just remember the worktree stuff well enough.  I had a bad gut feeling 
when it was proposed, and I had a bad impression of the (IMO way too 
intrusive) patch series implementing it.  It was pretty buggy and affected 
Git in serious ways (in order to accomodate worktree, we broke operations 
in bare repositories at least once, for example).

(So no, "bad feeling" is not convincing, but it is basically a primitive 
pattern matching in experiences that have not been fully analyzed, but 
that turned out to be bad enough.)

I tried to fix it, but did not a very good job at it.  In the meantime, I 
think I know why: there is no elegant way to implement this that is 
performant at the same time.  (Just think of having git_dir be relative: 
this is a necessity for the performance, but ugly to implement in the 
presence of worktree where it may _need_ to be absolute).

To me, the narrow patch series has all the looks of becoming the same type 
of nightmare:

- it is intrusive,

- it consists of a substantial number of patches (making bugs the opposite 
  of shallow),

- it is heavily under-reviewed,

- it _needs_ a lot of changes to be accomodated, affecting common code 
  paths, having all the potential to break existing workflows, and

- there is as little interest in the feature from core Git developers as 
  with worktree, literally guaranteeing that it will not, or only very 
  slowly, and probably badly, get fixed if it breaks.

And the worst part: I think that as with worktree, there has not been 
enough of kicking forth and back ideas how to design the beast, so I fully 
expect a subtle breakage that would require a redesign (which will be 
painful, with existing users of the feature).

Maybe I am crying "wolf", but I _do_ want to caution against risking too 
much, too fast, with that feature.

In other words, unless there is more interest in that feature, enough to 
generate a well-understood design before a good implementation, I'd rather 
see this patch series dropped.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-28 11:47     ` Johannes Schindelin
@ 2008-11-28 19:20       ` Shawn O. Pearce
  2008-11-29  0:13         ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Shawn O. Pearce @ 2008-11-28 19:20 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> On Thu, 27 Nov 2008, Junio C Hamano wrote:
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> > 
> > > I have a strong suspicion that the narrow stuff will make the worktree 
> > > mess pale in comparison.
> > >
> > > Note that I do not have time to review this myself (which is not 
> > > helped at all by it being no longer a trivial single patch, but a full 
> > > 10 patches!), but I really have a bad feeling about this.  IMO it is 
> > > substantially under-reviewed.
> > 
> > Well, "a bad feeling" is not a convincing enough argument either, is it? 
> > What kind of bad interaction are you fearing?
...
> And the worst part: I think that as with worktree, there has not been 
> enough of kicking forth and back ideas how to design the beast, so I fully 
> expect a subtle breakage that would require a redesign (which will be 
> painful, with existing users of the feature).
> 
> Maybe I am crying "wolf", but I _do_ want to caution against risking too 
> much, too fast, with that feature.
> 
> In other words, unless there is more interest in that feature, enough to 
> generate a well-understood design before a good implementation, I'd rather 
> see this patch series dropped.

Ack.  I agree with every remark made by Dscho, and also want to cry "wolf".

I haven't had time to read the patch series.  Its big and intrusive
and I just don't need the feature.

But I feel like if it were in fact merged I'll fall over some bug
in it sometime soon and be forced to stop and debug it.  Heck at
the least I'll have to go back to JGit's index code and implement
the new file format.  That shouldn't cause git.git's development to
stop, but I am whining (a little) about the file format change.  ;-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-28 19:20       ` Shawn O. Pearce
@ 2008-11-29  0:13         ` Junio C Hamano
  2008-11-29  0:15           ` [PATCH] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
  2008-11-29  1:25           ` What's cooking in git.git (Nov 2008, #06; Wed, 26) Daniel Barkalow
  0 siblings, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-11-29  0:13 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Schindelin, git

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> ...
>> In other words, unless there is more interest in that feature, enough to 
>> generate a well-understood design before a good implementation, I'd rather 
>> see this patch series dropped.
>
> Ack.  I agree with every remark made by Dscho, and also want to cry "wolf".
>
> I haven't had time to read the patch series.  Its big and intrusive
> and I just don't need the feature.

Well, "me neither".  Although I personally think resisting changes until
it becomes absolutely necessary is a good discipline, we also need to
recognise that there is a chicken-and-egg problem.  When you have a
potentially useful feature, unless people actually try using it in the
field, you won't discover the drawbacks in either the design nor the
implementation, let alone any improvements.

> But I feel like if it were in fact merged I'll fall over some bug
> in it sometime soon and be forced to stop and debug it.

Exactly.  That is how you make progress.

Having said that, I am willing to carry it over in 'next' outside 'master'
for the 1.6.1 cycle, as three people who are most likely to be able to fix
any potential issues are not using that feature.

> Heck at
> the least I'll have to go back to JGit's index code and implement
> the new file format.

I am sorry to dissapoint you but I am planning to use the first one in the
series, which is the one that adds extended index flag bits, for the fix
to an unrelated feature.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] git add --intent-to-add: fix removal of cached emptiness
  2008-11-29  0:13         ` Junio C Hamano
@ 2008-11-29  0:15           ` Junio C Hamano
  2008-11-29  3:51             ` [PATCH 1/3] builtin-rm.c: explain and clarify the "local change" logic Junio C Hamano
  2008-11-29  1:25           ` What's cooking in git.git (Nov 2008, #06; Wed, 26) Daniel Barkalow
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2008-11-29  0:15 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Schindelin, git

This uses the extended index flag mechanism introduced earlier to mark
the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.

The logic to detect an "intent to add" entry for the purpose of allowing
"git rm --cached $path" is tightened to check not just for a staged empty
blob, but with the CE_INTENT_TO_ADD bit.  This protects an empty blob that
was explicitly added and then modified in the work tree from being dropped
with this sequence:

	$ >empty
	$ git add empty
	$ echo "non empty" >empty
	$ git rm --cached empty

An index an "intent to add" entry is blocked.  This implies that you
cannot "git commit" from such a state; however "git commit -a" still
works.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This applies on top of the result of merging commit 06aaaa0 (Extend
   index to save more flags, 2008-10-01) from nd/narrow topic into 'master'.

 builtin-rm.c               |   11 ++++++-----
 builtin-write-tree.c       |    2 +-
 cache-tree.c               |   10 +++++++---
 cache.h                    |    3 ++-
 read-cache.c               |    2 ++
 t/t3600-rm.sh              |    4 ++--
 t/t3701-add-interactive.sh |   18 ++++++++++++++++++
 7 files changed, 38 insertions(+), 12 deletions(-)

diff --git a/builtin-rm.c b/builtin-rm.c
index b7126e3..8debcec 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -73,14 +73,15 @@ static int check_local_mod(unsigned char *head, int index_only)
 		}
 		if (ce_match_stat(ce, &st, 0))
 			local_changes = 1;
-		if (no_head
-		     || get_tree_entry(head, name, sha1, &mode)
-		     || ce->ce_mode != create_ce_mode(mode)
-		     || hashcmp(ce->sha1, sha1))
+		if (no_head ||
+		    ((get_tree_entry(head, name, sha1, &mode)
+		      || ce->ce_mode != create_ce_mode(mode)
+		      || hashcmp(ce->sha1, sha1)) &&
+		     !(ce->ce_flags & CE_INTENT_TO_ADD)))
 			staged_changes = 1;
 
 		if (local_changes && staged_changes &&
-		    !(index_only && is_empty_blob_sha1(ce->sha1)))
+		    !(index_only && (ce->ce_flags & CE_INTENT_TO_ADD)))
 			errs = error("'%s' has staged content different "
 				     "from both the file and the HEAD\n"
 				     "(use -f to force removal)", name);
diff --git a/builtin-write-tree.c b/builtin-write-tree.c
index 52a3c01..9d64050 100644
--- a/builtin-write-tree.c
+++ b/builtin-write-tree.c
@@ -42,7 +42,7 @@ int cmd_write_tree(int argc, const char **argv, const char *unused_prefix)
 		die("%s: error reading the index", me);
 		break;
 	case WRITE_TREE_UNMERGED_INDEX:
-		die("%s: error building trees; the index is unmerged?", me);
+		die("%s: error building trees", me);
 		break;
 	case WRITE_TREE_PREFIX_ERROR:
 		die("%s: prefix %s not found", me, prefix);
diff --git a/cache-tree.c b/cache-tree.c
index 5f8ee87..3d8f218 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -155,13 +155,17 @@ static int verify_cache(struct cache_entry **cache,
 	funny = 0;
 	for (i = 0; i < entries; i++) {
 		struct cache_entry *ce = cache[i];
-		if (ce_stage(ce)) {
+		if (ce_stage(ce) || (ce->ce_flags & CE_INTENT_TO_ADD)) {
 			if (10 < ++funny) {
 				fprintf(stderr, "...\n");
 				break;
 			}
-			fprintf(stderr, "%s: unmerged (%s)\n",
-				ce->name, sha1_to_hex(ce->sha1));
+			if (ce_stage(ce))
+				fprintf(stderr, "%s: unmerged (%s)\n",
+					ce->name, sha1_to_hex(ce->sha1));
+			else
+				fprintf(stderr, "%s: not added yet\n",
+					ce->name);
 		}
 	}
 	if (funny)
diff --git a/cache.h b/cache.h
index ef2e7f9..f15b3fc 100644
--- a/cache.h
+++ b/cache.h
@@ -176,10 +176,11 @@ struct cache_entry {
 /*
  * Extended on-disk flags
  */
+#define CE_INTENT_TO_ADD 0x20000000
 /* CE_EXTENDED2 is for future extension */
 #define CE_EXTENDED2 0x80000000
 
-#define CE_EXTENDED_FLAGS (0)
+#define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD)
 
 /*
  * Safeguard to avoid saving wrong flags:
diff --git a/read-cache.c b/read-cache.c
index abc627b..fa30a0f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -546,6 +546,8 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	ce->ce_flags = namelen;
 	if (!intent_only)
 		fill_stat_cache_info(ce, st);
+	else
+		ce->ce_flags |= CE_INTENT_TO_ADD;
 
 	if (trust_executable_bit && has_symlinks)
 		ce->ce_mode = create_ce_mode(st_mode);
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 5b4d6f7..b7d46e5 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -187,8 +187,8 @@ test_expect_success 'but with -f it should work.' '
 	test_must_fail git ls-files --error-unmatch baz
 '
 
-test_expect_failure 'refuse to remove cached empty file with modifications' '
-	touch empty &&
+test_expect_success 'refuse to remove cached empty file with modifications' '
+	>empty &&
 	git add empty &&
 	echo content >empty &&
 	test_must_fail git rm --cached empty
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index e95663d..473ef85 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -133,6 +133,24 @@ test_expect_success 'real edit works' '
 	test_cmp expected output
 '
 
+test_expect_success 'cannot commit with i-t-a entry' '
+	git reset --hard &&
+	echo xyzzy >rezrov &&
+	echo frotz >nitfol &&
+	git add rezrov &&
+	git add -N nitfol &&
+	test_must_fail git commit
+'
+
+test_expect_success 'can commit with an unrelated i-t-a entry in index' '
+	git reset --hard &&
+	echo xyzzy >rezrov &&
+	echo frotz >nitfol &&
+	git add rezrov &&
+	git add -N nitfol &&
+	git commit -m partial rezrov
+'
+
 if test "$(git config --bool core.filemode)" = false
 then
     say 'skipping filemode tests (filesystem does not properly support modes)'
-- 
1.6.0.4.850.g6bd829

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-29  0:13         ` Junio C Hamano
  2008-11-29  0:15           ` [PATCH] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
@ 2008-11-29  1:25           ` Daniel Barkalow
  2008-11-29 13:02             ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 37+ messages in thread
From: Daniel Barkalow @ 2008-11-29  1:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Johannes Schindelin, git

On Fri, 28 Nov 2008, Junio C Hamano wrote:

> "Shawn O. Pearce" <spearce@spearce.org> writes:
> 
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > ...
> >> In other words, unless there is more interest in that feature, enough to 
> >> generate a well-understood design before a good implementation, I'd rather 
> >> see this patch series dropped.
> >
> > Ack.  I agree with every remark made by Dscho, and also want to cry "wolf".
> >
> > I haven't had time to read the patch series.  Its big and intrusive
> > and I just don't need the feature.
> 
> Well, "me neither".  Although I personally think resisting changes until
> it becomes absolutely necessary is a good discipline, we also need to
> recognise that there is a chicken-and-egg problem.  When you have a
> potentially useful feature, unless people actually try using it in the
> field, you won't discover the drawbacks in either the design nor the
> implementation, let alone any improvements.

I just looked over most of it (skipping the generic index extension 
portion). It looks to me like it's introducing an extra concept to avoid 
actually fixing maybe-bugs in the "assume unchanged" implementation 
when used with files that have been changed intentionally (with the user 
intending git to overlook this change). Sparse checkout is essentially a 
special case of this, where the user has changed the working directory 
radically (not populating it at all) and wants git to carry on as if this 
was not the case (with a certain amount of porcelain code to cause this to 
happen automatically).

If there's any need for this to be distinguished from "assume unchanged", 
I think it should be used with, not instead of, the CE_VALID bit; and it 
could probably use some bit in the stat info section, since we don't need 
stat info if we know by assumption that the entry is valid.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 1/3] builtin-rm.c: explain and clarify the "local change" logic
  2008-11-29  0:15           ` [PATCH] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
@ 2008-11-29  3:51             ` Junio C Hamano
  2008-11-29  3:55               ` [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
  2008-11-29  3:56               ` [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident Junio C Hamano
  0 siblings, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-11-29  3:51 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Schindelin, git

Explain the logic to check local modification a bit more in the comment,
especially because the existing comment that talks about "git rm --cached"
was placed in a part that was not about "--cached" at all.

Also clarify "if .. else if .." structure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This is the first patch of a re-rolled sesries of the previous one, and
   again applies to the master plus the first commit from nd/narrow.

 builtin-rm.c |   53 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 files changed, 42 insertions(+), 11 deletions(-)

diff --git a/builtin-rm.c b/builtin-rm.c
index b7126e3..3d03da0 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -31,7 +31,8 @@ static void add_list(const char *name)
 
 static int check_local_mod(unsigned char *head, int index_only)
 {
-	/* items in list are already sorted in the cache order,
+	/*
+	 * Items in list are already sorted in the cache order,
 	 * so we could do this a lot more efficiently by using
 	 * tree_desc based traversal if we wanted to, but I am
 	 * lazy, and who cares if removal of files is a tad
@@ -71,25 +72,55 @@ static int check_local_mod(unsigned char *head, int index_only)
 			 */
 			continue;
 		}
+
+		/*
+		 * "rm" of a path that has changes need to be treated
+		 * carefully not to allow losing local changes
+		 * accidentally.  A local change could be (1) file in
+		 * work tree is different since the index; and/or (2)
+		 * the user staged a content that is different from
+		 * the current commit in the index.
+		 *
+		 * In such a case, you would need to --force the
+		 * removal.  However, "rm --cached" (remove only from
+		 * the index) is safe if the index matches the file in
+		 * the work tree or the HEAD commit, as it means that
+		 * the content being removed is available elsewhere.
+		 */
+
+		/*
+		 * Is the index different from the file in the work tree?
+		 */
 		if (ce_match_stat(ce, &st, 0))
 			local_changes = 1;
+
+		/*
+		 * Is the index different from the HEAD commit?  By
+		 * definition, before the very initial commit,
+		 * anything staged in the index is treated by the same
+		 * way as changed from the HEAD.
+		 */
 		if (no_head
 		     || get_tree_entry(head, name, sha1, &mode)
 		     || ce->ce_mode != create_ce_mode(mode)
 		     || hashcmp(ce->sha1, sha1))
 			staged_changes = 1;
 
-		if (local_changes && staged_changes &&
-		    !(index_only && is_empty_blob_sha1(ce->sha1)))
-			errs = error("'%s' has staged content different "
-				     "from both the file and the HEAD\n"
-				     "(use -f to force removal)", name);
+		/*
+		 * If the index does not match the file in the work
+		 * tree and if it does not match the HEAD commit
+		 * either, (1) "git rm" without --cached definitely
+		 * will lose information; (2) "git rm --cached" will
+		 * lose information unless it is about removing an
+		 * "intent to add" entry.
+		 */
+		if (local_changes && staged_changes) {
+			if (!index_only || !is_empty_blob_sha1(ce->sha1))
+				errs = error("'%s' has staged content different "
+					     "from both the file and the HEAD\n"
+					     "(use -f to force removal)", name);
+		}
 		else if (!index_only) {
-			/* It's not dangerous to "git rm --cached" a
-			 * file if the index matches the file or the
-			 * HEAD, since it means the deleted content is
-			 * still available somewhere.
-			 */
 			if (staged_changes)
 				errs = error("'%s' has changes staged in the index\n"
 					     "(use --cached to keep the file, "
-- 
1.6.0.4.850.g6bd829

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness
  2008-11-29  3:51             ` [PATCH 1/3] builtin-rm.c: explain and clarify the "local change" logic Junio C Hamano
@ 2008-11-29  3:55               ` Junio C Hamano
  2008-11-29 15:38                 ` Sverre Rabbelier
  2008-11-29  3:56               ` [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2008-11-29  3:55 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Schindelin, git

This uses the extended index flag mechanism introduced earlier to mark
the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.

The logic to detect an "intent to add" entry for the purpose of allowing
"git rm --cached $path" is tightened to check not just for a staged empty
blob, but with the CE_INTENT_TO_ADD bit.  This protects an empty blob that
was explicitly added and then modified in the work tree from being dropped
with this sequence:

	$ >empty
	$ git add empty
	$ echo "non empty" >empty
	$ git rm --cached empty

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-rm.c  |    2 +-
 cache.h       |    3 ++-
 read-cache.c  |    2 ++
 t/t3600-rm.sh |    4 ++--
 4 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/builtin-rm.c b/builtin-rm.c
index 3d03da0..c11f455 100644
--- a/builtin-rm.c
+++ b/builtin-rm.c
@@ -115,7 +115,7 @@ static int check_local_mod(unsigned char *head, int index_only)
 		 * "intent to add" entry.
 		 */
 		if (local_changes && staged_changes) {
-			if (!index_only || !is_empty_blob_sha1(ce->sha1))
+			if (!index_only || !(ce->ce_flags & CE_INTENT_TO_ADD))
 				errs = error("'%s' has staged content different "
 					     "from both the file and the HEAD\n"
 					     "(use -f to force removal)", name);
diff --git a/cache.h b/cache.h
index ef2e7f9..f15b3fc 100644
--- a/cache.h
+++ b/cache.h
@@ -176,10 +176,11 @@ struct cache_entry {
 /*
  * Extended on-disk flags
  */
+#define CE_INTENT_TO_ADD 0x20000000
 /* CE_EXTENDED2 is for future extension */
 #define CE_EXTENDED2 0x80000000
 
-#define CE_EXTENDED_FLAGS (0)
+#define CE_EXTENDED_FLAGS (CE_INTENT_TO_ADD)
 
 /*
  * Safeguard to avoid saving wrong flags:
diff --git a/read-cache.c b/read-cache.c
index abc627b..fa30a0f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -546,6 +546,8 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
 	ce->ce_flags = namelen;
 	if (!intent_only)
 		fill_stat_cache_info(ce, st);
+	else
+		ce->ce_flags |= CE_INTENT_TO_ADD;
 
 	if (trust_executable_bit && has_symlinks)
 		ce->ce_mode = create_ce_mode(st_mode);
diff --git a/t/t3600-rm.sh b/t/t3600-rm.sh
index 5b4d6f7..b7d46e5 100755
--- a/t/t3600-rm.sh
+++ b/t/t3600-rm.sh
@@ -187,8 +187,8 @@ test_expect_success 'but with -f it should work.' '
 	test_must_fail git ls-files --error-unmatch baz
 '
 
-test_expect_failure 'refuse to remove cached empty file with modifications' '
-	touch empty &&
+test_expect_success 'refuse to remove cached empty file with modifications' '
+	>empty &&
 	git add empty &&
 	echo content >empty &&
 	test_must_fail git rm --cached empty
-- 
1.6.0.4.850.g6bd829

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident
  2008-11-29  3:51             ` [PATCH 1/3] builtin-rm.c: explain and clarify the "local change" logic Junio C Hamano
  2008-11-29  3:55               ` [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
@ 2008-11-29  3:56               ` Junio C Hamano
  2008-11-30 19:14                 ` Jeff King
  1 sibling, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2008-11-29  3:56 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Schindelin, git

Writing a tree out of an index with an "intent to add" entry is blocked.
This implies that you cannot "git commit" from such a state; however "git
commit -a" still works.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-write-tree.c       |    2 +-
 cache-tree.c               |   10 +++++++---
 t/t3701-add-interactive.sh |   18 ++++++++++++++++++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/builtin-write-tree.c b/builtin-write-tree.c
index 52a3c01..9d64050 100644
--- a/builtin-write-tree.c
+++ b/builtin-write-tree.c
@@ -42,7 +42,7 @@ int cmd_write_tree(int argc, const char **argv, const char *unused_prefix)
 		die("%s: error reading the index", me);
 		break;
 	case WRITE_TREE_UNMERGED_INDEX:
-		die("%s: error building trees; the index is unmerged?", me);
+		die("%s: error building trees", me);
 		break;
 	case WRITE_TREE_PREFIX_ERROR:
 		die("%s: prefix %s not found", me, prefix);
diff --git a/cache-tree.c b/cache-tree.c
index 5f8ee87..3d8f218 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -155,13 +155,17 @@ static int verify_cache(struct cache_entry **cache,
 	funny = 0;
 	for (i = 0; i < entries; i++) {
 		struct cache_entry *ce = cache[i];
-		if (ce_stage(ce)) {
+		if (ce_stage(ce) || (ce->ce_flags & CE_INTENT_TO_ADD)) {
 			if (10 < ++funny) {
 				fprintf(stderr, "...\n");
 				break;
 			}
-			fprintf(stderr, "%s: unmerged (%s)\n",
-				ce->name, sha1_to_hex(ce->sha1));
+			if (ce_stage(ce))
+				fprintf(stderr, "%s: unmerged (%s)\n",
+					ce->name, sha1_to_hex(ce->sha1));
+			else
+				fprintf(stderr, "%s: not added yet\n",
+					ce->name);
 		}
 	}
 	if (funny)
diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index e95663d..473ef85 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -133,6 +133,24 @@ test_expect_success 'real edit works' '
 	test_cmp expected output
 '
 
+test_expect_success 'cannot commit with i-t-a entry' '
+	git reset --hard &&
+	echo xyzzy >rezrov &&
+	echo frotz >nitfol &&
+	git add rezrov &&
+	git add -N nitfol &&
+	test_must_fail git commit
+'
+
+test_expect_success 'can commit with an unrelated i-t-a entry in index' '
+	git reset --hard &&
+	echo xyzzy >rezrov &&
+	echo frotz >nitfol &&
+	git add rezrov &&
+	git add -N nitfol &&
+	git commit -m partial rezrov
+'
+
 if test "$(git config --bool core.filemode)" = false
 then
     say 'skipping filemode tests (filesystem does not properly support modes)'
-- 
1.6.0.4.850.g6bd829

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-29  1:25           ` What's cooking in git.git (Nov 2008, #06; Wed, 26) Daniel Barkalow
@ 2008-11-29 13:02             ` Nguyen Thai Ngoc Duy
  2008-11-30 10:29               ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-11-29 13:02 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 11/29/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> On Fri, 28 Nov 2008, Junio C Hamano wrote:
>
>  > "Shawn O. Pearce" <spearce@spearce.org> writes:
>  >
>  > > Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>  > > ...
>  > >> In other words, unless there is more interest in that feature, enough to
>  > >> generate a well-understood design before a good implementation, I'd rather
>  > >> see this patch series dropped.
>  > >
>  > > Ack.  I agree with every remark made by Dscho, and also want to cry "wolf".
>  > >
>  > > I haven't had time to read the patch series.  Its big and intrusive
>  > > and I just don't need the feature.
>  >
>  > Well, "me neither".  Although I personally think resisting changes until
>  > it becomes absolutely necessary is a good discipline, we also need to
>  > recognise that there is a chicken-and-egg problem.  When you have a
>  > potentially useful feature, unless people actually try using it in the
>  > field, you won't discover the drawbacks in either the design nor the
>  > implementation, let alone any improvements.
>
>
> I just looked over most of it (skipping the generic index extension
>  portion). It looks to me like it's introducing an extra concept to avoid
>  actually fixing maybe-bugs in the "assume unchanged" implementation
>  when used with files that have been changed intentionally (with the user
>  intending git to overlook this change). Sparse checkout is essentially a
>  special case of this, where the user has changed the working directory
>  radically (not populating it at all) and wants git to carry on as if this
>  was not the case (with a certain amount of porcelain code to cause this to
>  happen automatically).

I chose to use another bit because I did not want to change the
behaviour of CE_VALID bit: it assumes all files are present.

>  If there's any need for this to be distinguished from "assume unchanged",
>  I think it should be used with, not instead of, the CE_VALID bit; and it
>  could probably use some bit in the stat info section, since we don't need
>  stat info if we know by assumption that the entry is valid.

Interesting. I'll think more about this.

>         -Daniel
>  *This .sig left intentionally blank*
>
> --
>  To unsubscribe from this list: send the line "unsubscribe git" in
>  the body of a message to majordomo@vger.kernel.org
>  More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness
  2008-11-29  3:55               ` [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
@ 2008-11-29 15:38                 ` Sverre Rabbelier
  2008-11-30 19:21                   ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Sverre Rabbelier @ 2008-11-29 15:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Johannes Schindelin, git

On Sat, Nov 29, 2008 at 04:55, Junio C Hamano <gitster@pobox.com> wrote:
> This uses the extended index flag mechanism introduced earlier to mark
> the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.

Is 'intent' [0] used properly here? Should it not be 'intend' [1]?

[0] http://en.wiktionary.org/wiki/intent
[1] http://en.wiktionary.org/wiki/intend

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-29 13:02             ` Nguyen Thai Ngoc Duy
@ 2008-11-30 10:29               ` Nguyen Thai Ngoc Duy
  2008-11-30 21:26                 ` Daniel Barkalow
  0 siblings, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-11-30 10:29 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 11/29/08, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> On 11/29/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  >  If there's any need for this to be distinguished from "assume unchanged",
>  >  I think it should be used with, not instead of, the CE_VALID bit; and it
>  >  could probably use some bit in the stat info section, since we don't need
>  >  stat info if we know by assumption that the entry is valid.
>
>
> Interesting. I'll think more about this.
>

As I said, CE_VALID implies all files are present. I could make
CE_NO_CHECKOUT to be used with CE_VALID, but I would need to check all
CE_VALID code path to make sure the behaviour remains if
CE_NO_CHECKOUT is absent. It's just more intrusive.

I have nothing against storing CE_NO_CHECKOUT in stat info except that
it seems inappropriate/hidden place to do. ce_flags is more obvious
choice. I haven't looked closely to stat info code in read-cache.c
though.
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident
  2008-11-29  3:56               ` [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident Junio C Hamano
@ 2008-11-30 19:14                 ` Jeff King
  2008-12-01  9:24                   ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Jeff King @ 2008-11-30 19:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Johannes Schindelin, git

On Fri, Nov 28, 2008 at 07:56:34PM -0800, Junio C Hamano wrote:

> Subject: Re: [PATCH 3/3] git add --intent-to-add: do not let an empty blob
>	committed by accident

Minor nit: grammatical error in the subject.

> This implies that you cannot "git commit" from such a state; however "git
> commit -a" still works.

I was going to provide a test for "git commit -a" to squash in, but it
looks like the version in 'pu' already has one.

>  	case WRITE_TREE_UNMERGED_INDEX:
> -		die("%s: error building trees; the index is unmerged?", me);
> +		die("%s: error building trees", me);

This caught me by surprise while reading, but I assume the rationale is
"now there is a new, different reason not to be able to build the trees,
so our guess is less likely to be correct". I wonder if we can do better
by actually passing out a more exact error value (though it looks like
we will already have said "foo: not added yet" by that point anyway, so
maybe it is just pointless to say more).

> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh

Why in t3701? These don't have anything to do with interactive add, and
there is a already a t2203-add-intent.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness
  2008-11-29 15:38                 ` Sverre Rabbelier
@ 2008-11-30 19:21                   ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2008-11-30 19:21 UTC (permalink / raw)
  To: sverre; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On Sat, Nov 29, 2008 at 04:38:12PM +0100, Sverre Rabbelier wrote:

> On Sat, Nov 29, 2008 at 04:55, Junio C Hamano <gitster@pobox.com> wrote:
> > This uses the extended index flag mechanism introduced earlier to mark
> > the entries added to the index via "git add -N" with CE_INTENT_TO_ADD.
> 
> Is 'intent' [0] used properly here? Should it not be 'intend' [1]?
> 
> [0] http://en.wiktionary.org/wiki/intent
> [1] http://en.wiktionary.org/wiki/intend

I think it's fine. The flags describe the entry, not the user (e.g.,
CE_VALID). So the entry does not _intend_ to add anything, but rather
there exists _intent_ to add this entry (you might also say the entry is
"_intended_ to be added", but that is getting a bit clunky).

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-30 10:29               ` Nguyen Thai Ngoc Duy
@ 2008-11-30 21:26                 ` Daniel Barkalow
  2008-12-06 17:26                   ` Nguyen Thai Ngoc Duy
  2008-12-07  3:45                   ` Junio C Hamano
  0 siblings, 2 replies; 37+ messages in thread
From: Daniel Barkalow @ 2008-11-30 21:26 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On Sun, 30 Nov 2008, Nguyen Thai Ngoc Duy wrote:

> On 11/29/08, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> > On 11/29/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> >  >  If there's any need for this to be distinguished from "assume unchanged",
> >  >  I think it should be used with, not instead of, the CE_VALID bit; and it
> >  >  could probably use some bit in the stat info section, since we don't need
> >  >  stat info if we know by assumption that the entry is valid.
> >
> >
> > Interesting. I'll think more about this.
> >
> 
> As I said, CE_VALID implies all files are present.

My first question is whether this actually should be true. Going back to 
the message for 5f73076c1a9b4b8dc94f77eac98eb558d25e33c0, it sounds like 
the CE_VALID code is designed to be safe and sort of correct even if the 
files are not actually unchanged; I don't think it would be out-of-spec 
for CE_VALID to (1) always produce output as if the working tree contained 
what the index contains, while (2) refusing to make any changes to working 
tree files that do not actually match the index. As it is now, (2) is 
explicitly true, but (1) is left vague-- commands may fail entirely or 
produce different output if CE_VALID is set in the index for a file that 
has changes in the working tree, but not in any particular way.

Now, it might be necessary for CE_NO_CHECKOUT to differ from CE_VALID in 
some ways in (2): if a file is CE_NO_CHECKOUT and absent, code which would 
modify it could probably just report sucess, while CE_VALID on a file 
with changes should probably report failure. On the other hand, that could 
just as easily be at the porcelain layer, with the porcelain instructing 
the plumbing to change the index without changing the working tree for 
those files outside the sparse checkout, and the plumbing would report 
errors if the porcelain did not do this.

> I could make CE_NO_CHECKOUT to be used with CE_VALID, but I would need 
> to check all CE_VALID code path to make sure the behaviour remains if
> CE_NO_CHECKOUT is absent. It's just more intrusive.

I would expect all code that has a CE_VALID path to do something actually 
wrong if it took the non-CE_VALID code path on CE_NO_CHECKOUT and there 
was no CE_NO_CHECKOUT code path. So I'd expect that your patch is 
insufficient to the extent that CE_NO_CHECKOUT doesn't imply CE_VALID 
(since there is very little in the way of CE_NO_CHECKOUT-specific 
handling in your patch).

The only case I can think of where NO_CHECKOUT is more like !VALID than 
VALID is with respect to whether we can report the content in the index by 
looking in the filesystem instead of in the database; I don't think this 
is an intentional optimization anywhere, and I think it would be a likely 
source of bugs if it were (e.g., it would have to know about files which 
are up-to-date with respect to stat info, but which have been "smudged" on 
disk and therefore don't match byte-for-byte with the database). Actually, 
it might be most accurate to treat --no-checkout as being CE_VALID with a 
smudge filter of "rm". If the combination of CE_VALID and on-disk 
conversion works (which is likely to be the common pattern for Windows 
users, who need autocrlf and have a slow lstat(), and is therefore 
maintained), surely this combination would work for CE_NO_CHECKOUT.

> I have nothing against storing CE_NO_CHECKOUT in stat info except that
> it seems inappropriate/hidden place to do. ce_flags is more obvious
> choice. I haven't looked closely to stat info code in read-cache.c
> though.

It should be pretty clean to check CE_VALID when reading an entry from 
disk and remap bits from it to additional flags in memory. I wouldn't 
suggest overlaying them in memory, but there's also no shortage of space 
for flags in memory.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident
  2008-11-30 19:14                 ` Jeff King
@ 2008-12-01  9:24                   ` Junio C Hamano
  0 siblings, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-12-01  9:24 UTC (permalink / raw)
  To: Jeff King; +Cc: Shawn O. Pearce, Johannes Schindelin, git

Jeff King <peff@peff.net> writes:

> On Fri, Nov 28, 2008 at 07:56:34PM -0800, Junio C Hamano wrote:
> ...
>>  	case WRITE_TREE_UNMERGED_INDEX:
>> -		die("%s: error building trees; the index is unmerged?", me);
>> +		die("%s: error building trees", me);
>
> This caught me by surprise while reading, but I assume the rationale is
> "now there is a new, different reason not to be able to build the trees,
> so our guess is less likely to be correct". I wonder if we can do better
> by actually passing out a more exact error value (though it looks like
> we will already have said "foo: not added yet" by that point anyway, so
> maybe it is just pointless to say more).

The places that detect the "unmerged" (and then newly added "intent-only")
entries already have issued error messages in the codepath that leads to
this error.

>> diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
>
> Why in t3701?

Good question.  Brain fart, perhaps.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-30 21:26                 ` Daniel Barkalow
@ 2008-12-06 17:26                   ` Nguyen Thai Ngoc Duy
  2008-12-06 18:39                     ` Daniel Barkalow
  2008-12-07  3:45                   ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-06 17:26 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 12/1/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> On Sun, 30 Nov 2008, Nguyen Thai Ngoc Duy wrote:
>
>  > On 11/29/08, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
>  > > On 11/29/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  > >  >  If there's any need for this to be distinguished from "assume unchanged",
>  > >  >  I think it should be used with, not instead of, the CE_VALID bit; and it
>  > >  >  could probably use some bit in the stat info section, since we don't need
>  > >  >  stat info if we know by assumption that the entry is valid.
>  > >
>  > >
>  > > Interesting. I'll think more about this.
>  > >
>  >
>  > As I said, CE_VALID implies all files are present.
>
>
> My first question is whether this actually should be true. Going back to
>  the message for 5f73076c1a9b4b8dc94f77eac98eb558d25e33c0, it sounds like
>  the CE_VALID code is designed to be safe and sort of correct even if the
>  files are not actually unchanged; I don't think it would be out-of-spec
>  for CE_VALID to (1) always produce output as if the working tree contained
>  what the index contains, while (2) refusing to make any changes to working
>  tree files that do not actually match the index. As it is now, (2) is
>  explicitly true, but (1) is left vague-- commands may fail entirely or
>  produce different output if CE_VALID is set in the index for a file that
>  has changes in the working tree, but not in any particular way.

(1) is not always true. For example diff machinary may examine
worktree files regardless CE_VALID, which is updated for
CE_NO_CHECKOUT in d9f8fca (Prevent diff machinery from examining
worktree outside sparse checkout)

>  Now, it might be necessary for CE_NO_CHECKOUT to differ from CE_VALID in
>  some ways in (2): if a file is CE_NO_CHECKOUT and absent, code which would
>  modify it could probably just report sucess, while CE_VALID on a file
>  with changes should probably report failure. On the other hand, that could
>  just as easily be at the porcelain layer, with the porcelain instructing
>  the plumbing to change the index without changing the working tree for
>  those files outside the sparse checkout, and the plumbing would report
>  errors if the porcelain did not do this.

That's right. Much of work in the last half of the series is on
porcelain layer. "git grep" fix is the only porcelain that gets fixed
in this series.

>  > I could make CE_NO_CHECKOUT to be used with CE_VALID, but I would need
>  > to check all CE_VALID code path to make sure the behaviour remains if
>  > CE_NO_CHECKOUT is absent. It's just more intrusive.
>
>
> I would expect all code that has a CE_VALID path to do something actually
>  wrong if it took the non-CE_VALID code path on CE_NO_CHECKOUT and there
>  was no CE_NO_CHECKOUT code path. So I'd expect that your patch is
>  insufficient to the extent that CE_NO_CHECKOUT doesn't imply CE_VALID
>  (since there is very little in the way of CE_NO_CHECKOUT-specific
>  handling in your patch).

I read the code again. CE_NO_CHECKOUT should follow CE_VALID code path
(which was extended to CE_VALID_MASK to have both flags). That means
CE_NO_CHECKOUT is treated as same as CE_VALID. The only difference
here is CE_VALID is set/unset by "git update-index --really-refresh"
and core.ingorestat while CE_NO_CHECKOUT has its own way to set/unset.

There is not much work for CE_NO_CHECKOUT on plumbling level except
some fixes. The last half of the series, for porcelain level, you will
see more.

>  The only case I can think of where NO_CHECKOUT is more like !VALID than
>  VALID is with respect to whether we can report the content in the index by
>  looking in the filesystem instead of in the database; I don't think this
>  is an intentional optimization anywhere, and I think it would be a likely
>  source of bugs if it were (e.g., it would have to know about files which
>  are up-to-date with respect to stat info, but which have been "smudged" on
>  disk and therefore don't match byte-for-byte with the database).

There is worktree file reuse in diff code somewhere IIRC. Yes, this
should be checked.

>  Actually,
>  it might be most accurate to treat --no-checkout as being CE_VALID with a
>  smudge filter of "rm". If the combination of CE_VALID and on-disk
>  conversion works (which is likely to be the common pattern for Windows
>  users, who need autocrlf and have a slow lstat(), and is therefore
>  maintained), surely this combination would work for CE_NO_CHECKOUT.

Very interesting.

>  > I have nothing against storing CE_NO_CHECKOUT in stat info except that
>  > it seems inappropriate/hidden place to do. ce_flags is more obvious
>  > choice. I haven't looked closely to stat info code in read-cache.c
>  > though.
>
>
> It should be pretty clean to check CE_VALID when reading an entry from
>  disk and remap bits from it to additional flags in memory. I wouldn't
>  suggest overlaying them in memory, but there's also no shortage of space
>  for flags in memory.

I see. Still I prefer the current approach, less headache to decide
what bit to take from stat info ;-)
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-06 17:26                   ` Nguyen Thai Ngoc Duy
@ 2008-12-06 18:39                     ` Daniel Barkalow
  2008-12-07 12:27                       ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel Barkalow @ 2008-12-06 18:39 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On Sun, 7 Dec 2008, Nguyen Thai Ngoc Duy wrote:

> On 12/1/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> > On Sun, 30 Nov 2008, Nguyen Thai Ngoc Duy wrote:
> >
> >  > On 11/29/08, Nguyen Thai Ngoc Duy <pclouds@gmail.com> wrote:
> >  > > On 11/29/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> >  > >  >  If there's any need for this to be distinguished from "assume unchanged",
> >  > >  >  I think it should be used with, not instead of, the CE_VALID bit; and it
> >  > >  >  could probably use some bit in the stat info section, since we don't need
> >  > >  >  stat info if we know by assumption that the entry is valid.
> >  > >
> >  > >
> >  > > Interesting. I'll think more about this.
> >  > >
> >  >
> >  > As I said, CE_VALID implies all files are present.
> >
> >
> > My first question is whether this actually should be true. Going back to
> >  the message for 5f73076c1a9b4b8dc94f77eac98eb558d25e33c0, it sounds like
> >  the CE_VALID code is designed to be safe and sort of correct even if the
> >  files are not actually unchanged; I don't think it would be out-of-spec
> >  for CE_VALID to (1) always produce output as if the working tree contained
> >  what the index contains, while (2) refusing to make any changes to working
> >  tree files that do not actually match the index. As it is now, (2) is
> >  explicitly true, but (1) is left vague-- commands may fail entirely or
> >  produce different output if CE_VALID is set in the index for a file that
> >  has changes in the working tree, but not in any particular way.
> 
> (1) is not always true. For example diff machinary may examine
> worktree files regardless CE_VALID, which is updated for
> CE_NO_CHECKOUT in d9f8fca (Prevent diff machinery from examining
> worktree outside sparse checkout)

I know (1) is not always true in the current implementation. What I'm 
getting at is to ask (a) whether our documented behavior ever violates (1) 
and (b) whether enforcing (1) would be an improvement.

I suspect that enforcing (1) would be less surprising to users in the 
situation where the assumption is false that worktree files match the 
index when CE_VALID is set. As it is, the diff machinery does surprising 
things in this situation, and I think we just say "Nasal Monkey Territory" 
(that is, we tell the user, "if you don't want git to do surprising 
things, don't get into this situation"). If that is the case, we should be 
fine changing it to match your CE_NO_CHECKOUT behavior, and it wouldn't be 
worse for CE_VALID and might be better.

> >  Now, it might be necessary for CE_NO_CHECKOUT to differ from CE_VALID in
> >  some ways in (2): if a file is CE_NO_CHECKOUT and absent, code which would
> >  modify it could probably just report sucess, while CE_VALID on a file
> >  with changes should probably report failure. On the other hand, that could
> >  just as easily be at the porcelain layer, with the porcelain instructing
> >  the plumbing to change the index without changing the working tree for
> >  those files outside the sparse checkout, and the plumbing would report
> >  errors if the porcelain did not do this.
> 
> That's right. Much of work in the last half of the series is on
> porcelain layer. "git grep" fix is the only porcelain that gets fixed
> in this series.

I haven't gotten to reading the last half yet, so I'll avoid taking a 
specific position on how it should work, and just make unsupported 
suggestions.

> >  > I could make CE_NO_CHECKOUT to be used with CE_VALID, but I would need
> >  > to check all CE_VALID code path to make sure the behaviour remains if
> >  > CE_NO_CHECKOUT is absent. It's just more intrusive.
> >
> >
> > I would expect all code that has a CE_VALID path to do something actually
> >  wrong if it took the non-CE_VALID code path on CE_NO_CHECKOUT and there
> >  was no CE_NO_CHECKOUT code path. So I'd expect that your patch is
> >  insufficient to the extent that CE_NO_CHECKOUT doesn't imply CE_VALID
> >  (since there is very little in the way of CE_NO_CHECKOUT-specific
> >  handling in your patch).
> 
> I read the code again. CE_NO_CHECKOUT should follow CE_VALID code path
> (which was extended to CE_VALID_MASK to have both flags). That means
> CE_NO_CHECKOUT is treated as same as CE_VALID. The only difference
> here is CE_VALID is set/unset by "git update-index --really-refresh"
> and core.ingorestat while CE_NO_CHECKOUT has its own way to set/unset.

I think, then, that it would be cleaner to have CE_VALID instead of 
CE_VALID_MASK, and have the things that care test CE_NO_CHECKOUT or 
!CE_NO_CHECKOUT. This wouldn't be all that different, except that it would 
mean that distinguishing them appears as a special case, which in turn 
makes it easier to question whether they should differ.

> There is not much work for CE_NO_CHECKOUT on plumbling level except
> some fixes. The last half of the series, for porcelain level, you will
> see more.

For the porcelain level, do we need the difference to be in the index? If 
the porcelain knows the sparse checkout area and can instruct the plumbing 
appropriately, the information shouldn't need to be stored in the index 
unless it's ever important to remember whether an entry is CE_VALID due to 
having been outside the checkout when the index was written, even though 
the checkout area now includes it. I don't have a good intuition as to 
what ought to happen if the user manually changes what's specified for 
checkout without actually updating the index and working tree.

> >  The only case I can think of where NO_CHECKOUT is more like !VALID than
> >  VALID is with respect to whether we can report the content in the index by
> >  looking in the filesystem instead of in the database; I don't think this
> >  is an intentional optimization anywhere, and I think it would be a likely
> >  source of bugs if it were (e.g., it would have to know about files which
> >  are up-to-date with respect to stat info, but which have been "smudged" on
> >  disk and therefore don't match byte-for-byte with the database).
> 
> There is worktree file reuse in diff code somewhere IIRC. Yes, this
> should be checked.
> 
> >  Actually,
> >  it might be most accurate to treat --no-checkout as being CE_VALID with a
> >  smudge filter of "rm". If the combination of CE_VALID and on-disk
> >  conversion works (which is likely to be the common pattern for Windows
> >  users, who need autocrlf and have a slow lstat(), and is therefore
> >  maintained), surely this combination would work for CE_NO_CHECKOUT.
> 
> Very interesting.
> 
> >  > I have nothing against storing CE_NO_CHECKOUT in stat info except that
> >  > it seems inappropriate/hidden place to do. ce_flags is more obvious
> >  > choice. I haven't looked closely to stat info code in read-cache.c
> >  > though.
> >
> >
> > It should be pretty clean to check CE_VALID when reading an entry from
> >  disk and remap bits from it to additional flags in memory. I wouldn't
> >  suggest overlaying them in memory, but there's also no shortage of space
> >  for flags in memory.
> 
> I see. Still I prefer the current approach, less headache to decide
> what bit to take from stat info ;-)

True. But it would be even easier if it didn't have to be marked in the 
index at all.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-11-30 21:26                 ` Daniel Barkalow
  2008-12-06 17:26                   ` Nguyen Thai Ngoc Duy
@ 2008-12-07  3:45                   ` Junio C Hamano
  1 sibling, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-12-07  3:45 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Nguyen Thai Ngoc Duy, Shawn O. Pearce, Johannes Schindelin, git

Daniel Barkalow <barkalow@iabervon.org> writes:

>> As I said, CE_VALID implies all files are present.
>
> My first question is whether this actually should be true.

If the user says "Please pretend that I have never touched this file",
which is what "assume unchanged" is all about, I think we should not
notice if the user removes one of such files from the working tree, just
like we don't notice (rather, pretend not to notice) if the user modified
it.

I am inclined to think that we should rather treat that as a bug.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-06 18:39                     ` Daniel Barkalow
@ 2008-12-07 12:27                       ` Nguyen Thai Ngoc Duy
  2008-12-07 21:26                         ` Daniel Barkalow
  0 siblings, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-07 12:27 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 12/7/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  > There is not much work for CE_NO_CHECKOUT on plumbling level except
>  > some fixes. The last half of the series, for porcelain level, you will
>  > see more.
>
>
> For the porcelain level, do we need the difference to be in the index? If
>  the porcelain knows the sparse checkout area and can instruct the plumbing
>  appropriately, the information shouldn't need to be stored in the index

This was discussed since the beginning of this feature. I recall that
the index reflects worktree, and because we mark CE_NO_CHECKOUT on
file basis, it's best to save the information there, not separately.
We do save high level information to form the checkout area (sparse
patterns) in the last half, but basically you should be able to live
without that.

>  unless it's ever important to remember whether an entry is CE_VALID due to
>  having been outside the checkout when the index was written, even though
>  the checkout area now includes it. I don't have a good intuition as to
>  what ought to happen if the user manually changes what's specified for
>  checkout without actually updating the index and working tree.

So if a user changes worktree without updating index, they will have
the same results as they do now: files are shown as modified if they
don't have CE_NO_CHECKOUT set. If those files do, they are considered
'orphaned' or staled and are recommended to be removed/updated to
avoid unexpected consequences (not availble this this first half
series because that belongs to "git status").
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-07 12:27                       ` Nguyen Thai Ngoc Duy
@ 2008-12-07 21:26                         ` Daniel Barkalow
  2008-12-08 12:51                           ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel Barkalow @ 2008-12-07 21:26 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On Sun, 7 Dec 2008, Nguyen Thai Ngoc Duy wrote:

> On 12/7/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> >  > There is not much work for CE_NO_CHECKOUT on plumbling level except
> >  > some fixes. The last half of the series, for porcelain level, you will
> >  > see more.
> >
> >
> > For the porcelain level, do we need the difference to be in the index? If
> >  the porcelain knows the sparse checkout area and can instruct the plumbing
> >  appropriately, the information shouldn't need to be stored in the index
> 
> This was discussed since the beginning of this feature. I recall that
> the index reflects worktree, and because we mark CE_NO_CHECKOUT on
> file basis, it's best to save the information there, not separately.
> We do save high level information to form the checkout area (sparse
> patterns) in the last half, but basically you should be able to live
> without that.

We need to mark in the index the information that reflects the worktree.

If, however, we take CE_VALID to be the flag for "ignore the worktree 
entirely at this path; act as it if contains what the index contains" (and 
use this to cause that aspect of no-checkout), and we then entirely ignore 
the worktree, including not caring whether there are files there or not 
(except, of course, that in the transition from caring to not caring for 
no-checkout, we make the worktree empty, while in the case for 
"stat-is-expensive", we bring it into agreement with the index), then 
there is no additional information that needs to be conveyed in the index.

> >  unless it's ever important to remember whether an entry is CE_VALID due to
> >  having been outside the checkout when the index was written, even though
> >  the checkout area now includes it. I don't have a good intuition as to
> >  what ought to happen if the user manually changes what's specified for
> >  checkout without actually updating the index and working tree.
> 
> So if a user changes worktree without updating index, they will have
> the same results as they do now: files are shown as modified if they
> don't have CE_NO_CHECKOUT set. If those files do, they are considered
> 'orphaned' or staled and are recommended to be removed/updated to
> avoid unexpected consequences (not availble this this first half
> series because that belongs to "git status").

I was actually thinking that there would be a file for "this is what the 
user wants to have checked out" (as opposed to the index, which must 
contain "this is what is checked out"), and the porcelain would instruct 
the plumbing as to what to do with the worktree (that the plumbing with 
then ignore, due to the index bit) based on this information.

The index obviously can't contain the user's full instructions for what 
should be checked out, because the user will want to say "I don't care 
about anything in Documentation/" and have this apply to 
Documentation/some-file-not-in-the-index, so that if this file is in the 
worktree, the user gets a warning.

I think you're doing this with core.defaultsparse, although you seem to 
allow the index to diverge from this easily.

The question, then, is what happens when the index and core.defaultsparse 
disagree, either because the porcelain supports causing it or because the 
user has simply editting the config file or used plumbing to modify the 
index. That is, (1) we have index entries that say that the worktree is 
ignored, and the rules don't say they're outside the sparse checkout; do 
we care whether we expect the worktree to be empty or match the index? 
And, (2) we have index entries that say we do care about them, but the 
rules say they're outside the sparse checkout; what happens with these?

Case (1) is where we would need to know, in the index, whether we expect 
the worktree to actually match the index (traditional CE_VALID) or whether 
we expect the worktree to be empty (CE_NO_CHECKOUT), if our behavior 
should actually differ. My vague feeling is that we don't want it to 
differ, and these paths are unexpectional "interesting to the user, but 
the worktree is ignored" until reading a tree into the index again. (But 
note that we will have to check on the worktree when reading into the 
index if this changes the index from "blobA, CE_VALID" to 
"blobA, !CE_VALID", since the worktree could differ in a way that we don't 
want to retain. And I think we want it to be an error to have the worktree 
be something other than blobA or nothing before, but "nothing" is fine and 
we just write it out. (This means that users of CE_VALID who remove files 
behind git's back may lose their removal work; but this is a pretty 
trivial danger).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-07 21:26                         ` Daniel Barkalow
@ 2008-12-08 12:51                           ` Nguyen Thai Ngoc Duy
  2008-12-08 19:41                             ` Daniel Barkalow
  0 siblings, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-08 12:51 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 12/8/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  > This was discussed since the beginning of this feature. I recall that
>  > the index reflects worktree, and because we mark CE_NO_CHECKOUT on
>  > file basis, it's best to save the information there, not separately.
>  > We do save high level information to form the checkout area (sparse
>  > patterns) in the last half, but basically you should be able to live
>  > without that.
>
>
> We need to mark in the index the information that reflects the worktree.
>
>  If, however, we take CE_VALID to be the flag for "ignore the worktree
>  entirely at this path; act as it if contains what the index contains" (and
>  use this to cause that aspect of no-checkout), and we then entirely ignore
>  the worktree, including not caring whether there are files there or not
>  (except, of course, that in the transition from caring to not caring for
>  no-checkout, we make the worktree empty, while in the case for
>  "stat-is-expensive", we bring it into agreement with the index), then
>  there is no additional information that needs to be conveyed in the index.

That's not enough. CE_VALID is "ignore the worktree files" while
CE_NO_CHECKOUT is stricter: "those files does (or should) not exist".
The difference is

 - for "git grep", we ignore path with CE_NO_CHECKOUT (while using
cache version for CE_VALID)
 - porcelain-level support to widen/narrow checkout area will need
CE_NO_CHECKOUT, not CE_VALID

>
>  > >  unless it's ever important to remember whether an entry is CE_VALID due to
>  > >  having been outside the checkout when the index was written, even though
>  > >  the checkout area now includes it. I don't have a good intuition as to
>  > >  what ought to happen if the user manually changes what's specified for
>  > >  checkout without actually updating the index and working tree.
>  >
>  > So if a user changes worktree without updating index, they will have
>  > the same results as they do now: files are shown as modified if they
>  > don't have CE_NO_CHECKOUT set. If those files do, they are considered
>  > 'orphaned' or staled and are recommended to be removed/updated to
>  > avoid unexpected consequences (not availble this this first half
>  > series because that belongs to "git status").
>
>
> I was actually thinking that there would be a file for "this is what the
>  user wants to have checked out" (as opposed to the index, which must
>  contain "this is what is checked out"), and the porcelain would instruct
>  the plumbing as to what to do with the worktree (that the plumbing with
>  then ignore, due to the index bit) based on this information.
>
>  The index obviously can't contain the user's full instructions for what
>  should be checked out, because the user will want to say "I don't care
>  about anything in Documentation/" and have this apply to
>  Documentation/some-file-not-in-the-index, so that if this file is in the
>  worktree, the user gets a warning.
>
>  I think you're doing this with core.defaultsparse, although you seem to
>  allow the index to diverge from this easily.

Yes they can.

>
>  The question, then, is what happens when the index and core.defaultsparse
>  disagree, either because the porcelain supports causing it or because the
>  user has simply editting the config file or used plumbing to modify the
>  index. That is, (1) we have index entries that say that the worktree is
>  ignored, and the rules don't say they're outside the sparse checkout; do
>  we care whether we expect the worktree to be empty or match the index?
>  And, (2) we have index entries that say we do care about them, but the
>  rules say they're outside the sparse checkout; what happens with these?

The rule is CE_NO_CHECKOUT is king. core.defaultsparse only helps
setting CE_NO_CHECKOUT on new entries when they enter the index.

So if core.defaultsparse does not match what is in index, that's fine.
You may get more files with "git merge", "git checkout".. but
already-removed files should stay removed.

If you are not happy with core.defaultsparse, you can replace it with
your own tools by manipulating directly CE_NO_CHECKOUT using "git
update-index" and "git ls-files". BTW  the implementation has not had
an easy way to replace core.defaultsparse. You have to modify
apply_narrow_spec(). But if someone really needs it, a hook can be
added.
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-08 12:51                           ` Nguyen Thai Ngoc Duy
@ 2008-12-08 19:41                             ` Daniel Barkalow
  2008-12-11 13:04                               ` Nguyen Thai Ngoc Duy
  0 siblings, 1 reply; 37+ messages in thread
From: Daniel Barkalow @ 2008-12-08 19:41 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On Mon, 8 Dec 2008, Nguyen Thai Ngoc Duy wrote:

> On 12/8/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> >  > This was discussed since the beginning of this feature. I recall that
> >  > the index reflects worktree, and because we mark CE_NO_CHECKOUT on
> >  > file basis, it's best to save the information there, not separately.
> >  > We do save high level information to form the checkout area (sparse
> >  > patterns) in the last half, but basically you should be able to live
> >  > without that.
> >
> >
> > We need to mark in the index the information that reflects the worktree.
> >
> >  If, however, we take CE_VALID to be the flag for "ignore the worktree
> >  entirely at this path; act as it if contains what the index contains" (and
> >  use this to cause that aspect of no-checkout), and we then entirely ignore
> >  the worktree, including not caring whether there are files there or not
> >  (except, of course, that in the transition from caring to not caring for
> >  no-checkout, we make the worktree empty, while in the case for
> >  "stat-is-expensive", we bring it into agreement with the index), then
> >  there is no additional information that needs to be conveyed in the index.
> 
> That's not enough. CE_VALID is "ignore the worktree files" while
> CE_NO_CHECKOUT is stricter: "those files does (or should) not exist".
> The difference is
> 
>  - for "git grep", we ignore path with CE_NO_CHECKOUT (while using
> cache version for CE_VALID)

Is this sufficient? I'd expect "git grep" to ignore paths that are outside 
the checked-out region, even when searching an arbitrary tree, and even 
when those files aren't in the index at all (i.e., the current commit 
doesn't have them). That is, I'd expect core.defaultsparse or the 
equivalent to limit the paths, normally giving this effect.

>  - porcelain-level support to widen/narrow checkout area will need
> CE_NO_CHECKOUT, not CE_VALID

This isn't a meaningful difference between CE_NO_CHECKOUT and CE_VALID if 
there aren't any other differences.

> >  > >  unless it's ever important to remember whether an entry is CE_VALID due to
> >  > >  having been outside the checkout when the index was written, even though
> >  > >  the checkout area now includes it. I don't have a good intuition as to
> >  > >  what ought to happen if the user manually changes what's specified for
> >  > >  checkout without actually updating the index and working tree.
> >  >
> >  > So if a user changes worktree without updating index, they will have
> >  > the same results as they do now: files are shown as modified if they
> >  > don't have CE_NO_CHECKOUT set. If those files do, they are considered
> >  > 'orphaned' or staled and are recommended to be removed/updated to
> >  > avoid unexpected consequences (not availble this this first half
> >  > series because that belongs to "git status").
> >
> >
> > I was actually thinking that there would be a file for "this is what the
> >  user wants to have checked out" (as opposed to the index, which must
> >  contain "this is what is checked out"), and the porcelain would instruct
> >  the plumbing as to what to do with the worktree (that the plumbing with
> >  then ignore, due to the index bit) based on this information.
> >
> >  The index obviously can't contain the user's full instructions for what
> >  should be checked out, because the user will want to say "I don't care
> >  about anything in Documentation/" and have this apply to
> >  Documentation/some-file-not-in-the-index, so that if this file is in the
> >  worktree, the user gets a warning.
> >
> >  I think you're doing this with core.defaultsparse, although you seem to
> >  allow the index to diverge from this easily.
> 
> Yes they can.
> 
> >
> >  The question, then, is what happens when the index and core.defaultsparse
> >  disagree, either because the porcelain supports causing it or because the
> >  user has simply editting the config file or used plumbing to modify the
> >  index. That is, (1) we have index entries that say that the worktree is
> >  ignored, and the rules don't say they're outside the sparse checkout; do
> >  we care whether we expect the worktree to be empty or match the index?
> >  And, (2) we have index entries that say we do care about them, but the
> >  rules say they're outside the sparse checkout; what happens with these?
> 
> The rule is CE_NO_CHECKOUT is king. core.defaultsparse only helps
> setting CE_NO_CHECKOUT on new entries when they enter the index.

This seems like a really bad idea to me. If you ask for a file that's 
outside your default area to be checked out, and then you switch branches 
and switch back, the file may or may not disappear (depending on whether 
the branch you switched to temporarily had it or not). Likewise, if you 
remove files, and then switch branches and back, the files may or may not 
reappear.

Of course, commands need to look at the index to determine what we've 
actually done with the worktree and index, but I think there should be 
some other location that is responsible for keeping track of what the user 
has asked for.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-08 19:41                             ` Daniel Barkalow
@ 2008-12-11 13:04                               ` Nguyen Thai Ngoc Duy
  2008-12-11 20:30                                 ` Daniel Barkalow
  0 siblings, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-11 13:04 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 12/9/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  >  - for "git grep", we ignore path with CE_NO_CHECKOUT (while using
>  > cache version for CE_VALID)
>
>
> Is this sufficient? I'd expect "git grep" to ignore paths that are outside
>  the checked-out region, even when searching an arbitrary tree, and even
>  when those files aren't in the index at all (i.e., the current commit
>  doesn't have them). That is, I'd expect core.defaultsparse or the
>  equivalent to limit the paths, normally giving this effect.

That's the point. CE_VALID does not define checkout area while
CE_NO_CHECKOUT does.  If an entry is CE_VALID, it is still in checkout
area. But if it is CE_NO_CHECKOUT, "git grep" should ignore that path.
core.defaultsparse has nothing to do here.

>  > >  The question, then, is what happens when the index and core.defaultsparse
>  > >  disagree, either because the porcelain supports causing it or because the
>  > >  user has simply editting the config file or used plumbing to modify the
>  > >  index. That is, (1) we have index entries that say that the worktree is
>  > >  ignored, and the rules don't say they're outside the sparse checkout; do
>  > >  we care whether we expect the worktree to be empty or match the index?
>  > >  And, (2) we have index entries that say we do care about them, but the
>  > >  rules say they're outside the sparse checkout; what happens with these?
>  >
>  > The rule is CE_NO_CHECKOUT is king. core.defaultsparse only helps
>  > setting CE_NO_CHECKOUT on new entries when they enter the index.
>
>
> This seems like a really bad idea to me. If you ask for a file that's
>  outside your default area to be checked out, and then you switch branches
>  and switch back, the file may or may not disappear (depending on whether
>  the branch you switched to temporarily had it or not). Likewise, if you
>  remove files, and then switch branches and back, the files may or may not
>  reappear.

Well, if you set core.defaultsparse properly, those files should
appear/disappear as you wish (and as of now if you define your
checkout area with "git checkout --{include-,exclude-,}sparse" then
core.defaultsparse should be updated accordingly). I don't say
core.defaultsparse is perfect.

Anyway how do you suppose the tool to do in your case (checkout,
switch away then switch back)?
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-11 13:04                               ` Nguyen Thai Ngoc Duy
@ 2008-12-11 20:30                                 ` Daniel Barkalow
  2008-12-12  1:41                                   ` Junio C Hamano
  2008-12-12 16:08                                   ` Nguyen Thai Ngoc Duy
  0 siblings, 2 replies; 37+ messages in thread
From: Daniel Barkalow @ 2008-12-11 20:30 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On Thu, 11 Dec 2008, Nguyen Thai Ngoc Duy wrote:

> On 12/9/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
> >  >  - for "git grep", we ignore path with CE_NO_CHECKOUT (while using
> >  > cache version for CE_VALID)
> >
> >
> > Is this sufficient? I'd expect "git grep" to ignore paths that are outside
> >  the checked-out region, even when searching an arbitrary tree, and even
> >  when those files aren't in the index at all (i.e., the current commit
> >  doesn't have them). That is, I'd expect core.defaultsparse or the
> >  equivalent to limit the paths, normally giving this effect.
> 
> That's the point. CE_VALID does not define checkout area while
> CE_NO_CHECKOUT does.  If an entry is CE_VALID, it is still in checkout
> area. But if it is CE_NO_CHECKOUT, "git grep" should ignore that path.
> core.defaultsparse has nothing to do here.

My point is that the index cannot tell git grep whether it should search a 
path if the path isn't in the index. If I do a narrow checkout of only 
Documentation/, and I do "git grep foo", I won't see files that aren't in 
Documentation/; if I do "git grep foo origin/next", I think I shouldn't 
see files that aren't in Documentation/, and "new-program.c" isn't in my 
index at all, marked as CE_NO_CHECKOUT or otherwise, so git grep can't 
find out from the index whether that file is outside my area of interest. 
It needs to be able to determine that "only Documentation/ is in the 
checkout area" ignoring the details of the list of files in the working 
directory currently in or out of the area.

> >  > >  The question, then, is what happens when the index and core.defaultsparse
> >  > >  disagree, either because the porcelain supports causing it or because the
> >  > >  user has simply editting the config file or used plumbing to modify the
> >  > >  index. That is, (1) we have index entries that say that the worktree is
> >  > >  ignored, and the rules don't say they're outside the sparse checkout; do
> >  > >  we care whether we expect the worktree to be empty or match the index?
> >  > >  And, (2) we have index entries that say we do care about them, but the
> >  > >  rules say they're outside the sparse checkout; what happens with these?
> >  >
> >  > The rule is CE_NO_CHECKOUT is king. core.defaultsparse only helps
> >  > setting CE_NO_CHECKOUT on new entries when they enter the index.
> >
> >
> > This seems like a really bad idea to me. If you ask for a file that's
> >  outside your default area to be checked out, and then you switch branches
> >  and switch back, the file may or may not disappear (depending on whether
> >  the branch you switched to temporarily had it or not). Likewise, if you
> >  remove files, and then switch branches and back, the files may or may not
> >  reappear.
> 
> Well, if you set core.defaultsparse properly, those files should
> appear/disappear as you wish (and as of now if you define your
> checkout area with "git checkout --{include-,exclude-,}sparse" then
> core.defaultsparse should be updated accordingly). I don't say
> core.defaultsparse is perfect.

Right, so in order to get reasonable behavior, the user must use 
--{include,exclude}-sparse. I think that this should be the *default* 
behavior, and probably the *only porcelain-supported* behavior, because 
otherwise it's confusing.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-11 20:30                                 ` Daniel Barkalow
@ 2008-12-12  1:41                                   ` Junio C Hamano
  2008-12-12  2:40                                     ` Daniel Barkalow
  2008-12-12 16:13                                     ` Nguyen Thai Ngoc Duy
  2008-12-12 16:08                                   ` Nguyen Thai Ngoc Duy
  1 sibling, 2 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-12-12  1:41 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Nguyen Thai Ngoc Duy, Shawn O. Pearce, Johannes Schindelin, git

Daniel Barkalow <barkalow@iabervon.org> writes:

>> That's the point. CE_VALID does not define checkout area while
>> CE_NO_CHECKOUT does.  If an entry is CE_VALID, it is still in checkout
>> area. But if it is CE_NO_CHECKOUT, "git grep" should ignore that path.
>> core.defaultsparse has nothing to do here.
>
> My point is that the index cannot tell git grep whether it should search a 
> path if the path isn't in the index.

Let's step back a bit.  I think "git grep" that stays silent outside of
the checkout area when used to grep in the work tree or in the index is a
mistake.

The problem "sparse checkout" attempts to address is not this:

    I ran "git init && git add ." in /usr/src by mistake.  There is no
    reason for coreutils that is in /usr/src/coreutils and gnucash that is
    in /usr/src/gnucash to share the same development history nor their
    should be any ordering between commits in these two independent
    projects.  I should have done N separate "init & add" independently at
    one level deeper in the directory hierarchy, but I am too lazy to
    filter branch the resulting mess now.

At least, it should not be that, at least to me.

"Sparse" is "I am not going to modify the files in these areas, and I know
they do not need to be present for my purposes (e.g. build), so I do not
need copies in the work tree."  It still works on the whole tree structure
recorded in the commit, but gives you a way to work inside a sparsely
populated work tree, iow, without checking everything out.

So "git grep -e frotz Documentation/", whether you only check out
Documentation or the whole tree, should grep only in Documentation area,
and "git grep -e frotz" should grep in the whole tree, even if you happen
to have a sparse checkout.  By definition, a sparse checkout has no
modifications outside the checkout area, so whenever grep wants to look
for strings outside the checkout area it should pretend as if the same
content as what the index records is in the work tree.  This is consistent
with the way how "git diff" in a sparsely checked out work tree should
behave.

If you understand that, it is clear what "git grep -e frotz HEAD^" should
do.  No checkout area is involved.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12  1:41                                   ` Junio C Hamano
@ 2008-12-12  2:40                                     ` Daniel Barkalow
  2008-12-12  3:12                                       ` Junio C Hamano
  2008-12-12 16:13                                     ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 37+ messages in thread
From: Daniel Barkalow @ 2008-12-12  2:40 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyen Thai Ngoc Duy, Shawn O. Pearce, Johannes Schindelin, git

On Thu, 11 Dec 2008, Junio C Hamano wrote:

> Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> >> That's the point. CE_VALID does not define checkout area while
> >> CE_NO_CHECKOUT does.  If an entry is CE_VALID, it is still in checkout
> >> area. But if it is CE_NO_CHECKOUT, "git grep" should ignore that path.
> >> core.defaultsparse has nothing to do here.
> >
> > My point is that the index cannot tell git grep whether it should search a 
> > path if the path isn't in the index.
> 
> Let's step back a bit.  I think "git grep" that stays silent outside of
> the checkout area when used to grep in the work tree or in the index is a
> mistake.
> 
> The problem "sparse checkout" attempts to address is not this:
> 
>     I ran "git init && git add ." in /usr/src by mistake.  There is no
>     reason for coreutils that is in /usr/src/coreutils and gnucash that is
>     in /usr/src/gnucash to share the same development history nor their
>     should be any ordering between commits in these two independent
>     projects.  I should have done N separate "init & add" independently at
>     one level deeper in the directory hierarchy, but I am too lazy to
>     filter branch the resulting mess now.
> 
> At least, it should not be that, at least to me.
> 
> "Sparse" is "I am not going to modify the files in these areas, and I know
> they do not need to be present for my purposes (e.g. build), so I do not
> need copies in the work tree."  It still works on the whole tree structure
> recorded in the commit, but gives you a way to work inside a sparsely
> populated work tree, iow, without checking everything out.

There's the meta question of: "Do people who have declared that they 
aren't going to modify or build with some files want their searches to 
tell them about those files?"

Say I'm the "tr" guy, and I care about the build system, library code, and 
"tr.c", and I run "make tr"; my sparse checkout doesn't include "head.c", 
and I totally ignore all the other stuff that's in coreutils. Maybe I want 
"git grep" to exclude the other stuff.

I don't really have a firm position on whether "git grep" should ignore 
"head.c" or not, but I think it should be consistent between "git grep" 
and "git grep origin/next", and I think that, if origin/next has a new 
"foot.c" that isn't in the current branch to by marked as NO_CHECKOUT, it 
should be skipped if "tail.c" (which is in my current branch) is skipped.

> So "git grep -e frotz Documentation/", whether you only check out
> Documentation or the whole tree, should grep only in Documentation area,
> and "git grep -e frotz" should grep in the whole tree, even if you happen
> to have a sparse checkout.  By definition, a sparse checkout has no
> modifications outside the checkout area, so whenever grep wants to look
> for strings outside the checkout area it should pretend as if the same
> content as what the index records is in the work tree.  This is consistent
> with the way how "git diff" in a sparsely checked out work tree should
> behave.

"git diff" is an ambiguous model for "git grep". It equally describes 
the behavior of "git diff" to say that it treats files outside the 
checkout area as matching the index or to say that it never lists files 
outside the checkout area. On the other hand, there is the question of 
whether "git diff branch1 branch2" shows differences that are outside the 
checkout area, and whether "git log" shows commits that only change things 
outside the checkout area, and "git grep" should match the behavior of 
these.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12  2:40                                     ` Daniel Barkalow
@ 2008-12-12  3:12                                       ` Junio C Hamano
  2008-12-12  3:36                                         ` Jeff King
  0 siblings, 1 reply; 37+ messages in thread
From: Junio C Hamano @ 2008-12-12  3:12 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: Nguyen Thai Ngoc Duy, Shawn O. Pearce, Johannes Schindelin, git

Daniel Barkalow <barkalow@iabervon.org> writes:

> "git diff" is an ambiguous model for "git grep". It equally describes 
> the behavior of "git diff" to say that it treats files outside the 
> checkout area as matching the index or to say that it never lists files 
> outside the checkout area. On the other hand, there is the question of 
> whether "git diff branch1 branch2" shows differences that are outside the 
> checkout area, and whether "git log" shows commits that only change things 
> outside the checkout area, and "git grep" should match the behavior of 
> these.

Sure, but as "sparse" does not (again, "it should not, at least to me")
change the fact that git is about tracking the history of whole tree, not
just a single file, nor just a subset of files, none of these operations
should be affected about what the checkout area is.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12  3:12                                       ` Junio C Hamano
@ 2008-12-12  3:36                                         ` Jeff King
  0 siblings, 0 replies; 37+ messages in thread
From: Jeff King @ 2008-12-12  3:36 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Daniel Barkalow, Nguyen Thai Ngoc Duy, Shawn O. Pearce,
	Johannes Schindelin, git

On Thu, Dec 11, 2008 at 07:12:53PM -0800, Junio C Hamano wrote:

> Sure, but as "sparse" does not (again, "it should not, at least to me")
> change the fact that git is about tracking the history of whole tree, not
> just a single file, nor just a subset of files, none of these operations
> should be affected about what the checkout area is.

I agree with Junio here. If you want "git grep foo HEAD^" to ignore
certain files, then sparse _checkout_ is not the right feature. In that
case you want a sparse _repo_, which is not something I think anybody is
seriously working on.

-Peff

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-11 20:30                                 ` Daniel Barkalow
  2008-12-12  1:41                                   ` Junio C Hamano
@ 2008-12-12 16:08                                   ` Nguyen Thai Ngoc Duy
  1 sibling, 0 replies; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-12 16:08 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, Shawn O. Pearce, Johannes Schindelin, git

On 12/12/08, Daniel Barkalow <barkalow@iabervon.org> wrote:
>  > Well, if you set core.defaultsparse properly, those files should
>  > appear/disappear as you wish (and as of now if you define your
>  > checkout area with "git checkout --{include-,exclude-,}sparse" then
>  > core.defaultsparse should be updated accordingly). I don't say
>  > core.defaultsparse is perfect.
>
>
> Right, so in order to get reasonable behavior, the user must use
>  --{include,exclude}-sparse. I think that this should be the *default*
>  behavior, and probably the *only porcelain-supported* behavior, because
>  otherwise it's confusing.

It's pretty hard (or intrusive) to enforce such behaviour. How about
showing files that does not match core.defaultsparse in "git status"
along with instructions how to add them to core.defaultsparse? That
way people can keep it consistent and less modification to current
code.
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12  1:41                                   ` Junio C Hamano
  2008-12-12  2:40                                     ` Daniel Barkalow
@ 2008-12-12 16:13                                     ` Nguyen Thai Ngoc Duy
  2008-12-12 16:45                                       ` Johannes Sixt
  2008-12-13  5:51                                       ` Junio C Hamano
  1 sibling, 2 replies; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-12 16:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Daniel Barkalow, Shawn O. Pearce, Johannes Schindelin, git

On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>  So "git grep -e frotz Documentation/", whether you only check out
>  Documentation or the whole tree, should grep only in Documentation area,
>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>  to have a sparse checkout.  By definition, a sparse checkout has no
>  modifications outside the checkout area, so whenever grep wants to look
>  for strings outside the checkout area it should pretend as if the same
>  content as what the index records is in the work tree.  This is consistent
>  with the way how "git diff" in a sparsely checked out work tree should
>  behave.

Assume someone is using sparse checkout with KDE git repository. They
sparse-checkout kdeutils module and do "git grep -e foo". I would
expect that the command only searches in kdeutils only (and is the
current behavior).
-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12 16:13                                     ` Nguyen Thai Ngoc Duy
@ 2008-12-12 16:45                                       ` Johannes Sixt
  2008-12-12 16:54                                         ` Nguyen Thai Ngoc Duy
  2008-12-13  5:51                                       ` Junio C Hamano
  1 sibling, 1 reply; 37+ messages in thread
From: Johannes Sixt @ 2008-12-12 16:45 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Junio C Hamano, Daniel Barkalow, Shawn O. Pearce,
	Johannes Schindelin, git

Nguyen Thai Ngoc Duy schrieb:
> On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>>  So "git grep -e frotz Documentation/", whether you only check out
>>  Documentation or the whole tree, should grep only in Documentation area,
>>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>>  to have a sparse checkout.  By definition, a sparse checkout has no
>>  modifications outside the checkout area, so whenever grep wants to look
>>  for strings outside the checkout area it should pretend as if the same
>>  content as what the index records is in the work tree.  This is consistent
>>  with the way how "git diff" in a sparsely checked out work tree should
>>  behave.
> 
> Assume someone is using sparse checkout with KDE git repository. They
> sparse-checkout kdeutils module and do "git grep -e foo". I would
> expect that the command only searches in kdeutils only (and is the
> current behavior).

But what if the same persion notices a #define in a kdeutils header file
and want's to know whether it is unused in order to remove it:

    $ git grep FOO
    kdeutils/foo.h:#define FOO bar

Conclusion from this output: "It's only defined, but not used anywhere."
But this conclusion is not necessarily correct because FOO could be used
outside kdeutils.

So, no, "git grep" should disregard the checkout area.

-- Hannes

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12 16:45                                       ` Johannes Sixt
@ 2008-12-12 16:54                                         ` Nguyen Thai Ngoc Duy
  2008-12-13  5:51                                           ` Junio C Hamano
  0 siblings, 1 reply; 37+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2008-12-12 16:54 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Junio C Hamano, Daniel Barkalow, Shawn O. Pearce,
	Johannes Schindelin, git

On 12/12/08, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Nguyen Thai Ngoc Duy schrieb:
>
> > On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>  >>  So "git grep -e frotz Documentation/", whether you only check out
>  >>  Documentation or the whole tree, should grep only in Documentation area,
>  >>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>  >>  to have a sparse checkout.  By definition, a sparse checkout has no
>  >>  modifications outside the checkout area, so whenever grep wants to look
>  >>  for strings outside the checkout area it should pretend as if the same
>  >>  content as what the index records is in the work tree.  This is consistent
>  >>  with the way how "git diff" in a sparsely checked out work tree should
>  >>  behave.
>  >
>  > Assume someone is using sparse checkout with KDE git repository. They
>  > sparse-checkout kdeutils module and do "git grep -e foo". I would
>  > expect that the command only searches in kdeutils only (and is the
>  > current behavior).
>
>
> But what if the same persion notices a #define in a kdeutils header file
>  and want's to know whether it is unused in order to remove it:
>
>     $ git grep FOO
>     kdeutils/foo.h:#define FOO bar

"git grep --cached FOO" ?

>  Conclusion from this output: "It's only defined, but not used anywhere."
>  But this conclusion is not necessarily correct because FOO could be used
>  outside kdeutils.
>
>  So, no, "git grep" should disregard the checkout area.
>
>  -- Hannes
>


-- 
Duy

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12 16:54                                         ` Nguyen Thai Ngoc Duy
@ 2008-12-13  5:51                                           ` Junio C Hamano
  0 siblings, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-12-13  5:51 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Johannes Sixt, Junio C Hamano, Daniel Barkalow, Shawn O. Pearce,
	Johannes Schindelin, git

"Nguyen Thai Ngoc Duy" <pclouds@gmail.com> writes:

> On 12/12/08, Johannes Sixt <j.sixt@viscovery.net> wrote:
> ...
>> But what if the same persion notices a #define in a kdeutils header file
>>  and want's to know whether it is unused in order to remove it:
>>
>>     $ git grep FOO
>>     kdeutils/foo.h:#define FOO bar
>
> "git grep --cached FOO" ?

That should behave identically when the work tree does not have change
since the index, and by definition paths outside the checkout area in the
"sparse" mode cannot have changes, so "git grep FOO" should behave the
same and should find it.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: What's cooking in git.git (Nov 2008, #06; Wed, 26)
  2008-12-12 16:13                                     ` Nguyen Thai Ngoc Duy
  2008-12-12 16:45                                       ` Johannes Sixt
@ 2008-12-13  5:51                                       ` Junio C Hamano
  1 sibling, 0 replies; 37+ messages in thread
From: Junio C Hamano @ 2008-12-13  5:51 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Daniel Barkalow, Shawn O. Pearce, Johannes Schindelin, git

"Nguyen Thai Ngoc Duy" <pclouds@gmail.com> writes:

> On 12/12/08, Junio C Hamano <gitster@pobox.com> wrote:
>>  So "git grep -e frotz Documentation/", whether you only check out
>>  Documentation or the whole tree, should grep only in Documentation area,
>>  and "git grep -e frotz" should grep in the whole tree, even if you happen
>>  to have a sparse checkout.  By definition, a sparse checkout has no
>>  modifications outside the checkout area, so whenever grep wants to look
>>  for strings outside the checkout area it should pretend as if the same
>>  content as what the index records is in the work tree.  This is consistent
>>  with the way how "git diff" in a sparsely checked out work tree should
>>  behave.
>
> Assume someone is using sparse checkout with KDE git repository. They
> sparse-checkout kdeutils module and do "git grep -e foo". I would
> expect that the command only searches in kdeutils only (and is the
> current behavior).

Yes it is the "current in next" behaviour, and no that is not what you
should expect, and that is why I earlier said it is a mistake.  The
ability to choose which part to leave out of the working tree should not
change the fact that git is about managing the history of the whole tree,
not an individual file nor a subset of files.

I do not think it is unreasonable to have a mechanism to let the user
limit the area of the whole tree often used Porcelain commands look at.
We already have pathspec "git grep -e foo kdeutils/" mechanism that lets
you do such limiting.  It is conceivable that some workflows _might_ find
having the default pathspec convenient in end-user initiated operations,
but then it would be convenient whether the end-user uses the sparse
checkout to limit the area to kdeutils/ or has the whole checkout.
Although I think it would be Okay to default the default pathspec match
the checkout area when the sparse checkout feature is in use, I think the
"checkout area" and "area of interest" should be two independent concepts.

I said "_might_" in the above because I do not think it is such a good
idea to have _the_ default pathspec to begin with, though.  It would
probably be more useful to allow people to use shorthands to pathspecs,
and at that point you can use usual shell variables to do that already,
e.g. instead of having to say "git grep -e foo arch/x86 include/asm-i386",
you would say "git grep -e foo $i386", after "i386=arch/x86 include/asm-i386",
or something like that.

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2008-12-13  5:53 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-11-27  0:28 What's cooking in git.git (Nov 2008, #06; Wed, 26) Junio C Hamano
2008-11-27 22:49 ` Johannes Schindelin
2008-11-28  2:06   ` Junio C Hamano
2008-11-28 11:47     ` Johannes Schindelin
2008-11-28 19:20       ` Shawn O. Pearce
2008-11-29  0:13         ` Junio C Hamano
2008-11-29  0:15           ` [PATCH] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
2008-11-29  3:51             ` [PATCH 1/3] builtin-rm.c: explain and clarify the "local change" logic Junio C Hamano
2008-11-29  3:55               ` [PATCH 2/3] git add --intent-to-add: fix removal of cached emptiness Junio C Hamano
2008-11-29 15:38                 ` Sverre Rabbelier
2008-11-30 19:21                   ` Jeff King
2008-11-29  3:56               ` [PATCH 3/3] git add --intent-to-add: do not let an empty blob committed by accident Junio C Hamano
2008-11-30 19:14                 ` Jeff King
2008-12-01  9:24                   ` Junio C Hamano
2008-11-29  1:25           ` What's cooking in git.git (Nov 2008, #06; Wed, 26) Daniel Barkalow
2008-11-29 13:02             ` Nguyen Thai Ngoc Duy
2008-11-30 10:29               ` Nguyen Thai Ngoc Duy
2008-11-30 21:26                 ` Daniel Barkalow
2008-12-06 17:26                   ` Nguyen Thai Ngoc Duy
2008-12-06 18:39                     ` Daniel Barkalow
2008-12-07 12:27                       ` Nguyen Thai Ngoc Duy
2008-12-07 21:26                         ` Daniel Barkalow
2008-12-08 12:51                           ` Nguyen Thai Ngoc Duy
2008-12-08 19:41                             ` Daniel Barkalow
2008-12-11 13:04                               ` Nguyen Thai Ngoc Duy
2008-12-11 20:30                                 ` Daniel Barkalow
2008-12-12  1:41                                   ` Junio C Hamano
2008-12-12  2:40                                     ` Daniel Barkalow
2008-12-12  3:12                                       ` Junio C Hamano
2008-12-12  3:36                                         ` Jeff King
2008-12-12 16:13                                     ` Nguyen Thai Ngoc Duy
2008-12-12 16:45                                       ` Johannes Sixt
2008-12-12 16:54                                         ` Nguyen Thai Ngoc Duy
2008-12-13  5:51                                           ` Junio C Hamano
2008-12-13  5:51                                       ` Junio C Hamano
2008-12-12 16:08                                   ` Nguyen Thai Ngoc Duy
2008-12-07  3:45                   ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.