git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH (WIP)] Show a dirty working tree and a detached HEAD in status for submodule
@ 2010-01-11 22:05 Jens Lehmann
  2010-01-11 22:45 ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Lehmann @ 2010-01-11 22:05 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Junio C Hamano, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Until now a submodule only showed up as changed in the supermodule when
the last commit in the submodule differed from the one in the index or
the last commit of the superproject. A dirty working tree or a detached
HEAD in a submodule were just ignored when looking at it from the
superproject.

This patch shows these changes when using git status or one of the diff
commands which compare against the working tree in the superproject.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---


This is the first version of a patch letting git status and the git
diff family show dirty working directories and a detached HEAD in
directories. It is not intended to be merged in its present form but
to be used as a starting point for discussion if this is going in
the right direction.


What the patch does:

* It makes git show submodules as modified in the superproject when
  one or more of these conditions are met:

    a) The submodule contains untracked files
    b) The submodule contains modified files
    c) The submodules HEAD is not on a local or remote branch

  That can be seen when using either "git status", "git diff[-files]"
  & "git diff[-index] HEAD" (and with "git gui" & gitk).


What the patch doesn't do (yet):

* It still breaks tests t7400-submodule-basic.sh &
  t7407-submodule-foreach.sh.

* It doesn't give detailed output when doing a "git diff* -p" with or
  without the --submodule option. It should show something like

    diff --git a/sub b/sub
    index 5431f52..3f35670 160000
    --- a/sub
    +++ b/sub
    @@ -1 +1 @@
    -Subproject commit 5431f529197f3831cdfbba1354a819a79f948f6f
    +Subproject commit 3f356705649b5d566d97ff843cf193359229a453-dirty

  for "git diff* -p" (notice the "-dirty" in the last line) or

      Submodule sub contains untracked files
      Submodule sub contains modified files
      Submodule sub contains a HEAD not on any branch
      Submodule sub 5431f52..3f35670:
      > commit message 1

  when using the --submodule option of the diff family.

* This behavior is not configurable but activated by default. A config
  option is needed here.

* It doesn't give optimal performance:

  - Apart from the fact that checking submodules this way will always
    be slower than ignoring their changes as git does until now, doing
    two run_command() calls for each submodule is not going to help at
    all (especially when running on Windows).

  - AFAICS the check for a detached HEAD would be faster for the most
    probable case if it would check against remotes/origin/master first.
    And it could stop when the first branch was found instead of
    continuing to look for others too as "git branch --contains" does.

  - Similar for the test for a dirty working directory, no need to have
    the full list of new and modified files, it could stop at the first
    one it finds.

  - If no detailed output is wanted the examination of HEAD and the
    working directory is not necessary when the HEAD and the commit in
    the index of the superproject already don't match.


What do you think?




 diff-lib.c                  |    7 +++-
 submodule.c                 |  103 +++++++++++++++++++++++++++++++++++++++++++
 submodule.h                 |    1 +
 t/t7506-status-submodule.sh |   45 ++++++++++++++++++-
 4 files changed, 154 insertions(+), 2 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 1c7e652..323305a 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -159,7 +159,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 				continue;
 		}

-		if (ce_uptodate(ce) || ce_skip_worktree(ce))
+		if ((ce_uptodate(ce) && !S_ISGITLINK(ce->ce_mode)) || ce_skip_worktree(ce))
 			continue;

 		/* If CE_VALID is set, don't look at workdir for file removal */
@@ -176,6 +176,11 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 			continue;
 		}
 		changed = ce_match_stat(ce, &st, ce_option);
+		if (S_ISGITLINK(ce->ce_mode)) {
+			/* TODO: This should not be executed when the submodule is changed
+			 * and only short ouptut is wanted for performance reasons. */
+			changed |= is_submodule_modified(ce->name);
+		}
 		if (!changed) {
 			ce_mark_uptodate(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
diff --git a/submodule.c b/submodule.c
index 86aad65..b35f1b3 100644
--- a/submodule.c
+++ b/submodule.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "commit.h"
 #include "revision.h"
+#include "run-command.h"

 int add_submodule_odb(const char *path)
 {
@@ -112,3 +113,105 @@ void show_submodule_summary(FILE *f, const char *path,
 	}
 	strbuf_release(&sb);
 }
+
+static int is_submodule_head_detached(const char *path)
+{
+	int retval, len;
+	struct child_process branch;
+	const char *argv[] = {
+		"branch",
+		"-a",
+		"--contains",
+		"HEAD",
+		NULL,
+	};
+	char *env[3];
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_addf(&buf, "GIT_WORK_TREE=%s", path);
+	env[0] = strbuf_detach(&buf, NULL);
+	strbuf_addf(&buf, "GIT_DIR=%s/.git", path);
+	env[1] = strbuf_detach(&buf, NULL);
+	env[2] = NULL;
+
+	memset(&branch, 0, sizeof(branch));
+	branch.argv = argv;
+	branch.env = (const char *const *)env;
+	branch.git_cmd = 1;
+	branch.no_stdin = 1;
+	branch.out = -1;
+	if (start_command(&branch))
+		die("Could not run git branch -a --contains HEAD");
+
+	len = strbuf_read(&buf, branch.out, 1024);
+	close(branch.out);
+
+	if (finish_command(&branch))
+		die("git branch -a --contains HEAD failed");
+
+	retval = (strncmp(buf.buf, "* (no branch)", 13) == 0);
+
+	free(env[0]);
+	free(env[1]);
+	strbuf_release(&buf);
+	return retval;
+}
+
+static int is_submodule_working_directory_dirty(const char *path)
+{
+	int len;
+	struct child_process branch;
+	const char *argv[] = {
+		"status",
+		"--porcelain",
+		NULL,
+	};
+	char *env[3];
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_addf(&buf, "GIT_WORK_TREE=%s", path);
+	env[0] = strbuf_detach(&buf, NULL);
+	strbuf_addf(&buf, "GIT_DIR=%s/.git", path);
+	env[1] = strbuf_detach(&buf, NULL);
+	env[2] = NULL;
+
+	memset(&branch, 0, sizeof(branch));
+	branch.argv = argv;
+	branch.env = (const char *const *)env;
+	branch.git_cmd = 1;
+	branch.no_stdin = 1;
+	branch.out = -1;
+	if (start_command(&branch))
+		die("Could not run git status --porcelain");
+
+	len = strbuf_read(&buf, branch.out, 1024);
+	close(branch.out);
+
+	if (finish_command(&branch))
+		die("git status --porcelain failed");
+
+	free(env[0]);
+	free(env[1]);
+	strbuf_release(&buf);
+	return len != 0;
+}
+
+int is_submodule_modified(const char *path)
+{
+	struct strbuf buffer = STRBUF_INIT;
+
+	strbuf_addf(&buffer, "%s/.git/", path);
+	if (!is_directory(buffer.buf)) {
+		strbuf_release(&buffer);
+		return 0;
+	}
+	strbuf_release(&buffer);
+
+	if (is_submodule_head_detached(path))
+		return 1;
+
+	if (is_submodule_working_directory_dirty(path))
+		return 1;
+
+	return 0;
+}
diff --git a/submodule.h b/submodule.h
index 4c0269d..0773121 100644
--- a/submodule.h
+++ b/submodule.h
@@ -4,5 +4,6 @@
 void show_submodule_summary(FILE *f, const char *path,
 		unsigned char one[20], unsigned char two[20],
 		const char *del, const char *add, const char *reset);
+int is_submodule_modified(const char *path);

 #endif
diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 3ca17ab..509754a 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -10,8 +10,12 @@ test_expect_success 'setup' '
 	: >bar &&
 	git add bar &&
 	git commit -m " Add bar" &&
+	: >foo &&
+	git add foo &&
+	git commit -m " Add foo" &&
 	cd .. &&
-	git add sub &&
+	echo output > .gitignore
+	git add sub .gitignore &&
 	git commit -m "Add submodule sub"
 '

@@ -23,6 +27,45 @@ test_expect_success 'commit --dry-run -a clean' '
 	git commit --dry-run -a |
 	grep "nothing to commit"
 '
+
+echo "changed" > sub/foo
+test_expect_success 'status with modified file in submodule' '
+	git status | grep "modified:   sub"
+'
+test_expect_success 'status with modified file in submodule (porcelain)' '
+	git status --porcelain >output &&
+	diff output - <<-EOF
+ M sub
+EOF
+'
+(cd sub && git checkout foo)
+
+echo "content" > sub/new-file
+test_expect_success 'status with untracked file in submodule' '
+	git status | grep "modified:   sub"
+'
+test_expect_success 'status with untracked file in submodule (porcelain)' '
+	git status --porcelain >output &&
+	diff output - <<-EOF
+ M sub
+EOF
+'
+rm sub/new-file
+
+(cd sub && 2>/dev/null
+old_head=$(cat .git/refs/heads/master) &&
+git reset --hard HEAD^ &&
+git checkout $old_head 2>/dev/null)
+test_expect_success 'status with detatched HEAD in submodule' '
+	git status | grep "modified:   sub"
+'
+test_expect_success 'status with detatched HEAD in submodule (porcelain)' '
+	git status --porcelain >output &&
+	diff output - <<-EOF
+ M sub
+EOF
+'
+
 test_expect_success 'rm submodule contents' '
 	rm -rf sub/* sub/.git
 '
-- 
1.6.6.203.g6b27d.dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH (WIP)] Show a dirty working tree and a detached HEAD in status for submodule
  2010-01-11 22:05 [RFC PATCH (WIP)] Show a dirty working tree and a detached HEAD in status for submodule Jens Lehmann
@ 2010-01-11 22:45 ` Junio C Hamano
  2010-01-12 16:20   ` Jens Lehmann
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2010-01-11 22:45 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Git Mailing List, Junio C Hamano, Johannes Schindelin,
	Shawn O. Pearce, Heiko Voigt, Lars Hjemli

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Until now a submodule only showed up as changed in the supermodule when
> the last commit in the submodule differed from the one in the index or
> the last commit of the superproject. A dirty working tree or a detached
> HEAD in a submodule were just ignored when looking at it from the
> superproject.
>
> This patch shows these changes when using git status or one of the diff
> commands which compare against the working tree in the superproject.
>
> Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
> ---
>
>
> This is the first version of a patch letting git status and the git
> diff family show dirty working directories and a detached HEAD in
> directories. It is not intended to be merged in its present form but
> to be used as a starting point for discussion if this is going in
> the right direction.
>
>
> What the patch does:
>
> * It makes git show submodules as modified in the superproject when
>   one or more of these conditions are met:
>
>     a) The submodule contains untracked files
>     b) The submodule contains modified files
>     c) The submodules HEAD is not on a local or remote branch
>
>   That can be seen when using either "git status", "git diff[-files]"
>   & "git diff[-index] HEAD" (and with "git gui" & gitk).

If the submodule is checked out, _and_ if the HEAD there, either detached
or not, does not agree with what the "other" one records (i.e. the commit
recorded in an entry in the index, or in the tree, that you are comparing
your work tree against), then it also should be considered modified.  I
don't think your (a)-(c) cover this case.

Also I don't understand why you want to treat (c) any specially at all.
Even if (c) is something we _should_ report, please do not call that as
"detached" in its implementation.  "detached HEAD" has a very precise
technical meaning, and can point at the same commit as a local or a remote
tracking branch, which is very different from the definition your
implementation seems to use.

> * This behavior is not configurable but activated by default. A config
>   option is needed here.

I doubt it.

My gut feeling is that this should be _always_ on for a submodule
directory that has been "submodule init/update".  The user is interested
in that particular submodule, and any change to it should be reported for
both classes of users.  Theose who meant to use the submodule read-only
need to be able to notice that they accidentally made the submodule dirty
before making a commit in the superproject.  Those who wanted to work in
submodule needs to know if the state is in sync with what they expect
before making a commit in the superproject.

That of course is provided if the unconditional check does not trigger for
submodules that the user hasn't "submodue init"ed; I think you did that
correctly at the beginning of your is_submodule_modified() implementation.

> +static int is_submodule_head_detached(const char *path)
> +{

I don't understand why you should care which branch the submodule happens
to be on, as long as the next commit you make in the superproject records
the commit that is checked out in the submodule.

Of course you may want to be careful when "pushing" the superproject
results out (i.e. you would want to push out the history leading to that
commit at the submodule HEAD in the submodule history), so that the people
who are pulling from the repository you are pushing into will have
everything available.

But the thing is, in a distributed environment, the submodule HEAD being
at the tip of _some_ branch (either local or remote) you have doesn't mean
anything to help them.  IOW, for protect others, you would need a check
when you _push out_ (either in 'push' or on the receiving end).

So I'd suggest dropping this condition in "status/diff" that is about
preparing to make the next commit in your _local_ history.

If "must be reachable from somewhere" is a condition worth caring about in
some context other than "status/diff", you can do an equivalent of:

    $ git rev-parse HEAD --not --all | git rev-list --stdin

and see if anything comes out (in which case you have commits that are not
reachable from any of your refs other than the detached HEAD).

But that is not "is HEAD detached?"; it is something else.  "Dangling",
perhaps, as that is how "git fsck" call commits that are not reachable
from any of your refs ("fsck" considers HEAD a part of refs, so it is not
strictly correct but it is much closer).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH (WIP)] Show a dirty working tree and a detached HEAD in status for submodule
  2010-01-11 22:45 ` Junio C Hamano
@ 2010-01-12 16:20   ` Jens Lehmann
  2010-01-13  7:09     ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Jens Lehmann @ 2010-01-12 16:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Am 11.01.2010 23:45, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
>> * It makes git show submodules as modified in the superproject when
>>   one or more of these conditions are met:
>>
>>     a) The submodule contains untracked files
>>     b) The submodule contains modified files
>>     c) The submodules HEAD is not on a local or remote branch
>>
>>   That can be seen when using either "git status", "git diff[-files]"
>>   & "git diff[-index] HEAD" (and with "git gui" & gitk).
> 
> If the submodule is checked out, _and_ if the HEAD there, either detached
> or not, does not agree with what the "other" one records (i.e. the commit
> recorded in an entry in the index, or in the tree, that you are comparing
> your work tree against), then it also should be considered modified.  I
> don't think your (a)-(c) cover this case.

Right, i did not to add the current (and unchanged) behavior to this
list, i just wrote down the new cases (and these new cases only come
into play when the submodule has been checked out).


> Also I don't understand why you want to treat (c) any specially at all.

To avoid possible loss of commits.

Before doing something like "git checkout -f" or "git reset --hard", it
is a good idea to check via "git status" if you have local changes. I
hope checkout and reset will recurse into submodules in the near future.
when they do, all commits in the submodule which are not on any branch
are lost (at least when the reflog expired). Or the remote branch the
user thinks the submodule is tracking has been deleted or rebased. You
might want to know that before e.g. committing it in the superproject.

Maybe compare it to new or modified files in a git repo: They don't
necessarily pose a problem when committing, you might be able to push
and clone the repo somewhere else and nothing breaks. But you wanna
know about these new and modifies files, in case you just forgot to add
them. So i think the HEAD of a submodule not on any branch is a bit like
a new or modified file in a regular repo, both will not show up in a
different repo than yours unless you do something about it. And a
modification is lost by a checkout or reset just as the dangling commits
will be.

Yes, this test can't provide 100% safety against loss of commits, but at
least we should try to warn if we can detect it. Does it give false
positives (saying the submodules HEAD is dangling when it shouldn't)?
I doubt it. Does it give false negatives? Yes, but we can't do anything
about that due to the distributed nature of git.


> Even if (c) is something we _should_ report, please do not call that as
> "detached" in its implementation.

Correct, that term is misleading in this context. Maybe call it
something like "The submodule contains a HEAD not on any branch"
then? Or "The submodule has a dangling HEAD"?


>> * This behavior is not configurable but activated by default. A config
>>   option is needed here.
> 
> I doubt it.
> 
> My gut feeling is that this should be _always_ on for a submodule
> directory that has been "submodule init/update".  The user is interested
> in that particular submodule, and any change to it should be reported for
> both classes of users.  Theose who meant to use the submodule read-only
> need to be able to notice that they accidentally made the submodule dirty
> before making a commit in the superproject.  Those who wanted to work in
> submodule needs to know if the state is in sync with what they expect
> before making a commit in the superproject.

Yes, me too thinks it should default to on for every initialized
submodule.

But this is a major change in behavior, so it might be a good idea to be
able to turn it off (e.g. if it breaks scripts). Maybe a config option
really isn't such a bright idea, but what about having something like a
"--no-dirty-submodules" command line option?


> That of course is provided if the unconditional check does not trigger for
> submodules that the user hasn't "submodue init"ed; I think you did that
> correctly at the beginning of your is_submodule_modified() implementation.

Yes, that's what that test is for. Will add a comment there.


> But the thing is, in a distributed environment, the submodule HEAD being
> at the tip of _some_ branch (either local or remote) you have doesn't mean
> anything to help them.  IOW, for protect others, you would need a check
> when you _push out_ (either in 'push' or on the receiving end).

This is something on my TODO list: Add a change to "git push" to assert
that all HEADs of initialized submodules lie on a /remote/ branch before
doing the push in the superproject.


> So I'd suggest dropping this condition in "status/diff" that is about
> preparing to make the next commit in your _local_ history.

I would rather have this patch merged without c) than not at all. But i
think it is a worthwhile and rather cheap test. And i would prefer to
change the default behavior of "git status" only once now and not again
later.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH (WIP)] Show a dirty working tree and a detached HEAD in status for submodule
  2010-01-12 16:20   ` Jens Lehmann
@ 2010-01-13  7:09     ` Junio C Hamano
  2010-01-13 18:59       ` [PATCH] Show submodules as modified when they contain a dirty work tree Jens Lehmann
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2010-01-13  7:09 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Jens Lehmann <Jens.Lehmann@web.de> writes:

> To avoid possible loss of commits.
> ... Or the remote branch the
> user thinks the submodule is tracking has been deleted or rebased. You
> might want to know that before e.g. committing it in the superproject.

The interaction between recursive reset/checkout and gc in submodule
repository is something I didn't think about thoroughly, but I think you
are solving the issue in a wrong place at a wrong level.

We have so far neglected the object reachability analysis across
repositories.  When people use "clone --reference" to take advantage of
alternate object store, we tell them not to use any repository that
rewinds its branches as reference.  Putting it the other way around, we
tell owners of repositories that somebody borrows objects from not to run
gc, because objects they themselves do not use may be used by others who
borrow from them.  But we don't _enforce_ it; it is merely a social
convention.  We might want to do something about it, perhaps by having
typed back-pointers from a borrowed repository to borrowing repositories.

I think commits in submodules that are pointed at by _some_ superproject
commits are in the same situation.  Just like some objects in a borrowed
repository may have to be protected by refs in borrowing repositories,
some commits in a submodule repository (here I am only talking about
repositories that serve as distribution points---a private repository of a
user who is not interested in that particular submodule is excluded from
this discussion and may not follow the same rules) may have to be
protected by commits reachable by refs in its superproject repository.

The right solution to your "gc might soon remove objects from submodule
repository even when commits in the superproject uses them" may be to
teach "gc" run in the submodule repository to behave better.  After all,
even if the commit were reachable from some of the refs in the submodule
when you created a commit that points at it at the superproject level,
next "git submodule update" might rewind that ref in the submodule that
reached the commit and make it unreachable; your check to be careful at
commit time wouldn't buy you much, even if the check were 100% accurate.

Also in _your_ repository you may have the submodule commit with your
check at the time you make commit in the superproject, but if you forget
to push it out from the submodule before you push the commit out from the
superproject, other people will suffer from the same issue you are trying
to guard against.

In any case, this "where in the submodule's history does the commit being
recorded lie" does not have anything to do with "diff/status" that lets
you know "has it been _modified_?" at the level of the superproject.  As
you mentioned, it is more similar to "Untracked files" section in "status"
output of a flat project.  It doesn't belong to "diff", nor to "Changed
but not updated" section.

>>> * This behavior is not configurable but activated by default. A config
>>>   option is needed here.
>> 
>> I doubt it.
>> 
>> My gut feeling is that this should be _always_ on for a submodule
>> directory that has been "submodule init/update".  The user is interested
>> in that particular submodule, and any change to it should be reported for
>> both classes of users.  Those who meant to use the submodule read-only
>> need to be able to notice that they accidentally made the submodule dirty
>> before making a commit in the superproject.  Those who wanted to work in
>> submodule needs to know if the state is in sync with what they expect
>> before making a commit in the superproject.
>
> Yes, me too thinks it should default to on for every initialized
> submodule.
>
> But this is a major change in behavior, so it might be a good idea to be
> able to turn it off (e.g. if it breaks scripts).

If scripts are broken by this change, I think it is actually a good thing.

They've been operating happily as if everything is clean when some
submodule directories are *not*, and your patch starts showing something
closer to the reality, the changes they should care about.  If on the
other hand the scripts are willing to commit the index while leaving local
modifications behind, they will not be checking with "diff" output
(instead they will be checking "diff --cached") and you won't change the
output with your patch for that codepath, so they will keep working
happily.

>> But the thing is, in a distributed environment, the submodule HEAD being
>> at the tip of _some_ branch (either local or remote) you have doesn't mean
>> anything to help them.  IOW, for protect others, you would need a check
>> when you _push out_ (either in 'push' or on the receiving end).

The right place to do this check is at the receiving end. Just like they
can be (and by default are) configured to reject a non ff push into the
repository, the receiving end of the superproject can inspect the incoming
history and make sure that the commits bound at submodule paths are all
available to others that might want to fetch those superproject commits
from it in associated submodule repositories.  It would forbid people from
first pushing superproject and then pushing submodule projects, but I
think that is a better workflow anyway (you make sure prerequisites are
available in the submodule repositories to others first, and then make the
superproject commit that depends on the submodules).

You may also need to check at the receiving end when pushing into the
submodule repository; if the push is a non ff one, it _might_ lose commits
that are still needed by the associated superproject.

>> So I'd suggest dropping this condition in "status/diff" that is about
>> preparing to make the next commit in your _local_ history.
>
> I would rather have this patch merged without c) than not at all. But i
> think it is a worthwhile and rather cheap test.

Did I ever say "drop (c) because it is *too expensive*"?

I suggested to drop it because it is a _pointless_ check in the context of
the codepath, even though I was too polite to put it that bluntly in the
message you are responding to.

I think this topic of yours overall is going in the right direction.  I
have no issue with the intent to show submodules with local changes in
"Changed but not updated" section of status, or do the "+sha1-dirty" thing
in diff.

Thanks.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-13  7:09     ` Junio C Hamano
@ 2010-01-13 18:59       ` Jens Lehmann
  2010-01-13 22:10         ` Junio C Hamano
  2010-01-15 13:32         ` [PATCH] Show submodules as modified when they contain a dirty work tree Nanako Shiraishi
  0 siblings, 2 replies; 14+ messages in thread
From: Jens Lehmann @ 2010-01-13 18:59 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Until now a submodule only then showed up as modified in the supermodule
when the last commit in the submodule differed from the one in the index
or the diffed against commit of the superproject. A dirty work tree
containing new untracked or modified files in a submodule was
undetectable when looking at it from the superproject.

Now git status and git diff (against the work tree) in the superproject
will also display submodules as modified when they contain untracked or
modified files, even if the compared ref matches the HEAD of the
submodule.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---

Thanks for your review, here is the updated patch. Changes to the RFC
version are:

  - Removed check for a dangling HEAD (now the testsuite runs fine)
  - Reworded the commit message
  - Inlined is_submodule_working_directory_dirty() into
    is_submodule_modified()
  - The new code will only be called when refs did match (when they
    didn't the submodule will already show up as modified)

What do you think?


 diff-lib.c                  |    4 ++-
 submodule.c                 |   49 +++++++++++++++++++++++++++++++++++++++++++
 submodule.h                 |    1 +
 t/t7506-status-submodule.sh |   31 ++++++++++++++++++++++++++-
 4 files changed, 83 insertions(+), 2 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 1c7e652..6918920 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -159,7 +159,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 				continue;
 		}

-		if (ce_uptodate(ce) || ce_skip_worktree(ce))
+		if ((ce_uptodate(ce) && !S_ISGITLINK(ce->ce_mode)) || ce_skip_worktree(ce))
 			continue;

 		/* If CE_VALID is set, don't look at workdir for file removal */
@@ -176,6 +176,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 			continue;
 		}
 		changed = ce_match_stat(ce, &st, ce_option);
+		if (S_ISGITLINK(ce->ce_mode) && !changed)
+			changed = is_submodule_modified(ce->name);
 		if (!changed) {
 			ce_mark_uptodate(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
diff --git a/submodule.c b/submodule.c
index 86aad65..3f851de 100644
--- a/submodule.c
+++ b/submodule.c
@@ -4,6 +4,7 @@
 #include "diff.h"
 #include "commit.h"
 #include "revision.h"
+#include "run-command.h"

 int add_submodule_odb(const char *path)
 {
@@ -112,3 +113,51 @@ void show_submodule_summary(FILE *f, const char *path,
 	}
 	strbuf_release(&sb);
 }
+
+int is_submodule_modified(const char *path)
+{
+	int len;
+	struct child_process cp;
+	const char *argv[] = {
+		"status",
+		"--porcelain",
+		NULL,
+	};
+	char *env[3];
+	struct strbuf buf = STRBUF_INIT;
+
+	strbuf_addf(&buf, "%s/.git/", path);
+	if (!is_directory(buf.buf)) {
+		strbuf_release(&buf);
+		/* The submodule is not checked out, so it is not modified */
+		return 0;
+
+	}
+	strbuf_reset(&buf);
+
+	strbuf_addf(&buf, "GIT_WORK_TREE=%s", path);
+	env[0] = strbuf_detach(&buf, NULL);
+	strbuf_addf(&buf, "GIT_DIR=%s/.git", path);
+	env[1] = strbuf_detach(&buf, NULL);
+	env[2] = NULL;
+
+	memset(&cp, 0, sizeof(cp));
+	cp.argv = argv;
+	cp.env = (const char *const *)env;
+	cp.git_cmd = 1;
+	cp.no_stdin = 1;
+	cp.out = -1;
+	if (start_command(&cp))
+		die("Could not run git status --porcelain");
+
+	len = strbuf_read(&buf, cp.out, 1024);
+	close(cp.out);
+
+	if (finish_command(&cp))
+		die("git status --porcelain failed");
+
+	free(env[0]);
+	free(env[1]);
+	strbuf_release(&buf);
+	return len != 0;
+}
diff --git a/submodule.h b/submodule.h
index 4c0269d..0773121 100644
--- a/submodule.h
+++ b/submodule.h
@@ -4,5 +4,6 @@
 void show_submodule_summary(FILE *f, const char *path,
 		unsigned char one[20], unsigned char two[20],
 		const char *del, const char *add, const char *reset);
+int is_submodule_modified(const char *path);

 #endif
diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 3ca17ab..47e205b 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -10,8 +10,12 @@ test_expect_success 'setup' '
 	: >bar &&
 	git add bar &&
 	git commit -m " Add bar" &&
+	: >foo &&
+	git add foo &&
+	git commit -m " Add foo" &&
 	cd .. &&
-	git add sub &&
+	echo output > .gitignore
+	git add sub .gitignore &&
 	git commit -m "Add submodule sub"
 '

@@ -23,6 +27,31 @@ test_expect_success 'commit --dry-run -a clean' '
 	git commit --dry-run -a |
 	grep "nothing to commit"
 '
+
+echo "changed" > sub/foo
+test_expect_success 'status with modified file in submodule' '
+	git status | grep "modified:   sub"
+'
+test_expect_success 'status with modified file in submodule (porcelain)' '
+	git status --porcelain >output &&
+	diff output - <<-EOF
+ M sub
+EOF
+'
+(cd sub && git checkout foo)
+
+echo "content" > sub/new-file
+test_expect_success 'status with untracked file in submodule' '
+	git status | grep "modified:   sub"
+'
+test_expect_success 'status with untracked file in submodule (porcelain)' '
+	git status --porcelain >output &&
+	diff output - <<-EOF
+ M sub
+EOF
+'
+rm sub/new-file
+
 test_expect_success 'rm submodule contents' '
 	rm -rf sub/* sub/.git
 '
-- 
1.6.6.203.g28a8ba.dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-13 18:59       ` [PATCH] Show submodules as modified when they contain a dirty work tree Jens Lehmann
@ 2010-01-13 22:10         ` Junio C Hamano
  2010-01-14  8:32           ` Jens Lehmann
  2010-01-14 21:38           ` Jens Lehmann
  2010-01-15 13:32         ` [PATCH] Show submodules as modified when they contain a dirty work tree Nanako Shiraishi
  1 sibling, 2 replies; 14+ messages in thread
From: Junio C Hamano @ 2010-01-13 22:10 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Thanks for your review, here is the updated patch. Changes to the RFC
> version are:
>
>   - Removed check for a dangling HEAD (now the testsuite runs fine)
>   - Reworded the commit message
>   - Inlined is_submodule_working_directory_dirty() into
>     is_submodule_modified()
>   - The new code will only be called when refs did match (when they
>     didn't the submodule will already show up as modified)

Looking good.

I had to squash in '#include "submodule.h"' in diff-lib.c just after it
includes "refs.h", though.

And a patch to add:

>> * It doesn't give detailed output when doing a "git diff* -p" with or
>>   without the --submodule option. It should show something like
>> 
>>     diff --git a/sub b/sub
>>     index 5431f52..3f35670 160000
>>     --- a/sub
>>     +++ b/sub
>>     @@ -1 +1 @@
>>     -Subproject commit 5431f529197f3831cdfbba1354a819a79f948f6f
>>     +Subproject commit 3f356705649b5d566d97ff843cf193359229a453-dirty
>> 

would look like the attached.

I think a reasonable next step would be

 - Move the check for your condition (c) that we dropped from this round
   to wt-status.c;

 - Add wt_status_print_dangling_submodules() to wt-status.c, and use the
   above logic to produce a section "Submodules with Dangling HEAD" or
   something.

 - Call it in wt_status_print(), immediately before we check s->verbose
   and show the patch text under -v option.  "git status" now will warn
   about the condition (c).

 - Add a similar wt_shortstatus_print_dangling_submodules() and call it at
   the end of wt_shortstatus_print().

 - Update is_submodule_modified() in your patch thats reads the output
   from "status --porcelain", to *ignore* information about dangling
   submodules.  As we discussed, dangling submodules may be something the
   user cares about, but that is not something "diff" should.

-- >8 --
Subject: Teach diff that modified submodule directory is dirty

A diff run in superproject only compares the name of the commit object
bound at the submodule paths.  When we compare with a work tree and the
checked out submodule directory is dirty (e.g. has either staged or
unstaged changes, or has new files the user forgot to add to the index),
show the work tree side as "dirty".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff.c                    |    9 ++++++-
 t/t4027-diff-submodule.sh |   49 ++++++++++++++++++++++++++++++++++++++++++++-
 2 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/diff.c b/diff.c
index 04beb26..750c066 100644
--- a/diff.c
+++ b/diff.c
@@ -2029,9 +2029,14 @@ static int populate_from_stdin(struct diff_filespec *s)
 static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 {
 	int len;
-	char *data = xmalloc(100);
+	char *data = xmalloc(100), *dirty = "";
+
+	/* Are we looking at the work tree? */
+	if (!s->sha1_valid && is_submodule_modified(s->path))
+		dirty = "-dirty";
+
 	len = snprintf(data, 100,
-		"Subproject commit %s\n", sha1_to_hex(s->sha1));
+		       "Subproject commit %s%s\n", sha1_to_hex(s->sha1), dirty);
 	s->data = data;
 	s->size = len;
 	s->should_free = 1;
diff --git a/t/t4027-diff-submodule.sh b/t/t4027-diff-submodule.sh
index 5cf8924..bf8c980 100755
--- a/t/t4027-diff-submodule.sh
+++ b/t/t4027-diff-submodule.sh
@@ -32,7 +32,8 @@ test_expect_success setup '
 		cd sub &&
 		git rev-list HEAD
 	) &&
-	echo ":160000 160000 $3 $_z40 M	sub" >expect
+	echo ":160000 160000 $3 $_z40 M	sub" >expect &&
+	subtip=$3 subprev=$2
 '
 
 test_expect_success 'git diff --raw HEAD' '
@@ -50,6 +51,52 @@ test_expect_success 'git diff-files --raw' '
 	test_cmp expect actual.files
 '
 
+expect_from_to () {
+	printf "%sSubproject commit %s\n+Subproject commit %s\n" \
+		"-" "$1" "$2"
+}
+
+test_expect_success 'git diff HEAD' '
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subtip $subprev &&
+	test_cmp expect.body actual.body
+'
+
+test_expect_success 'git diff HEAD with dirty submodule (work tree)' '
+	echo >>sub/world &&
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subtip $subprev-dirty &&
+	test_cmp expect.body actual.body
+'
+
+test_expect_success 'git diff HEAD with dirty submodule (index)' '
+	(
+		cd sub &&
+		git reset --hard &&
+		echo >>world &&
+		git add world
+	) &&
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subtip $subprev-dirty &&
+	test_cmp expect.body actual.body
+'
+
+test_expect_success 'git diff HEAD with dirty submodule (untracked)' '
+	(
+		cd sub &&
+		git reset --hard &&
+		git clean -qfdx &&
+		>cruft
+	) &&
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subtip $subprev-dirty &&
+	test_cmp expect.body actual.body
+'
+
 test_expect_success 'git diff (empty submodule dir)' '
 	: >empty &&
 	rm -rf sub/* sub/.git &&

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-13 22:10         ` Junio C Hamano
@ 2010-01-14  8:32           ` Jens Lehmann
  2010-01-14 21:38           ` Jens Lehmann
  1 sibling, 0 replies; 14+ messages in thread
From: Jens Lehmann @ 2010-01-14  8:32 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Am 13.01.2010 23:10, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> I had to squash in '#include "submodule.h"' in diff-lib.c just after it
> includes "refs.h", though.

Sorry, i seem to repeatedly have missed the compiler warning :-(


> And a patch to add:
> 
>>> * It doesn't give detailed output when doing a "git diff* -p" with or
>>>   without the --submodule option. It should show something like
>>>
>>>     diff --git a/sub b/sub
>>>     index 5431f52..3f35670 160000
>>>     --- a/sub
>>>     +++ b/sub
>>>     @@ -1 +1 @@
>>>     -Subproject commit 5431f529197f3831cdfbba1354a819a79f948f6f
>>>     +Subproject commit 3f356705649b5d566d97ff843cf193359229a453-dirty
>>>
> 
> would look like the attached.

Thanks!


> I think a reasonable next step would be
> 
>  - Move the check for your condition (c) that we dropped from this round
>    to wt-status.c;
> 
>  - Add wt_status_print_dangling_submodules() to wt-status.c, and use the
>    above logic to produce a section "Submodules with Dangling HEAD" or
>    something.
> 
>  - Call it in wt_status_print(), immediately before we check s->verbose
>    and show the patch text under -v option.  "git status" now will warn
>    about the condition (c).
> 
>  - Add a similar wt_shortstatus_print_dangling_submodules() and call it at
>    the end of wt_shortstatus_print().
> 
>  - Update is_submodule_modified() in your patch thats reads the output
>    from "status --porcelain", to *ignore* information about dangling
>    submodules.  As we discussed, dangling submodules may be something the
>    user cares about, but that is not something "diff" should.

Great, i will send patches when i have something to show.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-13 22:10         ` Junio C Hamano
  2010-01-14  8:32           ` Jens Lehmann
@ 2010-01-14 21:38           ` Jens Lehmann
  2010-01-14 23:13             ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Jens Lehmann @ 2010-01-14 21:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Am 13.01.2010 23:10, schrieb Junio C Hamano:
> And a patch to add:
> 
>>> * It doesn't give detailed output when doing a "git diff* -p" with or
>>>   without the --submodule option. It should show something like
>>>
>>>     diff --git a/sub b/sub
>>>     index 5431f52..3f35670 160000
>>>     --- a/sub
>>>     +++ b/sub
>>>     @@ -1 +1 @@
>>>     -Subproject commit 5431f529197f3831cdfbba1354a819a79f948f6f
>>>     +Subproject commit 3f356705649b5d566d97ff843cf193359229a453-dirty
>>>
> 
> would look like the attached.

Your patch did not show submodules as dirty when the refs were identical.
The following patch fixes that and extends the test to catch that too.

-- >8 --
Subject: Show a modified submodule directory as dirty even if the refs match

When the submodules HEAD and the ref committed in the HEAD of the
superproject were the same, "git diff[-index] HEAD" did not show the
submodule as dirty when it should.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---
 diff-lib.c                |    3 ++-
 t/t4027-diff-submodule.sh |   35 +++++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 5ce226b..9cdf6da 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -233,7 +233,8 @@ static int get_stat_data(struct cache_entry *ce,
 			return -1;
 		}
 		changed = ce_match_stat(ce, &st, 0);
-		if (changed) {
+		if (changed
+		    || (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name))) {
 			mode = ce_mode_from_stat(ce, st.st_mode);
 			sha1 = null_sha1;
 		}
diff --git a/t/t4027-diff-submodule.sh b/t/t4027-diff-submodule.sh
index bf8c980..83c1914 100755
--- a/t/t4027-diff-submodule.sh
+++ b/t/t4027-diff-submodule.sh
@@ -97,6 +97,41 @@ test_expect_success 'git diff HEAD with dirty submodule (untracked)' '
 	test_cmp expect.body actual.body
 '

+test_expect_success 'git diff HEAD with dirty submodule (work tree, refs match)' '
+	git commit -m "x" sub &&
+	echo >>sub/world &&
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subprev $subprev-dirty &&
+	test_cmp expect.body actual.body
+'
+
+test_expect_success 'git diff HEAD with dirty submodule (index, refs match)' '
+	(
+		cd sub &&
+		git reset --hard &&
+		echo >>world &&
+		git add world
+	) &&
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subprev $subprev-dirty &&
+	test_cmp expect.body actual.body
+'
+
+test_expect_success 'git diff HEAD with dirty submodule (untracked, refs match)' '
+	(
+		cd sub &&
+		git reset --hard &&
+		git clean -qfdx &&
+		>cruft
+	) &&
+	git diff HEAD >actual &&
+	sed -e "1,/^@@/d" actual >actual.body &&
+	expect_from_to >expect.body $subprev $subprev-dirty &&
+	test_cmp expect.body actual.body
+'
+
 test_expect_success 'git diff (empty submodule dir)' '
 	: >empty &&
 	rm -rf sub/* sub/.git &&
-- 
1.6.6.294.g1f7f2.dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-14 21:38           ` Jens Lehmann
@ 2010-01-14 23:13             ` Junio C Hamano
  2010-01-15  0:24               ` Jens Lehmann
  2010-01-17 19:01               ` [PATCH] Performance optimization for detection of modified submodules Jens Lehmann
  0 siblings, 2 replies; 14+ messages in thread
From: Junio C Hamano @ 2010-01-14 23:13 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Jens Lehmann <Jens.Lehmann@web.de> writes:

> Subject: Show a modified submodule directory as dirty even if the refs match
>
> When the submodules HEAD and the ref committed in the HEAD of the
> superproject were the same, "git diff[-index] HEAD" did not show the
> submodule as dirty when it should.
>
> Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
> ---
>  diff-lib.c                |    3 ++-
>  t/t4027-diff-submodule.sh |   35 +++++++++++++++++++++++++++++++++++
>  2 files changed, 37 insertions(+), 1 deletions(-)
>
> diff --git a/diff-lib.c b/diff-lib.c
> index 5ce226b..9cdf6da 100644
> --- a/diff-lib.c
> +++ b/diff-lib.c
> @@ -233,7 +233,8 @@ static int get_stat_data(struct cache_entry *ce,
>  			return -1;
>  		}
>  		changed = ce_match_stat(ce, &st, 0);
> -		if (changed) {
> +		if (changed
> +		    || (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name))) {

You had a check in your previous patch that decides to call or skip
diff_change() based on is_submodule_modified() for diff-files, but forgot
to have the same for diff-index, which this patch does.  Perhaps we want
to squash this into 4519d9c (Show submodules as modified when they contain
a dirty work tree, 2010-01-13).

The existing code is a bit unfortunate; by the time we come to the output
routine, the information we found from is_submodule_modified() is lost;
that is why my "would look like this" patch calls is_submodule_modified().

We may want to add one parameter to diff_change() and diff_addremove(), to
tell them if the work-tree side (if we are comparing something with the
work tree) is a modified submodule, and add one bit to the diff_filespec
structure to record that in diff_change() and diff_addremove() (obviously
only when adding).  That way, my "would looks like this" patch needs to
check the result of is_submodule_modified() the front-ends left in the
filespec, instead of running it again.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-14 23:13             ` Junio C Hamano
@ 2010-01-15  0:24               ` Jens Lehmann
  2010-01-17 19:01               ` [PATCH] Performance optimization for detection of modified submodules Jens Lehmann
  1 sibling, 0 replies; 14+ messages in thread
From: Jens Lehmann @ 2010-01-15  0:24 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Am 15.01.2010 00:13, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> Subject: Show a modified submodule directory as dirty even if the refs match
>>
>> When the submodules HEAD and the ref committed in the HEAD of the
>> superproject were the same, "git diff[-index] HEAD" did not show the
>> submodule as dirty when it should.
>>
>> Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
>> ---
>>  diff-lib.c                |    3 ++-
>>  t/t4027-diff-submodule.sh |   35 +++++++++++++++++++++++++++++++++++
>>  2 files changed, 37 insertions(+), 1 deletions(-)
>>
>> diff --git a/diff-lib.c b/diff-lib.c
>> index 5ce226b..9cdf6da 100644
>> --- a/diff-lib.c
>> +++ b/diff-lib.c
>> @@ -233,7 +233,8 @@ static int get_stat_data(struct cache_entry *ce,
>>  			return -1;
>>  		}
>>  		changed = ce_match_stat(ce, &st, 0);
>> -		if (changed) {
>> +		if (changed
>> +		    || (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name))) {
> 
> You had a check in your previous patch that decides to call or skip
> diff_change() based on is_submodule_modified() for diff-files, but forgot
> to have the same for diff-index, which this patch does.  Perhaps we want
> to squash this into 4519d9c (Show submodules as modified when they contain
> a dirty work tree, 2010-01-13).

Of course you are right, the change you quoted should have been in my
patch in the first place. So squashing it seems to be the right thing to
do (but AFAICS the tests i added might be a problem, as they use
expect_from_to() which your intermediate patch added. Maybe squash these
tests into your patch and the diff you quoted above into mine?).


> The existing code is a bit unfortunate; by the time we come to the output
> routine, the information we found from is_submodule_modified() is lost;
> that is why my "would look like this" patch calls is_submodule_modified().
> 
> We may want to add one parameter to diff_change() and diff_addremove(), to
> tell them if the work-tree side (if we are comparing something with the
> work tree) is a modified submodule, and add one bit to the diff_filespec
> structure to record that in diff_change() and diff_addremove() (obviously
> only when adding).  That way, my "would looks like this" patch needs to
> check the result of is_submodule_modified() the front-ends left in the
> filespec, instead of running it again.

Good idea, i've been already exploring this line of thought too and came
to the same conclusion (i noticed that when calling plain "git diff" in a
repo with submodules, is_submodule_modified() gets called *three* times
for each submodule, which is not /that/ good for performance ;-). But i
intended to do this optimization in a subsequent patch (and in preparation
for "git diff --submodule" being able to print /how/ the submodule is
dirty without having to scan it again).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Show submodules as modified when they contain a dirty work tree
  2010-01-13 18:59       ` [PATCH] Show submodules as modified when they contain a dirty work tree Jens Lehmann
  2010-01-13 22:10         ` Junio C Hamano
@ 2010-01-15 13:32         ` Nanako Shiraishi
  1 sibling, 0 replies; 14+ messages in thread
From: Nanako Shiraishi @ 2010-01-15 13:32 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Junio C Hamano, Git Mailing List, Johannes Schindelin,
	Shawn O. Pearce, Heiko Voigt, Lars Hjemli

Quoting Jens Lehmann <Jens.Lehmann@web.de>

> diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
> index 3ca17ab..47e205b 100755
> --- a/t/t7506-status-submodule.sh
> +++ b/t/t7506-status-submodule.sh
> @@ -10,8 +10,12 @@ test_expect_success 'setup' '
>  	: >bar &&
>  	git add bar &&
>  	git commit -m " Add bar" &&
> +	: >foo &&
> +	git add foo &&
> +	git commit -m " Add foo" &&
>  	cd .. &&
> -	git add sub &&
> +	echo output > .gitignore
> +	git add sub .gitignore &&
>  	git commit -m "Add submodule sub"
>  '

This is not a new problem you introduced, but if some commands
before 'cd ..' fails, the next test will run in 'sub'. Other
tests run operations inside () to avoid this problem.

> @@ -23,6 +27,31 @@ test_expect_success 'commit --dry-run -a clean' '
>  	git commit --dry-run -a |
>  	grep "nothing to commit"
>  '
> +
> +echo "changed" > sub/foo

Have it inside the next test_expect_success.

> +test_expect_success 'status with modified file in submodule' '
> +	git status | grep "modified:   sub"
> +'

To catch failure from 'git status' this is better written like this.
    git status >output &&
    grep "modified:   sub" output

> +test_expect_success 'status with modified file in submodule (porcelain)' '
> +	git status --porcelain >output &&
> +	diff output - <<-EOF
> + M sub
> +EOF
> +'

If you use -EOF you may want to align it with tab to make it
easier to read. The one in t7005-editor.sh is a good example
(t7401 is a bad example to imitate).

> +(cd sub && git checkout foo)
> +
> +echo "content" > sub/new-file

Move this part to the next test_expect_success to catch broken
checkout.

> +test_expect_success 'status with untracked file in submodule' '
> +	git status | grep "modified:   sub"
> +'

Same comment as before.

> +test_expect_success 'status with untracked file in submodule (porcelain)' '
> +	git status --porcelain >output &&
> +	diff output - <<-EOF
> + M sub
> +EOF
> +'

Same comment as before.

> +rm sub/new-file
> +

Do you need this? If so, move it inside the next
test_expect_success.

>  test_expect_success 'rm submodule contents' '
>  	rm -rf sub/* sub/.git
>  '
> -- 
> 1.6.6.203.g28a8ba.dirty

The following can be squashed to 4519d9cf092a173ac7b0a5570b0d5d602086ecf2

diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
index 47e205b..253c334 100755
--- a/t/t7506-status-submodule.sh
+++ b/t/t7506-status-submodule.sh
@@ -5,63 +5,87 @@ test_description='git status for submodule'
 . ./test-lib.sh
 
 test_expect_success 'setup' '
-	test_create_repo sub
-	cd sub &&
-	: >bar &&
-	git add bar &&
-	git commit -m " Add bar" &&
-	: >foo &&
-	git add foo &&
-	git commit -m " Add foo" &&
-	cd .. &&
-	echo output > .gitignore
+	test_create_repo sub &&
+	(
+		cd sub &&
+		: >bar &&
+		git add bar &&
+		git commit -m " Add bar" &&
+		: >foo &&
+		git add foo &&
+		git commit -m " Add foo"
+	) &&
+	echo output > .gitignore &&
 	git add sub .gitignore &&
 	git commit -m "Add submodule sub"
 '
 
 test_expect_success 'status clean' '
-	git status |
-	grep "nothing to commit"
+	git status >output &&
+	grep "nothing to commit" output
 '
+
 test_expect_success 'commit --dry-run -a clean' '
-	git commit --dry-run -a |
-	grep "nothing to commit"
+	test_must_fail git commit --dry-run -a >output &&
+	grep "nothing to commit" output
 '
 
-echo "changed" > sub/foo
 test_expect_success 'status with modified file in submodule' '
-	git status | grep "modified:   sub"
+	(cd sub && git reset --hard) &&
+	echo "changed" >sub/foo &&
+	git status >output &&
+	grep "modified:   sub" output
 '
+
 test_expect_success 'status with modified file in submodule (porcelain)' '
+	(cd sub && git reset --hard) &&
+	echo "changed" >sub/foo &&
+	git status --porcelain >output &&
+	diff output - <<-\EOF
+	 M sub
+	EOF
+'
+
+test_expect_success 'status with added file in submodule' '
+	(cd sub && git reset --hard && echo >foo && git add foo) &&
+	git status >output &&
+	grep "modified:   sub" output
+'
+
+test_expect_success 'status with added file in submodule (porcelain)' '
+	(cd sub && git reset --hard && echo >foo && git add foo) &&
 	git status --porcelain >output &&
-	diff output - <<-EOF
- M sub
-EOF
+	diff output - <<-\EOF
+	 M sub
+	EOF
 '
-(cd sub && git checkout foo)
 
-echo "content" > sub/new-file
 test_expect_success 'status with untracked file in submodule' '
-	git status | grep "modified:   sub"
+	(cd sub && git reset --hard) &&
+	echo "content" >sub/new-file &&
+	git status >output &&
+	grep "modified:   sub" output
 '
+
 test_expect_success 'status with untracked file in submodule (porcelain)' '
 	git status --porcelain >output &&
-	diff output - <<-EOF
- M sub
-EOF
+	diff output - <<-\EOF
+	 M sub
+	EOF
 '
-rm sub/new-file
 
 test_expect_success 'rm submodule contents' '
 	rm -rf sub/* sub/.git
 '
+
 test_expect_success 'status clean (empty submodule dir)' '
-	git status |
-	grep "nothing to commit"
+	git status >output &&
+	grep "nothing to commit" output
 '
+
 test_expect_success 'status -a clean (empty submodule dir)' '
-	git commit --dry-run -a |
-	grep "nothing to commit"
+	test_must_fail git commit --dry-run -a >output &&
+	grep "nothing to commit" output
 '
 
 test_done


-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH] Performance optimization for detection of modified submodules
  2010-01-14 23:13             ` Junio C Hamano
  2010-01-15  0:24               ` Jens Lehmann
@ 2010-01-17 19:01               ` Jens Lehmann
  2010-01-17 20:01                 ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Jens Lehmann @ 2010-01-17 19:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time should not be thrown away.

A new parameter has been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
---


Am 15.01.2010 00:13, schrieb Junio C Hamano:
> The existing code is a bit unfortunate; by the time we come to the output
> routine, the information we found from is_submodule_modified() is lost;
> that is why my "would look like this" patch calls is_submodule_modified().
>
> We may want to add one parameter to diff_change() and diff_addremove(), to
> tell them if the work-tree side (if we are comparing something with the
> work tree) is a modified submodule, and add one bit to the diff_filespec
> structure to record that in diff_change() and diff_addremove() (obviously
> only when adding).  That way, my "would looks like this" patch needs to
> check the result of is_submodule_modified() the front-ends left in the
> filespec, instead of running it again.

So here is my first attempt of implementing your proposal. The test suite
runs fine, but a few more eyeballs would really be appreciated as i am not
very familiar with the code and its corner cases (See diff_change(), is it
sufficient to only set "two->dirty_submodule", even if the REVERSE_DIFF
option is set? Apart from that i am not so sure about the four changes to
tree-diff.c).

I think we could skip the call to is_submodule_modified() in
run_diff_files() and get_stat_data() when the changed flag is already
set and only short output (without calling diff_populate_gitlink(), e.g.
"git status -s" or "git diff-files") is requested. What do you think
about doing that in a seperate patch?



 diff-lib.c  |   42 +++++++++++++++++++++++++++---------------
 diff.c      |   11 +++++++----
 diff.h      |    8 ++++----
 diffcore.h  |    1 +
 revision.c  |    4 ++--
 tree-diff.c |    8 ++++----
 6 files changed, 45 insertions(+), 29 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 29c5915..5b18d86 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -73,6 +73,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		unsigned int oldmode, newmode;
 		struct cache_entry *ce = active_cache[i];
 		int changed;
+		int dirty_submodule = 0;

 		if (DIFF_OPT_TST(&revs->diffopt, QUICK) &&
 			DIFF_OPT_TST(&revs->diffopt, HAS_CHANGES))
@@ -173,12 +174,14 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 			if (silent_on_removed)
 				continue;
 			diff_addremove(&revs->diffopt, '-', ce->ce_mode,
-				       ce->sha1, ce->name);
+				       ce->sha1, ce->name, 0);
 			continue;
 		}
 		changed = ce_match_stat(ce, &st, ce_option);
-		if (S_ISGITLINK(ce->ce_mode) && !changed)
-			changed = is_submodule_modified(ce->name);
+		if (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name)) {
+			changed = 1;
+			dirty_submodule = 1;
+		}
 		if (!changed) {
 			ce_mark_uptodate(ce);
 			if (!DIFF_OPT_TST(&revs->diffopt, FIND_COPIES_HARDER))
@@ -188,7 +191,7 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 		newmode = ce_mode_from_stat(ce, st.st_mode);
 		diff_change(&revs->diffopt, oldmode, newmode,
 			    ce->sha1, (changed ? null_sha1 : ce->sha1),
-			    ce->name);
+			    ce->name, dirty_submodule);

 	}
 	diffcore_std(&revs->diffopt);
@@ -204,16 +207,18 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 static void diff_index_show_file(struct rev_info *revs,
 				 const char *prefix,
 				 struct cache_entry *ce,
-				 const unsigned char *sha1, unsigned int mode)
+				 const unsigned char *sha1, unsigned int mode,
+				 int dirty_submodule)
 {
 	diff_addremove(&revs->diffopt, prefix[0], mode,
-		       sha1, ce->name);
+		       sha1, ce->name, dirty_submodule);
 }

 static int get_stat_data(struct cache_entry *ce,
 			 const unsigned char **sha1p,
 			 unsigned int *modep,
-			 int cached, int match_missing)
+			 int cached, int match_missing,
+			 int *dirty_submodule)
 {
 	const unsigned char *sha1 = ce->sha1;
 	unsigned int mode = ce->ce_mode;
@@ -233,8 +238,11 @@ static int get_stat_data(struct cache_entry *ce,
 			return -1;
 		}
 		changed = ce_match_stat(ce, &st, 0);
-		if (changed
-		    || (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name))) {
+		if (S_ISGITLINK(ce->ce_mode) && is_submodule_modified(ce->name)) {
+			changed = 1;
+			*dirty_submodule = 1;
+		}
+		if (changed) {
 			mode = ce_mode_from_stat(ce, st.st_mode);
 			sha1 = null_sha1;
 		}
@@ -251,15 +259,17 @@ static void show_new_file(struct rev_info *revs,
 {
 	const unsigned char *sha1;
 	unsigned int mode;
+	int dirty_submodule = 0;

 	/*
 	 * New file in the index: it might actually be different in
 	 * the working copy.
 	 */
-	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0)
+	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
+	    &dirty_submodule) < 0)
 		return;

-	diff_index_show_file(revs, "+", new, sha1, mode);
+	diff_index_show_file(revs, "+", new, sha1, mode, dirty_submodule);
 }

 static int show_modified(struct rev_info *revs,
@@ -270,11 +280,13 @@ static int show_modified(struct rev_info *revs,
 {
 	unsigned int mode, oldmode;
 	const unsigned char *sha1;
+	int dirty_submodule = 0;

-	if (get_stat_data(new, &sha1, &mode, cached, match_missing) < 0) {
+	if (get_stat_data(new, &sha1, &mode, cached, match_missing,
+			  &dirty_submodule) < 0) {
 		if (report_missing)
 			diff_index_show_file(revs, "-", old,
-					     old->sha1, old->ce_mode);
+					     old->sha1, old->ce_mode, 0);
 		return -1;
 	}

@@ -309,7 +321,7 @@ static int show_modified(struct rev_info *revs,
 		return 0;

 	diff_change(&revs->diffopt, oldmode, mode,
-		    old->sha1, sha1, old->name);
+		    old->sha1, sha1, old->name, dirty_submodule);
 	return 0;
 }

@@ -356,7 +368,7 @@ static void do_oneway_diff(struct unpack_trees_options *o,
 	 * Something removed from the tree?
 	 */
 	if (!idx) {
-		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode);
+		diff_index_show_file(revs, "-", tree, tree->sha1, tree->ce_mode, 0);
 		return;
 	}

diff --git a/diff.c b/diff.c
index 012b3d3..490a7ec 100644
--- a/diff.c
+++ b/diff.c
@@ -2032,7 +2032,7 @@ static int diff_populate_gitlink(struct diff_filespec *s, int size_only)
 	char *data = xmalloc(100), *dirty = "";

 	/* Are we looking at the work tree? */
-	if (!s->sha1_valid && is_submodule_modified(s->path))
+	if (!s->sha1_valid && s->dirty_submodule)
 		dirty = "-dirty";

 	len = snprintf(data, 100,
@@ -3736,7 +3736,7 @@ int diff_result_code(struct diff_options *opt, int status)
 void diff_addremove(struct diff_options *options,
 		    int addremove, unsigned mode,
 		    const unsigned char *sha1,
-		    const char *concatpath)
+		    const char *concatpath, int dirty_submodule)
 {
 	struct diff_filespec *one, *two;

@@ -3768,8 +3768,10 @@ void diff_addremove(struct diff_options *options,

 	if (addremove != '+')
 		fill_filespec(one, sha1, mode);
-	if (addremove != '-')
+	if (addremove != '-') {
 		fill_filespec(two, sha1, mode);
+		two->dirty_submodule = dirty_submodule;
+	}

 	diff_queue(&diff_queued_diff, one, two);
 	if (!DIFF_OPT_TST(options, DIFF_FROM_CONTENTS))
@@ -3780,7 +3782,7 @@ void diff_change(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *concatpath)
+		 const char *concatpath, int dirty_submodule)
 {
 	struct diff_filespec *one, *two;

@@ -3803,6 +3805,7 @@ void diff_change(struct diff_options *options,
 	two = alloc_filespec(concatpath);
 	fill_filespec(one, old_sha1, old_mode);
 	fill_filespec(two, new_sha1, new_mode);
+	two->dirty_submodule = dirty_submodule;

 	diff_queue(&diff_queued_diff, one, two);
 	if (!DIFF_OPT_TST(options, DIFF_FROM_CONTENTS))
diff --git a/diff.h b/diff.h
index 06a9a88..13596c2 100644
--- a/diff.h
+++ b/diff.h
@@ -14,12 +14,12 @@ typedef void (*change_fn_t)(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *fullpath);
+		 const char *fullpath, int dirty_submodule);

 typedef void (*add_remove_fn_t)(struct diff_options *options,
 		    int addremove, unsigned mode,
 		    const unsigned char *sha1,
-		    const char *fullpath);
+		    const char *fullpath, int dirty_submodule);

 typedef void (*diff_format_fn_t)(struct diff_queue_struct *q,
 		struct diff_options *options, void *data);
@@ -177,13 +177,13 @@ extern void diff_addremove(struct diff_options *,
 			   int addremove,
 			   unsigned mode,
 			   const unsigned char *sha1,
-			   const char *fullpath);
+			   const char *fullpath, int dirty_submodule);

 extern void diff_change(struct diff_options *,
 			unsigned mode1, unsigned mode2,
 			const unsigned char *sha1,
 			const unsigned char *sha2,
-			const char *fullpath);
+			const char *fullpath, int dirty_submodule);

 extern void diff_unmerge(struct diff_options *,
 			 const char *path,
diff --git a/diffcore.h b/diffcore.h
index 5b63458..66687c3 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -42,6 +42,7 @@ struct diff_filespec {
 #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0)
 	unsigned should_free : 1; /* data should be free()'ed */
 	unsigned should_munmap : 1; /* data should be munmap()'ed */
+	unsigned dirty_submodule : 1;  /* For submodules: its work tree is dirty */

 	struct userdiff_driver *driver;
 	/* data should be considered "binary"; -1 means "don't know yet" */
diff --git a/revision.c b/revision.c
index 25fa14d..e95cc41 100644
--- a/revision.c
+++ b/revision.c
@@ -268,7 +268,7 @@ static int tree_difference = REV_TREE_SAME;
 static void file_add_remove(struct diff_options *options,
 		    int addremove, unsigned mode,
 		    const unsigned char *sha1,
-		    const char *fullpath)
+		    const char *fullpath, int dirty_submodule)
 {
 	int diff = addremove == '+' ? REV_TREE_NEW : REV_TREE_OLD;

@@ -281,7 +281,7 @@ static void file_change(struct diff_options *options,
 		 unsigned old_mode, unsigned new_mode,
 		 const unsigned char *old_sha1,
 		 const unsigned char *new_sha1,
-		 const char *fullpath)
+		 const char *fullpath, int dirty_submodule)
 {
 	tree_difference = REV_TREE_DIFFERENT;
 	DIFF_OPT_SET(options, HAS_CHANGES);
diff --git a/tree-diff.c b/tree-diff.c
index 7d745b4..2ef0c77 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -68,7 +68,7 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2, const
 		if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) {
 			newbase[baselen + pathlen1] = 0;
 			opt->change(opt, mode1, mode2,
-				    sha1, sha2, newbase);
+				    sha1, sha2, newbase, 0);
 			newbase[baselen + pathlen1] = '/';
 		}
 		retval = diff_tree_sha1(sha1, sha2, newbase, opt);
@@ -77,7 +77,7 @@ static int compare_tree_entry(struct tree_desc *t1, struct tree_desc *t2, const
 	}

 	fullname = malloc_fullname(base, baselen, path1, pathlen1);
-	opt->change(opt, mode1, mode2, sha1, sha2, fullname);
+	opt->change(opt, mode1, mode2, sha1, sha2, fullname, 0);
 	free(fullname);
 	return 0;
 }
@@ -241,7 +241,7 @@ static void show_entry(struct diff_options *opt, const char *prefix, struct tree

 		if (DIFF_OPT_TST(opt, TREE_IN_RECURSIVE)) {
 			newbase[baselen + pathlen] = 0;
-			opt->add_remove(opt, *prefix, mode, sha1, newbase);
+			opt->add_remove(opt, *prefix, mode, sha1, newbase, 0);
 			newbase[baselen + pathlen] = '/';
 		}

@@ -252,7 +252,7 @@ static void show_entry(struct diff_options *opt, const char *prefix, struct tree
 		free(newbase);
 	} else {
 		char *fullname = malloc_fullname(base, baselen, path, pathlen);
-		opt->add_remove(opt, prefix[0], mode, sha1, fullname);
+		opt->add_remove(opt, prefix[0], mode, sha1, fullname, 0);
 		free(fullname);
 	}
 }
-- 
1.6.6.327.g4c0c1.dirty

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] Performance optimization for detection of modified submodules
  2010-01-17 19:01               ` [PATCH] Performance optimization for detection of modified submodules Jens Lehmann
@ 2010-01-17 20:01                 ` Junio C Hamano
  2010-01-17 20:33                   ` Jens Lehmann
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2010-01-17 20:01 UTC (permalink / raw)
  To: Jens Lehmann
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Jens Lehmann <Jens.Lehmann@web.de> writes:

> So here is my first attempt of implementing your proposal. The test suite
> runs fine, but a few more eyeballs would really be appreciated as i am not
> very familiar with the code and its corner cases (See diff_change(), is it
> sufficient to only set "two->dirty_submodule", even if the REVERSE_DIFF
> option is set? Apart from that i am not so sure about the four changes to
> tree-diff.c).

The effect of REVERSE_DIFF bit is contained in the output layer.  The
order frontends (e.g. diff-files and diff-lib.c::run_diff_files()) feed
the entries from two hierarchies is not affected.

The current callers of addremove() may always give the work tree side as
the second one, but the API is meant to be usable by any other new callers
and for some of them feeding the work tree side as the first one _might_
be more sensible (we are talking about futureproofing, so by definition we
won't know).  It might even be the case where an unanticipated new caller
might be comparing two trees both living in the work tree (hence you might
require two independent dirty_submodule bits to the call to show which
side is dirty, and such a caller may say "both sides are dirty").

So it would be most future-proof if you add two independent "dirty" bits
to the API if you are changing it: "is the left side of the comparision a
dirty submodule?" and "is the right side ...?".  Especially I don't think
assuming "setting two->dirty is enough for the current implementation" is
the right way going forward.

> I think we could skip the call to is_submodule_modified() in
> run_diff_files() and get_stat_data() when the changed flag is already
> set and only short output (without calling diff_populate_gitlink(), e.g.
> "git status -s" or "git diff-files") is requested.

I am puzzled by your "we could skip"; isn't it what you already have done
in this patch?  More importantly, I think that is the whole point of the
change to diff API this patch brings in.

> What do you think
> about doing that in a seperate patch?

Doing these in this same patch like you did is better, as it demonstrates
how the callers benefit by the addition of these new bits to the API.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] Performance optimization for detection of modified submodules
  2010-01-17 20:01                 ` Junio C Hamano
@ 2010-01-17 20:33                   ` Jens Lehmann
  0 siblings, 0 replies; 14+ messages in thread
From: Jens Lehmann @ 2010-01-17 20:33 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Johannes Schindelin, Shawn O. Pearce,
	Heiko Voigt, Lars Hjemli

Am 17.01.2010 21:01, schrieb Junio C Hamano:
> Jens Lehmann <Jens.Lehmann@web.de> writes:
> 
>> So here is my first attempt of implementing your proposal. The test suite
>> runs fine, but a few more eyeballs would really be appreciated as i am not
>> very familiar with the code and its corner cases (See diff_change(), is it
>> sufficient to only set "two->dirty_submodule", even if the REVERSE_DIFF
>> option is set? Apart from that i am not so sure about the four changes to
>> tree-diff.c).
> 
> The effect of REVERSE_DIFF bit is contained in the output layer.  The
> order frontends (e.g. diff-files and diff-lib.c::run_diff_files()) feed
> the entries from two hierarchies is not affected.
> 
> The current callers of addremove() may always give the work tree side as
> the second one, but the API is meant to be usable by any other new callers
> and for some of them feeding the work tree side as the first one _might_
> be more sensible (we are talking about futureproofing, so by definition we
> won't know).  It might even be the case where an unanticipated new caller
> might be comparing two trees both living in the work tree (hence you might
> require two independent dirty_submodule bits to the call to show which
> side is dirty, and such a caller may say "both sides are dirty").
> 
> So it would be most future-proof if you add two independent "dirty" bits
> to the API if you are changing it: "is the left side of the comparision a
> dirty submodule?" and "is the right side ...?".  Especially I don't think
> assuming "setting two->dirty is enough for the current implementation" is
> the right way going forward.

Thanks for your explanation, will provide two independent dirty_submodule
bits there.


>> I think we could skip the call to is_submodule_modified() in
>> run_diff_files() and get_stat_data() when the changed flag is already
>> set and only short output (without calling diff_populate_gitlink(), e.g.
>> "git status -s" or "git diff-files") is requested.
> 
> I am puzzled by your "we could skip"; isn't it what you already have done
> in this patch?  More importantly, I think that is the whole point of the
> change to diff API this patch brings in.

This patch was about calling is_submodule_modified() /only once/, either
in run_diff_files() or in get_stat_data(), and reuse the result. But i
think we don't have to call it /at all/ when for example executing "git
status -s" and the HEAD of the submodule does not match the commit in the
index of the superproject. This information is enough to display an 'M'.
No need to check the submodule work tree for dirtyness, as it won't
change the output of the command anymore.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-01-17 20:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-01-11 22:05 [RFC PATCH (WIP)] Show a dirty working tree and a detached HEAD in status for submodule Jens Lehmann
2010-01-11 22:45 ` Junio C Hamano
2010-01-12 16:20   ` Jens Lehmann
2010-01-13  7:09     ` Junio C Hamano
2010-01-13 18:59       ` [PATCH] Show submodules as modified when they contain a dirty work tree Jens Lehmann
2010-01-13 22:10         ` Junio C Hamano
2010-01-14  8:32           ` Jens Lehmann
2010-01-14 21:38           ` Jens Lehmann
2010-01-14 23:13             ` Junio C Hamano
2010-01-15  0:24               ` Jens Lehmann
2010-01-17 19:01               ` [PATCH] Performance optimization for detection of modified submodules Jens Lehmann
2010-01-17 20:01                 ` Junio C Hamano
2010-01-17 20:33                   ` Jens Lehmann
2010-01-15 13:32         ` [PATCH] Show submodules as modified when they contain a dirty work tree Nanako Shiraishi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).