All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] Improve git-status --ignored
@ 2013-03-18 20:28 Karsten Blees
  2013-03-19  4:08 ` Junio C Hamano
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
  0 siblings, 2 replies; 36+ messages in thread
From: Karsten Blees @ 2013-03-18 20:28 UTC (permalink / raw)
  To: Git List
  Cc: Junio C Hamano, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

This patch series addresses several bugs and performance issues in .gitignore processing that came up in the inotify discussion.

Also available here:
https://github.com/kblees/git/tree/kb/improve-git-status-ignored
git pull git://github.com/kblees/git.git kb/improve-git-status-ignored


Patches 1 - 4 fix bugs in 'git-status --ignored' and add appropriate test cases.

Patches 5 - 7 eliminate the is_path_excluded API, in favor of a slightly improved and faster is_excluded. This speeds up 'git-ls-files --cached --ignored' by factor 5 - 6.

Patch 8 finally skips excluded checks for tracked files. With the bugs and is_path_excluded out of the way, it should be obvious that this can safely be done unconditionally without introducing regressions. Speeds up 'git-status [--ignored]' by factor 1.4 - 2.


I still believe that 'git-status --ignored' shouldn't list "ignored tracked" directories, to be consistent with the listing of untracked directories, and because "ignored tracked" contradicts the very definition of ignored content in gitignore(5).

Cheers,
Karsten


Karsten Blees (8):
  dir.c: git-status --ignored: don't drop ignored directories
  dir.c: git-status --ignored: don't list files in ignored directories
  dir.c: git-status --ignored: don't list empty ignored directories
  dir.c: git-status --ignored: don't list empty directories as ignored
  dir.c: move prep_exclude and factor out parts of last_exclude_matching
  dir.c: unify is_excluded and is_path_excluded APIs
  dir.c: replace is_path_excluded with now equivalent is_excluded API
  dir.c: git-status: avoid is_excluded checks for tracked files

 builtin/add.c              |   5 +-
 builtin/check-ignore.c     |   6 +-
 builtin/ls-files.c         |  15 +-
 dir.c                      | 351 +++++++++++++++++----------------------------
 dir.h                      |  22 +--
 t/t7061-wtstatus-ignore.sh | 104 +++++++++++++-
 unpack-trees.c             |  10 +-
 unpack-trees.h             |   1 -
 8 files changed, 243 insertions(+), 271 deletions(-)

-- 

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/8] Improve git-status --ignored
  2013-03-18 20:28 [PATCH 0/8] Improve git-status --ignored Karsten Blees
@ 2013-03-19  4:08 ` Junio C Hamano
  2013-03-19  5:20   ` Duy Nguyen
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
  1 sibling, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2013-03-19  4:08 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Karsten Blees <karsten.blees@gmail.com> writes:

> This patch series addresses several bugs and performance issues in
> .gitignore processing that came up in the inotify discussion.

Thanks.

How does this interact with the nd/read-directory-recursive-optim
topic that has been cooking for a while?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/8] Improve git-status --ignored
  2013-03-19  4:08 ` Junio C Hamano
@ 2013-03-19  5:20   ` Duy Nguyen
  2013-03-19 10:48     ` Karsten Blees
  2013-03-19 14:48     ` Junio C Hamano
  0 siblings, 2 replies; 36+ messages in thread
From: Duy Nguyen @ 2013-03-19  5:20 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Karsten Blees, Git List, Erik Faye-Lund, Ramkumar Ramachandra,
	Robert Zeh, Antoine Pelisse, Adam Spiers

On Tue, Mar 19, 2013 at 11:08 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Karsten Blees <karsten.blees@gmail.com> writes:
>
>> This patch series addresses several bugs and performance issues in
>> .gitignore processing that came up in the inotify discussion.
>
> Thanks.
>
> How does this interact with the nd/read-directory-recursive-optim
> topic that has been cooking for a while?

I think 8/8 is another version of nd/read-directory-recursive-optim
-- 
Duy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/8] Improve git-status --ignored
  2013-03-19  5:20   ` Duy Nguyen
@ 2013-03-19 10:48     ` Karsten Blees
  2013-03-19 14:48     ` Junio C Hamano
  1 sibling, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-03-19 10:48 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Junio C Hamano, Git List, Erik Faye-Lund, Ramkumar Ramachandra,
	Robert Zeh, Antoine Pelisse, Adam Spiers

Am 19.03.2013 06:20, schrieb Duy Nguyen:
> On Tue, Mar 19, 2013 at 11:08 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Karsten Blees <karsten.blees@gmail.com> writes:
>>
>>> This patch series addresses several bugs and performance issues in
>>> .gitignore processing that came up in the inotify discussion.
>>
>> Thanks.
>>
>> How does this interact with the nd/read-directory-recursive-optim
>> topic that has been cooking for a while?
> 
> I think 8/8 is another version of nd/read-directory-recursive-optim
> 

Yes. When reviewing Duy's patch, I wondered why we would need all those special cases (or: why is treat_file so damn complicated). It turned out that it gets much simpler after fixing some bugs and eliminating the is_excluded / is_path_excluded discrepancy. This variant also optimizes git-status --ignored.

I tried to express my ideas here [1], but I guess this was a bit unstructured, sorry :-)

Note that we could skip excluded checks for tracked directories as well if it weren't for the current notion of ignored tracked directories.

I'd still like to eliminate the second directory scan in git-status --ignored (i.e. don't call fill_directory twice, which would save ~50ms for linux and ~300ms for WebKit), but that's a bit more involved...

[1] http://thread.gmane.org/gmane.comp.version-control.git/217111

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/8] Improve git-status --ignored
  2013-03-19  5:20   ` Duy Nguyen
  2013-03-19 10:48     ` Karsten Blees
@ 2013-03-19 14:48     ` Junio C Hamano
  2013-03-19 15:58       ` Duy Nguyen
  1 sibling, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2013-03-19 14:48 UTC (permalink / raw)
  To: Duy Nguyen
  Cc: Karsten Blees, Git List, Erik Faye-Lund, Ramkumar Ramachandra,
	Robert Zeh, Antoine Pelisse, Adam Spiers

Duy Nguyen <pclouds@gmail.com> writes:

> On Tue, Mar 19, 2013 at 11:08 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Karsten Blees <karsten.blees@gmail.com> writes:
>>
>>> This patch series addresses several bugs and performance issues in
>>> .gitignore processing that came up in the inotify discussion.
>>
>> Thanks.
>>
>> How does this interact with the nd/read-directory-recursive-optim
>> topic that has been cooking for a while?
>
> I think 8/8 is another version of nd/read-directory-recursive-optim

Yeah, it seems so; even though the series is much larger, overall it
looks to me a cleaner solution without a specific special case.

Would we we better off kicking nd/read-directory-recursive-optim
back to 'pu' (and eventually ejecting it) and replacing it with a
reroll of Karsten's series when it comes, perhaps?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/8] Improve git-status --ignored
  2013-03-19 14:48     ` Junio C Hamano
@ 2013-03-19 15:58       ` Duy Nguyen
  0 siblings, 0 replies; 36+ messages in thread
From: Duy Nguyen @ 2013-03-19 15:58 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Karsten Blees, Git List, Erik Faye-Lund, Ramkumar Ramachandra,
	Robert Zeh, Antoine Pelisse, Adam Spiers

On Tue, Mar 19, 2013 at 9:48 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Would we we better off kicking nd/read-directory-recursive-optim
> back to 'pu' (and eventually ejecting it) and replacing it with a
> reroll of Karsten's series when it comes, perhaps?

I have no problem with that. Whatever better should get in.
-- 
Duy

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 00/14] Improve git-status --ignored
  2013-03-18 20:28 [PATCH 0/8] Improve git-status --ignored Karsten Blees
  2013-03-19  4:08 ` Junio C Hamano
@ 2013-04-15 19:04 ` Karsten Blees
  2013-04-15 19:05   ` [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
                     ` (14 more replies)
  1 sibling, 15 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:04 UTC (permalink / raw)
  To: Git List
  Cc: Junio C Hamano, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers, Karsten Blees

This patch series addresses several bugs and performance issues in
.gitignore processing.

Patches #1 - #6 fix bugs and add appropriate test cases.

Patch #7 changes handling of "ignored tracked" directories, as I discovered
that with the current bahavior git-clean can delete tracked content.

Patches #8 - #14 are performance optimizations.


Also available here:
https://github.com/kblees/git/tree/kb/improve-git-status-ignored-v2
git pull git://github.com/kblees/git.git kb/improve-git-status-ignored-v2


Changes since v1 (old#->new#: description):

1->1: dir.c: git-status --ignored: don't drop ignored directories
2->2: dir.c: git-status --ignored: don't list files in ignored directories
3->3: dir.c: git-status --ignored: don't list empty ignored directories

*->4: dir.c: git-ls-files --directories: don't hide empty directories
      - new bugfix

4->5: dir.c: git-status --ignored: don't list empty directories as ignored
      - fixed typo in commit message thanks to Eric Sunshine
      - patch is reduced to a one-liner (the DIR_HIDE_EMPTY_DIRECTORIES
        flag has already been fixed in patch 4, renaming the variable and
	tweaking DIR_SHOW_OTHER_DIRECTORIES is not strictly necessary)

*->6: dir.c: make 'git-status --ignored' work within leading directories
      - new bugfix

*->7: dir.c: git-clean -d -X: don't delete tracked directories
      - changes handling of "ignored tracked" directories

5->8: dir.c: factor out parts of last_exclude_matching for later reuse
5->9: dir.c: move prep_exclude
      - split in two patches for cleaner diff
      - removed unnecessary ';' after '}'

6->10: dir.c: unify is_excluded and is_path_excluded APIs
       - fixed another typo in commit message

7->11: dir.c: replace is_path_excluded with now equivalent is_excluded API
8->12: dir.c: git-status: avoid is_excluded checks for tracked files

*->13: dir.c: git-status --ignored: don't scan the work tree three times
       - new optimization

*->14: dir.c: git-status --ignored: don't scan the work tree twice
       - new optimization

Karsten Blees (14):
  dir.c: git-status --ignored: don't drop ignored directories
  dir.c: git-status --ignored: don't list files in ignored directories
  dir.c: git-status --ignored: don't list empty ignored directories
  dir.c: git-ls-files --directories: don't hide empty directories
  dir.c: git-status --ignored: don't list empty directories as ignored
  dir.c: make 'git-status --ignored' work within leading directories
  dir.c: git-clean -d -X: don't delete tracked directories
  dir.c: factor out parts of last_exclude_matching for later reuse
  dir.c: move prep_exclude
  dir.c: unify is_excluded and is_path_excluded APIs
  dir.c: replace is_path_excluded with now equivalent is_excluded API
  dir.c: git-status: avoid is_excluded checks for tracked files
  dir.c: git-status --ignored: don't scan the work tree three times
  dir.c: git-status --ignored: don't scan the work tree twice

 Documentation/technical/api-directory-listing.txt |  25 +-
 builtin/add.c                                     |   5 +-
 builtin/check-ignore.c                            |   7 +-
 builtin/ls-files.c                                |  15 +-
 dir.c                                             | 499 +++++++++-------------
 dir.h                                             |  25 +-
 t/t3001-ls-files-others-exclude.sh                |  49 +++
 t/t7061-wtstatus-ignore.sh                        | 125 +++++-
 t/t7300-clean.sh                                  |  34 ++
 unpack-trees.c                                    |  10 +-
 unpack-trees.h                                    |   1 -
 wt-status.c                                       |  24 +-
 12 files changed, 455 insertions(+), 364 deletions(-)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
@ 2013-04-15 19:05   ` Karsten Blees
  2013-04-16 17:33     ` Ramkumar Ramachandra
  2013-04-15 19:06   ` [PATCH v2 02/14] dir.c: git-status --ignored: don't list files in " Karsten Blees
                     ` (13 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:05 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Junio C Hamano, Erik Faye-Lund, Ramkumar Ramachandra,
	Robert Zeh, Duy Nguyen, Antoine Pelisse, Adam Spiers

'git-status --ignored' drops ignored directories if they contain untracked
files in an untracked sub directory.

Fix it by getting exact (recursive) excluded status in treat_directory.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                      |  9 +++++++++
 t/t7061-wtstatus-ignore.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/dir.c b/dir.c
index 57394e4..ec4eebf 100644
--- a/dir.c
+++ b/dir.c
@@ -1060,6 +1060,15 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 
 	/* This is the "show_other_directories" case */
 
+	/* might be a sub directory in an excluded directory */
+	if (!exclude) {
+		struct path_exclude_check check;
+		int dt = DT_DIR;
+		path_exclude_check_init(&check, dir);
+		exclude = is_path_excluded(&check, dirname, len, &dt);
+		path_exclude_check_clear(&check);
+	}
+
 	/*
 	 * We are looking for ignored files and our directory is not ignored,
 	 * check if it contains only ignored files
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 0da1214..0f1034e 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -143,4 +143,31 @@ test_expect_success 'status ignored tracked directory and uncommitted file with
 	test_cmp expected actual
 '
 
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+!! tracked/
+EOF
+
+test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore' '
+	rm -rf tracked/uncommitted &&
+	mkdir tracked/ignored &&
+	: >tracked/ignored/uncommitted &&
+	git status --porcelain --ignored >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+!! tracked/ignored/uncommitted
+EOF
+
+test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore -u' '
+	git status --porcelain --ignored -u >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 02/14] dir.c: git-status --ignored: don't list files in ignored directories
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
  2013-04-15 19:05   ` [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
@ 2013-04-15 19:06   ` Karsten Blees
  2013-04-16  9:57     ` [PATCH] read_revisions_from_stdin: make copies for handle_revision_arg Thomas Rast
  2013-04-15 19:07   ` [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories Karsten Blees
                     ` (12 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:06 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-status --ignored' lists both the ignored directory and the ignored
files if the files are in a tracked sub directory.

When recursing into sub directories in read_directory_recursive, pass on
the check_only parameter so that we don't accidentally add the files.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                      |  4 +---
 t/t7061-wtstatus-ignore.sh | 27 +++++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/dir.c b/dir.c
index ec4eebf..7c9bc9c 100644
--- a/dir.c
+++ b/dir.c
@@ -1273,7 +1273,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		return path_ignored;
 	case DT_DIR:
 		strbuf_addch(path, '/');
-
 		switch (treat_directory(dir, path->buf, path->len, exclude, simplify)) {
 		case show_directory:
 			break;
@@ -1343,8 +1342,7 @@ static int read_directory_recursive(struct dir_struct *dir,
 		switch (treat_path(dir, de, &path, baselen, simplify)) {
 		case path_recurse:
 			contents += read_directory_recursive(dir, path.buf,
-							     path.len, 0,
-							     simplify);
+				path.len, check_only, simplify);
 			continue;
 		case path_ignored:
 			continue;
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 0f1034e..4ece129 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -170,4 +170,31 @@ test_expect_success 'status ignored tracked directory with uncommitted file in u
 	test_cmp expected actual
 '
 
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+!! tracked/
+EOF
+
+test_expect_success 'status ignored tracked directory with uncommitted file in tracked subdir with --ignore' '
+	: >tracked/ignored/committed &&
+	git add -f tracked/ignored/committed &&
+	git commit -m. &&
+	git status --porcelain --ignored >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+!! tracked/ignored/uncommitted
+EOF
+
+test_expect_success 'status ignored tracked directory with uncommitted file in tracked subdir with --ignore -u' '
+	git status --porcelain --ignored -u >actual &&
+	test_cmp expected actual
+'
+
 test_done
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
  2013-04-15 19:05   ` [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
  2013-04-15 19:06   ` [PATCH v2 02/14] dir.c: git-status --ignored: don't list files in " Karsten Blees
@ 2013-04-15 19:07   ` Karsten Blees
  2013-04-16 17:48     ` Ramkumar Ramachandra
  2013-04-15 19:08   ` [PATCH v2 04/14] dir.c: git-ls-files --directories: don't hide empty directories Karsten Blees
                     ` (11 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:07 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-status --ignored' lists ignored tracked directories without any
ignored files if a tracked file happens to match an exclude pattern.

Always exclude tracked files.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                      | 11 ++++-------
 t/t7061-wtstatus-ignore.sh | 24 ++++++++++++++++++++++++
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index 7c9bc9c..fd1f088 100644
--- a/dir.c
+++ b/dir.c
@@ -1109,16 +1109,13 @@ static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude,
 	struct path_exclude_check check;
 	int exclude_file = 0;
 
+	/* Always exclude indexed files */
+	if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
+		return 1;
+
 	if (exclude)
 		exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
 	else if (dir->flags & DIR_SHOW_IGNORED) {
-		/* Always exclude indexed files */
-		struct cache_entry *ce = index_name_exists(&the_index,
-		    path->buf, path->len, ignore_case);
-
-		if (ce)
-			return 1;
-
 		path_exclude_check_init(&check, dir);
 
 		if (!is_path_excluded(&check, path->buf, path->len, dtype))
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 4ece129..28b7d95 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -122,10 +122,34 @@ cat >expected <<\EOF
 ?? .gitignore
 ?? actual
 ?? expected
+EOF
+
+test_expect_success 'status ignored tracked directory and ignored file with --ignore' '
+	echo "committed" >>.gitignore &&
+	git status --porcelain --ignored >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+EOF
+
+test_expect_success 'status ignored tracked directory and ignored file with --ignore -u' '
+	git status --porcelain --ignored -u >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
 !! tracked/
 EOF
 
 test_expect_success 'status ignored tracked directory and uncommitted file with --ignore' '
+	echo "tracked" >.gitignore &&
 	: >tracked/uncommitted &&
 	git status --porcelain --ignored >actual &&
 	test_cmp expected actual
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 04/14] dir.c: git-ls-files --directories: don't hide empty directories
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (2 preceding siblings ...)
  2013-04-15 19:07   ` [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories Karsten Blees
@ 2013-04-15 19:08   ` Karsten Blees
  2013-04-15 19:08   ` [PATCH v2 05/14] dir.c: git-status --ignored: don't list empty directories as ignored Karsten Blees
                     ` (10 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:08 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-ls-files --ignored --directories' hides empty directories even though
--no-empty-directory was not specified.

Treat the DIR_HIDE_EMPTY_DIRECTORIES flag independently from
DIR_SHOW_IGNORED to make all git-ls-files options work as expected.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                              |  6 ++----
 t/t3001-ls-files-others-exclude.sh | 23 +++++++++++++++++++++++
 wt-status.c                        |  2 +-
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/dir.c b/dir.c
index fd1f088..1112b05 100644
--- a/dir.c
+++ b/dir.c
@@ -1076,15 +1076,13 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 	if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) {
 		int ignored;
 		dir->flags &= ~DIR_SHOW_IGNORED;
-		dir->flags |= DIR_HIDE_EMPTY_DIRECTORIES;
 		ignored = read_directory_recursive(dir, dirname, len, 1, simplify);
-		dir->flags &= ~DIR_HIDE_EMPTY_DIRECTORIES;
 		dir->flags |= DIR_SHOW_IGNORED;
 
 		return ignored ? ignore_directory : show_directory;
 	}
-	if (!(dir->flags & DIR_SHOW_IGNORED) &&
-	    !(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
+
+	if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
 		return show_directory;
 	if (!read_directory_recursive(dir, dirname, len, 1, simplify))
 		return ignore_directory;
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index efb7ebc..859da35 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -214,6 +214,29 @@ test_expect_success 'subdirectory ignore (l1)' '
 	test_cmp expect actual
 '
 
+test_expect_success 'show/hide empty ignored directory (setup)' '
+	rm top/l1/l2/l1 &&
+	rm top/l1/.gitignore
+'
+
+test_expect_success 'show empty ignored directory with --directory' '
+	(
+		cd top &&
+		git ls-files -o -i --exclude l1 --directory
+	) >actual &&
+	echo l1/ >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hide empty ignored directory with --no-empty-directory' '
+	(
+		cd top &&
+		git ls-files -o -i --exclude l1 --directory --no-empty-directory
+	) >actual &&
+	>expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'pattern matches prefix completely' '
 	: >expect &&
 	git ls-files -i -o --exclude "/three/a.3[abc]" >actual &&
diff --git a/wt-status.c b/wt-status.c
index ef405d0..79eb124 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -518,7 +518,7 @@ static void wt_status_collect_untracked(struct wt_status *s)
 		dir.nr = 0;
 		dir.flags = DIR_SHOW_IGNORED;
 		if (s->show_untracked_files != SHOW_ALL_UNTRACKED_FILES)
-			dir.flags |= DIR_SHOW_OTHER_DIRECTORIES;
+			dir.flags |= DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
 		fill_directory(&dir, s->pathspec);
 		for (i = 0; i < dir.nr; i++) {
 			struct dir_entry *ent = dir.entries[i];
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 05/14] dir.c: git-status --ignored: don't list empty directories as ignored
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (3 preceding siblings ...)
  2013-04-15 19:08   ` [PATCH v2 04/14] dir.c: git-ls-files --directories: don't hide empty directories Karsten Blees
@ 2013-04-15 19:08   ` Karsten Blees
  2013-04-15 19:09   ` [PATCH v2 06/14] dir.c: make 'git-status --ignored' work within leading directories Karsten Blees
                     ` (9 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:08 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-status --ignored' lists empty untracked directories as ignored, even
though they don't have any ignored files.

When checking if a directory is already listed as untracked (i.e. shouldn't
be listed as ignored as well), don't assume that the directory has only
ignored files if it doesn't have untracked files, as the directory may be
empty.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                      |  5 +++--
 t/t7061-wtstatus-ignore.sh | 26 ++++++++++++++++++++++++--
 2 files changed, 27 insertions(+), 4 deletions(-)

diff --git a/dir.c b/dir.c
index 1112b05..c5705e3 100644
--- a/dir.c
+++ b/dir.c
@@ -1071,7 +1071,7 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 
 	/*
 	 * We are looking for ignored files and our directory is not ignored,
-	 * check if it contains only ignored files
+	 * check if it contains untracked files (i.e. is listed as untracked)
 	 */
 	if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) {
 		int ignored;
@@ -1079,7 +1079,8 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 		ignored = read_directory_recursive(dir, dirname, len, 1, simplify);
 		dir->flags |= DIR_SHOW_IGNORED;
 
-		return ignored ? ignore_directory : show_directory;
+		if (ignored)
+			return ignore_directory;
 	}
 
 	if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 28b7d95..6171a49 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -64,13 +64,35 @@ cat >expected <<\EOF
 ?? .gitignore
 ?? actual
 ?? expected
-!! untracked-ignored/
 EOF
 
-test_expect_success 'status untracked directory with ignored files with --ignore' '
+test_expect_success 'status empty untracked directory with --ignore' '
 	rm -rf ignored &&
 	mkdir untracked-ignored &&
 	mkdir untracked-ignored/test &&
+	git status --porcelain --ignored >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+EOF
+
+test_expect_success 'status empty untracked directory with --ignore -u' '
+	git status --porcelain --ignored -u >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? .gitignore
+?? actual
+?? expected
+!! untracked-ignored/
+EOF
+
+test_expect_success 'status untracked directory with ignored files with --ignore' '
 	: >untracked-ignored/ignored &&
 	: >untracked-ignored/test/ignored &&
 	git status --porcelain --ignored >actual &&
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 06/14] dir.c: make 'git-status --ignored' work within leading directories
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (4 preceding siblings ...)
  2013-04-15 19:08   ` [PATCH v2 05/14] dir.c: git-status --ignored: don't list empty directories as ignored Karsten Blees
@ 2013-04-15 19:09   ` Karsten Blees
  2013-04-15 19:10   ` [PATCH v2 07/14] dir.c: git-clean -d -X: don't delete tracked directories Karsten Blees
                     ` (8 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:09 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-status --ignored path/' doesn't list ignored files and directories
within 'path' if some component of 'path' is classified as untracked.

Disable the DIR_SHOW_OTHER_DIRECTORIES flag while traversing leading
directories. This prevents treat_leading_path() with DIR_SHOW_IGNORED flag
from aborting at the top level untracked directory.

As a side effect, this also eliminates a recursive directory scan per
leading directory level, as treat_directory() can no longer call
read_directory_recursive() when called from treat_leading_path().

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                      |  3 +++
 t/t7061-wtstatus-ignore.sh | 19 +++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/dir.c b/dir.c
index c5705e3..822e501 100644
--- a/dir.c
+++ b/dir.c
@@ -1403,12 +1403,14 @@ static int treat_leading_path(struct dir_struct *dir,
 	struct strbuf sb = STRBUF_INIT;
 	int baselen, rc = 0;
 	const char *cp;
+	int old_flags = dir->flags;
 
 	while (len && path[len - 1] == '/')
 		len--;
 	if (!len)
 		return 1;
 	baselen = 0;
+	dir->flags &= ~DIR_SHOW_OTHER_DIRECTORIES;
 	while (1) {
 		cp = path + baselen + !!baselen;
 		cp = memchr(cp, '/', path + len - cp);
@@ -1431,6 +1433,7 @@ static int treat_leading_path(struct dir_struct *dir,
 		}
 	}
 	strbuf_release(&sb);
+	dir->flags = old_flags;
 	return rc;
 }
 
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 6171a49..4c6f145 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -32,6 +32,25 @@ test_expect_success 'status untracked directory with --ignored -u' '
 	git status --porcelain --ignored -u >actual &&
 	test_cmp expected actual
 '
+cat >expected <<\EOF
+?? untracked/uncommitted
+!! untracked/ignored
+EOF
+
+test_expect_success 'status prefixed untracked directory with --ignored' '
+	git status --porcelain --ignored untracked/ >actual &&
+	test_cmp expected actual
+'
+
+cat >expected <<\EOF
+?? untracked/uncommitted
+!! untracked/ignored
+EOF
+
+test_expect_success 'status prefixed untracked sub-directory with --ignored -u' '
+	git status --porcelain --ignored -u untracked/ >actual &&
+	test_cmp expected actual
+'
 
 cat >expected <<\EOF
 ?? .gitignore
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 07/14] dir.c: git-clean -d -X: don't delete tracked directories
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (5 preceding siblings ...)
  2013-04-15 19:09   ` [PATCH v2 06/14] dir.c: make 'git-status --ignored' work within leading directories Karsten Blees
@ 2013-04-15 19:10   ` Karsten Blees
  2013-04-15 19:11   ` [PATCH v2 08/14] dir.c: factor out parts of last_exclude_matching for later reuse Karsten Blees
                     ` (7 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:10 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

The notion of "ignored tracked" directories introduced in 721ac4ed "dir.c:
Make git-status --ignored more consistent" has a few unwanted side effects:

 - git-clean -d -X: deletes ignored tracked directories. git-clean should
   never delete tracked content.

 - git-ls-files --ignored --other --directory: lists ignored tracked
   directories instead of "other" directories.

 - git-status --ignored: lists ignored tracked directories while contained
   files may be listed as modified. Paths listed by git-status should be
   disjoint (except in long format where a path may be listed in both the
   staged and unstaged section).

Additionally, the current behaviour violates documentation in gitignore(5)
("Specifies intentionally *untracked* files to ignore") and Documentation/
technical/api-directory-listing.txt ("DIR_SHOW_OTHER_DIRECTORIES: Include
a directory that is *not tracked*.").

In dir.c::treat_directory, remove the special handling of ignored tracked
directories, so that the DIR_SHOW_OTHER_DIRECTORIES flag only affects
"other" (i.e. untracked) directories. In dir.c::dir_add_name, check that
added paths are untracked even if DIR_SHOW_IGNORED is set.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c                              | 11 +++--------
 t/t3001-ls-files-others-exclude.sh | 26 ++++++++++++++++++++++++++
 t/t7061-wtstatus-ignore.sh         |  6 +++---
 t/t7300-clean.sh                   | 34 ++++++++++++++++++++++++++++++++++
 4 files changed, 66 insertions(+), 11 deletions(-)

diff --git a/dir.c b/dir.c
index 822e501..f10fb69 100644
--- a/dir.c
+++ b/dir.c
@@ -897,8 +897,7 @@ static struct dir_entry *dir_entry_new(const char *pathname, int len)
 
 static struct dir_entry *dir_add_name(struct dir_struct *dir, const char *pathname, int len)
 {
-	if (!(dir->flags & DIR_SHOW_IGNORED) &&
-	    cache_name_exists(pathname, len, ignore_case))
+	if (cache_name_exists(pathname, len, ignore_case))
 		return NULL;
 
 	ALLOC_GROW(dir->entries, dir->nr+1, dir->alloc);
@@ -1000,9 +999,8 @@ static enum exist_status directory_exists_in_index(const char *dirname, int len)
  * traversal routine.
  *
  * Case 1: If we *already* have entries in the index under that
- * directory name, we recurse into the directory to see all the files,
- * unless the directory is excluded and we want to show ignored
- * directories
+ * directory name, we always recurse into the directory to see
+ * all the files.
  *
  * Case 2: If we *already* have that directory name as a gitlink,
  * we always continue to see it as a gitlink, regardless of whether
@@ -1037,9 +1035,6 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 	/* The "len-1" is to strip the final '/' */
 	switch (directory_exists_in_index(dirname, len-1)) {
 	case index_directory:
-		if ((dir->flags & DIR_SHOW_OTHER_DIRECTORIES) && exclude)
-			break;
-
 		return recurse_into_directory;
 
 	case index_gitdir:
diff --git a/t/t3001-ls-files-others-exclude.sh b/t/t3001-ls-files-others-exclude.sh
index 859da35..ec4fae2 100755
--- a/t/t3001-ls-files-others-exclude.sh
+++ b/t/t3001-ls-files-others-exclude.sh
@@ -237,6 +237,32 @@ test_expect_success 'hide empty ignored directory with --no-empty-directory' '
 	test_cmp expect actual
 '
 
+test_expect_success 'show/hide empty ignored sub-directory (setup)' '
+	> top/l1/tracked &&
+	(
+		cd top &&
+		git add -f l1/tracked
+	)
+'
+
+test_expect_success 'show empty ignored sub-directory with --directory' '
+	(
+		cd top &&
+		git ls-files -o -i --exclude l1 --directory
+	) >actual &&
+	echo l1/l2/ >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success 'hide empty ignored sub-directory with --no-empty-directory' '
+	(
+		cd top &&
+		git ls-files -o -i --exclude l1 --directory --no-empty-directory
+	) >actual &&
+	>expect &&
+	test_cmp expect actual
+'
+
 test_expect_success 'pattern matches prefix completely' '
 	: >expect &&
 	git ls-files -i -o --exclude "/three/a.3[abc]" >actual &&
diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
index 4c6f145..460789b 100755
--- a/t/t7061-wtstatus-ignore.sh
+++ b/t/t7061-wtstatus-ignore.sh
@@ -186,7 +186,7 @@ cat >expected <<\EOF
 ?? .gitignore
 ?? actual
 ?? expected
-!! tracked/
+!! tracked/uncommitted
 EOF
 
 test_expect_success 'status ignored tracked directory and uncommitted file with --ignore' '
@@ -212,7 +212,7 @@ cat >expected <<\EOF
 ?? .gitignore
 ?? actual
 ?? expected
-!! tracked/
+!! tracked/ignored/
 EOF
 
 test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore' '
@@ -239,7 +239,7 @@ cat >expected <<\EOF
 ?? .gitignore
 ?? actual
 ?? expected
-!! tracked/
+!! tracked/ignored/uncommitted
 EOF
 
 test_expect_success 'status ignored tracked directory with uncommitted file in tracked subdir with --ignore' '
diff --git a/t/t7300-clean.sh b/t/t7300-clean.sh
index ccfb54d..710be90 100755
--- a/t/t7300-clean.sh
+++ b/t/t7300-clean.sh
@@ -298,6 +298,23 @@ test_expect_success 'git clean -d -x' '
 
 '
 
+test_expect_success 'git clean -d -x with ignored tracked directory' '
+
+	mkdir -p build docs &&
+	touch a.out src/part3.c docs/manual.txt obj.o build/lib.so &&
+	git clean -d -x -e src &&
+	test -f Makefile &&
+	test -f README &&
+	test -f src/part1.c &&
+	test -f src/part2.c &&
+	test ! -f a.out &&
+	test -f src/part3.c &&
+	test ! -d docs &&
+	test ! -f obj.o &&
+	test ! -d build
+
+'
+
 test_expect_success 'git clean -X' '
 
 	mkdir -p build docs &&
@@ -332,6 +349,23 @@ test_expect_success 'git clean -d -X' '
 
 '
 
+test_expect_success 'git clean -d -X with ignored tracked directory' '
+
+	mkdir -p build docs &&
+	touch a.out src/part3.c docs/manual.txt obj.o build/lib.so &&
+	git clean -d -X -e src &&
+	test -f Makefile &&
+	test -f README &&
+	test -f src/part1.c &&
+	test -f src/part2.c &&
+	test -f a.out &&
+	test ! -f src/part3.c &&
+	test -f docs/manual.txt &&
+	test ! -f obj.o &&
+	test ! -d build
+
+'
+
 test_expect_success 'clean.requireForce defaults to true' '
 
 	git config --unset clean.requireForce &&
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 08/14] dir.c: factor out parts of last_exclude_matching for later reuse
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (6 preceding siblings ...)
  2013-04-15 19:10   ` [PATCH v2 07/14] dir.c: git-clean -d -X: don't delete tracked directories Karsten Blees
@ 2013-04-15 19:11   ` Karsten Blees
  2013-04-15 19:11   ` [PATCH v2 09/14] dir.c: move prep_exclude Karsten Blees
                     ` (6 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:11 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 36 ++++++++++++++++++++++--------------
 1 file changed, 22 insertions(+), 14 deletions(-)

diff --git a/dir.c b/dir.c
index f10fb69..594307c 100644
--- a/dir.c
+++ b/dir.c
@@ -751,6 +751,26 @@ int is_excluded_from_list(const char *pathname,
 	return -1; /* undecided */
 }
 
+static struct exclude *last_exclude_matching_from_lists(struct dir_struct *dir,
+		const char *pathname, int pathlen, const char *basename,
+		int *dtype_p)
+{
+	int i, j;
+	struct exclude_list_group *group;
+	struct exclude *exclude;
+	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
+		group = &dir->exclude_list_group[i];
+		for (j = group->nr - 1; j >= 0; j--) {
+			exclude = last_exclude_matching_from_list(
+				pathname, pathlen, basename, dtype_p,
+				&group->el[j]);
+			if (exclude)
+				return exclude;
+		}
+	}
+	return NULL;
+}
+
 /*
  * Loads the exclude lists for the directory containing pathname, then
  * scans all exclude lists to determine whether pathname is excluded.
@@ -762,25 +782,13 @@ static struct exclude *last_exclude_matching(struct dir_struct *dir,
 					     int *dtype_p)
 {
 	int pathlen = strlen(pathname);
-	int i, j;
-	struct exclude_list_group *group;
-	struct exclude *exclude;
 	const char *basename = strrchr(pathname, '/');
 	basename = (basename) ? basename+1 : pathname;
 
 	prep_exclude(dir, pathname, basename-pathname);
 
-	for (i = EXC_CMDL; i <= EXC_FILE; i++) {
-		group = &dir->exclude_list_group[i];
-		for (j = group->nr - 1; j >= 0; j--) {
-			exclude = last_exclude_matching_from_list(
-				pathname, pathlen, basename, dtype_p,
-				&group->el[j]);
-			if (exclude)
-				return exclude;
-		}
-	}
-	return NULL;
+	return last_exclude_matching_from_lists(dir, pathname, pathlen,
+			basename, dtype_p);
 }
 
 /*
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 09/14] dir.c: move prep_exclude
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (7 preceding siblings ...)
  2013-04-15 19:11   ` [PATCH v2 08/14] dir.c: factor out parts of last_exclude_matching for later reuse Karsten Blees
@ 2013-04-15 19:11   ` Karsten Blees
  2013-04-15 19:12   ` [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs Karsten Blees
                     ` (5 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:11 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

Move prep_exclude in preparation for the next patch.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 144 +++++++++++++++++++++++++++++++++---------------------------------
 1 file changed, 72 insertions(+), 72 deletions(-)

diff --git a/dir.c b/dir.c
index 594307c..fcb3def 100644
--- a/dir.c
+++ b/dir.c
@@ -549,78 +549,6 @@ void add_excludes_from_file(struct dir_struct *dir, const char *fname)
 		die("cannot use %s as an exclude file", fname);
 }
 
-/*
- * Loads the per-directory exclude list for the substring of base
- * which has a char length of baselen.
- */
-static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
-{
-	struct exclude_list_group *group;
-	struct exclude_list *el;
-	struct exclude_stack *stk = NULL;
-	int current;
-
-	if ((!dir->exclude_per_dir) ||
-	    (baselen + strlen(dir->exclude_per_dir) >= PATH_MAX))
-		return; /* too long a path -- ignore */
-
-	group = &dir->exclude_list_group[EXC_DIRS];
-
-	/* Pop the exclude lists from the EXCL_DIRS exclude_list_group
-	 * which originate from directories not in the prefix of the
-	 * path being checked. */
-	while ((stk = dir->exclude_stack) != NULL) {
-		if (stk->baselen <= baselen &&
-		    !strncmp(dir->basebuf, base, stk->baselen))
-			break;
-		el = &group->el[dir->exclude_stack->exclude_ix];
-		dir->exclude_stack = stk->prev;
-		free((char *)el->src); /* see strdup() below */
-		clear_exclude_list(el);
-		free(stk);
-		group->nr--;
-	}
-
-	/* Read from the parent directories and push them down. */
-	current = stk ? stk->baselen : -1;
-	while (current < baselen) {
-		struct exclude_stack *stk = xcalloc(1, sizeof(*stk));
-		const char *cp;
-
-		if (current < 0) {
-			cp = base;
-			current = 0;
-		}
-		else {
-			cp = strchr(base + current + 1, '/');
-			if (!cp)
-				die("oops in prep_exclude");
-			cp++;
-		}
-		stk->prev = dir->exclude_stack;
-		stk->baselen = cp - base;
-		memcpy(dir->basebuf + current, base + current,
-		       stk->baselen - current);
-		strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);
-		/*
-		 * dir->basebuf gets reused by the traversal, but we
-		 * need fname to remain unchanged to ensure the src
-		 * member of each struct exclude correctly
-		 * back-references its source file.  Other invocations
-		 * of add_exclude_list provide stable strings, so we
-		 * strdup() and free() here in the caller.
-		 */
-		el = add_exclude_list(dir, EXC_DIRS, strdup(dir->basebuf));
-		stk->exclude_ix = group->nr - 1;
-		add_excludes_from_file_to_list(dir->basebuf,
-					       dir->basebuf, stk->baselen,
-					       el, 1);
-		dir->exclude_stack = stk;
-		current = stk->baselen;
-	}
-	dir->basebuf[baselen] = '\0';
-}
-
 int match_basename(const char *basename, int basenamelen,
 		   const char *pattern, int prefix, int patternlen,
 		   int flags)
@@ -772,6 +700,78 @@ static struct exclude *last_exclude_matching_from_lists(struct dir_struct *dir,
 }
 
 /*
+ * Loads the per-directory exclude list for the substring of base
+ * which has a char length of baselen.
+ */
+static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
+{
+	struct exclude_list_group *group;
+	struct exclude_list *el;
+	struct exclude_stack *stk = NULL;
+	int current;
+
+	if ((!dir->exclude_per_dir) ||
+	    (baselen + strlen(dir->exclude_per_dir) >= PATH_MAX))
+		return; /* too long a path -- ignore */
+
+	group = &dir->exclude_list_group[EXC_DIRS];
+
+	/* Pop the exclude lists from the EXCL_DIRS exclude_list_group
+	 * which originate from directories not in the prefix of the
+	 * path being checked. */
+	while ((stk = dir->exclude_stack) != NULL) {
+		if (stk->baselen <= baselen &&
+		    !strncmp(dir->basebuf, base, stk->baselen))
+			break;
+		el = &group->el[dir->exclude_stack->exclude_ix];
+		dir->exclude_stack = stk->prev;
+		free((char *)el->src); /* see strdup() below */
+		clear_exclude_list(el);
+		free(stk);
+		group->nr--;
+	}
+
+	/* Read from the parent directories and push them down. */
+	current = stk ? stk->baselen : -1;
+	while (current < baselen) {
+		struct exclude_stack *stk = xcalloc(1, sizeof(*stk));
+		const char *cp;
+
+		if (current < 0) {
+			cp = base;
+			current = 0;
+		}
+		else {
+			cp = strchr(base + current + 1, '/');
+			if (!cp)
+				die("oops in prep_exclude");
+			cp++;
+		}
+		stk->prev = dir->exclude_stack;
+		stk->baselen = cp - base;
+		memcpy(dir->basebuf + current, base + current,
+		       stk->baselen - current);
+		strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);
+		/*
+		 * dir->basebuf gets reused by the traversal, but we
+		 * need fname to remain unchanged to ensure the src
+		 * member of each struct exclude correctly
+		 * back-references its source file.  Other invocations
+		 * of add_exclude_list provide stable strings, so we
+		 * strdup() and free() here in the caller.
+		 */
+		el = add_exclude_list(dir, EXC_DIRS, strdup(dir->basebuf));
+		stk->exclude_ix = group->nr - 1;
+		add_excludes_from_file_to_list(dir->basebuf,
+					       dir->basebuf, stk->baselen,
+					       el, 1);
+		dir->exclude_stack = stk;
+		current = stk->baselen;
+	}
+	dir->basebuf[baselen] = '\0';
+}
+
+/*
  * Loads the exclude lists for the directory containing pathname, then
  * scans all exclude lists to determine whether pathname is excluded.
  * Returns the exclude_list element which matched, or NULL for
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (8 preceding siblings ...)
  2013-04-15 19:11   ` [PATCH v2 09/14] dir.c: move prep_exclude Karsten Blees
@ 2013-04-15 19:12   ` Karsten Blees
  2013-04-15 21:35     ` Junio C Hamano
  2013-04-15 19:12   ` [PATCH v2 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
                     ` (4 subsequent siblings)
  14 siblings, 1 reply; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:12 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

The is_excluded and is_path_excluded APIs are very similar, except for a
few noteworthy differences:

is_excluded doesn't handle ignored directories, results for paths within
ignored directories are incorrect. This is probably based on the premise
that recursive directory scans should stop at ignored directories, which
is no longer true (in certain cases, read_directory_recursive currently
calls is_excluded *and* is_path_excluded to get correct ignored state).

is_excluded caches parsed .gitignore files of the last directory in struct
dir_struct. If the directory changes, it finds a common parent directory
and is very careful to drop only as much state as necessary. On the other
hand, is_excluded will also read and parse .gitignore files in already
ignored directories, which are completely irrelevant.

is_path_excluded correctly handles ignored directories by checking if any
component in the path is excluded. As it uses is_excluded internally, this
unfortunately forces is_excluded to drop and re-read all .gitignore files,
as there is no common parent directory for the root dir.

is_path_excluded tracks state in a separate struct path_exclude_check,
which is essentially a wrapper of dir_struct with two more fields. However,
as is_path_excluded also modifies dir_struct, it is not possible to e.g.
use multiple path_exclude_check structures with the same dir_struct in
parallel. The additional structure just unnecessarily complicates the API.

Teach is_excluded / prep_exclude about ignored directories: whenever
entering a new directory, first check if the entire directory is excluded.
Remember the excluded state in dir_struct. Don't traverse into already
ignored directories (i.e. don't read irrelevant .gitignore files).

Directories could also be excluded by exclude patterns specified on the
command line or .git/info/exclude, so we cannot simply skip prep_exclude
entirely if there's no .gitignore file name (dir_struct.exclude_per_dir).
Move this check to just before actually reading the file.

is_path_excluded is now equivalent to is_excluded, so we can simply
redirect to it (the public API is cleaned up in the next patch).

The performance impact of the additional ignored check per directory is
hardly noticeable when reading directories recursively (e.g. 'git status').
However, performance of git commands using the is_path_excluded API (e.g.
'git ls-files --cached --ignored --exclude-standard') is greatly improved
as this no longer re-reads .gitignore files on each call.

Here's some performance data from the linux and WebKit repos (best of 10
runs on a Debian Linux on SSD, core.preloadIndex=true):

       | ls-files -ci   |    status      | status --ignored
       | linux | WebKit | linux | WebKit | linux | WebKit
-------+-------+--------+-------+--------+-------+---------
before | 0.506 |  6.539 | 0.212 |  1.555 | 0.323 |  2.541
after  | 0.080 |  1.191 | 0.218 |  1.583 | 0.321 |  2.579
gain   | 6.325 |  5.490 | 0.972 |  0.982 | 1.006 |  0.985

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 107 +++++++++++++++++++++++++++---------------------------------------
 dir.h |   6 ++--
 2 files changed, 46 insertions(+), 67 deletions(-)

diff --git a/dir.c b/dir.c
index fcb3def..33bd019 100644
--- a/dir.c
+++ b/dir.c
@@ -710,10 +710,6 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 	struct exclude_stack *stk = NULL;
 	int current;
 
-	if ((!dir->exclude_per_dir) ||
-	    (baselen + strlen(dir->exclude_per_dir) >= PATH_MAX))
-		return; /* too long a path -- ignore */
-
 	group = &dir->exclude_list_group[EXC_DIRS];
 
 	/* Pop the exclude lists from the EXCL_DIRS exclude_list_group
@@ -725,12 +721,17 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 			break;
 		el = &group->el[dir->exclude_stack->exclude_ix];
 		dir->exclude_stack = stk->prev;
+		dir->exclude = NULL;
 		free((char *)el->src); /* see strdup() below */
 		clear_exclude_list(el);
 		free(stk);
 		group->nr--;
 	}
 
+	/* Skip traversing into sub directories if the parent is excluded */
+	if (dir->exclude)
+		return;
+
 	/* Read from the parent directories and push them down. */
 	current = stk ? stk->baselen : -1;
 	while (current < baselen) {
@@ -749,22 +750,43 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
 		}
 		stk->prev = dir->exclude_stack;
 		stk->baselen = cp - base;
+		stk->exclude_ix = group->nr;
+		el = add_exclude_list(dir, EXC_DIRS, NULL);
 		memcpy(dir->basebuf + current, base + current,
 		       stk->baselen - current);
-		strcpy(dir->basebuf + stk->baselen, dir->exclude_per_dir);
-		/*
-		 * dir->basebuf gets reused by the traversal, but we
-		 * need fname to remain unchanged to ensure the src
-		 * member of each struct exclude correctly
-		 * back-references its source file.  Other invocations
-		 * of add_exclude_list provide stable strings, so we
-		 * strdup() and free() here in the caller.
-		 */
-		el = add_exclude_list(dir, EXC_DIRS, strdup(dir->basebuf));
-		stk->exclude_ix = group->nr - 1;
-		add_excludes_from_file_to_list(dir->basebuf,
-					       dir->basebuf, stk->baselen,
-					       el, 1);
+
+		/* Abort if the directory is excluded */
+		if (stk->baselen) {
+			int dt = DT_DIR;
+			dir->basebuf[stk->baselen - 1] = 0;
+			dir->exclude = last_exclude_matching_from_lists(dir,
+				dir->basebuf, stk->baselen - 1,
+				dir->basebuf + current, &dt);
+			dir->basebuf[stk->baselen - 1] = '/';
+			if (dir->exclude) {
+				dir->basebuf[stk->baselen] = 0;
+				dir->exclude_stack = stk;
+				return;
+			}
+		}
+
+		/* Try to read per-directory file unless path is too long */
+		if (dir->exclude_per_dir &&
+		    stk->baselen + strlen(dir->exclude_per_dir) < PATH_MAX) {
+			strcpy(dir->basebuf + stk->baselen,
+					dir->exclude_per_dir);
+			/*
+			 * dir->basebuf gets reused by the traversal, but we
+			 * need fname to remain unchanged to ensure the src
+			 * member of each struct exclude correctly
+			 * back-references its source file.  Other invocations
+			 * of add_exclude_list provide stable strings, so we
+			 * strdup() and free() here in the caller.
+			 */
+			el->src = strdup(dir->basebuf);
+			add_excludes_from_file_to_list(dir->basebuf,
+					dir->basebuf, stk->baselen, el, 1);
+		}
 		dir->exclude_stack = stk;
 		current = stk->baselen;
 	}
@@ -787,6 +809,9 @@ static struct exclude *last_exclude_matching(struct dir_struct *dir,
 
 	prep_exclude(dir, pathname, basename-pathname);
 
+	if (dir->exclude)
+		return dir->exclude;
+
 	return last_exclude_matching_from_lists(dir, pathname, pathlen,
 			basename, dtype_p);
 }
@@ -809,13 +834,10 @@ void path_exclude_check_init(struct path_exclude_check *check,
 			     struct dir_struct *dir)
 {
 	check->dir = dir;
-	check->exclude = NULL;
-	strbuf_init(&check->path, 256);
 }
 
 void path_exclude_check_clear(struct path_exclude_check *check)
 {
-	strbuf_release(&check->path);
 }
 
 /*
@@ -831,49 +853,6 @@ struct exclude *last_exclude_matching_path(struct path_exclude_check *check,
 					   const char *name, int namelen,
 					   int *dtype)
 {
-	int i;
-	struct strbuf *path = &check->path;
-	struct exclude *exclude;
-
-	/*
-	 * we allow the caller to pass namelen as an optimization; it
-	 * must match the length of the name, as we eventually call
-	 * is_excluded() on the whole name string.
-	 */
-	if (namelen < 0)
-		namelen = strlen(name);
-
-	/*
-	 * If path is non-empty, and name is equal to path or a
-	 * subdirectory of path, name should be excluded, because
-	 * it's inside a directory which is already known to be
-	 * excluded and was previously left in check->path.
-	 */
-	if (path->len &&
-	    path->len <= namelen &&
-	    !memcmp(name, path->buf, path->len) &&
-	    (!name[path->len] || name[path->len] == '/'))
-		return check->exclude;
-
-	strbuf_setlen(path, 0);
-	for (i = 0; name[i]; i++) {
-		int ch = name[i];
-
-		if (ch == '/') {
-			int dt = DT_DIR;
-			exclude = last_exclude_matching(check->dir,
-							path->buf, &dt);
-			if (exclude) {
-				check->exclude = exclude;
-				return exclude;
-			}
-		}
-		strbuf_addch(path, ch);
-	}
-
-	/* An entry in the index; cannot be a directory with subentries */
-	strbuf_setlen(path, 0);
-
 	return last_exclude_matching(check->dir, name, dtype);
 }
 
diff --git a/dir.h b/dir.h
index c3eb4b5..cd166d0 100644
--- a/dir.h
+++ b/dir.h
@@ -110,9 +110,11 @@ struct dir_struct {
 	 *
 	 * exclude_stack points to the top of the exclude_stack, and
 	 * basebuf contains the full path to the current
-	 * (sub)directory in the traversal.
+	 * (sub)directory in the traversal. Exclude points to the
+	 * matching exclude struct if the directory is excluded.
 	 */
 	struct exclude_stack *exclude_stack;
+	struct exclude *exclude;
 	char basebuf[PATH_MAX];
 };
 
@@ -156,8 +158,6 @@ extern int match_pathname(const char *, int,
  */
 struct path_exclude_check {
 	struct dir_struct *dir;
-	struct exclude *exclude;
-	struct strbuf path;
 };
 extern void path_exclude_check_init(struct path_exclude_check *, struct dir_struct *);
 extern void path_exclude_check_clear(struct path_exclude_check *);
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (9 preceding siblings ...)
  2013-04-15 19:12   ` [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs Karsten Blees
@ 2013-04-15 19:12   ` Karsten Blees
  2013-04-15 19:13   ` [PATCH v2 12/14] dir.c: git-status: avoid is_excluded checks for tracked files Karsten Blees
                     ` (3 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:12 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 builtin/add.c          |  5 +---
 builtin/check-ignore.c |  6 +---
 builtin/ls-files.c     | 15 +++-------
 dir.c                  | 79 ++++----------------------------------------------
 dir.h                  | 16 ++--------
 unpack-trees.c         | 10 +------
 unpack-trees.h         |  1 -
 7 files changed, 16 insertions(+), 116 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index ab1c9e8..06f365d 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -444,9 +444,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (pathspec) {
 		int i;
-		struct path_exclude_check check;
 
-		path_exclude_check_init(&check, &dir);
 		if (!seen)
 			seen = find_pathspecs_matching_against_index(pathspec);
 		for (i = 0; pathspec[i]; i++) {
@@ -454,7 +452,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			    && !file_exists(pathspec[i])) {
 				if (ignore_missing) {
 					int dtype = DT_UNKNOWN;
-					if (is_path_excluded(&check, pathspec[i], -1, &dtype))
+					if (is_excluded(&dir, pathspec[i], &dtype))
 						dir_add_ignored(&dir, pathspec[i], strlen(pathspec[i]));
 				} else
 					die(_("pathspec '%s' did not match any files"),
@@ -462,7 +460,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			}
 		}
 		free(seen);
-		path_exclude_check_clear(&check);
 	}
 
 	plug_bulk_checkin();
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 0240f99..7388346 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -59,7 +59,6 @@ static int check_ignore(const char *prefix, const char **pathspec)
 	const char *path, *full_path;
 	char *seen;
 	int num_ignored = 0, dtype = DT_UNKNOWN, i;
-	struct path_exclude_check check;
 	struct exclude *exclude;
 
 	/* read_cache() is only necessary so we can watch out for submodules. */
@@ -76,7 +75,6 @@ static int check_ignore(const char *prefix, const char **pathspec)
 		return 0;
 	}
 
-	path_exclude_check_init(&check, &dir);
 	/*
 	 * look for pathspecs matching entries in the index, since these
 	 * should not be ignored, in order to be consistent with
@@ -90,8 +88,7 @@ static int check_ignore(const char *prefix, const char **pathspec)
 		full_path = check_path_for_gitlink(full_path);
 		die_if_path_beyond_symlink(full_path, prefix);
 		if (!seen[i]) {
-			exclude = last_exclude_matching_path(&check, full_path,
-							     -1, &dtype);
+			exclude = last_exclude_matching(&dir, full_path, &dtype);
 			if (exclude) {
 				if (!quiet)
 					output_exclude(path, exclude);
@@ -101,7 +98,6 @@ static int check_ignore(const char *prefix, const char **pathspec)
 	}
 	free(seen);
 	clear_directory(&dir);
-	path_exclude_check_clear(&check);
 
 	return num_ignored;
 }
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 175e6e3..2202072 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -201,19 +201,15 @@ static void show_ru_info(void)
 	}
 }
 
-static int ce_excluded(struct path_exclude_check *check, struct cache_entry *ce)
+static int ce_excluded(struct dir_struct *dir, struct cache_entry *ce)
 {
 	int dtype = ce_to_dtype(ce);
-	return is_path_excluded(check, ce->name, ce_namelen(ce), &dtype);
+	return is_excluded(dir, ce->name, &dtype);
 }
 
 static void show_files(struct dir_struct *dir)
 {
 	int i;
-	struct path_exclude_check check;
-
-	if ((dir->flags & DIR_SHOW_IGNORED))
-		path_exclude_check_init(&check, dir);
 
 	/* For cached/deleted files we don't need to even do the readdir */
 	if (show_others || show_killed) {
@@ -227,7 +223,7 @@ static void show_files(struct dir_struct *dir)
 		for (i = 0; i < active_nr; i++) {
 			struct cache_entry *ce = active_cache[i];
 			if ((dir->flags & DIR_SHOW_IGNORED) &&
-			    !ce_excluded(&check, ce))
+			    !ce_excluded(dir, ce))
 				continue;
 			if (show_unmerged && !ce_stage(ce))
 				continue;
@@ -243,7 +239,7 @@ static void show_files(struct dir_struct *dir)
 			struct stat st;
 			int err;
 			if ((dir->flags & DIR_SHOW_IGNORED) &&
-			    !ce_excluded(&check, ce))
+			    !ce_excluded(dir, ce))
 				continue;
 			if (ce->ce_flags & CE_UPDATE)
 				continue;
@@ -256,9 +252,6 @@ static void show_files(struct dir_struct *dir)
 				show_ce_entry(tag_modified, ce);
 		}
 	}
-
-	if ((dir->flags & DIR_SHOW_IGNORED))
-		path_exclude_check_clear(&check);
 }
 
 /*
diff --git a/dir.c b/dir.c
index 33bd019..67313bd 100644
--- a/dir.c
+++ b/dir.c
@@ -799,7 +799,7 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
  * Returns the exclude_list element which matched, or NULL for
  * undecided.
  */
-static struct exclude *last_exclude_matching(struct dir_struct *dir,
+struct exclude *last_exclude_matching(struct dir_struct *dir,
 					     const char *pathname,
 					     int *dtype_p)
 {
@@ -821,7 +821,7 @@ static struct exclude *last_exclude_matching(struct dir_struct *dir,
  * scans all exclude lists to determine whether pathname is excluded.
  * Returns 1 if true, otherwise 0.
  */
-static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
+int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
 {
 	struct exclude *exclude =
 		last_exclude_matching(dir, pathname, dtype_p);
@@ -830,47 +830,6 @@ static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_
 	return 0;
 }
 
-void path_exclude_check_init(struct path_exclude_check *check,
-			     struct dir_struct *dir)
-{
-	check->dir = dir;
-}
-
-void path_exclude_check_clear(struct path_exclude_check *check)
-{
-}
-
-/*
- * For each subdirectory in name, starting with the top-most, checks
- * to see if that subdirectory is excluded, and if so, returns the
- * corresponding exclude structure.  Otherwise, checks whether name
- * itself (which is presumably a file) is excluded.
- *
- * A path to a directory known to be excluded is left in check->path to
- * optimize for repeated checks for files in the same excluded directory.
- */
-struct exclude *last_exclude_matching_path(struct path_exclude_check *check,
-					   const char *name, int namelen,
-					   int *dtype)
-{
-	return last_exclude_matching(check->dir, name, dtype);
-}
-
-/*
- * Is this name excluded?  This is for a caller like show_files() that
- * do not honor directory hierarchy and iterate through paths that are
- * possibly in an ignored directory.
- */
-int is_path_excluded(struct path_exclude_check *check,
-		  const char *name, int namelen, int *dtype)
-{
-	struct exclude *exclude =
-		last_exclude_matching_path(check, name, namelen, dtype);
-	if (exclude)
-		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
-	return 0;
-}
-
 static struct dir_entry *dir_entry_new(const char *pathname, int len)
 {
 	struct dir_entry *ent;
@@ -1042,15 +1001,6 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 
 	/* This is the "show_other_directories" case */
 
-	/* might be a sub directory in an excluded directory */
-	if (!exclude) {
-		struct path_exclude_check check;
-		int dt = DT_DIR;
-		path_exclude_check_init(&check, dir);
-		exclude = is_path_excluded(&check, dirname, len, &dt);
-		path_exclude_check_clear(&check);
-	}
-
 	/*
 	 * We are looking for ignored files and our directory is not ignored,
 	 * check if it contains untracked files (i.e. is listed as untracked)
@@ -1085,27 +1035,13 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
  *
  * Return 1 for exclude, 0 for include.
  */
-static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude, int *dtype)
+static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude)
 {
-	struct path_exclude_check check;
-	int exclude_file = 0;
-
 	/* Always exclude indexed files */
 	if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
 		return 1;
 
-	if (exclude)
-		exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
-	else if (dir->flags & DIR_SHOW_IGNORED) {
-		path_exclude_check_init(&check, dir);
-
-		if (!is_path_excluded(&check, path->buf, path->len, dtype))
-			exclude_file = 1;
-
-		path_exclude_check_clear(&check);
-	}
-
-	return exclude_file;
+	return exclude == !(dir->flags & DIR_SHOW_IGNORED);
 }
 
 /*
@@ -1262,12 +1198,9 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		break;
 	case DT_REG:
 	case DT_LNK:
-		switch (treat_file(dir, path, exclude, &dtype)) {
-		case 1:
+		if (treat_file(dir, path, exclude))
 			return path_ignored;
-		default:
-			break;
-		}
+		break;
 	}
 	return path_handled;
 }
diff --git a/dir.h b/dir.h
index cd166d0..bfe726e 100644
--- a/dir.h
+++ b/dir.h
@@ -151,20 +151,10 @@ extern int match_pathname(const char *, int,
 			  const char *, int,
 			  const char *, int, int, int);
 
-/*
- * The is_excluded() API is meant for callers that check each level of leading
- * directory hierarchies with is_excluded() to avoid recursing into excluded
- * directories.  Callers that do not do so should use this API instead.
- */
-struct path_exclude_check {
-	struct dir_struct *dir;
-};
-extern void path_exclude_check_init(struct path_exclude_check *, struct dir_struct *);
-extern void path_exclude_check_clear(struct path_exclude_check *);
-extern struct exclude *last_exclude_matching_path(struct path_exclude_check *, const char *,
-						  int namelen, int *dtype);
-extern int is_path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
+extern struct exclude *last_exclude_matching(struct dir_struct *dir,
+					     const char *name, int *dtype);
 
+extern int is_excluded(struct dir_struct *dir, const char *name, int *dtype);
 
 extern struct exclude_list *add_exclude_list(struct dir_struct *dir,
 					     int group_type, const char *src);
diff --git a/unpack-trees.c b/unpack-trees.c
index 09e53df..ede4299 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1026,10 +1026,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			o->el = &el;
 	}
 
-	if (o->dir) {
-		o->path_exclude_check = xmalloc(sizeof(struct path_exclude_check));
-		path_exclude_check_init(o->path_exclude_check, o->dir);
-	}
 	memset(&o->result, 0, sizeof(o->result));
 	o->result.initialized = 1;
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
@@ -1155,10 +1151,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 done:
 	clear_exclude_list(&el);
-	if (o->path_exclude_check) {
-		path_exclude_check_clear(o->path_exclude_check);
-		free(o->path_exclude_check);
-	}
 	return ret;
 
 return_failed:
@@ -1375,7 +1367,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;
 
 	if (o->dir &&
-	    is_path_excluded(o->path_exclude_check, name, -1, &dtype))
+	    is_excluded(o->dir, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index ec74a9f..5e432f5 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -52,7 +52,6 @@ struct unpack_trees_options {
 	const char *prefix;
 	int cache_bottom;
 	struct dir_struct *dir;
-	struct path_exclude_check *path_exclude_check;
 	struct pathspec *pathspec;
 	merge_fn_t fn;
 	const char *msgs[NB_UNPACK_TREES_ERROR_TYPES];
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 12/14] dir.c: git-status: avoid is_excluded checks for tracked files
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (10 preceding siblings ...)
  2013-04-15 19:12   ` [PATCH v2 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
@ 2013-04-15 19:13   ` Karsten Blees
  2013-04-15 19:14   ` [PATCH v2 13/14] dir.c: git-status --ignored: don't scan the work tree three times Karsten Blees
                     ` (2 subsequent siblings)
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:13 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

Checking if a file is in the index is much faster (hashtable lookup) than
checking if the file is excluded (linear search over exclude patterns).

Skip is_excluded checks for files: move the cache_name_exists check from
treat_file to treat_one_path and return early if the file is tracked.

This can safely be done as all other code paths also return path_ignored
for tracked files, and dir_add_ignored skips tracked files as well.

There's just one line left in treat_file, so move this to treat_one_path
as well.

Here's some performance data for git-status from the linux and WebKit
repos (best of 10 runs on a Debian Linux on SSD, core.preloadIndex=true):

       |    status      | status --ignored
       | linux | WebKit | linux | WebKit
-------+-------+--------+-------+---------
before | 0.218 |  1.583 | 0.321 |  2.579
after  | 0.156 |  0.988 | 0.202 |  1.279
gain   | 1.397 |  1.602 | 1.589 |  2.016

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 38 +++++++++++---------------------------
 1 file changed, 11 insertions(+), 27 deletions(-)

diff --git a/dir.c b/dir.c
index 67313bd..5ae5722 100644
--- a/dir.c
+++ b/dir.c
@@ -1023,28 +1023,6 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 }
 
 /*
- * Decide what to do when we find a file while traversing the
- * filesystem. Mostly two cases:
- *
- *  1. We are looking for ignored files
- *   (a) File is ignored, include it
- *   (b) File is in ignored path, include it
- *   (c) File is not ignored, exclude it
- *
- *  2. Other scenarios, include the file if not excluded
- *
- * Return 1 for exclude, 0 for include.
- */
-static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude)
-{
-	/* Always exclude indexed files */
-	if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
-		return 1;
-
-	return exclude == !(dir->flags & DIR_SHOW_IGNORED);
-}
-
-/*
  * This is an inexact early pruning of any recursive directory
  * reading - if the path cannot possibly be in the pathspec,
  * return true, and we'll skip it early.
@@ -1167,7 +1145,16 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 					  const struct path_simplify *simplify,
 					  int dtype, struct dirent *de)
 {
-	int exclude = is_excluded(dir, path->buf, &dtype);
+	int exclude;
+	if (dtype == DT_UNKNOWN)
+		dtype = get_dtype(de, path->buf, path->len);
+
+	/* Always exclude indexed files */
+	if (dtype != DT_DIR &&
+	    cache_name_exists(path->buf, path->len, ignore_case))
+		return path_ignored;
+
+	exclude = is_excluded(dir, path->buf, &dtype);
 	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
 	    && exclude_matches_pathspec(path->buf, path->len, simplify))
 		dir_add_ignored(dir, path->buf, path->len);
@@ -1179,9 +1166,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
 		return path_ignored;
 
-	if (dtype == DT_UNKNOWN)
-		dtype = get_dtype(de, path->buf, path->len);
-
 	switch (dtype) {
 	default:
 		return path_ignored;
@@ -1198,7 +1182,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		break;
 	case DT_REG:
 	case DT_LNK:
-		if (treat_file(dir, path, exclude))
+		if (exclude == !(dir->flags & DIR_SHOW_IGNORED))
 			return path_ignored;
 		break;
 	}
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 13/14] dir.c: git-status --ignored: don't scan the work tree three times
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (11 preceding siblings ...)
  2013-04-15 19:13   ` [PATCH v2 12/14] dir.c: git-status: avoid is_excluded checks for tracked files Karsten Blees
@ 2013-04-15 19:14   ` Karsten Blees
  2013-04-15 19:15   ` [PATCH v2 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees
  2013-04-15 19:23   ` [PATCH v2 00/14] Improve git-status --ignored Junio C Hamano
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:14 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-status --ignored' recursively scans directories up to three times:

 1. To collect untracked files.

 2. To collect ignored files.

 3. When collecting ignored files, to check that an untracked directory
    that potentially contains ignored files doesn't also contain untracked
    files (i.e. isn't already listed as untracked).

Let's get rid of case 3 first.

Currently, read_directory_recursive returns a boolean whether a directory
contains the requested files or not (actually, it returns the number of
files, but no caller actually needs that), and DIR_SHOW_IGNORED specifies
what we're looking for.

To be able to test for both untracked and ignored files in a single scan,
we need to return a bit more info, and the result must be independent of
the DIR_SHOW_IGNORED flag.

Reuse the path_treatment enum as return value of read_directory_recursive.
Split path_handled in two separate values path_excluded and path_untracked
that don't change their meaning with the DIR_SHOW_IGNORED flag. We don't
need an extra value path_untracked_and_excluded, as directories with both
untracked and ignored files should be listed as untracked.

Rename path_ignored to path_none for clarity (i.e. "don't treat that path"
in contrast to "the path is ignored and should be treated according to
DIR_SHOW_IGNORED").

Replace enum directory_treatment with path_treatment. That's just another
enum with the same meaning, no need to translate back and forth.

In treat_directory, get rid of the extra read_directory_recursive call and
all the DIR_SHOW_IGNORED-specific code.

In read_directory_recursive, decide whether to dir_add_name path_excluded
or path_untracked paths based on the DIR_SHOW_IGNORED flag.

The return value of read_directory_recursive is the maximum path_treatment
of all files and sub-directories. In the check_only case, abort when we've
reached the most significant value (path_untracked).

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 dir.c | 146 +++++++++++++++++++++++++++++++++---------------------------------
 1 file changed, 72 insertions(+), 74 deletions(-)

diff --git a/dir.c b/dir.c
index 5ae5722..5770ed4 100644
--- a/dir.c
+++ b/dir.c
@@ -17,7 +17,21 @@ struct path_simplify {
 	const char *path;
 };
 
-static int read_directory_recursive(struct dir_struct *dir, const char *path, int len,
+/*
+ * Tells read_directory_recursive how a file or directory should be treated.
+ * Values are ordered by significance, e.g. if a directory contains both
+ * excluded and untracked files, it is listed as untracked because
+ * path_untracked > path_excluded.
+ */
+enum path_treatment {
+	path_none = 0,
+	path_recurse,
+	path_excluded,
+	path_untracked
+};
+
+static enum path_treatment read_directory_recursive(struct dir_struct *dir,
+	const char *path, int len,
 	int check_only, const struct path_simplify *simplify);
 static int get_dtype(struct dirent *de, const char *path, int len);
 
@@ -958,35 +972,26 @@ static enum exist_status directory_exists_in_index(const char *dirname, int len)
  *
  *  (a) if "show_other_directories" is true, we show it as
  *      just a directory, unless "hide_empty_directories" is
- *      also true and the directory is empty, in which case
- *      we just ignore it entirely.
- *      if we are looking for ignored directories, look if it
- *      contains only ignored files to decide if it must be shown as
- *      ignored or not.
+ *      also true, in which case we need to check if it contains any
+ *      untracked and / or ignored files.
  *  (b) if it looks like a git directory, and we don't have
  *      'no_gitlinks' set we treat it as a gitlink, and show it
  *      as a directory.
  *  (c) otherwise, we recurse into it.
  */
-enum directory_treatment {
-	show_directory,
-	ignore_directory,
-	recurse_into_directory
-};
-
-static enum directory_treatment treat_directory(struct dir_struct *dir,
+static enum path_treatment treat_directory(struct dir_struct *dir,
 	const char *dirname, int len, int exclude,
 	const struct path_simplify *simplify)
 {
 	/* The "len-1" is to strip the final '/' */
 	switch (directory_exists_in_index(dirname, len-1)) {
 	case index_directory:
-		return recurse_into_directory;
+		return path_recurse;
 
 	case index_gitdir:
 		if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)
-			return ignore_directory;
-		return show_directory;
+			return path_none;
+		return path_untracked;
 
 	case index_nonexistent:
 		if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES)
@@ -994,32 +999,17 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 		if (!(dir->flags & DIR_NO_GITLINKS)) {
 			unsigned char sha1[20];
 			if (resolve_gitlink_ref(dirname, "HEAD", sha1) == 0)
-				return show_directory;
+				return path_untracked;
 		}
-		return recurse_into_directory;
+		return path_recurse;
 	}
 
 	/* This is the "show_other_directories" case */
 
-	/*
-	 * We are looking for ignored files and our directory is not ignored,
-	 * check if it contains untracked files (i.e. is listed as untracked)
-	 */
-	if ((dir->flags & DIR_SHOW_IGNORED) && !exclude) {
-		int ignored;
-		dir->flags &= ~DIR_SHOW_IGNORED;
-		ignored = read_directory_recursive(dir, dirname, len, 1, simplify);
-		dir->flags |= DIR_SHOW_IGNORED;
-
-		if (ignored)
-			return ignore_directory;
-	}
-
 	if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES))
-		return show_directory;
-	if (!read_directory_recursive(dir, dirname, len, 1, simplify))
-		return ignore_directory;
-	return show_directory;
+		return exclude ? path_excluded : path_untracked;
+
+	return read_directory_recursive(dir, dirname, len, 1, simplify);
 }
 
 /*
@@ -1134,12 +1124,6 @@ static int get_dtype(struct dirent *de, const char *path, int len)
 	return dtype;
 }
 
-enum path_treatment {
-	path_ignored,
-	path_handled,
-	path_recurse
-};
-
 static enum path_treatment treat_one_path(struct dir_struct *dir,
 					  struct strbuf *path,
 					  const struct path_simplify *simplify,
@@ -1152,7 +1136,7 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	/* Always exclude indexed files */
 	if (dtype != DT_DIR &&
 	    cache_name_exists(path->buf, path->len, ignore_case))
-		return path_ignored;
+		return path_none;
 
 	exclude = is_excluded(dir, path->buf, &dtype);
 	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
@@ -1164,29 +1148,19 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 	 * ignored files, ignore it
 	 */
 	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
-		return path_ignored;
+		return path_excluded;
 
 	switch (dtype) {
 	default:
-		return path_ignored;
+		return path_none;
 	case DT_DIR:
 		strbuf_addch(path, '/');
-		switch (treat_directory(dir, path->buf, path->len, exclude, simplify)) {
-		case show_directory:
-			break;
-		case recurse_into_directory:
-			return path_recurse;
-		case ignore_directory:
-			return path_ignored;
-		}
-		break;
+		return treat_directory(dir, path->buf, path->len, exclude,
+			simplify);
 	case DT_REG:
 	case DT_LNK:
-		if (exclude == !(dir->flags & DIR_SHOW_IGNORED))
-			return path_ignored;
-		break;
+		return exclude ? path_excluded : path_untracked;
 	}
-	return path_handled;
 }
 
 static enum path_treatment treat_path(struct dir_struct *dir,
@@ -1198,11 +1172,11 @@ static enum path_treatment treat_path(struct dir_struct *dir,
 	int dtype;
 
 	if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
-		return path_ignored;
+		return path_none;
 	strbuf_setlen(path, baselen);
 	strbuf_addstr(path, de->d_name);
 	if (simplify_away(path->buf, path->len, simplify))
-		return path_ignored;
+		return path_none;
 
 	dtype = DTYPE(de);
 	return treat_one_path(dir, path, simplify, dtype, de);
@@ -1216,14 +1190,16 @@ static enum path_treatment treat_path(struct dir_struct *dir,
  *
  * Also, we ignore the name ".git" (even if it is not a directory).
  * That likely will not change.
+ *
+ * Returns the most significant path_treatment value encountered in the scan.
  */
-static int read_directory_recursive(struct dir_struct *dir,
+static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 				    const char *base, int baselen,
 				    int check_only,
 				    const struct path_simplify *simplify)
 {
 	DIR *fdir;
-	int contents = 0;
+	enum path_treatment state, subdir_state, dir_state = path_none;
 	struct dirent *de;
 	struct strbuf path = STRBUF_INIT;
 
@@ -1234,26 +1210,48 @@ static int read_directory_recursive(struct dir_struct *dir,
 		goto out;
 
 	while ((de = readdir(fdir)) != NULL) {
-		switch (treat_path(dir, de, &path, baselen, simplify)) {
-		case path_recurse:
-			contents += read_directory_recursive(dir, path.buf,
+		/* check how the file or directory should be treated */
+		state = treat_path(dir, de, &path, baselen, simplify);
+		if (state > dir_state)
+			dir_state = state;
+
+		/* recurse into subdir if instructed by treat_path */
+		if (state == path_recurse) {
+			subdir_state = read_directory_recursive(dir, path.buf,
 				path.len, check_only, simplify);
+			if (subdir_state > dir_state)
+				dir_state = subdir_state;
+		}
+
+		if (check_only) {
+			/* abort early if maximum state has been reached */
+			if (dir_state == path_untracked)
+				break;
+			/* skip the dir_add_* part */
 			continue;
-		case path_ignored:
-			continue;
-		case path_handled:
-			break;
 		}
-		contents++;
-		if (check_only)
+
+		/* add the path to the appropriate result list */
+		switch (state) {
+		case path_excluded:
+			if (dir->flags & DIR_SHOW_IGNORED)
+				dir_add_name(dir, path.buf, path.len);
+			break;
+
+		case path_untracked:
+			if (!(dir->flags & DIR_SHOW_IGNORED))
+				dir_add_name(dir, path.buf, path.len);
 			break;
-		dir_add_name(dir, path.buf, path.len);
+
+		default:
+			break;
+		}
 	}
 	closedir(fdir);
  out:
 	strbuf_release(&path);
 
-	return contents;
+	return dir_state;
 }
 
 static int cmp_name(const void *p1, const void *p2)
@@ -1324,7 +1322,7 @@ static int treat_leading_path(struct dir_struct *dir,
 		if (simplify_away(sb.buf, sb.len, simplify))
 			break;
 		if (treat_one_path(dir, &sb, simplify,
-				   DT_DIR, NULL) == path_ignored)
+				   DT_DIR, NULL) == path_none)
 			break; /* do not recurse into it */
 		if (len <= baselen) {
 			rc = 1;
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2 14/14] dir.c: git-status --ignored: don't scan the work tree twice
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (12 preceding siblings ...)
  2013-04-15 19:14   ` [PATCH v2 13/14] dir.c: git-status --ignored: don't scan the work tree three times Karsten Blees
@ 2013-04-15 19:15   ` Karsten Blees
  2013-04-15 19:23   ` [PATCH v2 00/14] Improve git-status --ignored Junio C Hamano
  14 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 19:15 UTC (permalink / raw)
  To: Git List
  Cc: Karsten Blees, Junio C Hamano, Erik Faye-Lund,
	Ramkumar Ramachandra, Robert Zeh, Duy Nguyen, Antoine Pelisse,
	Adam Spiers

'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.

fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).

The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.

We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.

In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.

In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.

builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.

Update API documentation to reflect the changes.

Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 Documentation/technical/api-directory-listing.txt | 25 ++++++++++++++++++++---
 builtin/check-ignore.c                            |  1 -
 dir.c                                             | 10 +++++----
 dir.h                                             |  3 ++-
 wt-status.c                                       | 24 ++++++++++------------
 5 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 1f349b2..7f8e78d 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -22,12 +22,23 @@ The notable options are:
 
 `flags`::
 
-	A bit-field of options:
+	A bit-field of options (the `*IGNORED*` flags are mutually exclusive):
 
 `DIR_SHOW_IGNORED`:::
 
-	The traversal is for finding just ignored files, not unignored
-	files.
+	Return just ignored files in `entries[]`, not untracked files.
+
+`DIR_SHOW_IGNORED_TOO`:::
+
+	Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
+	in addition to untracked files in `entries[]`.
+
+`DIR_COLLECT_IGNORED`:::
+
+	Special mode for git-add. Return ignored files in `ignored[]` and
+	untracked files in `entries[]`. Only returns ignored files that match
+	pathspec exactly (no wildcards). Does not recurse into ignored
+	directories.
 
 `DIR_SHOW_OTHER_DIRECTORIES`:::
 
@@ -57,6 +68,14 @@ The result of the enumeration is left in these fields:
 
 	Internal use; keeps track of allocation of `entries[]` array.
 
+`ignored[]`::
+
+	An array of `struct dir_entry`, used for ignored paths with the
+	`DIR_SHOW_IGNORED_TOO` and `DIR_COLLECT_IGNORED` flags.
+
+`ignored_nr`::
+
+	The number of members in `ignored[]` array.
 
 Calling sequence
 ----------------
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 7388346..854a88a 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -66,7 +66,6 @@ static int check_ignore(const char *prefix, const char **pathspec)
 		die(_("index file corrupt"));
 
 	memset(&dir, 0, sizeof(dir));
-	dir.flags |= DIR_COLLECT_IGNORED;
 	setup_standard_excludes(&dir);
 
 	if (!pathspec || !*pathspec) {
diff --git a/dir.c b/dir.c
index 5770ed4..847e2f2 100644
--- a/dir.c
+++ b/dir.c
@@ -1139,15 +1139,12 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		return path_none;
 
 	exclude = is_excluded(dir, path->buf, &dtype);
-	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
-	    && exclude_matches_pathspec(path->buf, path->len, simplify))
-		dir_add_ignored(dir, path->buf, path->len);
 
 	/*
 	 * Excluded? If we don't explicitly want to show
 	 * ignored files, ignore it
 	 */
-	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
+	if (exclude && !(dir->flags & (DIR_SHOW_IGNORED|DIR_SHOW_IGNORED_TOO)))
 		return path_excluded;
 
 	switch (dtype) {
@@ -1236,6 +1233,11 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		case path_excluded:
 			if (dir->flags & DIR_SHOW_IGNORED)
 				dir_add_name(dir, path.buf, path.len);
+			else if ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
+				((dir->flags & DIR_COLLECT_IGNORED) &&
+				exclude_matches_pathspec(path.buf, path.len,
+					simplify)))
+				dir_add_ignored(dir, path.buf, path.len);
 			break;
 
 		case path_untracked:
diff --git a/dir.h b/dir.h
index bfe726e..3d6b80c 100644
--- a/dir.h
+++ b/dir.h
@@ -79,7 +79,8 @@ struct dir_struct {
 		DIR_SHOW_OTHER_DIRECTORIES = 1<<1,
 		DIR_HIDE_EMPTY_DIRECTORIES = 1<<2,
 		DIR_NO_GITLINKS = 1<<3,
-		DIR_COLLECT_IGNORED = 1<<4
+		DIR_COLLECT_IGNORED = 1<<4,
+		DIR_SHOW_IGNORED_TOO = 1<<5
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
diff --git a/wt-status.c b/wt-status.c
index 79eb124..7b3cc8e 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -503,9 +503,12 @@ static void wt_status_collect_untracked(struct wt_status *s)
 	if (s->show_untracked_files != SHOW_ALL_UNTRACKED_FILES)
 		dir.flags |=
 			DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
+	if (s->show_ignored_files)
+		dir.flags |= DIR_SHOW_IGNORED_TOO;
 	setup_standard_excludes(&dir);
 
 	fill_directory(&dir, s->pathspec);
+
 	for (i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
 		if (cache_name_is_other(ent->name, ent->len) &&
@@ -514,22 +517,17 @@ static void wt_status_collect_untracked(struct wt_status *s)
 		free(ent);
 	}
 
-	if (s->show_ignored_files) {
-		dir.nr = 0;
-		dir.flags = DIR_SHOW_IGNORED;
-		if (s->show_untracked_files != SHOW_ALL_UNTRACKED_FILES)
-			dir.flags |= DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
-		fill_directory(&dir, s->pathspec);
-		for (i = 0; i < dir.nr; i++) {
-			struct dir_entry *ent = dir.entries[i];
-			if (cache_name_is_other(ent->name, ent->len) &&
-			    match_pathspec(s->pathspec, ent->name, ent->len, 0, NULL))
-				string_list_insert(&s->ignored, ent->name);
-			free(ent);
-		}
+	for (i = 0; i < dir.ignored_nr; i++) {
+		struct dir_entry *ent = dir.ignored[i];
+		if (cache_name_is_other(ent->name, ent->len) &&
+		    match_pathspec(s->pathspec, ent->name, ent->len, 0, NULL))
+			string_list_insert(&s->ignored, ent->name);
+		free(ent);
 	}
 
 	free(dir.entries);
+	free(dir.ignored);
+	clear_directory(&dir);
 }
 
 void wt_status_collect(struct wt_status *s)
-- 
1.8.1.2.8026.g2b66448.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/14] Improve git-status --ignored
  2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
                     ` (13 preceding siblings ...)
  2013-04-15 19:15   ` [PATCH v2 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees
@ 2013-04-15 19:23   ` Junio C Hamano
  2013-04-15 19:33     ` Junio C Hamano
  14 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2013-04-15 19:23 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Karsten Blees <karsten.blees@gmail.com> writes:

> This patch series addresses several bugs and performance issues in
> .gitignore processing.

A 8-patch series ending at 5d765dc7888b (dir.c: git-status: avoid
is_excluded checks for tracked files, 2013-03-18) has been cooking
in 'next'; in general we won't revert and requeue a new round for a
topic that has already merged to 'next'.

Is this "v2" an update for that topic?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/14] Improve git-status --ignored
  2013-04-15 19:23   ` [PATCH v2 00/14] Improve git-status --ignored Junio C Hamano
@ 2013-04-15 19:33     ` Junio C Hamano
  2013-04-15 20:06       ` Karsten Blees
  0 siblings, 1 reply; 36+ messages in thread
From: Junio C Hamano @ 2013-04-15 19:33 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Junio C Hamano <gitster@pobox.com> writes:

> Karsten Blees <karsten.blees@gmail.com> writes:
>
>> This patch series addresses several bugs and performance issues in
>> .gitignore processing.
>
> A 8-patch series ending at 5d765dc7888b (dir.c: git-status: avoid
> is_excluded checks for tracked files, 2013-03-18) has been cooking
> in 'next'; in general we won't revert and requeue a new round for a
> topic that has already merged to 'next'.
>
> Is this "v2" an update for that topic?

Hmph, it seems to be the case.  I was hoping that the one in 'next'
was basically in a good shape, not so broken to require a wholesale
replacement like this.  Overall, this round of reroll seems to be
better structured.

Let's revert the v1 from 'next', queue this one to 'pu', and make it
advance slooowly this time.

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/14] Improve git-status --ignored
  2013-04-15 19:33     ` Junio C Hamano
@ 2013-04-15 20:06       ` Karsten Blees
  2013-04-15 20:25         ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Karsten Blees @ 2013-04-15 20:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Am 15.04.2013 21:33, schrieb Junio C Hamano:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Karsten Blees <karsten.blees@gmail.com> writes:
>>
>>> This patch series addresses several bugs and performance issues in
>>> .gitignore processing.
>>
>> A 8-patch series ending at 5d765dc7888b (dir.c: git-status: avoid
>> is_excluded checks for tracked files, 2013-03-18) has been cooking
>> in 'next'; in general we won't revert and requeue a new round for a
>> topic that has already merged to 'next'.
>>

I'm sorry to have caused such trouble. I thought you were expecting a reroll from this:

Am 19.03.2013 15:48, schrieb Junio C Hamano:
> Would we we better off kicking nd/read-directory-recursive-optim
> back to 'pu' (and eventually ejecting it) and replacing it with a
> reroll of Karsten's series when it comes, perhaps?


>> Is this "v2" an update for that topic?
> 
> Hmph, it seems to be the case.  I was hoping that the one in 'next'
> was basically in a good shape, not so broken to require a wholesale
> replacement like this.  Overall, this round of reroll seems to be
> better structured.
> 
> Let's revert the v1 from 'next', queue this one to 'pu', and make it
> advance slooowly this time.
> 

Thanks. I don't plan any more rerolls unless required by upcoming discussion.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/14] Improve git-status --ignored
  2013-04-15 20:06       ` Karsten Blees
@ 2013-04-15 20:25         ` Junio C Hamano
  2013-04-17 19:50           ` Karsten Blees
                             ` (2 more replies)
  0 siblings, 3 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-04-15 20:25 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Karsten Blees <karsten.blees@gmail.com> writes:

> Am 15.04.2013 21:33, schrieb Junio C Hamano:
>> Junio C Hamano <gitster@pobox.com> writes:
>> 
>>> Karsten Blees <karsten.blees@gmail.com> writes:
>>>
>>>> This patch series addresses several bugs and performance issues in
>>>> .gitignore processing.
>>>
>>> A 8-patch series ending at 5d765dc7888b (dir.c: git-status: avoid
>>> is_excluded checks for tracked files, 2013-03-18) has been cooking
>>> in 'next'; in general we won't revert and requeue a new round for a
>>> topic that has already merged to 'next'.
>>>
>
> I'm sorry to have caused such trouble. I thought you were expecting a reroll from this:

Heh, that was an ancient history.

It is not such a big deal to revert stuff from 'next'.  I've tried
to queue this reroll, but tentatively ejected the result from 'pu'
for today's integration run, as as/check-ignore topic has an
unpleasant conflict with this.

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs
  2013-04-15 19:12   ` [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs Karsten Blees
@ 2013-04-15 21:35     ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-04-15 21:35 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Karsten Blees <karsten.blees@gmail.com> writes:

> is_excluded doesn't handle ignored directories, results for paths within
> ignored directories are incorrect. This is probably based on the premise
> that recursive directory scans should stop at ignored directories,...

Correct.  Long time ago, we used to have strange rules in a case
like this:

	.gitignore has "d/" to exclude it;
        d/.gitignore says "!f" to cause d/f included.

and when you ask about d/f, we would say "d/ is ignored, so we
wouldn't even bother checking and declare it ignored", while asked
about "f" from within "d/" by first chdir'ing to it, we said "the
fact that you are inside d/ alone means you are interested in
looking at its contents, and checking d/.gitignore tells us that you
are interested in f". "recursion stops at ignored paths" mattered
back then.

I think we consistently honors higher level .gitignore files these
days.

> Teach is_excluded / prep_exclude about ignored directories: whenever
> entering a new directory, first check if the entire directory is excluded.
> Remember the excluded state in dir_struct. Don't traverse into already
> ignored directories (i.e. don't read irrelevant .gitignore files).
> ...
> Here's some performance data from the linux and WebKit repos (best of 10
> runs on a Debian Linux on SSD, core.preloadIndex=true):
>
>        | ls-files -ci   |    status      | status --ignored
>        | linux | WebKit | linux | WebKit | linux | WebKit
> -------+-------+--------+-------+--------+-------+---------
> before | 0.506 |  6.539 | 0.212 |  1.555 | 0.323 |  2.541
> after  | 0.080 |  1.191 | 0.218 |  1.583 | 0.321 |  2.579
> gain   | 6.325 |  5.490 | 0.972 |  0.982 | 1.006 |  0.985

Nice.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH] read_revisions_from_stdin: make copies for handle_revision_arg
  2013-04-15 19:06   ` [PATCH v2 02/14] dir.c: git-status --ignored: don't list files in " Karsten Blees
@ 2013-04-16  9:57     ` Thomas Rast
  2013-04-16 18:17       ` Junio C Hamano
  0 siblings, 1 reply; 36+ messages in thread
From: Thomas Rast @ 2013-04-16  9:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Miklos Vajna, Michael Haggerty, Ramkumar Ramachandra, git

read_revisions_from_stdin() has passed pointers to its read buffer
down to handle_revision_arg() since its inception way back in 42cabc3
(Teach rev-list an option to read revs from the standard input.,
2006-09-05).  Even back then, this was a bug: through
add_pending_object, the argument was recorded in the object_array's
'name' field.

Fix it by making a copy whenever read_revisions_from_stdin() passes an
argument down the callchain.  The other caller runs handle_revision_arg()
on argv[], where it would be redundant to make a copy.

Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
---

> So I changed my mind.  Your "easy fix" looks to me the right thing
> to do.

So here's the same with a commit message and signoff.  I hope I got my
history right; I didn't look too long if it had any users, but it was
definitely recorded.


 revision.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/revision.c b/revision.c
index 3a20c96..181a8db 100644
--- a/revision.c
+++ b/revision.c
@@ -1277,7 +1277,8 @@ static void read_revisions_from_stdin(struct rev_info *revs,
 			}
 			die("options not supported in --stdin mode");
 		}
-		if (handle_revision_arg(sb.buf, revs, 0, REVARG_CANNOT_BE_FILENAME))
+		if (handle_revision_arg(xstrdup(sb.buf), revs, 0,
+					REVARG_CANNOT_BE_FILENAME))
 			die("bad revision '%s'", sb.buf);
 	}
 	if (seen_dashdash)
-- 
1.8.2.1.703.g2535f49

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories
  2013-04-15 19:05   ` [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
@ 2013-04-16 17:33     ` Ramkumar Ramachandra
  2013-04-17  0:31       ` Karsten Blees
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2013-04-16 17:33 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Junio C Hamano, Erik Faye-Lund, Robert Zeh, Duy Nguyen,
	Antoine Pelisse, Adam Spiers

Firstly, great work on the series! I've just started looking into it,
so please don't take my comments too seriously: some of them may be
queries, and others may be minor suggestions, but I can't say I
understand the area you're patching.  I know Junio doesn't like me
mixing queries in reviews, but I don't fully agree with his policy.

Karsten Blees wrote:
> 'git-status --ignored' drops ignored directories if they contain untracked
> files in an untracked sub directory.

Wait, ignored directories will always contain untracked
subdirectories, unless you add -f them, right?  Why are you saying
untracked files in an _untracked_ subdirectory?  We don't track
directories anyway, and I would call a directory "tracked" if there's
atleast one file inside it is tracked.  So, my understanding of this
is:

    quux/
       bar
        baz/
             foo

In this example, if quux is ignored and untracked, git status
--ignored currently shows quux/.  If quux/bar is tracked (say with add
-f), but baz/ is untracked, git status --ignored doesn't show me
anything.  What exactly is the bug you're fixing?  I'll try to look at
the tests to infer this, but your commit message could probably be
clearer.

Nit: please s/git-status/git status/

> Fix it by getting exact (recursive) excluded status in treat_directory.

Okay, so you're patching treat_directory() in dir.c to do some
recursive exclude handling.  Let's see what this is.

> diff --git a/dir.c b/dir.c
> index 57394e4..ec4eebf 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1060,6 +1060,15 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
>
>         /* This is the "show_other_directories" case */
>
> +       /* might be a sub directory in an excluded directory */
> +       if (!exclude) {
> +               struct path_exclude_check check;
> +               int dt = DT_DIR;
> +               path_exclude_check_init(&check, dir);
> +               exclude = is_path_excluded(&check, dirname, len, &dt);
> +               path_exclude_check_clear(&check);
> +       }
> +

So, I'm guessing that DT_DIR refers to a value that a field in struct
dirent can take; that value could be one of DIR (directory), REG
(regular file?), LNK (symbolic link?).  I don't get much of this, but
what I do get is that you're setting exclude for the rest of the code
in this function.

Sorry that I'm not able to do a more thorough review.

> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
> index 0da1214..0f1034e 100755
> --- a/t/t7061-wtstatus-ignore.sh
> +++ b/t/t7061-wtstatus-ignore.sh
> @@ -143,4 +143,31 @@ test_expect_success 'status ignored tracked directory and uncommitted file with
>         test_cmp expected actual
>  '
>
> +cat >expected <<\EOF
> +?? .gitignore
> +?? actual
> +?? expected
> +!! tracked/
> +EOF

Please put these segments inside the test_expect_success block, so
it's easy to think about those blocks in isolation.  I know you're
just following the existing conventions existing in this test, but
those are not necessarily good conventions.

> +test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore' '
> +       rm -rf tracked/uncommitted &&
> +       mkdir tracked/ignored &&
> +       : >tracked/ignored/uncommitted &&
> +       git status --porcelain --ignored >actual &&
> +       test_cmp expected actual
> +'

This is very confusing.  How is tracked a tracked directory?  Oh,
right: some previous test git add'ed tracked/committed.  How do I know
about that in this test?

Yeah, changes to tracked ignored directories are not shown, but the
commit message didn't tell me this.

> +cat >expected <<\EOF
> +?? .gitignore
> +?? actual
> +?? expected
> +!! tracked/ignored/uncommitted
> +EOF
> +
> +test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore -u' '
> +       git status --porcelain --ignored -u >actual &&
> +       test_cmp expected actual
> +'
> +
>  test_done

I suppose the commit message told me about this one vaguely, but I
think it could be much clearer overall.

Thanks.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories
  2013-04-15 19:07   ` [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories Karsten Blees
@ 2013-04-16 17:48     ` Ramkumar Ramachandra
  2013-04-17  0:31       ` Karsten Blees
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2013-04-16 17:48 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Junio C Hamano, Erik Faye-Lund, Robert Zeh, Duy Nguyen,
	Antoine Pelisse, Adam Spiers

Karsten Blees wrote:
> 'git-status --ignored' lists ignored tracked directories without any
> ignored files if a tracked file happens to match an exclude pattern.

Here, I have:

    quux/
        bar
        baz/
            foo

So, quux is an ignored tracked directory.  bar is tracked, but matches
an ignore pattern.  Currently, git status --ignored lists quux/.  I'm
confused.

> Always exclude tracked files.

"exclude" it from the 'git status --ignored' output, I presume?
There's already an _exclude_ pattern in your previous sentence, so you
can see why the reader might be confused about what you're talking
about.

> diff --git a/dir.c b/dir.c
> index 7c9bc9c..fd1f088 100644
> --- a/dir.c
> +++ b/dir.c
> @@ -1109,16 +1109,13 @@ static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude,
>         struct path_exclude_check check;
>         int exclude_file = 0;
>
> +       /* Always exclude indexed files */
> +       if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
> +               return 1;
> +
>         if (exclude)
>                 exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
>         else if (dir->flags & DIR_SHOW_IGNORED) {
> -               /* Always exclude indexed files */
> -               struct cache_entry *ce = index_name_exists(&the_index,
> -                   path->buf, path->len, ignore_case);
> -
> -               if (ce)
> -                       return 1;
> -

Okay, so you just moved this segment outside the else if()
conditional.  Can you explain what the old logic was doing, and what
the rationale for it was?

> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
> index 4ece129..28b7d95 100755
> --- a/t/t7061-wtstatus-ignore.sh
> +++ b/t/t7061-wtstatus-ignore.sh
> @@ -122,10 +122,34 @@ cat >expected <<\EOF
>  ?? .gitignore
>  ?? actual
>  ?? expected
> +EOF
> +
> +test_expect_success 'status ignored tracked directory and ignored file with --ignore' '
> +       echo "committed" >>.gitignore &&
> +       git status --porcelain --ignored >actual &&
> +       test_cmp expected actual
> +'

Um, didn't really get this one.  You have three untracked files, and
git status seems to be showing them fine.  What am I missing?

> +cat >expected <<\EOF
> +?? .gitignore
> +?? actual
> +?? expected
> +EOF
> +
> +test_expect_success 'status ignored tracked directory and ignored file with --ignore -u' '
> +       git status --porcelain --ignored -u >actual &&
> +       test_cmp expected actual
> +'

I didn't understand why you're invoking -u here (doesn't it imply
"all", as opposed to "normal" when unspecified?).  There are really no
directories, so I don't know what I'm expected to see.

> +cat >expected <<\EOF
> +?? .gitignore
> +?? actual
> +?? expected
>  !! tracked/
>  EOF
>
>  test_expect_success 'status ignored tracked directory and uncommitted file with --ignore' '
> +       echo "tracked" >.gitignore &&
>         : >tracked/uncommitted &&
>         git status --porcelain --ignored >actual &&
>         test_cmp expected actual

Didn't we test this in the last patch?  Okay, I'm completely confused now.

Once again, apologies for my inexperienced comments:  I'm contributing
whatever little I can to the review process.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH] read_revisions_from_stdin: make copies for handle_revision_arg
  2013-04-16  9:57     ` [PATCH] read_revisions_from_stdin: make copies for handle_revision_arg Thomas Rast
@ 2013-04-16 18:17       ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-04-16 18:17 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Miklos Vajna, Michael Haggerty, Ramkumar Ramachandra, git

Thomas Rast <trast@inf.ethz.ch> writes:

> read_revisions_from_stdin() has passed pointers to its read buffer
> down to handle_revision_arg() since its inception way back in 42cabc3
> (Teach rev-list an option to read revs from the standard input.,
> 2006-09-05).  Even back then, this was a bug: through
> add_pending_object, the argument was recorded in the object_array's
> 'name' field.
>
> Fix it by making a copy whenever read_revisions_from_stdin() passes an
> argument down the callchain.  The other caller runs handle_revision_arg()
> on argv[], where it would be redundant to make a copy.
>
> Signed-off-by: Thomas Rast <trast@inf.ethz.ch>
> ---
>
>> So I changed my mind.  Your "easy fix" looks to me the right thing
>> to do.
>
> So here's the same with a commit message and signoff.  I hope I got my
> history right; I didn't look too long if it had any users, but it was
> definitely recorded.

Thanks.


>
>
>  revision.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/revision.c b/revision.c
> index 3a20c96..181a8db 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -1277,7 +1277,8 @@ static void read_revisions_from_stdin(struct rev_info *revs,
>  			}
>  			die("options not supported in --stdin mode");
>  		}
> -		if (handle_revision_arg(sb.buf, revs, 0, REVARG_CANNOT_BE_FILENAME))
> +		if (handle_revision_arg(xstrdup(sb.buf), revs, 0,
> +					REVARG_CANNOT_BE_FILENAME))
>  			die("bad revision '%s'", sb.buf);
>  	}
>  	if (seen_dashdash)

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories
  2013-04-16 17:33     ` Ramkumar Ramachandra
@ 2013-04-17  0:31       ` Karsten Blees
  0 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-17  0:31 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: Git List, Junio C Hamano, Erik Faye-Lund, Robert Zeh, Duy Nguyen,
	Antoine Pelisse, Adam Spiers

Am 16.04.2013 19:33, schrieb Ramkumar Ramachandra:
> Firstly, great work on the series! I've just started looking into it,
> so please don't take my comments too seriously: some of them may be
> queries, and others may be minor suggestions, but I can't say I
> understand the area you're patching.  I know Junio doesn't like me
> mixing queries in reviews, but I don't fully agree with his policy.
> 
> Karsten Blees wrote:
>> 'git-status --ignored' drops ignored directories if they contain untracked
>> files in an untracked sub directory.
> 
> Wait, ignored directories will always contain untracked
> subdirectories, unless you add -f them, right?  Why are you saying
> untracked files in an _untracked_ subdirectory?  We don't track

IIRC i mentioned untracked subdirectory explicitly because the problem doesn't occur if the subdirectory is tracked (i.e. contains a tracked file).

> directories anyway, and I would call a directory "tracked" if there's
> atleast one file inside it is tracked.  So, my understanding of this
> is:
> 
>     quux/
>        bar
>         baz/
>              foo
> 
> In this example, if quux is ignored and untracked, git status
> --ignored currently shows quux/.  If quux/bar is tracked (say with add
> -f), but baz/ is untracked, git status --ignored doesn't show me
> anything.  What exactly is the bug you're fixing?

With your example, you get this:

1) git-status --ignored -uall: quux/baz/foo
2) git-status --ignored -unormal: <empty>

Without the untracked directory 'baz' between 'quux' and 'foo':

	quux/
		bar
		foo

you get this:

3) git status --ignored -uall: quux/foo
4) git status --ignored -unormal: quux/

The bug is in case (2), this should be:

2) git-status --ignored -unormal: quux/

I.e. ignored directories with untracked files in untracked sub directories should be listed as ignored, because they contain ignored files. Currently these directories are not listed. I realize this sounds pretty much like the original commit message, but I don't know how to describe it any better.

> I'll try to look at
> the tests to infer this, but your commit message could probably be
> clearer.
> 
> Nit: please s/git-status/git status/

Both notations (as well as just 'status') seem to be widely used. Does this justify a reroll of the entire series with changed commit messages (I believe I've used git-status / git-ls-files / git-clean consistently in all patches)?

> 
>> Fix it by getting exact (recursive) excluded status in treat_directory.
> 
> Okay, so you're patching treat_directory() in dir.c to do some
> recursive exclude handling.  Let's see what this is.
> 
>> diff --git a/dir.c b/dir.c
>> index 57394e4..ec4eebf 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -1060,6 +1060,15 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
>>
>>         /* This is the "show_other_directories" case */
>>
>> +       /* might be a sub directory in an excluded directory */
>> +       if (!exclude) {
>> +               struct path_exclude_check check;
>> +               int dt = DT_DIR;
>> +               path_exclude_check_init(&check, dir);
>> +               exclude = is_path_excluded(&check, dirname, len, &dt);
>> +               path_exclude_check_clear(&check);
>> +       }
>> +
> 
> So, I'm guessing that DT_DIR refers to a value that a field in struct
> dirent can take; that value could be one of DIR (directory), REG
> (regular file?), LNK (symbolic link?).  I don't get much of this, but
> what I do get is that you're setting exclude for the rest of the code
> in this function.
> 

This simply does a recursive excluded check via is_path_excluded, because is_excluded is not good enough. The rest is just boilerplate code (i.e. initialize and clear the check structure and passing a dtype == DT_DIR because dirname obviously is a directory here). We already have similar code in treat_file (and all other places that use is_path_excluded).

> Sorry that I'm not able to do a more thorough review.
> 
>> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
>> index 0da1214..0f1034e 100755
>> --- a/t/t7061-wtstatus-ignore.sh
>> +++ b/t/t7061-wtstatus-ignore.sh
>> @@ -143,4 +143,31 @@ test_expect_success 'status ignored tracked directory and uncommitted file with
>>         test_cmp expected actual
>>  '
>>
>> +cat >expected <<\EOF
>> +?? .gitignore
>> +?? actual
>> +?? expected
>> +!! tracked/
>> +EOF
> 
> Please put these segments inside the test_expect_success block, so
> it's easy to think about those blocks in isolation.  I know you're
> just following the existing conventions existing in this test, but
> those are not necessarily good conventions.
> 

$ grep -e '^cat .*EOF$' t*.sh | wc -l
    743
$ grep -e '^    cat .*EOF$' t*.sh | wc -l
     22

Setting up expected results outside test_expect_* (i.e. unindented) seems to be the norm in git tests.

>> +test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore' '
>> +       rm -rf tracked/uncommitted &&
>> +       mkdir tracked/ignored &&
>> +       : >tracked/ignored/uncommitted &&
>> +       git status --porcelain --ignored >actual &&
>> +       test_cmp expected actual
>> +'
> 
> This is very confusing.  How is tracked a tracked directory?  Oh,
> right: some previous test git add'ed tracked/committed.  How do I know
> about that in this test?
> 

This test expands on the previous test 'status ignored tracked directory and uncommitted file with --ignore', i.e. the test just adds the 'in untracked subdir' part (and thus moves the 'uncommitted' file into a sub directory). Expanding on previous state is the whole point of putting several tests in a single t*.sh file, right? If I wanted to set up the test fixture from scratch I would have started a new t7063-whatever.sh.

> Yeah, changes to tracked ignored directories are not shown, but the
> commit message didn't tell me this.
> 
>> +cat >expected <<\EOF
>> +?? .gitignore
>> +?? actual
>> +?? expected
>> +!! tracked/ignored/uncommitted
>> +EOF
>> +
>> +test_expect_success 'status ignored tracked directory with uncommitted file in untracked subdir with --ignore -u' '
>> +       git status --porcelain --ignored -u >actual &&
>> +       test_cmp expected actual
>> +'
>> +
>>  test_done
> 
> I suppose the commit message told me about this one vaguely, but I
> think it could be much clearer overall.
> 

Note that all tests in t7061 come in pairs '--ignored' (i.e. --untracked-files=normal) and '--ignored -u' (i.e. --untracked-files=all). If '-uall' lists individual ignored files, then '-unormal' should list the ignored directory.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories
  2013-04-16 17:48     ` Ramkumar Ramachandra
@ 2013-04-17  0:31       ` Karsten Blees
  0 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-17  0:31 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: Git List, Junio C Hamano, Erik Faye-Lund, Robert Zeh, Duy Nguyen,
	Antoine Pelisse, Adam Spiers

Am 16.04.2013 19:48, schrieb Ramkumar Ramachandra:
> Karsten Blees wrote:
>> 'git-status --ignored' lists ignored tracked directories without any
>> ignored files if a tracked file happens to match an exclude pattern.
> 
> Here, I have:
> 
>     quux/
>         bar
>         baz/
>             foo
> 
> So, quux is an ignored tracked directory.  bar is tracked, but matches
> an ignore pattern.  Currently, git status --ignored lists quux/.  I'm
> confused.
> 

The crucial part is 'without any ignored files'. So if you 'rm quux/baz/foo' or 'git-add -f quux/baz/foo', you get this:

1) git-status --ignored -uall: <empty>
2) git-status --ignored -unormal: quux/

In case 2), quux/ is listed as ignored even though there are no ignored files and the DIR_HIDE_EMPTY_DIRECTORIES flag should kick in.

>> Always exclude tracked files.
> 
> "exclude" it from the 'git status --ignored' output, I presume?
> There's already an _exclude_ pattern in your previous sentence, so you
> can see why the reader might be confused about what you're talking
> about.
> 

I see, but "Always *ignore* tracked files" is also ambiguous. Suggestions?

>> diff --git a/dir.c b/dir.c
>> index 7c9bc9c..fd1f088 100644
>> --- a/dir.c
>> +++ b/dir.c
>> @@ -1109,16 +1109,13 @@ static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude,
>>         struct path_exclude_check check;
>>         int exclude_file = 0;
>>
>> +       /* Always exclude indexed files */
>> +       if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
>> +               return 1;
>> +
>>         if (exclude)
>>                 exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
>>         else if (dir->flags & DIR_SHOW_IGNORED) {
>> -               /* Always exclude indexed files */
>> -               struct cache_entry *ce = index_name_exists(&the_index,
>> -                   path->buf, path->len, ignore_case);
>> -
>> -               if (ce)
>> -                       return 1;
>> -
> 
> Okay, so you just moved this segment outside the else if()
> conditional.  Can you explain what the old logic was doing, and what
> the rationale for it was?
> 

Quoting Antoine's patch from January:
+		/*
+		 * Optimization:
+		 * Don't spend time on indexed files, they won't be
+		 * added to the list anyway
+		 */

In other words: don't do the (expensive) recursive is_path_excluded check if the file will be dropped later anyway (see the additional 'cache_name_is_other' checks in wt_status.c::wt_status_collect_untracked and builtin/ls-files.c::show_other_files).

In the DIR_SHOW_OTHER_DIRECTORIES case, we use read_directory_recursive to just *check* for ignored / untracked files, all within dir.c. So the results must be correct *here*, we cannot rely on some other modules to fix up incorrect files later.

>> diff --git a/t/t7061-wtstatus-ignore.sh b/t/t7061-wtstatus-ignore.sh
>> index 4ece129..28b7d95 100755
>> --- a/t/t7061-wtstatus-ignore.sh
>> +++ b/t/t7061-wtstatus-ignore.sh
>> @@ -122,10 +122,34 @@ cat >expected <<\EOF
>>  ?? .gitignore
>>  ?? actual
>>  ?? expected
>> +EOF
>> +
>> +test_expect_success 'status ignored tracked directory and ignored file with --ignore' '
>> +       echo "committed" >>.gitignore &&
>> +       git status --porcelain --ignored >actual &&
>> +       test_cmp expected actual
>> +'
> 
> Um, didn't really get this one.  You have three untracked files, and
> git status seems to be showing them fine.  What am I missing?
> 

We have this:

	tracked/
		committed

.gitignore:
tracked
committed

Both the directory and tracked file match an exclude pattern. There are no untracked ignored files, yet before this patch, git-status --ignored would list tracked/ as ignored directory.

>> +cat >expected <<\EOF
>> +?? .gitignore
>> +?? actual
>> +?? expected
>> +EOF
>> +
>> +test_expect_success 'status ignored tracked directory and ignored file with --ignore -u' '
>> +       git status --porcelain --ignored -u >actual &&
>> +       test_cmp expected actual
>> +'
> 
> I didn't understand why you're invoking -u here (doesn't it imply
> "all", as opposed to "normal" when unspecified?).  There are really no
> directories, so I don't know what I'm expected to see.
> 
>> +cat >expected <<\EOF
>> +?? .gitignore
>> +?? actual
>> +?? expected
>>  !! tracked/
>>  EOF
>>
>>  test_expect_success 'status ignored tracked directory and uncommitted file with --ignore' '
>> +       echo "tracked" >.gitignore &&
>>         : >tracked/uncommitted &&
>>         git status --porcelain --ignored >actual &&
>>         test_cmp expected actual
> 
> Didn't we test this in the last patch?  Okay, I'm completely confused now.
> 

'echo "tracked" >.gitignore' reverts the 'echo "committed" >>.gitignore' from above.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/14] Improve git-status --ignored
  2013-04-15 20:25         ` Junio C Hamano
@ 2013-04-17 19:50           ` Karsten Blees
  2013-04-17 22:03             ` Junio C Hamano
  2013-04-17 19:50           ` [PATCH v2-pu 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
  2013-04-17 19:51           ` [PATCH v2-pu 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees
  2 siblings, 1 reply; 36+ messages in thread
From: Karsten Blees @ 2013-04-17 19:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Am 15.04.2013 22:25, schrieb Junio C Hamano:
> Karsten Blees <karsten.blees@gmail.com> writes:
> 
>> Am 15.04.2013 21:33, schrieb Junio C Hamano:
>>> Junio C Hamano <gitster@pobox.com> writes:
>>>
>>>> Karsten Blees <karsten.blees@gmail.com> writes:
>>>>
>>>>> This patch series addresses several bugs and performance issues in
>>>>> .gitignore processing.
>>>>
>>>> A 8-patch series ending at 5d765dc7888b (dir.c: git-status: avoid
>>>> is_excluded checks for tracked files, 2013-03-18) has been cooking
>>>> in 'next'; in general we won't revert and requeue a new round for a
>>>> topic that has already merged to 'next'.
>>>>
>>
>> I'm sorry to have caused such trouble. I thought you were expecting a reroll from this:
> 
> Heh, that was an ancient history.

My git time is scarce, so progress is slow...guess I need some performance improvements, too :-)

> 
> It is not such a big deal to revert stuff from 'next'.  I've tried
> to queue this reroll, but tentatively ejected the result from 'pu'
> for today's integration run, as as/check-ignore topic has an
> unpleasant conflict with this.
> 

I'll send fixups for #11 and #14, or you can pick the entire series rebased to pu from:
https://github.com/kblees/git/tree/kb/improve-git-status-ignored-v2-pu
git pull git://github.com/kblees/git.git kb/improve-git-status-ignored-v2-pu

HTH, Karsten

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH v2-pu 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API
  2013-04-15 20:25         ` Junio C Hamano
  2013-04-17 19:50           ` Karsten Blees
@ 2013-04-17 19:50           ` Karsten Blees
  2013-04-17 19:51           ` [PATCH v2-pu 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees
  2 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-17 19:50 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 builtin/add.c          |  5 +---
 builtin/check-ignore.c | 16 ++++------
 builtin/ls-files.c     | 15 +++-------
 dir.c                  | 79 ++++----------------------------------------------
 dir.h                  | 16 ++--------
 unpack-trees.c         | 10 +------
 unpack-trees.h         |  1 -
 7 files changed, 21 insertions(+), 121 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index ddd5e38..050330e 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -419,9 +419,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 
 	if (pathspec) {
 		int i;
-		struct path_exclude_check check;
 
-		path_exclude_check_init(&check, &dir);
 		if (!seen)
 			seen = find_pathspecs_matching_against_index(pathspec);
 		for (i = 0; pathspec[i]; i++) {
@@ -429,7 +427,7 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			    && !file_exists(pathspec[i])) {
 				if (ignore_missing) {
 					int dtype = DT_UNKNOWN;
-					if (is_path_excluded(&check, pathspec[i], -1, &dtype))
+					if (is_excluded(&dir, pathspec[i], &dtype))
 						dir_add_ignored(&dir, pathspec[i], strlen(pathspec[i]));
 				} else
 					die(_("pathspec '%s' did not match any files"),
@@ -437,7 +435,6 @@ int cmd_add(int argc, const char **argv, const char *prefix)
 			}
 		}
 		free(seen);
-		path_exclude_check_clear(&check);
 	}
 
 	plug_bulk_checkin();
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index c00a7d6..a4357fb 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -63,7 +63,7 @@ static void output_exclude(const char *path, struct exclude *exclude)
 	}
 }
 
-static int check_ignore(struct path_exclude_check *check,
+static int check_ignore(struct dir_struct *dir,
 			const char *prefix, const char **pathspec)
 {
 	const char *path, *full_path;
@@ -91,8 +91,7 @@ static int check_ignore(struct path_exclude_check *check,
 		die_if_path_beyond_symlink(full_path, prefix);
 		exclude = NULL;
 		if (!seen[i]) {
-			exclude = last_exclude_matching_path(check, full_path,
-							     -1, &dtype);
+			exclude = last_exclude_matching(dir, full_path, &dtype);
 		}
 		if (!quiet && (exclude || show_non_matching))
 			output_exclude(path, exclude);
@@ -104,7 +103,7 @@ static int check_ignore(struct path_exclude_check *check,
 	return num_ignored;
 }
 
-static int check_ignore_stdin_paths(struct path_exclude_check *check, const char *prefix)
+static int check_ignore_stdin_paths(struct dir_struct *dir, const char *prefix)
 {
 	struct strbuf buf, nbuf;
 	char *pathspec[2] = { NULL, NULL };
@@ -121,7 +120,7 @@ static int check_ignore_stdin_paths(struct path_exclude_check *check, const char
 			strbuf_swap(&buf, &nbuf);
 		}
 		pathspec[0] = buf.buf;
-		num_ignored += check_ignore(check, prefix, (const char **)pathspec);
+		num_ignored += check_ignore(dir, prefix, (const char **)pathspec);
 		maybe_flush_or_die(stdout, "check-ignore to stdout");
 	}
 	strbuf_release(&buf);
@@ -133,7 +132,6 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 {
 	int num_ignored;
 	struct dir_struct dir;
-	struct path_exclude_check check;
 
 	git_config(git_default_config, NULL);
 
@@ -166,16 +164,14 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 	dir.flags |= DIR_COLLECT_IGNORED;
 	setup_standard_excludes(&dir);
 
-	path_exclude_check_init(&check, &dir);
 	if (stdin_paths) {
-		num_ignored = check_ignore_stdin_paths(&check, prefix);
+		num_ignored = check_ignore_stdin_paths(&dir, prefix);
 	} else {
-		num_ignored = check_ignore(&check, prefix, argv);
+		num_ignored = check_ignore(&dir, prefix, argv);
 		maybe_flush_or_die(stdout, "ignore to stdout");
 	}
 
 	clear_directory(&dir);
-	path_exclude_check_clear(&check);
 
 	return !num_ignored;
 }
diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 175e6e3..2202072 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -201,19 +201,15 @@ static void show_ru_info(void)
 	}
 }
 
-static int ce_excluded(struct path_exclude_check *check, struct cache_entry *ce)
+static int ce_excluded(struct dir_struct *dir, struct cache_entry *ce)
 {
 	int dtype = ce_to_dtype(ce);
-	return is_path_excluded(check, ce->name, ce_namelen(ce), &dtype);
+	return is_excluded(dir, ce->name, &dtype);
 }
 
 static void show_files(struct dir_struct *dir)
 {
 	int i;
-	struct path_exclude_check check;
-
-	if ((dir->flags & DIR_SHOW_IGNORED))
-		path_exclude_check_init(&check, dir);
 
 	/* For cached/deleted files we don't need to even do the readdir */
 	if (show_others || show_killed) {
@@ -227,7 +223,7 @@ static void show_files(struct dir_struct *dir)
 		for (i = 0; i < active_nr; i++) {
 			struct cache_entry *ce = active_cache[i];
 			if ((dir->flags & DIR_SHOW_IGNORED) &&
-			    !ce_excluded(&check, ce))
+			    !ce_excluded(dir, ce))
 				continue;
 			if (show_unmerged && !ce_stage(ce))
 				continue;
@@ -243,7 +239,7 @@ static void show_files(struct dir_struct *dir)
 			struct stat st;
 			int err;
 			if ((dir->flags & DIR_SHOW_IGNORED) &&
-			    !ce_excluded(&check, ce))
+			    !ce_excluded(dir, ce))
 				continue;
 			if (ce->ce_flags & CE_UPDATE)
 				continue;
@@ -256,9 +252,6 @@ static void show_files(struct dir_struct *dir)
 				show_ce_entry(tag_modified, ce);
 		}
 	}
-
-	if ((dir->flags & DIR_SHOW_IGNORED))
-		path_exclude_check_clear(&check);
 }
 
 /*
diff --git a/dir.c b/dir.c
index 7e68a1c..16e1c07 100644
--- a/dir.c
+++ b/dir.c
@@ -843,7 +843,7 @@ static void prep_exclude(struct dir_struct *dir, const char *base, int baselen)
  * Returns the exclude_list element which matched, or NULL for
  * undecided.
  */
-static struct exclude *last_exclude_matching(struct dir_struct *dir,
+struct exclude *last_exclude_matching(struct dir_struct *dir,
 					     const char *pathname,
 					     int *dtype_p)
 {
@@ -865,7 +865,7 @@ static struct exclude *last_exclude_matching(struct dir_struct *dir,
  * scans all exclude lists to determine whether pathname is excluded.
  * Returns 1 if true, otherwise 0.
  */
-static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
+int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_p)
 {
 	struct exclude *exclude =
 		last_exclude_matching(dir, pathname, dtype_p);
@@ -874,47 +874,6 @@ static int is_excluded(struct dir_struct *dir, const char *pathname, int *dtype_
 	return 0;
 }
 
-void path_exclude_check_init(struct path_exclude_check *check,
-			     struct dir_struct *dir)
-{
-	check->dir = dir;
-}
-
-void path_exclude_check_clear(struct path_exclude_check *check)
-{
-}
-
-/*
- * For each subdirectory in name, starting with the top-most, checks
- * to see if that subdirectory is excluded, and if so, returns the
- * corresponding exclude structure.  Otherwise, checks whether name
- * itself (which is presumably a file) is excluded.
- *
- * A path to a directory known to be excluded is left in check->path to
- * optimize for repeated checks for files in the same excluded directory.
- */
-struct exclude *last_exclude_matching_path(struct path_exclude_check *check,
-					   const char *name, int namelen,
-					   int *dtype)
-{
-	return last_exclude_matching(check->dir, name, dtype);
-}
-
-/*
- * Is this name excluded?  This is for a caller like show_files() that
- * do not honor directory hierarchy and iterate through paths that are
- * possibly in an ignored directory.
- */
-int is_path_excluded(struct path_exclude_check *check,
-		  const char *name, int namelen, int *dtype)
-{
-	struct exclude *exclude =
-		last_exclude_matching_path(check, name, namelen, dtype);
-	if (exclude)
-		return exclude->flags & EXC_FLAG_NEGATIVE ? 0 : 1;
-	return 0;
-}
-
 static struct dir_entry *dir_entry_new(const char *pathname, int len)
 {
 	struct dir_entry *ent;
@@ -1086,15 +1045,6 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
 
 	/* This is the "show_other_directories" case */
 
-	/* might be a sub directory in an excluded directory */
-	if (!exclude) {
-		struct path_exclude_check check;
-		int dt = DT_DIR;
-		path_exclude_check_init(&check, dir);
-		exclude = is_path_excluded(&check, dirname, len, &dt);
-		path_exclude_check_clear(&check);
-	}
-
 	/*
 	 * We are looking for ignored files and our directory is not ignored,
 	 * check if it contains untracked files (i.e. is listed as untracked)
@@ -1129,27 +1079,13 @@ static enum directory_treatment treat_directory(struct dir_struct *dir,
  *
  * Return 1 for exclude, 0 for include.
  */
-static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude, int *dtype)
+static int treat_file(struct dir_struct *dir, struct strbuf *path, int exclude)
 {
-	struct path_exclude_check check;
-	int exclude_file = 0;
-
 	/* Always exclude indexed files */
 	if (index_name_exists(&the_index, path->buf, path->len, ignore_case))
 		return 1;
 
-	if (exclude)
-		exclude_file = !(dir->flags & DIR_SHOW_IGNORED);
-	else if (dir->flags & DIR_SHOW_IGNORED) {
-		path_exclude_check_init(&check, dir);
-
-		if (!is_path_excluded(&check, path->buf, path->len, dtype))
-			exclude_file = 1;
-
-		path_exclude_check_clear(&check);
-	}
-
-	return exclude_file;
+	return exclude == !(dir->flags & DIR_SHOW_IGNORED);
 }
 
 /*
@@ -1306,12 +1242,9 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		break;
 	case DT_REG:
 	case DT_LNK:
-		switch (treat_file(dir, path, exclude, &dtype)) {
-		case 1:
+		if (treat_file(dir, path, exclude))
 			return path_ignored;
-		default:
-			break;
-		}
+		break;
 	}
 	return path_handled;
 }
diff --git a/dir.h b/dir.h
index cd166d0..bfe726e 100644
--- a/dir.h
+++ b/dir.h
@@ -151,20 +151,10 @@ extern int match_pathname(const char *, int,
 			  const char *, int,
 			  const char *, int, int, int);
 
-/*
- * The is_excluded() API is meant for callers that check each level of leading
- * directory hierarchies with is_excluded() to avoid recursing into excluded
- * directories.  Callers that do not do so should use this API instead.
- */
-struct path_exclude_check {
-	struct dir_struct *dir;
-};
-extern void path_exclude_check_init(struct path_exclude_check *, struct dir_struct *);
-extern void path_exclude_check_clear(struct path_exclude_check *);
-extern struct exclude *last_exclude_matching_path(struct path_exclude_check *, const char *,
-						  int namelen, int *dtype);
-extern int is_path_excluded(struct path_exclude_check *, const char *, int namelen, int *dtype);
+extern struct exclude *last_exclude_matching(struct dir_struct *dir,
+					     const char *name, int *dtype);
 
+extern int is_excluded(struct dir_struct *dir, const char *name, int *dtype);
 
 extern struct exclude_list *add_exclude_list(struct dir_struct *dir,
 					     int group_type, const char *src);
diff --git a/unpack-trees.c b/unpack-trees.c
index 09e53df..ede4299 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1026,10 +1026,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 			o->el = &el;
 	}
 
-	if (o->dir) {
-		o->path_exclude_check = xmalloc(sizeof(struct path_exclude_check));
-		path_exclude_check_init(o->path_exclude_check, o->dir);
-	}
 	memset(&o->result, 0, sizeof(o->result));
 	o->result.initialized = 1;
 	o->result.timestamp.sec = o->src_index->timestamp.sec;
@@ -1155,10 +1151,6 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 
 done:
 	clear_exclude_list(&el);
-	if (o->path_exclude_check) {
-		path_exclude_check_clear(o->path_exclude_check);
-		free(o->path_exclude_check);
-	}
 	return ret;
 
 return_failed:
@@ -1375,7 +1367,7 @@ static int check_ok_to_remove(const char *name, int len, int dtype,
 		return 0;
 
 	if (o->dir &&
-	    is_path_excluded(o->path_exclude_check, name, -1, &dtype))
+	    is_excluded(o->dir, name, &dtype))
 		/*
 		 * ce->name is explicitly excluded, so it is Ok to
 		 * overwrite it.
diff --git a/unpack-trees.h b/unpack-trees.h
index ec74a9f..5e432f5 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -52,7 +52,6 @@ struct unpack_trees_options {
 	const char *prefix;
 	int cache_bottom;
 	struct dir_struct *dir;
-	struct path_exclude_check *path_exclude_check;
 	struct pathspec *pathspec;
 	merge_fn_t fn;
 	const char *msgs[NB_UNPACK_TREES_ERROR_TYPES];
-- 
1.8.1.2.8038.ge6fb49b.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH v2-pu 14/14] dir.c: git-status --ignored: don't scan the work tree twice
  2013-04-15 20:25         ` Junio C Hamano
  2013-04-17 19:50           ` Karsten Blees
  2013-04-17 19:50           ` [PATCH v2-pu 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
@ 2013-04-17 19:51           ` Karsten Blees
  2 siblings, 0 replies; 36+ messages in thread
From: Karsten Blees @ 2013-04-17 19:51 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.

fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).

The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.

We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.

In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.

In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.

builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.

Update API documentation to reflect the changes.

Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.

Signed-off-by: Karsten Blees <blees@dcon.de>
---
 Documentation/technical/api-directory-listing.txt | 25 ++++++++++++++++++++---
 builtin/check-ignore.c                            |  1 -
 dir.c                                             | 10 +++++----
 dir.h                                             |  3 ++-
 wt-status.c                                       | 24 ++++++++++------------
 5 files changed, 41 insertions(+), 22 deletions(-)

diff --git a/Documentation/technical/api-directory-listing.txt b/Documentation/technical/api-directory-listing.txt
index 1f349b2..7f8e78d 100644
--- a/Documentation/technical/api-directory-listing.txt
+++ b/Documentation/technical/api-directory-listing.txt
@@ -22,12 +22,23 @@ The notable options are:
 
 `flags`::
 
-	A bit-field of options:
+	A bit-field of options (the `*IGNORED*` flags are mutually exclusive):
 
 `DIR_SHOW_IGNORED`:::
 
-	The traversal is for finding just ignored files, not unignored
-	files.
+	Return just ignored files in `entries[]`, not untracked files.
+
+`DIR_SHOW_IGNORED_TOO`:::
+
+	Similar to `DIR_SHOW_IGNORED`, but return ignored files in `ignored[]`
+	in addition to untracked files in `entries[]`.
+
+`DIR_COLLECT_IGNORED`:::
+
+	Special mode for git-add. Return ignored files in `ignored[]` and
+	untracked files in `entries[]`. Only returns ignored files that match
+	pathspec exactly (no wildcards). Does not recurse into ignored
+	directories.
 
 `DIR_SHOW_OTHER_DIRECTORIES`:::
 
@@ -57,6 +68,14 @@ The result of the enumeration is left in these fields:
 
 	Internal use; keeps track of allocation of `entries[]` array.
 
+`ignored[]`::
+
+	An array of `struct dir_entry`, used for ignored paths with the
+	`DIR_SHOW_IGNORED_TOO` and `DIR_COLLECT_IGNORED` flags.
+
+`ignored_nr`::
+
+	The number of members in `ignored[]` array.
 
 Calling sequence
 ----------------
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index a4357fb..4a8fc70 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -161,7 +161,6 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 		die(_("index file corrupt"));
 
 	memset(&dir, 0, sizeof(dir));
-	dir.flags |= DIR_COLLECT_IGNORED;
 	setup_standard_excludes(&dir);
 
 	if (stdin_paths) {
diff --git a/dir.c b/dir.c
index 2088891..ee4d04d 100644
--- a/dir.c
+++ b/dir.c
@@ -1183,15 +1183,12 @@ static enum path_treatment treat_one_path(struct dir_struct *dir,
 		return path_none;
 
 	exclude = is_excluded(dir, path->buf, &dtype);
-	if (exclude && (dir->flags & DIR_COLLECT_IGNORED)
-	    && exclude_matches_pathspec(path->buf, path->len, simplify))
-		dir_add_ignored(dir, path->buf, path->len);
 
 	/*
 	 * Excluded? If we don't explicitly want to show
 	 * ignored files, ignore it
 	 */
-	if (exclude && !(dir->flags & DIR_SHOW_IGNORED))
+	if (exclude && !(dir->flags & (DIR_SHOW_IGNORED|DIR_SHOW_IGNORED_TOO)))
 		return path_excluded;
 
 	switch (dtype) {
@@ -1280,6 +1277,11 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir,
 		case path_excluded:
 			if (dir->flags & DIR_SHOW_IGNORED)
 				dir_add_name(dir, path.buf, path.len);
+			else if ((dir->flags & DIR_SHOW_IGNORED_TOO) ||
+				((dir->flags & DIR_COLLECT_IGNORED) &&
+				exclude_matches_pathspec(path.buf, path.len,
+					simplify)))
+				dir_add_ignored(dir, path.buf, path.len);
 			break;
 
 		case path_untracked:
diff --git a/dir.h b/dir.h
index bfe726e..3d6b80c 100644
--- a/dir.h
+++ b/dir.h
@@ -79,7 +79,8 @@ struct dir_struct {
 		DIR_SHOW_OTHER_DIRECTORIES = 1<<1,
 		DIR_HIDE_EMPTY_DIRECTORIES = 1<<2,
 		DIR_NO_GITLINKS = 1<<3,
-		DIR_COLLECT_IGNORED = 1<<4
+		DIR_COLLECT_IGNORED = 1<<4,
+		DIR_SHOW_IGNORED_TOO = 1<<5
 	} flags;
 	struct dir_entry **entries;
 	struct dir_entry **ignored;
diff --git a/wt-status.c b/wt-status.c
index 676b058..bf84a86 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -511,9 +511,12 @@ static void wt_status_collect_untracked(struct wt_status *s)
 	if (s->show_untracked_files != SHOW_ALL_UNTRACKED_FILES)
 		dir.flags |=
 			DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
+	if (s->show_ignored_files)
+		dir.flags |= DIR_SHOW_IGNORED_TOO;
 	setup_standard_excludes(&dir);
 
 	fill_directory(&dir, s->pathspec);
+
 	for (i = 0; i < dir.nr; i++) {
 		struct dir_entry *ent = dir.entries[i];
 		if (cache_name_is_other(ent->name, ent->len) &&
@@ -522,22 +525,17 @@ static void wt_status_collect_untracked(struct wt_status *s)
 		free(ent);
 	}
 
-	if (s->show_ignored_files) {
-		dir.nr = 0;
-		dir.flags = DIR_SHOW_IGNORED;
-		if (s->show_untracked_files != SHOW_ALL_UNTRACKED_FILES)
-			dir.flags |= DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
-		fill_directory(&dir, s->pathspec);
-		for (i = 0; i < dir.nr; i++) {
-			struct dir_entry *ent = dir.entries[i];
-			if (cache_name_is_other(ent->name, ent->len) &&
-			    match_pathspec(s->pathspec, ent->name, ent->len, 0, NULL))
-				string_list_insert(&s->ignored, ent->name);
-			free(ent);
-		}
+	for (i = 0; i < dir.ignored_nr; i++) {
+		struct dir_entry *ent = dir.ignored[i];
+		if (cache_name_is_other(ent->name, ent->len) &&
+		    match_pathspec(s->pathspec, ent->name, ent->len, 0, NULL))
+			string_list_insert(&s->ignored, ent->name);
+		free(ent);
 	}
 
 	free(dir.entries);
+	free(dir.ignored);
+	clear_directory(&dir);
 
 	if (advice_status_u_option) {
 		struct timeval t_end;
-- 
1.8.1.2.8038.ge6fb49b.dirty

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH v2 00/14] Improve git-status --ignored
  2013-04-17 19:50           ` Karsten Blees
@ 2013-04-17 22:03             ` Junio C Hamano
  0 siblings, 0 replies; 36+ messages in thread
From: Junio C Hamano @ 2013-04-17 22:03 UTC (permalink / raw)
  To: Karsten Blees
  Cc: Git List, Erik Faye-Lund, Ramkumar Ramachandra, Robert Zeh,
	Duy Nguyen, Antoine Pelisse, Adam Spiers

Karsten Blees <karsten.blees@gmail.com> writes:

> I'll send fixups for #11 and #14, or you can pick the entire series rebased to pu from:
> https://github.com/kblees/git/tree/kb/improve-git-status-ignored-v2-pu
> git pull git://github.com/kblees/git.git kb/improve-git-status-ignored-v2-pu

Will take a look; very much appreciated.

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2013-04-17 22:04 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-18 20:28 [PATCH 0/8] Improve git-status --ignored Karsten Blees
2013-03-19  4:08 ` Junio C Hamano
2013-03-19  5:20   ` Duy Nguyen
2013-03-19 10:48     ` Karsten Blees
2013-03-19 14:48     ` Junio C Hamano
2013-03-19 15:58       ` Duy Nguyen
2013-04-15 19:04 ` [PATCH v2 00/14] " Karsten Blees
2013-04-15 19:05   ` [PATCH v2 01/14] dir.c: git-status --ignored: don't drop ignored directories Karsten Blees
2013-04-16 17:33     ` Ramkumar Ramachandra
2013-04-17  0:31       ` Karsten Blees
2013-04-15 19:06   ` [PATCH v2 02/14] dir.c: git-status --ignored: don't list files in " Karsten Blees
2013-04-16  9:57     ` [PATCH] read_revisions_from_stdin: make copies for handle_revision_arg Thomas Rast
2013-04-16 18:17       ` Junio C Hamano
2013-04-15 19:07   ` [PATCH v2 03/14] dir.c: git-status --ignored: don't list empty ignored directories Karsten Blees
2013-04-16 17:48     ` Ramkumar Ramachandra
2013-04-17  0:31       ` Karsten Blees
2013-04-15 19:08   ` [PATCH v2 04/14] dir.c: git-ls-files --directories: don't hide empty directories Karsten Blees
2013-04-15 19:08   ` [PATCH v2 05/14] dir.c: git-status --ignored: don't list empty directories as ignored Karsten Blees
2013-04-15 19:09   ` [PATCH v2 06/14] dir.c: make 'git-status --ignored' work within leading directories Karsten Blees
2013-04-15 19:10   ` [PATCH v2 07/14] dir.c: git-clean -d -X: don't delete tracked directories Karsten Blees
2013-04-15 19:11   ` [PATCH v2 08/14] dir.c: factor out parts of last_exclude_matching for later reuse Karsten Blees
2013-04-15 19:11   ` [PATCH v2 09/14] dir.c: move prep_exclude Karsten Blees
2013-04-15 19:12   ` [PATCH v2 10/14] dir.c: unify is_excluded and is_path_excluded APIs Karsten Blees
2013-04-15 21:35     ` Junio C Hamano
2013-04-15 19:12   ` [PATCH v2 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
2013-04-15 19:13   ` [PATCH v2 12/14] dir.c: git-status: avoid is_excluded checks for tracked files Karsten Blees
2013-04-15 19:14   ` [PATCH v2 13/14] dir.c: git-status --ignored: don't scan the work tree three times Karsten Blees
2013-04-15 19:15   ` [PATCH v2 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees
2013-04-15 19:23   ` [PATCH v2 00/14] Improve git-status --ignored Junio C Hamano
2013-04-15 19:33     ` Junio C Hamano
2013-04-15 20:06       ` Karsten Blees
2013-04-15 20:25         ` Junio C Hamano
2013-04-17 19:50           ` Karsten Blees
2013-04-17 22:03             ` Junio C Hamano
2013-04-17 19:50           ` [PATCH v2-pu 11/14] dir.c: replace is_path_excluded with now equivalent is_excluded API Karsten Blees
2013-04-17 19:51           ` [PATCH v2-pu 14/14] dir.c: git-status --ignored: don't scan the work tree twice Karsten Blees

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.