All of lore.kernel.org
 help / color / mirror / Atom feed
* .gitignore sub-dir exclusions not overriding '*'
@ 2014-11-19  3:40 Phil Pennock
  2014-11-19  9:48 ` Duy Nguyen
  0 siblings, 1 reply; 7+ messages in thread
From: Phil Pennock @ 2014-11-19  3:40 UTC (permalink / raw)
  To: git

The behaviour below is seen with both git 1.8.3.2 and git 2.1.3; I am
not subscribed to the vger list, please keep me in the CC list.

Use-case: git repo which only holds PGP-encrypted files with a .asc
extension, no matter which sub-directory they're in, or if in the top
directory.

Simple layout; for demo purposes names starting 'incl' should end up
included, those starting 'excl' should end up excluded, but not based on
those prefices: they're success result indicators; so:

    cd wherever
    git init
    mkdir foo
    touch incl.asc excl excl.txt foo/incl.asc foo/excl.txt foo/excl
    $EDITOR .gitignore

Expected to work as .gitignore in top-level of repo:

    *
    !**/*.asc
    !.gitignore

With that, `git status` ignores the contents of foo/ thusly:

    $ git check-ignore -v foo/incl.asc
    .gitignore:1:*	foo/incl.asc

Commenting out the '*' line and removing the '!' from the second, the
**/*.asc clearly matches.  The only way I can make this style work is to
set the first line to '**/*.*' which fails to exclude the plain 'excl'
files (no extension).

It seems that there's some magic around '*' as the entire final path
component of a pattern which causes it to match against the entire
directory, and excludes of the directory can not be overriden by matches
against '*.ext' within the directory, even when they come later in the
same config file at the same precedence.

This does not seem to my reading to match the behaviour described by
`git help gitignore` (checked in both versions of git) and so seems to
me to be a bug, but if it's a failure of my understanding, please help
me to understand where I messed up.

Thanks,
-Phil

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: .gitignore sub-dir exclusions not overriding '*'
  2014-11-19  3:40 .gitignore sub-dir exclusions not overriding '*' Phil Pennock
@ 2014-11-19  9:48 ` Duy Nguyen
  2014-11-19 10:30   ` [PATCH] dir.c: allow re-include after a dir is excluded in some cases Nguyễn Thái Ngọc Duy
  2014-11-19 23:41   ` .gitignore sub-dir exclusions not overriding '*' Phil Pennock
  0 siblings, 2 replies; 7+ messages in thread
From: Duy Nguyen @ 2014-11-19  9:48 UTC (permalink / raw)
  To: Phil Pennock; +Cc: Git Mailing List

On Wed, Nov 19, 2014 at 10:40 AM, Phil Pennock
<phil-gitml@phil.spodhuis.org> wrote:
> Expected to work as .gitignore in top-level of repo:
>
>     *
>     !**/*.asc
>     !.gitignore
>

gitignore man page has this "It is not possible to re-include a file
if a parent directory of that file is excluded". In this case,
directory "foo" is ignored by "*". Although it makes sense for this
particular case to re-include something in foo because we can clearly
see there are rules to re-include. It's on my todo list, but I don't
know when it will be implemented.
-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] dir.c: allow re-include after a dir is excluded in some cases
  2014-11-19  9:48 ` Duy Nguyen
@ 2014-11-19 10:30   ` Nguyễn Thái Ngọc Duy
  2014-11-19 18:48     ` Junio C Hamano
  2014-11-19 23:41   ` .gitignore sub-dir exclusions not overriding '*' Phil Pennock
  1 sibling, 1 reply; 7+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2014-11-19 10:30 UTC (permalink / raw)
  To: phil-gitml; +Cc: git, Nguyễn Thái Ngọc Duy

If we exclude a directory and have no knowledge in advance if there
will be any negative rules on that directory, then it makes no sense
to enter the directory and search for those negative rules. That
defeats the purpose of excluding in the first place.

However there are cases where we know in advance about such negative
rules. One of them is when the rule to exclude a dir and negative
rules on that dir appear on the same .gitignore file. This case be
easily detected.

Note that I do not support this case

  *
  !bar

because I fear of performance degradation because people may not
realize "!bar" will force recursing indefinitely. So you need at least
a slash some where to nail your pattern to some directory. "!/bar" is
fine (but pointless), "bar/*.c" or "**/foo" may be seen more often.

This is a quick prototype. I know the two statements in add_exclude()
is not good enough. I should not count the last trailing slash because
that one means different thing.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
---
 On Wed, Nov 19, 2014 at 4:48 PM, Duy Nguyen <pclouds@gmail.com> wrote:
 > On Wed, Nov 19, 2014 at 10:40 AM, Phil Pennock
 > <phil-gitml@phil.spodhuis.org> wrote:
 >> Expected to work as .gitignore in top-level of repo:
 >>
 >>     *
 >>     !**/*.asc
 >>     !.gitignore
 >>
 >
 > gitignore man page has this "It is not possible to re-include a file
 > if a parent directory of that file is excluded". In this case,
 > directory "foo" is ignored by "*". Although it makes sense for this
 > particular case to re-include something in foo because we can clearly
 > see there are rules to re-include. It's on my todo list, but I don't
 > know when it will be implemented.

 It turns out not so hard to do. I just needed a push like your
 report. I'll try to find some time to finish this up, if people don't
 object this idea.

 Documentation/gitignore.txt | 13 +++++++++----
 dir.c                       | 27 +++++++++++++++++++++------
 dir.h                       |  2 ++
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
index 09e82c3..0340c44 100644
--- a/Documentation/gitignore.txt
+++ b/Documentation/gitignore.txt
@@ -82,10 +82,9 @@ PATTERN FORMAT
 
  - An optional prefix "`!`" which negates the pattern; any
    matching file excluded by a previous pattern will become
-   included again. It is not possible to re-include a file if a parent
-   directory of that file is excluded. Git doesn't list excluded
-   directories for performance reasons, so any patterns on contained
-   files have no effect, no matter where they are defined.
+   included again.
+   It is usually not possible to re-include a file if a parent
+   directory of that file is excluded. See NOTES for details.
    Put a backslash ("`\`") in front of the first "`!`" for patterns
    that begin with a literal "`!`", for example, "`\!important!.txt`".
 
@@ -144,6 +143,12 @@ use 'git update-index {litdd}assume-unchanged'.
 To stop tracking a file that is currently tracked, use
 'git rm --cached'.
 
+It is usually not possible to re-include a file if a parent directory
+of that file is excluded because of performance reasons. However, if
+there are negative rules in the same .gitignore file that contains the
+rule to ignore a specific directory, and those negative rules contain
+a slash, then re-inclusion is possible.
+
 EXAMPLES
 --------
 
diff --git a/dir.c b/dir.c
index 3f7a025..df6d940 100644
--- a/dir.c
+++ b/dir.c
@@ -443,6 +443,8 @@ void add_exclude(const char *string, const char *base,
 	int flags;
 	int nowildcardlen;
 
+	if (string[0] == '!' && strchr(string + 1, '/'))
+		el->flags |= EXC_FLAG_MAY_REINCLUDE;
 	parse_exclude_pattern(&string, &patternlen, &flags, &nowildcardlen);
 	if (flags & EXC_FLAG_MUSTBEDIR) {
 		char *s;
@@ -710,14 +712,17 @@ static struct exclude *last_exclude_matching_from_list(const char *pathname,
 						       struct exclude_list *el)
 {
 	int i;
+	struct exclude *x;
+	const char *exclude;
+	int prefix;
 
 	if (!el->nr)
 		return NULL;	/* undefined */
 
 	for (i = el->nr - 1; 0 <= i; i--) {
-		struct exclude *x = el->excludes[i];
-		const char *exclude = x->pattern;
-		int prefix = x->nowildcardlen;
+		x = el->excludes[i];
+		exclude = x->pattern;
+		prefix = x->nowildcardlen;
 
 		if (x->flags & EXC_FLAG_MUSTBEDIR) {
 			if (*dtype == DT_UNKNOWN)
@@ -731,7 +736,7 @@ static struct exclude *last_exclude_matching_from_list(const char *pathname,
 					   pathlen - (basename - pathname),
 					   exclude, prefix, x->patternlen,
 					   x->flags))
-				return x;
+				goto done;
 			continue;
 		}
 
@@ -739,9 +744,19 @@ static struct exclude *last_exclude_matching_from_list(const char *pathname,
 		if (match_pathname(pathname, pathlen,
 				   x->base, x->baselen ? x->baselen - 1 : 0,
 				   exclude, prefix, x->patternlen, x->flags))
-			return x;
+			goto done;
+	}
+	x = NULL;		/* undecided */
+done:
+	if (x && !(x->flags & EXC_FLAG_NEGATIVE) &&
+	    (el->flags & EXC_FLAG_MAY_REINCLUDE)) {
+		if (*dtype == DT_UNKNOWN)
+			*dtype = get_dtype(NULL, pathname, pathlen);
+		if (*dtype == DT_DIR)
+			x = NULL;
 	}
-	return NULL; /* undecided */
+
+	return x;
 }
 
 /*
diff --git a/dir.h b/dir.h
index 6c45e9d..1ace1cf 100644
--- a/dir.h
+++ b/dir.h
@@ -14,6 +14,7 @@ struct dir_entry {
 #define EXC_FLAG_ENDSWITH 4
 #define EXC_FLAG_MUSTBEDIR 8
 #define EXC_FLAG_NEGATIVE 16
+#define EXC_FLAG_MAY_REINCLUDE 32
 
 struct exclude {
 	/*
@@ -46,6 +47,7 @@ struct exclude {
 struct exclude_list {
 	int nr;
 	int alloc;
+	int flags;
 
 	/* remember pointer to exclude file contents so we can free() */
 	char *filebuf;
-- 
2.1.0.rc0.78.gc0d8480

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] dir.c: allow re-include after a dir is excluded in some cases
  2014-11-19 10:30   ` [PATCH] dir.c: allow re-include after a dir is excluded in some cases Nguyễn Thái Ngọc Duy
@ 2014-11-19 18:48     ` Junio C Hamano
  2014-11-20  0:27       ` Duy Nguyen
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2014-11-19 18:48 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: phil-gitml, git

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
> index 09e82c3..0340c44 100644
> --- a/Documentation/gitignore.txt
> +++ b/Documentation/gitignore.txt
> @@ -82,10 +82,9 @@ PATTERN FORMAT
>  
>   - An optional prefix "`!`" which negates the pattern; any
>     matching file excluded by a previous pattern will become
> -   included again. It is not possible to re-include a file if a parent
> -   directory of that file is excluded. Git doesn't list excluded
> -   directories for performance reasons, so any patterns on contained
> -   files have no effect, no matter where they are defined.
> +   included again.
> +   It is usually not possible to re-include a file if a parent
> +   directory of that file is excluded. See NOTES for details.
>     Put a backslash ("`\`") in front of the first "`!`" for patterns
>     that begin with a literal "`!`", for example, "`\!important!.txt`".
>  
> @@ -144,6 +143,12 @@ use 'git update-index {litdd}assume-unchanged'.
>  To stop tracking a file that is currently tracked, use
>  'git rm --cached'.
>  
> +It is usually not possible to re-include a file if a parent directory
> +of that file is excluded because of performance reasons. However, if
> +there are negative rules in the same .gitignore file that contains the
> +rule to ignore a specific directory, and those negative rules contain
> +a slash, then re-inclusion is possible.

Does that mean "performance reasons" goes out the window???

What trade-off are the users making by choosing to do so?  Is it
explained in the documentation well enough to allow them to make an
informed decision?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: .gitignore sub-dir exclusions not overriding '*'
  2014-11-19  9:48 ` Duy Nguyen
  2014-11-19 10:30   ` [PATCH] dir.c: allow re-include after a dir is excluded in some cases Nguyễn Thái Ngọc Duy
@ 2014-11-19 23:41   ` Phil Pennock
  2014-11-20  1:26     ` Duy Nguyen
  1 sibling, 1 reply; 7+ messages in thread
From: Phil Pennock @ 2014-11-19 23:41 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

On 2014-11-19 at 16:48 +0700, Duy Nguyen wrote:
> On Wed, Nov 19, 2014 at 10:40 AM, Phil Pennock
> <phil-gitml@phil.spodhuis.org> wrote:
> > Expected to work as .gitignore in top-level of repo:
> >
> >     *
> >     !**/*.asc
> >     !.gitignore
> >
> 
> gitignore man page has this "It is not possible to re-include a file
> if a parent directory of that file is excluded". In this case,
> directory "foo" is ignored by "*". Although it makes sense for this
> particular case to re-include something in foo because we can clearly
> see there are rules to re-include. It's on my todo list, but I don't
> know when it will be implemented.

Thanks for this and the patches and discussion which follow.

I didn't cover it in my report, but one of the scenarios I tried was to
explicitly re-include directories, to make them candidates again, and
either use directory-matching patterns in the top-level .gitignore or to
use per-directory .gitignore to handle those directories.

Looking fresh today, I see that I failed to compare baseline behaviour
without a .gitignore when using `git status` as a baseline for
comparison.  So a .gitignore like this:

    *
    !*/
    !*.asc

appeared to not work; even within the `foo/` sub-directory, `git status`
shows no candidates for inclusion.  But this is true even without a
.gitignore.  *sigh*

In fact, it looks like the simple three lines above work, without any
.gitignore in sub-directories.

The behaviour which confused me between this simplified test-case and
the original was that `git status` shows files in the top-level
directory which are untracked, and in untracked files sub-directories
where some other file in that directory is already tracked, but if no
file in the sub-directory is already tracked, then `git status` does not
report the files for inclusion, even if the cwd is inside that
directory.

I tied myself in knots trying to avoid adding unencrypted files to the
repo.

Thanks,
-Phil

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] dir.c: allow re-include after a dir is excluded in some cases
  2014-11-19 18:48     ` Junio C Hamano
@ 2014-11-20  0:27       ` Duy Nguyen
  0 siblings, 0 replies; 7+ messages in thread
From: Duy Nguyen @ 2014-11-20  0:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: phil-gitml, Git Mailing List

On Thu, Nov 20, 2014 at 1:48 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:
>
>> diff --git a/Documentation/gitignore.txt b/Documentation/gitignore.txt
>> index 09e82c3..0340c44 100644
>> --- a/Documentation/gitignore.txt
>> +++ b/Documentation/gitignore.txt
>> @@ -82,10 +82,9 @@ PATTERN FORMAT
>>
>>   - An optional prefix "`!`" which negates the pattern; any
>>     matching file excluded by a previous pattern will become
>> -   included again. It is not possible to re-include a file if a parent
>> -   directory of that file is excluded. Git doesn't list excluded
>> -   directories for performance reasons, so any patterns on contained
>> -   files have no effect, no matter where they are defined.
>> +   included again.
>> +   It is usually not possible to re-include a file if a parent
>> +   directory of that file is excluded. See NOTES for details.
>>     Put a backslash ("`\`") in front of the first "`!`" for patterns
>>     that begin with a literal "`!`", for example, "`\!important!.txt`".
>>
>> @@ -144,6 +143,12 @@ use 'git update-index {litdd}assume-unchanged'.
>>  To stop tracking a file that is currently tracked, use
>>  'git rm --cached'.
>>
>> +It is usually not possible to re-include a file if a parent directory
>> +of that file is excluded because of performance reasons. However, if
>> +there are negative rules in the same .gitignore file that contains the
>> +rule to ignore a specific directory, and those negative rules contain
>> +a slash, then re-inclusion is possible.
>
> Does that mean "performance reasons" goes out the window???
>
> What trade-off are the users making by choosing to do so?  Is it
> explained in the documentation well enough to allow them to make an
> informed decision?

If they put "!**/foo" there, I think it's obvious for users that all
dirs must be looked at. "!a*/**/foo" may be expected to only look at a
subset of dirs recursively, instead of everything. "!*/def" may be
expected that only one more level of directories are looked at. I
didn't cover the last two cases well and I don't think it'll be easy
to do. Which makes this patch less appealing. (so much for a
not-thoroughly-thought-through quick prototype)

Perhaps I look at it from a wrong angle. If we have a way to say
"these patterns do not apply to directories", then the user can
selectively exclude certain directories and let others through. It
would give the user more control and make the travelling cost more
apparent (assuming we mention somewhere that the more dirs we examine,
the longer it will take).
-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: .gitignore sub-dir exclusions not overriding '*'
  2014-11-19 23:41   ` .gitignore sub-dir exclusions not overriding '*' Phil Pennock
@ 2014-11-20  1:26     ` Duy Nguyen
  0 siblings, 0 replies; 7+ messages in thread
From: Duy Nguyen @ 2014-11-20  1:26 UTC (permalink / raw)
  To: Phil Pennock; +Cc: Git Mailing List

On Thu, Nov 20, 2014 at 6:41 AM, Phil Pennock
<phil-gitml@phil.spodhuis.org> wrote:
> On 2014-11-19 at 16:48 +0700, Duy Nguyen wrote:
>> On Wed, Nov 19, 2014 at 10:40 AM, Phil Pennock
>> <phil-gitml@phil.spodhuis.org> wrote:
>> > Expected to work as .gitignore in top-level of repo:
>> >
>> >     *
>> >     !**/*.asc
>> >     !.gitignore
>> >
>>
>> gitignore man page has this "It is not possible to re-include a file
>> if a parent directory of that file is excluded". In this case,
>> directory "foo" is ignored by "*". Although it makes sense for this
>> particular case to re-include something in foo because we can clearly
>> see there are rules to re-include. It's on my todo list, but I don't
>> know when it will be implemented.
>
> Thanks for this and the patches and discussion which follow.
>
> I didn't cover it in my report, but one of the scenarios I tried was to
> explicitly re-include directories, to make them candidates again, and
> either use directory-matching patterns in the top-level .gitignore or to
> use per-directory .gitignore to handle those directories.
>
> Looking fresh today, I see that I failed to compare baseline behaviour
> without a .gitignore when using `git status` as a baseline for
> comparison.  So a .gitignore like this:
>
>     *
>     !*/
>     !*.asc
>
> appeared to not work; even within the `foo/` sub-directory, `git status`
> shows no candidates for inclusion.  But this is true even without a
> .gitignore.  *sigh*

I should have read this mail before replying to Junio in the other
email :( Yeah the "!*/" would re-include dirs back. I'm not sure if
there are any side effects by doing this, no time to think about this
yet. Maybe we can put this in the example section in gitignore man
page with more explanation.

> In fact, it looks like the simple three lines above work, without any
> .gitignore in sub-directories.
>
> The behaviour which confused me between this simplified test-case and
> the original was that `git status` shows files in the top-level
> directory which are untracked, and in untracked files sub-directories
> where some other file in that directory is already tracked, but if no
> file in the sub-directory is already tracked, then `git status` does not
> report the files for inclusion, even if the cwd is inside that
> directory.
>
> I tied myself in knots trying to avoid adding unencrypted files to the
> repo.
>
> Thanks,
> -Phil



-- 
Duy

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2014-11-20  1:27 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-19  3:40 .gitignore sub-dir exclusions not overriding '*' Phil Pennock
2014-11-19  9:48 ` Duy Nguyen
2014-11-19 10:30   ` [PATCH] dir.c: allow re-include after a dir is excluded in some cases Nguyễn Thái Ngọc Duy
2014-11-19 18:48     ` Junio C Hamano
2014-11-20  0:27       ` Duy Nguyen
2014-11-19 23:41   ` .gitignore sub-dir exclusions not overriding '*' Phil Pennock
2014-11-20  1:26     ` Duy Nguyen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.