All of lore.kernel.org
 help / color / mirror / Atom feed
* RFC: two minor tweaks to check-ignore to help git-annex assistant
@ 2013-04-08 18:13 Adam Spiers
  2013-04-08 21:56 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-08 18:13 UTC (permalink / raw)
  To: Git Mailing List

Hi all,

I was recently informed by the author of git-annex that my
implementation of git check-ignore has two minor deficiencies which
currently prevent him from adding .gitignore support to the git-annex
assistant (web UI):

    1. When accepting a list of files to check via --stdin, no results
       are calculated until EOF is hit.  This prevents it being used
       as a persistent background query process which streams results
       to its caller.  (This is inconsistent with check-attr, which
       *does* support stream-like behaviour.)

    2. Even if it did support streaming, the lack of output for files
       which don't match any ignore pattern make it impossible for the
       consumer to distinguish between the two cases a) the file
       doesn't match any pattern and b) it does match but the output
       hasn't arrived yet.

Both of these are pretty trivial to fix.  The first is fixed by
changing check_ignore_stdin_paths() to invoke check_ignore() per input
line, rather than collecting all input from STDIN and then invoking
check_ignore() on the whole lot.  (I have not implemented this yet,
but may well be able to do it this week, thanks to it being one of
SUSE's hack weeks :-)

I already have a rough fix for the second issue, but I wanted to
solicit feedback on the appropriate UI changes before proceeding much
further.  Does something like the below patch seem reasonable, modulo
the lack of tests?  In case the UI changes I am proposing are not
clear from the patch, here's some example output from running it
inside a clone of the git source tree:

    $ git check-ignore -v -n foo.tar.{gz,bz2}
    .gitignore:203:*.tar.gz foo.tar.gz
    ::      foo.tar.bz2

So the number of output fields does not change depending on whether
the pattern matches or not, and any caller can determine whether it
does simply by checking whether the first field is non-empty.

Also, does it make sense to write a new test to accompany the fix to
the first (streaming) issue?

Thanks,
Adam

-- >8 --
Subject: [PATCH] check-ignore: add -n / --non-matching option

If `-n` or `--non-matching` are specified, non-matching pathnames will
also be output, in which case all fields in each output record except
for <pathname> will be empty.  This can be useful when running
non-interactively, so that files can be incrementally streamed to
STDIN of a long-running check-ignore process, and for each of these
files, STDOUT will indicate whether that file matched a pattern or
not.  (Without this option, it would be impossible to tell whether the
absence of output for a given file meant that it didn't match any
pattern, or that the output hadn't been generated yet.)

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/git-check-ignore.txt | 15 +++++++++++++
 builtin/check-ignore.c             | 43 ++++++++++++++++++++++++--------------
 2 files changed, 42 insertions(+), 16 deletions(-)

diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
index 854e4d0..7e3cabc 100644
--- a/Documentation/git-check-ignore.txt
+++ b/Documentation/git-check-ignore.txt
@@ -39,6 +39,12 @@ OPTIONS
 	below).  If `--stdin` is also given, input paths are separated
 	with a NUL character instead of a linefeed character.
 
+-n, --non-matching::
+	Show given paths which don't match any pattern.	 This only
+	makes sense when `--verbose` is enabled, otherwise it would
+	not be possible to distinguish between paths which match a
+	pattern and those which don't.
+
 OUTPUT
 ------
 
@@ -65,6 +71,15 @@ are also used instead of colons and hard tabs:
 
 <source> <NULL> <linenum> <NULL> <pattern> <NULL> <pathname> <NULL>
 
+If `-n` or `--non-matching` are specified, non-matching pathnames will
+also be output, in which case all fields in each output record except
+for <pathname> will be empty.  This can be useful when running
+non-interactively, so that files can be incrementally streamed to
+STDIN of a long-running check-ignore process, and for each of these
+files, STDOUT will indicate whether that file matched a pattern or
+not.  (Without this option, it would be impossible to tell whether the
+absence of output for a given file meant that it didn't match any
+pattern, or that the output hadn't been generated yet.)
 
 EXIT STATUS
 -----------
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 0240f99..498fd65 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -5,7 +5,7 @@
 #include "pathspec.h"
 #include "parse-options.h"
 
-static int quiet, verbose, stdin_paths;
+static int quiet, verbose, stdin_paths, show_non_matching;
 static const char * const check_ignore_usage[] = {
 "git check-ignore [options] pathname...",
 "git check-ignore [options] --stdin < <list-of-paths>",
@@ -22,21 +22,28 @@ static const struct option check_ignore_options[] = {
 		    N_("read file names from stdin")),
 	OPT_BOOLEAN('z', NULL, &null_term_line,
 		    N_("input paths are terminated by a null character")),
+	OPT_BOOLEAN('n', "non-matching", &show_non_matching,
+		    N_("show non-matching input paths")),
 	OPT_END()
 };
 
 static void output_exclude(const char *path, struct exclude *exclude)
 {
-	char *bang  = exclude->flags & EXC_FLAG_NEGATIVE  ? "!" : "";
-	char *slash = exclude->flags & EXC_FLAG_MUSTBEDIR ? "/" : "";
+	char *bang  = (exclude && exclude->flags & EXC_FLAG_NEGATIVE)  ? "!" : "";
+	char *slash = (exclude && exclude->flags & EXC_FLAG_MUSTBEDIR) ? "/" : "";
 	if (!null_term_line) {
 		if (!verbose) {
 			write_name_quoted(path, stdout, '\n');
 		} else {
-			quote_c_style(exclude->el->src, NULL, stdout, 0);
-			printf(":%d:%s%s%s\t",
-			       exclude->srcpos,
-			       bang, exclude->pattern, slash);
+			if (exclude) {
+				quote_c_style(exclude->el->src, NULL, stdout, 0);
+				printf(":%d:%s%s%s\t",
+				       exclude->srcpos,
+				       bang, exclude->pattern, slash);
+			}
+			else {
+				printf("::\t");
+			}
 			quote_c_style(path, NULL, stdout, 0);
 			fputc('\n', stdout);
 		}
@@ -44,11 +51,14 @@ static void output_exclude(const char *path, struct exclude *exclude)
 		if (!verbose) {
 			printf("%s%c", path, '\0');
 		} else {
-			printf("%s%c%d%c%s%s%s%c%s%c",
-			       exclude->el->src, '\0',
-			       exclude->srcpos, '\0',
-			       bang, exclude->pattern, slash, '\0',
-			       path, '\0');
+			if (exclude)
+				printf("%s%c%d%c%s%s%s%c%s%c",
+				       exclude->el->src, '\0',
+				       exclude->srcpos, '\0',
+				       bang, exclude->pattern, slash, '\0',
+				       path, '\0');
+			else
+				printf("%c%c%c%s%c", '\0', '\0', '\0', path, '\0');
 		}
 	}
 }
@@ -92,11 +102,10 @@ static int check_ignore(const char *prefix, const char **pathspec)
 		if (!seen[i]) {
 			exclude = last_exclude_matching_path(&check, full_path,
 							     -1, &dtype);
-			if (exclude) {
-				if (!quiet)
-					output_exclude(path, exclude);
+			if (!quiet && (exclude || show_non_matching))
+				output_exclude(path, exclude);
+			if (exclude)
 				num_ignored++;
-			}
 		}
 	}
 	free(seen);
@@ -161,6 +170,8 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 		if (verbose)
 			die(_("cannot have both --quiet and --verbose"));
 	}
+	if (show_non_matching && !verbose)
+		die(_("--non-matching is only valid with --verbose"));
 
 	if (stdin_paths) {
 		num_ignored = check_ignore_stdin_paths(prefix);
-- 
1.8.2.242.g8617715.dirty

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: RFC: two minor tweaks to check-ignore to help git-annex assistant
  2013-04-08 18:13 RFC: two minor tweaks to check-ignore to help git-annex assistant Adam Spiers
@ 2013-04-08 21:56 ` Junio C Hamano
  2013-04-08 22:20 ` Jeff King
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
  2 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2013-04-08 21:56 UTC (permalink / raw)
  To: Adam Spiers; +Cc: Git Mailing List

Adam Spiers <git@adamspiers.org> writes:

> I already have a rough fix for the second issue, but I wanted to
> solicit feedback on the appropriate UI changes before proceeding much
> further.  Does something like the below patch seem reasonable, modulo
> the lack of tests?  In case the UI changes I am proposing are not
> clear from the patch, here's some example output from running it
> inside a clone of the git source tree:
>
>     $ git check-ignore -v -n foo.tar.{gz,bz2}
>     .gitignore:203:*.tar.gz foo.tar.gz
>     ::      foo.tar.bz2
>
> So the number of output fields does not change depending on whether
> the pattern matches or not, and any caller can determine whether it
> does simply by checking whether the first field is non-empty.

Haven't looked at the proposed patch very carefully, but the design
looks sound.  The above output screams "empty! nothing!", and I do
not think there is any other way :: will show up in that position.

> Also, does it make sense to write a new test to accompany the fix to
> the first (streaming) issue?

Would it be tricky to write safely not to get stuck?  You feed one
line, stop feeding, while checking that the output has arrived, and
then kill the whole thing?  Feels somewhat yucky, but sounds doable.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: RFC: two minor tweaks to check-ignore to help git-annex assistant
  2013-04-08 18:13 RFC: two minor tweaks to check-ignore to help git-annex assistant Adam Spiers
  2013-04-08 21:56 ` Junio C Hamano
@ 2013-04-08 22:20 ` Jeff King
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
  2 siblings, 0 replies; 33+ messages in thread
From: Jeff King @ 2013-04-08 22:20 UTC (permalink / raw)
  To: Adam Spiers; +Cc: Git Mailing List

On Mon, Apr 08, 2013 at 07:13:11PM +0100, Adam Spiers wrote:

> I was recently informed by the author of git-annex that my
> implementation of git check-ignore has two minor deficiencies which
> currently prevent him from adding .gitignore support to the git-annex
> assistant (web UI):
> 
>     1. When accepting a list of files to check via --stdin, no results
>        are calculated until EOF is hit.  This prevents it being used
>        as a persistent background query process which streams results
>        to its caller.  (This is inconsistent with check-attr, which
>        *does* support stream-like behaviour.)

I think flushing on each line is reasonable, though you are also
introducing a deadlock possibility for callers which do not read back
the output in real-time. For example, if I write N paths out then read N
ignore-lines back in, I risk a situation where I am blocked on write()
to check-ignore, and it is blocked on write back to me. Somebody has to
buffer (the pipe buffers give you some leeway, but they are limited).

Given how new check-ignore is, and that we have not advertised any
particular buffering scheme so far, it's probably OK to switch without
worrying about breaking existing callers.

But if this is a mode of operation that we expect people to use (here
and for check-attr), we should advertise the flushing behavior, and
probably warn about the deadlock (I don't think adding a "--no-flush"
option is worth it, as it would just mean buffering in check-ignore,
which the caller could just as easily do itself).

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/5] check-ignore: move setup into cmd_check_ignore()
  2013-04-08 18:13 RFC: two minor tweaks to check-ignore to help git-annex assistant Adam Spiers
  2013-04-08 21:56 ` Junio C Hamano
  2013-04-08 22:20 ` Jeff King
@ 2013-04-11  1:59 ` Adam Spiers
  2013-04-11  1:59   ` [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
                     ` (4 more replies)
  2 siblings, 5 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11  1:59 UTC (permalink / raw)
  To: git list

Initialisation of the dir_struct and path_exclude_check structs was
previously done within check_ignore().  This was acceptable since
check_ignore() was only called once per check-ignore invocation;
however the next commit will convert it into an inner loop which is
called once per line of STDIN when --stdin is given.  Therefore moving
the initialisation code out into cmd_check_ignore() ensures that
initialisation is still only performed once per check-ignore
invocation, and consequently that the output is identical whether
pathspecs are provided as CLI arguments or via STDIN.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 builtin/check-ignore.c | 39 ++++++++++++++++++++-------------------
 1 file changed, 20 insertions(+), 19 deletions(-)

diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 0240f99..0a4eef1 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -53,30 +53,20 @@ static void output_exclude(const char *path, struct exclude *exclude)
 	}
 }
 
-static int check_ignore(const char *prefix, const char **pathspec)
+static int check_ignore(struct path_exclude_check check,
+			const char *prefix, const char **pathspec)
 {
-	struct dir_struct dir;
 	const char *path, *full_path;
 	char *seen;
 	int num_ignored = 0, dtype = DT_UNKNOWN, i;
-	struct path_exclude_check check;
 	struct exclude *exclude;
 
-	/* read_cache() is only necessary so we can watch out for submodules. */
-	if (read_cache() < 0)
-		die(_("index file corrupt"));
-
-	memset(&dir, 0, sizeof(dir));
-	dir.flags |= DIR_COLLECT_IGNORED;
-	setup_standard_excludes(&dir);
-
 	if (!pathspec || !*pathspec) {
 		if (!quiet)
 			fprintf(stderr, "no pathspec given.\n");
 		return 0;
 	}
 
-	path_exclude_check_init(&check, &dir);
 	/*
 	 * look for pathspecs matching entries in the index, since these
 	 * should not be ignored, in order to be consistent with
@@ -100,13 +90,11 @@ static int check_ignore(const char *prefix, const char **pathspec)
 		}
 	}
 	free(seen);
-	clear_directory(&dir);
-	path_exclude_check_clear(&check);
 
 	return num_ignored;
 }
 
-static int check_ignore_stdin_paths(const char *prefix)
+static int check_ignore_stdin_paths(struct path_exclude_check check, const char *prefix)
 {
 	struct strbuf buf, nbuf;
 	char **pathspec = NULL;
@@ -129,17 +117,18 @@ static int check_ignore_stdin_paths(const char *prefix)
 	}
 	ALLOC_GROW(pathspec, nr + 1, alloc);
 	pathspec[nr] = NULL;
-	num_ignored = check_ignore(prefix, (const char **)pathspec);
+	num_ignored = check_ignore(check, prefix, (const char **)pathspec);
 	maybe_flush_or_die(stdout, "attribute to stdout");
 	strbuf_release(&buf);
 	strbuf_release(&nbuf);
-	free(pathspec);
 	return num_ignored;
 }
 
 int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 {
 	int num_ignored;
+	struct dir_struct dir;
+	struct path_exclude_check check;
 
 	git_config(git_default_config, NULL);
 
@@ -162,12 +151,24 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 			die(_("cannot have both --quiet and --verbose"));
 	}
 
+	/* read_cache() is only necessary so we can watch out for submodules. */
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	memset(&dir, 0, sizeof(dir));
+	dir.flags |= DIR_COLLECT_IGNORED;
+	setup_standard_excludes(&dir);
+
+	path_exclude_check_init(&check, &dir);
 	if (stdin_paths) {
-		num_ignored = check_ignore_stdin_paths(prefix);
+		num_ignored = check_ignore_stdin_paths(check, prefix);
 	} else {
-		num_ignored = check_ignore(prefix, argv);
+		num_ignored = check_ignore(check, prefix, argv);
 		maybe_flush_or_die(stdout, "ignore to stdout");
 	}
 
+	clear_directory(&dir);
+	path_exclude_check_clear(&check);
+
 	return !num_ignored;
 }
-- 
1.8.2.1.347.g37e0606

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
@ 2013-04-11  1:59   ` Adam Spiers
  2013-04-11  5:31     ` Jeff King
  2013-04-11 11:20     ` Adam Spiers
  2013-04-11  1:59   ` [PATCH 3/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
                     ` (3 subsequent siblings)
  4 siblings, 2 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11  1:59 UTC (permalink / raw)
  To: git list

Some callers, such as the git-annex web assistant, find it useful to
invoke git check-ignore as a persistent background process, which can
then have queries fed to its STDIN at any point, and the corresponding
response consumed from its STDOUT.  For this we need to invoke
check_ignore() once per line of standard input, and flush standard
output after each result.

The above use case suggests that empty STDIN is actually a reasonable
scenario (e.g. when the caller doesn't know in advance whether any
queries need to be fed to the background process until after it's
already started), so we make the minor behavioural change that "no
pathspec given." is no longer emitted in when STDIN is empty.

Even though check_ignore() could now be changed to operate on a single
pathspec, we keep it operating on an array of pathspecs since that is
a more convenient way of consuming the existing pathspec API.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 builtin/check-ignore.c | 16 ++++++----------
 t/t0008-ignores.sh     | 29 ++++++++++++++++++++++++-----
 2 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 0a4eef1..ce4b1ad 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -97,10 +97,9 @@ static int check_ignore(struct path_exclude_check check,
 static int check_ignore_stdin_paths(struct path_exclude_check check, const char *prefix)
 {
 	struct strbuf buf, nbuf;
-	char **pathspec = NULL;
-	size_t nr = 0, alloc = 0;
+	char *pathspec[2] = { NULL, NULL };
 	int line_termination = null_term_line ? 0 : '\n';
-	int num_ignored;
+	int num_ignored = 0;
 
 	strbuf_init(&buf, 0);
 	strbuf_init(&nbuf, 0);
@@ -111,14 +110,11 @@ static int check_ignore_stdin_paths(struct path_exclude_check check, const char
 				die("line is badly quoted");
 			strbuf_swap(&buf, &nbuf);
 		}
-		ALLOC_GROW(pathspec, nr + 1, alloc);
-		pathspec[nr] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
-		strcpy(pathspec[nr++], buf.buf);
+		pathspec[0] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
+		strcpy(pathspec[0], buf.buf);
+		num_ignored += check_ignore(check, prefix, (const char **)pathspec);
+		maybe_flush_or_die(stdout, "check-ignore to stdout");
 	}
-	ALLOC_GROW(pathspec, nr + 1, alloc);
-	pathspec[nr] = NULL;
-	num_ignored = check_ignore(check, prefix, (const char **)pathspec);
-	maybe_flush_or_die(stdout, "attribute to stdout");
 	strbuf_release(&buf);
 	strbuf_release(&nbuf);
 	return num_ignored;
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 9c1bde1..0dd3ef7 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -189,11 +189,7 @@ test_expect_success_multi 'empty command line' '' '
 
 test_expect_success_multi '--stdin with empty STDIN' '' '
 	test_check_ignore "--stdin" 1 </dev/null &&
-	if test -n "$quiet_opt"; then
-		test_stderr ""
-	else
-		test_stderr "no pathspec given."
-	fi
+	test_stderr ""
 '
 
 test_expect_success '-q with multiple args' '
@@ -648,5 +644,28 @@ do
 	'
 done
 
+test_expect_success 'setup: have stdbuf?' '
+	if which stdbuf >/dev/null 2>&1
+	then
+		test_set_prereq STDBUF
+	fi
+'
+
+test_expect_success STDBUF 'streaming support for --stdin' '
+	(
+		echo one
+		sleep 2
+		echo two
+	) | stdbuf -oL git check-ignore -v -n --stdin >out &
+	pid=$! &&
+	sleep 1 &&
+	cat out &&
+	grep "^\.gitignore:1:one	one" out &&
+	test $( wc -l <out ) = 1 &&
+	sleep 2 &&
+	grep "^::	two" out &&
+	test $( wc -l <out ) = 2 &&
+	( wait $pid || kill $pid || : ) 2>/dev/null
+'
 
 test_done
-- 
1.8.2.1.347.g37e0606

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
  2013-04-11  1:59   ` [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
@ 2013-04-11  1:59   ` Adam Spiers
  2013-04-11  5:31     ` Jeff King
  2013-04-11  1:59   ` [PATCH 4/5] t0008: remove duplicated test fixture data Adam Spiers
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Adam Spiers @ 2013-04-11  1:59 UTC (permalink / raw)
  To: git list

check-attr and check-ignore have the potential to deadlock callers
which do not read back the output in real-time.  For example, if a
caller writes N paths out and then reads N lines back in, it risks
becoming blocked on write() to check-*, and check-* is blocked on
write back to the caller.  Somebody has to buffer; the pipe buffers
provide some leeway, but they are limited.

Thanks to Peff for pointing this out:

    http://article.gmane.org/gmane.comp.version-control.git/220534

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/git-check-attr.txt   |  5 +++++
 Documentation/git-check-ignore.txt |  5 +++++
 Documentation/git.txt              | 16 +++++++++-------
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt
index 5abdbaa..a7be80d 100644
--- a/Documentation/git-check-attr.txt
+++ b/Documentation/git-check-attr.txt
@@ -56,6 +56,11 @@ being queried and <info> can be either:
 'set';;		when the attribute is defined as true.
 <value>;;	when a value has been assigned to the attribute.
 
+Buffering happens as documented under the `GIT_FLUSH` option in
+linkgit:git[1].  The caller is responsible for avoiding deadlocks
+caused by overfilling an input buffer or reading from an empty output
+buffer.
+
 EXAMPLES
 --------
 
diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
index 854e4d0..4014d28 100644
--- a/Documentation/git-check-ignore.txt
+++ b/Documentation/git-check-ignore.txt
@@ -66,6 +66,11 @@ are also used instead of colons and hard tabs:
 <source> <NULL> <linenum> <NULL> <pattern> <NULL> <pathname> <NULL>
 
 
+Buffering happens as documented under the `GIT_FLUSH` option in
+linkgit:git[1].  The caller is responsible for avoiding deadlocks
+caused by overfilling an input buffer or reading from an empty output
+buffer.
+
 EXIT STATUS
 -----------
 
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6a875f2..eecdb15 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -808,13 +808,15 @@ for further details.
 
 'GIT_FLUSH'::
 	If this environment variable is set to "1", then commands such
-	as 'git blame' (in incremental mode), 'git rev-list', 'git log',
-	and 'git whatchanged' will force a flush of the output stream
-	after each commit-oriented record have been flushed.   If this
-	variable is set to "0", the output of these commands will be done
-	using completely buffered I/O.   If this environment variable is
-	not set, Git will choose buffered or record-oriented flushing
-	based on whether stdout appears to be redirected to a file or not.
+	as 'git blame' (in incremental mode), 'git rev-list', 'git
+	log', 'git check-attr', 'git check-ignore', and 'git
+	whatchanged' will force a flush of the output stream after
+	each commit-oriented record have been flushed.  If this
+	variable is set to "0", the output of these commands will be
+	done using completely buffered I/O.  If this environment
+	variable is not set, Git will choose buffered or
+	record-oriented flushing based on whether stdout appears to be
+	redirected to a file or not.
 
 'GIT_TRACE'::
 	If this variable is set to "1", "2" or "true" (comparison
-- 
1.8.2.1.347.g37e0606

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/5] t0008: remove duplicated test fixture data
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
  2013-04-11  1:59   ` [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
  2013-04-11  1:59   ` [PATCH 3/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
@ 2013-04-11  1:59   ` Adam Spiers
  2013-04-11  1:59   ` [PATCH 5/5] check-ignore: add -n / --non-matching option Adam Spiers
  2013-04-11  5:25   ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King
  4 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11  1:59 UTC (permalink / raw)
  To: git list

The expected contents of STDOUT for the final --stdin tests can be
derived from the expected contents of STDOUT for the same tests when
--verbose is given, in the same way that test_expect_success_multi
derives this for earlier tests.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 t/t0008-ignores.sh | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 0dd3ef7..80b731a 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -571,21 +571,6 @@ cat <<-\EOF >stdin
 	b/globaltwo
 	../b/globaltwo
 EOF
-cat <<-\EOF >expected-default
-	../one
-	one
-	b/on
-	b/one
-	b/one one
-	b/one two
-	"b/one\"three"
-	b/two
-	b/twooo
-	../globaltwo
-	globaltwo
-	b/globaltwo
-	../b/globaltwo
-EOF
 cat <<-EOF >expected-verbose
 	.gitignore:1:one	../one
 	.gitignore:1:one	one
@@ -601,6 +586,7 @@ cat <<-EOF >expected-verbose
 	$global_excludes:2:!globaltwo	b/globaltwo
 	$global_excludes:2:!globaltwo	../b/globaltwo
 EOF
+sed -e 's/.*	//' expected-verbose >expected-default
 
 sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
 	tr "\n" "\0" >stdin0
-- 
1.8.2.1.347.g37e0606

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 5/5] check-ignore: add -n / --non-matching option
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
                     ` (2 preceding siblings ...)
  2013-04-11  1:59   ` [PATCH 4/5] t0008: remove duplicated test fixture data Adam Spiers
@ 2013-04-11  1:59   ` Adam Spiers
  2013-04-11  5:25   ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King
  4 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11  1:59 UTC (permalink / raw)
  To: git list

If `-n` or `--non-matching` are specified, non-matching pathnames will
also be output, in which case all fields in each output record except
for <pathname> will be empty.  This can be useful when running
check-ignore as a background process, so that files can be
incrementally streamed to STDIN, and for each of these files, STDOUT
will indicate whether that file matched a pattern or not.  (Without
this option, it would be impossible to tell whether the absence of
output for a given file meant that it didn't match any pattern, or
that the result simply hadn't been flushed to STDOUT yet.)

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/git-check-ignore.txt |  15 +++++
 builtin/check-ignore.c             |  46 ++++++++------
 t/t0008-ignores.sh                 | 122 +++++++++++++++++++++++++++----------
 3 files changed, 134 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
index 4014d28..8e1f7ab 100644
--- a/Documentation/git-check-ignore.txt
+++ b/Documentation/git-check-ignore.txt
@@ -39,6 +39,12 @@ OPTIONS
 	below).  If `--stdin` is also given, input paths are separated
 	with a NUL character instead of a linefeed character.
 
+-n, --non-matching::
+	Show given paths which don't match any pattern.	 This only
+	makes sense when `--verbose` is enabled, otherwise it would
+	not be possible to distinguish between paths which match a
+	pattern and those which don't.
+
 OUTPUT
 ------
 
@@ -65,6 +71,15 @@ are also used instead of colons and hard tabs:
 
 <source> <NULL> <linenum> <NULL> <pattern> <NULL> <pathname> <NULL>
 
+If `-n` or `--non-matching` are specified, non-matching pathnames will
+also be output, in which case all fields in each output record except
+for <pathname> will be empty.  This can be useful when running
+non-interactively, so that files can be incrementally streamed to
+STDIN of a long-running check-ignore process, and for each of these
+files, STDOUT will indicate whether that file matched a pattern or
+not.  (Without this option, it would be impossible to tell whether the
+absence of output for a given file meant that it didn't match any
+pattern, or that the output hadn't been generated yet.)
 
 Buffering happens as documented under the `GIT_FLUSH` option in
 linkgit:git[1].  The caller is responsible for avoiding deadlocks
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index ce4b1ad..03ddada 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -5,7 +5,7 @@
 #include "pathspec.h"
 #include "parse-options.h"
 
-static int quiet, verbose, stdin_paths;
+static int quiet, verbose, stdin_paths, show_non_matching;
 static const char * const check_ignore_usage[] = {
 "git check-ignore [options] pathname...",
 "git check-ignore [options] --stdin < <list-of-paths>",
@@ -22,21 +22,28 @@ static const struct option check_ignore_options[] = {
 		    N_("read file names from stdin")),
 	OPT_BOOLEAN('z', NULL, &null_term_line,
 		    N_("input paths are terminated by a null character")),
+	OPT_BOOLEAN('n', "non-matching", &show_non_matching,
+		    N_("show non-matching input paths")),
 	OPT_END()
 };
 
 static void output_exclude(const char *path, struct exclude *exclude)
 {
-	char *bang  = exclude->flags & EXC_FLAG_NEGATIVE  ? "!" : "";
-	char *slash = exclude->flags & EXC_FLAG_MUSTBEDIR ? "/" : "";
+	char *bang  = (exclude && exclude->flags & EXC_FLAG_NEGATIVE)  ? "!" : "";
+	char *slash = (exclude && exclude->flags & EXC_FLAG_MUSTBEDIR) ? "/" : "";
 	if (!null_term_line) {
 		if (!verbose) {
 			write_name_quoted(path, stdout, '\n');
 		} else {
-			quote_c_style(exclude->el->src, NULL, stdout, 0);
-			printf(":%d:%s%s%s\t",
-			       exclude->srcpos,
-			       bang, exclude->pattern, slash);
+			if (exclude) {
+				quote_c_style(exclude->el->src, NULL, stdout, 0);
+				printf(":%d:%s%s%s\t",
+				       exclude->srcpos,
+				       bang, exclude->pattern, slash);
+			}
+			else {
+				printf("::\t");
+			}
 			quote_c_style(path, NULL, stdout, 0);
 			fputc('\n', stdout);
 		}
@@ -44,11 +51,14 @@ static void output_exclude(const char *path, struct exclude *exclude)
 		if (!verbose) {
 			printf("%s%c", path, '\0');
 		} else {
-			printf("%s%c%d%c%s%s%s%c%s%c",
-			       exclude->el->src, '\0',
-			       exclude->srcpos, '\0',
-			       bang, exclude->pattern, slash, '\0',
-			       path, '\0');
+			if (exclude)
+				printf("%s%c%d%c%s%s%s%c%s%c",
+				       exclude->el->src, '\0',
+				       exclude->srcpos, '\0',
+				       bang, exclude->pattern, slash, '\0',
+				       path, '\0');
+			else
+				printf("%c%c%c%s%c", '\0', '\0', '\0', path, '\0');
 		}
 	}
 }
@@ -79,15 +89,15 @@ static int check_ignore(struct path_exclude_check check,
 					? strlen(prefix) : 0, path);
 		full_path = check_path_for_gitlink(full_path);
 		die_if_path_beyond_symlink(full_path, prefix);
+		exclude = NULL;
 		if (!seen[i]) {
 			exclude = last_exclude_matching_path(&check, full_path,
 							     -1, &dtype);
-			if (exclude) {
-				if (!quiet)
-					output_exclude(path, exclude);
-				num_ignored++;
-			}
 		}
+		if (!quiet && (exclude || show_non_matching))
+			output_exclude(path, exclude);
+		if (exclude)
+			num_ignored++;
 	}
 	free(seen);
 
@@ -146,6 +156,8 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 		if (verbose)
 			die(_("cannot have both --quiet and --verbose"));
 	}
+	if (show_non_matching && !verbose)
+		die(_("--non-matching is only valid with --verbose"));
 
 	/* read_cache() is only necessary so we can watch out for submodules. */
 	if (read_cache() < 0)
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 80b731a..86b8bf8 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -66,16 +66,23 @@ test_check_ignore () {
 
 	init_vars &&
 	rm -f "$HOME/stdout" "$HOME/stderr" "$HOME/cmd" &&
-	echo git $global_args check-ignore $quiet_opt $verbose_opt $args \
+	echo git $global_args check-ignore $quiet_opt $verbose_opt $non_matching_opt $args \
 		>"$HOME/cmd" &&
+	echo "$expect_code" >"$HOME/expected-exit-code" &&
 	test_expect_code "$expect_code" \
-		git $global_args check-ignore $quiet_opt $verbose_opt $args \
+		git $global_args check-ignore $quiet_opt $verbose_opt $non_matching_opt $args \
 		>"$HOME/stdout" 2>"$HOME/stderr" &&
 	test_cmp "$HOME/expected-stdout" "$HOME/stdout" &&
 	stderr_empty_on_success "$expect_code"
 }
 
-# Runs the same code with 3 different levels of output verbosity,
+# Runs the same code with 4 different levels of output verbosity:
+#
+#   1. with -q / --quiet
+#   2. with default verbosity
+#   3. with -v / --verbose
+#   4. with -v / --verbose, *and* -n / --non-matching
+#
 # expecting success each time.  Takes advantage of the fact that
 # check-ignore --verbose output is the same as normal output except
 # for the extra first column.
@@ -83,7 +90,9 @@ test_check_ignore () {
 # Arguments:
 #   - (optional) prereqs for this test, e.g. 'SYMLINKS'
 #   - test name
-#   - output to expect from -v / --verbose mode
+#   - output to expect from the fourth verbosity mode (the output
+#     from the other verbosity modes is automatically inferred
+#     from this value)
 #   - code to run (should invoke test_check_ignore)
 test_expect_success_multi () {
 	prereq=
@@ -92,8 +101,9 @@ test_expect_success_multi () {
 		prereq=$1
 		shift
 	fi
-	testname="$1" expect_verbose="$2" code="$3"
+	testname="$1" expect_all="$2" code="$3"
 
+	expect_verbose=$( echo "$expect_all" | grep -v '^::	' )
 	expect=$( echo "$expect_verbose" | sed -e 's/.*	//' )
 
 	test_expect_success $prereq "$testname" '
@@ -101,23 +111,40 @@ test_expect_success_multi () {
 		eval "$code"
 	'
 
-	for quiet_opt in '-q' '--quiet'
-	do
-		test_expect_success $prereq "$testname${quiet_opt:+ with $quiet_opt}" "
+	# --quiet is only valid when a single pattern is passed
+	if test $( echo "$expect_all" | wc -l ) = 1
+	then
+		for quiet_opt in '-q' '--quiet'
+		do
+			test_expect_success $prereq "$testname${quiet_opt:+ with $quiet_opt}" "
 			expect '' &&
 			$code
 		"
-	done
-	quiet_opt=
+		done
+		quiet_opt=
+	fi
 
 	for verbose_opt in '-v' '--verbose'
 	do
-		test_expect_success $prereq "$testname${verbose_opt:+ with $verbose_opt}" "
-			expect '$expect_verbose' &&
-			$code
-		"
+		for non_matching_opt in '' ' -n' ' --non-matching'
+		do
+			if test -n "$non_matching_opt"
+			then
+				my_expect="$expect_all"
+			else
+				my_expect="$expect_verbose"
+			fi
+
+			test_code="
+				expect '$my_expect' &&
+				$code
+			"
+			opts="$verbose_opt$non_matching_opt"
+			test_expect_success $prereq "$testname${opts:+ with $opts}" "$test_code"
+		done
 	done
 	verbose_opt=
+	non_matching_opt=
 }
 
 test_expect_success 'setup' '
@@ -178,7 +205,7 @@ test_expect_success 'setup' '
 #
 # test invalid inputs
 
-test_expect_success_multi '. corner-case' '' '
+test_expect_success_multi '. corner-case' '::	.' '
 	test_check_ignore . 1
 '
 
@@ -272,27 +299,39 @@ do
 		where="in subdir $subdir"
 	fi
 
-	test_expect_success_multi "non-existent file $where not ignored" '' "
-		test_check_ignore '${subdir}non-existent' 1
-	"
+	test_expect_success_multi "non-existent file $where not ignored" \
+		"::	${subdir}non-existent" \
+		"test_check_ignore '${subdir}non-existent' 1"
 
 	test_expect_success_multi "non-existent file $where ignored" \
-		".gitignore:1:one	${subdir}one" "
-		test_check_ignore '${subdir}one'
-	"
+		".gitignore:1:one	${subdir}one" \
+		"test_check_ignore '${subdir}one'"
 
-	test_expect_success_multi "existing untracked file $where not ignored" '' "
-		test_check_ignore '${subdir}not-ignored' 1
-	"
+	test_expect_success_multi "existing untracked file $where not ignored" \
+		"::	${subdir}not-ignored" \
+		"test_check_ignore '${subdir}not-ignored' 1"
 
-	test_expect_success_multi "existing tracked file $where not ignored" '' "
-		test_check_ignore '${subdir}ignored-but-in-index' 1
-	"
+	test_expect_success_multi "existing tracked file $where not ignored" \
+		"::	${subdir}ignored-but-in-index" \
+		"test_check_ignore '${subdir}ignored-but-in-index' 1"
 
 	test_expect_success_multi "existing untracked file $where ignored" \
-		".gitignore:2:ignored-*	${subdir}ignored-and-untracked" "
-		test_check_ignore '${subdir}ignored-and-untracked'
-	"
+		".gitignore:2:ignored-*	${subdir}ignored-and-untracked" \
+		"test_check_ignore '${subdir}ignored-and-untracked'"
+
+	test_expect_success_multi "mix of file types $where" \
+"::	${subdir}non-existent
+.gitignore:1:one	${subdir}one
+::	${subdir}not-ignored
+::	${subdir}ignored-but-in-index
+.gitignore:2:ignored-*	${subdir}ignored-and-untracked" \
+		"test_check_ignore '
+			${subdir}non-existent
+			${subdir}one
+			${subdir}not-ignored
+			${subdir}ignored-but-in-index
+			${subdir}ignored-and-untracked'
+		"
 done
 
 # Having established the above, from now on we mostly test against
@@ -387,7 +426,7 @@ test_expect_success 'cd to ignored sub-directory with -v' '
 #
 # test handling of symlinks
 
-test_expect_success_multi SYMLINKS 'symlink' '' '
+test_expect_success_multi SYMLINKS 'symlink' '::	a/symlink' '
 	test_check_ignore "a/symlink" 1
 '
 
@@ -570,22 +609,33 @@ cat <<-\EOF >stdin
 	globaltwo
 	b/globaltwo
 	../b/globaltwo
+	c/not-ignored
 EOF
-cat <<-EOF >expected-verbose
+# N.B. we deliberately end STDIN with a non-matching pattern in order
+# to test that the exit code indicates that one or more of the
+# provided paths is ignored - in other words, that it represents an
+# aggregation of all the results, not just the final result.
+
+cat <<-EOF >expected-all
 	.gitignore:1:one	../one
+	::	../not-ignored
 	.gitignore:1:one	one
+	::	not-ignored
 	a/b/.gitignore:8:!on*	b/on
 	a/b/.gitignore:8:!on*	b/one
 	a/b/.gitignore:8:!on*	b/one one
 	a/b/.gitignore:8:!on*	b/one two
 	a/b/.gitignore:8:!on*	"b/one\"three"
 	a/b/.gitignore:9:!two	b/two
+	::	b/not-ignored
 	a/.gitignore:1:two*	b/twooo
 	$global_excludes:2:!globaltwo	../globaltwo
 	$global_excludes:2:!globaltwo	globaltwo
 	$global_excludes:2:!globaltwo	b/globaltwo
 	$global_excludes:2:!globaltwo	../b/globaltwo
+	::	c/not-ignored
 EOF
+grep -v '^::	' expected-all >expected-verbose
 sed -e 's/.*	//' expected-verbose >expected-default
 
 sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
@@ -611,6 +661,14 @@ test_expect_success '--stdin from subdirectory with -v' '
 	)
 '
 
+test_expect_success '--stdin from subdirectory with -v -n' '
+	expect_from_stdin <expected-all &&
+	(
+		cd a &&
+		test_check_ignore "--stdin -v -n" <../stdin
+	)
+'
+
 for opts in '--stdin -z' '-z --stdin'
 do
 	test_expect_success "$opts from subdirectory" '
-- 
1.8.2.1.347.g37e0606

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] check-ignore: move setup into cmd_check_ignore()
  2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
                     ` (3 preceding siblings ...)
  2013-04-11  1:59   ` [PATCH 5/5] check-ignore: add -n / --non-matching option Adam Spiers
@ 2013-04-11  5:25   ` Jeff King
  2013-04-11 11:05     ` Adam Spiers
  4 siblings, 1 reply; 33+ messages in thread
From: Jeff King @ 2013-04-11  5:25 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 02:59:31AM +0100, Adam Spiers wrote:

> Initialisation of the dir_struct and path_exclude_check structs was
> previously done within check_ignore().  This was acceptable since
> check_ignore() was only called once per check-ignore invocation;
> however the next commit will convert it into an inner loop which is
> called once per line of STDIN when --stdin is given.  Therefore moving
> the initialisation code out into cmd_check_ignore() ensures that
> initialisation is still only performed once per check-ignore
> invocation, and consequently that the output is identical whether
> pathspecs are provided as CLI arguments or via STDIN.
> 
> Signed-off-by: Adam Spiers <git@adamspiers.org>
> ---
>  builtin/check-ignore.c | 39 ++++++++++++++++++++-------------------
>  1 file changed, 20 insertions(+), 19 deletions(-)
> 
> diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
> index 0240f99..0a4eef1 100644
> --- a/builtin/check-ignore.c
> +++ b/builtin/check-ignore.c
> @@ -53,30 +53,20 @@ static void output_exclude(const char *path, struct exclude *exclude)
>  	}
>  }
>  
> -static int check_ignore(const char *prefix, const char **pathspec)
> +static int check_ignore(struct path_exclude_check check,
> +			const char *prefix, const char **pathspec)

Did you mean to pass the struct by value here? If it is truly a per-path
value, shouldn't it be declared and initialized inside here? Otherwise
you risk one invocation munging things that the struct points to, but
the caller's copy does not know about the change.

In particular, I see that the struct includes a strbuf. What happens
when one invocation of check_ignore grows the strbuf, then returns? The
copy of the struct in the caller will not know that the buffer it is
pointing to is now bogus.

> -static int check_ignore_stdin_paths(const char *prefix)
> +static int check_ignore_stdin_paths(struct path_exclude_check check, const char *prefix)

Ditto here.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11  1:59   ` [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
@ 2013-04-11  5:31     ` Jeff King
  2013-04-11 10:55       ` Adam Spiers
  2013-04-11 11:20     ` Adam Spiers
  1 sibling, 1 reply; 33+ messages in thread
From: Jeff King @ 2013-04-11  5:31 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 02:59:32AM +0100, Adam Spiers wrote:

> @@ -111,14 +110,11 @@ static int check_ignore_stdin_paths(struct path_exclude_check check, const char
>  				die("line is badly quoted");
>  			strbuf_swap(&buf, &nbuf);
>  		}
> -		ALLOC_GROW(pathspec, nr + 1, alloc);
> -		pathspec[nr] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
> -		strcpy(pathspec[nr++], buf.buf);
> +		pathspec[0] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
> +		strcpy(pathspec[0], buf.buf);
> +		num_ignored += check_ignore(check, prefix, (const char **)pathspec);
> +		maybe_flush_or_die(stdout, "check-ignore to stdout");

Now that you are not storing the whole pathspec at once, the pathspec
buffer only needs to be valid for the length of check_ignore, right?
That means you can drop this extra copy and just pass in buf.buf:

  pathspec[0] = buf.buf;
  num_ignored += check_ignore(check, prefix, pathspec);

> +test_expect_success 'setup: have stdbuf?' '
> +	if which stdbuf >/dev/null 2>&1
> +	then
> +		test_set_prereq STDBUF
> +	fi
> +'

Hmm. Today I learned about stdbuf. :)

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-11  1:59   ` [PATCH 3/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
@ 2013-04-11  5:31     ` Jeff King
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff King @ 2013-04-11  5:31 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 02:59:33AM +0100, Adam Spiers wrote:

> check-attr and check-ignore have the potential to deadlock callers
> which do not read back the output in real-time.  For example, if a
> caller writes N paths out and then reads N lines back in, it risks
> becoming blocked on write() to check-*, and check-* is blocked on
> write back to the caller.  Somebody has to buffer; the pipe buffers
> provide some leeway, but they are limited.
> 
> Thanks to Peff for pointing this out:
> 
>     http://article.gmane.org/gmane.comp.version-control.git/220534
> 
> Signed-off-by: Adam Spiers <git@adamspiers.org>

Thanks, I think the documentation changes look sane.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11  5:31     ` Jeff King
@ 2013-04-11 10:55       ` Adam Spiers
  0 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 10:55 UTC (permalink / raw)
  To: Jeff King; +Cc: git list

On Thu, Apr 11, 2013 at 01:31:45AM -0400, Jeff King wrote:
> On Thu, Apr 11, 2013 at 02:59:32AM +0100, Adam Spiers wrote:
> 
> > @@ -111,14 +110,11 @@ static int check_ignore_stdin_paths(struct path_exclude_check check, const char
> >  				die("line is badly quoted");
> >  			strbuf_swap(&buf, &nbuf);
> >  		}
> > -		ALLOC_GROW(pathspec, nr + 1, alloc);
> > -		pathspec[nr] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
> > -		strcpy(pathspec[nr++], buf.buf);
> > +		pathspec[0] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
> > +		strcpy(pathspec[0], buf.buf);
> > +		num_ignored += check_ignore(check, prefix, (const char **)pathspec);
> > +		maybe_flush_or_die(stdout, "check-ignore to stdout");
> 
> Now that you are not storing the whole pathspec at once, the pathspec
> buffer only needs to be valid for the length of check_ignore, right?
> That means you can drop this extra copy and just pass in buf.buf:
> 
>   pathspec[0] = buf.buf;
>   num_ignored += check_ignore(check, prefix, pathspec);

Oops, good point - thanks.  I've made that change.

> > +test_expect_success 'setup: have stdbuf?' '
> > +	if which stdbuf >/dev/null 2>&1
> > +	then
> > +		test_set_prereq STDBUF
> > +	fi
> > +'
> 
> Hmm. Today I learned about stdbuf. :)

Yeah, it's a relatively recent addition to coreutils.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] check-ignore: move setup into cmd_check_ignore()
  2013-04-11  5:25   ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King
@ 2013-04-11 11:05     ` Adam Spiers
  2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
  2013-04-11 18:35       ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King
  0 siblings, 2 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 11:05 UTC (permalink / raw)
  To: git list

On Thu, Apr 11, 2013 at 01:25:53AM -0400, Jeff King wrote:
> On Thu, Apr 11, 2013 at 02:59:31AM +0100, Adam Spiers wrote:
> > -static int check_ignore(const char *prefix, const char **pathspec)
> > +static int check_ignore(struct path_exclude_check check,
> > +			const char *prefix, const char **pathspec)
> 
> Did you mean to pass the struct by value here? If it is truly a per-path
> value, shouldn't it be declared and initialized inside here? Otherwise
> you risk one invocation munging things that the struct points to, but
> the caller's copy does not know about the change.
> 
> In particular, I see that the struct includes a strbuf. What happens
> when one invocation of check_ignore grows the strbuf, then returns? The
> copy of the struct in the caller will not know that the buffer it is
> pointing to is now bogus.
> 
> > -static int check_ignore_stdin_paths(const char *prefix)
> > +static int check_ignore_stdin_paths(struct path_exclude_check check, const char *prefix)
> 
> Ditto here.

It's not a per-path value; it's supposed to be reused across checks
for multiple paths, as explained in the comments above
last_exclude_matching_path():

    ...
     * A path to a directory known to be excluded is left in check->path to
     * optimize for repeated checks for files in the same excluded directory.
     */
    struct exclude *last_exclude_matching_path(struct path_exclude_check *check,
    ...

So I think you're probably right that there is potential for
check->path to become effectively corrupted due to the caller not
seeing the reallocation.  I'll change this too.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11  1:59   ` [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
  2013-04-11  5:31     ` Jeff King
@ 2013-04-11 11:20     ` Adam Spiers
  2013-04-11 18:33       ` Jeff King
  1 sibling, 1 reply; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 11:20 UTC (permalink / raw)
  To: git list

On Thu, Apr 11, 2013 at 02:59:32AM +0100, Adam Spiers wrote:
> +test_expect_success STDBUF 'streaming support for --stdin' '
> +	(
> +		echo one
> +		sleep 2
> +		echo two
> +	) | stdbuf -oL git check-ignore -v -n --stdin >out &

I just noticed that this patch precedes the one in the same series
which adds -n support.  I'll reorder them accordingly to avoid
breaking git bisect.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 1/5] t0008: remove duplicated test fixture data
  2013-04-11 11:05     ` Adam Spiers
@ 2013-04-11 12:05       ` Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 2/5] check-ignore: add -n / --non-matching option Adam Spiers
                           ` (3 more replies)
  2013-04-11 18:35       ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King
  1 sibling, 4 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 12:05 UTC (permalink / raw)
  To: git list

The expected contents of STDOUT for the final --stdin tests can be
derived from the expected contents of STDOUT for the same tests when
--verbose is given, in the same way that test_expect_success_multi
derives this for earlier tests.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 t/t0008-ignores.sh | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 9c1bde1..314a86d 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -575,21 +575,6 @@ cat <<-\EOF >stdin
 	b/globaltwo
 	../b/globaltwo
 EOF
-cat <<-\EOF >expected-default
-	../one
-	one
-	b/on
-	b/one
-	b/one one
-	b/one two
-	"b/one\"three"
-	b/two
-	b/twooo
-	../globaltwo
-	globaltwo
-	b/globaltwo
-	../b/globaltwo
-EOF
 cat <<-EOF >expected-verbose
 	.gitignore:1:one	../one
 	.gitignore:1:one	one
@@ -605,6 +590,7 @@ cat <<-EOF >expected-verbose
 	$global_excludes:2:!globaltwo	b/globaltwo
 	$global_excludes:2:!globaltwo	../b/globaltwo
 EOF
+sed -e 's/.*	//' expected-verbose >expected-default
 
 sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
 	tr "\n" "\0" >stdin0
-- 
1.8.2.1.342.gfa7285d

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 2/5] check-ignore: add -n / --non-matching option
  2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
@ 2013-04-11 12:05         ` Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 3/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 12:05 UTC (permalink / raw)
  To: git list

If `-n` or `--non-matching` are specified, non-matching pathnames will
also be output, in which case all fields in each output record except
for <pathname> will be empty.  This can be useful when running
check-ignore as a background process, so that files can be
incrementally streamed to STDIN, and for each of these files, STDOUT
will indicate whether that file matched a pattern or not.  (Without
this option, it would be impossible to tell whether the absence of
output for a given file meant that it didn't match any pattern, or
that the result simply hadn't been flushed to STDOUT yet.)

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/git-check-ignore.txt |  15 +++++
 builtin/check-ignore.c             |  46 ++++++++------
 t/t0008-ignores.sh                 | 122 +++++++++++++++++++++++++++----------
 3 files changed, 134 insertions(+), 49 deletions(-)

diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
index 854e4d0..7e3cabc 100644
--- a/Documentation/git-check-ignore.txt
+++ b/Documentation/git-check-ignore.txt
@@ -39,6 +39,12 @@ OPTIONS
 	below).  If `--stdin` is also given, input paths are separated
 	with a NUL character instead of a linefeed character.
 
+-n, --non-matching::
+	Show given paths which don't match any pattern.	 This only
+	makes sense when `--verbose` is enabled, otherwise it would
+	not be possible to distinguish between paths which match a
+	pattern and those which don't.
+
 OUTPUT
 ------
 
@@ -65,6 +71,15 @@ are also used instead of colons and hard tabs:
 
 <source> <NULL> <linenum> <NULL> <pattern> <NULL> <pathname> <NULL>
 
+If `-n` or `--non-matching` are specified, non-matching pathnames will
+also be output, in which case all fields in each output record except
+for <pathname> will be empty.  This can be useful when running
+non-interactively, so that files can be incrementally streamed to
+STDIN of a long-running check-ignore process, and for each of these
+files, STDOUT will indicate whether that file matched a pattern or
+not.  (Without this option, it would be impossible to tell whether the
+absence of output for a given file meant that it didn't match any
+pattern, or that the output hadn't been generated yet.)
 
 EXIT STATUS
 -----------
diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 0240f99..59acf74 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -5,7 +5,7 @@
 #include "pathspec.h"
 #include "parse-options.h"
 
-static int quiet, verbose, stdin_paths;
+static int quiet, verbose, stdin_paths, show_non_matching;
 static const char * const check_ignore_usage[] = {
 "git check-ignore [options] pathname...",
 "git check-ignore [options] --stdin < <list-of-paths>",
@@ -22,21 +22,28 @@ static const struct option check_ignore_options[] = {
 		    N_("read file names from stdin")),
 	OPT_BOOLEAN('z', NULL, &null_term_line,
 		    N_("input paths are terminated by a null character")),
+	OPT_BOOLEAN('n', "non-matching", &show_non_matching,
+		    N_("show non-matching input paths")),
 	OPT_END()
 };
 
 static void output_exclude(const char *path, struct exclude *exclude)
 {
-	char *bang  = exclude->flags & EXC_FLAG_NEGATIVE  ? "!" : "";
-	char *slash = exclude->flags & EXC_FLAG_MUSTBEDIR ? "/" : "";
+	char *bang  = (exclude && exclude->flags & EXC_FLAG_NEGATIVE)  ? "!" : "";
+	char *slash = (exclude && exclude->flags & EXC_FLAG_MUSTBEDIR) ? "/" : "";
 	if (!null_term_line) {
 		if (!verbose) {
 			write_name_quoted(path, stdout, '\n');
 		} else {
-			quote_c_style(exclude->el->src, NULL, stdout, 0);
-			printf(":%d:%s%s%s\t",
-			       exclude->srcpos,
-			       bang, exclude->pattern, slash);
+			if (exclude) {
+				quote_c_style(exclude->el->src, NULL, stdout, 0);
+				printf(":%d:%s%s%s\t",
+				       exclude->srcpos,
+				       bang, exclude->pattern, slash);
+			}
+			else {
+				printf("::\t");
+			}
 			quote_c_style(path, NULL, stdout, 0);
 			fputc('\n', stdout);
 		}
@@ -44,11 +51,14 @@ static void output_exclude(const char *path, struct exclude *exclude)
 		if (!verbose) {
 			printf("%s%c", path, '\0');
 		} else {
-			printf("%s%c%d%c%s%s%s%c%s%c",
-			       exclude->el->src, '\0',
-			       exclude->srcpos, '\0',
-			       bang, exclude->pattern, slash, '\0',
-			       path, '\0');
+			if (exclude)
+				printf("%s%c%d%c%s%s%s%c%s%c",
+				       exclude->el->src, '\0',
+				       exclude->srcpos, '\0',
+				       bang, exclude->pattern, slash, '\0',
+				       path, '\0');
+			else
+				printf("%c%c%c%s%c", '\0', '\0', '\0', path, '\0');
 		}
 	}
 }
@@ -89,15 +99,15 @@ static int check_ignore(const char *prefix, const char **pathspec)
 					? strlen(prefix) : 0, path);
 		full_path = check_path_for_gitlink(full_path);
 		die_if_path_beyond_symlink(full_path, prefix);
+		exclude = NULL;
 		if (!seen[i]) {
 			exclude = last_exclude_matching_path(&check, full_path,
 							     -1, &dtype);
-			if (exclude) {
-				if (!quiet)
-					output_exclude(path, exclude);
-				num_ignored++;
-			}
 		}
+		if (!quiet && (exclude || show_non_matching))
+			output_exclude(path, exclude);
+		if (exclude)
+			num_ignored++;
 	}
 	free(seen);
 	clear_directory(&dir);
@@ -161,6 +171,8 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 		if (verbose)
 			die(_("cannot have both --quiet and --verbose"));
 	}
+	if (show_non_matching && !verbose)
+		die(_("--non-matching is only valid with --verbose"));
 
 	if (stdin_paths) {
 		num_ignored = check_ignore_stdin_paths(prefix);
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 314a86d..7af93ba 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -66,16 +66,23 @@ test_check_ignore () {
 
 	init_vars &&
 	rm -f "$HOME/stdout" "$HOME/stderr" "$HOME/cmd" &&
-	echo git $global_args check-ignore $quiet_opt $verbose_opt $args \
+	echo git $global_args check-ignore $quiet_opt $verbose_opt $non_matching_opt $args \
 		>"$HOME/cmd" &&
+	echo "$expect_code" >"$HOME/expected-exit-code" &&
 	test_expect_code "$expect_code" \
-		git $global_args check-ignore $quiet_opt $verbose_opt $args \
+		git $global_args check-ignore $quiet_opt $verbose_opt $non_matching_opt $args \
 		>"$HOME/stdout" 2>"$HOME/stderr" &&
 	test_cmp "$HOME/expected-stdout" "$HOME/stdout" &&
 	stderr_empty_on_success "$expect_code"
 }
 
-# Runs the same code with 3 different levels of output verbosity,
+# Runs the same code with 4 different levels of output verbosity:
+#
+#   1. with -q / --quiet
+#   2. with default verbosity
+#   3. with -v / --verbose
+#   4. with -v / --verbose, *and* -n / --non-matching
+#
 # expecting success each time.  Takes advantage of the fact that
 # check-ignore --verbose output is the same as normal output except
 # for the extra first column.
@@ -83,7 +90,9 @@ test_check_ignore () {
 # Arguments:
 #   - (optional) prereqs for this test, e.g. 'SYMLINKS'
 #   - test name
-#   - output to expect from -v / --verbose mode
+#   - output to expect from the fourth verbosity mode (the output
+#     from the other verbosity modes is automatically inferred
+#     from this value)
 #   - code to run (should invoke test_check_ignore)
 test_expect_success_multi () {
 	prereq=
@@ -92,8 +101,9 @@ test_expect_success_multi () {
 		prereq=$1
 		shift
 	fi
-	testname="$1" expect_verbose="$2" code="$3"
+	testname="$1" expect_all="$2" code="$3"
 
+	expect_verbose=$( echo "$expect_all" | grep -v '^::	' )
 	expect=$( echo "$expect_verbose" | sed -e 's/.*	//' )
 
 	test_expect_success $prereq "$testname" '
@@ -101,23 +111,40 @@ test_expect_success_multi () {
 		eval "$code"
 	'
 
-	for quiet_opt in '-q' '--quiet'
-	do
-		test_expect_success $prereq "$testname${quiet_opt:+ with $quiet_opt}" "
+	# --quiet is only valid when a single pattern is passed
+	if test $( echo "$expect_all" | wc -l ) = 1
+	then
+		for quiet_opt in '-q' '--quiet'
+		do
+			test_expect_success $prereq "$testname${quiet_opt:+ with $quiet_opt}" "
 			expect '' &&
 			$code
 		"
-	done
-	quiet_opt=
+		done
+		quiet_opt=
+	fi
 
 	for verbose_opt in '-v' '--verbose'
 	do
-		test_expect_success $prereq "$testname${verbose_opt:+ with $verbose_opt}" "
-			expect '$expect_verbose' &&
-			$code
-		"
+		for non_matching_opt in '' ' -n' ' --non-matching'
+		do
+			if test -n "$non_matching_opt"
+			then
+				my_expect="$expect_all"
+			else
+				my_expect="$expect_verbose"
+			fi
+
+			test_code="
+				expect '$my_expect' &&
+				$code
+			"
+			opts="$verbose_opt$non_matching_opt"
+			test_expect_success $prereq "$testname${opts:+ with $opts}" "$test_code"
+		done
 	done
 	verbose_opt=
+	non_matching_opt=
 }
 
 test_expect_success 'setup' '
@@ -178,7 +205,7 @@ test_expect_success 'setup' '
 #
 # test invalid inputs
 
-test_expect_success_multi '. corner-case' '' '
+test_expect_success_multi '. corner-case' '::	.' '
 	test_check_ignore . 1
 '
 
@@ -276,27 +303,39 @@ do
 		where="in subdir $subdir"
 	fi
 
-	test_expect_success_multi "non-existent file $where not ignored" '' "
-		test_check_ignore '${subdir}non-existent' 1
-	"
+	test_expect_success_multi "non-existent file $where not ignored" \
+		"::	${subdir}non-existent" \
+		"test_check_ignore '${subdir}non-existent' 1"
 
 	test_expect_success_multi "non-existent file $where ignored" \
-		".gitignore:1:one	${subdir}one" "
-		test_check_ignore '${subdir}one'
-	"
+		".gitignore:1:one	${subdir}one" \
+		"test_check_ignore '${subdir}one'"
 
-	test_expect_success_multi "existing untracked file $where not ignored" '' "
-		test_check_ignore '${subdir}not-ignored' 1
-	"
+	test_expect_success_multi "existing untracked file $where not ignored" \
+		"::	${subdir}not-ignored" \
+		"test_check_ignore '${subdir}not-ignored' 1"
 
-	test_expect_success_multi "existing tracked file $where not ignored" '' "
-		test_check_ignore '${subdir}ignored-but-in-index' 1
-	"
+	test_expect_success_multi "existing tracked file $where not ignored" \
+		"::	${subdir}ignored-but-in-index" \
+		"test_check_ignore '${subdir}ignored-but-in-index' 1"
 
 	test_expect_success_multi "existing untracked file $where ignored" \
-		".gitignore:2:ignored-*	${subdir}ignored-and-untracked" "
-		test_check_ignore '${subdir}ignored-and-untracked'
-	"
+		".gitignore:2:ignored-*	${subdir}ignored-and-untracked" \
+		"test_check_ignore '${subdir}ignored-and-untracked'"
+
+	test_expect_success_multi "mix of file types $where" \
+"::	${subdir}non-existent
+.gitignore:1:one	${subdir}one
+::	${subdir}not-ignored
+::	${subdir}ignored-but-in-index
+.gitignore:2:ignored-*	${subdir}ignored-and-untracked" \
+		"test_check_ignore '
+			${subdir}non-existent
+			${subdir}one
+			${subdir}not-ignored
+			${subdir}ignored-but-in-index
+			${subdir}ignored-and-untracked'
+		"
 done
 
 # Having established the above, from now on we mostly test against
@@ -391,7 +430,7 @@ test_expect_success 'cd to ignored sub-directory with -v' '
 #
 # test handling of symlinks
 
-test_expect_success_multi SYMLINKS 'symlink' '' '
+test_expect_success_multi SYMLINKS 'symlink' '::	a/symlink' '
 	test_check_ignore "a/symlink" 1
 '
 
@@ -574,22 +613,33 @@ cat <<-\EOF >stdin
 	globaltwo
 	b/globaltwo
 	../b/globaltwo
+	c/not-ignored
 EOF
-cat <<-EOF >expected-verbose
+# N.B. we deliberately end STDIN with a non-matching pattern in order
+# to test that the exit code indicates that one or more of the
+# provided paths is ignored - in other words, that it represents an
+# aggregation of all the results, not just the final result.
+
+cat <<-EOF >expected-all
 	.gitignore:1:one	../one
+	::	../not-ignored
 	.gitignore:1:one	one
+	::	not-ignored
 	a/b/.gitignore:8:!on*	b/on
 	a/b/.gitignore:8:!on*	b/one
 	a/b/.gitignore:8:!on*	b/one one
 	a/b/.gitignore:8:!on*	b/one two
 	a/b/.gitignore:8:!on*	"b/one\"three"
 	a/b/.gitignore:9:!two	b/two
+	::	b/not-ignored
 	a/.gitignore:1:two*	b/twooo
 	$global_excludes:2:!globaltwo	../globaltwo
 	$global_excludes:2:!globaltwo	globaltwo
 	$global_excludes:2:!globaltwo	b/globaltwo
 	$global_excludes:2:!globaltwo	../b/globaltwo
+	::	c/not-ignored
 EOF
+grep -v '^::	' expected-all >expected-verbose
 sed -e 's/.*	//' expected-verbose >expected-default
 
 sed -e 's/^"//' -e 's/\\//' -e 's/"$//' stdin | \
@@ -615,6 +665,14 @@ test_expect_success '--stdin from subdirectory with -v' '
 	)
 '
 
+test_expect_success '--stdin from subdirectory with -v -n' '
+	expect_from_stdin <expected-all &&
+	(
+		cd a &&
+		test_check_ignore "--stdin -v -n" <../stdin
+	)
+'
+
 for opts in '--stdin -z' '-z --stdin'
 do
 	test_expect_success "$opts from subdirectory" '
-- 
1.8.2.1.342.gfa7285d

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 3/5] check-ignore: move setup into cmd_check_ignore()
  2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 2/5] check-ignore: add -n / --non-matching option Adam Spiers
@ 2013-04-11 12:05         ` Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
  3 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 12:05 UTC (permalink / raw)
  To: git list

Initialisation of the dir_struct and path_exclude_check structs was
previously done within check_ignore().  This was acceptable since
check_ignore() was only called once per check-ignore invocation;
however the next commit will convert it into an inner loop which is
called once per line of STDIN when --stdin is given.  Therefore moving
the initialisation code out into cmd_check_ignore() ensures that
initialisation is still only performed once per check-ignore
invocation, and consequently that the output is identical whether
pathspecs are provided as CLI arguments or via STDIN.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 builtin/check-ignore.c | 41 +++++++++++++++++++++--------------------
 1 file changed, 21 insertions(+), 20 deletions(-)

diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index 59acf74..e2d3006 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -63,30 +63,20 @@ static void output_exclude(const char *path, struct exclude *exclude)
 	}
 }
 
-static int check_ignore(const char *prefix, const char **pathspec)
+static int check_ignore(struct path_exclude_check *check,
+			const char *prefix, const char **pathspec)
 {
-	struct dir_struct dir;
 	const char *path, *full_path;
 	char *seen;
 	int num_ignored = 0, dtype = DT_UNKNOWN, i;
-	struct path_exclude_check check;
 	struct exclude *exclude;
 
-	/* read_cache() is only necessary so we can watch out for submodules. */
-	if (read_cache() < 0)
-		die(_("index file corrupt"));
-
-	memset(&dir, 0, sizeof(dir));
-	dir.flags |= DIR_COLLECT_IGNORED;
-	setup_standard_excludes(&dir);
-
 	if (!pathspec || !*pathspec) {
 		if (!quiet)
 			fprintf(stderr, "no pathspec given.\n");
 		return 0;
 	}
 
-	path_exclude_check_init(&check, &dir);
 	/*
 	 * look for pathspecs matching entries in the index, since these
 	 * should not be ignored, in order to be consistent with
@@ -101,7 +91,7 @@ static int check_ignore(const char *prefix, const char **pathspec)
 		die_if_path_beyond_symlink(full_path, prefix);
 		exclude = NULL;
 		if (!seen[i]) {
-			exclude = last_exclude_matching_path(&check, full_path,
+			exclude = last_exclude_matching_path(check, full_path,
 							     -1, &dtype);
 		}
 		if (!quiet && (exclude || show_non_matching))
@@ -110,13 +100,11 @@ static int check_ignore(const char *prefix, const char **pathspec)
 			num_ignored++;
 	}
 	free(seen);
-	clear_directory(&dir);
-	path_exclude_check_clear(&check);
 
 	return num_ignored;
 }
 
-static int check_ignore_stdin_paths(const char *prefix)
+static int check_ignore_stdin_paths(struct path_exclude_check *check, const char *prefix)
 {
 	struct strbuf buf, nbuf;
 	char **pathspec = NULL;
@@ -139,17 +127,18 @@ static int check_ignore_stdin_paths(const char *prefix)
 	}
 	ALLOC_GROW(pathspec, nr + 1, alloc);
 	pathspec[nr] = NULL;
-	num_ignored = check_ignore(prefix, (const char **)pathspec);
+	num_ignored = check_ignore(check, prefix, (const char **)pathspec);
 	maybe_flush_or_die(stdout, "attribute to stdout");
 	strbuf_release(&buf);
 	strbuf_release(&nbuf);
-	free(pathspec);
 	return num_ignored;
 }
 
 int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 {
 	int num_ignored;
+	struct dir_struct dir;
+	struct path_exclude_check check;
 
 	git_config(git_default_config, NULL);
 
@@ -174,12 +163,24 @@ int cmd_check_ignore(int argc, const char **argv, const char *prefix)
 	if (show_non_matching && !verbose)
 		die(_("--non-matching is only valid with --verbose"));
 
+	/* read_cache() is only necessary so we can watch out for submodules. */
+	if (read_cache() < 0)
+		die(_("index file corrupt"));
+
+	memset(&dir, 0, sizeof(dir));
+	dir.flags |= DIR_COLLECT_IGNORED;
+	setup_standard_excludes(&dir);
+
+	path_exclude_check_init(&check, &dir);
 	if (stdin_paths) {
-		num_ignored = check_ignore_stdin_paths(prefix);
+		num_ignored = check_ignore_stdin_paths(&check, prefix);
 	} else {
-		num_ignored = check_ignore(prefix, argv);
+		num_ignored = check_ignore(&check, prefix, argv);
 		maybe_flush_or_die(stdout, "ignore to stdout");
 	}
 
+	clear_directory(&dir);
+	path_exclude_check_clear(&check);
+
 	return !num_ignored;
 }
-- 
1.8.2.1.342.gfa7285d

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 2/5] check-ignore: add -n / --non-matching option Adam Spiers
  2013-04-11 12:05         ` [PATCH v2 3/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
@ 2013-04-11 12:05         ` Adam Spiers
  2013-04-11 19:11           ` Jeff King
  2013-04-11 21:04           ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Aaron Schrab
  2013-04-11 12:05         ` [PATCH v2 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
  3 siblings, 2 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 12:05 UTC (permalink / raw)
  To: git list

Some callers, such as the git-annex web assistant, find it useful to
invoke git check-ignore as a persistent background process, which can
then have queries fed to its STDIN at any point, and the corresponding
response consumed from its STDOUT.  For this we need to invoke
check_ignore() once per line of standard input, and flush standard
output after each result.

The above use case suggests that empty STDIN is actually a reasonable
scenario (e.g. when the caller doesn't know in advance whether any
queries need to be fed to the background process until after it's
already started), so we make the minor behavioural change that "no
pathspec given." is no longer emitted in when STDIN is empty.

Even though check_ignore() could now be changed to operate on a single
pathspec, we keep it operating on an array of pathspecs since that is
a more convenient way of consuming the existing pathspec API.

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 builtin/check-ignore.c | 15 +++++----------
 t/t0008-ignores.sh     | 28 +++++++++++++++++++++++-----
 2 files changed, 28 insertions(+), 15 deletions(-)

diff --git a/builtin/check-ignore.c b/builtin/check-ignore.c
index e2d3006..c00a7d6 100644
--- a/builtin/check-ignore.c
+++ b/builtin/check-ignore.c
@@ -107,10 +107,9 @@ static int check_ignore(struct path_exclude_check *check,
 static int check_ignore_stdin_paths(struct path_exclude_check *check, const char *prefix)
 {
 	struct strbuf buf, nbuf;
-	char **pathspec = NULL;
-	size_t nr = 0, alloc = 0;
+	char *pathspec[2] = { NULL, NULL };
 	int line_termination = null_term_line ? 0 : '\n';
-	int num_ignored;
+	int num_ignored = 0;
 
 	strbuf_init(&buf, 0);
 	strbuf_init(&nbuf, 0);
@@ -121,14 +120,10 @@ static int check_ignore_stdin_paths(struct path_exclude_check *check, const char
 				die("line is badly quoted");
 			strbuf_swap(&buf, &nbuf);
 		}
-		ALLOC_GROW(pathspec, nr + 1, alloc);
-		pathspec[nr] = xcalloc(strlen(buf.buf) + 1, sizeof(*buf.buf));
-		strcpy(pathspec[nr++], buf.buf);
+		pathspec[0] = buf.buf;
+		num_ignored += check_ignore(check, prefix, (const char **)pathspec);
+		maybe_flush_or_die(stdout, "check-ignore to stdout");
 	}
-	ALLOC_GROW(pathspec, nr + 1, alloc);
-	pathspec[nr] = NULL;
-	num_ignored = check_ignore(check, prefix, (const char **)pathspec);
-	maybe_flush_or_die(stdout, "attribute to stdout");
 	strbuf_release(&buf);
 	strbuf_release(&nbuf);
 	return num_ignored;
diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index 7af93ba..fbf12ae 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -216,11 +216,7 @@ test_expect_success_multi 'empty command line' '' '
 
 test_expect_success_multi '--stdin with empty STDIN' '' '
 	test_check_ignore "--stdin" 1 </dev/null &&
-	if test -n "$quiet_opt"; then
-		test_stderr ""
-	else
-		test_stderr "no pathspec given."
-	fi
+	test_stderr ""
 '
 
 test_expect_success '-q with multiple args' '
@@ -692,5 +688,27 @@ do
 	'
 done
 
+test_expect_success 'setup: have stdbuf?' '
+	if which stdbuf >/dev/null 2>&1
+	then
+		test_set_prereq STDBUF
+	fi
+'
+
+test_expect_success STDBUF 'streaming support for --stdin' '
+	(
+		echo one
+		sleep 2
+		echo two
+	) | stdbuf -oL git check-ignore -v -n --stdin >out &
+	pid=$! &&
+	sleep 1 &&
+	grep "^\.gitignore:1:one	one" out &&
+	test $( wc -l <out ) = 1 &&
+	sleep 2 &&
+	grep "^::	two" out &&
+	test $( wc -l <out ) = 2 &&
+	( wait $pid || kill $pid || : ) 2>/dev/null
+'
 
 test_done
-- 
1.8.2.1.342.gfa7285d

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
                           ` (2 preceding siblings ...)
  2013-04-11 12:05         ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
@ 2013-04-11 12:05         ` Adam Spiers
  2013-04-11 18:09           ` Junio C Hamano
  3 siblings, 1 reply; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 12:05 UTC (permalink / raw)
  To: git list

check-attr and check-ignore have the potential to deadlock callers
which do not read back the output in real-time.  For example, if a
caller writes N paths out and then reads N lines back in, it risks
becoming blocked on write() to check-*, and check-* is blocked on
write back to the caller.  Somebody has to buffer; the pipe buffers
provide some leeway, but they are limited.

Thanks to Peff for pointing this out:

    http://article.gmane.org/gmane.comp.version-control.git/220534

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/git-check-attr.txt   |  5 +++++
 Documentation/git-check-ignore.txt |  5 +++++
 Documentation/git.txt              | 16 +++++++++-------
 3 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt
index 5abdbaa..a7be80d 100644
--- a/Documentation/git-check-attr.txt
+++ b/Documentation/git-check-attr.txt
@@ -56,6 +56,11 @@ being queried and <info> can be either:
 'set';;		when the attribute is defined as true.
 <value>;;	when a value has been assigned to the attribute.
 
+Buffering happens as documented under the `GIT_FLUSH` option in
+linkgit:git[1].  The caller is responsible for avoiding deadlocks
+caused by overfilling an input buffer or reading from an empty output
+buffer.
+
 EXAMPLES
 --------
 
diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
index 7e3cabc..8e1f7ab 100644
--- a/Documentation/git-check-ignore.txt
+++ b/Documentation/git-check-ignore.txt
@@ -81,6 +81,11 @@ not.  (Without this option, it would be impossible to tell whether the
 absence of output for a given file meant that it didn't match any
 pattern, or that the output hadn't been generated yet.)
 
+Buffering happens as documented under the `GIT_FLUSH` option in
+linkgit:git[1].  The caller is responsible for avoiding deadlocks
+caused by overfilling an input buffer or reading from an empty output
+buffer.
+
 EXIT STATUS
 -----------
 
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6a875f2..eecdb15 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -808,13 +808,15 @@ for further details.
 
 'GIT_FLUSH'::
 	If this environment variable is set to "1", then commands such
-	as 'git blame' (in incremental mode), 'git rev-list', 'git log',
-	and 'git whatchanged' will force a flush of the output stream
-	after each commit-oriented record have been flushed.   If this
-	variable is set to "0", the output of these commands will be done
-	using completely buffered I/O.   If this environment variable is
-	not set, Git will choose buffered or record-oriented flushing
-	based on whether stdout appears to be redirected to a file or not.
+	as 'git blame' (in incremental mode), 'git rev-list', 'git
+	log', 'git check-attr', 'git check-ignore', and 'git
+	whatchanged' will force a flush of the output stream after
+	each commit-oriented record have been flushed.  If this
+	variable is set to "0", the output of these commands will be
+	done using completely buffered I/O.  If this environment
+	variable is not set, Git will choose buffered or
+	record-oriented flushing based on whether stdout appears to be
+	redirected to a file or not.
 
 'GIT_TRACE'::
 	If this variable is set to "1", "2" or "true" (comparison
-- 
1.8.2.1.342.gfa7285d

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-11 12:05         ` [PATCH v2 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
@ 2013-04-11 18:09           ` Junio C Hamano
  2013-04-11 20:12             ` [PATCH v3 " Adam Spiers
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2013-04-11 18:09 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

Adam Spiers <git@adamspiers.org> writes:

> diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
> index 7e3cabc..8e1f7ab 100644
> --- a/Documentation/git-check-ignore.txt
> +++ b/Documentation/git-check-ignore.txt
> @@ -81,6 +81,11 @@ not.  (Without this option, it would be impossible to tell whether the
>  absence of output for a given file meant that it didn't match any
>  pattern, or that the output hadn't been generated yet.)
>  
> +Buffering happens as documented under the `GIT_FLUSH` option in
> +linkgit:git[1].  The caller is responsible for avoiding deadlocks
> +caused by overfilling an input buffer or reading from an empty output
> +buffer.
> +
>  EXIT STATUS
>  -----------
>  
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index 6a875f2..eecdb15 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -808,13 +808,15 @@ for further details.
>  
>  'GIT_FLUSH'::
>  	If this environment variable is set to "1", then commands such
> -	as 'git blame' (in incremental mode), 'git rev-list', 'git log',
> -	and 'git whatchanged' will force a flush of the output stream
> -	after each commit-oriented record have been flushed.   If this
> -	variable is set to "0", the output of these commands will be done
> -	using completely buffered I/O.   If this environment variable is
> -	not set, Git will choose buffered or record-oriented flushing
> -	based on whether stdout appears to be redirected to a file or not.
> +	as 'git blame' (in incremental mode), 'git rev-list', 'git
> +	log', 'git check-attr', 'git check-ignore', and 'git
> +	whatchanged' will force a flush of the output stream after
> +	each commit-oriented record have been flushed.  If this
> +	variable is set to "0", the output of these commands will be
> +	done using completely buffered I/O.  If this environment
> +	variable is not set, Git will choose buffered or
> +	record-oriented flushing based on whether stdout appears to be
> +	redirected to a file or not.

Reflowing of the text is very much unappreciated X-<.  

It took me five minutes to spot that you only added check-attr and
check-ignore and forgot to adjust that "commit-oriented record" to
an updated reality, where you now have commands that produce
non-commit-oriented record to the output.

It would have been far simpler to review if it were like this, don't
you think?

>  	If this environment variable is set to "1", then commands such
> 	as 'git blame' (in incremental mode), 'git rev-list', 'git log',
> -	and 'git whatchanged' will force a flush of the output stream
> -	after each commit-oriented record have been flushed.   If this
> +	'git check-attr', 'git check-ignore', and 'git whatchanged' will
> +	force a flush of the output stream
> +     after each record have been flushed.   If this
> 	variable is set to "0", the output of these commands will be done
> 	using completely buffered I/O.   If this environment variable is
>  	not set, Git will choose buffered or record-oriented flushing
>  	based on whether stdout appears to be redirected to a file or not.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 11:20     ` Adam Spiers
@ 2013-04-11 18:33       ` Jeff King
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff King @ 2013-04-11 18:33 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 12:20:00PM +0100, Adam Spiers wrote:

> On Thu, Apr 11, 2013 at 02:59:32AM +0100, Adam Spiers wrote:
> > +test_expect_success STDBUF 'streaming support for --stdin' '
> > +	(
> > +		echo one
> > +		sleep 2
> > +		echo two
> > +	) | stdbuf -oL git check-ignore -v -n --stdin >out &
> 
> I just noticed that this patch precedes the one in the same series
> which adds -n support.  I'll reorder them accordingly to avoid
> breaking git bisect.

Thanks for noticing. FWIW, I often do this:

  GIT_EDITOR='sed -i "/^pick /aexec make test" \
  git rebase -i origin/master

on my topics to make sure each commit compiles and passes the tests in
isolation (it will stop on a failure, at which point you can fix up and
"git rebase --continue").

It's somewhat annoying, because it runs the whole test suite, even for
commits that are obviously not going to have an impact (e.g., doc
updates, or change to a single test). But it has saved me from
embarrassment many times, when I thought for sure that my commits were
obviously correct. :)

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/5] check-ignore: move setup into cmd_check_ignore()
  2013-04-11 11:05     ` Adam Spiers
  2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
@ 2013-04-11 18:35       ` Jeff King
  1 sibling, 0 replies; 33+ messages in thread
From: Jeff King @ 2013-04-11 18:35 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 12:05:11PM +0100, Adam Spiers wrote:

> On Thu, Apr 11, 2013 at 01:25:53AM -0400, Jeff King wrote:
> > On Thu, Apr 11, 2013 at 02:59:31AM +0100, Adam Spiers wrote:
> > > -static int check_ignore(const char *prefix, const char **pathspec)
> > > +static int check_ignore(struct path_exclude_check check,
> > > +			const char *prefix, const char **pathspec)
> > 
> > Did you mean to pass the struct by value here? If it is truly a per-path
> > [...]
>
> It's not a per-path value; it's supposed to be reused across checks
> for multiple paths, as explained in the comments above
> last_exclude_matching_path():

Makes sense (I didn't look into it very far, and was just guessing based
on the pass-by-value). Passing a pointer is definitely the right fix,
then.

Thanks.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 12:05         ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
@ 2013-04-11 19:11           ` Jeff King
  2013-04-11 20:31             ` Adam Spiers
  2013-04-11 21:04           ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Aaron Schrab
  1 sibling, 1 reply; 33+ messages in thread
From: Jeff King @ 2013-04-11 19:11 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 01:05:12PM +0100, Adam Spiers wrote:

> +test_expect_success 'setup: have stdbuf?' '
> +	if which stdbuf >/dev/null 2>&1
> +	then
> +		test_set_prereq STDBUF
> +	fi
> +'
> +
> +test_expect_success STDBUF 'streaming support for --stdin' '
> +	(
> +		echo one
> +		sleep 2
> +		echo two
> +	) | stdbuf -oL git check-ignore -v -n --stdin >out &
> +	pid=$! &&
> +	sleep 1 &&
> +	grep "^\.gitignore:1:one	one" out &&
> +	test $( wc -l <out ) = 1 &&
> +	sleep 2 &&
> +	grep "^::	two" out &&
> +	test $( wc -l <out ) = 2 &&
> +	( wait $pid || kill $pid || : ) 2>/dev/null
> +'

I always get a little nervous with sleeps in the test suite, as they are
indicative that we are trying to avoid some race condition, which means
that the test can fail when the system is under load, or when a tool
like valgrind is used which drastically alters the timing (e.g., if
check-ignore takes longer than 1 second to produce its answer, we may
fail here).

Is there a simpler way to test this?

Like:

  # Set up a long-running "check-ignore" connected by pipes.
  mkfifo in out &&
  (git check-ignore ... <in >out &) &&

  # We cannot just "echo >in" because check-ignore
  # would get EOF after echo exited; instead we open
  # the descriptor in our shell, and then echo to the
  # fd. We make sure to close it at the end, so that
  # the subprocess does get EOF and dies properly.
  exec 9>in &&
  test_when_finished "exec 9>&-" &&

  # Now we can do interactive tests
  echo >&9 one &&
  read response <out &&
  test "$response" = ... &&
  echo >&9 two &&
  read response <out &&
  test "$response" = ...

Hmm. Maybe simpler wasn't the right word. :) But it avoids any sleeps or
race conditions.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-11 18:09           ` Junio C Hamano
@ 2013-04-11 20:12             ` Adam Spiers
  2013-04-12  2:12               ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 20:12 UTC (permalink / raw)
  To: git list

On Thu, Apr 11, 2013 at 11:09:28AM -0700, Junio C Hamano wrote:
> Reflowing of the text is very much unappreciated X-<.  

I very much appreciate the excellent job you do as maintainer; your
attention to detail results in an incredibly high quality project.
However I do occasionally find your communication style unnecessarily
abrasive.  Maybe that's just me.

> It took me five minutes to spot that you only added check-attr and
> check-ignore and forgot to adjust that "commit-oriented record" to
> an updated reality, where you now have commands that produce
> non-commit-oriented record to the output.

I am sorry for wasting five minutes of your time.  A non-reflowed
version is included inline below, and also at:

    https://github.com/aspiers/git/compare/master...git-annex-streaming

> It would have been far simpler to review if it were like this, don't
> you think?

It would have been slightly simpler, yes.  It did occur to me not to
re-flow it, but then one of the lines ended up noticeably shorter, and
as it was a short paragraph, I estimated that you would prefer it
re-flowed.  Clearly I was wrong - not the first time, and it won't be
the last either, since I'm just a flawed human being trying to do my
best in the time available.  The question then arises: how uneven does
a paragraph's right margin have to be in order to justify re-flowing?
I could not find any guidelines in SubmittingPatches or
CodingGuidelines regarding re-flowing of documentation.  With
hindsight, I can now see that it would have been better to skip it on
this occasion, or at least keep the re-flow as a separate commit.

So I apologise again for the mistake, but don't you think it would
have been far more pleasant if instead you'd worded your email
something like this?

    Thanks for the patches.  I notice that you unnecessarily re-flowed
    the latter half of the GIT_FLUSH paragraph; unfortunately this
    meant I had to spend a few extra minutes on the review and almost
    missed that "commit-oriented" is no longer applicable.  In future,
    please avoid re-flowing text where possible.

Fortunately I'm not the sensitive sort, but I imagine that there are
others in this community who might be discouraged from contributing
for fear of being on the receiving end of sentences which end with
phrases such as "[...] is very much unappreciated X-<."  Please don't
underestimate the human factor; common courtesy can make a big
difference.

Thanks,
Adam

-- >8 --
Subject: [PATCH v3 5/5] Documentation: add caveats about I/O buffering for
 check-{attr,ignore}

check-attr and check-ignore have the potential to deadlock callers
which do not read back the output in real-time.  For example, if a
caller writes N paths out and then reads N lines back in, it risks
becoming blocked on write() to check-*, and check-* is blocked on
write back to the caller.  Somebody has to buffer; the pipe buffers
provide some leeway, but they are limited.

Thanks to Peff for pointing this out:

    http://article.gmane.org/gmane.comp.version-control.git/220534

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 Documentation/git-check-attr.txt   | 5 +++++
 Documentation/git-check-ignore.txt | 5 +++++
 Documentation/git.txt              | 7 ++++---
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/git-check-attr.txt b/Documentation/git-check-attr.txt
index 5abdbaa..a7be80d 100644
--- a/Documentation/git-check-attr.txt
+++ b/Documentation/git-check-attr.txt
@@ -56,6 +56,11 @@ being queried and <info> can be either:
 'set';;		when the attribute is defined as true.
 <value>;;	when a value has been assigned to the attribute.
 
+Buffering happens as documented under the `GIT_FLUSH` option in
+linkgit:git[1].  The caller is responsible for avoiding deadlocks
+caused by overfilling an input buffer or reading from an empty output
+buffer.
+
 EXAMPLES
 --------
 
diff --git a/Documentation/git-check-ignore.txt b/Documentation/git-check-ignore.txt
index 7e3cabc..8e1f7ab 100644
--- a/Documentation/git-check-ignore.txt
+++ b/Documentation/git-check-ignore.txt
@@ -81,6 +81,11 @@ not.  (Without this option, it would be impossible to tell whether the
 absence of output for a given file meant that it didn't match any
 pattern, or that the output hadn't been generated yet.)
 
+Buffering happens as documented under the `GIT_FLUSH` option in
+linkgit:git[1].  The caller is responsible for avoiding deadlocks
+caused by overfilling an input buffer or reading from an empty output
+buffer.
+
 EXIT STATUS
 -----------
 
diff --git a/Documentation/git.txt b/Documentation/git.txt
index 6a875f2..3258f2c 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -808,9 +808,10 @@ for further details.
 
 'GIT_FLUSH'::
 	If this environment variable is set to "1", then commands such
-	as 'git blame' (in incremental mode), 'git rev-list', 'git log',
-	and 'git whatchanged' will force a flush of the output stream
-	after each commit-oriented record have been flushed.   If this
+	as 'git blame' (in incremental mode), 'git rev-list', 'git
+	log', 'git check-attr', 'git check-ignore', and 'git
+	whatchanged' will force a flush of the output stream
+	after each record has been flushed.  If this
 	variable is set to "0", the output of these commands will be done
 	using completely buffered I/O.   If this environment variable is
 	not set, Git will choose buffered or record-oriented flushing
-- 
1.8.2.1.347.gbef22ca

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 19:11           ` Jeff King
@ 2013-04-11 20:31             ` Adam Spiers
  2013-04-11 20:40               ` Jeff King
  2013-04-22 18:03               ` Junio C Hamano
  0 siblings, 2 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 20:31 UTC (permalink / raw)
  To: git list

On Thu, Apr 11, 2013 at 03:11:32PM -0400, Jeff King wrote:
> I always get a little nervous with sleeps in the test suite, as they are
> indicative that we are trying to avoid some race condition, which means
> that the test can fail when the system is under load, or when a tool
> like valgrind is used which drastically alters the timing (e.g., if
> check-ignore takes longer than 1 second to produce its answer, we may
> fail here).

Agreed, especially here where my btrfs filesystems see fit to kindly
freeze my system for a few seconds many times each day :-/

> Is there a simpler way to test this?
> 
> Like:
> 
>   # Set up a long-running "check-ignore" connected by pipes.
>   mkfifo in out &&
>   (git check-ignore ... <in >out &) &&
> 
>   # We cannot just "echo >in" because check-ignore
>   # would get EOF after echo exited; instead we open
>   # the descriptor in our shell, and then echo to the
>   # fd. We make sure to close it at the end, so that
>   # the subprocess does get EOF and dies properly.
>   exec 9>in &&
>   test_when_finished "exec 9>&-" &&
> 
>   # Now we can do interactive tests
>   echo >&9 one &&
>   read response <out &&
>   test "$response" = ... &&
>   echo >&9 two &&
>   read response <out &&
>   test "$response" = ...
> 
> Hmm. Maybe simpler wasn't the right word. :) But it avoids any sleeps or
> race conditions.

The shell source is strong with this one ;-)

Congrats - I first tried with FIFOs (hence my other patch which moves
the PIPE test prerequisite definition into the core framework - the
original intention was to reuse it here) but failed to get it working.
I'll re-roll using your approach.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 20:31             ` Adam Spiers
@ 2013-04-11 20:40               ` Jeff King
  2013-04-22 18:03               ` Junio C Hamano
  1 sibling, 0 replies; 33+ messages in thread
From: Jeff King @ 2013-04-11 20:40 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

On Thu, Apr 11, 2013 at 09:31:41PM +0100, Adam Spiers wrote:

> The shell source is strong with this one ;-)
> 
> Congrats - I first tried with FIFOs (hence my other patch which moves
> the PIPE test prerequisite definition into the core framework - the
> original intention was to reuse it here) but failed to get it working.
> I'll re-roll using your approach.

Thanks. If it make you feel any better, it took about 20 minutes of
experimenting and at least three head-scratching "wait, that _should_
have worked" moments to get it right. :)

The rest of the series looked good to me, though I admit I did not think
too hard about the "--non-matching" patch, as it looked like you and
Junio had already given some thought to the output format, which is the
tricky part.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 12:05         ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
  2013-04-11 19:11           ` Jeff King
@ 2013-04-11 21:04           ` Aaron Schrab
  2013-04-11 22:55             ` Adam Spiers
  1 sibling, 1 reply; 33+ messages in thread
From: Aaron Schrab @ 2013-04-11 21:04 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

At 13:05 +0100 11 Apr 2013, Adam Spiers <git@adamspiers.org> wrote:
>The above use case suggests that empty STDIN is actually a reasonable
>scenario (e.g. when the caller doesn't know in advance whether any
>queries need to be fed to the background process until after it's
>already started), so we make the minor behavioural change that "no
>pathspec given." is no longer emitted in when STDIN is empty.

The last "in" there looks to be misplaced.  Was that originally 
something like "in the case"?  If so the removed words should be 
restored or the lingering one removed as well.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 21:04           ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Aaron Schrab
@ 2013-04-11 22:55             ` Adam Spiers
  0 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-11 22:55 UTC (permalink / raw)
  To: git list

On Thu, Apr 11, 2013 at 05:04:30PM -0400, Aaron Schrab wrote:
> At 13:05 +0100 11 Apr 2013, Adam Spiers <git@adamspiers.org> wrote:
> >The above use case suggests that empty STDIN is actually a reasonable
> >scenario (e.g. when the caller doesn't know in advance whether any
> >queries need to be fed to the background process until after it's
> >already started), so we make the minor behavioural change that "no
> >pathspec given." is no longer emitted in when STDIN is empty.
> 
> The last "in" there looks to be misplaced.  Was that originally
> something like "in the case"?  If so the removed words should be
> restored or the lingering one removed as well.

I'll remove it; thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-11 20:12             ` [PATCH v3 " Adam Spiers
@ 2013-04-12  2:12               ` Junio C Hamano
  2013-04-12 11:00                 ` Adam Spiers
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2013-04-12  2:12 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

Adam Spiers <git@adamspiers.org> writes:

> On Thu, Apr 11, 2013 at 11:09:28AM -0700, Junio C Hamano wrote:
>> Reflowing of the text is very much unappreciated X-<.  
>
> I very much appreciate the excellent job you do as maintainer; your
> attention to detail results in an incredibly high quality project.
> However I do occasionally find your communication style unnecessarily
> abrasive.  Maybe that's just me.

Sorry for being me X-<.  Yeah, I agree that the above came out to be
more blunt than needed.

It is usually OK to re-flow the text in the paragraph you are
touching. After all, for the purpose of reviewing, people can just
blindly apply and then ask "diff --color-words".  In this case,
however, there was some changes that conflict in the vicinity, and
reflowing made the resolution unnecessarily more cumbersome.

I have briefly looked at this series, but it severely conflicts with
a few topics in flight that touch the infrastructure you are using,
so I haven't merged it to 'pu'. Perhaps after things calm down, we
may want to ask you to reroll on top of updated codebase.

Thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore}
  2013-04-12  2:12               ` Junio C Hamano
@ 2013-04-12 11:00                 ` Adam Spiers
  0 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-12 11:00 UTC (permalink / raw)
  To: git list

On Thu, Apr 11, 2013 at 07:12:22PM -0700, Junio C Hamano wrote:
> It is usually OK to re-flow the text in the paragraph you are
> touching. After all, for the purpose of reviewing, people can just
> blindly apply and then ask "diff --color-words".  In this case,
> however, there was some changes that conflict in the vicinity, and
> reflowing made the resolution unnecessarily more cumbersome.

I see.  Thanks for the tip; I was only dimly aware of --color-words.

> I have briefly looked at this series, but it severely conflicts with
> a few topics in flight that touch the infrastructure you are using,
> so I haven't merged it to 'pu'. Perhaps after things calm down, we
> may want to ask you to reroll on top of updated codebase.

Sure, no problem.  I'll try a quick rebase now to see how ugly it is.

By the way, I've replaced my test for streaming --stdin which was
based on stdbuf(1) and sleep(1) with Peff's clever hack based on
mkfifo.  I'll hold off from sending a reroll until pathspec activity
cools down, but in the meantime it's available here:

    https://github.com/aspiers/git/compare/master...git-annex-streaming

It requires my "t: make PIPE a standard test prerequisite" patch, but
I notice that's already in master which will make things easier later
on.

Thanks!
Adam

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-11 20:31             ` Adam Spiers
  2013-04-11 20:40               ` Jeff King
@ 2013-04-22 18:03               ` Junio C Hamano
  2013-04-24  8:02                 ` Adam Spiers
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2013-04-22 18:03 UTC (permalink / raw)
  To: Adam Spiers; +Cc: git list

Adam Spiers <git@adamspiers.org> writes:

> On Thu, Apr 11, 2013 at 03:11:32PM -0400, Jeff King wrote:
>> I always get a little nervous with sleeps in the test suite, as they are
>> indicative that we are trying to avoid some race condition, which means
>> that the test can fail when the system is under load, or when a tool
>> like valgrind is used which drastically alters the timing (e.g., if
>> check-ignore takes longer than 1 second to produce its answer, we may
>> fail here).
>
> Agreed, especially here where my btrfs filesystems see fit to kindly
> freeze my system for a few seconds many times each day :-/
> 
>> Is there a simpler way to test this?
>> 
>> Like:
>> ...
> I'll re-roll using your approach.

I think I missed this one and it already is in 'next'.

I'll hold it back so please make your re-roll into an incremental
update.

Thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin
  2013-04-22 18:03               ` Junio C Hamano
@ 2013-04-24  8:02                 ` Adam Spiers
  2013-04-29 22:55                   ` [PATCH] t0008: use named pipe (FIFO) to test check-ignore streaming Adam Spiers
  0 siblings, 1 reply; 33+ messages in thread
From: Adam Spiers @ 2013-04-24  8:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git list

On Mon, Apr 22, 2013 at 11:03:44AM -0700, Junio C Hamano wrote:
> Adam Spiers <git@adamspiers.org> writes:
> 
> > On Thu, Apr 11, 2013 at 03:11:32PM -0400, Jeff King wrote:
> >> I always get a little nervous with sleeps in the test suite, as they are
> >> indicative that we are trying to avoid some race condition, which means
> >> that the test can fail when the system is under load, or when a tool
> >> like valgrind is used which drastically alters the timing (e.g., if
> >> check-ignore takes longer than 1 second to produce its answer, we may
> >> fail here).
> >
> > Agreed, especially here where my btrfs filesystems see fit to kindly
> > freeze my system for a few seconds many times each day :-/
> > 
> >> Is there a simpler way to test this?
> >> 
> >> Like:
> >> ...
> > I'll re-roll using your approach.
> 
> I think I missed this one and it already is in 'next'.
> 
> I'll hold it back so please make your re-roll into an incremental
> update.

Will do - will probably take a few days more though, since I'm
currently catching up on a post-travel work backlog.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH] t0008: use named pipe (FIFO) to test check-ignore streaming
  2013-04-24  8:02                 ` Adam Spiers
@ 2013-04-29 22:55                   ` Adam Spiers
  0 siblings, 0 replies; 33+ messages in thread
From: Adam Spiers @ 2013-04-29 22:55 UTC (permalink / raw)
  To: git list

sleeps in the check-ignore test suite are not ideal since they can
fail when the system is under load, or when a tool like valgrind is
used which drastically alters the timing.  Therefore we replace them
with a more robust solution using a named pipe (FIFO).

Thanks to Jeff King for coming up with the redirection wizardry
required to make this work.

http://article.gmane.org/gmane.comp.version-control.git/220916

Signed-off-by: Adam Spiers <git@adamspiers.org>
---
 t/t0008-ignores.sh | 38 +++++++++++++++++---------------------
 1 file changed, 17 insertions(+), 21 deletions(-)

diff --git a/t/t0008-ignores.sh b/t/t0008-ignores.sh
index fbf12ae..a56db80 100755
--- a/t/t0008-ignores.sh
+++ b/t/t0008-ignores.sh
@@ -688,27 +688,23 @@ do
 	'
 done
 
-test_expect_success 'setup: have stdbuf?' '
-	if which stdbuf >/dev/null 2>&1
-	then
-		test_set_prereq STDBUF
-	fi
-'
-
-test_expect_success STDBUF 'streaming support for --stdin' '
-	(
-		echo one
-		sleep 2
-		echo two
-	) | stdbuf -oL git check-ignore -v -n --stdin >out &
-	pid=$! &&
-	sleep 1 &&
-	grep "^\.gitignore:1:one	one" out &&
-	test $( wc -l <out ) = 1 &&
-	sleep 2 &&
-	grep "^::	two" out &&
-	test $( wc -l <out ) = 2 &&
-	( wait $pid || kill $pid || : ) 2>/dev/null
+test_expect_success PIPE 'streaming support for --stdin' '
+	mkfifo in out &&
+	(git check-ignore -n -v --stdin <in >out &) &&
+
+	# We cannot just "echo >in" because check-ignore would get EOF
+	# after echo exited; instead we open the descriptor in our
+	# shell, and then echo to the fd. We make sure to close it at
+	# the end, so that the subprocess does get EOF and dies
+	# properly.
+	exec 9>in &&
+	test_when_finished "exec 9>&-" &&
+	echo >&9 one &&
+	read response <out &&
+	echo "$response" | grep "^\.gitignore:1:one	one" &&
+	echo >&9 two &&
+	read response <out &&
+	echo "$response" | grep "^::	two"
 '
 
 test_done
-- 
1.8.3.rc0.305.g6580fe1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2013-04-29 22:55 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-04-08 18:13 RFC: two minor tweaks to check-ignore to help git-annex assistant Adam Spiers
2013-04-08 21:56 ` Junio C Hamano
2013-04-08 22:20 ` Jeff King
2013-04-11  1:59 ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
2013-04-11  1:59   ` [PATCH 2/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
2013-04-11  5:31     ` Jeff King
2013-04-11 10:55       ` Adam Spiers
2013-04-11 11:20     ` Adam Spiers
2013-04-11 18:33       ` Jeff King
2013-04-11  1:59   ` [PATCH 3/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
2013-04-11  5:31     ` Jeff King
2013-04-11  1:59   ` [PATCH 4/5] t0008: remove duplicated test fixture data Adam Spiers
2013-04-11  1:59   ` [PATCH 5/5] check-ignore: add -n / --non-matching option Adam Spiers
2013-04-11  5:25   ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King
2013-04-11 11:05     ` Adam Spiers
2013-04-11 12:05       ` [PATCH v2 1/5] t0008: remove duplicated test fixture data Adam Spiers
2013-04-11 12:05         ` [PATCH v2 2/5] check-ignore: add -n / --non-matching option Adam Spiers
2013-04-11 12:05         ` [PATCH v2 3/5] check-ignore: move setup into cmd_check_ignore() Adam Spiers
2013-04-11 12:05         ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Adam Spiers
2013-04-11 19:11           ` Jeff King
2013-04-11 20:31             ` Adam Spiers
2013-04-11 20:40               ` Jeff King
2013-04-22 18:03               ` Junio C Hamano
2013-04-24  8:02                 ` Adam Spiers
2013-04-29 22:55                   ` [PATCH] t0008: use named pipe (FIFO) to test check-ignore streaming Adam Spiers
2013-04-11 21:04           ` [PATCH v2 4/5] check-ignore: allow incremental streaming of queries via --stdin Aaron Schrab
2013-04-11 22:55             ` Adam Spiers
2013-04-11 12:05         ` [PATCH v2 5/5] Documentation: add caveats about I/O buffering for check-{attr,ignore} Adam Spiers
2013-04-11 18:09           ` Junio C Hamano
2013-04-11 20:12             ` [PATCH v3 " Adam Spiers
2013-04-12  2:12               ` Junio C Hamano
2013-04-12 11:00                 ` Adam Spiers
2013-04-11 18:35       ` [PATCH 1/5] check-ignore: move setup into cmd_check_ignore() Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.