All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] diff --no-index: support reading from named pipes
@ 2023-06-27 14:10 Phillip Wood
  2023-06-27 14:10 ` [PATCH 1/3] diff --no-index: die on error reading stdin Phillip Wood
                   ` (3 more replies)
  0 siblings, 4 replies; 17+ messages in thread
From: Phillip Wood @ 2023-06-27 14:10 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with git diff
--no-index. On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named
pipe. On Linux, the arguments are symlinks to pipes, so git diff
"helpfully" diffs these symlinks, comparing their targets like
"pipe:[1234]" and "pipe:[5678]".

There have been at least three previous attempts [1-3] to address this
issue. They all seem to have received broad support for the aim of
supporting process substitutions but have foundered on details of the
implementation. In an effort to avoid the same fate this series is
narrowly focussed on making command substitutions work with "diff
--no-index" and does not attempt to add a general facility for
de-referencing symbolic links or reading from pipes to the diff
machinery.

The only functional change is that if a path given on the commandline
is a named pipe or a symbolic link that resolves to a named pipe then
we read the data to diff from that pipe. The first two patches improve
the error handling when reading from stdin and add a test. The third
patch implements support for reading from pipes.

This cover letter and the commit message for the third patch are
largely copied from brian’s patch[2] - do we have a standard commit
trailer for "I stole the commit message from someone else's patch"?

I've cc’d the participants of the discussion of the last attempt[1] to
fix this.

[1] https://lore.kernel.org/git/20200918113256.8699-3-tguyot@gmail.com/
[2] https://lore.kernel.org/git/20181220002610.43832-1-sandals@crustytoothpaste.net/
[3] https://public-inbox.org/git/20170113102021.6054-1-dennis@kaarsemaker.net/


Base-Commit: 94486b6763c29144c60932829a65fec0597e17b3
Published-As: https://github.com/phillipwood/git/releases/tag/diff-no-index-pipes%2Fv1
View-Changes-At: https://github.com/phillipwood/git/compare/94486b676...990e71882
Fetch-It-Via: git fetch https://github.com/phillipwood/git diff-no-index-pipes/v1

Phillip Wood (3):
  diff --no-index: die on error reading stdin
  t4054: test diff --no-index with stdin
  diff --no-index: support reading from named pipes

 diff-no-index.c          | 83 ++++++++++++++++++++++++----------------
 t/t4053-diff-no-index.sh | 43 +++++++++++++++++++++
 2 files changed, 92 insertions(+), 34 deletions(-)

-- 
2.40.1.852.g22d29fd9ba


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/3] diff --no-index: die on error reading stdin
  2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
@ 2023-06-27 14:10 ` Phillip Wood
  2023-06-27 14:10 ` [PATCH 2/3] t4054: test diff --no-index with stdin Phillip Wood
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 17+ messages in thread
From: Phillip Wood @ 2023-06-27 14:10 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

If there is an error when reading from stdin then "diff --no-index"
prints an error message but continues to try and diff a file named "-"
resulting in an error message that looks like

    error: error while reading from stdin: Invalid argument
    fatal: stat '-': No such file or directory

assuming that no file named "-" exists. If such a file exists it prints
the first error message and generates the diff from that file which is
not what the user wanted. Instead just die() straight away if we cannot
read from stdin.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff-no-index.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 4296940f90..7b9327f8f3 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -60,20 +60,19 @@ static int get_mode(const char *path, int *mode)
 	return 0;
 }
 
-static int populate_from_stdin(struct diff_filespec *s)
+static void populate_from_stdin(struct diff_filespec *s)
 {
 	struct strbuf buf = STRBUF_INIT;
 	size_t size = 0;
 
 	if (strbuf_read(&buf, 0, 0) < 0)
-		return error_errno("error while reading from stdin");
+		die_errno("error while reading from stdin");
 
 	s->should_munmap = 0;
 	s->data = strbuf_detach(&buf, &size);
 	s->size = size;
 	s->should_free = 1;
 	s->is_stdin = 1;
-	return 0;
 }
 
 static struct diff_filespec *noindex_filespec(const char *name, int mode)
-- 
2.40.1.852.g22d29fd9ba


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/3] t4054: test diff --no-index with stdin
  2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
  2023-06-27 14:10 ` [PATCH 1/3] diff --no-index: die on error reading stdin Phillip Wood
@ 2023-06-27 14:10 ` Phillip Wood
  2023-06-27 14:10 ` [PATCH 3/3] diff --no-index: support reading from named pipes Phillip Wood
  2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
  3 siblings, 0 replies; 17+ messages in thread
From: Phillip Wood @ 2023-06-27 14:10 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

"git diff --no-index" supports reading from stdin with the path "-".
There is no test coverage for this so add a regression test before
changing the code in the next commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 t/t4053-diff-no-index.sh | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 4e9fa0403d..d14b194ea2 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -205,4 +205,22 @@ test_expect_success POSIXPERM,SYMLINKS 'diff --no-index normalizes: mode not lik
 	test_cmp expected actual
 '
 
+test_expect_success "diff --no-index treats '-' as stdin" '
+	cat >expect <<-EOF &&
+	diff --git a/- b/a/1
+	index $ZERO_OID..$(git hash-object --stdin <a/1) 100644
+	--- a/-
+	+++ b/a/1
+	@@ -1 +1 @@
+	-x
+	+1
+	EOF
+
+	test_write_lines x | test_expect_code 1 \
+		git -c core.abbrev=no diff --no-index -- - a/1 >actual &&
+	test_cmp expect actual &&
+
+	test_write_lines 1 | git diff --no-index -- a/1 - >actual &&
+	test_must_be_empty actual
+'
 test_done
-- 
2.40.1.852.g22d29fd9ba


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/3] diff --no-index: support reading from named pipes
  2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
  2023-06-27 14:10 ` [PATCH 1/3] diff --no-index: die on error reading stdin Phillip Wood
  2023-06-27 14:10 ` [PATCH 2/3] t4054: test diff --no-index with stdin Phillip Wood
@ 2023-06-27 14:10 ` Phillip Wood
  2023-06-27 19:44   ` Junio C Hamano
  2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
  3 siblings, 1 reply; 17+ messages in thread
From: Phillip Wood @ 2023-06-27 14:10 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with "git diff
--no-index". On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named pipe. On
Linux, the arguments are symlinks to pipes, so git diff "helpfully"
diffs these symlinks, comparing their targets like "pipe:[1234]" and
"pipe:[5678]".

To address this "diff --no-index" is changed so that if a path given on
the commandline is a named pipe or a symbolic link that resolves to a
named pipe then we read the data to diff from that pipe. This is
implemented by generalizing the code that already exists to handle
reading from stdin when the user passes the path "-".

As process substitution is not support by POSIX this change is tested by
using a pipe and a symbolic link to a pipe.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff-no-index.c          | 80 ++++++++++++++++++++++++----------------
 t/t4053-diff-no-index.sh | 25 +++++++++++++
 2 files changed, 73 insertions(+), 32 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 7b9327f8f3..45fbd7cdbe 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -41,7 +41,7 @@ static int read_directory_contents(const char *path, struct string_list *list)
  */
 static const char file_from_standard_input[] = "-";
 
-static int get_mode(const char *path, int *mode)
+static int get_mode(const char *path, int is_pipe, int *mode)
 {
 	struct stat st;
 
@@ -51,7 +51,7 @@ static int get_mode(const char *path, int *mode)
 	else if (!strcasecmp(path, "nul"))
 		*mode = 0;
 #endif
-	else if (path == file_from_standard_input)
+	else if (is_pipe)
 		*mode = create_ce_mode(0666);
 	else if (lstat(path, &st))
 		return error("Could not access '%s'", path);
@@ -60,13 +60,18 @@ static int get_mode(const char *path, int *mode)
 	return 0;
 }
 
-static void populate_from_stdin(struct diff_filespec *s)
+static void populate_from_pipe(struct diff_filespec *s, int is_stdin)
 {
 	struct strbuf buf = STRBUF_INIT;
 	size_t size = 0;
+	int fd = 0;
 
-	if (strbuf_read(&buf, 0, 0) < 0)
+	if (!is_stdin)
+		fd = xopen(s->path, O_RDONLY);
+	if (strbuf_read(&buf, fd, 0) < 0)
 		die_errno("error while reading from stdin");
+	if (!is_stdin)
+		close(fd);
 
 	s->should_munmap = 0;
 	s->data = strbuf_detach(&buf, &size);
@@ -75,40 +80,43 @@ static void populate_from_stdin(struct diff_filespec *s)
 	s->is_stdin = 1;
 }
 
-static struct diff_filespec *noindex_filespec(const char *name, int mode)
+static struct diff_filespec *noindex_filespec(const char *name, int is_pipe,
+					      int mode)
 {
 	struct diff_filespec *s;
 
 	if (!name)
 		name = "/dev/null";
 	s = alloc_filespec(name);
 	fill_filespec(s, null_oid(), 0, mode);
-	if (name == file_from_standard_input)
-		populate_from_stdin(s);
+	if (is_pipe)
+		populate_from_pipe(s, name == file_from_standard_input);
 	return s;
 }
 
 static int queue_diff(struct diff_options *o,
-		      const char *name1, const char *name2)
+		      const char *name1, int is_pipe1,
+		      const char *name2, int is_pipe2)
 {
 	int mode1 = 0, mode2 = 0;
 
-	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
+	if (get_mode(name1, is_pipe1, &mode1) ||
+	    get_mode(name2, is_pipe2, &mode2))
 		return -1;
 
 	if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2)) {
 		struct diff_filespec *d1, *d2;
 
 		if (S_ISDIR(mode1)) {
 			/* 2 is file that is created */
-			d1 = noindex_filespec(NULL, 0);
-			d2 = noindex_filespec(name2, mode2);
+			d1 = noindex_filespec(NULL, 0, 0);
+			d2 = noindex_filespec(name2, is_pipe2, mode2);
 			name2 = NULL;
 			mode2 = 0;
 		} else {
 			/* 1 is file that is deleted */
-			d1 = noindex_filespec(name1, mode1);
-			d2 = noindex_filespec(NULL, 0);
+			d1 = noindex_filespec(name1, is_pipe1, mode1);
+			d2 = noindex_filespec(NULL, 0, 0);
 			name1 = NULL;
 			mode1 = 0;
 		}
@@ -173,7 +181,7 @@ static int queue_diff(struct diff_options *o,
 				n2 = buffer2.buf;
 			}
 
-			ret = queue_diff(o, n1, n2);
+			ret = queue_diff(o, n1, 0, n2, 0);
 		}
 		string_list_clear(&p1, 0);
 		string_list_clear(&p2, 0);
@@ -189,8 +197,8 @@ static int queue_diff(struct diff_options *o,
 			SWAP(name1, name2);
 		}
 
-		d1 = noindex_filespec(name1, mode1);
-		d2 = noindex_filespec(name2, mode2);
+		d1 = noindex_filespec(name1, is_pipe1, mode1);
+		d2 = noindex_filespec(name2, is_pipe2, mode2);
 		diff_queue(&diff_queued_diff, d1, d2);
 		return 0;
 	}
@@ -213,18 +221,11 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
  * Note that we append the basename of F to D/, so "diff a/b/file D"
  * becomes "diff a/b/file D/file", not "diff a/b/file D/a/b/file".
  */
-static void fixup_paths(const char **path, struct strbuf *replacement)
+static void fixup_paths(const char **path, int *is_dir, struct strbuf *replacement)
 {
-	unsigned int isdir0, isdir1;
-
-	if (path[0] == file_from_standard_input ||
-	    path[1] == file_from_standard_input)
-		return;
-	isdir0 = is_directory(path[0]);
-	isdir1 = is_directory(path[1]);
-	if (isdir0 == isdir1)
-		return;
-	if (isdir0) {
+	if (is_dir[0] == is_dir[1])
+		return;
+	if (is_dir[0]) {
 		append_basename(replacement, path[0], path[1]);
 		path[0] = replacement->buf;
 	} else {
@@ -246,6 +247,8 @@ int diff_no_index(struct rev_info *revs,
 	int ret = 1;
 	const char *paths[2];
 	char *to_free[ARRAY_SIZE(paths)] = { 0 };
+	int is_dir[ARRAY_SIZE(paths)] = { 0 };
+	int is_pipe[ARRAY_SIZE(paths)] = { 0 };
 	struct strbuf replacement = STRBUF_INIT;
 	const char *prefix = revs->prefix;
 	struct option no_index_options[] = {
@@ -267,18 +270,30 @@ int diff_no_index(struct rev_info *revs,
 	FREE_AND_NULL(options);
 	for (i = 0; i < 2; i++) {
 		const char *p = argv[i];
-		if (!strcmp(p, "-"))
+		if (!strcmp(p, "-")) {
 			/*
 			 * stdin should be spelled as "-"; if you have
 			 * path that is "-", spell it as "./-".
 			 */
 			p = file_from_standard_input;
-		else if (prefix)
-			p = to_free[i] = prefix_filename(prefix, p);
+			is_pipe[i] = 1;
+		} else {
+			struct stat st;
+
+			if (prefix)
+				p = to_free[i] = prefix_filename(prefix, p);
+			if (stat(p, &st))
+				;
+			else if (S_ISDIR(st.st_mode))
+				is_dir[i] = 1;
+			else if (S_ISFIFO(st.st_mode))
+				is_pipe[i] = 1;
+		}
 		paths[i] = p;
 	}
 
-	fixup_paths(paths, &replacement);
+	if (!is_pipe[0] && !is_pipe[1])
+		fixup_paths(paths, is_dir, &replacement);
 
 	revs->diffopt.skip_stat_unmatch = 1;
 	if (!revs->diffopt.output_format)
@@ -295,7 +310,8 @@ int diff_no_index(struct rev_info *revs,
 	setup_diff_pager(&revs->diffopt);
 	revs->diffopt.flags.exit_with_status = 1;
 
-	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
+	if (queue_diff(&revs->diffopt,
+		       paths[0], is_pipe[0], paths[1], is_pipe[1]))
 		goto out;
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index d14b194ea2..30b9e2c2f0 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -223,4 +223,29 @@ test_expect_success "diff --no-index treats '-' as stdin" '
 	test_write_lines 1 | git diff --no-index -- a/1 - >actual &&
 	test_must_be_empty actual
 '
+
+test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' '
+	mkfifo old &&
+	mkfifo new &&
+	ln -s new new-link &&
+	{
+		(test_write_lines a b c >old) &
+		(test_write_lines a x c >new) &
+	} &&
+
+	cat >expect <<-EOF &&
+	diff --git a/old b/new-link
+	--- a/old
+	+++ b/new-link
+	@@ -1,3 +1,3 @@
+	 a
+	-b
+	+x
+	 c
+	EOF
+
+	test_expect_code 1 git diff --no-index old new-link >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.40.1.852.g22d29fd9ba


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] diff --no-index: support reading from named pipes
  2023-06-27 14:10 ` [PATCH 3/3] diff --no-index: support reading from named pipes Phillip Wood
@ 2023-06-27 19:44   ` Junio C Hamano
  2023-06-28 10:05     ` Phillip Wood
  0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2023-06-27 19:44 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> In some shells, such as bash and zsh, it's possible to use a command
> substitution to provide the output of a command as a file argument to
> another process, like so:
>
>   diff -u <(printf "a\nb\n") <(printf "a\nc\n")
>
> However, this syntax does not produce useful results with "git diff
> --no-index". On macOS, the arguments to the command are named pipes
> under /dev/fd, and git diff doesn't know how to handle a named pipe. On
> Linux, the arguments are symlinks to pipes, so git diff "helpfully"
> diffs these symlinks, comparing their targets like "pipe:[1234]" and
> "pipe:[5678]".
>
> To address this "diff --no-index" is changed so that if a path given on
> the commandline is a named pipe or a symbolic link that resolves to a
> named pipe then we read the data to diff from that pipe. This is
> implemented by generalizing the code that already exists to handle
> reading from stdin when the user passes the path "-".
>
> As process substitution is not support by POSIX this change is tested by
> using a pipe and a symbolic link to a pipe.
>
> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>  diff-no-index.c          | 80 ++++++++++++++++++++++++----------------
>  t/t4053-diff-no-index.sh | 25 +++++++++++++
>  2 files changed, 73 insertions(+), 32 deletions(-)

This looks good, if a bit invasive, to a cursory read, at least to
me.  It is very focused to the real problem at hand, and shows that
the way we split the "no-index" mode out to its own implementation
of filespec population code does make sense.

> -static void populate_from_stdin(struct diff_filespec *s)
> +static void populate_from_pipe(struct diff_filespec *s, int is_stdin)
>  {
>  	struct strbuf buf = STRBUF_INIT;
>  	size_t size = 0;
> +	int fd = 0;
>  
> -	if (strbuf_read(&buf, 0, 0) < 0)
> +	if (!is_stdin)
> +		fd = xopen(s->path, O_RDONLY);
> +	if (strbuf_read(&buf, fd, 0) < 0)
>  		die_errno("error while reading from stdin");
> +	if (!is_stdin)
> +		close(fd);

Given that the error message explicitly says "stdin", and there are
many "if ([!]is_stdin)" sprinkled in the code, I actually suspect
that there should be two separate helpers, one for stdin and one for
non-stdin pipe.  It is especially true since there is only one
caller that does this:

> +	if (is_pipe)
> +		populate_from_pipe(s, name == file_from_standard_input);

which can be

	if (is_pipe) {
		if (name == file_from_standard_input)
			populate_from_stdin(s);
		else
			populate_from_pipe(s);
	}

without losing clarity.  The code that you are sharing by forcing
them to be a single helper to wrap up a handful of members in the s
structure can become its own helper that is called from these two
helper functions.

>  static int queue_diff(struct diff_options *o,
> -		      const char *name1, const char *name2)
> +		      const char *name1, int is_pipe1,
> +		      const char *name2, int is_pipe2)
>  {
>  	int mode1 = 0, mode2 = 0;
>  
> -	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
> +	if (get_mode(name1, is_pipe1, &mode1) ||
> +	    get_mode(name2, is_pipe2, &mode2))
>  		return -1;

Makes me wonder why the caller of queue_diff() even needs to know if
these two names are pipes; we are calling get_mode() which would run
stat(2) anyway, and the result from stat(2) is what you use (in the
caller) to determine the values of is_pipeN.  Wouldn't it be more
appropriate to leave the caller oblivious of special casing of the
pipes and let get_mode() handle this?  After all, that is how the
existing code special cases the standard input so there is a strong
precedence.

If we go that route, it may make sense to further isolate the
"address comparison" trick used for the standard input mode.
Perhaps we can and do something like

    static int get_mode(const char *path, int *mode, int *special)
    {
	struct stat st;

+	*special = 0; /* default - nothing special */
	...
	else if (path == file_from_standard_input) {
		*mode = create_ce_mode(0666);
+		*pipe_kind = 1; /* STDIN */
+	} else if (stat(path, &st)) {
+		... error ...
+	} else if (S_ISFIFO(st.st_mode)) {
+		*mode = create_ce_mode(0666);
+		*pipe_kind = 2; /* FIFO */
	} else if (lstat(path, &st)) {
		... error ...
	} else {
		*mode = st.st_mode;
	}

and have the caller act on "special" to choose among calling
populate_from_stdin(), populate_from_pipe(), or do nothing for
the regular files?

    Side note: this has an added benefit of highlighting that we do
    stat() and lstat() because of dereferencing.  What I suspect is
    that "git diff --no-index" mode was primarily to give Git
    niceties like rename detection and diff algorithms to those who
    wanted to use in contexts (i.e. contents not tracked by Git)
    they use "diff" by other people like GNU, without bothering to
    update "diff" by other people.  I further suspect that "compare
    the readlink contents", which is very much necessary within the
    Git context, may not fall into the "Git niceties" when they
    invoke "--no-index" mode.  Which leads me to imagine a future
    direction where we only use stat() and not lstat() in the
    "--no-index" codepath.  Having everything including these
    lstat() and stat() calls inside get_mode() will allow such a
    future transition hopefully simpler.

I do not quite see why you decided to move the "is_dir" processing
up and made the caller responsible.  Specifically,

> -	fixup_paths(paths, &replacement);
> +	if (!is_pipe[0] && !is_pipe[1])
> +		fixup_paths(paths, is_dir, &replacement);

this seems fishy when one side is pipe and the other one is not.
When the user says

    $ git diff --no-index <(command) path

fixup_paths() are bypassed because one of them is pipe.  It makes me
suspect that it should be an error if "path" is a directory.  I do
not know if fixup_paths() is the best place for doing such checking,
but somebody should be doing that, no?



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/3] diff --no-index: support reading from named pipes
  2023-06-27 19:44   ` Junio C Hamano
@ 2023-06-28 10:05     ` Phillip Wood
  0 siblings, 0 replies; 17+ messages in thread
From: Phillip Wood @ 2023-06-28 10:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest

Hi Junio

On 27/06/2023 20:44, Junio C Hamano wrote:
> Phillip Wood <phillip.wood123@gmail.com> writes:
> 
>> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>>
>> In some shells, such as bash and zsh, it's possible to use a command
>> substitution to provide the output of a command as a file argument to
>> another process, like so:
>>
>>    diff -u <(printf "a\nb\n") <(printf "a\nc\n")
>>
>> However, this syntax does not produce useful results with "git diff
>> --no-index". On macOS, the arguments to the command are named pipes
>> under /dev/fd, and git diff doesn't know how to handle a named pipe. On
>> Linux, the arguments are symlinks to pipes, so git diff "helpfully"
>> diffs these symlinks, comparing their targets like "pipe:[1234]" and
>> "pipe:[5678]".
>>
>> To address this "diff --no-index" is changed so that if a path given on
>> the commandline is a named pipe or a symbolic link that resolves to a
>> named pipe then we read the data to diff from that pipe. This is
>> implemented by generalizing the code that already exists to handle
>> reading from stdin when the user passes the path "-".
>>
>> As process substitution is not support by POSIX this change is tested by
>> using a pipe and a symbolic link to a pipe.
>>
>> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
>> ---
>>   diff-no-index.c          | 80 ++++++++++++++++++++++++----------------
>>   t/t4053-diff-no-index.sh | 25 +++++++++++++
>>   2 files changed, 73 insertions(+), 32 deletions(-)
> 
> This looks good, if a bit invasive, to a cursory read, at least to
> me.  It is very focused to the real problem at hand, and shows that
> the way we split the "no-index" mode out to its own implementation
> of filespec population code does make sense.

Yes, it is more invasive than I'd like but I think that stems from 
needing to treat paths given on the commandline differently to the paths 
we find when recursing directories.

>> -static void populate_from_stdin(struct diff_filespec *s)
>> +static void populate_from_pipe(struct diff_filespec *s, int is_stdin)
>>   {
>>   	struct strbuf buf = STRBUF_INIT;
>>   	size_t size = 0;
>> +	int fd = 0;
>>   
>> -	if (strbuf_read(&buf, 0, 0) < 0)
>> +	if (!is_stdin)
>> +		fd = xopen(s->path, O_RDONLY);
>> +	if (strbuf_read(&buf, fd, 0) < 0)
>>   		die_errno("error while reading from stdin");
>> +	if (!is_stdin)
>> +		close(fd);
> 
> Given that the error message explicitly says "stdin", and there are
> many "if ([!]is_stdin)" sprinkled in the code, I actually suspect
> that there should be two separate helpers, one for stdin and one for
> non-stdin pipe.  It is especially true since there is only one
> caller that does this:
> 
>> +	if (is_pipe)
>> +		populate_from_pipe(s, name == file_from_standard_input);
> 
> which can be
> 
> 	if (is_pipe) {
> 		if (name == file_from_standard_input)
> 			populate_from_stdin(s);
> 		else
> 			populate_from_pipe(s);
> 	}
> 
> without losing clarity.  The code that you are sharing by forcing
> them to be a single helper to wrap up a handful of members in the s
> structure can become its own helper that is called from these two
> helper functions.

Sure, and thanks for pointing out the error message, I'd overlooked that.

>>   static int queue_diff(struct diff_options *o,
>> -		      const char *name1, const char *name2)
>> +		      const char *name1, int is_pipe1,
>> +		      const char *name2, int is_pipe2)
>>   {
>>   	int mode1 = 0, mode2 = 0;
>>   
>> -	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
>> +	if (get_mode(name1, is_pipe1, &mode1) ||
>> +	    get_mode(name2, is_pipe2, &mode2))
>>   		return -1;
> 
> Makes me wonder why the caller of queue_diff() even needs to know if
> these two names are pipes; we are calling get_mode() which would run
> stat(2) anyway, and the result from stat(2) is what you use (in the
> caller) to determine the values of is_pipeN.  Wouldn't it be more
> appropriate to leave the caller oblivious of special casing of the
> pipes and let get_mode() handle this?  After all, that is how the
> existing code special cases the standard input so there is a strong
> precedence.

Maybe is_pipeN should be called force_pipeN. We only want to 
de-reference symbolic links to pipes for paths that are given on the 
command line, not when the the user asked to compare two directories. 
That means we need to pass some kind of flag around to say whether we're 
recursing or not. An earlier draft of this series had a recursing flag 
rather than is_pipeN like this

-static int get_mode(const char *path, int *mode)
+static int get_mode(const char *path, int *mode, int recursing)
  {
          struct stat st;

          if (!path || !strcmp(path, "/dev/null"))
                  *mode = 0;
  #ifdef GIT_WINDOWS_NATIVE
          else if (!strcasecmp(path, "nul"))
                  *mode = 0;
  #endif
          else if (path == file_from_standard_input)
                  *mode = create_ce_mode(0666);
          else if (lstat(path, &st))
                  return error("Could not access '%s'", path);
          else
                  *mode = st.st_mode;
+
+        /*
+	  * For paths given on the command line de-reference symbolic
+	  * links that resolve to a fifo.
+	  */
+        if (!recursing &&
+            S_ISLNK(*mode) && !stat(path, &st) && S_ISFIFO(st.st_mode))
+                *mode = st.st_mode;
+
          return 0;
  }

I dropped it in favor of is_pipeN after I realized we need to check if 
the paths on the command line are pipes before calling fixup_paths(). I 
think we could use the "special" parameter you suggest below as a 
recursion indicator by setting it to NULL when recursing.

> If we go that route, it may make sense to further isolate the
> "address comparison" trick used for the standard input mode.
> Perhaps we can and do something like
> 
>      static int get_mode(const char *path, int *mode, int *special)
>      {
> 	struct stat st;
> 
> +	*special = 0; /* default - nothing special */
> 	...
> 	else if (path == file_from_standard_input) {
> 		*mode = create_ce_mode(0666);
> +		*pipe_kind = 1; /* STDIN */
> +	} else if (stat(path, &st)) {
> +		... error ...
> +	} else if (S_ISFIFO(st.st_mode)) {
> +		*mode = create_ce_mode(0666);
> +		*pipe_kind = 2; /* FIFO */
> 	} else if (lstat(path, &st)) {
> 		... error ...
> 	} else {
> 		*mode = st.st_mode;
> 	}
> 
> and have the caller act on "special" to choose among calling
> populate_from_stdin(), populate_from_pipe(), or do nothing for
> the regular files?
> 
>      Side note: this has an added benefit of highlighting that we do
>      stat() and lstat() because of dereferencing.  What I suspect is
>      that "git diff --no-index" mode was primarily to give Git
>      niceties like rename detection and diff algorithms to those who
>      wanted to use in contexts (i.e. contents not tracked by Git)
>      they use "diff" by other people like GNU, without bothering to
>      update "diff" by other people.  I further suspect that "compare
>      the readlink contents", which is very much necessary within the
>      Git context, may not fall into the "Git niceties" when they
>      invoke "--no-index" mode.  Which leads me to imagine a future
>      direction where we only use stat() and not lstat() in the
>      "--no-index" codepath.  Having everything including these
>      lstat() and stat() calls inside get_mode() will allow such a
>      future transition hopefully simpler.
> 
> I do not quite see why you decided to move the "is_dir" processing
> up and made the caller responsible.

To avoid a second stat() call in is_directory() after we've already 
called stat() to see if the path is a pipe.

>  Specifically,
> 
>> -	fixup_paths(paths, &replacement);
>> +	if (!is_pipe[0] && !is_pipe[1])
>> +		fixup_paths(paths, is_dir, &replacement);
> 
> this seems fishy when one side is pipe and the other one is not.
> When the user says
> 
>      $ git diff --no-index <(command) path
> 
> fixup_paths() are bypassed because one of them is pipe.  It makes me
> suspect that it should be an error if "path" is a directory.  I do
> not know if fixup_paths() is the best place for doing such checking,
> but somebody should be doing that, no?

I  wasn't sure what was best to do in that case. In the end I went with 
making the behavior the same as "git diff --no-index -- - directory". 
I'm happy to make both an error.

Thanks for your comments, I'll leave it a couple of days to see if there 
are any more comments and then re-roll.

Phillip


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 0/4] diff --no-index: support reading from named pipes
  2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
                   ` (2 preceding siblings ...)
  2023-06-27 14:10 ` [PATCH 3/3] diff --no-index: support reading from named pipes Phillip Wood
@ 2023-07-05 19:49 ` Phillip Wood
  2023-07-05 19:49   ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
                     ` (3 more replies)
  3 siblings, 4 replies; 17+ messages in thread
From: Phillip Wood @ 2023-07-05 19:49 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with git diff
--no-index. On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named
pipe. On Linux, the arguments are symlinks to pipes, so git diff
"helpfully" diffs these symlinks, comparing their targets like
"pipe:[1234]" and "pipe:[5678]".

There have been at least three previous attempts [1-3] to address this
issue. They all seem to have received broad support for the aim of
supporting process substitutions but have foundered on details of the
implementation. In an effort to avoid the same fate this series is
narrowly focussed on making command substitutions work with "diff
--no-index" and does not attempt to add a general facility for
de-referencing symbolic links or reading from pipes to the diff
machinery.

The only functional change is that if a path given on the commandline
is a named pipe or a symbolic link that resolves to a named pipe then
we read the data to diff from that pipe. The first two patches improve
the error handling when reading from stdin and add a test. The third
patch implements support for reading from pipes.

This cover letter and the commit message for the third patch are
largely copied from brian’s patch[2] - do we have a standard commit
trailer for "I stole the commit message from someone else's patch"?

I've cc’d the participants of the discussion of the last attempt[1] to
fix this.

Thanks to Junio for his comments, the changes since v1 are:

* Patch 1: - This is new and changes the code to error out if the user
	     tries to compare stdin to a directory.

* Patch 4: - Changed the implementation of get_mode() to treat stdin and
	     named pipes as special. (suggested by Junio)
	   - Added separate functions to populate a diff_filespec from
	     a stdin or a named pipe. (suggested by Junio)
	   - It now dies if the user tries to compare a named pipe to a
	     directory and added a test for this. (suggested by Junio)
	   - The test for comparing named pipes now cleans up the
	     background processes if it fails.

[1] https://lore.kernel.org/git/20200918113256.8699-3-tguyot@gmail.com/
[2] https://lore.kernel.org/git/20181220002610.43832-1-sandals@crustytoothpaste.net/
[3] https://public-inbox.org/git/20170113102021.6054-1-dennis@kaarsemaker.net/


base-commit: 94486b6763c29144c60932829a65fec0597e17b3
Published-As: https://github.com/phillipwood/git/releases/tag/diff-no-index-pipes%2Fv2
View-Changes-At: https://github.com/phillipwood/git/compare/94486b676...4e05a0be5
Fetch-It-Via: git fetch https://github.com/phillipwood/git diff-no-index-pipes/v2

Phillip Wood (4):
  diff --no-index: refuse to compare stdin to a directory
  diff --no-index: die on error reading stdin
  t4054: test diff --no-index with stdin
  diff --no-index: support reading from named pipes

 diff-no-index.c          | 126 ++++++++++++++++++++++++++++-----------
 t/t4053-diff-no-index.sh |  64 ++++++++++++++++++++
 2 files changed, 156 insertions(+), 34 deletions(-)

Range-diff against v1:
-:  ---------- > 1:  5e65a15223 diff --no-index: refuse to compare stdin to a directory
1:  5dad728f3b = 2:  be1d666769 diff --no-index: die on error reading stdin
2:  b94d59034f ! 3:  1c7db4dbe2 t4054: test diff --no-index with stdin
    @@ t/t4053-diff-no-index.sh: test_expect_success POSIXPERM,SYMLINKS 'diff --no-inde
     +	test_write_lines 1 | git diff --no-index -- a/1 - >actual &&
     +	test_must_be_empty actual
     +'
    - test_done
    ++
    + test_expect_success 'diff --no-index refuses to diff stdin and a directory' '
    + 	test_must_fail git diff --no-index -- - a </dev/null 2>err &&
    + 	grep "fatal: cannot compare stdin to a directory" err
3:  990e71882b < -:  ---------- diff --no-index: support reading from named pipes
-:  ---------- > 4:  4e05a0be54 diff --no-index: support reading from named pipes

base-commit: 94486b6763c29144c60932829a65fec0597e17b3
-- 
2.40.1.852.g0a1e0755a6


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory
  2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
@ 2023-07-05 19:49   ` Phillip Wood
  2023-07-05 21:17     ` Junio C Hamano
  2023-07-05 19:49   ` [PATCH v2 2/4] diff --no-index: die on error reading stdin Phillip Wood
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 17+ messages in thread
From: Phillip Wood @ 2023-07-05 19:49 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

When the user runs

    git diff --no-index file directory

we follow the behavior of POSIX diff and rewrite the arguments as

    git diff --no-index file directory/file

Doing that when "file" is "-" (which means "read from stdin") does not
make sense so we should error out if the user asks us to compare "-" to
a directory. This matches the behavior of GNU diff and diff on *BSD.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff-no-index.c          | 12 +++++++-----
 t/t4053-diff-no-index.sh |  5 +++++
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 4296940f90..77462ac2a9 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -218,11 +218,13 @@ static void fixup_paths(const char **path, struct strbuf *replacement)
 {
 	unsigned int isdir0, isdir1;
 
-	if (path[0] == file_from_standard_input ||
-	    path[1] == file_from_standard_input)
-		return;
-	isdir0 = is_directory(path[0]);
-	isdir1 = is_directory(path[1]);
+	isdir0 = path[0] != file_from_standard_input && is_directory(path[0]);
+	isdir1 = path[1] != file_from_standard_input && is_directory(path[1]);
+
+	if ((path[0] == file_from_standard_input && isdir1) ||
+	    (isdir0 && path[1] == file_from_standard_input))
+		die(_("cannot compare stdin to a directory"));
+
 	if (isdir0 == isdir1)
 		return;
 	if (isdir0) {
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 4e9fa0403d..5bfb282e98 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -205,4 +205,9 @@ test_expect_success POSIXPERM,SYMLINKS 'diff --no-index normalizes: mode not lik
 	test_cmp expected actual
 '
 
+test_expect_success 'diff --no-index refuses to diff stdin and a directory' '
+	test_must_fail git diff --no-index -- - a </dev/null 2>err &&
+	grep "fatal: cannot compare stdin to a directory" err
+'
+
 test_done
-- 
2.40.1.852.g0a1e0755a6


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/4] diff --no-index: die on error reading stdin
  2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
  2023-07-05 19:49   ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
@ 2023-07-05 19:49   ` Phillip Wood
  2023-07-05 21:18     ` Junio C Hamano
  2023-07-05 19:49   ` [PATCH v2 3/4] t4054: test diff --no-index with stdin Phillip Wood
  2023-07-05 19:49   ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
  3 siblings, 1 reply; 17+ messages in thread
From: Phillip Wood @ 2023-07-05 19:49 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

If there is an error when reading from stdin then "diff --no-index"
prints an error message but continues to try and diff a file named "-"
resulting in an error message that looks like

    error: error while reading from stdin: Invalid argument
    fatal: stat '-': No such file or directory

assuming that no file named "-" exists. If such a file exists it prints
the first error message and generates the diff from that file which is
not what the user wanted. Instead just die() straight away if we cannot
read from stdin.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff-no-index.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 77462ac2a9..4470e0271d 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -60,20 +60,19 @@ static int get_mode(const char *path, int *mode)
 	return 0;
 }
 
-static int populate_from_stdin(struct diff_filespec *s)
+static void populate_from_stdin(struct diff_filespec *s)
 {
 	struct strbuf buf = STRBUF_INIT;
 	size_t size = 0;
 
 	if (strbuf_read(&buf, 0, 0) < 0)
-		return error_errno("error while reading from stdin");
+		die_errno("error while reading from stdin");
 
 	s->should_munmap = 0;
 	s->data = strbuf_detach(&buf, &size);
 	s->size = size;
 	s->should_free = 1;
 	s->is_stdin = 1;
-	return 0;
 }
 
 static struct diff_filespec *noindex_filespec(const char *name, int mode)
-- 
2.40.1.852.g0a1e0755a6


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/4] t4054: test diff --no-index with stdin
  2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
  2023-07-05 19:49   ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
  2023-07-05 19:49   ` [PATCH v2 2/4] diff --no-index: die on error reading stdin Phillip Wood
@ 2023-07-05 19:49   ` Phillip Wood
  2023-07-05 21:22     ` Junio C Hamano
  2023-07-05 19:49   ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
  3 siblings, 1 reply; 17+ messages in thread
From: Phillip Wood @ 2023-07-05 19:49 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

"git diff --no-index" supports reading from stdin with the path "-".
There is no test coverage for this so add a regression test before
changing the code in the next commit.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 t/t4053-diff-no-index.sh | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 5bfb282e98..4870443609 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -205,6 +205,25 @@ test_expect_success POSIXPERM,SYMLINKS 'diff --no-index normalizes: mode not lik
 	test_cmp expected actual
 '
 
+test_expect_success "diff --no-index treats '-' as stdin" '
+	cat >expect <<-EOF &&
+	diff --git a/- b/a/1
+	index $ZERO_OID..$(git hash-object --stdin <a/1) 100644
+	--- a/-
+	+++ b/a/1
+	@@ -1 +1 @@
+	-x
+	+1
+	EOF
+
+	test_write_lines x | test_expect_code 1 \
+		git -c core.abbrev=no diff --no-index -- - a/1 >actual &&
+	test_cmp expect actual &&
+
+	test_write_lines 1 | git diff --no-index -- a/1 - >actual &&
+	test_must_be_empty actual
+'
+
 test_expect_success 'diff --no-index refuses to diff stdin and a directory' '
 	test_must_fail git diff --no-index -- - a </dev/null 2>err &&
 	grep "fatal: cannot compare stdin to a directory" err
-- 
2.40.1.852.g0a1e0755a6


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/4] diff --no-index: support reading from named pipes
  2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
                     ` (2 preceding siblings ...)
  2023-07-05 19:49   ` [PATCH v2 3/4] t4054: test diff --no-index with stdin Phillip Wood
@ 2023-07-05 19:49   ` Phillip Wood
  2023-07-05 22:19     ` Junio C Hamano
  2023-08-09 17:17     ` Jeff King
  3 siblings, 2 replies; 17+ messages in thread
From: Phillip Wood @ 2023-07-05 19:49 UTC (permalink / raw)
  To: git
  Cc: Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

From: Phillip Wood <phillip.wood@dunelm.org.uk>

In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:

  diff -u <(printf "a\nb\n") <(printf "a\nc\n")

However, this syntax does not produce useful results with "git diff
--no-index". On macOS, the arguments to the command are named pipes
under /dev/fd, and git diff doesn't know how to handle a named pipe. On
Linux, the arguments are symlinks to pipes, so git diff "helpfully"
diffs these symlinks, comparing their targets like "pipe:[1234]" and
"pipe:[5678]".

To address this "diff --no-index" is changed so that if a path given on
the commandline is a named pipe or a symbolic link that resolves to a
named pipe then we read the data to diff from that pipe. This is
implemented by generalizing the code that already exists to handle
reading from stdin when the user passes the path "-".

If the user tries to compare a named pipe to a directory then we die as
we do when trying to compare stdin to a directory.

As process substitution is not support by POSIX this change is tested by
using a pipe and a symbolic link to a pipe.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
---
 diff-no-index.c          | 113 +++++++++++++++++++++++++++++----------
 t/t4053-diff-no-index.sh |  40 ++++++++++++++
 2 files changed, 125 insertions(+), 28 deletions(-)

diff --git a/diff-no-index.c b/diff-no-index.c
index 4470e0271d..4771cf02aa 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -41,74 +41,119 @@ static int read_directory_contents(const char *path, struct string_list *list)
  */
 static const char file_from_standard_input[] = "-";
 
-static int get_mode(const char *path, int *mode)
+/*
+ * For paths given on the command-line we treat "-" as stdin and named
+ * pipes and symbolic links to named pipes specially.
+ */
+enum special {
+	SPECIAL_NONE,
+	SPECIAL_STDIN,
+	SPECIAL_PIPE,
+};
+
+static int get_mode(const char *path, int *mode, enum special *special)
 {
 	struct stat st;
 
-	if (!path || !strcmp(path, "/dev/null"))
+	if (!path || !strcmp(path, "/dev/null")) {
 		*mode = 0;
 #ifdef GIT_WINDOWS_NATIVE
-	else if (!strcasecmp(path, "nul"))
+	} else if (!strcasecmp(path, "nul")) {
 		*mode = 0;
 #endif
-	else if (path == file_from_standard_input)
+	} else if (path == file_from_standard_input) {
 		*mode = create_ce_mode(0666);
-	else if (lstat(path, &st))
+		*special = SPECIAL_STDIN;
+	} else if (lstat(path, &st)) {
 		return error("Could not access '%s'", path);
-	else
+	} else {
 		*mode = st.st_mode;
+	}
+	/*
+	 * For paths on the command-line treat named pipes and symbolic
+	 * links that resolve to a named pipe specially.
+	 */
+	if (special &&
+	    (S_ISFIFO(*mode) ||
+	     (S_ISLNK(*mode) && !stat(path, &st) && S_ISFIFO(st.st_mode)))) {
+		*mode = create_ce_mode(0666);
+		*special = SPECIAL_PIPE;
+	}
+
 	return 0;
 }
 
-static void populate_from_stdin(struct diff_filespec *s)
+static void populate_common(struct diff_filespec *s, struct strbuf *buf)
 {
-	struct strbuf buf = STRBUF_INIT;
 	size_t size = 0;
 
-	if (strbuf_read(&buf, 0, 0) < 0)
-		die_errno("error while reading from stdin");
-
 	s->should_munmap = 0;
-	s->data = strbuf_detach(&buf, &size);
+	s->data = strbuf_detach(buf, &size);
 	s->size = size;
 	s->should_free = 1;
 	s->is_stdin = 1;
 }
 
-static struct diff_filespec *noindex_filespec(const char *name, int mode)
+static void populate_from_pipe(struct diff_filespec *s)
+{
+	struct strbuf buf = STRBUF_INIT;
+	int fd = xopen(s->path, O_RDONLY);
+
+	if (strbuf_read(&buf, fd, 0) < 0)
+		die_errno("error while reading from '%s'", s->path);
+	close(fd);
+	populate_common(s, &buf);
+}
+
+static void populate_from_stdin(struct diff_filespec *s)
+{
+	struct strbuf buf = STRBUF_INIT;
+
+	if (strbuf_read(&buf, 0, 0) < 0)
+		die_errno("error while reading from stdin");
+	populate_common(s, &buf);
+}
+
+static struct diff_filespec *noindex_filespec(const char *name, int mode,
+					      enum special special)
 {
 	struct diff_filespec *s;
 
 	if (!name)
 		name = "/dev/null";
 	s = alloc_filespec(name);
 	fill_filespec(s, null_oid(), 0, mode);
-	if (name == file_from_standard_input)
+	if (special == SPECIAL_STDIN)
 		populate_from_stdin(s);
+	else if (special == SPECIAL_PIPE)
+		populate_from_pipe(s);
 	return s;
 }
 
 static int queue_diff(struct diff_options *o,
-		      const char *name1, const char *name2)
+		      const char *name1, const char *name2, int recursing)
 {
 	int mode1 = 0, mode2 = 0;
+	enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE;
 
-	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
+	/* Paths can only be special if we're not recursing. */
+	if (get_mode(name1, &mode1, recursing ? NULL : &special1) ||
+	    get_mode(name2, &mode2, recursing ? NULL : &special2))
 		return -1;
 
 	if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2)) {
 		struct diff_filespec *d1, *d2;
 
 		if (S_ISDIR(mode1)) {
 			/* 2 is file that is created */
-			d1 = noindex_filespec(NULL, 0);
-			d2 = noindex_filespec(name2, mode2);
+			d1 = noindex_filespec(NULL, 0, SPECIAL_NONE);
+			d2 = noindex_filespec(name2, mode2, special2);
 			name2 = NULL;
 			mode2 = 0;
 		} else {
 			/* 1 is file that is deleted */
-			d1 = noindex_filespec(name1, mode1);
-			d2 = noindex_filespec(NULL, 0);
+			d1 = noindex_filespec(name1, mode1, special1);
+			d2 = noindex_filespec(NULL, 0, SPECIAL_NONE);
 			name1 = NULL;
 			mode1 = 0;
 		}
@@ -173,7 +218,7 @@ static int queue_diff(struct diff_options *o,
 				n2 = buffer2.buf;
 			}
 
-			ret = queue_diff(o, n1, n2);
+			ret = queue_diff(o, n1, n2, 1);
 		}
 		string_list_clear(&p1, 0);
 		string_list_clear(&p2, 0);
@@ -189,8 +234,8 @@ static int queue_diff(struct diff_options *o,
 			SWAP(name1, name2);
 		}
 
-		d1 = noindex_filespec(name1, mode1);
-		d2 = noindex_filespec(name2, mode2);
+		d1 = noindex_filespec(name1, mode1, special1);
+		d2 = noindex_filespec(name2, mode2, special2);
 		diff_queue(&diff_queued_diff, d1, d2);
 		return 0;
 	}
@@ -215,15 +260,27 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
  */
 static void fixup_paths(const char **path, struct strbuf *replacement)
 {
-	unsigned int isdir0, isdir1;
-
-	isdir0 = path[0] != file_from_standard_input && is_directory(path[0]);
-	isdir1 = path[1] != file_from_standard_input && is_directory(path[1]);
+	struct stat st;
+	unsigned int isdir0 = 0, isdir1 = 0;
+	unsigned int ispipe0 = 0, ispipe1 = 0;
+
+	if (path[0] != file_from_standard_input && !stat(path[0], &st)) {
+		isdir0 = S_ISDIR(st.st_mode);
+		ispipe0 = S_ISFIFO(st.st_mode);
+	}
+
+	if (path[1] != file_from_standard_input && !stat(path[1], &st)) {
+		isdir1 = S_ISDIR(st.st_mode);
+		ispipe1 = S_ISFIFO(st.st_mode);
+	}
 
 	if ((path[0] == file_from_standard_input && isdir1) ||
 	    (isdir0 && path[1] == file_from_standard_input))
 		die(_("cannot compare stdin to a directory"));
 
+	if ((isdir0 && ispipe1) || (ispipe0 && isdir1))
+		die(_("cannot compare a named pipe to a directory"));
+
 	if (isdir0 == isdir1)
 		return;
 	if (isdir0) {
@@ -297,7 +354,7 @@ int diff_no_index(struct rev_info *revs,
 	setup_diff_pager(&revs->diffopt);
 	revs->diffopt.flags.exit_with_status = 1;
 
-	if (queue_diff(&revs->diffopt, paths[0], paths[1]))
+	if (queue_diff(&revs->diffopt, paths[0], paths[1], 0))
 		goto out;
 	diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/");
 	diffcore_std(&revs->diffopt);
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 4870443609..a28b9ff243 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -229,4 +229,44 @@ test_expect_success 'diff --no-index refuses to diff stdin and a directory' '
 	grep "fatal: cannot compare stdin to a directory" err
 '
 
+test_expect_success PIPE 'diff --no-index refuses to diff a named pipe and a directory' '
+	test_when_finished "rm -f pipe" &&
+	mkfifo pipe &&
+	{
+		(>pipe) &
+	} &&
+	test_when_finished "kill $!" &&
+	test_must_fail git diff --no-index -- pipe a 2>err &&
+	grep "fatal: cannot compare a named pipe to a directory" err
+'
+
+test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' '
+	test_when_finished "rm -f old new new-link" &&
+	mkfifo old &&
+	mkfifo new &&
+	ln -s new new-link &&
+	{
+		(test_write_lines a b c >old) &
+	} &&
+	test_when_finished "! kill $!" &&
+	{
+		(test_write_lines a x c >new) &
+	} &&
+	test_when_finished "! kill $!" &&
+
+	cat >expect <<-EOF &&
+	diff --git a/old b/new-link
+	--- a/old
+	+++ b/new-link
+	@@ -1,3 +1,3 @@
+	 a
+	-b
+	+x
+	 c
+	EOF
+
+	test_expect_code 1 git diff --no-index old new-link >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.40.1.852.g0a1e0755a6


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory
  2023-07-05 19:49   ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
@ 2023-07-05 21:17     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2023-07-05 21:17 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> When the user runs
>
>     git diff --no-index file directory
>
> we follow the behavior of POSIX diff and rewrite the arguments as
>
>     git diff --no-index file directory/file
>
> Doing that when "file" is "-" (which means "read from stdin") does not
> make sense so we should error out if the user asks us to compare "-" to
> a directory. This matches the behavior of GNU diff and diff on *BSD.

"git diff --no-index directory file" would also be the same way.

Makes sense.

> -	if (path[0] == file_from_standard_input ||
> -	    path[1] == file_from_standard_input)
> -		return;
> -	isdir0 = is_directory(path[0]);
> -	isdir1 = is_directory(path[1]);

We used to silently did nonsense, I guess.

> +	isdir0 = path[0] != file_from_standard_input && is_directory(path[0]);
> +	isdir1 = path[1] != file_from_standard_input && is_directory(path[1]);
> +
> +	if ((path[0] == file_from_standard_input && isdir1) ||
> +	    (isdir0 && path[1] == file_from_standard_input))
> +		die(_("cannot compare stdin to a directory"));

OK.  It is much better than turning "diff - D" into "diff - D/-".
If D were a missing or a misspelt directory name, the rest of the
function will just leave the original intact and let comparison
between the standard input and D that does not exist, and will fail
correctly anyway.  Good.

>  	if (isdir0 == isdir1)
>  		return;
>  	if (isdir0) {
> diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
> index 4e9fa0403d..5bfb282e98 100755
> --- a/t/t4053-diff-no-index.sh
> +++ b/t/t4053-diff-no-index.sh
> @@ -205,4 +205,9 @@ test_expect_success POSIXPERM,SYMLINKS 'diff --no-index normalizes: mode not lik
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'diff --no-index refuses to diff stdin and a directory' '
> +	test_must_fail git diff --no-index -- - a </dev/null 2>err &&
> +	grep "fatal: cannot compare stdin to a directory" err
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/4] diff --no-index: die on error reading stdin
  2023-07-05 19:49   ` [PATCH v2 2/4] diff --no-index: die on error reading stdin Phillip Wood
@ 2023-07-05 21:18     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2023-07-05 21:18 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> If there is an error when reading from stdin then "diff --no-index"
> prints an error message but continues to try and diff a file named "-"
> resulting in an error message that looks like
>
>     error: error while reading from stdin: Invalid argument
>     fatal: stat '-': No such file or directory
>
> assuming that no file named "-" exists. If such a file exists it prints
> the first error message and generates the diff from that file which is
> not what the user wanted. Instead just die() straight away if we cannot
> read from stdin.

Makes sense.

> Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
> ---
>  diff-no-index.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/diff-no-index.c b/diff-no-index.c
> index 77462ac2a9..4470e0271d 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -60,20 +60,19 @@ static int get_mode(const char *path, int *mode)
>  	return 0;
>  }
>  
> -static int populate_from_stdin(struct diff_filespec *s)
> +static void populate_from_stdin(struct diff_filespec *s)
>  {
>  	struct strbuf buf = STRBUF_INIT;
>  	size_t size = 0;
>  
>  	if (strbuf_read(&buf, 0, 0) < 0)
> -		return error_errno("error while reading from stdin");
> +		die_errno("error while reading from stdin");
>  
>  	s->should_munmap = 0;
>  	s->data = strbuf_detach(&buf, &size);
>  	s->size = size;
>  	s->should_free = 1;
>  	s->is_stdin = 1;
> -	return 0;
>  }
>  
>  static struct diff_filespec *noindex_filespec(const char *name, int mode)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/4] t4054: test diff --no-index with stdin
  2023-07-05 19:49   ` [PATCH v2 3/4] t4054: test diff --no-index with stdin Phillip Wood
@ 2023-07-05 21:22     ` Junio C Hamano
  0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2023-07-05 21:22 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest

Phillip Wood <phillip.wood123@gmail.com> writes:

> +	cat >expect <<-EOF &&
> +	diff --git a/- b/a/1
> +	index $ZERO_OID..$(git hash-object --stdin <a/1) 100644
> +	--- a/-
> +	+++ b/a/1
> +	@@ -1 +1 @@
> +	-x
> +	+1
> +	EOF
> +
> +	test_write_lines x | test_expect_code 1 \
> +		git -c core.abbrev=no diff --no-index -- - a/1 >actual &&

Or "git diff --no-index --full-index -- - a/1", which looks somewhat
odd but may be correct and a bit shorter ;-)

Not a suggestion to use --full-index instead of core.abbrev; both
are equally valid.

> +	test_cmp expect actual &&
> +
> +	test_write_lines 1 | git diff --no-index -- a/1 - >actual &&
> +	test_must_be_empty actual
> +'

Good.

>  test_expect_success 'diff --no-index refuses to diff stdin and a directory' '
>  	test_must_fail git diff --no-index -- - a </dev/null 2>err &&
>  	grep "fatal: cannot compare stdin to a directory" err

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/4] diff --no-index: support reading from named pipes
  2023-07-05 19:49   ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
@ 2023-07-05 22:19     ` Junio C Hamano
  2023-08-09 17:17     ` Jeff King
  1 sibling, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2023-07-05 22:19 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Jeff King, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest

Phillip Wood <phillip.wood123@gmail.com> writes:

> From: Phillip Wood <phillip.wood@dunelm.org.uk>
>
> In some shells, such as bash and zsh, it's possible to use a command
> substitution to provide the output of a command as a file argument to
> another process, like so:
>
>   diff -u <(printf "a\nb\n") <(printf "a\nc\n")

"command substitution" -> "process substitution", I think, are the
words found in the manual pages of both shells cited as examples.

> However, this syntax does not produce useful results with "git diff
> --no-index". On macOS, the arguments to the command are named pipes
> under /dev/fd, and git diff doesn't know how to handle a named pipe. On
> Linux, the arguments are symlinks to pipes, so git diff "helpfully"
> diffs these symlinks, comparing their targets like "pipe:[1234]" and
> "pipe:[5678]".

Concisely and clearly described.  Very nice.

> To address this "diff --no-index" is changed so that if a path given on

"this" -> "this,", probably.

> the commandline is a named pipe or a symbolic link that resolves to a
> named pipe then we read the data to diff from that pipe. This is
> implemented by generalizing the code that already exists to handle
> reading from stdin when the user passes the path "-".

Great.

> If the user tries to compare a named pipe to a directory then we die as
> we do when trying to compare stdin to a directory.
>
> As process substitution is not support by POSIX this change is tested by
> using a pipe and a symbolic link to a pipe.

People who are interested to more directly test the intended use
case can add a test "diff --no-index <(one) <(two)" under some
prerequisite (like BASH).  The test on the underlying mechansim like
this patch does is just fine in the meantime.

> diff --git a/diff-no-index.c b/diff-no-index.c
> index 4470e0271d..4771cf02aa 100644
> --- a/diff-no-index.c
> +++ b/diff-no-index.c
> @@ -41,74 +41,119 @@ static int read_directory_contents(const char *path, struct string_list *list)
>   */
>  static const char file_from_standard_input[] = "-";
>  
> -static int get_mode(const char *path, int *mode)
> +/*
> + * For paths given on the command-line we treat "-" as stdin and named
> + * pipes and symbolic links to named pipes specially.
> + */
> +enum special {
> +	SPECIAL_NONE,
> +	SPECIAL_STDIN,
> +	SPECIAL_PIPE,
> +};
> +
> +static int get_mode(const char *path, int *mode, enum special *special)
>  {
>  	struct stat st;
>  
> -	if (!path || !strcmp(path, "/dev/null"))
> +	if (!path || !strcmp(path, "/dev/null")) {
>  		*mode = 0;
>  #ifdef GIT_WINDOWS_NATIVE
> -	else if (!strcasecmp(path, "nul"))
> +	} else if (!strcasecmp(path, "nul")) {
>  		*mode = 0;
>  #endif
> -	else if (path == file_from_standard_input)
> +	} else if (path == file_from_standard_input) {
>  		*mode = create_ce_mode(0666);
> -	else if (lstat(path, &st))
> +		*special = SPECIAL_STDIN;
> +	} else if (lstat(path, &st)) {
>  		return error("Could not access '%s'", path);
> -	else
> +	} else {
>  		*mode = st.st_mode;
> +	}
> +	/*
> +	 * For paths on the command-line treat named pipes and symbolic
> +	 * links that resolve to a named pipe specially.
> +	 */

Excellent.  The code structure is very clear.  We do what we have
already done, and in addition, treat pipes and symlinks to pipes
specially.

> +	if (special &&
> +	    (S_ISFIFO(*mode) ||
> +	     (S_ISLNK(*mode) && !stat(path, &st) && S_ISFIFO(st.st_mode)))) {
> +		*mode = create_ce_mode(0666);
> +		*special = SPECIAL_PIPE;
> +	}

What should we do when !stat(path, &st) fails here?  It may be a
dangling symbolic link, causes stat() to fail and we want to be
silent about it without changing *mode.  And that is exactly what
the code does.  Great.

Under what circumstances does a caller pass NULL to the special?  If
we were given a fifo and the caller passes NULL to the special,
*mode at this point would be S_ISFIFO() and not 0666.  It may be
what we want, but I do not immediately see why (I'll realize why
while reading the "recursing?" bit below).

> -static void populate_from_stdin(struct diff_filespec *s)
> +static void populate_common(struct diff_filespec *s, struct strbuf *buf)
>  {
> -	struct strbuf buf = STRBUF_INIT;
>  	size_t size = 0;

Not a new issue, but this initialization assignment is useless as
strbuf_detach() would always overwrite it (or die and would not come
back to us).

> -	if (strbuf_read(&buf, 0, 0) < 0)
> -		die_errno("error while reading from stdin");
> -
>  	s->should_munmap = 0;
> -	s->data = strbuf_detach(&buf, &size);
> +	s->data = strbuf_detach(buf, &size);
>  	s->size = size;

Again not a new issue, but it is unfortunate that s->size is not
size_t but ulong, which requires us to use a temporary variable of
the correct type, only to assign to it.

>  	s->should_free = 1;
>  	s->is_stdin = 1;
>  }
>  
> -static struct diff_filespec *noindex_filespec(const char *name, int mode)
> +static void populate_from_pipe(struct diff_filespec *s)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +	int fd = xopen(s->path, O_RDONLY);
> +
> +	if (strbuf_read(&buf, fd, 0) < 0)
> +		die_errno("error while reading from '%s'", s->path);
> +	close(fd);
> +	populate_common(s, &buf);
> +}
> +
> +static void populate_from_stdin(struct diff_filespec *s)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +
> +	if (strbuf_read(&buf, 0, 0) < 0)
> +		die_errno("error while reading from stdin");
> +	populate_common(s, &buf);
> +}

Both look OK.  I would have named "_common" -> "_finalize" or
something to signal that this is the shared "tail" part of the two
callers, but it is minor.

> +static struct diff_filespec *noindex_filespec(const char *name, int mode,
> +					      enum special special)
>  {
>  	struct diff_filespec *s;
>  
>  	if (!name)
>  		name = "/dev/null";
>  	s = alloc_filespec(name);
>  	fill_filespec(s, null_oid(), 0, mode);
> -	if (name == file_from_standard_input)
> +	if (special == SPECIAL_STDIN)
>  		populate_from_stdin(s);
> +	else if (special == SPECIAL_PIPE)
> +		populate_from_pipe(s);
>  	return s;
>  }

Great.

>  static int queue_diff(struct diff_options *o,
> -		      const char *name1, const char *name2)
> +		      const char *name1, const char *name2, int recursing)
>  {
>  	int mode1 = 0, mode2 = 0;
> +	enum special special1 = SPECIAL_NONE, special2 = SPECIAL_NONE;
>  
> -	if (get_mode(name1, &mode1) || get_mode(name2, &mode2))
> +	/* Paths can only be special if we're not recursing. */
> +	if (get_mode(name1, &mode1, recursing ? NULL : &special1) ||
> +	    get_mode(name2, &mode2, recursing ? NULL : &special2))
>  		return -1;

Ahh, OK.  If we are told to "diff --no-index D1 D2", even if we find
a fifo or symlink to a fifo, they aren't "<(foo)" process
substitutions.  Makes sense.  And because we do not handle FIFO in
such a case, we'd leave *mode (in get_mode()) to a value that we
never use (i.e. S_ISFIFO(*mode) is true); hopefully later part of
the control flow somewhere we will notice it and die?

>  	if (mode1 && mode2 && S_ISDIR(mode1) != S_ISDIR(mode2)) {
>  		struct diff_filespec *d1, *d2;
>  
>  		if (S_ISDIR(mode1)) {
>  			/* 2 is file that is created */
> -			d1 = noindex_filespec(NULL, 0);
> -			d2 = noindex_filespec(name2, mode2);
> +			d1 = noindex_filespec(NULL, 0, SPECIAL_NONE);
> +			d2 = noindex_filespec(name2, mode2, special2);
>  			name2 = NULL;
>  			mode2 = 0;
>  		} else {
>  			/* 1 is file that is deleted */
> -			d1 = noindex_filespec(name1, mode1);
> -			d2 = noindex_filespec(NULL, 0);
> +			d1 = noindex_filespec(name1, mode1, special1);
> +			d2 = noindex_filespec(NULL, 0, SPECIAL_NONE);
>  			name1 = NULL;
>  			mode1 = 0;
>  		}
> @@ -173,7 +218,7 @@ static int queue_diff(struct diff_options *o,
>  				n2 = buffer2.buf;
>  			}
>  
> -			ret = queue_diff(o, n1, n2);
> +			ret = queue_diff(o, n1, n2, 1);
>  		}
>  		string_list_clear(&p1, 0);
>  		string_list_clear(&p2, 0);
> @@ -189,8 +234,8 @@ static int queue_diff(struct diff_options *o,
>  			SWAP(name1, name2);
>  		}
>  
> -		d1 = noindex_filespec(name1, mode1);
> -		d2 = noindex_filespec(name2, mode2);
> +		d1 = noindex_filespec(name1, mode1, special1);
> +		d2 = noindex_filespec(name2, mode2, special2);
>  		diff_queue(&diff_queued_diff, d1, d2);
>  		return 0;
>  	}

Looks quite straight-forward.

> @@ -215,15 +260,27 @@ static void append_basename(struct strbuf *path, const char *dir, const char *fi
>   */
>  static void fixup_paths(const char **path, struct strbuf *replacement)
>  {
> -	unsigned int isdir0, isdir1;
> -
> -	isdir0 = path[0] != file_from_standard_input && is_directory(path[0]);
> -	isdir1 = path[1] != file_from_standard_input && is_directory(path[1]);
> +	struct stat st;
> +	unsigned int isdir0 = 0, isdir1 = 0;
> +	unsigned int ispipe0 = 0, ispipe1 = 0;
> +
> +	if (path[0] != file_from_standard_input && !stat(path[0], &st)) {
> +		isdir0 = S_ISDIR(st.st_mode);
> +		ispipe0 = S_ISFIFO(st.st_mode);
> +	}
> +
> +	if (path[1] != file_from_standard_input && !stat(path[1], &st)) {
> +		isdir1 = S_ISDIR(st.st_mode);
> +		ispipe1 = S_ISFIFO(st.st_mode);
> +	}
>  
>  	if ((path[0] == file_from_standard_input && isdir1) ||
>  	    (isdir0 && path[1] == file_from_standard_input))
>  		die(_("cannot compare stdin to a directory"));
>  
> +	if ((isdir0 && ispipe1) || (ispipe0 && isdir1))
> +		die(_("cannot compare a named pipe to a directory"));
> +

Nice.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/4] diff --no-index: support reading from named pipes
  2023-07-05 19:49   ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
  2023-07-05 22:19     ` Junio C Hamano
@ 2023-08-09 17:17     ` Jeff King
  2023-08-10 12:56       ` Phillip Wood
  1 sibling, 1 reply; 17+ messages in thread
From: Jeff King @ 2023-08-09 17:17 UTC (permalink / raw)
  To: Phillip Wood
  Cc: git, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano, Phillip Wood

On Wed, Jul 05, 2023 at 08:49:30PM +0100, Phillip Wood wrote:

> +test_expect_success PIPE 'diff --no-index refuses to diff a named pipe and a directory' '
> +	test_when_finished "rm -f pipe" &&
> +	mkfifo pipe &&
> +	{
> +		(>pipe) &
> +	} &&
> +	test_when_finished "kill $!" &&
> +	test_must_fail git diff --no-index -- pipe a 2>err &&
> +	grep "fatal: cannot compare a named pipe to a directory" err
> +'
> +
> +test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' '
> +	test_when_finished "rm -f old new new-link" &&
> +	mkfifo old &&
> +	mkfifo new &&
> +	ln -s new new-link &&
> +	{
> +		(test_write_lines a b c >old) &
> +	} &&
> +	test_when_finished "! kill $!" &&
> +	{
> +		(test_write_lines a x c >new) &
> +	} &&
> +	test_when_finished "! kill $!" &&
> +
> +	cat >expect <<-EOF &&
> +	diff --git a/old b/new-link
> +	--- a/old
> +	+++ b/new-link
> +	@@ -1,3 +1,3 @@
> +	 a
> +	-b
> +	+x
> +	 c
> +	EOF
> +
> +	test_expect_code 1 git diff --no-index old new-link >actual &&
> +	test_cmp expect actual
> +'

I've had t4053 hang for me once or twice in the last few days. I haven't
managed to pinpoint the problem, but I did notice that running it with
--stress seems to occasionally fail in thie "reads from pipes" test.

The problem is that the "kill" is racy. Even after we've read all of the
input from those sub-processes, they might still be hanging around
waiting to exit when our test_when_finished runs. And then kill will
return "0". So I think we need to either:

  1. Soften the when_finished to "kill $! || true" so that we are OK if
     they are still there.

  2. If the diff command completed as expected, it should be safe to
     "wait" for each of them to confirm that they successfully wrote
     everything. I'm not sure it buys us much over testing the output
     from "diff", though.

I still don't see where the hang comes from, though. It might be
unrelated. I'll try to examine more next time I catch it in the act.

-Peff

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/4] diff --no-index: support reading from named pipes
  2023-08-09 17:17     ` Jeff King
@ 2023-08-10 12:56       ` Phillip Wood
  0 siblings, 0 replies; 17+ messages in thread
From: Phillip Wood @ 2023-08-10 12:56 UTC (permalink / raw)
  To: Jeff King, Phillip Wood
  Cc: git, Taylor Blau, brian m. carlson, Thomas Guyot-Sionnest,
	Junio C Hamano

Hi Eff

Thanks for reporting this

On 09/08/2023 18:17, Jeff King wrote:
> On Wed, Jul 05, 2023 at 08:49:30PM +0100, Phillip Wood wrote:
> 
>> +test_expect_success PIPE 'diff --no-index refuses to diff a named pipe and a directory' '
>> +	test_when_finished "rm -f pipe" &&
>> +	mkfifo pipe &&
>> +	{
>> +		(>pipe) &
>> +	} &&
>> +	test_when_finished "kill $!" &&
>> +	test_must_fail git diff --no-index -- pipe a 2>err &&
>> +	grep "fatal: cannot compare a named pipe to a directory" err
>> +'
>> +
>> +test_expect_success PIPE,SYMLINKS 'diff --no-index reads from pipes' '
>> +	test_when_finished "rm -f old new new-link" &&
>> +	mkfifo old &&
>> +	mkfifo new &&
>> +	ln -s new new-link &&
>> +	{
>> +		(test_write_lines a b c >old) &
>> +	} &&
>> +	test_when_finished "! kill $!" &&
>> +	{
>> +		(test_write_lines a x c >new) &
>> +	} &&
>> +	test_when_finished "! kill $!" &&
>> +
>> +	cat >expect <<-EOF &&
>> +	diff --git a/old b/new-link
>> +	--- a/old
>> +	+++ b/new-link
>> +	@@ -1,3 +1,3 @@
>> +	 a
>> +	-b
>> +	+x
>> +	 c
>> +	EOF
>> +
>> +	test_expect_code 1 git diff --no-index old new-link >actual &&
>> +	test_cmp expect actual
>> +'
> 
> I've had t4053 hang for me once or twice in the last few days. I haven't
> managed to pinpoint the problem, but I did notice that running it with
> --stress seems to occasionally fail in thie "reads from pipes" test.
> 
> The problem is that the "kill" is racy. Even after we've read all of the
> input from those sub-processes, they might still be hanging around
> waiting to exit when our test_when_finished runs. And then kill will
> return "0". So I think we need to either:
> 
>    1. Soften the when_finished to "kill $! || true" so that we are OK if
>       they are still there.

I think this is the easiest option, I'll send a patch later today

>    2. If the diff command completed as expected, it should be safe to
>       "wait" for each of them to confirm that they successfully wrote
>       everything. I'm not sure it buys us much over testing the output
>       from "diff", though.

If the diff output is OK that's I think that's all we really care about.

> I still don't see where the hang comes from, though. It might be
> unrelated. I'll try to examine more next time I catch it in the act.

I had a look at the tests again and nothing jumped out at me. If you do 
manage to catch it hanging we should at least we should be able to tell 
which test is causing the problem.

Thanks again,

Phillip

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-08-10 12:56 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-27 14:10 [PATCH 0/3] diff --no-index: support reading from named pipes Phillip Wood
2023-06-27 14:10 ` [PATCH 1/3] diff --no-index: die on error reading stdin Phillip Wood
2023-06-27 14:10 ` [PATCH 2/3] t4054: test diff --no-index with stdin Phillip Wood
2023-06-27 14:10 ` [PATCH 3/3] diff --no-index: support reading from named pipes Phillip Wood
2023-06-27 19:44   ` Junio C Hamano
2023-06-28 10:05     ` Phillip Wood
2023-07-05 19:49 ` [PATCH v2 0/4] " Phillip Wood
2023-07-05 19:49   ` [PATCH v2 1/4] diff --no-index: refuse to compare stdin to a directory Phillip Wood
2023-07-05 21:17     ` Junio C Hamano
2023-07-05 19:49   ` [PATCH v2 2/4] diff --no-index: die on error reading stdin Phillip Wood
2023-07-05 21:18     ` Junio C Hamano
2023-07-05 19:49   ` [PATCH v2 3/4] t4054: test diff --no-index with stdin Phillip Wood
2023-07-05 21:22     ` Junio C Hamano
2023-07-05 19:49   ` [PATCH v2 4/4] diff --no-index: support reading from named pipes Phillip Wood
2023-07-05 22:19     ` Junio C Hamano
2023-08-09 17:17     ` Jeff King
2023-08-10 12:56       ` Phillip Wood

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.