All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] sha1_file: pass empty buffer to index empty file
@ 2015-05-14 17:23 Jim Hill
  2015-05-14 18:43 ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Jim Hill @ 2015-05-14 17:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

`git add` of an empty file with a filter currently pops complaints from
`copy_fd` about a bad file descriptor.

This traces back to these lines in sha1_file.c:index_core:

	if (!size) {
		ret = index_mem(sha1, NULL, size, type, path, flags);

The problem here is that content to be added to the index can be
supplied from an fd, or from a memory buffer, or from a pathname. This
call is supplying a NULL buffer pointer and a zero size.

Downstream logic takes the complete absence of a buffer to mean the
data is to be found elsewhere -- for instance, these, from convert.c:

	if (params->src) {
		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
	} else {
		write_err = copy_fd(params->fd, child_process.in);
	}

~If there's a buffer, write from that, otherwise the data must be coming
from an open fd.~

Perfectly reasonable logic in a routine that's going to write from
either a buffer or an fd.

So change `index_core` to supply an empty buffer when indexing an empty
file.

There's a patch out there that instead changes the logic quoted above to
take a `-1` fd to mean "use the buffer", but it seems to me that the
distinction between a missing buffer and an empty one carries intrinsic
semantics, where the logic change is adapting the code to handle
incorrect arguments.

Signed-off-by: Jim Hill <gjthill@gmail.com>
---
 sha1_file.c                        |  2 +-
 t/t2205-add-empty-filtered-file.sh | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)
 create mode 100755 t/t2205-add-empty-filtered-file.sh

diff --git a/sha1_file.c b/sha1_file.c
index f860d67..61e2735 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3186,7 +3186,7 @@ static int index_core(unsigned char *sha1, int fd, size_t size,
 	int ret;
 
 	if (!size) {
-		ret = index_mem(sha1, NULL, size, type, path, flags);
+		ret = index_mem(sha1, "", size, type, path, flags);
 	} else if (size <= SMALL_FILE_SIZE) {
 		char *buf = xmalloc(size);
 		if (size == read_in_full(fd, buf, size))
diff --git a/t/t2205-add-empty-filtered-file.sh b/t/t2205-add-empty-filtered-file.sh
new file mode 100755
index 0000000..28903c4
--- /dev/null
+++ b/t/t2205-add-empty-filtered-file.sh
@@ -0,0 +1,21 @@
+#!/bin/sh
+
+test_description='adding empty filtered file'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	echo "* filter=test" >>.gitattributes &&
+	git config filter.test.clean cat &&
+	git config filter.test.smudge cat &&
+	git add . &&
+	git commit -m-
+
+'
+
+test_expect_success "add of empty filtered file produces no complaints" '
+	touch emptyfile &&
+	git add emptyfile 2>out &&
+	test -e out -a ! -s out
+'
+test_done
-- 
2.4.1.4.gd9c648d

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] sha1_file: pass empty buffer to index empty file
  2015-05-14 17:23 [PATCH] sha1_file: pass empty buffer to index empty file Jim Hill
@ 2015-05-14 18:43 ` Junio C Hamano
  2015-05-14 23:17   ` [PATCH v2] " Jim Hill
  2015-05-14 23:26   ` [PATCH] " Jim Hill
  0 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2015-05-14 18:43 UTC (permalink / raw)
  To: Jim Hill; +Cc: git

Jim Hill <gjthill@gmail.com> writes:

> `git add` of an empty file with a filter currently pops complaints from
> `copy_fd` about a bad file descriptor.

Our log message typically begins with the description of a problem
that exists; "currently" is redundant in that context.  Please lose
that word.

> This traces back to these lines in sha1_file.c:index_core:
>
> 	if (!size) {
> 		ret = index_mem(sha1, NULL, size, type, path, flags);
>
> The problem here is that content to be added to the index can be
> supplied from an fd, or from a memory buffer, or from a pathname. This
> call is supplying a NULL buffer pointer and a zero size.

Good spotting.

I think the original thinking was that "we'd feed mem[0..size) to
the hash function, so mem being NULL should not matter", but as you
analysed, mem being NULL is used as a signal with a different meaning,
and your fix is the right one.

I am not enthused to see a new test script wasted just for one piece
of test.  Don't we have other "run 'git add' with clean/smudge
filters" test to which you can add this new one to?  If there is
none, then a new test script is good, but then it should be a place
to _add_ missing "run 'git add' with filters" test and its name
should not be so specific to this "empty" special case.

> diff --git a/sha1_file.c b/sha1_file.c
> index f860d67..61e2735 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -3186,7 +3186,7 @@ static int index_core(unsigned char *sha1, int fd, size_t size,
>  	int ret;
>  
>  	if (!size) {
> -		ret = index_mem(sha1, NULL, size, type, path, flags);
> +		ret = index_mem(sha1, "", size, type, path, flags);
>  	} else if (size <= SMALL_FILE_SIZE) {
>  		char *buf = xmalloc(size);
>  		if (size == read_in_full(fd, buf, size))
> diff --git a/t/t2205-add-empty-filtered-file.sh b/t/t2205-add-empty-filtered-file.sh
> new file mode 100755
> index 0000000..28903c4
> --- /dev/null
> +++ b/t/t2205-add-empty-filtered-file.sh
> @@ -0,0 +1,21 @@
> +#!/bin/sh
> +
> +test_description='adding empty filtered file'
> +
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> +	echo "* filter=test" >>.gitattributes &&
> +	git config filter.test.clean cat &&
> +	git config filter.test.smudge cat &&
> +	git add . &&
> +	git commit -m-
> +
> +'

Please do not be cryptic and show a good example, e.g.

	git commit -m test

Also lose the blank line from that test.

> +
> +test_expect_success "add of empty filtered file produces no complaints" '
> +	touch emptyfile &&

"touch" is to be used when you care about the timestamp.  When you
care more about the _presence_ of the file than what timestamp it
has, do not use "touch".  Say

	>emptyfile &&

instead.  This is especially true in this case, because you not only
care about the presence but you care about it being _empty_.

> +	git add emptyfile 2>out &&
> +	test -e out -a ! -s out

Future generation of Git users may want to see "git add emptyfile"
that succeeds to still say something and that something may be
different from "the complaint about a bad file descriptor".  Don't
force the person who makes such a change to update this test.

You may do

	git add emptyfile 2>err &&

and check that 'err' does not contain the copy-fd error (in other
words, this test is not in a good position to reject any other
output to the standard error stream), if you really wanted to, but I
do not think it is worth it.

My preference is to lose the test on 'out' and end this test like
this:

	git add emptyfile

> +'
> +test_done

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2] sha1_file: pass empty buffer to index empty file
  2015-05-14 18:43 ` Junio C Hamano
@ 2015-05-14 23:17   ` Jim Hill
  2015-05-15 18:01     ` Junio C Hamano
  2015-05-14 23:26   ` [PATCH] " Jim Hill
  1 sibling, 1 reply; 23+ messages in thread
From: Jim Hill @ 2015-05-14 23:17 UTC (permalink / raw)
  To: Junkio C Hamano; +Cc: git

`git add` of an empty file with a filter pops complaints from
`copy_fd` about a bad file descriptor.

This traces back to these lines in sha1_file.c:index_core:

	if (!size) {
		ret = index_mem(sha1, NULL, size, type, path, flags);

The problem here is that content to be added to the index can be
supplied from an fd, or from a memory buffer, or from a pathname. This
call is supplying a NULL buffer pointer and a zero size.

Downstream logic takes the complete absence of a buffer to mean the
data is to be found elsewhere -- for instance, these, from convert.c:

	if (params->src) {
		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
	} else {
		write_err = copy_fd(params->fd, child_process.in);
	}

~If there's a buffer, write from that, otherwise the data must be coming
from an open fd.~

Perfectly reasonable logic in a routine that's going to write from
either a buffer or an fd.

So change `index_core` to supply an empty buffer when indexing an empty
file.

There's a patch out there that instead changes the logic quoted above to
take a `-1` fd to mean "use the buffer", but it seems to me that the
distinction between a missing buffer and an empty one carries intrinsic
semantics, where the logic change is adapting the code to handle
incorrect arguments.

Signed-off-by: Jim Hill <gjthill@gmail.com>
---

Junio C Hamano wrote:
> Please lose that word.
Check.

> Good spotting.
Thanks :-)

> I am not enthused to see a new test script wasted just for one piece
> of test.  Don't we have other 
Yup. Didn't see a place I liked for it in the add and update-index
test suites, didn't think to look for a filter test suite...
Check.

> Please do not be cryptic and show a good example, e.g.
> Also lose the blank line from that test.
>~Say `>emptyfile &&`
Checkcheckcheck

> check that 'err' does not contain the copy-fd error

Implemented this out of necessity, because the add works and returns
success despite the complaints to stderr.


 sha1_file.c           |  2 +-
 t/t0021-conversion.sh | 11 +++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index f860d67..61e2735 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3186,7 +3186,7 @@ static int index_core(unsigned char *sha1, int fd, size_t size,
 	int ret;
 
 	if (!size) {
-		ret = index_mem(sha1, NULL, size, type, path, flags);
+		ret = index_mem(sha1, "", size, type, path, flags);
 	} else if (size <= SMALL_FILE_SIZE) {
 		char *buf = xmalloc(size);
 		if (size == read_in_full(fd, buf, size))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index ca7d2a6..5986bb0 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -216,4 +216,15 @@ test_expect_success EXPENSIVE 'filter large file' '
 	! test -s err
 '
 
+test_expect_success "filtering empty file should not produce complaints" '
+	echo "emptyfile filter=cat" >>.gitattributes &&
+	git config filter.cat.clean cat &&
+	git config filter.cat.smudge cat &&
+	git add . &&
+	git commit -m "cat filter for emptyfile" &&
+	> emptyfile &&
+	git add emptyfile 2>err &&
+	! grep -Fiqs "bad file descriptor" err
+'
+
 test_done
-- 
2.4.1.4.gd9c648d

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] sha1_file: pass empty buffer to index empty file
  2015-05-14 18:43 ` Junio C Hamano
  2015-05-14 23:17   ` [PATCH v2] " Jim Hill
@ 2015-05-14 23:26   ` Jim Hill
  1 sibling, 0 replies; 23+ messages in thread
From: Jim Hill @ 2015-05-14 23:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Thanks for the corrections and improvements. The resulting test code
looks much cleaner to me.

Please do suggest any further improvements that occur to you,

Thanks,
Jim

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2] sha1_file: pass empty buffer to index empty file
  2015-05-14 23:17   ` [PATCH v2] " Jim Hill
@ 2015-05-15 18:01     ` Junio C Hamano
  2015-05-15 23:31       ` Jim Hill
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-15 18:01 UTC (permalink / raw)
  To: Jim Hill; +Cc: git

Jim Hill <gjthill@gmail.com> writes:

>> check that 'err' does not contain the copy-fd error
>
> Implemented this out of necessity, because the add works and returns
> success despite the complaints to stderr.

That would mean that you found _another_ bug, wouldn't it?  If
copy-fd failed to read input to feed the external filter with, it
must have returned an error to its caller, and somebody in the
callchain is not paying attention to that error and pretending
as if everything went well.  That's a separate issue, though.

In any case, I think the following patch may make the test better
(apply on top of yours).

 * A failure to run the filter with the right contents can be caught
   by examining the outcome.  I tweaked the filter to prepend an
   extra header line to the contents; if copy-fd failed to drive the
   filter, we wouldn't see the cleaned output to match that extra
   header line (and nothing else---as the contents we are feeding is
   an empty blob).

 * There is no need to create an extra commit; an uncommitted
   .gitattributes from the working tree would work just fine.

 * The "grep" is gone, with use of -i (questionable why it is
   needed), -q (generally, we do not squelch error output in
   individual tests, which is unnecessary and running tests with -v
   option less useful) and -s (same, withquestionable portability).

Thanks.

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 5986bb0..a72d265 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -216,15 +216,17 @@ test_expect_success EXPENSIVE 'filter large file' '
 	! test -s err
 '
 
-test_expect_success "filtering empty file should not produce complaints" '
-	echo "emptyfile filter=cat" >>.gitattributes &&
-	git config filter.cat.clean cat &&
-	git config filter.cat.smudge cat &&
-	git add . &&
-	git commit -m "cat filter for emptyfile" &&
-	> emptyfile &&
-	git add emptyfile 2>err &&
-	! grep -Fiqs "bad file descriptor" err
+test_expect_success "filtering empty file should work correctly" '
+	write_script filter-clean.sh <<-EOF &&
+	echo "Extra Head" && cat
+	EOF
+	echo "emptyfile filter=check" >>.gitattributes &&
+	git config filter.check.clean "sh ./filter-clean.sh" &&
+	>emptyfile &&
+	git add emptyfile &&
+	echo "Extra Head" >expect &&
+	git cat-file blob :emptyfile >actual &&
+	test_cmp expect actual
 '
 
 test_done

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2] sha1_file: pass empty buffer to index empty file
  2015-05-15 18:01     ` Junio C Hamano
@ 2015-05-15 23:31       ` Jim Hill
  2015-05-16 18:48         ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Jim Hill @ 2015-05-15 23:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, May 15, 2015 at 11:01:34AM -0700, Junio C Hamano wrote:
> That would mean that you found _another_ bug, wouldn't it?  If
> copy-fd failed to read input to feed the external filter with, it
> must have returned an error to its caller, and somebody in the
> callchain is not paying attention to that error and pretending
> as if everything went well.  That's a separate issue, though.

as you say, separate ... I think I stumbled over more than one:

setup:
	~/sandbox/40$ git grl
	core.autocrlf false
	core.whitespace cr-at-eof
	core.repositoryformatversion 0
	core.filemode true
	core.bare false
	core.logallrefupdates true
	filter.cat.smudge cat
	filter.cat.clean echo Kilroy was here && cat
	filter.cat.required true
	~/sandbox/40$ git rm --cached -f --ignore-unmatch emptyfile
	rm 'emptyfile'

with required filter:
	~/sandbox/40$ cat emptyfile
	~/sandbox/40$ git add emptyfile
	~/sandbox/40$ git show :emptyfile
	Kilroy was here
	~/sandbox/40$ git config --unset filter.cat.required

then with not-required filter:
	~/sandbox/40$ git rm --cached -f --ignore-unmatch emptyfile
	error: copy-fd: read returned Bad file descriptor
	error: cannot feed the input to external filter echo Kilroy was here && cat
	error: external filter echo Kilroy was here && cat failed
	rm 'emptyfile'
	~/sandbox/40$ git show :emptyfile
	fatal: Path 'emptyfile' exists on disk, but not in the index.
	~/sandbox/40$ git add emptyfile
	error: copy-fd: read returned Bad file descriptor
	error: cannot feed the input to external filter echo Kilroy was here && cat
	error: external filter echo Kilroy was here && cat failed
	~/sandbox/40$ git show :emptyfile
	~/sandbox/40$ git rm --cached emptyfile
	rm 'emptyfile'
	~/sandbox/40$ git add emptyfile
	error: copy-fd: read returned Bad file descriptor
	error: cannot feed the input to external filter echo Kilroy was here && cat
	error: external filter echo Kilroy was here && cat failed
	~/sandbox/40$ git rm --cached -f --ignore-unmatch emptyfile
	rm 'emptyfile'
	~/sandbox/40$ 

===

I don't understand rm's choices of when to run the filter, and the
apparently entirely separate code path for required filters is just
bothersome.

>  * A failure to run the filter with the right contents can be caught
>    by examining the outcome.

agreed. That's better anyway -- my few git greps didn't find any
empty-file filter tests anyway.

>  * There is no need to create an extra commit; an uncommitted
>    .gitattributes from the working tree would work just fine.

Done.

>  * The "grep" is gone, with use of -i (questionable why it is
>    needed), 

Yah, I was bad-thinking strerror results might be a bit unpredictable, I
should have checked for a string under git's control instead.  I'd just
assumed the 0 return was because non-required filters are allowed to
fail, got the above transcript while checking the assumption.

=== 

So, so long as we're testing empty-file filters, I figured I'd add real
empty-file filter tests, I think that covers it.

So is this better instead?

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 5986bb0..fc2c644 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -216,15 +216,33 @@ test_expect_success EXPENSIVE 'filter large file' '
 	! test -s err
 '
 
-test_expect_success "filtering empty file should not produce complaints" '
-	echo "emptyfile filter=cat" >>.gitattributes &&
-	git config filter.cat.clean cat &&
-	git config filter.cat.smudge cat &&
-	git add . &&
-	git commit -m "cat filter for emptyfile" &&
-	> emptyfile &&
-	git add emptyfile 2>err &&
-	! grep -Fiqs "bad file descriptor" err
+test_expect_success "filter: clean empty file" '
+	header=---in-repo-header--- &&
+	git config filter.in-repo-header.clean  "echo $header && cat" &&
+	git config filter.in-repo-header.smudge "sed 1d" &&
+
+	echo "empty-in-worktree    filter=in-repo-header" >>.gitattributes &&
+	> empty-in-worktree &&
+
+	echo $header              > expected &&
+	git add empty-in-worktree            &&
+	git show :empty-in-worktree > actual &&
+	test_cmp expected actual
+'
+
+test_expect_success "filter: smudge empty file" '
+	git config filter.empty-in-repo.smudge "echo smudge added line && cat" &&
+	git config filter.empty-in-repo.clean   true &&
+
+	echo "empty-in-repo      filter=empty-in-repo"  >>.gitattributes &&
+
+	echo dead data walking > empty-in-repo &&
+	git add empty-in-repo &&
+
+	:			> expected &&
+	git show :empty-in-repo	> actual &&
+	test_cmp expected actual
 '
 
 test_done
+

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2] sha1_file: pass empty buffer to index empty file
  2015-05-15 23:31       ` Jim Hill
@ 2015-05-16 18:48         ` Junio C Hamano
  2015-05-16 20:06           ` [PATCH v3] " Jim Hill
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-16 18:48 UTC (permalink / raw)
  To: Jim Hill; +Cc: git

Jim Hill <gjthill@gmail.com> writes:

> So, so long as we're testing empty-file filters, I figured I'd add real
> empty-file filter tests, I think that covers it.
>
> So is this better instead?

I wouldn't use "---in-repo-header--" as that extra string.  Feeding
anything that begins with '-' to 'echo' gives me portability worries
for one thing.  A single word "Extra" would suffice.

Be careful and consistent wrt redirection operator and its file; we
do not write SP there but some of yours do and some others don't.

Do not attempt to align && with excess SPs; other tests don't.

Other than that, yeah, I think that is an improvement.

Thanks.

>
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 5986bb0..fc2c644 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -216,15 +216,33 @@ test_expect_success EXPENSIVE 'filter large file' '
>  	! test -s err
>  '
>  
> -test_expect_success "filtering empty file should not produce complaints" '
> -	echo "emptyfile filter=cat" >>.gitattributes &&
> -	git config filter.cat.clean cat &&
> -	git config filter.cat.smudge cat &&
> -	git add . &&
> -	git commit -m "cat filter for emptyfile" &&
> -	> emptyfile &&
> -	git add emptyfile 2>err &&
> -	! grep -Fiqs "bad file descriptor" err
> +test_expect_success "filter: clean empty file" '
> +	header=---in-repo-header--- &&
> +	git config filter.in-repo-header.clean  "echo $header && cat" &&
> +	git config filter.in-repo-header.smudge "sed 1d" &&
> +
> +	echo "empty-in-worktree    filter=in-repo-header" >>.gitattributes &&
> +	> empty-in-worktree &&
> +
> +	echo $header              > expected &&
> +	git add empty-in-worktree            &&
> +	git show :empty-in-worktree > actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success "filter: smudge empty file" '
> +	git config filter.empty-in-repo.smudge "echo smudge added line && cat" &&
> +	git config filter.empty-in-repo.clean   true &&
> +
> +	echo "empty-in-repo      filter=empty-in-repo"  >>.gitattributes &&
> +
> +	echo dead data walking > empty-in-repo &&
> +	git add empty-in-repo &&
> +
> +	:			> expected &&
> +	git show :empty-in-repo	> actual &&
> +	test_cmp expected actual
>  '
>  
>  test_done
> +

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-16 18:48         ` Junio C Hamano
@ 2015-05-16 20:06           ` Jim Hill
  2015-05-16 23:22             ` Junio C Hamano
  2015-05-17 17:37             ` Junio C Hamano
  0 siblings, 2 replies; 23+ messages in thread
From: Jim Hill @ 2015-05-16 20:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

`git add` of an empty file with a filter pops complaints from
`copy_fd` about a bad file descriptor.

This traces back to these lines in sha1_file.c:index_core:

	if (!size) {
		ret = index_mem(sha1, NULL, size, type, path, flags);

The problem here is that content to be added to the index can be
supplied from an fd, or from a memory buffer, or from a pathname. This
call is supplying a NULL buffer pointer and a zero size.

Downstream logic takes the complete absence of a buffer to mean the
data is to be found elsewhere -- for instance, these, from convert.c:

	if (params->src) {
		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
	} else {
		write_err = copy_fd(params->fd, child_process.in);
	}

~If there's a buffer, write from that, otherwise the data must be coming
from an open fd.~

Perfectly reasonable logic in a routine that's going to write from
either a buffer or an fd.

So change `index_core` to supply an empty buffer when indexing an empty
file.

There's a patch out there that instead changes the logic quoted above to
take a `-1` fd to mean "use the buffer", but it seems to me that the
distinction between a missing buffer and an empty one carries intrinsic
semantics, where the logic change is adapting the code to handle
incorrect arguments.

Signed-off-by: Jim Hill <gjthill@gmail.com>
---
I promise to pay more attention to test quality in the future, thanks for the
patience.

 sha1_file.c           |  2 +-
 t/t0021-conversion.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index f860d67..61e2735 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3186,7 +3186,7 @@ static int index_core(unsigned char *sha1, int fd, size_t size,
 	int ret;
 
 	if (!size) {
-		ret = index_mem(sha1, NULL, size, type, path, flags);
+		ret = index_mem(sha1, "", size, type, path, flags);
 	} else if (size <= SMALL_FILE_SIZE) {
 		char *buf = xmalloc(size);
 		if (size == read_in_full(fd, buf, size))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index ca7d2a6..bf87e9b 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -216,4 +216,30 @@ test_expect_success EXPENSIVE 'filter large file' '
 	! test -s err
 '
 
+test_expect_success "filter: clean empty file" '
+	git config filter.in-repo-header.clean  "echo cleaned && cat" &&
+	git config filter.in-repo-header.smudge "sed 1d" &&
+
+	echo "empty-in-worktree    filter=in-repo-header" >>.gitattributes &&
+	>empty-in-worktree &&
+
+	echo cleaned >expected &&
+	git add empty-in-worktree &&
+	git show :empty-in-worktree >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success "filter: smudge empty file" '
+	git config filter.empty-in-repo.clean true &&
+	git config filter.empty-in-repo.smudge "echo smudged && cat" &&
+
+	echo "empty-in-repo filter=empty-in-repo"  >>.gitattributes &&
+	echo dead data walking >empty-in-repo &&
+	git add empty-in-repo &&
+
+	echo smudged >expected &&
+	git checkout-index --prefix=filtered- empty-in-repo &&
+	test_cmp expected filtered-empty-in-repo
+'
+
 test_done
-- 
2.4.1.4.g08cda19

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-16 20:06           ` [PATCH v3] " Jim Hill
@ 2015-05-16 23:22             ` Junio C Hamano
  2015-05-17 17:37             ` Junio C Hamano
  1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2015-05-16 23:22 UTC (permalink / raw)
  To: Jim Hill; +Cc: git

Jim Hill <gjthill@gmail.com> writes:

> `git add` of an empty file with a filter pops complaints from
> `copy_fd` about a bad file descriptor.
>
> This traces back to these lines in sha1_file.c:index_core:
>
> 	if (!size) {
> 		ret = index_mem(sha1, NULL, size, type, path, flags);
>
> The problem here is that content to be added to the index can be
> supplied from an fd, or from a memory buffer, or from a pathname. This
> call is supplying a NULL buffer pointer and a zero size.
>
> Downstream logic takes the complete absence of a buffer to mean the
> data is to be found elsewhere -- for instance, these, from convert.c:
>
> 	if (params->src) {
> 		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
> 	} else {
> 		write_err = copy_fd(params->fd, child_process.in);
> 	}
>
> ~If there's a buffer, write from that, otherwise the data must be coming
> from an open fd.~
>
> Perfectly reasonable logic in a routine that's going to write from
> either a buffer or an fd.
>
> So change `index_core` to supply an empty buffer when indexing an empty
> file.
>
> There's a patch out there that instead changes the logic quoted above to
> take a `-1` fd to mean "use the buffer", but it seems to me that the
> distinction between a missing buffer and an empty one carries intrinsic
> semantics, where the logic change is adapting the code to handle
> incorrect arguments.
>
> Signed-off-by: Jim Hill <gjthill@gmail.com>
> ---
> I promise to pay more attention to test quality in the future, thanks for the
> patience.

It's us who should thank you ;-).  Thanks for spending time to
polish essentially a one-liner this long.


>
>  sha1_file.c           |  2 +-
>  t/t0021-conversion.sh | 26 ++++++++++++++++++++++++++
>  2 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index f860d67..61e2735 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -3186,7 +3186,7 @@ static int index_core(unsigned char *sha1, int fd, size_t size,
>  	int ret;
>  
>  	if (!size) {
> -		ret = index_mem(sha1, NULL, size, type, path, flags);
> +		ret = index_mem(sha1, "", size, type, path, flags);
>  	} else if (size <= SMALL_FILE_SIZE) {
>  		char *buf = xmalloc(size);
>  		if (size == read_in_full(fd, buf, size))
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index ca7d2a6..bf87e9b 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -216,4 +216,30 @@ test_expect_success EXPENSIVE 'filter large file' '
>  	! test -s err
>  '
>  
> +test_expect_success "filter: clean empty file" '
> +	git config filter.in-repo-header.clean  "echo cleaned && cat" &&
> +	git config filter.in-repo-header.smudge "sed 1d" &&
> +
> +	echo "empty-in-worktree    filter=in-repo-header" >>.gitattributes &&
> +	>empty-in-worktree &&
> +
> +	echo cleaned >expected &&
> +	git add empty-in-worktree &&
> +	git show :empty-in-worktree >actual &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success "filter: smudge empty file" '
> +	git config filter.empty-in-repo.clean true &&
> +	git config filter.empty-in-repo.smudge "echo smudged && cat" &&
> +
> +	echo "empty-in-repo filter=empty-in-repo"  >>.gitattributes &&
> +	echo dead data walking >empty-in-repo &&
> +	git add empty-in-repo &&
> +
> +	echo smudged >expected &&
> +	git checkout-index --prefix=filtered- empty-in-repo &&
> +	test_cmp expected filtered-empty-in-repo
> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-16 20:06           ` [PATCH v3] " Jim Hill
  2015-05-16 23:22             ` Junio C Hamano
@ 2015-05-17 17:37             ` Junio C Hamano
  2015-05-17 19:10               ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-17 17:37 UTC (permalink / raw)
  To: Jim Hill; +Cc: git

Jim Hill <gjthill@gmail.com> writes:

> +test_expect_success "filter: smudge empty file" '
> +	git config filter.empty-in-repo.clean true &&

But this one is correct but tricky ;-)

If the contents to be cleaned is small enough (i.e. the one-liner
file used in this test) to fit in the pipe buffer and we feed the
pipe before 'true' exits, we won't see any problem.  Otherwise we
may get SIGPIPE when we attempt to write to the 'true' (non-)filter,
but because we explicitly ignore SIGPIPE, 'true' still is a "black
hole" filter.

"cat >/dev/null" may have been a more naive and straight-forward way
to write this "black hole" filter, but what you did is fine.

> +	git config filter.empty-in-repo.smudge "echo smudged && cat" &&
> +
> +	echo "empty-in-repo filter=empty-in-repo"  >>.gitattributes &&
> +	echo dead data walking >empty-in-repo &&
> +	git add empty-in-repo &&
> +
> +	echo smudged >expected &&
> +	git checkout-index --prefix=filtered- empty-in-repo &&
> +	test_cmp expected filtered-empty-in-repo

This is also correct but tricky.

    rm -f empty-in-repo && git checkout empty-in-repo

may have been more straight-forward, but this exercises the same
codepath and perfectly fine.

Will queue and let's merge this to 'next' soonish.

Thanks.

> +'
> +
>  test_done

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-17 17:37             ` Junio C Hamano
@ 2015-05-17 19:10               ` Junio C Hamano
  2015-05-18  0:41                 ` [PATCH v4] " Jim Hill
  2015-05-19  6:37                 ` [PATCH v3] " Jeff King
  0 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2015-05-17 19:10 UTC (permalink / raw)
  To: Jim Hill; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> If the contents to be cleaned is small enough (i.e. the one-liner
> file used in this test) to fit in the pipe buffer and we feed the
> pipe before 'true' exits, we won't see any problem.  Otherwise we
> may get SIGPIPE when we attempt to write to the 'true' (non-)filter,
> but because we explicitly ignore SIGPIPE, 'true' still is a "black
> hole" filter.
>
> "cat >/dev/null" may have been a more naive and straight-forward way
> to write this "black hole" filter, but what you did is fine.

I spoke too fast X-<.  "while sh t0021-*.sh; do :; done" dies after
a few iterations and with this squashed in it doesn't.

 t/t0021-conversion.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 42e6423..b778faf 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -218,7 +218,7 @@ test_expect_success "filter: clean empty file" '
 '
 
 test_expect_success "filter: smudge empty file" '
-	git config filter.empty-in-repo.clean true &&
+	git config filter.empty-in-repo.clean "cat >/dev/null" &&
 	git config filter.empty-in-repo.smudge "echo smudged && cat" &&
 
 	echo "empty-in-repo filter=empty-in-repo" >>.gitattributes &&
-- 
2.4.1-374-g090bfc9

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4] sha1_file: pass empty buffer to index empty file
  2015-05-17 19:10               ` Junio C Hamano
@ 2015-05-18  0:41                 ` Jim Hill
  2015-05-19  6:37                 ` [PATCH v3] " Jeff King
  1 sibling, 0 replies; 23+ messages in thread
From: Jim Hill @ 2015-05-18  0:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

`git add` of an empty file with a filter pops complaints from
`copy_fd` about a bad file descriptor.

This traces back to these lines in sha1_file.c:index_core:

	if (!size) {
		ret = index_mem(sha1, NULL, size, type, path, flags);

The problem here is that content to be added to the index can be
supplied from an fd, or from a memory buffer, or from a pathname. This
call is supplying a NULL buffer pointer and a zero size.

Downstream logic takes the complete absence of a buffer to mean the
data is to be found elsewhere -- for instance, these, from convert.c:

	if (params->src) {
		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
	} else {
		write_err = copy_fd(params->fd, child_process.in);
	}

~If there's a buffer, write from that, otherwise the data must be coming
from an open fd.~

Perfectly reasonable logic in a routine that's going to write from
either a buffer or an fd.

So change `index_core` to supply an empty buffer when indexing an empty
file.

There's a patch out there that instead changes the logic quoted above to
take a `-1` fd to mean "use the buffer", but it seems to me that the
distinction between a missing buffer and an empty one carries intrinsic
semantics, where the logic change is adapting the code to handle
incorrect arguments.

Signed-off-by: Jim Hill <gjthill@gmail.com>
---
Okay, that's it: I'm officially laughably rusty, 'cause I'm laughing. This
applies your fix, _thank you_ for the caution. I get it: `true` can have
already exited by the time the write hits.

 sha1_file.c           |  2 +-
 t/t0021-conversion.sh | 26 ++++++++++++++++++++++++++
 2 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/sha1_file.c b/sha1_file.c
index f860d67..61e2735 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3186,7 +3186,7 @@ static int index_core(unsigned char *sha1, int fd, size_t size,
 	int ret;
 
 	if (!size) {
-		ret = index_mem(sha1, NULL, size, type, path, flags);
+		ret = index_mem(sha1, "", size, type, path, flags);
 	} else if (size <= SMALL_FILE_SIZE) {
 		char *buf = xmalloc(size);
 		if (size == read_in_full(fd, buf, size))
diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index ca7d2a6..eee4761 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -216,4 +216,30 @@ test_expect_success EXPENSIVE 'filter large file' '
 	! test -s err
 '
 
+test_expect_success "filter: clean empty file" '
+	git config filter.in-repo-header.clean  "echo cleaned && cat" &&
+	git config filter.in-repo-header.smudge "sed 1d" &&
+
+	echo "empty-in-worktree    filter=in-repo-header" >>.gitattributes &&
+	>empty-in-worktree &&
+
+	echo cleaned >expected &&
+	git add empty-in-worktree &&
+	git show :empty-in-worktree >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success "filter: smudge empty file" '
+	git config filter.empty-in-repo.clean "cat >/dev/null" &&
+	git config filter.empty-in-repo.smudge "echo smudged && cat" &&
+
+	echo "empty-in-repo filter=empty-in-repo"  >>.gitattributes &&
+	echo dead data walking >empty-in-repo &&
+	git add empty-in-repo &&
+
+	echo smudged >expected &&
+	git checkout-index --prefix=filtered- empty-in-repo &&
+	test_cmp expected filtered-empty-in-repo
+'
+
 test_done
-- 
2.4.1.4.g08cda19

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-17 19:10               ` Junio C Hamano
  2015-05-18  0:41                 ` [PATCH v4] " Jim Hill
@ 2015-05-19  6:37                 ` Jeff King
  2015-05-19 18:11                   ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Jeff King @ 2015-05-19  6:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jim Hill, git

On Sun, May 17, 2015 at 12:10:49PM -0700, Junio C Hamano wrote:

> I spoke too fast X-<.  "while sh t0021-*.sh; do :; done" dies after
> a few iterations and with this squashed in it doesn't.
> 
>  t/t0021-conversion.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 42e6423..b778faf 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -218,7 +218,7 @@ test_expect_success "filter: clean empty file" '
>  '
>  
>  test_expect_success "filter: smudge empty file" '
> -	git config filter.empty-in-repo.clean true &&
> +	git config filter.empty-in-repo.clean "cat >/dev/null" &&

Hmm, I thought we turned off SIGPIPE when writing to filters these days.
Looks like we still complain if we get EPIPE, though. I feel like it
should be the filter's business whether it wants to consume all of the
input or not[1], and we should only be checking its exit status.

-Peff

[1] As a practical example, consider a file format that has a lot of
    cruft at the end. The clean filter would want to read only to the
    start of the cruft, and then stop for reasons of efficiency.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19  6:37                 ` [PATCH v3] " Jeff King
@ 2015-05-19 18:11                   ` Junio C Hamano
  2015-05-19 18:17                     ` Junio C Hamano
                                       ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Junio C Hamano @ 2015-05-19 18:11 UTC (permalink / raw)
  To: Jeff King; +Cc: Jim Hill, git

Jeff King <peff@peff.net> writes:

> On Sun, May 17, 2015 at 12:10:49PM -0700, Junio C Hamano wrote:
>
>> I spoke too fast X-<.  "while sh t0021-*.sh; do :; done" dies after
>> a few iterations and with this squashed in it doesn't.
>> 
>>  t/t0021-conversion.sh | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
>> index 42e6423..b778faf 100755
>> --- a/t/t0021-conversion.sh
>> +++ b/t/t0021-conversion.sh
>> @@ -218,7 +218,7 @@ test_expect_success "filter: clean empty file" '
>>  '
>>  
>>  test_expect_success "filter: smudge empty file" '
>> -	git config filter.empty-in-repo.clean true &&
>> +	git config filter.empty-in-repo.clean "cat >/dev/null" &&
>
> Hmm, I thought we turned off SIGPIPE when writing to filters these days.
> Looks like we still complain if we get EPIPE, though. I feel like it
> should be the filter's business whether it wants to consume all of the
> input or not[1], and we should only be checking its exit status.
>
> -Peff
>
> [1] As a practical example, consider a file format that has a lot of
>     cruft at the end. The clean filter would want to read only to the
>     start of the cruft, and then stop for reasons of efficiency.

Yes.  Let's do these two.  The preparatory patch is larger than the
real change.

-- >8 --
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 19 May 2015 10:55:16 -0700
Subject: [PATCH] copy.c: make copy_fd() report its status silently

When copy_fd() function encounters errors, it emits error messages
itself, which makes it impossible for callers to take responsibility
for reporting errors, especially when they want to ignore certaion
errors.

Move the error reporting to its callers in preparation.

 - copy_file() and copy_file_with_time() by indirection get their
   own calls to error().

 - hold_lock_file_for_append(), when told to die on error, used to
   exit(128) relying on the error message from copy_fd(), but now it
   does its own die() instead.  Note that the callers that do not
   pass LOCK_DIE_ON_ERROR need to be adjusted for this change, but
   fortunately there is none ;-)

 - filter_buffer_or_fd() has its own error() already, in addition to
   the message from copy_fd(), so this will change the output but
   arguably in a better way.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 cache.h    |  4 ++++
 copy.c     | 17 +++++++++++------
 lockfile.c |  2 +-
 3 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/cache.h b/cache.h
index 22b7b81..2981eec 100644
--- a/cache.h
+++ b/cache.h
@@ -1482,9 +1482,13 @@ extern const char *git_mailmap_blob;
 extern void maybe_flush_or_die(FILE *, const char *);
 __attribute__((format (printf, 2, 3)))
 extern void fprintf_or_die(FILE *, const char *fmt, ...);
+
+#define COPY_READ_ERROR (-2)
+#define COPY_WRITE_ERROR (-3)
 extern int copy_fd(int ifd, int ofd);
 extern int copy_file(const char *dst, const char *src, int mode);
 extern int copy_file_with_time(const char *dst, const char *src, int mode);
+
 extern void write_or_die(int fd, const void *buf, size_t count);
 extern int write_or_whine(int fd, const void *buf, size_t count, const char *msg);
 extern int write_or_whine_pipe(int fd, const void *buf, size_t count, const char *msg);
diff --git a/copy.c b/copy.c
index f2970ec..574fa1f 100644
--- a/copy.c
+++ b/copy.c
@@ -7,13 +7,10 @@ int copy_fd(int ifd, int ofd)
 		ssize_t len = xread(ifd, buffer, sizeof(buffer));
 		if (!len)
 			break;
-		if (len < 0) {
-			return error("copy-fd: read returned %s",
-				     strerror(errno));
-		}
+		if (len < 0)
+			return COPY_READ_ERROR;
 		if (write_in_full(ofd, buffer, len) < 0)
-			return error("copy-fd: write returned %s",
-				     strerror(errno));
+			return COPY_WRITE_ERROR;
 	}
 	return 0;
 }
@@ -43,6 +40,14 @@ int copy_file(const char *dst, const char *src, int mode)
 		return fdo;
 	}
 	status = copy_fd(fdi, fdo);
+	switch (status) {
+	case COPY_READ_ERROR:
+		error("copy-fd: read returned %s", strerror(errno));
+		break;
+	case COPY_WRITE_ERROR:
+		error("copy-fd: write returned %s", strerror(errno));
+		break;
+	}
 	close(fdi);
 	if (close(fdo) != 0)
 		return error("%s: close error: %s", dst, strerror(errno));
diff --git a/lockfile.c b/lockfile.c
index 4f16ee7..beba0ed 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -206,7 +206,7 @@ int hold_lock_file_for_append(struct lock_file *lk, const char *path, int flags)
 		int save_errno = errno;
 
 		if (flags & LOCK_DIE_ON_ERROR)
-			exit(128);
+			die("failed to prepare '%s' for appending", path);
 		close(orig_fd);
 		rollback_lock_file(lk);
 		errno = save_errno;
-- 
2.4.1-413-ga38dc94

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 18:11                   ` Junio C Hamano
@ 2015-05-19 18:17                     ` Junio C Hamano
  2015-05-19 18:35                       ` Junio C Hamano
  2015-05-19 19:40                     ` Eric Sunshine
  2015-05-19 22:09                     ` Jeff King
  2 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-19 18:17 UTC (permalink / raw)
  To: Jeff King; +Cc: Jim Hill, git

Junio C Hamano <gitster@pobox.com> writes:

>> Hmm, I thought we turned off SIGPIPE when writing to filters these days.
>> Looks like we still complain if we get EPIPE, though. I feel like it
>> should be the filter's business whether it wants to consume all of the
>> input or not[1], and we should only be checking its exit status.
>>
>> -Peff
>>
>> [1] As a practical example, consider a file format that has a lot of
>>     cruft at the end. The clean filter would want to read only to the
>>     start of the cruft, and then stop for reasons of efficiency.
>
> Yes.  Let's do these two.  The preparatory patch is larger than the
> real change.

And this is the second one.

While preparing these, I noticed a handful of system calls whose
return values are not checked in the codepaths involved.  We should
clean them up, but I left them out of these two patches, as they are
separate issues.

-- >8 --
Subject: [PATCH 2/2] filter_buffer_or_fd(): ignore EPIPE

We are explicitly ignoring SIGPIPE, as we fully expect that the
filter program may not read our output fully.  Ignore EPIPE that
may come from writing to it as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 convert.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/convert.c b/convert.c
index 9a5612e..0f20979 100644
--- a/convert.c
+++ b/convert.c
@@ -359,6 +359,8 @@ static int filter_buffer_or_fd(int in, int out, void *data)
 		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
 	} else {
 		write_err = copy_fd(params->fd, child_process.in);
+		if (write_error == COPY_WRITE_ERROR && errno == EPIPE)
+			write_error = 0; /* we are ignoring it, right? */
 	}
 
 	if (close(child_process.in))
-- 
2.4.1-413-ga38dc94

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 18:17                     ` Junio C Hamano
@ 2015-05-19 18:35                       ` Junio C Hamano
  2015-05-19 19:48                         ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-19 18:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Jim Hill, git

Junio C Hamano <gitster@pobox.com> writes:

> Subject: [PATCH 2/2] filter_buffer_or_fd(): ignore EPIPE
>
> We are explicitly ignoring SIGPIPE, as we fully expect that the
> filter program may not read our output fully.  Ignore EPIPE that
> may come from writing to it as well.

Yuck; please discard the previous one.  write_in_full() side is also
writing into that process, so we should do the same.

-- >8 --
Subject: [PATCH v2 2/2] filter_buffer_or_fd(): ignore EPIPE

We are explicitly ignoring SIGPIPE, as we fully expect that the
filter program may not read our output fully.  Ignore EPIPE that
may come from writing to it as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 convert.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/convert.c b/convert.c
index 9a5612e..f3bd3e9 100644
--- a/convert.c
+++ b/convert.c
@@ -356,9 +356,14 @@ static int filter_buffer_or_fd(int in, int out, void *data)
 	sigchain_push(SIGPIPE, SIG_IGN);
 
 	if (params->src) {
-		write_err = (write_in_full(child_process.in, params->src, params->size) < 0);
+		write_err = (write_in_full(child_process.in,
+					   params->src, params->size) < 0);
+		if (errno == EPIPE)
+			write_err = 0;
 	} else {
 		write_err = copy_fd(params->fd, child_process.in);
+		if (write_err == COPY_WRITE_ERROR && errno == EPIPE)
+			write_err = 0;
 	}
 
 	if (close(child_process.in))
-- 
2.4.1-413-ga38dc94

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 18:11                   ` Junio C Hamano
  2015-05-19 18:17                     ` Junio C Hamano
@ 2015-05-19 19:40                     ` Eric Sunshine
  2015-05-19 22:09                     ` Jeff King
  2 siblings, 0 replies; 23+ messages in thread
From: Eric Sunshine @ 2015-05-19 19:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jim Hill, Git List

On Tue, May 19, 2015 at 2:11 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Subject: [PATCH] copy.c: make copy_fd() report its status silently
>
> When copy_fd() function encounters errors, it emits error messages
> itself, which makes it impossible for callers to take responsibility
> for reporting errors, especially when they want to ignore certaion

s/certaion/certain/

> errors.
>
> Move the error reporting to its callers in preparation.
>
>  - copy_file() and copy_file_with_time() by indirection get their
>    own calls to error().
>
>  - hold_lock_file_for_append(), when told to die on error, used to
>    exit(128) relying on the error message from copy_fd(), but now it
>    does its own die() instead.  Note that the callers that do not
>    pass LOCK_DIE_ON_ERROR need to be adjusted for this change, but
>    fortunately there is none ;-)
>
>  - filter_buffer_or_fd() has its own error() already, in addition to
>    the message from copy_fd(), so this will change the output but
>    arguably in a better way.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 18:35                       ` Junio C Hamano
@ 2015-05-19 19:48                         ` Junio C Hamano
  2015-05-19 22:14                           ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-19 19:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Jim Hill, git

Junio C Hamano <gitster@pobox.com> writes:

> Yuck; please discard the previous one.  write_in_full() side is also
> writing into that process, so we should do the same.

OK, without these two, and with the "true" filter that does not read
anything reinstated in the test script, t0021 used to die

    i=0; while sh t0021-conversion.sh; do i=$(( $i + 1 )); done

after 150 iteration or so for me.  With these two, it seems to go on
without breaking (I bored after 1000 iterations), so I'd declare it
good enough ;-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 18:11                   ` Junio C Hamano
  2015-05-19 18:17                     ` Junio C Hamano
  2015-05-19 19:40                     ` Eric Sunshine
@ 2015-05-19 22:09                     ` Jeff King
  2015-05-20 17:25                       ` Junio C Hamano
  2 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2015-05-19 22:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jim Hill, git

On Tue, May 19, 2015 at 11:11:38AM -0700, Junio C Hamano wrote:

> Subject: [PATCH] copy.c: make copy_fd() report its status silently
> 
> When copy_fd() function encounters errors, it emits error messages
> itself, which makes it impossible for callers to take responsibility
> for reporting errors, especially when they want to ignore certaion
> errors.
> 
> Move the error reporting to its callers in preparation.
> [...]

Looks good to me. And thank you for being thorough in analyzing the
impact on all the callers.

>  - hold_lock_file_for_append(), when told to die on error, used to
>    exit(128) relying on the error message from copy_fd(), but now it
>    does its own die() instead.  Note that the callers that do not
>    pass LOCK_DIE_ON_ERROR need to be adjusted for this change, but
>    fortunately there is none ;-)

Not related to your patch, but I've often wondered if we can just get
rid of hold_lock_file_for_append. There's exactly one caller, and I
think it is doing the wrong thing. It is add_to_alternates_file(), but
shouldn't it probably read the existing lines to make sure it is not
adding a duplicate? IOW, I think hold_lock_file_for_append is a
fundamentally bad interface, because almost nobody truly wants to _just_
append.

And I have not investigated it carefully, but I suspect that we do not
even have to be that careful. The only time we write the file is during
clone, and I suspect we could just use a string_list, and then write it
out. We probably don't even need to lock (it's not like we take a lock
before creating the "objects" directory in the first place).

Anyway, end mini-rant. It is probably not hurting anyone and does not
need to be dealt with anytime soon.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 19:48                         ` Junio C Hamano
@ 2015-05-19 22:14                           ` Jeff King
  2015-05-20 17:03                             ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Jeff King @ 2015-05-19 22:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jim Hill, git

On Tue, May 19, 2015 at 12:48:21PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > Yuck; please discard the previous one.  write_in_full() side is also
> > writing into that process, so we should do the same.
> 
> OK, without these two, and with the "true" filter that does not read
> anything reinstated in the test script, t0021 used to die
> 
>     i=0; while sh t0021-conversion.sh; do i=$(( $i + 1 )); done
> 
> after 150 iteration or so for me.  With these two, it seems to go on
> without breaking (I bored after 1000 iterations), so I'd declare it
> good enough ;-)

Your revised patch 2 looks good to me. I think you could test it more
reliably by simply adding a larger file, like:

  test-genrandom foo $((128 * 1024 + 1)) >big &&
  echo 'big filter=epipe' >.gitattributes &&
  git config filter.epipe.clean true &&
  git add big

The worst case if you get the size of the pipe buffer too small is that
the test will erroneously pass, but that is OK. As long as one person
has a reasonable-sized buffer, they will complain to the list
eventually. :)

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 22:14                           ` Jeff King
@ 2015-05-20 17:03                             ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2015-05-20 17:03 UTC (permalink / raw)
  To: Jeff King; +Cc: Jim Hill, git

Jeff King <peff@peff.net> writes:

> Your revised patch 2 looks good to me. I think you could test it more
> reliably by simply adding a larger file, like:
>
>   test-genrandom foo $((128 * 1024 + 1)) >big &&
>   echo 'big filter=epipe' >.gitattributes &&
>   git config filter.epipe.clean true &&
>   git add big
>
> The worst case if you get the size of the pipe buffer too small is that
> the test will erroneously pass, but that is OK. As long as one person
> has a reasonable-sized buffer, they will complain to the list
> eventually. :)

Yeah, I like it.  It was lazy of me not to add a new test.

Thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-19 22:09                     ` Jeff King
@ 2015-05-20 17:25                       ` Junio C Hamano
  2015-05-20 17:38                         ` Jeff King
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2015-05-20 17:25 UTC (permalink / raw)
  To: Jeff King; +Cc: Jim Hill, git

Jeff King <peff@peff.net> writes:

> Not related to your patch, but I've often wondered if we can just get
> rid of hold_lock_file_for_append. There's exactly one caller, and I
> think it is doing the wrong thing. It is add_to_alternates_file(), but
> shouldn't it probably read the existing lines to make sure it is not
> adding a duplicate? IOW, I think hold_lock_file_for_append is a
> fundamentally bad interface, because almost nobody truly wants to _just_
> append.

Yeah, I tend to agree.  Perhaps I should throw it into the list of
low hanging fruits (aka lmgtfy:"git blame leftover bits") and see if
anybody bites ;-)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v3] sha1_file: pass empty buffer to index empty file
  2015-05-20 17:25                       ` Junio C Hamano
@ 2015-05-20 17:38                         ` Jeff King
  0 siblings, 0 replies; 23+ messages in thread
From: Jeff King @ 2015-05-20 17:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jim Hill, git

On Wed, May 20, 2015 at 10:25:41AM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Not related to your patch, but I've often wondered if we can just get
> > rid of hold_lock_file_for_append. There's exactly one caller, and I
> > think it is doing the wrong thing. It is add_to_alternates_file(), but
> > shouldn't it probably read the existing lines to make sure it is not
> > adding a duplicate? IOW, I think hold_lock_file_for_append is a
> > fundamentally bad interface, because almost nobody truly wants to _just_
> > append.
> 
> Yeah, I tend to agree.  Perhaps I should throw it into the list of
> low hanging fruits (aka lmgtfy:"git blame leftover bits") and see if
> anybody bites ;-)

Good thinking. I think it is the right urgency and difficulty for that
list.

-Peff

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2015-05-20 17:38 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-14 17:23 [PATCH] sha1_file: pass empty buffer to index empty file Jim Hill
2015-05-14 18:43 ` Junio C Hamano
2015-05-14 23:17   ` [PATCH v2] " Jim Hill
2015-05-15 18:01     ` Junio C Hamano
2015-05-15 23:31       ` Jim Hill
2015-05-16 18:48         ` Junio C Hamano
2015-05-16 20:06           ` [PATCH v3] " Jim Hill
2015-05-16 23:22             ` Junio C Hamano
2015-05-17 17:37             ` Junio C Hamano
2015-05-17 19:10               ` Junio C Hamano
2015-05-18  0:41                 ` [PATCH v4] " Jim Hill
2015-05-19  6:37                 ` [PATCH v3] " Jeff King
2015-05-19 18:11                   ` Junio C Hamano
2015-05-19 18:17                     ` Junio C Hamano
2015-05-19 18:35                       ` Junio C Hamano
2015-05-19 19:48                         ` Junio C Hamano
2015-05-19 22:14                           ` Jeff King
2015-05-20 17:03                             ` Junio C Hamano
2015-05-19 19:40                     ` Eric Sunshine
2015-05-19 22:09                     ` Jeff King
2015-05-20 17:25                       ` Junio C Hamano
2015-05-20 17:38                         ` Jeff King
2015-05-14 23:26   ` [PATCH] " Jim Hill

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.