All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Auditing use of xwrite()
@ 2024-03-02 19:03 Junio C Hamano
  2024-03-02 19:03 ` [PATCH 1/3] unpack: replace xwrite() loop with write_in_full() Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Junio C Hamano @ 2024-03-02 19:03 UTC (permalink / raw)
  To: git

The xwrite() helper relieves us from having to worry about an
interrupted write(2) system call that returns an EINTR error without
writing any, in which case we need to retry the system call.  It,
just as the underlying write(2) system call, however can return
without writing the bytes requested to be written fully.  A code
that uses xwrite() and wants to write the full contents of the
buffer needs to take care of such a "short write" situation,
typically by calling xwrite() in a loop.

We recently audited all the calls to xwrite(), and have found two
classes of potential "problems".  This miniseries is about fixing them.

 * There are such loops around xwrite() as described above, which
   were written long time ago.  The write_in_full() helper does
   implement such a loop in a reusable way, and we should just call
   it instead.

 * There are xwrite() calls whose return value is ignored.  We would
   not be able to detect (and complete) "short write" in such callers,
   of course, but more importantly, we would not notice I/O failures.
   Again, use write_in_full() to deal with short writes, and check
   the return value from it to detect errors.

Junio C Hamano (3):
  unpack: rewrite hand-rolled loop of xwrite()s with write_in_full()
  sideband: avoid short write(2)
  repack: check error writing to pack-objects subprocess

 builtin/index-pack.c     | 17 +++--------------
 builtin/repack.c         |  5 +++--
 builtin/unpack-objects.c |  8 +-------
 sideband.c               |  4 ++--
 4 files changed, 9 insertions(+), 25 deletions(-)

-- 
2.44.0-84-gb387623c12


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/3] unpack: replace xwrite() loop with write_in_full()
  2024-03-02 19:03 [PATCH 0/3] Auditing use of xwrite() Junio C Hamano
@ 2024-03-02 19:03 ` Junio C Hamano
  2024-03-04  6:58   ` Patrick Steinhardt
  2024-03-02 19:03 ` [PATCH 2/3] sideband: avoid short write(2) Junio C Hamano
  2024-03-02 19:03 ` [PATCH 3/3] repack: check error writing to pack-objects subprocess Junio C Hamano
  2 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2024-03-02 19:03 UTC (permalink / raw)
  To: git

We have two packfile stream consumers, index-pack and
unpack-objects, that allow excess payload after the packfile stream
data. Their code to relay excess data hasn't changed significantly
since their original implementation that appeared in 67e5a5ec
(git-unpack-objects: re-write to read from stdin, 2005-06-28) and
9bee2478 (mimic unpack-objects when --stdin is used with index-pack,
2006-10-25).

These code blocks contain hand-rolled loops using xwrite(), written
before our write_in_full() helper existed. This helper now provides
the same functionality.

Replace these loops with write_in_full() for shorter, clearer
code. Update related variables accordingly.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/index-pack.c     | 17 +++--------------
 builtin/unpack-objects.c |  8 +-------
 2 files changed, 4 insertions(+), 21 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a3a37bd215..856428fef9 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1524,14 +1524,12 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
 	struct strbuf pack_name = STRBUF_INIT;
 	struct strbuf index_name = STRBUF_INIT;
 	struct strbuf rev_index_name = STRBUF_INIT;
-	int err;
 
 	if (!from_stdin) {
 		close(input_fd);
 	} else {
 		fsync_component_or_die(FSYNC_COMPONENT_PACK, output_fd, curr_pack_name);
-		err = close(output_fd);
-		if (err)
+		if (close(output_fd))
 			die_errno(_("error while closing pack file"));
 	}
 
@@ -1566,17 +1564,8 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
 		write_or_die(1, buf.buf, buf.len);
 		strbuf_release(&buf);
 
-		/*
-		 * Let's just mimic git-unpack-objects here and write
-		 * the last part of the input buffer to stdout.
-		 */
-		while (input_len) {
-			err = xwrite(1, input_buffer + input_offset, input_len);
-			if (err <= 0)
-				break;
-			input_len -= err;
-			input_offset += err;
-		}
+		/* Write the last part of the buffer to stdout */
+		write_in_full(1, input_buffer + input_offset, input_len);
 	}
 
 	strbuf_release(&rev_index_name);
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e0a701f2b3..f1c85a00ae 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -679,13 +679,7 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix UNUSED)
 	use(the_hash_algo->rawsz);
 
 	/* Write the last part of the buffer to stdout */
-	while (len) {
-		int ret = xwrite(1, buffer + offset, len);
-		if (ret <= 0)
-			break;
-		len -= ret;
-		offset += ret;
-	}
+	write_in_full(1, buffer + offset, len);
 
 	/* All done */
 	return has_errors;
-- 
2.44.0-84-gb387623c12


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/3] sideband: avoid short write(2)
  2024-03-02 19:03 [PATCH 0/3] Auditing use of xwrite() Junio C Hamano
  2024-03-02 19:03 ` [PATCH 1/3] unpack: replace xwrite() loop with write_in_full() Junio C Hamano
@ 2024-03-02 19:03 ` Junio C Hamano
  2024-03-02 19:03 ` [PATCH 3/3] repack: check error writing to pack-objects subprocess Junio C Hamano
  2 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2024-03-02 19:03 UTC (permalink / raw)
  To: git

The sideband demultiplexor writes the data it receives on sideband
with xwrite().  We can lose data if the underlying write(2) results
in a short write.

If they are limited to unimportant bytes like eye-candy progress
meter, it may be OK to lose them, but lets be careful and ensure
that we use write_in_full() instead.  Note that the original does
not check for errors, and this rewrite does not check for one.  At
least not yet.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sideband.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sideband.c b/sideband.c
index 266a67342b..5d8907151f 100644
--- a/sideband.c
+++ b/sideband.c
@@ -220,7 +220,7 @@ int demultiplex_sideband(const char *me, int status,
 			}
 
 			strbuf_addch(scratch, *brk);
-			xwrite(2, scratch->buf, scratch->len);
+			write_in_full(2, scratch->buf, scratch->len);
 			strbuf_reset(scratch);
 
 			b = brk + 1;
@@ -247,7 +247,7 @@ int demultiplex_sideband(const char *me, int status,
 		die("%s", scratch->buf);
 	if (scratch->len) {
 		strbuf_addch(scratch, '\n');
-		xwrite(2, scratch->buf, scratch->len);
+		write_in_full(2, scratch->buf, scratch->len);
 	}
 	strbuf_release(scratch);
 	return 1;
-- 
2.44.0-84-gb387623c12


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 3/3] repack: check error writing to pack-objects subprocess
  2024-03-02 19:03 [PATCH 0/3] Auditing use of xwrite() Junio C Hamano
  2024-03-02 19:03 ` [PATCH 1/3] unpack: replace xwrite() loop with write_in_full() Junio C Hamano
  2024-03-02 19:03 ` [PATCH 2/3] sideband: avoid short write(2) Junio C Hamano
@ 2024-03-02 19:03 ` Junio C Hamano
  2 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2024-03-02 19:03 UTC (permalink / raw)
  To: git

When "git repack" repacks promisor objects, it starts a pack-objects
subprocess and uses xwrite() to send object names over the pipe to
it, but without any error checking.  An I/O error or short write
(even though a short write is unlikely for such a small amount of
data) can result in a packfile that lacks certain objects we wanted
to put in there, leading to a silent repository corruption.

Use write_in_full(), instead of xwrite(), to mitigate short write
risks, check errors from it, and abort if we see a failure.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/repack.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index ede36328a3..15e4cccc45 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -314,8 +314,9 @@ static int write_oid(const struct object_id *oid,
 			die(_("could not start pack-objects to repack promisor objects"));
 	}
 
-	xwrite(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz);
-	xwrite(cmd->in, "\n", 1);
+	if (write_in_full(cmd->in, oid_to_hex(oid), the_hash_algo->hexsz) < 0 ||
+	    write_in_full(cmd->in, "\n", 1) < 0)
+		die(_("failed to feed promisor objects to pack-objects"));
 	return 0;
 }
 
-- 
2.44.0-84-gb387623c12


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] unpack: replace xwrite() loop with write_in_full()
  2024-03-02 19:03 ` [PATCH 1/3] unpack: replace xwrite() loop with write_in_full() Junio C Hamano
@ 2024-03-04  6:58   ` Patrick Steinhardt
  2024-03-04  7:29     ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Patrick Steinhardt @ 2024-03-04  6:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 3345 bytes --]

On Sat, Mar 02, 2024 at 11:03:46AM -0800, Junio C Hamano wrote:
> We have two packfile stream consumers, index-pack and
> unpack-objects, that allow excess payload after the packfile stream
> data. Their code to relay excess data hasn't changed significantly
> since their original implementation that appeared in 67e5a5ec
> (git-unpack-objects: re-write to read from stdin, 2005-06-28) and
> 9bee2478 (mimic unpack-objects when --stdin is used with index-pack,
> 2006-10-25).
> 
> These code blocks contain hand-rolled loops using xwrite(), written
> before our write_in_full() helper existed. This helper now provides
> the same functionality.
> 
> Replace these loops with write_in_full() for shorter, clearer
> code. Update related variables accordingly.
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
>  builtin/index-pack.c     | 17 +++--------------
>  builtin/unpack-objects.c |  8 +-------
>  2 files changed, 4 insertions(+), 21 deletions(-)
> 
> diff --git a/builtin/index-pack.c b/builtin/index-pack.c
> index a3a37bd215..856428fef9 100644
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -1524,14 +1524,12 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
>  	struct strbuf pack_name = STRBUF_INIT;
>  	struct strbuf index_name = STRBUF_INIT;
>  	struct strbuf rev_index_name = STRBUF_INIT;
> -	int err;
>  
>  	if (!from_stdin) {
>  		close(input_fd);
>  	} else {
>  		fsync_component_or_die(FSYNC_COMPONENT_PACK, output_fd, curr_pack_name);
> -		err = close(output_fd);
> -		if (err)
> +		if (close(output_fd))
>  			die_errno(_("error while closing pack file"));
>  	}
>  
> @@ -1566,17 +1564,8 @@ static void final(const char *final_pack_name, const char *curr_pack_name,
>  		write_or_die(1, buf.buf, buf.len);
>  		strbuf_release(&buf);
>  
> -		/*
> -		 * Let's just mimic git-unpack-objects here and write
> -		 * the last part of the input buffer to stdout.
> -		 */
> -		while (input_len) {
> -			err = xwrite(1, input_buffer + input_offset, input_len);
> -			if (err <= 0)
> -				break;
> -			input_len -= err;
> -			input_offset += err;
> -		}
> +		/* Write the last part of the buffer to stdout */
> +		write_in_full(1, input_buffer + input_offset, input_len);

With this change we stop updating `input_len` and `input_offset`, both
of which are global variables. Assuming that tests pass this must be
okay right now given that this is the final part of what we are writing.
But I wonder whether we shouldn't update those regardless just so that
these remain consistent?

>  	}
>  
>  	strbuf_release(&rev_index_name);
> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
> index e0a701f2b3..f1c85a00ae 100644
> --- a/builtin/unpack-objects.c
> +++ b/builtin/unpack-objects.c
> @@ -679,13 +679,7 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix UNUSED)
>  	use(the_hash_algo->rawsz);
>  
>  	/* Write the last part of the buffer to stdout */
> -	while (len) {
> -		int ret = xwrite(1, buffer + offset, len);
> -		if (ret <= 0)
> -			break;
> -		len -= ret;
> -		offset += ret;
> -	}
> +	write_in_full(1, buffer + offset, len);

Same here.

Patrick

>  	/* All done */
>  	return has_errors;
> -- 
> 2.44.0-84-gb387623c12
> 
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] unpack: replace xwrite() loop with write_in_full()
  2024-03-04  6:58   ` Patrick Steinhardt
@ 2024-03-04  7:29     ` Junio C Hamano
  2024-03-04 16:43       ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2024-03-04  7:29 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Patrick Steinhardt <ps@pks.im> writes:

>> -		while (input_len) {
>> -			err = xwrite(1, input_buffer + input_offset, input_len);
>> -			if (err <= 0)
>> -				break;
>> -			input_len -= err;
>> -			input_offset += err;
>> -		}
>> +		/* Write the last part of the buffer to stdout */
>> +		write_in_full(1, input_buffer + input_offset, input_len);
>
> With this change we stop updating `input_len` and `input_offset`, both
> of which are global variables. Assuming that tests pass this must be
> okay right now given that this is the final part of what we are writing.
> But I wonder whether we shouldn't update those regardless just so that
> these remain consistent?

It is probably a good hygiene, even though it may not matter at all
for the correctness in the current code.

Thanks for your sharp eyes.

>> diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
>> index e0a701f2b3..f1c85a00ae 100644
>> --- a/builtin/unpack-objects.c
>> +++ b/builtin/unpack-objects.c
>> @@ -679,13 +679,7 @@ int cmd_unpack_objects(int argc, const char **argv, const char *prefix UNUSED)
>>  	use(the_hash_algo->rawsz);
>>  
>>  	/* Write the last part of the buffer to stdout */
>> -	while (len) {
>> -		int ret = xwrite(1, buffer + offset, len);
>> -		if (ret <= 0)
>> -			break;
>> -		len -= ret;
>> -		offset += ret;
>> -	}
>> +	write_in_full(1, buffer + offset, len);
>
> Same here.
>
> Patrick
>
>>  	/* All done */
>>  	return has_errors;
>> -- 
>> 2.44.0-84-gb387623c12
>> 
>> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/3] unpack: replace xwrite() loop with write_in_full()
  2024-03-04  7:29     ` Junio C Hamano
@ 2024-03-04 16:43       ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2024-03-04 16:43 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Patrick Steinhardt <ps@pks.im> writes:
>
>>> -		while (input_len) {
>>> -			err = xwrite(1, input_buffer + input_offset, input_len);
>>> -			if (err <= 0)
>>> -				break;
>>> -			input_len -= err;
>>> -			input_offset += err;
>>> -		}
>>> +		/* Write the last part of the buffer to stdout */
>>> +		write_in_full(1, input_buffer + input_offset, input_len);
>>
>> With this change we stop updating `input_len` and `input_offset`, both
>> of which are global variables. Assuming that tests pass this must be
>> okay right now given that this is the final part of what we are writing.
>> But I wonder whether we shouldn't update those regardless just so that
>> these remain consistent?
>
> It is probably a good hygiene, even though it may not matter at all
> for the correctness in the current code.
>
> Thanks for your sharp eyes.

Actually, I changed my mind.  As you said, this is flushing the very
end of the data in the input_buffer[] and nobody will fill() the
input_buffer[] after the call to this function happens.

>>> -	while (len) {
>>> ...
>>> -		len -= ret;
>>> -		offset += ret;
>>> -	}
>>> +	write_in_full(1, buffer + offset, len);
>>
>> Same here.

Ditto.  We are about to pass the control back to the caller that
will exit using the "has_errors" we return from here.

>>
>> Patrick
>>
>>>  	/* All done */
>>>  	return has_errors;
>>> -- 
>>> 2.44.0-84-gb387623c12
>>> 
>>> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-03-04 16:43 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-02 19:03 [PATCH 0/3] Auditing use of xwrite() Junio C Hamano
2024-03-02 19:03 ` [PATCH 1/3] unpack: replace xwrite() loop with write_in_full() Junio C Hamano
2024-03-04  6:58   ` Patrick Steinhardt
2024-03-04  7:29     ` Junio C Hamano
2024-03-04 16:43       ` Junio C Hamano
2024-03-02 19:03 ` [PATCH 2/3] sideband: avoid short write(2) Junio C Hamano
2024-03-02 19:03 ` [PATCH 3/3] repack: check error writing to pack-objects subprocess Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.