All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <johannes.schindelin@gmx.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: Wolfgang Denk <wd@denx.de>, git@vger.kernel.org
Subject: Re: git error in tag ...: unterminated header
Date: Fri, 26 Jun 2015 10:06:20 +0200	[thread overview]
Message-ID: <d455a77d76b3558fb79d550d6ed4468d@www.dscho.org> (raw)
In-Reply-To: <xmqqbng3wheu.fsf@gitster.dls.corp.google.com>

Hi Junio,

On 2015-06-26 00:29, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Johannes Schindelin <johannes.schindelin@gmx.de> writes:
>> ...
>>> If the buffer does *not* contain an empty line, the fsck code runs the
>>> danger of looking beyond the allocated memory because it uses
>>> functions that assume NUL-terminated strings, while the buffer passed
>>> to the fsck code is a counted string.
>>> that already, but not all of them.
>> ...
>> I do not think you necessarily need a NUL.  As you said, your input
>> is a counted string, so you know the length of the buffer.  And you
>> are verifying line by line.  As long as you make sure the buffer
>> ends with "\n" (instead of saying "it has "\n\n" somewhere),
>> updating the existing code that does ...
>> to do this instead: ...
>> shouldn't be rocket science, no?
> 
> Here is to illustrate what I meant by the above.

I understood what you were saying, but it still appears too fragile to me to mix functions that assume NUL-terminated strings with an ad-hoc counted string check. Take `skip_prefix()` for example: it really assumes implicitly that the string is NUL-terminated because a first argument that is too short for the prefix will simply compare NUL against non-NUL at some stage. With your idea, we will now assume that all the usages of `skip_prefix()` in `fsck.c`, including future ones, will refrain from asking for a prefix containing `\n`. That might not hold true, and worse: it might not be detected because a couple of code paths *already* pass NUL-terminated buffers to fsck.

Now, we are actually talking about really rare objects, right? So why not be a little generous with memory in the very unlikely event that we encounter such a tag object? I am thinking somewhat along these lines:

-- snipsnap --
diff --git a/fsck.c b/fsck.c
index 2fffa43..4c7530a 100644
--- a/fsck.c
+++ b/fsck.c
@@ -238,10 +238,11 @@ static int fsck_tree(struct tree *item, int strict, fsck_error error_func)
 	return retval;
 }
 
-static int require_end_of_header(const void *data, unsigned long size,
+static int require_end_of_header(const char **data, unsigned long size,
 	struct object *obj, fsck_error error_func)
 {
-	const char *buffer = (const char *)data;
+	static char *nul_terminated;
+	const char *buffer = *(const char **)data;
 	unsigned long i;
 
 	for (i = 0; i < size; i++) {
@@ -255,7 +256,14 @@ static int require_end_of_header(const void *data, unsigned long size,
 		}
 	}
 
-	return error_func(obj, FSCK_ERROR, "unterminated header");
+	/* TODO: when fsck-opt hits `next`, test whether to error out here */
+	if (nul_terminated)
+		free(nul_terminated);
+	nul_terminated = xmalloc(size + 1);
+	memcpy(nul_terminated, *data, size);
+	nul_terminated[size] = '\0';
+	*data = nul_terminated;
+	return 0;
 }
 
 static int fsck_ident(const char **ident, struct object *obj, fsck_error error_func)
@@ -297,15 +304,16 @@ static int fsck_ident(const char **ident, struct object *obj, fsck_error error_f
 	return 0;
 }
 
-static int fsck_commit_buffer(struct commit *commit, const char *buffer,
+static int fsck_commit_buffer(struct commit *commit, const char *orig,
 	unsigned long size, fsck_error error_func)
 {
+	const char *buffer = orig;
 	unsigned char tree_sha1[20], sha1[20];
 	struct commit_graft *graft;
 	unsigned parent_count, parent_line_count = 0;
 	int err;
 
-	if (require_end_of_header(buffer, size, &commit->object, error_func))
+	if (require_end_of_header(&buffer, size, &commit->object, error_func))
 		return -1;
 
 	if (!skip_prefix(buffer, "tree ", &buffer))
@@ -356,9 +364,10 @@ static int fsck_commit(struct commit *commit, const char *data,
 	return ret;
 }
 
-static int fsck_tag_buffer(struct tag *tag, const char *data,
+static int fsck_tag_buffer(struct tag *tag, const char *orig,
 	unsigned long size, fsck_error error_func)
 {
+	const char *data = orig;
 	unsigned char sha1[20];
 	int ret = 0;
 	const char *buffer;
@@ -384,7 +393,7 @@ static int fsck_tag_buffer(struct tag *tag, const char *data,
 		}
 	}
 
-	if (require_end_of_header(buffer, size, &tag->object, error_func))
+	if (require_end_of_header(&buffer, size, &tag->object, error_func))
 		goto done;
 
 	if (!skip_prefix(buffer, "object ", &buffer)) {

  reply	other threads:[~2015-06-26  8:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-25 15:51 git error in tag ...: unterminated header Wolfgang Denk
2015-06-25 17:32 ` Junio C Hamano
2015-06-25 20:13   ` Wolfgang Denk
2015-06-25 20:24     ` Junio C Hamano
2015-06-25 21:07       ` Johannes Schindelin
2015-06-25 21:21         ` Junio C Hamano
2015-06-25 22:29           ` Junio C Hamano
2015-06-26  8:06             ` Johannes Schindelin [this message]
2015-06-26 15:52               ` Jeff King
2015-06-26 17:37                 ` Junio C Hamano
2015-06-27  8:57                   ` Johannes Schindelin
2015-06-27 18:36                     ` Junio C Hamano
2015-06-28 18:18                 ` [PATCH] fsck: it is OK for a tag and a commit to lack the body Junio C Hamano
2015-06-28 18:21                   ` Eric Sunshine
2015-06-29  5:12                   ` Johannes Schindelin
2015-06-29  5:42                     ` Junio C Hamano
2015-06-29 14:51                       ` Johannes Schindelin
2015-06-25 20:48     ` git error in tag ...: unterminated header Junio C Hamano
2015-06-25 17:38 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d455a77d76b3558fb79d550d6ed4468d@www.dscho.org \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=wd@denx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.