All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Johannes Schindelin <johannes.schindelin@gmx.de>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 4/6] fsck: check tag objects' headers
Date: Thu, 28 Aug 2014 14:25:16 -0700	[thread overview]
Message-ID: <xmqqlhq88fyb.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <alpine.DEB.1.00.1408281646530.990@s15462909.onlinehome-server.info> (Johannes Schindelin's message of "Thu, 28 Aug 2014 16:46:55 +0200 (CEST)")

Johannes Schindelin <johannes.schindelin@gmx.de> writes:

> We inspect commit objects pretty much in detail in git-fsck, but we just
> glanced over the tag objects. Let's be stricter.
>
> This work was sponsored by GitHub Inc.

Is it only this commit, or all of these patches in the series?
Does GitHub want their name sprinkled over all changes they sponsor?

>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  fsck.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 87 insertions(+), 1 deletion(-)
>
> diff --git a/fsck.c b/fsck.c
> index db6aaa4..d30946b 100644
> --- a/fsck.c
> +++ b/fsck.c
> @@ -6,6 +6,7 @@
>  #include "commit.h"
>  #include "tag.h"
>  #include "fsck.h"
> +#include "refs.h"
>  
>  static int fsck_walk_tree(struct tree *tree, fsck_walk_func walk, void *data)
>  {
> @@ -355,6 +356,90 @@ static int fsck_commit(struct commit *commit, const char *data,
>  	return ret;
>  }
>  
> +static int fsck_tag_buffer(struct tag *tag, const char *data,
> +	unsigned long size, fsck_error error_func)
> +{
> +	unsigned char commit_sha1[20];
> +	int ret = 0;
> +	const char *buffer;
> +	char *tmp = NULL, *eol;

Call it "to_free" or something; naming it 'tmp' would tempt people
who later touch this code to reuse it temporarily to hold something
unrelated (I know they will notice that mistake later, but noticing
mistake after wasting time is too late).

> +	if (data)
> +		buffer = data;
> +	else {
> +		enum object_type type;
> +
> +		buffer = tmp = read_sha1_file(tag->object.sha1, &type, &size);
> +		if (!buffer)
> +			return error_func(&tag->object, FSCK_ERROR,
> +				"cannot read tag object");
> +
> +		if (type != OBJ_TAG) {
> +			ret = error_func(&tag->object, FSCK_ERROR,
> +				"expected tag got %s",
> +			    typename(type));
> +			goto done;
> +		}
> +	}
> +
> +	if (must_have_empty_line(buffer, size, &tag->object, error_func))
> +		goto done;
> +
> +	if (!skip_prefix(buffer, "object ", &buffer)) {
> +		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'object' line");
> +		goto done;
> +	}
> +	if (get_sha1_hex(buffer, commit_sha1) || buffer[40] != '\n') {

This code is not making the mistake to assume that tagged objects
are always commits, so let's not call it commit_sha1; I'd suggest 
just calling it sha1[] (or even "tmp" or "junk", as the parsed value
is not used).

> +	*eol = '\0';
> +	if (type_from_string_gently(buffer) < 0)
> +		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'type' value");
> +	*eol = '\n';
> +	if (ret)
> +		goto done;

I can see that it is reverted back to '\n' immediately after calling
type_from_string_gently(), but it is a bit unfortunate that "const
char *data" needs to be touched this way.

Since the callee is introduced in this series, perhaps it can be
made to take a counted string?

> +	if (!skip_prefix(buffer, "tag ", &buffer)) {
> +		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - expected 'tag' line");
> +		goto done;
> +	}
> +	eol = strchr(buffer, '\n');
> +	if (!eol) {
> +		ret = error_func(&tag->object, FSCK_ERROR, "invalid format - unexpected end after 'type' line");
> +		goto done;
> +	}
> +	*eol = '\0';
> +	if (check_refname_format(buffer, REFNAME_ALLOW_ONELEVEL))
> +		ret = error_func(&tag->object, FSCK_ERROR, "invalid 'tag' name: %s", buffer);
> +	*eol = '\n';

I actually think this check is harmful.  It often matches the name
of the ref, but there is nothing inherently "refname like" in the
tag name proper.

And I think it is unnecessary.  Don't we already warn if it does not
match the name of the ref we use to point at it?  It would mean that
anything that does not conform to the check-refname-format will get
a warning one way or the other, either it is pointed at by a ref
whose name is kosher and does not match, or a ref itself has a name
that does not pass check-refname-format.

(goes and looks)

Yikes.  I assumed too much.  We do not seem to do much checking on
refs in that

    (1) After "git rev-parse HEAD >.git/refs/heads/ee..oo"
        "fsck" does not complain about the malformed ee..oo branch;

    (2) After "git tag -a foo -m foo && cp .git/refs/tags/foo .git/refs/tags/bar"
        "fsck" does not complain that refs/tags/bar talks about "foo"

But these two are something we would want to fix in a larger context
within "git fack" anyway, so my comments above still stand.

> +	if (!skip_prefix(buffer, "tagger ", &buffer)) {
> +		/* early tags do not contain 'tagger' lines; warn only */
> +		error_func(&tag->object, FSCK_WARN, "invalid format - expected 'tagger' line");

Nice.  Even nicer that this explains why people should not touch
'ret' here.

> +	}
> +	ret = fsck_ident(&buffer, &tag->object, error_func);
> +
> +done:
> +	free(tmp);
> +	return ret;
> +}
> +
>  static int fsck_tag(struct tag *tag, const char *data,
>  	unsigned long size, fsck_error error_func)
>  {
> @@ -362,7 +447,8 @@ static int fsck_tag(struct tag *tag, const char *data,
>  
>  	if (!tagged)
>  		return error_func(&tag->object, FSCK_ERROR, "could not load tagged object");
> -	return 0;
> +
> +	return fsck_tag_buffer(tag, data, size, error_func);
>  }
>  
>  int fsck_object(struct object *obj, void *data, unsigned long size,

  reply	other threads:[~2014-08-28 21:25 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-08-28 14:46 [PATCH 0/6] Improve tag checking in fsck and with transfer.fsckobjects Johannes Schindelin
2014-08-28 14:46 ` [PATCH 1/6] Refactor type_from_string() to avoid die()ing in case of errors Johannes Schindelin
2014-08-28 20:43   ` Junio C Hamano
2014-08-28 14:46 ` [PATCH 2/6] Accept object data in the fsck_object() function Johannes Schindelin
2014-08-28 20:47   ` Junio C Hamano
2014-08-29 23:10     ` Jeff King
2014-08-29 23:05   ` Jeff King
2014-08-28 14:46 ` [PATCH 3/6] Make sure fsck_commit_buffer() does not run out of the buffer Johannes Schindelin
2014-08-28 20:59   ` Junio C Hamano
2014-08-29 23:27   ` Jeff King
2014-08-28 14:46 ` [PATCH 4/6] fsck: check tag objects' headers Johannes Schindelin
2014-08-28 21:25   ` Junio C Hamano [this message]
2014-08-28 21:36     ` Junio C Hamano
2014-08-29 23:46       ` Jeff King
2014-08-31 22:46         ` Junio C Hamano
2014-09-03 22:29           ` Jeff King
2014-09-03 23:14             ` Junio C Hamano
2014-09-04  2:04               ` Jeff King
2014-08-29 23:43     ` Jeff King
2014-09-02 18:41       ` Junio C Hamano
2014-09-03 21:38         ` Jeff King
2014-08-28 14:46 ` [PATCH 5/6] Add regression tests for stricter tag fsck'ing Johannes Schindelin
2014-08-28 14:47 ` [PATCH 6/6] Make sure that index-pack --strict fails upon invalid tag objects Johannes Schindelin
2014-09-10 13:52 ` [PATCH v2 0/6] Improve tag checking in fsck and with transfer.fsckobjects Johannes Schindelin
2014-09-10 13:58   ` Johannes Schindelin
2014-09-10 21:07   ` Junio C Hamano
2014-09-10 21:31     ` Junio C Hamano
2014-09-11 14:20       ` Johannes Schindelin
2014-09-11 14:26   ` [PATCH v3 " Johannes Schindelin
2014-09-11 14:26     ` [PATCH v3 1/6] Refactor type_from_string() to avoid die()ing in case of errors Johannes Schindelin
2014-09-11 14:26     ` [PATCH v3 2/6] Accept object data in the fsck_object() function Johannes Schindelin
2014-09-11 14:26     ` [PATCH v3 3/6] Make sure fsck_commit_buffer() does not run out of the buffer Johannes Schindelin
2014-09-11 14:26     ` [PATCH v3 4/6] fsck: check tag objects' headers Johannes Schindelin
2014-09-11 14:26     ` [PATCH v3 5/6] Add regression tests for stricter tag fsck'ing Johannes Schindelin
2014-09-11 14:26     ` [PATCH v3 6/6] Make sure that index-pack --strict checks tag objects Johannes Schindelin
2014-09-11 17:58       ` Junio C Hamano
2014-09-11 21:16         ` Junio C Hamano
2014-09-11 21:17           ` [PATCH 0/3] hash-object --literally Junio C Hamano
2014-09-11 21:17             ` [PATCH 1/3] hash-object: reduce file-scope statics Junio C Hamano
2014-09-11 21:17             ` [PATCH 2/3] hash-object: pass 'write_object' as a flag Junio C Hamano
2014-09-11 21:17             ` [PATCH 3/3] hash-object: add --literally option Junio C Hamano
2014-09-12  8:04           ` [PATCH v3 6/6] Make sure that index-pack --strict checks tag objects Johannes Schindelin
2014-09-12  8:07     ` [PATCH v4 0/6] Improve tag checking in fsck and with transfer.fsckobjects Johannes Schindelin
2014-09-12  8:07       ` [PATCH v4 1/6] Refactor type_from_string() to avoid die()ing in case of errors Johannes Schindelin
2014-09-12  8:07       ` [PATCH v4 2/6] Accept object data in the fsck_object() function Johannes Schindelin
2014-09-12  8:07       ` [PATCH v4 3/6] Make sure fsck_commit_buffer() does not run out of the buffer Johannes Schindelin
2014-09-12  8:08       ` [PATCH v4 4/6] fsck: check tag objects' headers Johannes Schindelin
2014-09-12  8:08       ` [PATCH v4 5/6] Add regression tests for stricter tag fsck'ing Johannes Schindelin
2014-09-12  8:08       ` [PATCH v4 6/6] Make sure that index-pack --strict checks tag objects Johannes Schindelin
2014-09-12 18:02       ` [PATCH v4 0/6] Improve tag checking in fsck and with transfer.fsckobjects Junio C Hamano
2014-09-13  9:08         ` Johannes Schindelin
     [not found] ` <cover.1410356761.git.johannes.schindelin@gmx.de>
2014-09-10 13:52   ` [PATCH v2 1/6] Refactor type_from_string() to avoid die()ing in case of errors Johannes Schindelin
2014-09-10 13:52   ` [PATCH v2 2/6] Accept object data in the fsck_object() function Johannes Schindelin
2014-09-10 13:52   ` [PATCH v2 3/6] Make sure fsck_commit_buffer() does not run out of the buffer Johannes Schindelin
2014-09-10 17:43     ` Junio C Hamano
2014-09-11 11:59       ` Johannes Schindelin
2014-09-11 16:49         ` Junio C Hamano
2014-09-10 20:45     ` Eric Sunshine
2014-09-10 13:53   ` [PATCH v2 4/6] fsck: check tag objects' headers Johannes Schindelin
2014-09-10 17:52     ` Junio C Hamano
2014-09-10 13:53   ` [PATCH v2 5/6] Add regression tests for stricter tag fsck'ing Johannes Schindelin
2014-09-10 17:56     ` Junio C Hamano
2014-09-11 14:15       ` Johannes Schindelin
2014-09-10 13:53   ` [PATCH v2 6/6] Make sure that index-pack --strict fails upon invalid tag objects Johannes Schindelin
2014-09-10 21:54     ` Junio C Hamano
2014-09-11 14:22       ` Johannes Schindelin
2014-09-11 16:50         ` Junio C Hamano
2014-09-11 17:04           ` Johannes Schindelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqlhq88fyb.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=git@vger.kernel.org \
    --cc=johannes.schindelin@gmx.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.