All of lore.kernel.org
 help / color / mirror / Atom feed
* Funnies with "git fetch"
@ 2011-09-01 17:53 Junio C Hamano
  2011-09-01 22:42 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-01 17:53 UTC (permalink / raw)
  To: git

I just did this in an empty directory.

    $ git init src
    $ cd src
    $ echo hello >greetings ; git add . ; git commit -m greetings
    $ S=$(git rev-parse :greetings | sed -e 's|^..|&/|')
    $ X=$(echo bye | git hash-object -w --stdin | sed -e 's|^..|&/|')
    $ mv -f .git/objects/$X .git/objects/$S

The tip commit _thinks_ it has "greetings" that contains "hello", but
somebody replaced it with a corrupt "bye" that does not match self
integrity.

    $ git fsck
    error: sha1 mismatch ce013625030ba8dba906f756967f9e9ca394464a

    error: ce013625030ba8dba906f756967f9e9ca394464a: object corrupt or missing
    missing blob ce013625030ba8dba906f756967f9e9ca394464a

The "hello" blob is ce0136, and the tree contained in HEAD expects "hello"
in that loose object file, but notices the contents do not match the
filename.

So far, so good. Let's see what others see when they interact with this
repository.

cd ../
git init dst
cd dst
git config receive.fsckobjects true
git remote add origin ../src
git config branch.master.remote origin
git config branch.master.merge refs/heads/master
git fetch
    remote: Counting objects: 3, done.
    remote: Total 3 (delta 0), reused 0 (delta 0)
    Unpacking objects: 100% (3/3), done.
    From ../src
     * [new branch]      master     -> origin/master

Oops? If we run "fsck" at this point, we would notice the breakage:

    $ git fsck
    notice: HEAD points to an unborn branch (master)
    broken link from    tree 1c93b84c9756b083e5751db1f9ffa7f80ac667e2
                  to    blob ce013625030ba8dba906f756967f9e9ca394464a
    missing blob ce013625030ba8dba906f756967f9e9ca394464a
    dangling blob b023018cabc396e7692c70bbf5784a93d3f738ab

Here, b02301 is the true identity of the "bye" blob the src repository
crafted and tried to fool us into believing it is "hello".  We can see
that the object transfer gave three objects, and because we only propagate
the contents and have the receiving end compute the object names from the
data, we received b02301 but not ce0136.

    $ ls .git/objects/??/?*
    .git/objects/1c/93b84c9756b083e5751db1f9ffa7f80ac667e2
    .git/objects/61/5d8c76daef6744635c87fb312a76a5ec7462ea
    .git/objects/b0/23018cabc396e7692c70bbf5784a93d3f738ab

As a side note, if we did "git pull" instead of "git fetch", we would have
also noticed the breakage, like so:

    $ git pull
    remote: Counting objects: 3, done.
    remote: Total 3 (delta 0), reused 0 (delta 0)
    Unpacking objects: 100% (3/3), done.
    From ../src
     * [new branch]      master     -> origin/master
    error: unable to find ce013625030ba8dba906f756967f9e9ca394464a
    error: unable to read sha1 file of greetings (ce013625030ba...)

But the straight "fetch" did not notice anything fishy going on. Shouldn't
we have?  Even though we may be reasonably safe, unpack-objects should be
able to do better, especially under receive.fsckobjects option.

Also as a side note, if we set 

    $ git config fetch.unpacklimit 1

before we run this "git fetch", we end up storing a single pack, whose
contents are the same three objects above (as expected), and we do not get
any indication of an error from the command.

I think the breakages are:

 - The sending side does not give any indication that it _wanted_ to send
   ce0136 but couldn't, and ended up sending another object;

 - The pack data sent over the wire was self consistent (no breakage here)
   and sent three well-formed objects, but it was inconsistent with
   respect to what history was being transferred (breakage is here);

 - The receiving end did not notice the inconsistency.

The first one is of the lower priority, as the client side should be able
to notice an upstream with corruption in any case. Perhaps after asking
for objects between "have" and "want", "git fetch" should verify that it
can fully walk the subhistory that was supposed to be transferred down to
the blob level?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Funnies with "git fetch"
  2011-09-01 17:53 Funnies with "git fetch" Junio C Hamano
@ 2011-09-01 22:42 ` Junio C Hamano
  2011-09-01 23:31   ` Jeff King
  2011-09-01 22:43 ` [PATCH 0/3] Verify the objects fetch obtained before updating ref Junio C Hamano
  2011-09-04 19:15 ` Funnies with "git fetch" Junio C Hamano
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2011-09-01 22:42 UTC (permalink / raw)
  To: git

Junio C Hamano <gitster@pobox.com> writes:

> I think the breakages are:
>
>  - The sending side does not give any indication that it _wanted_ to send
>    ce0136 but couldn't, and ended up sending another object;
>
>  - The pack data sent over the wire was self consistent (no breakage here)
>    and sent three well-formed objects, but it was inconsistent with
>    respect to what history was being transferred (breakage is here);
>
>  - The receiving end did not notice the inconsistency.
>
> The first one is of the lower priority, as the client side should be able
> to notice an upstream with corruption in any case. Perhaps after asking
> for objects between "have" and "want", "git fetch" should verify that it
> can fully walk the subhistory that was supposed to be transferred down to
> the blob level?

So I have a series to fix the latter "more important" half I'll be sending
out in this thread.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 0/3] Verify the objects fetch obtained before updating ref
  2011-09-01 17:53 Funnies with "git fetch" Junio C Hamano
  2011-09-01 22:42 ` Junio C Hamano
@ 2011-09-01 22:43 ` Junio C Hamano
  2011-09-01 22:43   ` [PATCH 1/3] list-objects: pass callback data to show_objects() Junio C Hamano
                     ` (2 more replies)
  2011-09-04 19:15 ` Funnies with "git fetch" Junio C Hamano
  2 siblings, 3 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-01 22:43 UTC (permalink / raw)
  To: git

So here is a series to make the client side a bit more careful.

These patches apply on top of jc/traverse-commit-list beba25a (revision.c:
update show_object_with_name() without using malloc(), 2011-08-17) in
'next'.

Junio C Hamano (3):
  list-objects: pass callback data to show_objects()
  rev-list --verify-object
  fetch: ensure we transferred everything we need before updating our
    ref

 builtin/fetch.c        |  119 +++++++++++++++++++++++++----------------------
 builtin/pack-objects.c |    4 +-
 builtin/rev-list.c     |   14 ++++-
 list-objects.c         |   28 +++++++----
 list-objects.h         |    5 +-
 revision.c             |    5 ++
 revision.h             |    1 +
 upload-pack.c          |    4 +-
 8 files changed, 105 insertions(+), 75 deletions(-)

-- 
1.7.7.rc0.72.g4b5ea

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] list-objects: pass callback data to show_objects()
  2011-09-01 22:43 ` [PATCH 0/3] Verify the objects fetch obtained before updating ref Junio C Hamano
@ 2011-09-01 22:43   ` Junio C Hamano
  2011-09-01 22:43   ` [PATCH 2/3] rev-list --verify-object Junio C Hamano
  2011-09-01 22:43   ` [PATCH 3/3] fetch: verify we have everything we need before updating our ref Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-01 22:43 UTC (permalink / raw)
  To: git

The traverse_commit_list() API takes two callback functions, one to show
commit objects, and the other to show other kinds of objects. Even though
the former has a callback data parameter, so that the callback does not
have to rely on global state, the latter does not.

Give the show_objects() callback the same callback data parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/pack-objects.c |    4 +++-
 builtin/rev-list.c     |   10 +++++++---
 list-objects.c         |   28 +++++++++++++++++-----------
 list-objects.h         |    5 ++---
 upload-pack.c          |    4 +++-
 5 files changed, 32 insertions(+), 19 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index f402a84..fca6cb5 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1936,7 +1936,9 @@ static void show_commit(struct commit *commit, void *data)
 	commit->object.flags |= OBJECT_ADDED;
 }
 
-static void show_object(struct object *obj, const struct name_path *path, const char *last)
+static void show_object(struct object *obj,
+			const struct name_path *path, const char *last,
+			void *data)
 {
 	char *name = path_name(path, last);
 
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index f5ce487..920b91c 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -168,15 +168,19 @@ static void finish_commit(struct commit *commit, void *data)
 	commit->buffer = NULL;
 }
 
-static void finish_object(struct object *obj, const struct name_path *path, const char *name)
+static void finish_object(struct object *obj,
+			  const struct name_path *path, const char *name,
+			  void *cb_data)
 {
 	if (obj->type == OBJ_BLOB && !has_sha1_file(obj->sha1))
 		die("missing blob object '%s'", sha1_to_hex(obj->sha1));
 }
 
-static void show_object(struct object *obj, const struct name_path *path, const char *component)
+static void show_object(struct object *obj,
+			const struct name_path *path, const char *component,
+			void *cb_data)
 {
-	finish_object(obj, path, component);
+	finish_object(obj, path, component, cb_data);
 	show_object_with_name(stdout, obj, path, component);
 }
 
diff --git a/list-objects.c b/list-objects.c
index 0fb44e7..39d80c0 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -12,7 +12,8 @@ static void process_blob(struct rev_info *revs,
 			 struct blob *blob,
 			 show_object_fn show,
 			 struct name_path *path,
-			 const char *name)
+			 const char *name,
+			 void *cb_data)
 {
 	struct object *obj = &blob->object;
 
@@ -23,7 +24,7 @@ static void process_blob(struct rev_info *revs,
 	if (obj->flags & (UNINTERESTING | SEEN))
 		return;
 	obj->flags |= SEEN;
-	show(obj, path, name);
+	show(obj, path, name, cb_data);
 }
 
 /*
@@ -52,7 +53,8 @@ static void process_gitlink(struct rev_info *revs,
 			    const unsigned char *sha1,
 			    show_object_fn show,
 			    struct name_path *path,
-			    const char *name)
+			    const char *name,
+			    void *cb_data)
 {
 	/* Nothing to do */
 }
@@ -62,7 +64,8 @@ static void process_tree(struct rev_info *revs,
 			 show_object_fn show,
 			 struct name_path *path,
 			 struct strbuf *base,
-			 const char *name)
+			 const char *name,
+			 void *cb_data)
 {
 	struct object *obj = &tree->object;
 	struct tree_desc desc;
@@ -80,7 +83,7 @@ static void process_tree(struct rev_info *revs,
 	if (parse_tree(tree) < 0)
 		die("bad tree object %s", sha1_to_hex(obj->sha1));
 	obj->flags |= SEEN;
-	show(obj, path, name);
+	show(obj, path, name, cb_data);
 	me.up = path;
 	me.elem = name;
 	me.elem_len = strlen(name);
@@ -106,14 +109,17 @@ static void process_tree(struct rev_info *revs,
 		if (S_ISDIR(entry.mode))
 			process_tree(revs,
 				     lookup_tree(entry.sha1),
-				     show, &me, base, entry.path);
+				     show, &me, base, entry.path,
+				     cb_data);
 		else if (S_ISGITLINK(entry.mode))
 			process_gitlink(revs, entry.sha1,
-					show, &me, entry.path);
+					show, &me, entry.path,
+					cb_data);
 		else
 			process_blob(revs,
 				     lookup_blob(entry.sha1),
-				     show, &me, entry.path);
+				     show, &me, entry.path,
+				     cb_data);
 	}
 	strbuf_setlen(base, baselen);
 	free(tree->buffer);
@@ -185,17 +191,17 @@ void traverse_commit_list(struct rev_info *revs,
 			continue;
 		if (obj->type == OBJ_TAG) {
 			obj->flags |= SEEN;
-			show_object(obj, NULL, name);
+			show_object(obj, NULL, name, data);
 			continue;
 		}
 		if (obj->type == OBJ_TREE) {
 			process_tree(revs, (struct tree *)obj, show_object,
-				     NULL, &base, name);
+				     NULL, &base, name, data);
 			continue;
 		}
 		if (obj->type == OBJ_BLOB) {
 			process_blob(revs, (struct blob *)obj, show_object,
-				     NULL, name);
+				     NULL, name, data);
 			continue;
 		}
 		die("unknown pending object %s (%s)",
diff --git a/list-objects.h b/list-objects.h
index d65dbf0..3db7bb6 100644
--- a/list-objects.h
+++ b/list-objects.h
@@ -2,11 +2,10 @@
 #define LIST_OBJECTS_H
 
 typedef void (*show_commit_fn)(struct commit *, void *);
-typedef void (*show_object_fn)(struct object *, const struct name_path *, const char *);
-typedef void (*show_edge_fn)(struct commit *);
-
+typedef void (*show_object_fn)(struct object *, const struct name_path *, const char *, void *);
 void traverse_commit_list(struct rev_info *, show_commit_fn, show_object_fn, void *);
 
+typedef void (*show_edge_fn)(struct commit *);
 void mark_edges_uninteresting(struct commit_list *, struct rev_info *, show_edge_fn);
 
 #endif
diff --git a/upload-pack.c b/upload-pack.c
index 970a1eb..6be6259 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -83,7 +83,9 @@ static void show_commit(struct commit *commit, void *data)
 	commit->buffer = NULL;
 }
 
-static void show_object(struct object *obj, const struct name_path *path, const char *component)
+static void show_object(struct object *obj,
+			const struct name_path *path, const char *component,
+			void *cb_data)
 {
 	show_object_with_name(pack_pipe, obj, path, component);
 }
-- 
1.7.7.rc0.72.g4b5ea

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] rev-list --verify-object
  2011-09-01 22:43 ` [PATCH 0/3] Verify the objects fetch obtained before updating ref Junio C Hamano
  2011-09-01 22:43   ` [PATCH 1/3] list-objects: pass callback data to show_objects() Junio C Hamano
@ 2011-09-01 22:43   ` Junio C Hamano
  2011-09-01 22:43   ` [PATCH 3/3] fetch: verify we have everything we need before updating our ref Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-01 22:43 UTC (permalink / raw)
  To: git

Often we want to verify everything reachable from a given set of commits
are present in our repository and connected without a gap to the tips of
our refs. We used to do this for this purpose:

    $ rev-list --objects $commits_to_be_tested --not --all

Even though this is good enough for catching missing commits and trees,
we show the object name but do not verify their existence, let alone their
well-formedness, for the blob objects at the leaf level.

Add a new "--verify-object" option so that we can catch missing and broken
blobs as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/rev-list.c |    4 ++++
 revision.c         |    5 +++++
 revision.h         |    1 +
 3 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 920b91c..ab3be7c 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -180,7 +180,11 @@ static void show_object(struct object *obj,
 			const struct name_path *path, const char *component,
 			void *cb_data)
 {
+	struct rev_info *info = cb_data;
+
 	finish_object(obj, path, component, cb_data);
+	if (info->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT)
+		parse_object(obj->sha1);
 	show_object_with_name(stdout, obj, path, component);
 }
 
diff --git a/revision.c b/revision.c
index 072ddac..5ef498b 100644
--- a/revision.c
+++ b/revision.c
@@ -1383,6 +1383,11 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
 		revs->tree_objects = 1;
 		revs->blob_objects = 1;
 		revs->edge_hint = 1;
+	} else if (!strcmp(arg, "--verify-objects")) {
+		revs->tag_objects = 1;
+		revs->tree_objects = 1;
+		revs->blob_objects = 1;
+		revs->verify_objects = 1;
 	} else if (!strcmp(arg, "--unpacked")) {
 		revs->unpacked = 1;
 	} else if (!prefixcmp(arg, "--unpacked=")) {
diff --git a/revision.h b/revision.h
index da00a58..648876b 100644
--- a/revision.h
+++ b/revision.h
@@ -53,6 +53,7 @@ struct rev_info {
 			tag_objects:1,
 			tree_objects:1,
 			blob_objects:1,
+			verify_objects:1,
 			edge_hint:1,
 			limited:1,
 			unpacked:1,
-- 
1.7.7.rc0.72.g4b5ea

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] fetch: verify we have everything we need before updating our ref
  2011-09-01 22:43 ` [PATCH 0/3] Verify the objects fetch obtained before updating ref Junio C Hamano
  2011-09-01 22:43   ` [PATCH 1/3] list-objects: pass callback data to show_objects() Junio C Hamano
  2011-09-01 22:43   ` [PATCH 2/3] rev-list --verify-object Junio C Hamano
@ 2011-09-01 22:43   ` Junio C Hamano
  2011-09-02  3:55     ` Nguyen Thai Ngoc Duy
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2011-09-01 22:43 UTC (permalink / raw)
  To: git

The "git fetch" command works in two phases. The remote side tells us what
objects are at the tip of the refs we are fetching from, and transfers the
objects missing from our side. After storing the objects in our repository,
we update our remote tracking branches to point at the updated tips of the
refs.

A broken or malicious remote side could send a perfectly well-formed pack
data during the object transfer phase, but there is no guarantee that the
given data actually fill the gap between the objects we originally had and
the refs we are updating to.

Although this kind of breakage can be caught by running fsck after a
fetch, it is much cheaper to verify that everything that is reachable from
the tips of the refs we fetched are indeed fully connected to the tips of
our current set of refs before we update them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin/fetch.c |  119 +++++++++++++++++++++++++++++--------------------------
 1 files changed, 63 insertions(+), 56 deletions(-)

diff --git a/builtin/fetch.c b/builtin/fetch.c
index f9c41da..bdb03ff 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -345,6 +345,64 @@ static int update_local_ref(struct ref *ref,
 	}
 }
 
+/*
+ * The ref_map records the tips of the refs we are fetching. If
+ *
+ *  $ git rev-list --verify-objects --stdin --not --all
+ *
+ * (feeding all the refs in ref_map on its standard input) does not
+ * error out, that means everything reachable from these updated refs
+ * locally exists and is connected to some of our existing refs.
+ *
+ * Returns 0 if everything is connected, non-zero otherwise.
+ */
+static int check_everything_connected(struct ref *ref_map, int quiet)
+{
+	struct child_process rev_list;
+	const char *argv[] = {"rev-list", "--verify-objects",
+			      "--stdin", "--not", "--all", NULL, NULL};
+	char commit[41];
+	struct ref *ref;
+	int err = 0;
+
+	if (!ref_map)
+		return 0;
+
+	if (quiet)
+		argv[5] = "--quiet";
+
+	memset(&rev_list, 0, sizeof(rev_list));
+	rev_list.argv = argv;
+	rev_list.git_cmd = 1;
+	rev_list.in = -1;
+	rev_list.no_stdout = 1;
+	rev_list.no_stderr = quiet;
+	if (start_command(&rev_list))
+		return error(_("Could not run 'git rev-list'"));
+
+	sigchain_push(SIGPIPE, SIG_IGN);
+
+	memcpy(commit + 40, "\n", 2);
+	for (ref = ref_map; ref; ref = ref->next) {
+		memcpy(commit, sha1_to_hex(ref->old_sha1), 40);
+		if (write_in_full(rev_list.in, commit, 41) < 0) {
+			if (errno != EPIPE && errno != EINVAL)
+				error(_("failed write to rev-list: %s"),
+				      strerror(errno));
+			err = -1;
+			break;
+		}
+	}
+	if (close(rev_list.in)) {
+		error(_("failed to close rev-list's stdin: %s"), strerror(errno));
+		err = -1;
+	}
+
+	sigchain_pop(SIGPIPE);
+
+	return finish_command(&rev_list) || err;
+}
+
 static int store_updated_refs(const char *raw_url, const char *remote_name,
 		struct ref *ref_map)
 {
@@ -364,6 +422,10 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 		url = transport_anonymize_url(raw_url);
 	else
 		url = xstrdup("foreign");
+
+	if (check_everything_connected(ref_map, 0))
+		return error(_("%s did not send all necessary objects\n"), url);
+
 	for (rm = ref_map; rm; rm = rm->next) {
 		struct ref *ref = NULL;
 
@@ -457,24 +519,9 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
  * We would want to bypass the object transfer altogether if
  * everything we are going to fetch already exists and is connected
  * locally.
- *
- * The refs we are going to fetch are in ref_map.  If running
- *
- *  $ git rev-list --objects --stdin --not --all
- *
- * (feeding all the refs in ref_map on its standard input)
- * does not error out, that means everything reachable from the
- * refs we are going to fetch exists and is connected to some of
- * our existing refs.
  */
 static int quickfetch(struct ref *ref_map)
 {
-	struct child_process revlist;
-	struct ref *ref;
-	int err;
-	const char *argv[] = {"rev-list",
-		"--quiet", "--objects", "--stdin", "--not", "--all", NULL};
-
 	/*
 	 * If we are deepening a shallow clone we already have these
 	 * objects reachable.  Running rev-list here will return with
@@ -484,47 +531,7 @@ static int quickfetch(struct ref *ref_map)
 	 */
 	if (depth)
 		return -1;
-
-	if (!ref_map)
-		return 0;
-
-	memset(&revlist, 0, sizeof(revlist));
-	revlist.argv = argv;
-	revlist.git_cmd = 1;
-	revlist.no_stdout = 1;
-	revlist.no_stderr = 1;
-	revlist.in = -1;
-
-	err = start_command(&revlist);
-	if (err) {
-		error(_("could not run rev-list"));
-		return err;
-	}
-
-	/*
-	 * If rev-list --stdin encounters an unknown commit, it terminates,
-	 * which will cause SIGPIPE in the write loop below.
-	 */
-	sigchain_push(SIGPIPE, SIG_IGN);
-
-	for (ref = ref_map; ref; ref = ref->next) {
-		if (write_in_full(revlist.in, sha1_to_hex(ref->old_sha1), 40) < 0 ||
-		    write_str_in_full(revlist.in, "\n") < 0) {
-			if (errno != EPIPE && errno != EINVAL)
-				error(_("failed write to rev-list: %s"), strerror(errno));
-			err = -1;
-			break;
-		}
-	}
-
-	if (close(revlist.in)) {
-		error(_("failed to close rev-list's stdin: %s"), strerror(errno));
-		err = -1;
-	}
-
-	sigchain_pop(SIGPIPE);
-
-	return finish_command(&revlist) || err;
+	return check_everything_connected(ref_map, 1);
 }
 
 static int fetch_refs(struct transport *transport, struct ref *ref_map)
-- 
1.7.7.rc0.72.g4b5ea

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: Funnies with "git fetch"
  2011-09-01 22:42 ` Junio C Hamano
@ 2011-09-01 23:31   ` Jeff King
  2011-09-02  3:09     ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Jeff King @ 2011-09-01 23:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Sep 01, 2011 at 03:42:46PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I think the breakages are:
> >
> >  - The sending side does not give any indication that it _wanted_ to send
> >    ce0136 but couldn't, and ended up sending another object;

This isn't fixable without the server re-hashing every outgoing object,
though, is it? It thinks it has ce0136, but the data is corrupted. The
only way to detect that is to rehash each object that we send. That's
pretty expensive when you consider decompression and delta
reconstruction.

Besides which, it doesn't help many malicious cases at all. It will
handle disk corruption or a malicious object-db tweak. But it does
nothing against an attacker who can trojan git, or who can spoof the
network session (which is probably hard to do if its straight ssh, but
keep in mind that traffic may be going through cleartext proxies).

So it seems that the receiving side is the only sensible place to put
such checks. It catches problems at more levels, and it can generally
afford the extra processing load (under the assumption that pushes to
servers are smaller and less frequent than fetches).

> >  - The pack data sent over the wire was self consistent (no breakage here)
> >    and sent three well-formed objects, but it was inconsistent with
> >    respect to what history was being transferred (breakage is here);

I don't think this matters. We have thin packs, after all. What goes on
the wire doesn't have to be sensible unto itself; it's about whether
the receiving end can do something useful with it.

> >  - The receiving end did not notice the inconsistency.
> >
> > The first one is of the lower priority, as the client side should be able
> > to notice an upstream with corruption in any case. Perhaps after asking
> > for objects between "have" and "want", "git fetch" should verify that it
> > can fully walk the subhistory that was supposed to be transferred down to
> > the blob level?

That seems like a sensible improvement to me.

> So I have a series to fix the latter "more important" half I'll be sending
> out in this thread.

If I understand correctly, your series is just about checking that we
have newly-referenced blobs. We were already checking commits and trees,
and we should already be hashing individual objects when we index the
pack. Right?

In that case, the performance change should be negligible.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Funnies with "git fetch"
  2011-09-01 23:31   ` Jeff King
@ 2011-09-02  3:09     ` Junio C Hamano
  2011-09-02  3:28       ` Junio C Hamano
  2011-09-02  5:03       ` Jeff King
  0 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-02  3:09 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> If I understand correctly, your series is just about checking that we
> have newly-referenced blobs. We were already checking commits and trees,
> and we should already be hashing individual objects when we index the
> pack. Right?

You may be slightly misunderstanding the series.

We let unpack-objects or index-pack consume the pack stream, either by
exploding them into loose objects, or computing the object name for each
object to create the mapping from object name to the offset. During this
process, we deflate to read the contents and resolve the delta to come up
with the object name for individual objects, so we would notice corruption
at the individual object level. As pack stream does not say what name each
object is (the recipient is expected to compute it), there is no "stream
says it is object X but the data is actually for object Y" problem. The
recipient does not even see "X"---all it sees is Y.

The current code does not try to make sure we really have the objects
necessary to connect the updated tips to our original refs at all.  Not
just blobs but neither commits nor trees are traversed. The new check in
store_updated_refs() is about that. So in that sense, the series is not
about "just blobs".

The "rev-list --verify-objects" patch is about "blob vs everything else".
It is used in the existing quickfetch() check, and also the additional
check in store_updated_refs(). The existing check we run with "--objects"
is capable of detecting corruptions of commits and trees (as we had to be
able to read them to discover objects they refer to), but that is not a
sufficient check if we worry about missing blobs.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Funnies with "git fetch"
  2011-09-02  3:09     ` Junio C Hamano
@ 2011-09-02  3:28       ` Junio C Hamano
  2011-09-02  5:03       ` Jeff King
  1 sibling, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-02  3:28 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> If I understand correctly, your series is just about checking that we
>> have newly-referenced blobs. We were already checking commits and trees,
>> and we should already be hashing individual objects when we index the
>> pack. Right?
>
> You may be slightly misunderstanding the series.
>
> We let unpack-objects or index-pack consume the pack stream, either by
> exploding them into loose objects, or computing the object name for each
> object to create the mapping from object name to the offset. During this
> process, we deflate to read the contents and resolve the delta to come up
> with the object name for individual objects, so we would notice corruption
> at the individual object level. As pack stream does not say what name each
> object is (the recipient is expected to compute it), there is no "stream
> says it is object X but the data is actually for object Y" problem. The
> recipient does not even see "X"---all it sees is Y.

One thing I forgot to mention.

If you have receive.fsckobjects set, unpack-objects and index-pack are
called with the --strict option and causes fsck_objects() to run, so we
will also catch malformed commits and trees that way. But even then,
calling fsck_object() on a blob object always succeeds, so it is not a
real check.

We probably would want to flip the default for receive_fsck_objects to
true one of these days.

> The current code does not try to make sure we really have the objects
> necessary to connect the updated tips to our original refs at all.  Not
> just blobs but neither commits nor trees are traversed. The new check in
> store_updated_refs() is about that. So in that sense, the series is not
> about "just blobs".
>
> The "rev-list --verify-objects" patch is about "blob vs everything else".
> It is used in the existing quickfetch() check, and also the additional
> check in store_updated_refs(). The existing check we run with "--objects"
> is capable of detecting corruptions of commits and trees (as we had to be
> able to read them to discover objects they refer to), but that is not a
> sufficient check if we worry about missing blobs.

I am debating myself if we want to also add calls to fsck_object() to the
new codepath. An additional patch on top of [2/3] would look like this
(totally untested).

It would make "rev-list --verify-objects" useful independently from its
use in the context of "git fetch", and more importantly, what it checks in
the context of "git fetch", while it is related, is more or less
orthogonal to what receive.fsckobjects checks. The check done during the
transfer is to check the set of objects the other side threw at us. The
check done in the check_everything_connected() by calling "rev-list
--verify-objects" is to check the set of objects we are actually going to
use. The former _should_ be superset of the latter, but [3/3] is about
making sure that the former indeed is the superset of the latter.

 builtin/rev-list.c |   13 +++++++++++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index ab3be7c..bd49615 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -7,6 +7,7 @@
 #include "log-tree.h"
 #include "graph.h"
 #include "bisect.h"
+#include "fsck.h"
 
 static const char rev_list_usage[] =
 "git rev-list [OPTION] <commit-id>... [ -- paths... ]\n"
@@ -53,6 +54,11 @@ static void show_commit(struct commit *commit, void *data)
 	struct rev_info *revs = info->revs;
 
 	graph_show_commit(revs->graph);
+	if (revs->verify_objects) {
+		if (!commit->object.parsed)
+			parse_object(commit->object.sha1);
+		fsck_object(&commit->object, 1, fsck_error_function);
+	}
 
 	if (revs->count) {
 		if (commit->object.flags & PATCHSAME)
@@ -183,8 +189,11 @@ static void show_object(struct object *obj,
 	struct rev_info *info = cb_data;
 
 	finish_object(obj, path, component, cb_data);
-	if (info->verify_objects && !obj->parsed && obj->type != OBJ_COMMIT)
-		parse_object(obj->sha1);
+	if (info->verify_objects && obj->type != OBJ_COMMIT) {
+		if (!obj->parsed)
+			parse_object(obj->sha1);
+		fsck_object(obj, 1, fsck_error_function);
+	}
 	show_object_with_name(stdout, obj, path, component);
 }
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] fetch: verify we have everything we need before updating our ref
  2011-09-01 22:43   ` [PATCH 3/3] fetch: verify we have everything we need before updating our ref Junio C Hamano
@ 2011-09-02  3:55     ` Nguyen Thai Ngoc Duy
  2011-09-02  4:25       ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-09-02  3:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Fri, Sep 2, 2011 at 5:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
> The "git fetch" command works in two phases. The remote side tells us what
> objects are at the tip of the refs we are fetching from, and transfers the
> objects missing from our side. After storing the objects in our repository,
> we update our remote tracking branches to point at the updated tips of the
> refs.
>
> A broken or malicious remote side could send a perfectly well-formed pack
> data during the object transfer phase, but there is no guarantee that the
> given data actually fill the gap between the objects we originally had and
> the refs we are updating to.

What about receive-pack? Does it have the same breakage?
-- 
Duy

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] fetch: verify we have everything we need before updating our ref
  2011-09-02  3:55     ` Nguyen Thai Ngoc Duy
@ 2011-09-02  4:25       ` Junio C Hamano
  2011-09-02 23:14         ` Junio C Hamano
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2011-09-02  4:25 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:

> What about receive-pack? Does it have the same breakage?

I didn't look---you tell me ;-)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Funnies with "git fetch"
  2011-09-02  3:09     ` Junio C Hamano
  2011-09-02  3:28       ` Junio C Hamano
@ 2011-09-02  5:03       ` Jeff King
  1 sibling, 0 replies; 18+ messages in thread
From: Jeff King @ 2011-09-02  5:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Thu, Sep 01, 2011 at 08:09:49PM -0700, Junio C Hamano wrote:

> You may be slightly misunderstanding the series.
> [...]
> The current code does not try to make sure we really have the objects
> necessary to connect the updated tips to our original refs at all.  Not
> just blobs but neither commits nor trees are traversed. The new check in
> store_updated_refs() is about that. So in that sense, the series is not
> about "just blobs".

Ah, OK, I see. I was too focused on pulling the bits out of quickfetch
into check_everything_connected, and missed the important new call in
store_updated_refs.

So what you are doing makes sense to me. I am curious, though, what the
performance impact is like. In particular, it seems that we will pull
each blob into memory via parse_object. Until now, we were mostly
streaming the blobs straight into packs. That makes me a little nervous
given the discussions recently about large blobs, and not accessing them
unnecessarily. But maybe that is a silly concern, as we will have just
reconstructed and hashed such an object anyway to get its name.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/3] fetch: verify we have everything we need before updating our ref
  2011-09-02  4:25       ` Junio C Hamano
@ 2011-09-02 23:14         ` Junio C Hamano
  0 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-02 23:14 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

Junio C Hamano <gitster@pobox.com> writes:

> Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
>
>> What about receive-pack? Does it have the same breakage?
>
> I didn't look---you tell me ;-)

I looked. It needs a similar check, and I'd need to refactor the
check_everything_connected() function a bit to take an iterator.

Will send a follow-up series this weekend.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Funnies with "git fetch"
  2011-09-01 17:53 Funnies with "git fetch" Junio C Hamano
  2011-09-01 22:42 ` Junio C Hamano
  2011-09-01 22:43 ` [PATCH 0/3] Verify the objects fetch obtained before updating ref Junio C Hamano
@ 2011-09-04 19:15 ` Junio C Hamano
  2011-09-05  2:21   ` [PATCH 0/3] Add fetch.fsckobjects Junio C Hamano
  2 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2011-09-04 19:15 UTC (permalink / raw)
  To: git

Junio C Hamano <gitster@pobox.com> writes:

> I just did this in an empty directory.
>
>     $ git init src
>     $ cd src
>     $ echo hello >greetings ; git add . ; git commit -m greetings
>     $ S=$(git rev-parse :greetings | sed -e 's|^..|&/|')
>     $ X=$(echo bye | git hash-object -w --stdin | sed -e 's|^..|&/|')
>     $ mv -f .git/objects/$X .git/objects/$S
>
> The tip commit _thinks_ it has "greetings" that contains "hello", but
> somebody replaced it with a corrupt "bye" that does not match self
> integrity.
>
>     $ git fsck
>     error: sha1 mismatch ce013625030ba8dba906f756967f9e9ca394464a
>
>     error: ce013625030ba8dba906f756967f9e9ca394464a: object corrupt or missing
>     missing blob ce013625030ba8dba906f756967f9e9ca394464a
>
> The "hello" blob is ce0136, and the tree contained in HEAD expects "hello"
> in that loose object file, but notices the contents do not match the
> filename.
>
> So far, so good. Let's see what others see when they interact with this
> repository.
>
> cd ../
> git init dst
> cd dst
> git config receive.fsckobjects true
> git remote add origin ../src
> git config branch.master.remote origin
> git config branch.master.merge refs/heads/master
> git fetch
>     remote: Counting objects: 3, done.
>     remote: Total 3 (delta 0), reused 0 (delta 0)
>     Unpacking objects: 100% (3/3), done.
>     From ../src
>      * [new branch]      master     -> origin/master
>
> Oops? If we run "fsck" at this point, we would notice the breakage:
> ...
> But the straight "fetch" did not notice anything fishy going on. Shouldn't
> we have?  Even though we may be reasonably safe, unpack-objects should be
> able to do better, especially under receive.fsckobjects option.

We do not have fetch.fsckobjects, and here is a quick patch to add it. I
would also add transfer.fsckobjects to cover both with a single variable
in a follow-up patch, but with this, it fails as expected:

    $ git config fetch.fsckobjects true
    $ git fetch
    remote: Counting objects: 3, done.
    remote: Total 3 (delta 0), reused 0 (delta 0)
    Unpacking objects: 100% (3/3), done.
    error: unable to find ce013625030ba8dba906f756967f9e9ca394464a
    fatal: object of unexpected type
    fatal: unpack-objects failed

during the transfer phase.  The "rev-list --verify-objects" addition is a
fix to a related but independent issue, so both are probably good to have,
but this makes it unnecessary to run fsck_object() during the update-ref
phase.

 builtin/fetch-pack.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index 412bd32..bfed536 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -16,6 +16,7 @@ static int fetch_unpack_limit = -1;
 static int unpack_limit = 100;
 static int prefer_ofs_delta = 1;
 static int no_done = 0;
+static int fetch_fsck_objects;
 static struct fetch_pack_args args = {
 	/* .uploadpack = */ "git-upload-pack",
 };
@@ -734,6 +735,8 @@ static int get_pack(int xd[2], char **pack_lockfile)
 	}
 	if (*hdr_arg)
 		*av++ = hdr_arg;
+	if (fetch_fsck_objects)
+		*av++ = "--strict";
 	*av++ = NULL;
 
 	cmd.in = demux.out;
@@ -853,6 +856,11 @@ static int fetch_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "fetch.fsckobjects")) {
+		fetch_fsck_objects = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 0/3] Add fetch.fsckobjects
  2011-09-04 19:15 ` Funnies with "git fetch" Junio C Hamano
@ 2011-09-05  2:21   ` Junio C Hamano
  2011-09-05  2:21     ` [PATCH 1/3] fetch.fsckobjects: verify downloaded objects Junio C Hamano
                       ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-05  2:21 UTC (permalink / raw)
  To: git

Interestingly enough, when we added receive.fsckobjects long time ago,
for some reason we forgot to add the corresponding fetch.fsckobjects,
which would have been an obvious thing to do.

So here is a series to add it.

Junio C Hamano (3):
  fetch.fsckobjects: verify downloaded objects
  transfer.fsckobjects: unify fetch/receive.fsckobjects
  test: fetch/receive with fsckobjects

 Documentation/config.txt        |   15 +++++-
 builtin/fetch-pack.c            |   18 +++++++
 builtin/receive-pack.c          |   17 +++++-
 t/t5504-fetch-receive-strict.sh |  104 +++++++++++++++++++++++++++++++++++++++
 4 files changed, 150 insertions(+), 4 deletions(-)
 create mode 100755 t/t5504-fetch-receive-strict.sh

-- 
1.7.7.rc0.175.gb3212

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/3] fetch.fsckobjects: verify downloaded objects
  2011-09-05  2:21   ` [PATCH 0/3] Add fetch.fsckobjects Junio C Hamano
@ 2011-09-05  2:21     ` Junio C Hamano
  2011-09-05  2:21     ` [PATCH 2/3] transfer.fsckobjects: unify fetch/receive.fsckobjects Junio C Hamano
  2011-09-05  2:21     ` [PATCH 3/3] test: fetch/receive with fsckobjects Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-05  2:21 UTC (permalink / raw)
  To: git

This corresponds to receive.fsckobjects configuration variable added (a
lot) earlier in 20dc001 (receive-pack: allow using --strict mode for
unpacking objects, 2008-02-25).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |    6 ++++++
 builtin/fetch-pack.c     |    8 ++++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 664de6b..4cbc4b9 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -820,6 +820,12 @@ diff.wordRegex::
 	sequences that match the regular expression are "words", all other
 	characters are *ignorable* whitespace.
 
+fetch.fsckObjects::
+	If it is set to true, git-fetch-pack will check all fetched
+	objects. It will abort in the case of a malformed object or a
+	broken link. The result of an abort are only dangling objects.
+	Defaults to false.
+
 fetch.unpackLimit::
 	If the number of objects fetched over the git native
 	transfer is below this
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index dbd8b7b..df6a8dc 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -14,6 +14,7 @@ static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
 static int unpack_limit = 100;
 static int prefer_ofs_delta = 1;
+static int fetch_fsck_objects;
 static struct fetch_pack_args args = {
 	/* .uploadpack = */ "git-upload-pack",
 };
@@ -663,6 +664,8 @@ static int get_pack(int xd[2], char **pack_lockfile)
 	}
 	if (*hdr_arg)
 		*av++ = hdr_arg;
+	if (fetch_fsck_objects)
+		*av++ = "--strict";
 	*av++ = NULL;
 
 	cmd.in = demux.out;
@@ -776,6 +779,11 @@ static int fetch_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "fetch.fsckobjects")) {
+		fetch_fsck_objects = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
-- 
1.7.7.rc0.175.gb3212

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/3] transfer.fsckobjects: unify fetch/receive.fsckobjects
  2011-09-05  2:21   ` [PATCH 0/3] Add fetch.fsckobjects Junio C Hamano
  2011-09-05  2:21     ` [PATCH 1/3] fetch.fsckobjects: verify downloaded objects Junio C Hamano
@ 2011-09-05  2:21     ` Junio C Hamano
  2011-09-05  2:21     ` [PATCH 3/3] test: fetch/receive with fsckobjects Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-05  2:21 UTC (permalink / raw)
  To: git

This single variable can be used to set instead of setting fsckobjects
variable for fetch & receive independently.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 Documentation/config.txt |   11 +++++++++--
 builtin/fetch-pack.c     |   14 ++++++++++++--
 builtin/receive-pack.c   |   17 ++++++++++++++---
 3 files changed, 35 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 4cbc4b9..d944403 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -824,7 +824,8 @@ fetch.fsckObjects::
 	If it is set to true, git-fetch-pack will check all fetched
 	objects. It will abort in the case of a malformed object or a
 	broken link. The result of an abort are only dangling objects.
-	Defaults to false.
+	Defaults to false. If not set, the value of `transfer.fsckObjects`
+	is used instead.
 
 fetch.unpackLimit::
 	If the number of objects fetched over the git native
@@ -1427,7 +1428,8 @@ receive.fsckObjects::
 	If it is set to true, git-receive-pack will check all received
 	objects. It will abort in the case of a malformed object or a
 	broken link. The result of an abort are only dangling objects.
-	Defaults to false.
+	Defaults to false. If not set, the value of `transfer.fsckObjects`
+	is used instead.
 
 receive.unpackLimit::
 	If the number of objects received in a push is below this
@@ -1616,6 +1618,11 @@ tar.umask::
 	archiving user's umask will be used instead.  See umask(2) and
 	linkgit:git-archive[1].
 
+transfer.fsckObjects::
+	When `fetch.fsckObjects` or `receive.fsckObjects` are
+	not set, the value of this variable is used instead.
+	Defaults to false.
+
 transfer.unpackLimit::
 	When `fetch.unpackLimit` or `receive.unpackLimit` are
 	not set, the value of this variable is used instead.
diff --git a/builtin/fetch-pack.c b/builtin/fetch-pack.c
index df6a8dc..dac3038 100644
--- a/builtin/fetch-pack.c
+++ b/builtin/fetch-pack.c
@@ -14,7 +14,8 @@ static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
 static int unpack_limit = 100;
 static int prefer_ofs_delta = 1;
-static int fetch_fsck_objects;
+static int fetch_fsck_objects = -1;
+static int transfer_fsck_objects = -1;
 static struct fetch_pack_args args = {
 	/* .uploadpack = */ "git-upload-pack",
 };
@@ -664,7 +665,11 @@ static int get_pack(int xd[2], char **pack_lockfile)
 	}
 	if (*hdr_arg)
 		*av++ = hdr_arg;
-	if (fetch_fsck_objects)
+	if (fetch_fsck_objects >= 0
+	    ? fetch_fsck_objects
+	    : transfer_fsck_objects >= 0
+	    ? transfer_fsck_objects
+	    : 0)
 		*av++ = "--strict";
 	*av++ = NULL;
 
@@ -784,6 +789,11 @@ static int fetch_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (!strcmp(var, "transfer.fsckobjects")) {
+		transfer_fsck_objects = git_config_bool(var, value);
+		return 0;
+	}
+
 	return git_default_config(var, value, cb);
 }
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 0559fcc..021ea65 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -23,7 +23,8 @@ static int deny_deletes;
 static int deny_non_fast_forwards;
 static enum deny_action deny_current_branch = DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
-static int receive_fsck_objects;
+static int receive_fsck_objects = -1;
+static int transfer_fsck_objects = -1;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int unpack_limit = 100;
@@ -77,6 +78,11 @@ static int receive_pack_config(const char *var, const char *value, void *cb)
 		return 0;
 	}
 
+	if (strcmp(var, "transfer.fsckobjects") == 0) {
+		transfer_fsck_objects = git_config_bool(var, value);
+		return 0;
+	}
+
 	if (!strcmp(var, "receive.denycurrentbranch")) {
 		deny_current_branch = parse_deny_action(var, value);
 		return 0;
@@ -586,6 +592,11 @@ static const char *unpack(void)
 	struct pack_header hdr;
 	const char *hdr_err;
 	char hdr_arg[38];
+	int fsck_objects = (receive_fsck_objects >= 0
+			    ? receive_fsck_objects
+			    : transfer_fsck_objects >= 0
+			    ? transfer_fsck_objects
+			    : 0);
 
 	hdr_err = parse_pack_header(&hdr);
 	if (hdr_err)
@@ -598,7 +609,7 @@ static const char *unpack(void)
 		int code, i = 0;
 		const char *unpacker[4];
 		unpacker[i++] = "unpack-objects";
-		if (receive_fsck_objects)
+		if (fsck_objects)
 			unpacker[i++] = "--strict";
 		unpacker[i++] = hdr_arg;
 		unpacker[i++] = NULL;
@@ -618,7 +629,7 @@ static const char *unpack(void)
 
 		keeper[i++] = "index-pack";
 		keeper[i++] = "--stdin";
-		if (receive_fsck_objects)
+		if (fsck_objects)
 			keeper[i++] = "--strict";
 		keeper[i++] = "--fix-thin";
 		keeper[i++] = hdr_arg;
-- 
1.7.7.rc0.175.gb3212

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/3] test: fetch/receive with fsckobjects
  2011-09-05  2:21   ` [PATCH 0/3] Add fetch.fsckobjects Junio C Hamano
  2011-09-05  2:21     ` [PATCH 1/3] fetch.fsckobjects: verify downloaded objects Junio C Hamano
  2011-09-05  2:21     ` [PATCH 2/3] transfer.fsckobjects: unify fetch/receive.fsckobjects Junio C Hamano
@ 2011-09-05  2:21     ` Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Junio C Hamano @ 2011-09-05  2:21 UTC (permalink / raw)
  To: git

Add tests for the new fetch.fsckobjects, and also tests for
receive.fsckobjects we have had for quite some time.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 t/t5504-fetch-receive-strict.sh |  104 +++++++++++++++++++++++++++++++++++++++
 1 files changed, 104 insertions(+), 0 deletions(-)
 create mode 100755 t/t5504-fetch-receive-strict.sh

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
new file mode 100755
index 0000000..6610012
--- /dev/null
+++ b/t/t5504-fetch-receive-strict.sh
@@ -0,0 +1,104 @@
+#!/bin/sh
+
+test_description='fetch/receive strict mode'
+. ./test-lib.sh
+
+test_expect_success setup '
+	echo hello >greetings &&
+	git add greetings &&
+	git commit -m greetings &&
+
+	S=$(git rev-parse :greetings | sed -e "s|^..|&/|") &&
+	X=$(echo bye | git hash-object -w --stdin | sed -e "s|^..|&/|") &&
+	mv -f .git/objects/$X .git/objects/$S &&
+
+	test_must_fail git fsck
+'
+
+test_expect_success 'fetch without strict' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config fetch.fsckobjects false &&
+		git config transfer.fsckobjects false &&
+		git fetch ../.git master
+	)
+'
+
+test_expect_success 'fetch with !fetch.fsckobjects' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config fetch.fsckobjects false &&
+		git config transfer.fsckobjects true &&
+		git fetch ../.git master
+	)
+'
+
+test_expect_success 'fetch with fetch.fsckobjects' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config fetch.fsckobjects true &&
+		git config transfer.fsckobjects false &&
+		test_must_fail git fetch ../.git master
+	)
+'
+
+test_expect_success 'fetch with transfer.fsckobjects' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config transfer.fsckobjects true &&
+		test_must_fail git fetch ../.git master
+	)
+'
+
+test_expect_success 'push without strict' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config fetch.fsckobjects false &&
+		git config transfer.fsckobjects false
+	) &&
+	git push dst master:refs/heads/test
+'
+
+test_expect_success 'push with !receive.fsckobjects' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config receive.fsckobjects false &&
+		git config transfer.fsckobjects true
+	) &&
+	git push dst master:refs/heads/test
+'
+
+test_expect_success 'push with receive.fsckobjects' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config receive.fsckobjects true &&
+		git config transfer.fsckobjects false
+	) &&
+	test_must_fail git push dst master:refs/heads/test
+'
+
+test_expect_success 'push with transfer.fsckobjects' '
+	rm -rf dst &&
+	git init dst &&
+	(
+		cd dst &&
+		git config transfer.fsckobjects true
+	) &&
+	test_must_fail git push dst master:refs/heads/test
+'
+
+test_done
-- 
1.7.7.rc0.175.gb3212

^ permalink raw reply related	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2011-09-05  2:22 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-01 17:53 Funnies with "git fetch" Junio C Hamano
2011-09-01 22:42 ` Junio C Hamano
2011-09-01 23:31   ` Jeff King
2011-09-02  3:09     ` Junio C Hamano
2011-09-02  3:28       ` Junio C Hamano
2011-09-02  5:03       ` Jeff King
2011-09-01 22:43 ` [PATCH 0/3] Verify the objects fetch obtained before updating ref Junio C Hamano
2011-09-01 22:43   ` [PATCH 1/3] list-objects: pass callback data to show_objects() Junio C Hamano
2011-09-01 22:43   ` [PATCH 2/3] rev-list --verify-object Junio C Hamano
2011-09-01 22:43   ` [PATCH 3/3] fetch: verify we have everything we need before updating our ref Junio C Hamano
2011-09-02  3:55     ` Nguyen Thai Ngoc Duy
2011-09-02  4:25       ` Junio C Hamano
2011-09-02 23:14         ` Junio C Hamano
2011-09-04 19:15 ` Funnies with "git fetch" Junio C Hamano
2011-09-05  2:21   ` [PATCH 0/3] Add fetch.fsckobjects Junio C Hamano
2011-09-05  2:21     ` [PATCH 1/3] fetch.fsckobjects: verify downloaded objects Junio C Hamano
2011-09-05  2:21     ` [PATCH 2/3] transfer.fsckobjects: unify fetch/receive.fsckobjects Junio C Hamano
2011-09-05  2:21     ` [PATCH 3/3] test: fetch/receive with fsckobjects Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.